



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
CSE 490/590 – Midterm Exam –2025 V2 Solution|100% Complete Updated Questions and Answers.
Typology: Exams
1 / 7
This page cannot be seen from the preview
Don't miss anything!
Tag Cache Content 10b Mem(0x8) 01b Mem(0x5) 10b Mem(0xA) 11b Mem(0xF) Tag Cache Content 01b M(0x4) 01b M(0x5) 10b M(0xA) 11b M(0xF) Tag Cache Content 01b M(0x4) 01b M(0x5) 01b M(0x6) 11b M(0xF) Tag Cache Content 11b M(0xC) 01b M(0x5) 01b M(0x6) 11b M(0xF) Tag Cache Content 11b M(0xC) 01b M(0x5) 01b M(0x6) 11b M(0xF) .CSE 490/590 – Midterm Exam – 2025. V2 Solution [Question 1] (10 Points) Assume the presence of the following memories in a MIPS system: i. L1 cache ii. Main Memory iii. L3 cache iv. Registers v. SSDs vi. L2 cache vii. Hard drive Show the memory hierarchy and order them in terms of (State Increasing or decreasing order): a. Speed Registers - > L1 - > L2 - > L3 - > Main Memory - > SSDs - > Hard Drive (decreasing) b. Cost per byte Registers - > L1 - > L2 - > L3 - > Main Memory - > SSD - > Hard Drive (decreasing) a. Memory capacity (Size) Registers - > L1 - > L2 - > L3 - > Main Memory - > SSD - > Hard Drive (increasing) [Question 2] (10 Points) a) Consider a direct-mapped cache of size 4 bytes. Each index in the cache can hold only 1 word (here 1 word = 1 byte). Fill in the missing cache blocks at each step according to the address reference and specify whether it is a hit or a miss. The initial state of the cache is provided. Address references are (in order): Reference: 0x4 0x6 0xC 0x hit / miss hit / miss hit / miss hit / miss b) Calculate the hit rate for part a. ¼ = 0.
[Question 3] (12 Points) Consider three different processors P1, P2, and P3 executing the same instruction set. Processor Clock Speed in GHz CPI P1 3 2. P2 1.5 1. P3 4.0 2. Which processor has the highest performance expressed in instructions per second? (1 GHz = 10^9 Hz). Show your work. IPS = Cycles per second/CPI P1: 3/2.5 = 1. P2: 1.5/1.5 = 1 P3: 4/2 = 2 P3 is highest performance [Question 4] (12 Points) In the following instruction sequence, find all hazards. Rename the registers to eliminate the anti-dependencies and output dependences. Assume you have FP registers up to $F16 available. Show your work. div.s F0, F2, F mult.s F1, F0, F add.s F0, F2, F sub.s F8, F14, F Potential hazards RAW : F0 - > div.s and mult.s; WAR : F1 - > div.s and mult.s, F0 - > mult.s and add.s WAW : F0 - > div.s and add.s div.s F0, F2, F mult.s F9, F0, F add.s F10, F2, F sub.s F8, F14, F [Question 5] (12 Points)
Execution time(pipelined) < Execution time(non-pipelined) It shows that pipelining improved performance. [Question 7] (16 Points) Assume a 5 - stage, static dual-issue MIPS processor that can perform data forwarding. The 2 - issue MIPS helps us issue arithmetic and load/store instructions in parallel. The ideal 2 - issue MIPS, with no data dependencies will have a pipelining diagram as follows: For the instruction sequence given below, draw the pipelining diagram while taking all data dependencies into account. Show the data forwarding if necessary. Find the total number of clock cycles taken for the execution of the instruction sequence. An empty table for pipelining is given below for convenience add $s1, $s2, $s add $s3, $s2, $s sw $s1, 4($t2) add $s5, $s3, $s Address Instruction Pipeline Stages add s1,s2,s2 IF ID EX MEM WB add s3,s2,s3 IF ID EX MEM WB sw s1, 4(t2) IF ID EX MEM WB add s5, s3,s1 IF ID EX MEM WB 7 clock cycles
[Question 8] (16 Points) Consider the following floating-point instruction sequence on a processor (shown below) which uses Tomasulo’s Algorithm to dynamically schedule instructions (dual-issue per cycle – no speculation). The processor has the following non-pipelined execution units:
3 4 5 6 Rubric is the same as in V Adder Reservation Stations Tag 1 S1 Tag 2 S 1 2 y 4.0 8. 3 Instruction Executed: w x y Multiply Reservation Stations Tag 1 S1 Tag 2 S 4 x 8.0 4. 5 Instruction Executed: w x y FP Registers Busy Tag Data 0 Yes 2 3. 2 4. 4 Yes 4 2. 6 6. Adder Reservation Stations Tag 1 S1 Tag 2 S 1 2 y 4.0 8. 3 Instruction Executed: w x y Multiply Reservation Stations Tag 1 S1 Tag 2 S 4 x 8.0 4. 5 Instruction Executed: w x y FP Registers Busy Tag Data 0 Yes 2 3. 2 4. 4 Yes 4 2. 6 6. Adder Reservation Stations Tag 1 S1 Tag 2 S 1 2 3 Instruction Executed: w x y Multiply Reservation Stations Tag 1 S1 Tag 2 S 4 x 8.0 4. 5 Instruction Executed: w x y FP Registers Busy Tag Data 0 12. 2 4. 4 Yes 4 2. 6 6. Adder Reservation Stations Tag 1 S1 Tag 2 S 1 2 3 Instruction Executed: w x y Multiply Reservation Stations Tag 1 S1 Tag 2 S 4 5 Instruction Executed: w x y FP Registers Busy Tag Data 0 12. 2 4. 4 32. 6 6.