



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Community
Ask the community for help and clear up your study doubts
Discover the best universities in your country according to Docsity users
Free resources
Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors
Material Type: Assignment; Professor: Li; Class: Computer Organization; Subject: Systems and Computer Science; University: Howard University; Term: Unknown 2008;
Typology: Assignments
1 / 6
This page cannot be seen from the preview
Don't miss anything!
SYCS 201: Computer Organization Homework 5 Solutions
7.10 (12 pts) Since each block has 4 (= 2^2 ) words, there are totally 16 / 4 = 4 = 2 2 blocks in the cache. Therefore, given a word address, bit 2 and bit 3 are used to identify the block index, while bit 0 and bit 1 are used to locate the word in the block.
Given a word address X , the block number where a word should be placed is X / (block size in words) modulo (total number of blocks) where Y means the largest integer less than or equal to Y.
For example, for the word address 27, its block index is 27 / 4 mod 4 = 2.
Note that when a word is loaded into cache, three other words in the same block are also loaded. For example, when the word addressed 27 is loaded, the words 24, 25 and 26 are also loaded. According to the block index calculation shown above, the numbers from 24 to 27 all lead to the same block index.
Based on the calculations, we have the cache content after each reference (and thus hits and misses) as follows. Referenced Word Address
Addresses of The Words in Cache Hit or Block 00 Block 01 Block 10 Block 11 Miss 2 [0,1,2,3] Miss 3 [0,1,2,3] Hit 11 [0,1,2,3] [8,9,10,11] Miss 16 [16,17,18,19] [8,9,10,11] Miss 21 [16,17,18,19] [20,21,22,23] [8,9,10,11] Miss 13 [16,17,18,19] [20,21,22,23] [8,9,10,11] [12,13,14,15] Miss 64 [64,65,66,67] [20,21,22,23] [8,9,10,11] [12,13,14,15] Miss 48 [48,49,50,51] [20,21,22,23] [8,9,10,11] [12,13,14,15] Miss 19 [16,17,18,19] [20,21,22,23] [8,9,10,11] [12,13,14,15] Miss 11 [16,17,18,19] [20,21,22,23] [8,9,10,11] [12,13,14,15] Hit 3 [0,1,2,3] [20,21,22,23] [8,9,10,11] [12,13,14,15] Miss 22 [0,1,2,3] [20,21,22,23] [8,9,10,11] [12,13,14,15] Hit 4 [0,1,2,3] [4,5,6,7] [8,9,10,11] [12,13,14,15] Miss 27 [0,1,2,3] [4,5,6,7] [24,25,26,27] [12,13,14,15] Miss 6 [0,1,2,3] [4,5,6,7] [24,25,26,27] [12,13,14,15] Hit 11 [0,1,2,3] [4,5,6,7] [8,9,10,11] [12,13,14,15] Miss
The last line in the table shows the final content of the cache after all the references.
7.12 (9 pts) The cache has 256 blocks and 16 words per block. Each block has 16 words × 4 bytes per word = 64 bytes = 512 bits for data. To index the cache blocks, 8 bits are needed (as there are totally 256 blocks, and 256 = 2^8 ). To index the bytes within a block, 6 bits are needed (as each block as 64 bytes, and 64 = 2^6 ). The tag field is thus 32 – (8 + 6) = 18 bits. Each block has a tag and a valid bit in addition to the space for data storage. The total number of bits for the whole cache is therefore 256 × (Data bit number + Tag bit number + Valid bit number) = 256 × (512 + 18 + 1) = 135,936 bits
7.14 (12 pts) The miss penalty is the time to transfer one block from main memory to the cache. Assume that it takes 1 clock cycle to send the address to the main memory.
a. Configuration (a) requires one memory initialization to transfer each word. To retrieve a block, 16 transfers are needed, one word at a time. Therefore, 16 memory initializations are needed.
Miss penalty = 1 + 16×(10 + 1) = 177 clock cycles.
b. Configuration (b) requires one memory initialization to transfer four words. To retrieve a block, 4 transfers are needed, four words at a time. 4 memory initializations are needed.
Miss penalty = 1 + 4×(10 + 1) = 45 clock cycles.
c. In Configuration (c), one memory initialization can make four words ready for transfer. However, the four words still need to be transferred one at a time. Therefore, to retrieve a block, 4 memory initializations and 16 transfers are needed.
Miss penalty = 1 + 4×(10 + 4×1) = 57 clock cycles
7.17 (5 pts) AMAT = Hit time + Miss rate×Miss penalty = 2 ns + 0.05×(20 cycles×2 ns per cycle) = 4 ns
Referenced Word Address
Addresses of The Words in Cache Hit or Miss Set 0 Set 1 Block 0 Block 1 Block 0 Block 1 2 [0,1,2,3] Miss 3 [0,1,2,3] Hit 11 [0,1,2,3] [8,9,10,11] Miss 16 [16,17,18,19] [8,9,10,11] Miss 21 [16,17,18,19] [8,9,10,11] [20,21,22,23] Miss 13 [16,17,18,19] [8,9,10,11] [20,21,22,23] [12,13,14,15] Miss 64 [16,17,18,19] [64,65,66,67] [20,21,22,23] [12,13,14,15] Miss 48 [48,49,50,51] [64,65,66,67] [20,21,22,23] [12,13,14,15] Miss 19 [48,49,50,51] [16,17,18,19] [20,21,22,23] [12,13,14,15] Miss 11 [8,9,10,11] [16,17,18,19] [20,21,22,23] [12,13,14,15] Miss 3 [8,9,10,11] [0,1,2,3] [20,21,22,23] [12,13,14,15] Miss 22 [8,9,10,11] [0,1,2,3] [20,21,22,23] [12,13,14,15] Hit 4 [8,9,10,11] [0,1,2,3] [20,21,22,23] [4,5,6,7] Miss 27 [24,25,26,27] [0,1,2,3] [20,21,22,23] [4,5,6,7] Miss 6 [24,25,26,27] [0,1,2,3] [20,21,22,23] [4,5,6,7] Hit 11 [24,25,26,27] [8,9,10,11] [20,21,22,23] [4,5,6,7] Miss
The last line in the table shows the final content of the cache after all the references.
7.10 (For a fully associative cache with four-word blocks and a total size of 16 words.) (12 pts)
Since the cache is fully associative, a block can be placed anywhere in the cache.
each set has two blocks, and there are 4 / 2 = 2 sets. Therefore, given a word address, no bit is used to identify the block index, while bit 0 and bit 1 are used to locate the word in the block.
Based on the calculations, we have the cache content after each reference (and thus hits and misses) as follows. Assume LRU is used for block replacement.
Referenced Word Address
Addresses of The Words in Cache (^) Hit or Block 00 Block 01 Block 10 Block 11 Miss 2 [0,1,2,3] Miss 3 [0,1,2,3] Hit 11 [0,1,2,3] [8,9,10,11] Miss 16 [0,1,2,3] [8,9,10,11] [16,17,18,19] Miss 21 [0,1,2,3] [8,9,10,11] [16,17,18,19] [20,21,22,23] Miss
13 [12,13,14,15] [8,9,10,11] [16,17,18,19] [20,21,22,23] Miss 64 [12,13,14,15] [64,65,66,67] [16,17,18,19] [20,21,22,23] Miss 48 [12,13,14,15] [64,65,66,67] [48,49,50,51] [20,21,22,23] Miss 19 [12,13,14,15] [64,65,66,67] [48,49,50,51] [16,17,18,19] Miss 11 [8,9,10,11] [64,65,66,67] [48,49,50,51] [16,17,18,19] Miss 3 [8,9,10,11] [0,1,2,3] [48,49,50,51] [16,17,18,19] Miss 22 [8,9,10,11] [0,1,2,3] [20,21,22,23] [16,17,18,19] Miss 4 [8,9,10,11] [0,1,2,3] [20,21,22,23] [4,5,6,7] Miss 27 [24,25,26,27] [0,1,2,3] [20,21,22,23] [4,5,6,7] Miss 6 [24,25,26,27] [0,1,2,3] [20,21,22,23] [4,5,6,7] Hit 11 [24,25,26,27] [8,9,10,11] [20,21,22,23] [4,5,6,7] Miss
The last line in the table shows the final content of the cache after all the references.
7.12 (for a two-way associative cache with one-word blocks and a total size of 16 words) (9 pts)
As there are totally 16 words in the cache, and each block is of one word, the cache has 16 blocks. Each block has 1 word × 4 bytes per word = 4 bytes = 32 bits for data.
The cache is two-way associative, therefore each set has two blocks, and there are 16 / 2 = 8 sets. To index the block sets, 3 bits are needed (8 = 2 3 ). To index the bytes within a block, 2 bits are needed (as each block as 4 bytes, and 4 = 2 2 ). The tag field is thus 32 – (3 + 2) = 27 bits.
Each block has a tag and a valid bit in addition to the space for data storage. The total number of bits for the whole cache is therefore 16 × (Data bit number + Tag bit number + Valid bit number) = 16 × (32 + 27 + 1) = 960 bits
7.12 (for a fully associative cache with one-word blocks and a total size of 16 words.) (9 pts)
As there are totally 16 words in the cache, and each block is of one word, the cache has 16 blocks. Each block has 1 word × 4 bytes per word = 4 bytes = 32 bits for data.
The cache is fully associative, no bits are needed for set index. To index the bytes within a block, 2 bits are needed (as each block as 4 bytes, and 4 = 2 2 ). The tag field is thus 32 – (0 + 2) = 30 bits.