Prepare for your exams
Get points
Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search Store documents

The best documents sold by students who completed their studies

Search through all study resources

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

University Rankings

Discover the best universities in your country according to Docsity users

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

From our blog

Exams and Study

Go to the blog

Computer Architecture Pipelined Implementation, Lecture Slide - Computer Science, Slides of Computer Architecture and Organization

Carnegie Mellon University (CMU)Computer Architecture and Organization

Implementing Stalling ,Pipeline Register Modes, Data Forwarding, Bypass Paths ,Forwarding Priority ,Implementing Forwarding, Handling Mispredictions

Typology: Slides

2010/2011

Uploaded on 10/08/2011

rolla45 🇺🇸

(6)

133 documents

1 / 32

This page cannot be seen from the preview

Don't miss anything!

Randal E. Bryant

Carnegie Mellon University

CS:APP2e

CS:APP Chapter 4

Computer Architecture

Pipelined

Implementation

Part I

http://csapp.cs.cmu.edu

Partial preview of the text

Download Computer Architecture Pipelined Implementation, Lecture Slide - Computer Science and more Slides Computer Architecture and Organization in PDF only on Docsity!

Randal E. Bryant

Carnegie Mellon University

CS:APP2e

CS:APP Chapter 4

Computer Architecture

Pipelined

Implementation

Part I

http://csapp.cs.cmu.edu

Overview

General Principles of Pipelining

 Goal

 Difficulties

Creating a Pipelined Y86 Processor

 Rearranging SEQ

 Inserting pipeline registers

 Problems with data and control hazards

Computational Example

System

 Computation requires total of 300 picoseconds

 Additional 20 picoseconds to save result in register

 Must have clock cycle of at least 320 ps

Combinational logic

R

e g

300 ps 20 ps

Clock

Delay = 320 ps Throughput = 3.12 GIPS

3-Way Pipelined Version

System

 Divide combinational logic into 3 blocks of 100 ps each

 Can begin new operation as soon as previous one passes through stage A.

 Begin new operation every 120 ps

 Overall latency increases

 360 ps from start to finish

R

e g

Clock

Comb. logic A

R

e g

Comb. logic B

R

e g

Comb. logic C

100 ps 20 ps 100 ps 20 ps 100 ps 20 ps

Delay = 360 ps Throughput = 8.33 GIPS

Operating a Pipeline

Time

OP

A B C

0 120 240 360 480 640

Clock

R e g

Clock

Comb. logic A

R e g

Comb. logic B

R e g

Comb. logic C

100 ps 20 ps 100 ps 20 ps 100 ps 20 ps

R e g

Clock

Comb. logic A

R e g

Comb. logic B

R e g

Comb. logic C

100 ps 20 ps 100 ps 20 ps 100 ps 20 ps

R e g

100 ps 20 ps 100 ps 20 ps 100 ps 20 ps

Comb. logic A

Comb. logic B

Comb. logic C

Clock

R e g

Clock

Comb. logic A

R e g

Comb. logic B

R e g

Comb. logic C

100 ps 20 ps 100 ps 20 ps 100 ps 20 ps

Limitations: Nonuniform Delays

 Throughput limited by slowest stage

 Other stages sit idle for much of the time

 Challenging to partition system into balanced stages

R

e g

Clock

R

e g

Comb. logic B

R

e g

Comb. logic C

50 ps 20 ps 150 ps 20 ps 100 ps 20 ps

Delay = 510 ps Throughput = 5.88 GIPS

Comb. logic A

Time

OP

A B C

Data Dependencies

System

 Each operation depends on result from preceding one

Clock

Combinational logic

R

e g

Time

OP

Data Hazards

 Result does not feed back around in time for next operation

 Pipelining has changed behavior of system

R

e g

Clock

Comb. logic A

R

e g

Comb. logic B

R

e g

Comb. logic C

Time

OP

A B C

OP4 A B C

SEQ Hardware

 Stages occur in sequence

 One operation in process

at a time

SEQ+ Hardware

 Still sequential

implementation

 Reorder PC stage to put at

beginning

PC Stage

 Task is to select PC for

current instruction

 Based on results

computed by previous

instruction

Processor State

 PC is no longer stored in

register

 But, can determine PC

based on other stored

information

Pipeline Stages

Fetch

 Select current PC

 Read instruction

 Compute incremented PC

Decode

 Read program registers

Execute

 Operate ALU

Memory

 Read or write data memory

Write Back

 Update register file

PIPE- Hardware

 Pipeline registers hold intermediate values from instruction execution

Forward (Upward) Paths

 Values passed from one stage to next

 Cannot jump past stages

 e.g., valC passes

through decode

Feedback Paths

Predicted PC

 Guess value of next PC

Branch information

 Jump taken/not-taken

 Fall-through or target address

Return point

 Read from memory

Register updates

 To register file write ports

Predicting the

 Start fetch of new instruction after current one has completed fetch stage

 Not enough time to reliably determine next instruction

 Guess which instruction will follow

Computer Architecture Pipelined Implementation, Lecture Slide - Computer Science, Slides of Computer Architecture and Organization

Related documents

Partial preview of the text

Download Computer Architecture Pipelined Implementation, Lecture Slide - Computer Science and more Slides Computer Architecture and Organization in PDF only on Docsity!

Randal E. Bryant

Carnegie Mellon University

CS:APP Chapter 4

Computer Architecture

Pipelined

Implementation

Part I

http://csapp.cs.cmu.edu

R

 Begin new operation every 120 ps

 360 ps from start to finish

R

R

R

OP

OP

OP

A B C

A B C

A B C

R

R

R

OP

OP

OP

A B C

A B C

A B C

R

OP

OP

OP

R

R

R

OP

OP

OP

A B C

A B C

A B C

OP4 A B C

 Stages occur in sequence

 One operation in process

at a time

 Still sequential

implementation

 Reorder PC stage to put at

beginning

 Task is to select PC for

current instruction

 Based on results

computed by previous

instruction

 PC is no longer stored in

register

 But, can determine PC

based on other stored

information

 e.g., valC passes

through decode

 Not enough time to reliably determine next instruction

 Recover if prediction was incorrect