Forecast Big Pictre Datapath Control Data Hazards Stalls Forwarding Control Hazards Eceptions Pipelining otivation Want to minimize: Time = s/prog CPI Cycle time = P?? Single cycle implementation: CPI = Cycle=imem+reg_rd+al+dmem+reg_wr+mes & control = 5 + 25 + 5 + 5 + 25 + + = 2 ps = 2 ns Time/prog = P * 2 ns CS/ECE 552 Lectre Notes: Chapter 6 CS/ECE 552 Lectre Notes: Chapter 6 2 otivation Big Pictre lticycle implementation: CPI = 3, 4, 5 Cycle=a(, s,, mes&control) = ma(5, 25, 5) = 5 ps Time/prog = P * 4 * 5 = P * 2 ps = P * 2 ns Wold like: CPI = + hazards Cycle = 5 ps + overheads In reality, ~3 improvement lticycle implementation: s 2 3 4 5 6 7 8 9 Cycles i F D X W i+ F D X i+2 F D X i+3 F i+4 2 3 CS/ECE 552 Lectre Notes: Chapter 6 3 CS/ECE 552 Lectre Notes: Chapter 6 4
Big Pictre Big Pictre Latency = 5 cycles Cycles Throghpt = /5 instrctions per cycle s CPI = 5 cycles per instrction 2 3 4 5 6 7 8 9 2 3 Pipelining: process instrctions like a lnch bffet! ALL microprocessors today employ pipelining for speed i F D X W i+ F D X W i+2 F D X W E.g., Intel PentimIII and Compaq Alpha 2264 i+3 F D X W i+4 F D X W CS/ECE 552 Lectre Notes: Chapter 6 5 CS/ECE 552 Lectre Notes: Chapter 6 6 Big Pictre Big Pictre latency = 5 cycles - no change throghpt = instrction per cycle CPI = cycle per instrction CPI = cycle between instrction completion =! Bt path? note: five instrctions in path in cycle 5 control? mst be generated by mltiple instrctions instrctions may have and control flow dependences CS/ECE 552 Lectre Notes: Chapter 6 7 CS/ECE 552 Lectre Notes: Chapter 6 8
Datapath (Fig. 6.) Datapath (Fig. 6.) Time (in clock cycles) Program eection CC CC 2 CC 3 CC 4 CC 5 CC 6 CC 7 order (in instrctions) lw $, ($) I Reg D Reg IF: fetch ID: decode/ EX: Eecte/ E: emory access : bac file read address calclation 4 reslt Shift left 2 lw $2, 2($) lw $3, 3($) I Reg D Reg I Reg D Reg PC ress 2 Registers 2 6 32 Sign etend Zero reslt ress Data CS/ECE 552 Lectre Notes: Chapter 6 9 CS/ECE 552 Lectre Notes: Chapter 6 Big Pictre Data Dependence Control Set by five different instrctions Divide and conqer: carry IR down the pipeline IPS ISA reqires the appearance of seqential eection One instrction prodces a vale sed by a later instrction E.g., add $, -, - sb -, $4, - 2 3 4 5 6 7 8 9 i F D X W* i+ F D* X W CS/ECE 552 Lectre Notes: Chapter 6 CS/ECE 552 Lectre Notes: Chapter 6 2
Data Dependence Simple soltion : Stall the pipeline E.g., add $, -, - sb -, $4, - 2 3 4 5 6 7 8 9 add F D X W* sb F D* X W Bt CPI >, we will do better sing forwarding Control Dependence One instrction affects which instrction will eecte net E.g., bne, j sw $4, ($5) bne $2, $3, loop sb -, -, - 2 3 4 5 6 7 8 9 sw F D X W bne F D X* W sb F D X W CS/ECE 552 Lectre Notes: Chapter 6 3 CS/ECE 552 Lectre Notes: Chapter 6 4 Control Dependence Pipelined Datapath sw $4, ($5) bne $2, $3, loop sb -, -, - 2 3 4 5 6 7 8 9 sw F D X W bne F D X* W?? F D X W CPI >, we will do better Single-cycle path (Recall Fig. 6.) Pipelined eection assme each instrction has itw own path (Fig. 6.) bt each instrction ses different part in every cycle mltiple all on one path latch to separate cycles (as in mlticycle) and instrctions! Ignore and control flow dependences for now hazards control flow hazards CS/ECE 552 Lectre Notes: Chapter 6 5 CS/ECE 552 Lectre Notes: Chapter 6 6
Pipelined Datapath (Fig. 6.2) Pipelined Datapath flow add and load PC 4 ress IF/ID ID/EX EX/E E/ 2 Registers 2 6 32 Sign etend Shift left 2 reslt Zero reslt ress Data write of s pass specifiers Any info needed by a later stage will be passed down store vale throgh EX CS/ECE 552 Lectre Notes: Chapter 6 7 CS/ECE 552 Lectre Notes: Chapter 6 8 Pipelined Control Figre 6.25 IF and ID PCSrc none EX IF/ID ID/EX EX/E E/ op, src, Regdst 4 Reg Shift left 2 reslt Branch E Branch em, em emtoreg, Reg PC ress 2 Registers 2 [5 ] 6 32 Sign etend [2 6] [5 ] Src 6 control Op Zero reslt em ress Data em emtoreg RegDst CS/ECE 552 Lectre Notes: Chapter 6 9 CS/ECE 552 Lectre Notes: Chapter 6 2
Figre 6.29 Figre 6.3 PCSrc Control ID/EX EX/E E/ Control IF/ID EX EX PC 4 ress Reg 2 Registers 2 Shift left 2 reslt Src Zero reslt Branch em ress Data emtoreg 6 32 [5 ] Sign etend 6 control em IF/ID ID/EX EX/E E/ [2 6] [5 ] RegDst Op CS/ECE 552 Lectre Notes: Chapter 6 2 CS/ECE 552 Lectre Notes: Chapter 6 22 Pipelined Control Pipelining Bt controlled by different instrctions Decode instrctions and pass the signals down the pipe Control seqencing is embedded in the pipeline Not too comple yet hazards control hazards eceptions CS/ECE 552 Lectre Notes: Chapter 6 23 CS/ECE 552 Lectre Notes: Chapter 6 24
Data Hazards Data Hazards sb $2, $, $3 and $2, $2, $5 or $3, $6, $2 add $4, $2, $2 sw $5, ($2) st first detect hazards ID/EX.Register = IF/ID.Register ID/EX.Register = IF/ID.Register2 EX/E.Register = IF/ID.Register EX/E.Register = IF/ID.Register2 E/.Register = IF/ID.Register E/.Register = IF/ID.Register2 CS/ECE 552 Lectre Notes: Chapter 6 25 CS/ECE 552 Lectre Notes: Chapter 6 26 Data Hazards Data Hazards Not all hazards becase some Register not sed e.g., sw Register not sed e.g., addi, jmp Do something only if necessary Hazard detection nit several 5-bit (or 6-bit) comparators Response? Stall pipeline s in IF and ID stay IF/ID pipeline latch not pdated send nop down pipeline - called a bbble Pc, IF/ID and nop m CS/ECE 552 Lectre Notes: Chapter 6 27 CS/ECE 552 Lectre Notes: Chapter 6 28
Register Forwarding (Figre 6.38) ID/EX EX/E E/ Registers a. No forwarding ID/EX EX/E E/ Registers b. With forwarding Data Hazard Data ForwardA Data Rs ForwardB Rt Rt EX/E.RegisterRd Rd Forwarding nit E/.RegisterRd A better response - forwarding all of the above made sre reg read after reg write Instead of stalling se m to select forwarded vale rather than reg vale control m with hazard detection logic CS/ECE 552 Lectre Notes: Chapter 6 29 CS/ECE 552 Lectre Notes: Chapter 6 3 Data Hazards Data Hazards Load followed by a se Can t avoid a stall Stall one cycle and the forward Other options Disallow hazardos seqences compiler will never generate them assembly programmers will not se them If sed, reslt is random CS/ECE 552 Lectre Notes: Chapter 6 3 CS/ECE 552 Lectre Notes: Chapter 6 32
Control Flow Hazards Control flow instrctions branches, jmps, jals, retrns can t fetch ntil branch otcome known too late for net IF Control Flow Hazards What to do? Always stall easy to implement performs poorly /6th instrctions is a branch, each branch takes 3 cycle what is the CPI? CS/ECE 552 Lectre Notes: Chapter 6 33 CS/ECE 552 Lectre Notes: Chapter 6 34 Control Flow Hazards Control Flow Hazards Predict branch not taken let seqential instrctions go down the pipeline Late flsh of instrctions on misprediction Comple mst kill later instrctions if incorrect mst stop accesses and reg writes inclding loads (why?) CS/ECE 552 Lectre Notes: Chapter 6 35 CS/ECE 552 Lectre Notes: Chapter 6 36
Control Flow Hazards Even better bt more comple predict taken predict both dynamically adapt to program branch patters significant fraction of chip real estate PentimIII Alpha 2264 crrent topic of research Control Flow Hazards Another option: delayed branches always eecte following instrction delay slot pt sefl instrction, nop otherwise losing poplarity CS/ECE 552 Lectre Notes: Chapter 6 37 CS/ECE 552 Lectre Notes: Chapter 6 38 Eceptions Eceptions add $, $2, $3 overflows! a srprise branch earlier instrction flow to completion kill later instrctions save PC in EPC, PC to eception handler, Case, etc cost a lot of designer sanity Even worse: in one cycle I/O interrpt ser trap to OS illegal instrction arithmetic overflow hardware error etc CS/ECE 552 Lectre Notes: Chapter 6 39 CS/ECE 552 Lectre Notes: Chapter 6 4
State of the Art: Sperscalar State of the Art: Sperscalar i i+ i+2 i+3 i+4 i+5 i+5 i+7 2 3 4 5 6 7 8 9 F D X W F D X W F D X W F D X W F D X W F D X W F D X W F D X W 2 3 IF: parallel access to I-cache, reqire alignment? ID: replicate logic, fied length instrs? hazard checks? dynamic? EX: parallel/pipelined E: > per cycle? If so, hazards, mlti-ported D-cache? : different files? mlti-ported files? more things replicated more possibilities for hazards more loss de to hazards (why?) CS/ECE 552 Lectre Notes: Chapter 6 4 CS/ECE 552 Lectre Notes: Chapter 6 42 State of the Art: Ot of Order Ot of Order in the Limit eecte later instrctions while previos is waiting decople into different nits one to fetch/decode, several to eecte, one to write back fetch in program order eecte ot of order speclatively! commit in order Program Form static program dynamic instrction stream eection window Eection Wavefront Processing Phase instrction fetch & branch prediction dependence checking & dispatch instrction isse instrction eection completed instrctions instrction reorder & commit CS/ECE 552 Lectre Notes: Chapter 6 43 CS/ECE 552 Lectre Notes: Chapter 6 44
A Generic Ot of Order Processor Review Big pictre predecode instr. cache instr. bffer floating pt. file rename &dispatch integer file floating pt. instrction bffers integer/address instrction bffers fnctional nits fnctional nits and cache re-order bffer interface Datapath Control hazards stalls forwarding control flow hazards branch prediction Eceptions CS/ECE 552 Lectre Notes: Chapter 6 45 CS/ECE 552 Lectre Notes: Chapter 6 46