General concepts

  • A microprocessor runs best when it can synchronize the flow of instructions into the execution units with the required data without stalls.
  • The processor core pipeline controls the flow of instructions and data into the functional units
  • At the start of the flow the pipeline back end reads instructions from the instruction cache or instruction buffer (IB) and aligns them for consumption. The back end expands the bundle templates and disperses the instructions to the functional units (the template indicates which type of functional unit the instruction uses)
  • Any required register renaming happens, and then the data is delivered to the functional units from the registers
  • Now the instructions can be executed, because the functional unit has both the data and instructions locked and loaded.

About the pipeline

(note, I made up the 'Name' for each)

Stage Nmemonic Name Does
1 IPG Instruction Pointer Generation Generate Instruction Pointers, starts L1 Instr. Cache and L1-ITLB accesses
2 ROT ? Formats the instruction stream, fills Instruction buffer
- IB Instruction Buffer Buffers instructions between back and front end
3 EXP Expand expands instruction templates. The dispersal of instructions to functional units is organized and issued (but not completed?)
4 REN Register Rename Handles register stack and register rotations. Instructions are also decoded.
5 REG Register Delivery Pumps functional units (FUs) with data from registers or from other FUs from chained instructions via bypasses from EXE. Also generates spill/fill instructions needed by the Register Stack Engine
6 EXE Execute Dispatches instructions and data to the functional units per the instruction template. Also uses bypasses to send single cycle ALU data back to the REG stage
7 DET Detect Detects exceptions and branch mispredictions, and from this generates pipeline flushes, causing the highest priority pipeline stalls
8 WRB Write back writes output to the apropriate registers

  • 8 stages: 2 stage front end, 6 stage back end
  • The IB (see above) connects stage 2 to stage 3
  • The front and back ends operate asynchronously
  • The back end interacts with micro pipelines to deal with FPUs, L1D and L2 caches.

Pipeline Stalls/Flushes

  • The main goal of optimization is to minimize stalls, which can occur in 3 of the back end stages: DET, EXE and REN
  • Stalls in the front end are only interesting if they succeed in causing stalls in the back end
  • Stalls have priorities which increase as you reach the end of the pipeline. If an exception or branch mispredict at DET, the highest priority stall occurs leading to a pipeline flush. Once the pipeline is flushed, the other stalls no longer matter, because all work in the pipeline has to start over anyway.
  • Stalls also happen at DET when the back end and FPUs/L1D & L2 Caches/Multimedia unit interact, but these are lower in priority than branch mispredict/exceptions
  • Stalls in the EXE happen when data is not in its required register in time for the functional unit. The pipeline stalls until the data arrives.
  • REN Stalls when it runs out of registers
  • There is also a stall when the front end can't keep the back end fed with enough instructions.

See also...

-- MattWalsh - 13 Jul 2004

Topic revision: r1 - 14 Jul 2004 - MattWalsh
 
This site is powered by the TWiki collaboration platformCopyright © 2008-2012 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback