logo资料库

riscv-boom.pdf

第1页 / 共85页
第2页 / 共85页
第3页 / 共85页
第4页 / 共85页
第5页 / 共85页
第6页 / 共85页
第7页 / 共85页
第8页 / 共85页
资料共85页,剩余部分请下载后查看
Introduction and Overview
The BOOM Pipeline
The RISC-V ISA
The Chisel HCL
Quick-start
The BOOM Repository
The Rocket-Chip Repository
Instruction Fetch
The Rocket I-Cache
Fetching Compressed Instructions
The Fetch Buffer
The Fetch Target Queue
Branch Prediction
The Next-line Predictor (NLP)
The Backing Predictor (BPD)
The Decode Stage
RVC Changes
The Rename Stage
The Purpose of Renaming
The Explicit Renaming Design
The Rename Map Table
The Busy Table
The Free List
Stale Destination Specifiers
The Reorder Buffer (ROB) and the Dispatch Stage
The ROB Organization
ROB State
The Commit Stage
Exceptions and Flushes
Point of No Return (PNR)
The Issue Unit
Speculative Issue
Issue Slot
Issue Select Logic
Un-ordered Issue Queue
Age-ordered Issue Queue
Wake-up
The Register Files and Bypass Network
Register Read
Bypass Network
The Execute Pipeline
Execution Units
Functional Units
Branch Unit & Branch Speculation
Load/Store Unit
Floating Point Units
Floating Point Divide and Square-root Unit
Parameterization
Control/Status Register Instructions
The Rocket Custom Co-Processor Interface (RoCC)
The Load/Store Unit (LSU)
Store Instructions
Load Instructions
The BOOM Memory Model
Memory Ordering Failures
The Memory System and the Data-cache Shim
Micro-architectural Event Tracking
Setup HPM events to track
Reading HPM counters in software
Adding your own HPE
External Resources
Verification
RISC-V Tests
RISC-V Torture Tester
Continuous Integration (CI)
Debugging
FireSim Debugging
Pipeline Visualization
Physical Realization
Register Retiming
Pipelining Configuration Options
Future Work
The BOOM Custom Co-processor Interface (BOCC)
Parameterization
BOOM Parameters
Other Parameters
Frequently Asked Questions
The BOOM Ecosystem
Scala, Chisel, Generators, Configs, Oh My!
Terminology
Bibliography
Indices and tables
Bibliography
RISCV-BOOM Documentation Chris Celio, Jerry Zhao, Abraham Gonzalez, Ben Korpan May 15, 2019
Contents: 1 Introduction and Overview The BOOM Pipeline . . The RISC-V ISA . The Chisel HCL . . . . 1.1 1.2 1.3 1.4 Quick-start 1.5 1.6 . . . . . . . . The BOOM Repository . . The Rocket-Chip Repository . . . . . . . . . . . . . . . . 2 Instruction Fetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.1 2.2 2.3 2.4 . . The Rocket I-Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Fetching Compressed Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Fetch Buffer . . The Fetch Target Queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Branch Prediction 3.1 3.2 The Next-line Predictor (NLP) . The Backing Predictor (BPD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 The Decode Stage 4.1 RVC Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 The Rename Stage 5.1 5.2 5.3 5.4 5.5 5.6 . . The Purpose of Renaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Explicit Renaming Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Rename Map Table . The Busy Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Free List . . . Stale Destination Specifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 The Reorder Buffer (ROB) and the Dispatch Stage 6.1 6.2 6.3 6.4 6.5 . . . The ROB Organization . . . . ROB State . . . The Commit Stage . . . Exceptions and Flushes . Point of No Return (PNR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 The Issue Unit 1 1 3 3 4 4 6 9 9 10 10 11 13 13 15 23 23 25 25 25 25 28 28 28 29 29 29 31 31 32 33 i
. . . . Speculative Issue . Issue Slot . . . Issue Select Logic . . 7.1 . 7.2 7.3 . 7.4 Un-ordered Issue Queue . 7.5 Age-ordered Issue Queue . 7.6 Wake-up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 The Register Files and Bypass Network 8.1 8.2 Register Read . . Bypass Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 The Execute Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Execution Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Functional Units . Branch Unit & Branch Speculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Load/Store Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Floating Point Units Floating Point Divide and Square-root Unit . Parameterization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Control/Status Register Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Rocket Custom Co-Processor Interface (RoCC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.1 9.2 9.3 9.4 9.5 9.6 9.7 9.8 9.9 . . . . . . . . . . . . . . . . . . . . . . 10 The Load/Store Unit (LSU) . . . 10.1 Store Instructions . 10.2 Load Instructions . . 10.3 The BOOM Memory Model . 10.4 Memory Ordering Failures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 The Memory System and the Data-cache Shim 12 Micro-architectural Event Tracking . . 12.1 Setup HPM events to track . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Reading HPM counters in software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 Adding your own HPE . 12.4 External Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Verification . . 13.1 RISC-V Tests . . 13.2 RISC-V Torture Tester . . . 13.3 Continuous Integration (CI) . . . . 14 Debugging 14.1 FireSim Debugging . . 14.2 Pipeline Visualization . . . . . . 15 Physical Realization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15.1 Register Retiming . . 15.2 Pipelining Configuration Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 Future Work 16.1 The BOOM Custom Co-processor Interface (BOCC) . . . . . . . . . . . . . . . . . . . . . . . . . . 17 Parameterization 17.1 BOOM Parameters . 17.2 Other Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii 33 33 34 34 35 35 37 37 38 39 40 41 42 43 43 43 45 45 46 47 47 49 49 49 51 53 53 54 54 54 55 55 55 55 57 57 57 59 59 60 61 61 63 63 64
18 Frequently Asked Questions 19 The BOOM Ecosystem 19.1 Scala, Chisel, Generators, Configs, Oh My! . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Terminology 21 Bibliography 22 Indices and tables Bibliography 65 67 67 71 73 75 77 iii
iv
CHAPTER 1 Introduction and Overview The goal of this document is to describe the design and implementation of the Berkeley Out–of–Order Machine (BOOM). BOOM is heavily inspired by the MIPS R10k and the Alpha 21264 out–of–order processors. Like the R10k and the 21264, BOOM is a unified physical register file design (also known as “explicit register renaming”). The source code to BOOM can be found at https://github.com/riscv-boom/riscv-boom. 1.1 The BOOM Pipeline Fig. 1.1: Default BOOM Pipeline with Stages 1.1.1 Overview Conceptually, BOOM is broken up into 10 stages: Fetch, Decode, Register Rename, Dispatch, Issue, Register Read, Execute, Memory, Writeback and Commit. However, many of those stages are combined in the current implementation, yielding seven stages: Fetch, Decode/Rename, Rename/Dispatch, Issue/RegisterRead, Execute, Memory and Writeback (Commit occurs asynchronously, so it is not counted as part of the “pipeline”). 1.1.2 Stages Fetch Instructions are fetched from the Instruction Memory and pushed into a FIFO queue, known as the Fetch Buffer. Branch prediction also occurs in this stage, redirecting the fetched instructions as necessary.1 1 While the Fetch Buffer is N-entries deep, it can instantly read out the first instruction on the front of the FIFO. Put another way, instructions don’t need to spend N cycles moving their way through the Fetch Buffer if there are no instructions in front of them. 1
RISCV-BOOM Documentation Decode Decode pulls instructions out of the Fetch Buffer and generates the appropriate Micro-Op(s) to place into the pipeline.2 Rename The ISA, or “logical”, register specifiers (e.g. x0-x31) are then renamed into “physical” register specifiers. Dispatch The Micro-Op is then dispatched, or written, into a set of Issue Queues. Issue Micro-Ops sitting in a Issue Queue wait until all of their operands are ready and are then issued.3 This is the beginning of the out–of–order piece of the pipeline. Register Read Issued Micro-Ops first read their register operands from the unified physical register file (or from the bypass net- work). . . Execute . . . and then enter the Execute stage where the functional units reside. Issued memory operations perform their address calculations in the Execute stage, and then store the calculated addresses in the Load/Store Unit which resides in the Memory stage. Memory The Load/Store Unit consists of three queues: a Load Address Queue (LAQ), a Store Address Queue (SAQ), and a Store Data Queue (SDQ). Loads are fired to memory when their address is present in the LAQ. Stores are fired to memory at Commit time (and naturally, stores cannot be committed until both their address and data have been placed in the SAQ and SDQ). Writeback ALU operations and load operations are written back to the physical register file. Commit The Reorder Buffer (ROB), tracks the status of each instruction in the pipeline. When the head of the ROB is not-busy, the ROB commits the instruction. For stores, the ROB signals to the store at the head of the Store Queue (SAQ/SDQ) that it can now write its data to memory. 2 Because RISC-V is a RISC ISA, currently all instructions generate only a single Micro-Op. More details on how store Micro-Ops are handled can be found in The Memory System and the Data-cache Shim. 3 More precisely, Micro-Ops that are ready assert their request, and the issue scheduler chooses which Micro-Ops to issue that cycle. 2 Chapter 1. Introduction and Overview
分享到:
收藏