hls_bluebook.pdf

发布时间：2022-05-30 发布人：admin 分类：说明书资料大小：14.03M 资料格式：pdf 举报版权申诉

kaoruzhu7-12437486-16359647405185524236.pdf-第1页.png

第1页 / 共256页

kaoruzhu7-12437486-16359647405185524236.pdf-第2页.png

第2页 / 共256页

kaoruzhu7-12437486-16359647405185524236.pdf-第3页.png

第3页 / 共256页

kaoruzhu7-12437486-16359647405185524236.pdf-第4页.png

第4页 / 共256页

kaoruzhu7-12437486-16359647405185524236.pdf-第5页.png

第5页 / 共256页

kaoruzhu7-12437486-16359647405185524236.pdf-第6页.png

第6页 / 共256页

kaoruzhu7-12437486-16359647405185524236.pdf-第7页.png

第7页 / 共256页

kaoruzhu7-12437486-16359647405185524236.pdf-第8页.png

第8页 / 共256页

Chapter 1: Making the Case for High-Level Synthesis

A broken design flow

Keeping up with the pace

Benefits of High-Level Synthesis

Reducing design and verification efforts

More effective reuse

Investing R&D resources where it really matters

Seizing the opportunity

Chapter 2: General C++ Style

2.1. File Organization

2.2. Building an Executable Using Makefiles

2.2.1. Makefile Naming

2.2.2. Comments

2.2.3. Macros

2.2.4. Targets

2.2.5. Phony Targets

2.2.6. Simple Makefile Example

2.3. Header/Include Files

2.4. Test Benches

2.5. Creating a Golden Reference Design

2.5.1. Make Sure You're Fully Testing the DUT

2.5.2. Uninitialized Variables

Chapter 3: Bit Accurate Data Types

3.1. Compilation, Debug, and Simulation Speed

3.1.1. Header Files and Typedefs

3.2. Integer Data Types

3.2.1. Unsigned integer

3.2.2. Signed Integer

3.3. Fixed Point Data Types

3.3.1. Unsigned Fixed Point

3.3.2. Signed Fixed Point

3.3.3. Quantization and Overflow

3.3.4. Truncation and Rounding

3.3.5. Saturation and Overflow

3.4. Operators

3.4.1. Bitwise Arithmetic Operators: *, +, -, /, &, |, ^,%

3.4.2. Bit Select Operator: []

3.4.3. Shift Operators: <<, >>

3.4.4. Shift Right Operator: >>

Unsigned Shift Right

Signed Shift Right

Shift Left Operator: <<

Unsigned Shift Left

Signed Shift Left

Unexpected Loss of Precision

3.5. Methods

3.5.1. Slice Read: slc

3.5.2. Slice Write: set_slc

3.5.3. Explicit Conversion Functions

3.5.4. Implicit Conversion Functions

3.6. Helper/Utility Functions

3.6.1. Array Uninitialization: ac::init_array

3.6.2. ceil, floor, and nbits

3.7. Complex Data Types

Chapter 4: Fundamentals of High Level Synthesis

4.1. The Top-level Design Module

4.1.1. Registered Outputs

4.1.2. Control Ports

4.1.3. Port Width

4.1.4. Port Direction

Input ports

Output ports

Inout Ports

4.2. High-level C++ Synthesis

4.2.1. Data Flow Graph Analysis

4.2.2. Resource Allocation

4.2.3. Scheduling

4.2.4. Classic RISC Pipelining

4.2.5. Loop Pipelining

4.3. for/while/do Loops

4.3.1. What's in a Loop?

"for" Loop

"while" Loop

"do" Loop

4.3.2. Rolled Loops

4.3.3. Loop Unrolling

Partial Loop Unrolling

Fully Unrolled Loops

Dependencies Between Loop Iterations

Loops with Constant Bounds

4.3.4. Loops with Conditional Bounds

4.3.5. Optimizing the Loop Counter

4.3.6. Optimizing the Loop Control

4.3.7. Nested Loops

Unconstrained Nested Loops

Pipelined Nested Loops

Unrolling Nested Loops

4.3.8. Sequential Loops

Simple Independent Sequential Loops

Effects of Unmerged Sequential Loops

Manual merging of sequential loops

4.4. Pipeline Feedback

4.4.1. Data Feedback

4.4.2. Control Feedback

4.5. Conditions

4.5.1. Sharing

if-else statement

switch statement

Keep it Simple

4.5.2. Functions and Multiple Conditional Returns

Replacing Conditional Returns with Flags

4.5.3. References

Chapter 5: Scheduling of IO and Memories

5.1. Unconditional IO

5.1.1. Pass by Reference

5.1.2. Pass by Value

5.2. Conditional IO

5.2.1. Pass by Reference

5.2.2. Pass by Value

5.2.3. Ready/acknowledge Behavior (wait)

5.2.4. Stalling the Pipeline

5.2.5. Manually Flushing the Pipeline

5.2.6. Writing IO for Throughput

Making IO Mergable

5.3. Memories

5.3.1. Automatic Mapping of Arrays to Memories

5.3.2. Automatic Memory Merging

5.3.3. Designing for Throughput When Using Memories

Non-Mutually Exclusive Memory Accesses

Making Memory Accesses Mutually Exclusive

Manually Merging Non-Mutually Exclusive Memory Accesses

Chapter 6: Sequential and Combinational Hardware

6.1. Shift Registers

6.1.1. Basic Shift Register

6.1.2. Shift Register with Enable

6.1.3. Shift Register with Synchronous Clear

6.1.4. Shift Register with Load

6.1.5. Shift Register Template Function

6.1.6. Class Based Shift Register

6.1.7. Helper Classes for Design Reuse

Log2Ceil

NextPow2

6.2. Multiplexors

6.2.1. Binary MUX

6.2.2. Automatic Binary to Onehot MUX Optimizations

6.2.3. Manual Optimization of Binary Selection MUXes

6.2.4. One Hot MUX

6.2.5. Priority Search Hardware

6.2.6. Finding Leading 1s in a Bit-vector

6.2.7. Improved Performance and Area Using the Brute Force Approach

6.2.8. Log2(N) Based Search

6.2.9. Recursive Template Search

6.2.10. Finding the Maximum Value in an Array

6.2.11. Algorithmic Coding Style

6.2.12. Recursive Template Search

6.3. Absolute Value (abs)

6.4. Linear Feedback Shift Register (LFSR)

6.5. Accumulator

6.6. Shifters

6.6.1. Barrel shifter

Logical

Arithmetic

Bi-directional

Rotating

6.6.2. Constant Shifts

Transforming Barrel Shifters into Constant Shifts

6.6.3. Transforming Dynamic Bit Masking

6.7. Adder Trees

6.7.1. Preventing Automatic Tree Balancing

6.7.2. Coding to Facilitate Automatic Tree Balancing

6.8. Lookup Tables (LUT)

Chapter 7: Memory Architectures

7.1. Memory-based Shift Register

7.1.1. Classic Shift Register Description mapped to Memories

7.1.2. Circular Buffer

7.1.3. Initialization loops

7.2. Memory Organization

7.2.1. Interleaving Memories

7.2.2. Automatic Interleaving

7.2.3. Manual Interleaving with Random Access

7.2.4. Manual Interleaving with Sequential Access

7.3. Widening the Word Width of Memories

7.3.1. Manually Increasing Word Width with Sequential Access

7.4. Caching

7.4.1. "Windowing" of 1-D Data Streams

7.4.2. Pure Algorithmic Description with Poor Memory Architecture

7.4.3. Analyzing Array Access Patterns

7.4.4. Shift Register Sliding Window Implementation

7.4.5. Boundary Conditions

7.4.6. 2-D Windowing

7.4.7. Pure Algorithmic Description with Poor Memory Architecture

7.4.8. Analyzing Array Access Patterns

7.4.9. Circular Line Buffer Sliding Window Implementation

Chapter 8: Hierarchical Design

8.1. Arrays Shared Between Blocks

8.1.1. Out-of-order Array Access

8.1.2. Algorithmic C Channel Class

8.1.3. Using Explicit Channels

8.1.4. Using Channels at the Top-level Interface and Testbench

8.1.5. Arrays Inside of Channels

8.1.6. Arrays Mapped to Registers

8.1.7. Arrays Mapped to Memories

8.2. Blocks with Common Interface Control Variables

8.2.1. Passing Control Variables Between Blocks

8.2.2. Connecting Interface Control Variables to Multiple Blocks

8.2.8. Duplicating Control IO

8.3. Reconvergence: Balancing the Latency Between Blocks

8.3.1. Deadlock

8.3.2. Automatic Pipeline Flushing

8.3.3. Manually Setting FIFO Depths

Chapter 9: Advanced Hierarchical Design

9.1. ac_channel Methods

9.1.1. Channel size: int size()

9.1.2. Non-blocking Read: bool nb_read(T &val)

9.2. Recommended Coding Style

9.3. Feedback

9.3.1. C++ Assertion

9.3.2. Preloading the Channels/FIFOs

9.3.3. Deadlock

9.3.4. Variable Rate or Data Dependent Feedback

Chapter 10: Digital Filters

10.1. FIR Filters

10.2. Register Based Filters

10.2.1. External Coefficients

10.2.2. Constant Coefficients

10.2.3. Loadable Coefficients

10.2.4. Symmetric Coefficients

10.2.5. Even Symmetric

10.2.6. Odd Symmetric

10.2.7. Transposed

10.2.8. Systolic

10.3. Multi-rate Filtering

10.4. Using Decimation in Filters

10.4.1. Algorithmic Decimation

10.4.2. Manual Decimation

10.5. Using Interpolation in Filters

10.5.1. Algorithmic Interpolation

10.5.2. Manual Interpolation

10.6. Multi-stage Decimation

10.6.1. Multi-block

10.6.2. Single-block

Chapter 11: FFT Transform

11.1. Radix-2 FFT

11.2. Floating Point Radix-2 In-place FFT

11.3. Some Final Thoughts

11.3.1. References

HLS Bluebook Software Version v10.4a September 2019 This document contains information that is proprietary to Mentor Graphics Corporation. The original recipient of this document may duplicate this document in whole or in part for internal business purposes only, provided that this entire notice appears in all copies. In duplicating any part of this document, the recipient agrees to make every reasonable effort to prevent the unau- thorized use and distribution of the proprietary information.

This document is for information and instruction purposes. Mentor Graphics reserves the right to make changes in specifications and other information contained in this publication without prior notice, and the reader should, in all cases, consult Mentor Graphics to determine whether any changes have been made. The terms and conditions governing the sale and licensing of Mentor Graphics products are set forth in written agreements between Mentor Graphics and its customers. No representation or other affirmation of fact con- tained in this publication shall be deemed to be a warranty or give rise to any liability of Mentor Graphics what- soever. MENTOR GRAPHICS MAKES NO WARRANTY OF ANY KIND WITH REGARD TO THIS MATERIAL INCLUD- ING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. MENTOR GRAPHICS SHALL NOT BE LIABLE FOR ANY INCIDENTAL, INDIRECT, SPECIAL, OR CONSE- QUENTIAL DAMAGES WHATSOEVER (INCLUDING BUT NOT LIMITED TO LOST PROFITS) ARISING OUT OF OR RELATED TO THIS PUBLICATION OR THE INFORMATION CONTAINED IN IT, EVEN IF MENTOR GRAPHICS HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. U.S. GOVERNMENT LICENSE RIGHTS: The software and documentation were developed entirely at private expense and are commercial computer software and commercial computer software documentation within the meaning of the applicable acquisition regulations. Accordingly, pursuant to FAR 48 CFR 12.212 and DFARS 48 CFR 227.7202, use, duplication and disclosure by or for the U.S. Government or a U.S. Government subcon- tractor is subject solely to the terms and conditions set forth in the license agreement provided with the soft- ware, except for provisions which are contrary to applicable mandatory federal laws. TRADEMARKS: The trademarks, logos and service marks ("Marks") used herein are the property of Mentor Graphics Corporation or other parties. No one is permitted to use these Marks without the prior written consent of Mentor Graphics or the owner of the Mark, as applicable. The use herein of a third- party Mark is not an at- tempt to indicate Mentor Graphics as a source of a product, but is intended to indicate a product from, or asso- ciated with, a particular third party. A current list of Mentor Graphics' trademarks may be viewed at: www.mentor.com/trademarks. The registered trademark Linux® is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the mark on a world-wide basis. End-User License Agreement: You can print a copy of the End-User License Agreement from: www.mentor.com/eula. 8005 S.W. Boeckman Road, Wilsonville, Oregon 97070-7777. Mentor Graphics Corporation Telephone: 503.685.7000 Toll-Free Telephone: 800.592.2210 Website: www.mentor.com SupportNet: supportnet.mentor.com/

Table of Contents Table of Contents Chapter 1: Making the Case for High-Level Synthesis.....................................2 Chapter 2: General C++ Style........................................................................5 2.1. File Organization......................................................................................................5 2.2. Building an Executable Using Makefiles..................................................................5 2.2.1. Makefile Naming................................................................................................6 2.2.2. Comments.........................................................................................................6 2.2.3. Macros...............................................................................................................6 2.2.4. Targets...............................................................................................................6 2.2.5. Phony Targets....................................................................................................6 2.2.6. Simple Makefile Example..................................................................................6 2.3. Header/Include Files................................................................................................7 2.4. Test Benches............................................................................................................9 2.5. Creating a Golden Reference Design.......................................................................9 2.5.1. Make Sure You're Fully Testing the DUT...........................................................11 2.5.2. Uninitialized Variables.....................................................................................11 Chapter 3: Bit Accurate Data Types..............................................................13 3.1. Compilation, Debug, and Simulation Speed...........................................................13 3.1.1. Header Files and Typedefs...............................................................................14 3.2. Integer Data Types.................................................................................................14 3.2.1. Unsigned integer.............................................................................................14 3.2.2. Signed Integer.................................................................................................16 3.3. Fixed Point Data Types..........................................................................................17 3.3.1. Unsigned Fixed Point.......................................................................................18 3.3.2. Signed Fixed Point...........................................................................................19 3.3.3. Quantization and Overflow..............................................................................20 3.3.4. Truncation and Rounding.................................................................................20 3.3.5. Saturation and Overflow..................................................................................21 3.4. Operators...............................................................................................................22 3.4.1. Bitwise Arithmetic Operators: *, +, -, /, &, |, ^,%............................................23 3.4.2. Bit Select Operator: [].....................................................................................23 3.4.3. Shift Operators: <<, >>.................................................................................23 3.4.4. Shift Right Operator: >>.................................................................................23 3.5. Methods.................................................................................................................27 3.5.1. Slice Read: slc.................................................................................................27 3.5.2. Slice Write: set_slc...........................................................................................28 3.5.3. Explicit Conversion Functions..........................................................................28 3.5.4. Implicit Conversion Functions..........................................................................28 3.6. Helper/Utility Functions..........................................................................................29 3.6.1. Array Uninitialization: ac::init_array................................................................29 3.6.2. ceil, floor, and nbits.........................................................................................29 HLS Bluebook v10.4a September 2019 iii

Table of Contents 3.7. Complex Data Types..............................................................................................29 Chapter 4: Fundamentals of High Level Synthesis.........................................30 4.1. The Top-level Design Module.................................................................................30 4.1.1. Registered Outputs..........................................................................................31 4.1.2. Control Ports....................................................................................................31 4.1.3. Port Width........................................................................................................31 4.1.4. Port Direction...................................................................................................31 4.2. High-level C++ Synthesis......................................................................................32 4.2.1. Data Flow Graph Analysis................................................................................32 4.2.2. Resource Allocation.........................................................................................32 4.2.3. Scheduling.......................................................................................................33 4.2.4. Classic RISC Pipelining.....................................................................................34 4.2.5. Loop Pipelining................................................................................................35 4.3. for/while/do Loops..................................................................................................38 4.3.1. What's in a Loop?............................................................................................39 4.3.2. Rolled Loops....................................................................................................40 4.3.3. Loop Unrolling.................................................................................................41 4.3.4. Loops with Conditional Bounds........................................................................45 4.3.5. Optimizing the Loop Counter...........................................................................46 4.3.6. Optimizing the Loop Control............................................................................47 4.3.7. Nested Loops...................................................................................................48 4.3.8. Sequential Loops.............................................................................................63 4.4. Pipeline Feedback..................................................................................................67 4.4.1. Data Feedback.................................................................................................67 4.4.2. Control Feedback.............................................................................................72 4.5. Conditions..............................................................................................................74 4.5.1. Sharing............................................................................................................74 4.5.2. Functions and Multiple Conditional Returns.....................................................76 4.5.3. References.......................................................................................................78 Chapter 5: Scheduling of IO and Memories...................................................79 5.1. Unconditional IO....................................................................................................79 5.1.1. Pass by Reference...........................................................................................79 5.1.2. Pass by Value...................................................................................................82 5.2. Conditional IO........................................................................................................84 5.2.1. Pass by Reference...........................................................................................84 5.2.2. Pass by Value...................................................................................................89 5.2.3. Ready/acknowledge Behavior (wait)...............................................................91 5.2.4. Stalling the Pipeline.........................................................................................93 5.2.5. Manually Flushing the Pipeline........................................................................96 5.2.6. Writing IO for Throughput................................................................................97 5.3. Memories.............................................................................................................101 5.3.1. Automatic Mapping of Arrays to Memories....................................................102 5.3.2. Automatic Memory Merging..........................................................................104 HLS Bluebook v10.4a September 2019 iv

Table of Contents 5.3.3. Designing for Throughput When Using Memories.........................................107 Chapter 6: Sequential and Combinational Hardware...................................111 6.1. Shift Registers.....................................................................................................111 6.1.1. Basic Shift Register........................................................................................111 6.1.2. Shift Register with Enable.............................................................................113 6.1.3. Shift Register with Synchronous Clear...........................................................113 6.1.4. Shift Register with Load.................................................................................114 6.1.5. Shift Register Template Function...................................................................115 6.1.6. Class Based Shift Register.............................................................................116 6.1.7. Helper Classes for Design Reuse...................................................................119 6.2. Multiplexors.........................................................................................................121 6.2.1. Binary MUX....................................................................................................121 6.2.2. Automatic Binary to Onehot MUX Optimizations...........................................122 6.2.3. Manual Optimization of Binary Selection MUXes...........................................123 6.2.4. One Hot MUX.................................................................................................124 6.2.5. Priority Search Hardware...............................................................................125 6.2.6. Finding Leading 1s in a Bit-vector.................................................................125 6.2.7. Improved Performance and Area Using the Brute Force Approach................126 6.2.8. Log2(N) Based Search...................................................................................128 6.2.9. Recursive Template Search...........................................................................130 6.2.10. Finding the Maximum Value in an Array......................................................131 6.2.11. Algorithmic Coding Style.............................................................................132 6.2.12. Recursive Template Search.........................................................................133 6.3. Absolute Value (abs)............................................................................................135 6.4. Linear Feedback Shift Register (LFSR).................................................................138 6.5. Accumulator.........................................................................................................139 6.6. Shifters................................................................................................................140 6.6.1. Barrel shifter..................................................................................................141 6.6.2. Constant Shifts..............................................................................................144 6.6.3. Transforming Dynamic Bit Masking...............................................................145 6.7. Adder Trees..........................................................................................................146 6.7.1. Preventing Automatic Tree Balancing............................................................147 6.7.2. Coding to Facilitate Automatic Tree Balancing..............................................148 6.8. Lookup Tables (LUT).............................................................................................149 Chapter 7: Memory Architectures..............................................................153 7.1. Memory-based Shift Register...............................................................................153 7.1.1. Classic Shift Register Description mapped to Memories...............................153 7.1.2. Circular Buffer...............................................................................................154 7.1.3. Initialization loops.........................................................................................156 7.2. Memory Organization..........................................................................................156 7.2.1. Interleaving Memories...................................................................................157 7.2.2. Automatic Interleaving..................................................................................157 HLS Bluebook v10.4a September 2019 v

Table of Contents 7.2.3. Manual Interleaving with Random Access.....................................................158 7.2.4. Manual Interleaving with Sequential Access.................................................161 7.3. Widening the Word Width of Memories...............................................................163 7.3.1. Manually Increasing Word Width with Sequential Access..............................164 7.4. Caching................................................................................................................167 7.4.1. "Windowing" of 1-D Data Streams.................................................................171 7.4.2. Pure Algorithmic Description with Poor Memory Architecture.......................171 7.4.3. Analyzing Array Access Patterns...................................................................173 7.4.4. Shift Register Sliding Window Implementation..............................................174 7.4.5. Boundary Conditions.....................................................................................175 7.4.6. 2-D Windowing..............................................................................................176 7.4.7. Pure Algorithmic Description with Poor Memory Architecture.......................176 7.4.8. Analyzing Array Access Patterns...................................................................177 7.4.9. Circular Line Buffer Sliding Window Implementation.....................................178 Chapter 8: Hierarchical Design..................................................................181 8.1. Arrays Shared Between Blocks............................................................................181 8.1.1. Out-of-order Array Access.............................................................................181 8.1.2. Algorithmic C Channel Class..........................................................................183 8.1.3. Using Explicit Channels.................................................................................184 8.1.4. Using Channels at the Top-level Interface and Testbench.............................185 8.1.5. Arrays Inside of Channels..............................................................................187 8.1.6. Arrays Mapped to Registers...........................................................................187 8.1.7. Arrays Mapped to Memories..........................................................................190 8.2. Blocks with Common Interface Control Variables................................................192 8.2.1. Passing Control Variables Between Blocks....................................................192 8.2.2. Connecting Interface Control Variables to Multiple Blocks............................194 8.2.8. Duplicating Control IO...................................................................................196 8.3. Reconvergence: Balancing the Latency Between Blocks.....................................199 8.3.1. Deadlock.......................................................................................................199 8.3.2. Automatic Pipeline Flushing..........................................................................202 8.3.3. Manually Setting FIFO Depths.......................................................................202 Chapter 9: Advanced Hierarchical Design...................................................205 9.1. ac_channel Methods............................................................................................205 9.1.1. Channel size: int size()..................................................................................205 9.1.2. Non-blocking Read: bool nb_read(T &val).....................................................205 9.2. Recommended Coding Style................................................................................206 9.3. Feedback.............................................................................................................207 9.3.1. C++ Assertion...............................................................................................208 9.3.2. Preloading the Channels/FIFOs......................................................................209 9.3.3. Deadlock.......................................................................................................209 9.3.4. Variable Rate or Data Dependent Feedback..................................................210 Chapter 10: Digital Filters.........................................................................212 HLS Bluebook v10.4a September 2019 vi

Table of Contents 10.1. FIR Filters...........................................................................................................212 10.2. Register Based Filters........................................................................................213 10.2.1. External Coefficients...................................................................................213 10.2.2. Constant Coefficients..................................................................................215 10.2.3. Loadable Coefficients..................................................................................215 10.2.4. Symmetric Coefficients...............................................................................216 10.2.5. Even Symmetric..........................................................................................216 10.2.6. Odd Symmetric...........................................................................................217 10.2.7. Transposed..................................................................................................218 10.2.8. Systolic........................................................................................................220 10.3. Multi-rate Filtering.............................................................................................222 10.4. Using Decimation in Filters................................................................................222 10.4.1. Algorithmic Decimation...............................................................................222 10.4.2. Manual Decimation......................................................................................225 10.5. Using Interpolation in Filters..............................................................................229 10.5.1. Algorithmic Interpolation.............................................................................229 10.5.2. Manual Interpolation...................................................................................231 10.6. Multi-stage Decimation......................................................................................233 10.6.1. Multi-block...................................................................................................233 10.6.2. Single-block.................................................................................................234 Chapter 11: FFT Transform........................................................................240 11.1. Radix-2 FFT........................................................................................................240 11.2. Floating Point Radix-2 In-place FFT....................................................................241 11.3. Some Final Thoughts.........................................................................................249 11.3.1. References...................................................................................................249 HLS Bluebook v10.4a September 2019 vii

HLS Bluebook v10.4a September 2019 1

分享到：

赞收藏

资料库

hls_bluebook.pdf

相关推荐

开发技术

热门标签

最新资料