logo资料库

(资源)ug902-vivado-high-level-synthesis.pdf

第1页 / 共667页
第2页 / 共667页
第3页 / 共667页
第4页 / 共667页
第5页 / 共667页
第6页 / 共667页
第7页 / 共667页
第8页 / 共667页
资料共667页,剩余部分请下载后查看
Vivado Design Suite User Guide: High-Level Synthesis
Revision History
Table of Contents
Ch. 1: High-Level Synthesis
Introduction to C-Based FPGA Design
High-Level Synthesis Benefits
High-Level Synthesis Basics
Scheduling and Binding Example
Extracting Control Logic and Implementing I/O Ports Example
Performance Metrics Example
Understanding Vivado HLS
Inputs and Outputs
Test Bench, Language Support, and C Libraries
Test Bench
Language Support
C, C++, and SystemC Language Constructs
C Libraries
C Library Example
Synthesis, Optimization, and Analysis
Optimization
Analysis
OpenCL API C Kernel Synthesis
RTL Verification
RTL Export
Using Vivado HLS
Creating a New Synthesis Project
Simulating the C Code
Reviewing the Output of C Simulation
Synthesizing the C Code
Creating an Initial Solution
Reviewing the Output of C Synthesis
Analyzing the Results of C Synthesis
Synthesis Reports
Analysis Perspective
Creating a New Solution
Applying Optimization Directives
Using Tcl Commands or Embedded Pragmas
Applying Optimization Directives to Global Variables
Applying Optimization Directives to Class Objects
Applying Optimization Directives to Templates
Using #Define with Pragma Directives
Failure to Satisfy Optimization Directives
Verifying the RTL is Correct
Reviewing the Output of C/RTL Co-Simulation
Packaging the IP
Reviewing the Output of IP Packaging
Example Vivado RTL Project
Example IP Integrator Project
Archiving the Project
Using the Command Prompt and Tcl Interface
Understanding the Windows Command Prompt
Improving Run Time and Capacity
Design Examples and References
Tutorials
Design Examples
Coding Examples
Data Types for Efficient Hardware
Advantages of Hardware Efficient Data Types
Overview of Arbitrary Precision Integer Data Types
Arbitrary Precision Integer Types with C
Arbitrary Precision Types with C++
Arbitrary Precision Types with SystemC
Overview of Arbitrary Precision Fixed-Point Data Types
Half-Precision Floating-Point Data Types
Managing Interfaces
Interface Synthesis
Interface Synthesis Overview
Clock and Reset Ports
Block-Level Interface Protocol
Port-Level Interface Protocol
Interface Synthesis and OpenCL API C
Interface Synthesis I/O Protocols
Block-Level Interface Protocols
Port-Level Interface Protocols: AXI4 Interfaces
Port-Level Interface Protocols: No I/O Protocol
Port-Level Interface Protocols: Wire Handshakes
Port-Level Interface Protocols: Memory Interfaces
Interface Synthesis and Structs
Interface Synthesis and Multi-Access Pointers
Specifying Interfaces
Interface Synthesis for SystemC
Applying Interface Directives with SystemC
Block RAM Memory Ports
SystemC AXI4-Stream Interface
SystemC AXI4-Lite Interface
SystemC AXI4 Master Interface
Specifying Manual Interface
Using AXI4 Interfaces
AXI4-Stream Interfaces
AXI4-Stream Interfaces without Side-Channels
AXI4-Stream Interfaces with Side-Channels
Packing Structs into AXI4-Stream Interfaces
AXI4-Lite Interface
Control Clock and Reset in AXI4-Lite Interfaces
C Driver Files
C Driver Files and Float Types
Controlling Hardware
Controlling Software
Customizing AXI4-Lite Slave Interfaces in IP Integrator
AXI4 Master Interface
Controlling AXI4 Burst Behavior
Creating an AXI4 Interface with 64-bit Address Capability
Controlling the Address Offset in an AXI4 Interface
Customizing AXI4 Master Interfaces in IP Integrator
Managing Interfaces with SSI Technology Devices
Optimizing the Design
Clock, Reset, and RTL Output
Specifying the Clock Frequency
Specifying the Reset
Initialization Behavior
Controlling the Reset Behavior
Initializing and Resetting Arrays
RTL Output
Optimizing for Throughput
Function and Loop Pipelining
Rewinding Pipelined Loops for Performance
Flushing Pipelines
Automatic Loop Pipelining
Addressing Failure to Pipeline
Partitioning Arrays to Improve Pipelining
Automatic Array Partitioning
Dependencies with Vivado HLS
Removing False Dependencies to Improve Loop Pipelining
Optimal Loop Unrolling to Improve Pipelining
Exploiting Task Level Parallelism: Dataflow Optimization
Dataflow Optimization Limitations
Configuring Dataflow Memory Channels
Specifying Arrays as Ping-Pong Buffers or FIFOs
Optimizing for Latency
Using Latency Constraints
Merging Sequential Loops to Reduce Latency
Flattening Nested Loops to Improve Latency
Optimizing for Area
Data Types and Bit-Widths
Function Inlining
Mapping Many Arrays into One Large Array
Horizontal Array Mapping
Mapping Vertical Arrays
Array Mapping and Special Considerations
Array Reshaping
Function Instantiation
Controlling Hardware Resources
Limiting the Number of Operators
Globally Minimizing Operators
Controlling the Hardware Cores
Globally Optimizing Hardware Cores
Optimizing Logic
Controlling Operator Pipelining
Optimizing Logic Expressions
Verifying the RTL
Automatically Verifying the RTL
Test Bench Requirements
Interface Synthesis Requirements
RTL Simulator Support
Unsupported Optimizations for Cosimulation
Simulating IP Cores
Using C/RTL Co-Simulation
Executing RTL Simulation
Verification of Directives
Analyzing RTL Simulations
Debugging C/RTL Cosimulation
Setting up the Environment
Optimization Directives
C Test Bench and C Source Code
Exporting the RTL Design
Synthesizing the RTL
Packaging IP Catalog Format
Software Driver Files
Exporting IP to System Generator
Importing the RTL into System Generator
Optimizing Ports
Exporting a Synthesized Checkpoint
Ch. 2: High-Level Synthesis C Libraries
Introduction to the Vivado HLS C Libraries
Arbitrary Precision Data Types Library
Using Arbitrary Precision Data Types
Arbitrary Integer Precision Types with C
Arbitrary Integer Precision Types with C++
Arbitrary Precision Integer Types with SystemC
Arbitrary Precision Fixed-Point Data Types
Example Using ap_fixed
Example Using sc_fixed
C Arbitrary Precision Integer Data Types
Advantages of C Arbitrary Precision Data Types
Validating Arbitrary Precision Types in C
Integer Promotion
C Arbitrary Precision Integer Types: Reference Information
C++ Arbitrary Precision Integer Types
C++ Arbitrary Precision Integer Types: Reference Information
C++ Arbitrary Precision Fixed-Point Types
C++ Arbitrary Precision Fixed-Point Types: Reference Information
HLS Stream Library
C Modeling and RTL Implementation
Global and Local Streams
Using HLS Streams
Blocking Reads and Writes
Blocking Write Methods
Blocking Read Methods
Non-Blocking Reads and Writes
Non-Blocking Writes
Fullness Test
Non-Blocking Read
Emptiness Test
Controlling the RTL FIFO Depth
C/RTL Co-Simulation Support
HLS Math Library
HLS Math Library Accuracy
C90 mode
C99 mode (-std=c99)
C++ Using math.h
C++ Using cmath
C++ Using cmath and namespace std
The HLS Math Library
Trigonometric Functions
Hyperbolic Functions
Exponential Functions
Logarithmic Functions
Power Functions
Error Functions
Gamma Functions
Rounding Functions
Remainder Functions
Floating-point
Difference Functions
Other Functions
Classification Functions
Comparison Functions
Relational Functions
Fixed-Point Math Functions
Trigonometric Functions
Exponential Functions
Power Functions
Verification and Math Functions
Verification Option 1: Standard Math Library and Verify Differences
Verification Option 2: HLS Math Library and Validate Differences
Verification Option 3: HLS Math Library File and Validate Differences
Common Synthesis Errors
C++ cmath.h
C math.h
Cautions
HLS Video Library
Using the Video Library
Video Data Types
Memory Line Buffer
Memory Window Buffer
Video Functions
OpenCV Interface Functions
AXI4-Interface Functions
Video Processing Functions
Using Video Functions
Optimizing Video Functions for Performance
HLS IP Libraries
FFT IP Library
FFT Static Parameters
FFT Run Time Configuration and Status
Using the FFT Function
FIR Filter IP Library
FIR Static Parameters
Using the FIR Function
Optional FIR Run Time Configuration
DDS IP Library
DDS Static Parameters
SRL IP Library
Mapping Directly into SRL Resources
Read from the Shifter
Read, Write, and Shift Data
Read, Write, and Enable-Shift
HLS Linear Algebra Library
Using the Linear Algebra Library
Optimizing the Linear Algebra Functions
Cholesky
Implementation Controls
Key Factors
Specifications
Cholesky Inverse and QR Inverse
Implementation Controls
Key Factors
Specifications
Matrix Multiply
Implementation Controls
Key Factors
Specifications
QRF
Implementation Controls
Key Factors
Specifications
SVD
Implementation Controls
Key Factors
Specifications
HLS DSP Library
Using the DSP Library
Ch. 3: High-Level Synthesis Coding Styles
Introduction to Coding Styles
Unsupported C Constructs
System Calls
Dynamic Memory Usage
Pointer Limitations
General Pointer Casting
Pointer Arrays
Function Pointers
Recursive Functions
Standard Template Libraries
C Test Bench
Productive Test Benches
Design Files and Test Bench Files
Combining Test Bench and Design Files
OpenCL API C Test Benches
Functions
Inlining functions
Impact of Coding Style
Loops
Variable Loop Bounds
Loop Pipelining
Imperfect Nested Loops
Loop Parallelism
Loop Dependencies
Unrolling Loops in C++ Classes
Arrays
Array Accesses and Performance
FIFO Accesses
Arrays on the Interface
Array Interfaces
FIFO Interfaces
Array Initialization
Implementing ROMs
Data Types
Standard Types
Floats and Doubles
Arbitrary Precision Data Types
Composite Data Types
Structs
Enumerated Types
Unions
Type Qualifiers
Volatile
Statics
Const
Vivado HLS Optimizations
Global Variables
Exposing Global Variables as I/O Ports
Pointers
Pointers on the Interface
Basic Pointers
Pointer Arithmetic
Multi-Access Pointer Interfaces: Streaming Data
Understanding Volatile Data
Modeling Streaming Data Interfaces
Multi-Access Pointers and RTL Simulation
C Builtin Functions
Hardware Efficient C Code
Typical C Code for a Convolution Function
Horizontal Convolution
Vertical Convolution
Border Pixels
Ensuring the Continuous Flow of Data and Data Reuse
Using HLS Streams for Streaming Data
Horizontal Convolution
Vertical Convolution
Border Pixels
Summary of C for Efficient Hardware
C++ Classes and Templates
Constructors, Destructors, and Virtual Functions
Global Variables and Classes
Templates
Using Templates to Create Unique Instances
Using Templates for Recursion
Assertions
SystemC Synthesis
Design Modeling
Using SC_ MODULE
Using SC_METHOD
Instantiating SC_MODULES
Using SC_CTHREAD
Synthesis of Loops
Synthesis with Multiple Clocks
Communication Channels
Top-Level SystemC Ports
SystemC Interface Synthesis
RAM Port Synthesis
FIFO Port Synthesis
Unsupported SystemC Constructs
Modules and Constructors
Instantiating Modules
Module Constructors
Virtual Functions
Top-Level Interface Ports
Ch. 4: High-Level Synthesis Reference Guide
Command Reference
add_files
Description
Syntax
Options
Pragma
Examples
close_project
Description
Syntax
Options
Pragma
Examples
close_solution
Description
Syntax
Options
Pragma
Examples
config_array_partition
Description
Syntax
Options
Pragma
Examples
config_bind
Description
Syntax
Options
Pragma
Examples
config_compile
Description
Syntax
Options
Pragma
Examples
config_core
Description
Syntax
Options
Pragma
Examples
config_dataflow
Description
Syntax
Options
Pragma
Examples
config_interface
Description
Syntax
Options
Pragma
Examples
config_rtl
Description
Syntax
Options
Pragma
Examples
config_schedule
Description
Syntax
Options
Pragma
Examples
config_unroll
Description
Syntax
Options
Example
cosim_design
Description
Syntax
Options
Pragma
Examples
create_clock
Description
Syntax
Options
Pragma
Examples
csim_design
Description
Syntax
Options
Pragma
Examples
csynth_design
Description
Syntax
Options
Pragma
Examples
delete_project
Description
Syntax
Options
Pragma
Examples
delete_solution
Syntax
Description
Pragma
Examples
export_design
Description
Syntax
Options
Pragma
Examples
help
Description
Syntax
Options
Pragma
Examples
list_core
Description
Syntax
Options
Pragma
Examples
list_part
Description
Syntax
Pragma
Examples
open_project
Description
Syntax
Options
Pragma
Examples
open_solution
Description
Syntax
Options
Pragma
Examples
set_clock_uncertainty
Description
Syntax
Pragma
Examples
set_directive_allocation
Description
Syntax
Options
Pragma
Examples
set_directive_array_map
Description
Syntax
Options
Pragma
Examples
set_directive_array_partition
Description
Syntax
Options
Pragma
Examples
set_directive_array_reshape
Description
Syntax
Options
Pragma
Examples
set_directive_clock
Description
Syntax
Pragma
Examples
set_directive_dataflow
Description
Syntax
Pragma
Examples
set_directive_data_pack
Description
Syntax
Options
Pragma
Examples
set_directive_dependence
Description
Syntax
Options
Pragma
Examples
set_directive_expression_balance
Description
Syntax
Options
Pragma
Examples
set_directive_function_instantiate
Description
Syntax
Options
Pragma
Examples
set_directive_inline
Description
Syntax
Options
Pragma
Examples
set_directive_interface
Description
Syntax
Options
Pragma
Examples
set_directive_latency
Description
Syntax
Options
Pragma
Examples
set_directive_loop_flatten
Description
Syntax
Options
Pragma
Examples
set_directive_loop_merge
Description
Syntax
Options
Pragma
Examples
set_directive_loop_tripcount
Description
Syntax
Options
Pragma
Examples
set_directive_occurrence
Description
Syntax
Options
Pragma
Examples
set_directive_pipeline
Description
Syntax
Options
Pragma
Examples
set_directive_protocol
Description
Syntax
Options
Pragma
Examples
set_directive_reset
Description
Syntax
Options
Pragma
Examples
set_directive_resource
Description
Syntax
Options
Pragma
Examples
set_directive_stream
Description
Syntax
Options
Pragma
Examples
set_directive_top
Description
Syntax
Options
Pragma
Examples
set_directive_unroll
Description
Syntax
Options
Pragma
Examples
set_part
Description
Syntax
Options
Pragma
Examples
set_top
Description
Syntax
Options
Pragma
Examples
GUI Reference
Monitoring Variables
Resolving Header File Information
Resolving Comments in the Source Code
Customizing the GUI Behavior
Customizing the Console Window
Customizing the Key Behavior
Interface Synthesis Reference
Block-Level I/O Protocols
ap_ctrl_none
ap_ctrl_hs
ap_ctrl_chain
Port-Level I/O Protocols
ap_none
ap_stable
ap_hs (ap_ack, ap_vld, and ap_ovld)
ap_ack
ap_vld
ap_ovld
ap_memory, bram
ap_fifo
ap_bus
axis
s_axilite
m_axi
AXI4-Lite Slave C Driver Reference
XDut_Initialize
Synopsis
Description
XDut_CfgInitialize
Synopsis
Description
XDut_LookupConfig
Synopsis
Description
XDut_Release
Synopsis
Description
XDut_Start
Synopsis
Description
XDut_IsDone
Synopsis
Description
XDut_IsIdle
Synopsis
Description
XDut_IsReady
Synopsis
Description
XDut_Continue
Synopsis
Description
XDut_EnableAutoRestart
Synopsis
Description
XDut_DisableAutoRestart
Synopsis
Description
XDut_Set_ARG
Synopsis
Description
XDut_Set_ARG_vld
Synopsis
Description
XDut_Set_ARG_ack
Synopsis
Description
XDut_Get_ARG
Synopsis
Description
XDut_Get_ARG_vld
Synopsis
Description
XDut_Get_ARG_ack
Synopsis
Description
XDut_Get_ARG_BaseAddress
Synopsis
Description
XDut_Get_ARG_HighAddress
Synopsis
Description
XDut_Get_ARG_TotalBytes
Synopsis
Description
XDut_Get_ARG_BitWidth
Synopsis
Description
XDut_Get_ARG_Depth
Synopsis
Description
XDut_Write_ARG_Words
Synopsis
Description
XDut_Read_ARG_Words
Synopsis
Description
XDut_Write_ARG_Bytes
Synopsis
Description
XDut_Read_ARG_Bytes
Synopsis
Description
XDut_InterruptGlobalEnable
Synopsis
Description
XDut_InterruptGlobalDisable
Synopsis
Description
XDut_InterruptEnable
Synopsis
Description
XDut_InterruptDisable
Synopsis
Description
XDut_InterruptClear
Synopsis
Description
XDut_InterruptGetEnabled
Synopsis
Description
XDut_InterruptGetStatus
Synopsis
Description
HLS Video Functions Library
OpenCV Interface Functions
IplImage2AXIvideo
Synopsis
Parameters
Description
AXIvideo2IplImage
Synopsis
Parameters
Description
cvMat2AXIvideo
Synopsis
Parameters
Description
AXIvideo2cvMat
Synopsis
Parameters
Description
CvMat2AXIvideo
Synopsis
Parameters
Description
AXIvideo2CvMat
Synopsis
Parameters
Description
IplImage2hlsMat
Synopsis
Parameters
Description
hlsMat2IplImage
Synopsis
Parameters
Description
cvMat2hlsMat
Synopsis
Parameters
Description
hlsMat2cvMat
Synopsis
Parameters
Description
CvMat2hlsMat
Synopsis
Parameters
Description
hlsMat2CvMat
Synopsis
Parameters
Description
CvMat2hlsWindow
Synopsis
Parameters
Description
hlsWindow2CvMat
Synopsis
Parameters
Description
AXI4-Interface I/O Functions
hls::Array2Mat
Synopsis
Parameters
Description
hls::Mat2Array
Synopsis
Parameters
Description
hls::AXIvideo2Mat
Synopsis
Parameters
Description
hls::Mat2AXIvideo
Synopsis
Parameters
Description
Video Processing Functions
hls::AbsDiff
Synopsis
Parameters
Description
OpenCV Reference
hls::AddS
Synopsis
Parameters
Description
OpenCV Reference
hls::AddWeighted
Synopsis
Parameters
Description
OpenCV Reference
hls::And
Synopsis
Parameters
Description
OpenCV Reference
hls::Avg
Synopsis
Parameters
Description
OpenCV Reference
hls::AvgSdv
Synopsis
Parameters
Description
OpenCV Reference
hls::Cmp
Synopsis
Parameters
Description
OpenCV Reference
hls::CmpS
Synopsis
Parameters
Description
OpenCV Reference
hls::CornerHarris
Synopsis
Parameters
Description
OpenCV Reference
hls::CvtColor
Synopsis
Parameters
Description
OpenCV Reference
hls::Dilate
Synopsis
Parameters
Description
OpenCV Reference
hls::Duplicate
Synopsis
Parameters
Description
OpenCV Reference
hls::EqualizeHist
Synopsis
Parameters
Description
OpenCV Reference
hls::Erode
Synopsis
Parameters
Description
OpenCV Reference
hls::FASTX
Synopsis
Parameters
Description
OpenCV Reference
hls::Filter2D
Synopsis
Parameters
Description
OpenCV Reference
hls::FindStereoCorrespondenceBM
Synopsis
Parameters
Description
OpenCV Reference
hls::GaussianBlur
Synopsis
Parameters
Description
OpenCV Reference
hls::Harris
Synopsis
Parameters
Description
OpenCV Reference
hls::HoughLines2
Synopsis
Parameters
Description
OpenCV Reference
hls::Integral
Synopsis
Parameters
Description
OpenCV Reference
hls::InitUndistortRectifyMap
Synopsis
Parameters
Description
Limitations
OpenCV Reference
hls::Max
Synopsis
Parameters
Description
OpenCV Reference
hls::MaxS
Synopsis
Parameters
Description
OpenCV Reference
hls::Mean
Synopsis
Parameters
Description
OpenCV Reference
hls::Merge
Synopsis
Parameters
Description
OpenCV Reference
hls::Min
Synopsis
Parameters
Description
OpenCV Reference
hls::MinMaxLoc
Synopsis
Parameters
Description
OpenCV Reference
hls::MinS
Synopsis
Parameters
Description
OpenCV Reference
hls::Mul
Synopsis
Parameters
Description
OpenCV Reference
hls::Not
Synopsis
Parameters
Description
OpenCV Reference
hls::PaintMask
Synopsis
Parameters
Description
hls::PyrDown
Synopsis
Parameters
Description
OpenCV Reference
hls::PyrUp
Synopsis
Parameters
Description
OpenCV Reference
hls::Range
Synopsis
Parameters
Description
OpenCV Reference
hls::Remap
Synopsis
Parameters
Description
OpenCV Reference
hls::Reduce
Synopsis
Parameters
Description
OpenCV Reference
hls::Resize
Synopsis
Parameters
Description
OpenCV Reference
hls::Set
Synopsis
Parameters
Description
OpenCV Reference
hls::Scale
Synopsis
Parameters
Description
OpenCV Reference
hls::Sobel
Synopsis
Parameters
Description
OpenCV Reference
hls::Split
Synopsis
Parameters
Description
OpenCV Reference
hls::SubRS
Synopsis
Parameters
Description
OpenCV Reference
hls::SubS
Synopsis
Parameters
Description
OpenCV Reference
hls::Sum
Synopsis
Parameters
Description
OpenCV Reference
hls::Threshold
Synopsis
Parameters
Description
OpenCV Reference
hls::Zero
Synopsis
Parameters
Description
OpenCV Reference
HLS Linear Algebra Library Functions
matrix_multiply
Synopsis
Description
Parameters
Arguments
Return Values
Supported Data Types
Input Data Assumptions
cholesky
Synopsis
Description
Parameters
Arguments
Return Values
Supported Data Types
Input Data Assumptions
qrf
Synopsis
Description
Parameters
Arguments
Return Values
Supported Data Types
Input Data Assumptions
cholesky_inverse
Synopsis
Description
Parameters
Arguments
Return Values
Supported Data Types
Input Data Assumptions
qr_inverse
Synopsis
Description
Parameters
Arguments
Return Values
Supported Data Types
Input Data Assumptions
svd
Synopsis
Description
Parameters
Arguments
Return Values
Supported Data Types
Input Data Assumptions
Examples
HLS DSP Library Functions
HLS DSP Functions
awgn
Synopsis
Description
Parameters
Arguments
Return Values
Supported Base Data Types
Input Data Assumptions
nco
Synopsis
Description
Parameters
Arguments
Return Values
Supported Base Data Types
Input Data Assumptions
convolution_encoder
Synopsis
Description
Parameters
Arguments
Return Values
Supported Base Data Types
Input Data Assumptions
viterbi_decoder
Synopsis
Description
Parameters
Arguments
Return Values
Supported Base Data Types
Input Data Assumptions
atan2
Synopsis
Description
Parameters
Arguments
Return Values
Supported Base Data Types
Input Data Assumptions
sqrt
Synopsis
Description
Parameters
Arguments
Return Values
Supported Base Data Types
Input Data Assumptions
cmpy
Synopsis
Description
Parameters
Arguments
Return Values
Supported Base Data Types
Input Data Assumptions
HLS DSP Design Examples
C Arbitrary Precision Types
Compiling [u]int#W Types
Declaring/Defining [u]int#W Variables
Initialization and Assignment from Constants (Literals)
apint_string2bits()
apint_vstring2bits()
Support for console I/O (Printing)
apint_print()
apint_fprint()
Expressions Involving [u]int#W types
Zero- and Sign-Extension on Assignment from Narrower to Wider Variables
Truncation on Assignment of Wider to Narrower Variables
Binary Arithmetic Operators
Addition
Subtraction
Multiplication
Division
Modulus
Bitwise Logical Operators
Bitwise OR
Bitwise AND
Bitwise XOR
Shift Operators
Unsigned Integer Shift Right
Integer Shift Right
Unsigned Integer Shift Left
Integer Shift Left
Compound Assignment Operators
Relational Operators
Equality
Inequality
Less than
Greater than
Less than or equal to
Greater than or equal to
Bit-Level Operation: Support Function
Bit Manipulation
Length
Concatenation
Bit Selection
Set Bit Value
Range Selection
Set Range Value
Bit Reduction
AND Reduce
OR Reduce
XOR Reduce
NAND Reduce
NOR Reduce
XNOR Reduce
C++ Arbitrary Precision Types
Compiling ap_[u]int<> Types
Declaring/Defining ap_[u] Variables
Initialization and Assignment from Constants (Literals)
Support for Console I/O (Printing)
Using the C++ Standard Output Stream
Using the Standard C Library
Optional Argument One (Specifying the Radix)
Optional Argument Two (Printing as Signed Values)
Expressions Involving ap_[u]<> types
Zero- and Sign-Extension on Assignment From Narrower to Wider Variables
Truncation on Assignment of Wider to Narrower Variables
Class Methods and Operators
Binary Arithmetic Operators
Addition
Subtraction
Multiplication
Division
Modulus
Bitwise Logical Operators
Bitwise OR
Bitwise AND
Bitwise XOR
Unary Operators
Addition
Subtraction
Bitwise Inverse
Logical Invert
Ternary Operators
Shift Operators
Unsigned Integer Shift Right
Integer Shift Right
Unsigned Integer Shift Left
Integer Shift Left
Compound Assignment Operators
Increment and Decrement Operators
Pre-Increment
Post-Increment
Pre-Decrement
Post-Decrement
Relational Operators
Equality
Inequality
Less than
Greater than
Less than or equal to
Greater than or equal to
Other Class Methods, Operators, and Data Members
Bit-Level Operations
Length
Concatenation
Bit Selection
Range Selection
AND reduce
OR reduce
XOR reduce
NAND reduce
NOR reduce
XNOR reduce
Bit Reduction Method Examples
Bit Reverse
Reverse Method Example
Test Bit Value
Test Method Example
Set Bit Value
Set Bit (to 1)
Clear Bit (to 0)
Invert Bit
Rotate Right
Rotate Left
Bitwise NOT
Test Sign
Explicit Conversion Methods
To C/C++ “(u)int”
To C/C++ 64-bit “(u)int”
To C/C++ “double”
Sizeof
Compile Time Access to Data Type Attributes
C++ Arbitrary Precision Fixed-Point Types
ap_[u]fixed Representation
Quantization Modes
AP_RND
AP_RND_ZERO
AP_RND_MIN_INF
AP_RND_INF
AP_RND_CONV
AP_TRN
AP_TRN_ZERO
Overflow Modes
AP_SAT
AP_SAT_ZERO
AP_SAT_SYM
AP_WRAP
AP_WRAP_SM
Compiling ap_[u]fixed<> Types
Declaring and Defining ap_[u]fixed<> Variables
Initialization and Assignment from Constants (Literals)
Support for Console I/O (Printing)
Using the Standard C Library
Optional Argument One (Specifying the Radix)
Optional Argument Two (Printing as Signed Values)
Expressions Involving ap_[u]fixed<> types
Class Methods, Operators, and Data Members
Binary Arithmetic Operators
Addition
Subtraction
Multiplication
Division
Bitwise Logical Operators
Bitwise OR
Bitwise AND
Bitwise XOR
Increment and Decrement Operators
Pre-Increment
Post-Increment
Pre-Decrement
Post-Decrement
Unary Operators
Addition
Subtraction
Equality Zero
Bitwise Inverse
Shift Operators
Unsigned Shift Left
Signed Shift Left
Unsigned Shift Right
Signed Shift Right
Relational Operators
Equality
Inequality
Greater than or equal to
Less than or equal to
Greater than
Less than
Bit Operator
Bit-Select and Set
Bit Range
Range Select
Length
Explicit Conversion Methods
Fixed-to-double
Fixed-to-ap_int
Fixed-to-integer
Compile Time Access to Data Type Attributes
Comparison of SystemC and Vivado HLS Types
Default Constructor
Integer Division
Integer Modulus
Negative Shifts
Over-Shift Left
Range Operation
Division and Fixed-Point Types
Right Shift and Fixed-Point Types
Left Shift and Fixed-Point Types
Appx. A: Additional Resources and Legal Notices
Xilinx Resources
Solution Centers
Documentation Navigator and Design Hubs
References
Training Resources
Please Read: Important Legal Notices
Vivado Design Suite User Guide High-Level Synthesis UG902 (v2017.4) Feburaruy 2, 2018
Revision History The following table shows the revision history for this document. Date 02/02/2018 Version 2017.4 12/20/2017 2017.4 04/05/2017 2017.1 Revision In Chapter 1, High-Level Synthesis, updated information on DATAFLOW optimization in Dataflow Optimization Limitations, Configuring Dataflow Memory Channels, and Specifying Arrays as Ping-Pong Buffers or FIFOs. Updated STREAM in Table 1-11. Updated Optimizing for Throughput. Removed qam_mod and qam_demod from Table 1-5, Table 2-31, and HLS DSP Library Functions. Added information on read-after-read to Dependencies with Vivado HLS. Updated MulnS in Table 1-14. Updates throughout Optimizing for Throughput. Updated default modes in Table 2-3 and Table 2-4. Removed examples from Optimizing the Linear Algebra Functions. Added information on improving memory resources to Arrays. Updated FIFO Interfaces. Added Using Templates to Create Unique Instances to Templates. Updated config_compile, set_directive_loop_tripcount, set_directive_resource, and set_directive_stream. Updated Figure 4-8 through Figure 4-14. Changed rounding behavior in AP_TRN and AP_TRN_ZERO. Added new section HLS Math Library in Chapter 2. Updated code examples in Pointers, apint_print(), Invert Bit, Dependencies with Vivado HLS, and Cholesky Inverse and QR Inverse. Removed -avg option for TRIPCOUNT throughout document. Updated Specifying Arrays as Ping-Pong Buffers or FIFOs and set_directive_stream with information about -depth. Clarified C/RTL co-simulation halting conditions in Interface Synthesis Requirements. Updated Half-Precision Floating-Point Data Types. Added Off mode information to AXI4-Stream Interfaces. Updated AXI4-Lite Interface. Updated C Modeling and RTL Implementation. Updated Non-Blocking Reads and Writes. Removed Table 3-2 (Floating Point Cores and Device Support) from Standard Types. Added support information for Function Pointers to Pointer Limitations. Updated -register_mode in set_directive_interface. High-Level Synthesis UG902 (v2017.4) Feburaruy 2, 2018 www.xilinx.com 2 Send Feedback
Table of Contents Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Chapter 1: High-Level Synthesis Introduction to C-Based FPGA Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Understanding Vivado HLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Using Vivado HLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 Data Types for Efficient Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76 Managing Interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Optimizing the Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131 Verifying the RTL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190 Exporting the RTL Design. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202 Chapter 2: High-Level Synthesis C Libraries Introduction to the Vivado HLS C Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Arbitrary Precision Data Types Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 HLS Stream Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 HLS Math Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 HLS Video Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 HLS IP Libraries. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258 HLS Linear Algebra Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 HLS DSP Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 Chapter 3: High-Level Synthesis Coding Styles Introduction to Coding Styles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 Unsupported C Constructs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 C Test Bench . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 Loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 Arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 Data Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 C Builtin Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 Hardware Efficient C Code. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 C++ Classes and Templates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 High-Level Synthesis UG902 (v2017.4) Feburaruy 2, 2018 www.xilinx.com 3 Send Feedback
Assertions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385 SystemC Synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 Chapter 4: High-Level Synthesis Reference Guide Command Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410 GUI Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480 Interface Synthesis Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 AXI4-Lite Slave C Driver Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 503 HLS Video Functions Library . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 517 HLS Linear Algebra Library Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578 HLS DSP Library Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 587 C Arbitrary Precision Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 600 C++ Arbitrary Precision Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614 C++ Arbitrary Precision Fixed-Point Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633 Comparison of SystemC and Vivado HLS Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657 Appendix A: Additional Resources and Legal Notices Xilinx Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665 Solution Centers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665 Documentation Navigator and Design Hubs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 665 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666 Training Resources. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666 Please Read: Important Legal Notices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667 High-Level Synthesis UG902 (v2017.4) Feburaruy 2, 2018 www.xilinx.com 4 Send Feedback
Chapter 1 High-Level Synthesis Introduction to C-Based FPGA Design The Xilinx® Vivado® High-Level Synthesis (HLS) tool transforms a C specification into a register transfer level (RTL) implementation that you can synthesize into a Xilinx field programmable gate array (FPGA). You can write C specifications in C, C++, SystemC, or as an Open Computing Language (OpenCL™) API C kernel, and the FPGA provides a massively parallel architecture with benefits in performance, cost, and power over traditional processors. This chapter provides an overview of high-level synthesis. Note: For more information on FPGA architectures and Vivado HLS basic concepts, see the Introduction to FPGA Design Using High-Level Synthesis (UG998) [Ref 1]. High-Level Synthesis Benefits High-level synthesis bridges hardware and software domains, providing the following primary benefits: • Improved productivity for hardware designers Hardware designers can work at a higher level of abstraction while creating high-performance hardware. • Improved system performance for software designers Software developers can accelerate the computationally intensive parts of their algorithms on a new compilation target, the FPGA. Using a high-level synthesis design methodology allows you to: • Develop algorithms at the C-level Work at a level that is abstract from the implementation details, which consume development time. • Verify at the C-level Validate the functional correctness of the design more quickly than with traditional hardware description languages. High-Level Synthesis UG902 (v2017.4) Feburaruy 2, 2018 www.xilinx.com 5 Send Feedback
Chapter 1: High-Level Synthesis • Control the C synthesis process through optimization directives Create specific high-performance hardware implementations. • Create multiple implementations from the C source code using optimization directives Explore the design space, which increases the likelihood of finding an optimal implementation. • Create readable and portable C source code Retarget the C source into different devices as well as incorporate the C source into new projects. High-Level Synthesis Basics High-level synthesis includes the following phases: • Scheduling Determines which operations occur during each clock cycle based on: ° Length of the clock cycle or clock frequency Time it takes for the operation to complete, as defined by the target device ° ° User-specified optimization directives If the clock period is longer or a faster FPGA is targeted, more operations are completed within a single clock cycle, and all operations might complete in one clock cycle. Conversely, if the clock period is shorter or a slower FPGA is targeted, high-level synthesis automatically schedules the operations over more clock cycles, and some operations might need to be implemented as multicycle resources. • Binding Determines which hardware resource implements each scheduled operation. To implement the optimal solution, high-level synthesis uses information about the target device. • Control logic extraction Extracts the control logic to create a finite state machine (FSM) that sequences the operations in the RTL design. High-Level Synthesis UG902 (v2017.4) Feburaruy 2, 2018 www.xilinx.com 6 Send Feedback
Chapter 1: High-Level Synthesis High-level synthesis synthesizes the C code as follows: Top-level function arguments synthesize into RTL I/O ports • • C functions synthesize into blocks in the RTL hierarchy If the C code includes a hierarchy of sub-functions, the final RTL design includes a hierarchy of modules or entities that have a one-to-one correspondence with the original C function hierarchy. All instances of a function use the same RTL implementation or block. • Loops in the C functions are kept rolled by default When loops are rolled, synthesis creates the logic for one iteration of the loop, and the RTL design executes this logic for each iteration of the loop in sequence. Using optimization directives, you can unroll loops, which allows all iterations to occur in parallel. • Arrays in the C code synthesize into block RAM or UltraRAM in the final FPGA design If the array is on the top-level function interface, high-level synthesis implements the array as ports to access a block RAM outside the design. High-level synthesis creates the optimal implementation based on default behavior, constraints, and any optimization directives you specify. You can use optimization directives to modify and control the default behavior of the internal logic and I/O ports. This allows you to generate variations of the hardware implementation from the same C code. To determine if the design meets your requirements, you can review the performance metrics in the synthesis report generated by high-level synthesis. After analyzing the report, you can use optimization directives to refine the implementation. The synthesis report contains information on the following performance metrics: • • • Area: Amount of hardware resources required to implement the design based on the resources available in the FPGA, including look-up tables (LUT), registers, block RAMs, and DSP48s. Latency: Number of clock cycles required for the function to compute all output values. Initiation interval (II): Number of clock cycles before the function can accept new input data. Loop iteration latency: Number of clock cycles it takes to complete one iteration of the loop. Loop initiation interval: Number of clock cycle before the next iteration of the loop starts to process data. Loop latency: Number of cycles to execute all iterations of the loop. • • • High-Level Synthesis UG902 (v2017.4) Feburaruy 2, 2018 www.xilinx.com 7 Send Feedback
Chapter 1: High-Level Synthesis Scheduling and Binding Example The following figure shows an example of the scheduling and binding phases for this code example: int foo(char x, char a, char b, char c) { char y; y = x*a+b+c; return y } X-Ref Target - Figure 1-1 Figure 1-1: Scheduling and Binding Example In the scheduling phase of this example, high-level synthesis schedules the following operations to occur during each clock cycle: First clock cycle: Multiplication and the first addition Second clock cycle: Second addition and output generation • • Note: an internal register stores a variable. In this example, high-level synthesis only requires that the output of the addition is registered across a clock cycle. The first cycle reads x, a, and b data ports. The second cycle reads data port c and generates output y. In the preceding figure, the square between the first and second clock cycles indicates when High-Level Synthesis UG902 (v2017.4) Feburaruy 2, 2018 www.xilinx.com 8 Send Feedback
分享到:
收藏