logo资料库

VPU API参考手册.pdf

第1页 / 共87页
第2页 / 共87页
第3页 / 共87页
第4页 / 共87页
第5页 / 共87页
第6页 / 共87页
第7页 / 共87页
第8页 / 共87页
资料共87页,剩余部分请下载后查看
Overview
Main Features
Programmability
Frame-Based Processing
Program Memory Management
Multi-Instances
Host Interface
Communication Models
Data Handling
Host Interface Registers
API-Based VPU Control
API Features
Simple Software Control
Handling Multi-Instances
Frame-Based Codec Processing
Type Definitions
Type Definitions (common data types)
Uint8
Uint16
Uint32
Uint64
Int64
PhysicalAddress
VirtualAddress
CodStd
RetCode
CodecCommand
GDI_TILED_MAP_TYPE
MirrorDirection
Mp4HeaderType
AvcHeaderType
EncHandle
DecHandle
Data and Structure Definitions
FrameBuffer
DecMaxFrmInfo
Rect
EncHeaderParam
EncParamSet
EncMp4Param
EncH263Param
EncAvcParam
EncMjpgParam
EncSliceMode
EncOpenParam
EncReportBufSize
EncInitialInfo
EncParam
EncReportInfo
EncOutputInfo
SearchRamParam
DecParamSet
DecOpenParam
DecReportBufSize
DecInitialInfo
ExtBufCfg
DecBufInfo
DecParam
DecReportInfo
Vp8ScaleInfo
Vp8PicInfo
AvcFpaSei
MvcPicInfo
DecOutputInfo
vpu_versioninfo
VPUMemAlloc
iram_t
API Definitions Overview
Basic Architecture
Decoder Operation Flow
Encoder Operation Flow
Control API
vpu_Init()
vpu_UnInit()
vpu_IsBusy()
jpu_IsBusy()
vpu_WaitForInt()
vpu_GetVersionInfo()
IOGetPhyMem()
IOFreePhyMem()
IOGetVirtMem()
IOFreeVirtMem()
IOGetIramBase()
vpu_SWReset()
Encoder API
vpu_EncOpen()
vpu_EncClose()
vpu_EncGetInitialInfo()
vpu_EncGetBitstreamBuffer()
vpu_EncUpdateBitstreamBuffer()
vpu_EncRegisterFrameBuffer()
vpu_EncStartOneFrame()
vpu_EncGetOutputInfo()
vpu_EncGiveCommand()
Decoder API
vpu_DecOpen()
vpu_DecClose()
vpu_DecGetInitialInfo()
vpu_DecSetEscSeqInit()
vpu_DecGetBitstreamBuffer()
vpu_DecUpdateBitstreamBuffer()
vpu_DecRegisterFrameBuffer()
vpu_DecStartOneFrame()
vpu_DecGetOutputInfo()
vpu_DecBitBufferFlush()
vpu_DecClrDispFlag()
vpu_DecGiveCommand()
VPU Control
VPU Initialization
Version Check of BIT Processor Microcode
BIT Processor Enable and Disable
BIT Processor Data Buffer Management
BIT Processor Microcode Management
Stream Buffer Management
Ring-Buffer Scheme (Packet Mode)
Interrupt Signaling Management
Encoder Control
Creating an Encoder Instance
Configuring VPU for Encoder Instance
Sequence Initialization
Registering Frame Buffers During Configuration Process
Generating High-Level Header Syntaxes
Running Picture Encoder on VPU
YUV Input Loading
Initiating Picture Encoding
Completion of Picture Encoding
Encoder Stream Handling
Acquiring Encoder Results
Terminating an Encoder Instance
Dynamic Configuration Commands (picture encoding operations)
Decoder Control
Creating a Decoder Instance
AVC Display Reordering
Configuring VPU for Decoder Instance
Feeding Bitstream into Stream Buffer
Sequence Initialization when configuring VPU for Decoder Instance
Registering Frame Buffers
Running Picture Decoder On VPU
Initiating Picture Decoding
Frame Skipping Option
I-Frame Search for Random Access and Trick Mode
Decoder Stream Handling
Completion of Picture Decoding
Acquiring Decoder Results
Reading Display Output
Reading Decoded Output
Reading Pre-Scan Result
Display Cropping in H.264
Next Decoded Frame Index
Reading Lack of Additional Work Buffer
Management of Displaying Buffers Decoded
Escape from Decoder Hang
Terminating a Decoder Instance
Stream End and Last Picture in Stream Buffer
Closing Current Instance
Dynamic Configuration Commands
Example Applications
VPU Library
VPU Example Application
Decode Stream to Display on LCD
Encode Stream from Camera Captured Data
Other Issues
Revision History
NXP Semiconductors Document Number: IMXVPUAPI Rev. 0, 10/2016 i.MX VPU Application Programming Interface Linux® Reference Manual Contents 1 2 3 4 5 Overview....................................................................1 Host Interface............................................................ 5 API Features............................... ...............................7 VPU Control.............................. ............................. 68 Revision History......................... ............................ 86 1 Overview This section discusses the capabilities of i.MX 6 series VPU, and explains its block diagram. The i.MX 6 series Video Processing Unit (VPU) is a high performance multi-standard video decoder and encoder engine that performs multiple standard decoding and encoding operations. VPU codec is fully compliant with H.264 BP/MP/HP, VC-1 SP/MP/AP, MPEG-4 SP/ASP except GMC, DivX (Xvid), MPEG-1/2, VP8, AVS and MJPEG decoding and H.264, MPEG-4, H.263, and MJPG encoding. VPU supports up to full HD 1920x1080 60i or 30p decoding and 1920x1088 encoding. It can encode or decode multiple video clips with multiple standards simultaneously. A block diagram of the i.MX 6 series VPU is shown in figure below. The VPU connects with the system through the 32-bit AMBA3 APB bus for system control and the 64-bit AMBA3 AXI for data throughput. VPU also takes advantage of on-chip memories to achieve high performance. Most video hardware blocks in VPU are optimally designed for shared usage between different video standards which provides ultra low power and low gate count with powerful performance. As shown in figure below, VPU has a 16-bit DSP core, the BIT processor, which controls the internal video codec operations.
Overview For simple and efficient control of the VPU by the host processor, the VPU provides a set of registers called the host interface registers. Most commands and responses between the host processor and the VPU are transmitted through the host interface registers. Stream data and some output picture data are directly accessed by the host processor and VPU. For a more comprehensive way of controlling the VPU, a set of API functions is provided that includes all of the required operations from the host processor side. Figure 1. i.MX 6 VPU Block Diagram 1.1 Main Features The VPU is fully compliant with H.264 BP/MP/HP, VC-1 SP/MP/AP, MPEG-4 SP/ASP except GMC, DivX (Xvid) and MPEG-1/2, VP8, AVS, and MJPEG. Image sizes up to full HD 1920x1080 60i or 30p decoding and 1920x1088 encoding. VPU supports various error resilience tools, multiple decoding, and full duplex multi-party-call simultaneously. VPU provides programmability, flexibility, and ease of upgrade in decoding and encoding or host interface because all of the controls in the decoding and encoding process and host interface are implemented as firmware in the programmable BIT processor. The detailed features of the VPU are as follows: • Encoding • H.264 • 1/4-pel accuracy motion estimation with programmable search range up to [+/-128, +/-64] • Search range is reconfigurable by SW • 16x16, 16x8, 8x16 and 8x8 block sizes i.MX VPU Application Programming Interface Linux® Reference Manual, Rev. 0, 10/2016 2 NXP Semiconductors
Overview • Configurable block sizes • Only one reference frame for motion estimation • Intra-prediction • Luma I4x4 Mode : 9 modes • Luma I16x16 Mode : 3 modes (Vertical, Horizon, DC) • Chroma Mode : 3 modes (Vertical, Horizon, DC) • Minimum encoding image size is 96 pixels in horizontal and 16 pixels in vertical • FMO/ASO tool of H.264 is not supported • MPEG-4 • AC/DC prediction • 1/2-pel accuracy motion estimation with search range up to [+/-128, +/-64] • Search range is reconfigurable by SW • H.263 • H.263 Baseline profile + Annex J, K (RS=0 and ASO=0), and T • 48x32 pixel minimum encoding image size (48 pixels horizontal and 32 pixels vertical) • Decoding • H.264 • Fully compatible with the ITU-T Recommendation H.264 specification in BP/MP and HP • CABAC/CAVLC • Supports MVC Stereo High profile • Variable block size-16x16, 16x8, 8x16, 8x8, 8x4, 4x8 and 4x4 • Error detection, concealment and error resilience tools • VC1 • All VC-1 profile features-SMPTE Proposed SMPTE Standard for Television: VC-1 Compressed Video Bitstream format and Decoding Process • Simple/Main/Advanced Profile • MPEG-4 • Simple/Advanced Simple profile except GMC • H.263 Baseline profile + Annex I, J, K (except RS/ASO), and T • DivX version 3.x to 6.x • Xvid • MPEG-2 • Fully compatible with ISO/IEC 13182-2 MPEG2 specification in main profile • I,P and B frame • Field coded picture (interlaced) and fame coded picture • AVS • Supports Jizhun profile level 6.2 (exclude 422 use case) • VP8 • Fully compatible with VP8 decoder specification • Supporting both simple and normal in-loop deblocking • 64x64 pixel minimum decoding size • JPEG tools • MJPEG Baseline Process Encoder and Decoder • Baseline ISO/IEC 10918-1 JPEG compliance • Support 1 or 3 color components • 3 component in a scan (interleaved only) • 8 bit samples for each component • Support 4:2:0, 4:2:2, 2:2:4, 4:4:4 and 4:0:0 color format (max. six 8x8 blocks in one MCU) • Minimum encoding size is 16x16 pixels. • Value added features • De-ringing • Pre/Post rotator/mirror • Built-in de-blocking filter for MPEG-2/MPEG-4 and DivX • Programmability i.MX VPU Application Programming Interface Linux® Reference Manual, Rev. 0, 10/2016 NXP Semiconductors 3
Overview • 16-bit DSP processor dedicated to processing bitstream and controlling the codec hardware • General purpose registers and interrupt for communication to and from a host processor • Optimal external memory accesses • Configurable frame buffer formats (linear or tiled) for longer burst-length • 2D cache for motion estimation and compensation to reduce external memory accesses • Secondary AXI port for on-chip memory to enhance performance • Performance • All video decoder standards up to 1920x1088 @ 30 fps at 266 MHz • H264 encoder standards up to 1920x1088 @ 30 fps at 266 MHz, MPEG4 encoder up to 720p@30fps at 266MHz • MJPG decoder on 4:4:4 supports 120M pixel per second @ 266MHz • MJPG encoder on 4:4:4 supports 160M pixel per second @ 266MHz • Interrupt • Interrupt from and to external host processor or interrupt controller 1.2 Programmability The VPU has an internal DSP called the BIT processor which controls the internal hardware blocks for video decoder operations. The operation of the BIT processor is determined by the dedicated microcode called the BIT firmware. VPU has a complete set of BIT firmware code as well as a complete set of VPU control functions called VPU API. Therefore, application developers do not need to manage codec-specific issues on host processor. 1.2.1 Frame-Based Processing The BIT processor completes decoding operations on a frame-by-frame basis, which allows low level independence of VPU operations from the host processor. While frame operations are running, there is no need for communication between the host processor and the VPU. Therefore, VPU does not burden the host processor during decoder operations. After issuing a picture processing command, the host application performs its own operations until it is ready for the next picture processing operation or until it receives an interrupt from VPU informing the host processor of completion of the picture processing. 1.2.2 Program Memory Management The VPU has its own program memory to load BIT firmware for supporting application-specific operations. In order to use this internal memory efficiently, the BIT firmware has a dynamic re-loading scheme which enables the VPU to have a small amount of program memory. For example, if a MPEG-2 decoder operation is running on VPU, then VPU program memory is filled by the MPEG-2 decoder firmware inside VPU. If a H.264 decoder operation is newly issued, then the BIT processor automatically loads the H.264 decoder firmware from the SDRAM to program memory. Because of the frame-based operation of VPU, the maximum rate of this dynamic reloading operation is approximately 30 times per second in a single instance decoder use case. Since the amount of BIT firmware for one decoder standard is smaller than 16 Kytes, this is not a large burden for the VPU operations in performance and memory bandwidth. i.MX VPU Application Programming Interface Linux® Reference Manual, Rev. 0, 10/2016 4 NXP Semiconductors
Host Interface 1.2.3 Multi-Instances The VPU supports multiple instances which can be helpful for multi-channel decoder applications. In order to support this multi-instance operation, the BIT processor uses an internal context parameter set for each decoder instance. When creating a new instance and starting a picture processing operation, a set of context parameters is created and updated automatically within VPU. This internal context management scheme allows different decoder tasks running on the host processor to control VPU operations independently with their own instance numbers. When creating a new instance, an application task receives a new handle specifying an instance if a new handle is available on the VPU. All the subsequent operations for the given application task are handled separately by VPU using this task- specific handle. When writing a VPU driver, this handle can be regard as a device-ID or a port-ID of the VPU for each task. Since the VPU can only perform one picture processing task at a time, the application task should check if VPU is ready before starting a new picture operation. An application can easily terminate a single task on VPU by calling a function for closing a certain instance. 2 Host Interface This section describes the interfaces used by host processor to control i.MX 6 VPU. This section presents a general description of the host interfaces provided for a host processor to control i.MX 6 VPU. 2.1 Communication Models VPU requires a dedicated path for exchanging data and/or messages between the host processor and VPU. VPU uses shared memory for exchanging data between the host processor and VPU. This shared memory is accessible through ABMA host bus. Bitstream data and frame data are exchanged using this shared memory space. Independent of data exchange path, a dedicated path for messages between the host processor and VPU is provided using a set of VPU registers called the host interface registers. All commands and responses between the host processor and VPU are exchanged through these registers as shown in figure below. i.MX VPU Application Programming Interface Linux® Reference Manual, Rev. 0, 10/2016 NXP Semiconductors 5
Host Interface Figure 2. Data and Message Exchange Between Host and VPU All bitstream and picture data is accessed directly by the host processor and VPU. The related information about the data transfer as well as command and responses is exchanged through the host interface. The host interface of the VPU uses a set of registers accessible from the host processor. Some of these host registers are used for exchanging actual command and responses and other registers are used to give information about the internal status of the VPU to host processor. Firmware running on the BIT processor is well-optimized for a given set of commands and responses. 2.1.1 Data Handling All of the pixel data or stream data transactions are performed by the host processor or VPU through the shared memory space in SDRAM. In order to assure safe transactions between the host processor and VPU, all the required information is stored in the host interface registers. Generally, these transactions are one-directional transactions: the host or VPU writes the data and the other reads the data on a single data buffer. Therefore, transactions are easily and safely controlled by using a pair of read and write pointers. Just as common data buffers in shared memory, the BIT processor requires a certain amount of memory for processing called the working buffer. The working buffer can only be accessed by VPU. In addition, frame buffers used in picture decoding are managed exclusively by VPU which ensures safe decoding. For proper streaming, the available free space in the decoder stream buffer can be accessed using the buffer read pointer, write pointer, and buffer size. A set of APIs is provided for this purpose that can be called by the application anytime. 2.1.2 Host Interface Registers A set of commands is provided for controlling codec operations on a frame-by-frame basis together with the corresponding responses. Host interface registers can be partitioned into three categories as follows: • BIT processor control registers update or show BIT processor status to host processors. Most of these registers are used for initializing BIT processor during boot-up. i.MX VPU Application Programming Interface Linux® Reference Manual, Rev. 0, 10/2016 6 NXP Semiconductors
• BIT processor global registers store all the global variables which are reserved even while an active instance is changed. All the buffer addresses and some global options are safely stored in these registers. • BIT processor command I/O registers are overwritten or updated whenever a new command is transmitted from the host processor. All commands with input arguments and all corresponding responses with return values are handled using these registers. In addition, command I/O registers are used in a pre-defined way for each command to control VPU. API Features 2.2 API-Based VPU Control Host applications generally control VPU through a set of pre-defined APIs by sending a command and corresponding arguments to VPU. After receiving an interrupt from VPU, signalling the completion of the requested operation, the host application acquires the results as shown in figure below. Each API definition includes the requested command and the input and output data structure. The given command from the API function is always written on a dedicated I/O register, but the input and output data structure is transmitted through a set of command I/O registers that contain the input arguments and output results. Therefore, application developers do not need to know the details of the host register definitions and usage. Figure 3. Software Control Model of VPU from Host Application 3 API Features This section describes the important features of i.MX 6 VPU API, which is an API that includes a set of API functions to efficiently control VPU. i.MX VPU Application Programming Interface Linux® Reference Manual, Rev. 0, 10/2016 NXP Semiconductors 7
API Features A set of API functions is provided to efficiently control VPU. VPU API covers all functions of the i.MX 6 VPU. This API- based approach speeds up the development process of application software. Important features of the API for i.MX 6 VPU are summarized in the following sections. 3.1 Simple Software Control i.MX 6 VPU API provides a simple way to control the i.MX 6 VPU and avoid errors in application software. The host application does not need to know the details of the i.MX 6 VPU internal operations. For example, in order to initialize VPU, an application simply calls API for initialization, vpu_Init(), and no additional information is required for calling this API. vpu_Init() API performs all the required steps for initializing i.MX 6 VPU. When issuing a picture decoder operation, the application simply changes some variables included in the well-defined input data structure. 3.1.1 Handling Multi-Instances The i.MX 6 VPU supports multiple instances for decoding and encoding at the same time, which can be used in multiple decoding and encoding and multi-party call applications. To support multi-instance operations,i.MX 6 VPU API provides a full set of functions for handling the instances with ease. When opening a new instance, the application receives a handle specifying the new instance provided a new handle is available at that time. The operations for a given instance are separately controlled using the corresponding handle. An application can easily terminate a single task on VPU by calling a function for closing a certain instance. 3.1.2 Frame-Based Codec Processing i.MX 6 VPU completes decoding and encoding operation on a frame-by-frame basis, which enables low level independence of the VPU operations from the host processor. While frame processing operation are running, there is no need for communication between the host processor and VPU. Therefore, VPU does not burden the host processor during decoding and encoding operations. 3.2 Type Definitions This section describes the types and structures used in VPU API. 3.2.1 Type Definitions (common data types) This section describes the common data types used in the VPU API functions. 3.2.1.1 Uint8 typedef unsigned char Uint8; Description 8-bit unsigned integer type used for declaring pixel data. i.MX VPU Application Programming Interface Linux® Reference Manual, Rev. 0, 10/2016 8 NXP Semiconductors
分享到:
收藏