Network Working Group Y.-K. Wang
Internet Draft Qualcomm
Intended status: Standards track Y. Sanchez
Expires: June 2015 T. Schierl
Fraunhofer HHI
S. Wenger
Vidyo
M. M. Hannuksela
Nokia
December 8, 2014
RTP Payload Format for High Efficiency Video Coding
draft-ietf-payload-rtp-h265-07.txt
Abstract
This memo describes an RTP payload format for the video coding
standard ITU-T Recommendation H.265 and ISO/IEC International
Standard 23008-2, both also known as High Efficiency Video Coding
(HEVC) and developed by the Joint Collaborative Team on Video
Coding (JCT-VC). The RTP payload format allows for packetization
of one or more Network Abstraction Layer (NAL) units in each RTP
packet payload, as well as fragmentation of a NAL unit into
multiple RTP packets. Furthermore, it supports transmission of
an HEVC bitstream over a single as well as multiple RTP streams.
When multiple RTP streams are used, a single or multiple
transports may be utilized. The payload format has wide
applicability in videoconferencing, Internet video streaming, and
high bit-rate entertainment-quality video, among others.
Status of this Memo
This Internet-Draft is submitted to IETF in full conformance with
the provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
Wang, et al Expires June 8, 2015 [Page 1]
Internet-Draft RTP Payload Format for HEVC December 8, 2014
other groups may also distribute working documents as Internet-
Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other
documents at any time. It is inappropriate to use Internet-
Drafts as reference material or to cite them other than as "work
in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt.
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
This Internet-Draft will expire on June 8, 2015.
Copyright and License Notice
Copyright (c) 2014 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust’s Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with
respect to this document. Code Components extracted from this
document must include Simplified BSD License text as described in
Section 4.e of the Trust Legal Provisions and are provided
without warranty as described in the Simplified BSD License.
Wang, et al Expires June 8, 2015 [Page 2]
Internet-Draft RTP Payload Format for HEVC December 8, 2014
Table of Contents
Abstract.........................................................1
Status of this Memo..............................................1
Table of Contents................................................3
1 Introduction...................................................5
1.1 Overview of the HEVC Codec................................5
1.1.1 Coding-Tool Features.................................5
1.1.2 Systems and Transport Interfaces.....................7
1.1.3 Parallel Processing Support.........................14
1.1.4 NAL Unit Header.....................................16
1.2 Overview of the Payload Format...........................18
2 Conventions...................................................18
3 Definitions and Abbreviations.................................19
3.1 Definitions..............................................19
3.1.1 Definitions from the HEVC Specification.............19
3.1.2 Definitions Specific to This Memo...................21
3.2 Abbreviations............................................23
4 RTP Payload Format............................................24
4.1 RTP Header Usage.........................................24
4.2 Payload Header Usage.....................................27
4.3 Payload Structures.......................................27
4.4 Transmission Modes.......................................28
4.5 Decoding Order Number....................................29
4.6 Single NAL Unit Packets..................................31
4.7 Aggregation Packets (APs)................................32
4.8 Fragmentation Units (FUs)................................37
4.9 PACI packets.............................................40
4.9.1 Reasons for the PACI rules (informative)............43
4.9.2 PACI extensions (Informative).......................44
4.10 Temporal Scalability Control Information................45
5 Packetization Rules...........................................47
6 De-packetization Process......................................48
7 Payload Format Parameters.....................................50
7.1 Media Type Registration..................................51
7.2 SDP Parameters...........................................76
7.2.1 Mapping of Payload Type Parameters to SDP...........76
7.2.2 Usage with SDP Offer/Answer Model...................78
7.2.3 Usage in Declarative Session Descriptions...........87
Wang, et al Expires June 8, 2015 [Page 3]
Internet-Draft RTP Payload Format for HEVC December 8, 2014
7.2.4 Parameter Sets Considerations.......................88
7.2.5 Dependency Signaling in Multi-Stream Mode...........88
8 Use with Feedback Messages....................................89
8.1 Picture Loss Indication (PLI)............................90
8.2 Slice Loss Indication (SLI)..............................90
8.3 Reference Picture Selection Indication (RPSI)............91
8.4 Full Intra Request (FIR).................................92
9 Security Considerations.......................................93
10 Congestion Control...........................................94
11 IANA Consideration...........................................95
12 Acknowledgements.............................................95
13 References...................................................96
13.1 Normative References....................................96
13.2 Informative References..................................97
14 Authors’ Addresses...........................................99
Wang, et al Expires June 8, 2015 [Page 4]
Internet-Draft RTP Payload Format for HEVC December 8, 2014
1 Introduction
1.1 Overview of the HEVC Codec
High Efficiency Video Coding [HEVC], formally known as ITU-T
Recommendation H.265 and ISO/IEC International Standard 23008-2
was ratified by ITU-T in April 2013 and reportedly provides
significant coding efficiency gains over H.264 [H.264].
As both H.264 [H.264] and its RTP payload format [RFC6184] are
widely deployed and generally known in the relevant implementer
communities, frequently only the differences between those two
specifications are highlighted in non-normative, explanatory
parts of this memo. Basic familiarity with both specifications
is assumed for those parts. However, the normative parts of this
memo do not require study of H.264 or its RTP payload format.
H.264 and HEVC share a similar hybrid video codec design.
Conceptually, both technologies include a video coding layer
(VCL), which is often used to refer to the coding-tool features,
and a network abstraction layer (NAL), which is often used to
refer to the systems and transport interface aspects of the
codecs.
1.1.1 Coding-Tool Features
Similarly to earlier hybrid-video-coding-based standards,
including H.264, the following basic video coding design is
employed by HEVC. A prediction signal is first formed either by
intra or motion compensated prediction, and the residual (the
difference between the original and the prediction) is then
coded. The gains in coding efficiency are achieved by
redesigning and improving almost all parts of the codec over
earlier designs. In addition, HEVC includes several tools to
make the implementation on parallel architectures easier. Below
is a summary of HEVC coding-tool features.
Quad-tree block and transform structure
One of the major tools that contribute significantly to the
coding efficiency of HEVC is the usage of flexible coding blocks
Wang, et al Expires June 8, 2015 [Page 5]
Internet-Draft RTP Payload Format for HEVC December 8, 2014
and transforms, which are defined in a hierarchical quad-tree
manner. Unlike H.264, where the basic coding block is a
macroblock of fixed size 16x16, HEVC defines a Coding Tree Unit
(CTU) of a maximum size of 64x64. Each CTU can be divided into
smaller units in a hierarchical quad-tree manner and can
represent smaller blocks down to size 4x4. Similarly, the
transforms used in HEVC can have different sizes, starting from
4x4 and going up to 32x32. Utilizing large blocks and transforms
contribute to the major gain of HEVC, especially at high
resolutions.
Entropy coding
HEVC uses a single entropy coding engine, which is based on
Context Adaptive Binary Arithmetic Coding (CABAC), whereas H.264
uses two distinct entropy coding engines. CABAC in HEVC shares
many similarities with CABAC of H.264, but contains several
improvements. Those include improvements in coding efficiency
and lowered implementation complexity, especially for parallel
architectures.
In-loop filtering
H.264 includes an in-loop adaptive deblocking filter, where the
blocking artifacts around the transform edges in the
reconstructed picture are smoothed to improve the picture quality
and compression efficiency. In HEVC, a similar deblocking filter
is employed but with somewhat lower complexity. In addition,
pictures undergo a subsequent filtering operation called Sample
Adaptive Offset (SAO), which is a new design element in HEVC.
SAO basically adds a pixel-level offset in an adaptive manner and
usually acts as a de-ringing filter. It is observed that SAO
improves the picture quality, especially around sharp edges
contributing substantially to visual quality improvements of
HEVC.
Motion prediction and coding
There have been a number of improvements in this area that are
summarized as follows. The first category is motion merge and
advanced motion vector prediction (AMVP) modes. The motion
Wang, et al Expires June 8, 2015 [Page 6]
Internet-Draft RTP Payload Format for HEVC December 8, 2014
information of a prediction block can be inferred from the
spatially or temporally neighboring blocks. This is similar to
the DIRECT mode in H.264 but includes new aspects to incorporate
the flexible quad-tree structure and methods to improve the
parallel implementations. In addition, the motion vector
predictor can be signaled for improved efficiency. The second
category is high-precision interpolation. The interpolation
filter length is increased to 8-tap from 6-tap, which improves
the coding efficiency but also comes with increased complexity.
In addition, the interpolation filter is defined with higher
precision without any intermediate rounding operations to further
improve the coding efficiency.
Intra prediction and intra coding
Compared to 8 intra prediction modes in H.264, HEVC supports
angular intra prediction with 33 directions. This increased
flexibility improves both objective coding efficiency and visual
quality as the edges can be better predicted and ringing
artifacts around the edges can be reduced. In addition, the
reference samples are adaptively smoothed based on the prediction
direction. To avoid contouring artifacts a new interpolative
prediction generation is included to improve the visual quality.
Furthermore, discrete sine transform (DST) is utilized instead of
traditional discrete cosine transform (DCT) for 4x4 intra
transform blocks.
Other coding-tool features
HEVC includes some tools for lossless coding and efficient screen
content coding, such as skipping the transform for certain
blocks. These tools are particularly useful for example when
streaming the user-interface of a mobile device to a large
display.
1.1.2 Systems and Transport Interfaces
HEVC inherited the basic systems and transport interfaces
designs, such as the NAL-unit-based syntax structure, the
hierarchical syntax and data unit structure from sequence-level
parameter sets, multi-picture-level or picture-level parameter
Wang, et al Expires June 8, 2015 [Page 7]
Internet-Draft RTP Payload Format for HEVC December 8, 2014
sets, slice-level header parameters, lower-level parameters, the
supplemental enhancement information (SEI) message mechanism, the
hypothetical reference decoder (HRD) based video buffering model,
and so on. In the following, a list of differences in these
aspects compared to H.264 is summarized.
Video parameter set
A new type of parameter set, called video parameter set (VPS),
was introduced. For the first (2013) version of [HEVC], the
video parameter set NAL unit is required to be available prior to
its activation, while the information contained in the video
parameter set is not necessary for operation of the decoding
process. For future HEVC extensions, such as the 3D or scalable
extensions, the video parameter set is expected to include
information necessary for operation of the decoding process, e.g.
decoding dependency or information for reference picture set
construction of enhancement layers. The VPS provides a "big
picture" of a bitstream, including what types of operation points
are provided, the profile, tier, and level of the operation
points, and some other high-level properties of the bitstream
that can be used as the basis for session negotiation and content
selection, etc. (see section 7.1).
Profile, tier and level
The profile, tier and level syntax structure that can be included
in both VPS and sequence parameter set (SPS) includes 12 bytes of
data to describe the entire bitstream (including all temporally
scalable layers, which are referred to as sub-layers in the HEVC
specification), and can optionally include more profile, tier and
level information pertaining to individual temporally scalable
layers. The profile indicator indicates the "best viewed as"
profile when the bitstream conforms to multiple profiles, similar
to the major brand concept in the ISO base media file format
(ISOBMFF) [ISOBMFF] and file formats derived based on ISOBMFF,
such as the 3GPP file format [3GPPFF]. The profile, tier and
level syntax structure also includes the indications of whether
the bitstream is free of frame-packed content, whether the
bitstream is free of interlaced source content and free of field
pictures, i.e. contains only frame pictures of progressive
Wang, et al Expires June 8, 2015 [Page 8]