Informational R. Pantos, Ed.
Internet-Draft Apple Inc.
Intended status: Informational W. May
Expires: November 23, 2017 Major League Baseball Advanced Media
May 22, 2017
HTTP Live Streaming
draft-pantos-http-live-streaming-23
Abstract
This document describes a protocol for transferring unbounded streams
of multimedia data. It specifies the data format of the files and
the actions to be taken by the server (sender) and the clients
(receivers) of the streams. It describes version 7 of this protocol.
Status of This Memo
This Internet-Draft is submitted in full conformance with the
provisions of BCP 78 and BCP 79.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF). Note that other groups may also distribute
working documents as Internet-Drafts. The list of current Internet-
Drafts is at http://datatracker.ietf.org/drafts/current/.
Internet-Drafts are draft documents valid for a maximum of six months
and may be updated, replaced, or obsoleted by other documents at any
time. It is inappropriate to use Internet-Drafts as reference
material or to cite them other than as "work in progress."
This Internet-Draft will expire on November 23, 2017.
Copyright Notice
Copyright (c) 2017 IETF Trust and the persons identified as the
document authors. All rights reserved.
This document is subject to BCP 78 and the IETF Trust’s Legal
Provisions Relating to IETF Documents
(http://trustee.ietf.org/license-info) in effect on the date of
publication of this document. Please review these documents
carefully, as they describe your rights and restrictions with respect
to this document.
This Informational Internet Draft is submitted as an RFC Editor
Contribution and/or non-IETF Document (not as a Contribution, IETF
Contribution, nor IETF Document) in accordance with BCP 78 and BCP
79.
Pantos & May Expires November 23, 2017 [Page 1]
Internet-Draft HTTP Live Streaming May 2017
This document may not be modified, and derivative works of it may not
be created, except to format it for publication as an RFC or to
translate it into languages other than English.
Table of Contents
1. Introduction to HTTP Live Streaming . . . . . . . . . . . . . 4
2. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3. Media Segments . . . . . . . . . . . . . . . . . . . . . . . 6
3.1. Supported Media Segment Formats . . . . . . . . . . . . . 6
3.2. MPEG-2 Transport Streams . . . . . . . . . . . . . . . . 7
3.3. Fragmented MPEG-4 . . . . . . . . . . . . . . . . . . . . 7
3.4. Packed Audio . . . . . . . . . . . . . . . . . . . . . . 8
3.5. WebVTT . . . . . . . . . . . . . . . . . . . . . . . . . 8
4. Playlists . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.1. Definition of a Playlist . . . . . . . . . . . . . . . . 9
4.2. Attribute Lists . . . . . . . . . . . . . . . . . . . . . 10
4.3. Playlist Tags . . . . . . . . . . . . . . . . . . . . . . 12
4.3.1. Basic Tags . . . . . . . . . . . . . . . . . . . . . 12
4.3.1.1. EXTM3U . . . . . . . . . . . . . . . . . . . . . 12
4.3.1.2. EXT-X-VERSION . . . . . . . . . . . . . . . . . . 12
4.3.2. Media Segment Tags . . . . . . . . . . . . . . . . . 13
4.3.2.1. EXTINF . . . . . . . . . . . . . . . . . . . . . 13
4.3.2.2. EXT-X-BYTERANGE . . . . . . . . . . . . . . . . . 13
4.3.2.3. EXT-X-DISCONTINUITY . . . . . . . . . . . . . . . 14
4.3.2.4. EXT-X-KEY . . . . . . . . . . . . . . . . . . . . 14
4.3.2.5. EXT-X-MAP . . . . . . . . . . . . . . . . . . . . 16
4.3.2.6. EXT-X-PROGRAM-DATE-TIME . . . . . . . . . . . . . 17
4.3.2.7. EXT-X-DATERANGE . . . . . . . . . . . . . . . . . 17
4.3.2.7.1. Mapping SCTE-35 into EXT-X-DATERANGE . . . . 19
4.3.3. Media Playlist Tags . . . . . . . . . . . . . . . . . 21
4.3.3.1. EXT-X-TARGETDURATION . . . . . . . . . . . . . . 21
4.3.3.2. EXT-X-MEDIA-SEQUENCE . . . . . . . . . . . . . . 22
4.3.3.3. EXT-X-DISCONTINUITY-SEQUENCE . . . . . . . . . . 22
4.3.3.4. EXT-X-ENDLIST . . . . . . . . . . . . . . . . . . 23
4.3.3.5. EXT-X-PLAYLIST-TYPE . . . . . . . . . . . . . . . 23
4.3.3.6. EXT-X-I-FRAMES-ONLY . . . . . . . . . . . . . . . 23
4.3.4. Master Playlist Tags . . . . . . . . . . . . . . . . 24
4.3.4.1. EXT-X-MEDIA . . . . . . . . . . . . . . . . . . . 24
4.3.4.1.1. Rendition Groups . . . . . . . . . . . . . . 27
4.3.4.2. EXT-X-STREAM-INF . . . . . . . . . . . . . . . . 28
4.3.4.2.1. Alternative Renditions . . . . . . . . . . . 31
4.3.4.3. EXT-X-I-FRAME-STREAM-INF . . . . . . . . . . . . 32
4.3.4.4. EXT-X-SESSION-DATA . . . . . . . . . . . . . . . 33
Pantos & May Expires November 23, 2017 [Page 2]
Internet-Draft HTTP Live Streaming May 2017
4.3.4.5. EXT-X-SESSION-KEY . . . . . . . . . . . . . . . . 34
4.3.5. Media or Master Playlist Tags . . . . . . . . . . . . 34
4.3.5.1. EXT-X-INDEPENDENT-SEGMENTS . . . . . . . . . . . 34
4.3.5.2. EXT-X-START . . . . . . . . . . . . . . . . . . . 35
5. Key files . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1. Structure of Key files . . . . . . . . . . . . . . . . . 36
5.2. IV for [AES_128] . . . . . . . . . . . . . . . . . . . . 36
6. Client/Server Responsibilities . . . . . . . . . . . . . . . 36
6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . 36
6.2. Server Responsibilities . . . . . . . . . . . . . . . . . 36
6.2.1. General Server Responsibilities . . . . . . . . . . . 36
6.2.2. Live Playlists . . . . . . . . . . . . . . . . . . . 39
6.2.3. Encrypting Media Segments . . . . . . . . . . . . . . 40
6.2.4. Providing Variant Streams . . . . . . . . . . . . . . 41
6.3. Client Responsibilities . . . . . . . . . . . . . . . . . 42
6.3.1. General Client Responsibilities . . . . . . . . . . . 42
6.3.2. Loading the Media Playlist file . . . . . . . . . . . 43
6.3.3. Playing the Media Playlist file . . . . . . . . . . . 44
6.3.4. Reloading the Media Playlist file . . . . . . . . . . 45
6.3.5. Determining the next segment to load . . . . . . . . 46
6.3.6. Decrypting encrypted Media Segments . . . . . . . . . 46
7. Protocol version compatibility . . . . . . . . . . . . . . . 47
8. Playlist Examples . . . . . . . . . . . . . . . . . . . . . . 48
8.1. Simple Media Playlist . . . . . . . . . . . . . . . . . . 48
8.2. Live Media Playlist, using HTTPS . . . . . . . . . . . . 48
8.3. Playlist with encrypted Media Segments . . . . . . . . . 49
8.4. Master Playlist . . . . . . . . . . . . . . . . . . . . . 49
8.5. Master Playlist with I-Frames . . . . . . . . . . . . . . 50
8.6. Master Playlist with Alternative audio . . . . . . . . . 50
8.7. Master Playlist with Alternative video . . . . . . . . . 50
8.8. Session Data in a Master Playlist . . . . . . . . . . . . 51
8.9. CHARACTERISTICS attribute containing multiple
characteristics . . . . . . . . . . . . . . . . . . . . . 52
8.10. EXT-X-DATERANGE carrying SCTE-35 tags . . . . . . . . . . 52
9. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 52
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 52
11. Security Considerations . . . . . . . . . . . . . . . . . . . 53
12. References . . . . . . . . . . . . . . . . . . . . . . . . . 54
12.1. Normative References . . . . . . . . . . . . . . . . . . 54
12.2. Informative References . . . . . . . . . . . . . . . . . 57
Authors’ Addresses . . . . . . . . . . . . . . . . . . . . . . . 58
Pantos & May Expires November 23, 2017 [Page 3]
Internet-Draft HTTP Live Streaming May 2017
1. Introduction to HTTP Live Streaming
HTTP Live Streaming provides a reliable, cost-effective means of
delivering continuous and long-form video over the Internet. It
allows a receiver to adapt the bit rate of the media to the current
network conditions in order to maintain uninterrupted playback at the
best possible quality. It supports interstitial content boundaries.
It provides a flexible framework for media encryption. It can
efficiently offer multiple renditions of the same content, such as
audio translations. It offers compatibility with large-scale HTTP
caching infrastructure to support delivery to large audiences.
Since its first draft publication in 2009, HTTP Live Streaming has
been implemented and deployed by a wide array of content producers,
tools vendors, distributors, and device manufacturers. In the
subsequent eight years the protocol has been refined by extensive
review and discussion with a variety of media streaming implementors.
The purpose of this document is to facilitate interoperability
between HTTP Live Streaming implementations by describing the media
transmission protocol. Using this protocol, a client can receive a
continuous stream of media from a server for concurrent presentation.
This document describes version 7 of the protocol.
2. Overview
A multimedia presentation is specified by a Uniform Resource
Identifier (URI) [RFC3986] to a Playlist.
A Playlist is either a Media Playlist or a Master Playlist. Both are
UTF-8 text files containing URIs and descriptive tags.
A Media Playlist contains a list of Media Segments, which when played
sequentially will play the multimedia presentation.
Pantos & May Expires November 23, 2017 [Page 4]
Internet-Draft HTTP Live Streaming May 2017
Here is an example of a Media Playlist:
#EXTM3U
#EXT-X-TARGETDURATION:10
#EXTINF:9.009,
http://media.example.com/first.ts
#EXTINF:9.009,
http://media.example.com/second.ts
#EXTINF:3.003,
http://media.example.com/third.ts
The first line is the format identifier tag #EXTM3U. The line
containing #EXT-X-TARGETDURATION says that all Media Segments will be
10 seconds long or less. Then three Media Segments are declared.
The first and second are 9.009 seconds long; the third is 3.003
seconds.
To play this Playlist, the client first downloads it and then
downloads and plays each Media Segment declared within it. The
client reloads the Playlist as described in this document to discover
any added segments. Data SHOULD be carried over HTTP [RFC7230], but
in general a URI can specify any protocol that can reliably transfer
the specified resource on demand.
A more complex presentation can be described by a Master Playlist. A
Master Playlist provides a set of Variant Streams, each of which
describes a different version of the same content.
A Variant Stream includes a Media Playlist that specifies media
encoded at a particular bit rate, in a particular format, and at a
particular resolution for media containing video.
A Variant Stream can also specify a set of Renditions. Renditions
are alternate versions of the content, such as audio produced in
different languages or video recorded from different camera angles.
Clients should switch between different Variant Streams to adapt to
network conditions. Clients should choose Renditions based on user
preferences.
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in RFC 2119 [RFC2119].
Pantos & May Expires November 23, 2017 [Page 5]
Internet-Draft HTTP Live Streaming May 2017
3. Media Segments
A Media Playlist contains a series of Media Segments which make up
the overall presentation. A Media Segment is specified by a URI and
optionally a byte range.
The duration of each Media Segment is indicated in the Media Playlist
by its EXTINF tag (Section 4.3.2.1).
Each segment in a Media Playlist has a unique integer Media Sequence
Number. The Media Sequence Number of the first segment in the Media
Playlist is either 0, or declared in the Playlist (Section 4.3.3.2).
The Media Sequence Number of every other segment is equal to the
Media Sequence Number of the segment that precedes it plus one.
Each Media Segment MUST carry the continuation of the encoded
bitstream from the end of the segment with the previous Media
Sequence Number, where values in a series such as timestamps and
Continuity Counters MUST continue uninterrupted. The only exceptions
are the first Media Segment ever to appear in a Media Playlist, and
Media Segments which are explicitly signaled as discontinuities
(Section 4.3.2.3). Unmarked media discontinuities can trigger
playback errors.
Any Media Segment that contains video SHOULD include enough
information to initialize a video decoder and decode a continuous set
of frames that includes the final frame in the Segment; network
efficiency is optimized if there is enough information in the Segment
to decode all frames in the Segment. For example, any Media Segment
containing H.264 video SHOULD contain an IDR; frames prior to the
first IDR will be downloaded but possibly discarded.
3.1. Supported Media Segment Formats
All Media Segments MUST be in a format described in this section.
Transport of other media file formats is not defined.
Some media formats require a common sequence of bytes to initialize a
parser before a Media Segment can be parsed. This format-specific
sequence is called the Media Initialization Section. The Media
Initialization Section can be specified by an EXT-X-MAP
(Section 4.3.2.5) tag. The Media Initialization Section MUST NOT
contain sample data.
Pantos & May Expires November 23, 2017 [Page 6]
Internet-Draft HTTP Live Streaming May 2017
3.2. MPEG-2 Transport Streams
MPEG-2 Transport Streams are specified by [ISO_13818].
The Media Initialization Section of an MPEG-2 Transport Stream
Segment is a Program Association Table (PAT) followed by a Program
Map Table (PMT).
Transport Stream Segments MUST contain a single MPEG-2 Program;
playback of Multi-Program Transport Streams is not defined. Each
Transport Stream Segment MUST contain a PAT and a PMT, or have an
EXT-X-MAP (Section 4.3.2.5) tag applied to it. The first two
Transport Stream packets in a Segment without an EXT-X-MAP tag SHOULD
be a PAT and a PMT.
3.3. Fragmented MPEG-4
MPEG-4 Fragments are specified by the ISO Base Media File Format
[ISOBMFF]. Unlike regular MPEG-4 files which have a Movie Box
(’moov’) that contains sample tables and a Media Data Box (’mdat’)
containing the corresponding samples, an MPEG-4 Fragment consists of
a Movie Fragment Box (’moof’) containing a subset of the sample table
and a Media Data Box containing those samples. Use of MPEG-4
Fragments does require a Movie Box for initialization, but that Movie
Box contains only non-sample-specific information such as track and
sample descriptions.
A Fragmented MPEG-4 (fMP4) Segment is a "segment" as defined by
Section 3 of [ISOBMFF], including the constraints on Media Data Boxes
in Section 8.16 [ISOBMFF].
The Media Initialization Section for an fMP4 Segment is an ISO Base
Media File that can initialize a parser for that Segment.
Broadly speaking, fMP4 Segments and Media Initialization Sections are
[ISOBMFF] files that also satisfy the constraints described in this
section.
The Media Initialization Section for an fMP4 Segment MUST contain a
File Type Box (’ftyp’) containing a brand that is compatible with
’iso6’ or higher. The File Type Box MUST be followed by a Movie Box.
The Movie Box MUST contain a Track Box (’trak’) for every Track
Fragment Box (’traf’) in the fMP4 Segment, with matching track_ID.
Each Track Box SHOULD contain a sample table, but its sample count
MUST be zero. Movie Header Boxes (’mvhd’) and Track Header Boxes
(’tkhd’) MUST have durations of zero. A Movie Extends Box (’mvex’)
MUST follow the last Track Box. Note that a CMAF Header [CMAF] meets
all these requirements.
Pantos & May Expires November 23, 2017 [Page 7]
Internet-Draft HTTP Live Streaming May 2017
In an fMP4 Segment, every Track Fragment Box MUST contain a Track
Fragment Decode Time Box (’tfdt’). fMP4 Segments MUST use movie-
fragment relative addressing. fMP4 Segments MUST NOT use external
data references. Note that a CMAF Segment meets these requirements.
An fMP4 Segment in a Playlist containing the EXT-X-I-FRAMES-ONLY
(Section 4.3.3.6) tag MAY omit the portion of the Media Data Box
following the I-frame sample data.
Each fMP4 Segment in a Media Playlist MUST have an EXT-X-MAP tag
applied to it.
3.4. Packed Audio
A Packed Audio Segment contains encoded audio samples and ID3 tags
that are simply packed together with minimal framing and no per-
sample timestamps. Supported Packed Audio formats are AAC with ADTS
framing [ISO_13818_7]; MP3 [ISO_13818_3]; AC-3 [AC_3]; and Enhanced
AC-3 [AC_3].
A Packed Audio Segment has no Media Initialization Section.
Each Packed Audio Segment MUST signal the timestamp of its first
sample with an ID3 PRIV tag [ID3] at the beginning of the segment.
The ID3 PRIV owner identifier MUST be
"com.apple.streaming.transportStreamTimestamp". The ID3 payload MUST
be a 33-bit MPEG-2 Program Elementary Stream timestamp expressed as a
big-endian eight-octet number, with the upper 31 bits set to zero.
Clients SHOULD NOT play Packed Audio Segments without this ID3 tag.
3.5. WebVTT
A WebVTT Segment is a section of a WebVTT [WebVTT] file. WebVTT
Segments carry subtitles.
The Media Initialization Section of a WebVTT Segment is the WebVTT
header.
Each WebVTT Segment MUST contain all subtitle cues that are intended
to be displayed during the period indicated by the segment EXTINF
duration. The start time offset and end time offset of each cue MUST
indicate the total display time for that cue, even if part of the cue
time range is outside the Segment period. A WebVTT Segment MAY
contain no cues; this indicates that no subtitles are to be displayed
during that period.
Each WebVTT Segment MUST either start with a WebVTT header or have an
EXT-X-MAP tag applied to it.
Pantos & May Expires November 23, 2017 [Page 8]