INTERNATIONAL TELECOMMUNICATION UNION
CCITT
THE INTERNATIONAL
TELEGRAPH AND TELEPHONE
CONSULTATIVE COMMITTEE
T.81
(09/92)
TERMINAL EQUIPMENT AND PROTOCOLS
FOR TELEMATIC SERVICES
INFORMATION TECHNOLOGY –
DIGITAL COMPRESSION AND CODING
OF CONTINUOUS-TONE STILL IMAGES –
REQUIREMENTS AND GUIDELINES
Recommendation T.81
Foreword
ITU (International Telecommunication Union)
the field of
telecommunications. The CCITT (the International Telegraph and Telephone Consultative Committee) is a permanent
organ of the ITU. Some 166 member countries, 68 telecom operating entities, 163 scientific and industrial organizations
and 39 international organizations participate in CCITT which is the body which sets world telecommunications
standards (Recommendations).
the United Nations Specialized Agency
is
in
The approval of Recommendations by the members of CCITT is covered by the procedure laid down in CCITT Resolution
No. 2 (Melbourne, 1988). In addition, the Plenary Assembly of CCITT, which meets every four years, approves
Recommendations submitted to it and establishes the study programme for the following period.
In some areas of information technology, which fall within CCITT’s purview, the necessary standards are prepared on a
collaborative basis with ISO and IEC. The text of CCITT Recommendation T.81 was approved on 18th September 1992.
The identical text is also published as ISO/IEC International Standard 10918-1.
___________________
CCITT NOTE
In this Recommendation, the expression “Administration” is used for conciseness to indicate both a telecommunication
administration and a recognized private operating agency.
All rights reserved. No part of this publication may be reproduced or utilized in any form or by any means, electronic or
mechanical, including photocopying and microfilm, without permission in writing from the ITU.
ITU 1993
ª
Contents
Introduction..............................................................................................................................................................
1
2
3
4
5
6
7
Scope ............................................................................................................................................................
Normative references.....................................................................................................................................
Definitions, abbreviations and symbols .........................................................................................................
General .........................................................................................................................................................
Interchange format requirements ...................................................................................................................
Encoder requirements ...................................................................................................................................
Decoder requirements ...................................................................................................................................
Annex A – Mathematical definitions........................................................................................................................
Annex B – Compressed data formats........................................................................................................................
Annex C – Huffman table specification....................................................................................................................
Annex D – Arithmetic coding ..................................................................................................................................
Annex E – Encoder and decoder control procedures................................................................................................
Annex F – Sequential DCT-based mode of operation...............................................................................................
Annex G – Progressive DCT-based mode of operation.............................................................................................
Annex H – Lossless mode of operation ....................................................................................................................
Annex J – Hierarchical mode of operation................................................................................................................
Annex K – Examples and guidelines........................................................................................................................
Annex L – Patents....................................................................................................................................................
Annex M – Bibliography..........................................................................................................................................
Page
iii
1
1
1
12
23
23
23
24
31
50
54
77
87
119
132
137
143
179
181
CCITT Rec. T.81 (1992 E)
i
Introduction
This CCITT Recommendation | ISO/IEC International Standard was prepared by CCITT Study Group VIII and the Joint
Photographic Experts Group (JPEG) of ISO/IEC JTC 1/SC 29/WG 10. This Experts Group was formed in 1986 to
establish a standard for the sequential progressive encoding of continuous tone grayscale and colour images.
Digital Compression and Coding of Continuous-tone Still images, is published in two parts:
–
–
Requirements and guidelines;
Compliance testing.
This part, Part 1, sets out requirements and implementation guidelines for continuous-tone still image encoding and
decoding processes, and for the coded representation of compressed image data for interchange between applications.
These processes and representations are intended to be generic, that is, to be applicable to a broad range of applications for
colour and grayscale still images within communications and computer systems. Part 2, sets out tests for determining
whether implementations comply with the requirments for the various encoding and decoding processes specified in Part
1.
The user’s attention is called to the possibility that – for some of the coding processes specified herein – compliance with
this Recommendation | International Standard may require use of an invention covered by patent rights. See Annex L for
further information.
The requirements which these processes must satisfy to be useful for specific image communications applications such as
facsimile, Videotex and audiographic conferencing are defined in CCITT Recommendation T.80. The intent is that the
generic processes of Recommendation T.80 will be incorporated into the various CCITT Recommendations for terminal
equipment for these applications.
In addition to the applications addressed by the CCITT and ISO/IEC, the JPEG committee has developped a compression
standard to meet the needs of other applications as well, including desktop publishing, graphic arts, medical imaging and
scientific imaging.
Annexes A, B, C, D, E, F, G, H and J are normative, and thus form an integral part of this Specification. Annexes K, L
and M are informative and thus do not form an integral part of this Specification.
This Specification aims to follow the guidelines of CCITT and ISO/IEC JTC 1 on Rules for presentation of CCITT |
ISO/IEC common text.
INTERNATIONAL STANDARD
ISO/IEC 10918-1 : 1993(E)
CCITT Rec. T.81 (1992 E)
CCITT RECOMMENDATION
ISO/IEC 10918-1 : 1993(E)
INFORMATION TECHNOLOGY – DIGITAL COMPRESSION
AND CODING OF CONTINUOUS-TONE STILL IMAGES –
REQUIREMENTS AND GUIDELINES
1
Scope
This CCITT Recommendation | International Standard is applicable to continuous-tone – grayscale or colour – digital still
image data. It is applicable to a wide range of applications which require use of compressed images. It is not applicable to
bi-level image data.
This Specification
–
–
–
specifies processes for converting source image data to compressed image data;
specifies processes for converting compressed image data to reconstructed image data;
gives guidance on how to implement these processes in practice;
specifies coded representations for compressed image data.
–
NOTE – This Specification does not specify a complete coded image representation. Such representations may include
certain parameters, such as aspect ratio, component sample registration, and colour space designation, which are application-
dependent.
2
Normative references
The following CCITT Recommendations and International Standards contain provisions which, through reference in this
text, constitute provisions of this CCITT Recommendation | International Standard. At the time of publication, the
editions indicated were valid. All Recommendations and Standards are subject to revision, and parties to agreements
based on this CCITT Recommendation | International Standard are encouraged to investigate the possibility of applying
the most recent edition of the Recommendations and Standards listed below. Members of IEC and ISO maintain registers
of currently valid International Standards. The CCITT Secretariat maintains a list of currently valid CCITT
Recommendations.
–
CCITT Recommendation T.80 (1992), Common components for image compression and communication –
Basic principles.
3
Definitions, abbreviations and symbols
3.1
Definitions and abbreviations
For the purposes of this Specification, the following definitions apply.
3.1.1
abbreviated format: A representation of compressed image data which is missing some or all of the table
specifications required for decoding, or a representation of table-specification data without frame headers, scan headers,
and entropy-coded segments.
3.1.2
AC coefficient: Any DCT coefficient for which the frequency is not zero in at least one dimension.
3.1.3
symbols from the sequence of bits produced by the arithmetic encoder.
(adaptive) (binary) arithmetic decoding: An entropy decoding procedure which recovers the sequence of
3.1.4
subdivision of the probability of the sequence of symbols coded up to that point.
(adaptive) (binary) arithmetic encoding: An entropy encoding procedure which codes by means of a recursive
3.1.5
established for a particular application.
application environment: The standards for data representation, communication, or storage which have been
CCITT Rec. T.81 (1992 E)
1
ISO/IEC 10918-1 : 1993(E)
3.1.6
3.1.7
arithmetic decoder: An embodiment of arithmetic decoding procedure.
arithmetic encoder: An embodiment of arithmetic encoding procedure.
3.1.8
Specification, and which is required for all DCT-based decoding processes.
baseline (sequential): A particular sequential DCT-based encoding and decoding process specified in this
3.1.9
binary decision: Choice between two alternatives.
3.1.10
3.1.11
3.1.12
bit stream: Partially encoded or decoded sequence of bits comprising an entropy-coded segment.
block: An 8 ·
8 array of DCT coefficient values of one component.
8 array of samples or an 8 ·
block-row: A sequence of eight contiguous component lines which are partitioned into 8 ·
8 blocks.
3.1.13
byte: A group of 8 bits.
3.1.14
the entropy-coded segment following the generation of an encoded hexadecimal X’FF’ byte.
byte stuffing: A procedure in which either the Huffman coder or the arithmetic coder inserts a zero byte into
3.1.15
the eight bits reserved for the output byte.
carry bit: A bit in the arithmetic encoder code register which is set if a carry-over in the code register overflows
3.1.16
by selecting the smallest integer value which is greater than or equal to the real number.
ceiling function: The mathematical procedure in which the greatest integer value of a real number is obtained
3.1.17
class (of coding process): Lossy or lossless coding processes.
3.1.18
code register: The arithmetic encoder register containing the least significant bits of the partially completed
entropy-coded segment. Alternatively, the arithmetic decoder register containing the most significant bits of a partially
decoded entropy-coded segment.
3.1.19
coder: An embodiment of a coding process.
3.1.20
coding: Encoding or decoding.
3.1.21
coding model: A procedure used to convert input data into symbols to be coded.
3.1.22
(coding) process: A general term for referring to an encoding process, a decoding process, or both.
3.1.23
colour image: A continuous-tone image that has more than one component.
3.1.24
columns: Samples per line in a component.
3.1.25
component: One of the two-dimensional arrays which comprise an image.
3.1.26
compressed data: Either compressed image data or table specification data or both.
3.1.27
compressed image data: A coded representation of an image, as specified in this Specification.
3.1.28
compression: Reduction in the number of bits used to represent source image data.
3.1.29
interval is greater than the size of the MPS interval (in arithmetic coding).
conditional exchange: The interchange of MPS and LPS probability intervals whenever the size of the LPS
3.1.30
state machine (in arithmetic coding).
(conditional) probability estimate: The probability value assigned to the LPS by the probability estimation
3.1.31
decisions and the conditional probability estimates used in arithmetic coding.
conditioning table: The set of parameters which select one of the defined relationships between prior coding
3.1.32
estimation state machine (in arithmetic coding).
context: The set of previously coded binary decisions which is used to create the index to the probability
3.1.33
3.1.34
continuous-tone image: An image whose components have more than one bit per sample.
data unit: An 8 ·
8 block of samples of one component in DCT-based processes; a sample in lossless processes.
2
CCITT Rec. T.81 (1992 E)
ISO/IEC 10918-1 : 1993(E)
3.1.35 DC coefficient: The DCT coefficient for which the frequency is zero in both dimensions.
3.1.36 DC prediction: The procedure used by DCT-based encoders whereby the quantized DC coefficient from the
previously encoded 8 ·
8 block of the same component is subtracted from the current quantized DC coefficient.
3.1.37
to a quantized DCT coefficient, or to a dequantized DCT coefficient.
(DCT) coefficient: The amplitude of a specific cosine basis function – may refer to an original DCT coefficient,
3.1.38
decoder: An embodiment of a decoding process.
3.1.39
image.
decoding process: A process which takes as its input compressed image data and outputs a continuous-tone
3.1.40
coding of an image.
default conditioning: The values defined for the arithmetic coding conditioning tables at the beginning of
3.1.41
DCT coefficients.
dequantization: The inverse procedure to quantization by which the decoder recovers a representation of the
3.1.42
corresponding reference component derived from the preceding frame for that component (in hierarchical mode coding).
differential component: The difference between an input component derived from the source image and the
3.1.43
decoded.
differential frame: A frame in a hierarchical process in which differential components are either encoded or
3.1.44
this Specification.
(digital) reconstructed image (data): A continuous-tone image which is the output of any decoder defined in
3.1.45
Specification.
(digital) source image (data): A continuous-tone image used as input to any encoder defined in this
3.1.46
(digital) (still) image: A set of two-dimensional arrays of integer data.
3.1.47
transform.
discrete cosine transform; DCT: Either the forward discrete cosine transform or the inverse discrete cosine
3.1.48
mode coding).
downsampling (filter): A procedure by which the spatial resolution of an image is reduced (in hierarchical
3.1.49
encoder: An embodiment of an encoding process.
3.1.50
data.
encoding process: A process which takes as its input a continuous-tone image and outputs compressed image
3.1.51
image data.
entropy-coded (data) segment: An independently decodable sequence of entropy encoded bytes of compressed
3.1.52
the entropy encoded segment.
(entropy-coded segment) pointer: The variable which points to the most recently placed (or fetched) byte in
3.1.53
entropy decoder: An embodiment of an entropy decoding procedure.
3.1.54
produced by the entropy encoder.
entropy decoding: A lossless procedure which recovers the sequence of symbols from the sequence of bits
3.1.55
entropy encoder: An embodiment of an entropy encoding procedure.
3.1.56
such that the average number of bits per symbol approaches the entropy of the input symbols.
entropy encoding: A lossless procedure which converts a sequence of input symbols into a sequence of bits
3.1.57
additional capabilities are added to the baseline sequential process.
extended (DCT-based) process: A descriptive term for DCT-based encoding and decoding processes in which
3.1.58
converts a block of samples into a corresponding block of original DCT coefficients.
forward discrete cosine transform; FDCT: A mathematical transformation using cosine basis functions which
CCITT Rec. T.81 (1992 E)
3
ISO/IEC 10918-1 : 1993(E)
3.1.59
or more of the components in an image.
frame: A group of one or more scans (all using the same DCT-based or lossless process) through the data of one
3.1.60
coded at the beginning of a frame.
frame header: A marker segment that contains a start-of-frame marker and associated frame parameters that are
3.1.61
frequency: A two-dimensional index into the two-dimensional array of DCT coefficients.
3.1.62
(frequency) band: A contiguous group of coefficients from the zig-zag sequence (in progressive mode coding).
3.1.63
mode coding).
full progression: A process which uses both spectral selection and successive approximation (in progressive
3.1.64
grayscale image: A continuous-tone image that has only one component.
3.1.65
hierarchical: A mode of operation for coding an image in which the first frame for a given component is
followed by frames which code the differences between the source data and the reconstructed data from the previous
frame for that component. Resolution changes are allowed between frames.
3.1.66
hierarchical decoder: A sequence of decoder processes in which the first frame for each component is followed
by frames which decode an array of differences for each component and adds it to the reconstructed data from the
preceding frame for that component.
3.1.67
hierarchical encoder: The mode of operation in which the first frame for each component is followed by frames
which encode the array of differences between the source data and the reconstructed data from the preceding frame for
that component.
3.1.68
to the number of horizontal data units in the other components.
horizontal sampling factor: The relative number of horizontal data units of a particular component with respect
3.1.69 Huffman decoder: An embodiment of a Huffman decoding procedure.
3.1.70 Huffman decoding: An entropy decoding procedure which recovers the symbol from each variable length code
produced by the Huffman encoder.
3.1.71 Huffman encoder: An embodiment of a Huffman encoding procedure.
3.1.72 Huffman encoding: An entropy encoding procedure which assigns a variable length code to each input symbol.
3.1.73 Huffman table: The set of variable length codes required in a Huffman encoder and Huffman decoder.
3.1.74
image data: Either source image data or reconstructed image data.
3.1.75
environments.
interchange format: The representation of compressed image data for exchange between application
3.1.76
component in a scan in a specific order.
interleaved: The descriptive term applied to the repetitive multiplexing of small groups of data units from each
3.1.77
converts a block of dequantized DCT coefficients into a corresponding block of samples.
inverse discrete cosine transform; IDCT: A mathematical transformation using cosine basis functions which
3.1.78
Specification. The “joint” comes from the CCITT and ISO/IEC collaboration.
Joint Photographic Experts Group; JPEG: The informal name of the committee which created this
3.1.79
coding).
latent output: Output of the arithmetic encoder which is held, pending resolution of carry-over (in arithmetic
3.1.80
less probable symbol; LPS: For a binary decision, the decision value which has the smaller probability.
3.1.81
level shift: A procedure used by DCT-based encoders and decoders whereby each input sample is either
converted from an unsigned representation to a two’s complement representation or from a two’s complement
representation to an unsigned representation.
4
CCITT Rec. T.81 (1992 E)