wang-50214
wang˙fm
August 23, 2001
14:22
Contents
PREFACE
GLOSSARY OF NOTATIONS
1
VIDEO FORMATION, PERCEPTION,
AND REPRESENTATION
xxi
xxv
1
1.1
1.2
1.3
2
Color Perception and Specification
1.1.1 Light and Color, 2
1.1.2 Human Perception of Color, 3
1.1.3 The Trichromatic Theory of Color Mixture, 4
1.1.4 Color Specification by Tristimulus Values, 5
1.1.5 Color Specification by Luminance and Chrominance
Attributes, 6
7
Video Capture and Display
1.2.1 Principles of Color Video Imaging, 7
1.2.2 Video Cameras, 8
1.2.3 Video Display, 10
1.2.4 Composite versus Component Video, 11
1.2.5 Gamma Correction, 11
Analog Video Raster
1.3.1 Progressive and Interlaced Scan, 12
1.3.2 Characterization of a Video Raster, 14
12
ix
wang-50214
wang˙fm
August 23, 2001
14:22
x
Contents
1.4
1.5
1.6
1.7
1.8
Analog Color Television Systems
16
1.4.1 Spatial and Temporal Resolution, 16
1.4.2 Color Coordinate, 17
1.4.3 Signal Bandwidth, 19
1.4.4 Multiplexing of Luminance, Chrominance, and Audio, 19
1.4.5 Analog Video Recording, 21
Digital Video 22
1.5.1 Notation, 22
1.5.2 ITU-R BT.601 Digital Video, 23
1.5.3 Other Digital Video Formats and Applications, 26
1.5.4 Digital Video Recording, 28
1.5.5 Video Quality Measure, 28
Summary 30
Problems 31
Bibliography 32
2
FOURIER ANALYSIS OF VIDEO SIGNALS AND
FREQUENCY RESPONSE OF THE HUMAN
VISUAL SYSTEM
2.1
2.2
2.3
2.4
2.5
2.6
2.7
Multidimensional Continuous-Space Signals and Systems
33
Multidimensional Discrete-Space Signals and Systems 36
Frequency Domain Characterization of Video Signals 38
2.3.1 Spatial and Temporal Frequencies, 38
2.3.2 Temporal Frequencies Caused by Linear Motion, 40
Frequency Response of the Human Visual System 42
2.4.1 Temporal Frequency Response and Flicker Perception, 43
2.4.2 Spatial Frequency Response, 45
2.4.3 Spatiotemporal Frequency Response, 46
2.4.4 Smooth Pursuit Eye Movement, 48
Summary 50
Problems 51
Bibliography 52
3
VIDEO SAMPLING
3.1
3.2
Basics of the Lattice Theory
54
Sampling over Lattices
3.2.1 Sampling Process and Sampled-Space Fourier Transform, 60
3.2.2 The Generalized Nyquist Sampling Theorem , 61
3.2.3 Sampling Efficiency, 63
59
33
53
wang-50214
wang˙fm
August 23, 2001
14:22
Contents
xi
3.2.4 Implementation of the Prefilter and Reconstruction Filter, 65
3.2.5 Relation between Fourier Transforms over Continuous, Discrete,
and Sampled Spaces, 66
3.3
Sampling of Video Signals
67
3.3.1 Required Sampling Rates, 67
3.3.2 Sampling Video in Two Dimensions: Progressive versus
Interlaced Scans, 69
3.3.3 Sampling a Raster Scan: BT.601 Format Revisited, 71
3.3.4 Sampling Video in Three Dimensions, 72
3.3.5 Spatial and Temporal Aliasing, 73
3.4
3.5
3.6
3.7
Filtering Operations in Cameras and Display Devices
3.4.1 Camera Apertures, 76
3.4.2 Display Apertures, 79
76
Summary 80
Problems
80
Bibliography
83
4
VIDEO SAMPLING RATE CONVERSION
84
4.1
4.2
4.3
4.4
4.5
Conversion of Signals Sampled on Different Lattices
4.1.1 Up-Conversion, 85
4.1.2 Down-Conversion, 87
4.1.3 Conversion between Arbitrary Lattices, 89
4.1.4 Filter Implementation and Design, and Other Interpolation
84
Approaches, 91
Sampling Rate Conversion of Video Signals
4.2.1 Deinterlacing, 93
4.2.2 Conversion between PAL and NTSC Signals, 98
4.2.3 Motion-Adaptive Interpolation, 104
92
Summary 105
Problems
106
Bibliography
109
5
VIDEO MODELING
111
5.1
5.2
112
Camera Model
5.1.1 Pinhole Model, 112
5.1.2 CAHV Model, 114
5.1.3 Camera Motions, 116
Illumination Model
5.2.1 Diffuse and Specular Reflection, 116
116
wang-50214
wang˙fm
August 23, 2001
14:22
xii
Contents
5.2.2 Radiance Distribution under Differing Illumination and Reflection
Conditions, 117
5.2.3 Changes in the Image Function Due to Object Motion, 119
5.3
5.4
5.5
5.6
5.7
5.8
Object Model
5.3.1 Shape Model, 121
5.3.2 Motion Model, 122
120
Scene Model
125
Two-Dimensional Motion Models
5.5.1 Definition and Notation, 128
5.5.2 Two-Dimensional Motion Models Corresponding to Typical Camera
128
Motions, 130
5.5.3 Two-Dimensional Motion Corresponding to Three-Dimensional Rigid
Motion, 133
5.5.4 Approximations of Projective Mapping, 136
Summary 137
Problems 138
Bibliography 139
6
TWO-DIMENSIONAL MOTION ESTIMATION
141
6.1
6.2
6.3
6.4
Optical Flow 142
6.1.1 Two-Dimensional Motion versus Optical Flow, 142
6.1.2 Optical Flow Equation and Ambiguity in Motion Estimation, 143
General Methodologies 145
6.2.1 Motion Representation, 146
6.2.2 Motion Estimation Criteria, 147
6.2.3 Optimization Methods, 151
Pixel-Based Motion Estimation
6.3.1 Regularization Using the Motion Smoothness Constraint, 153
6.3.2 Using a Multipoint Neighborhood, 153
6.3.3 Pel-Recursive Methods, 154
152
Block-Matching Algorithm 154
6.4.1 The Exhaustive Block-Matching Algorithm, 155
6.4.2 Fractional Accuracy Search, 157
6.4.3 Fast Algorithms, 159
6.4.4 Imposing Motion Smoothness Constraints, 161
6.4.5 Phase Correlation Method, 162
6.4.6 Binary Feature Matching, 163
6.5
Deformable Block-Matching Algorithms 165
6.5.1 Node-Based Motion Representation, 166
6.5.2 Motion Estimation Using the Node-Based Model, 167
wang-50214
wang˙fm
August 23, 2001
14:22
Contents
xiii
6.6
6.7
6.8
6.9
6.10
6.11
6.12
6.13
Mesh-Based Motion Estimation
6.6.1 Mesh-Based Motion Representation, 171
6.6.2 Motion Estimation Using the Mesh-Based Model, 173
169
Global Motion Estimation
6.7.1 Robust Estimators, 177
6.7.2 Direct Estimation, 178
6.7.3 Indirect Estimation, 178
177
Region-Based Motion Estimation
6.8.1 Motion-Based Region Segmentation, 180
6.8.2 Joint Region Segmentation and Motion Estimation, 181
179
Multiresolution Motion Estimation
6.9.1 General Formulation, 182
6.9.2 Hierarchical Block Matching Algorithm, 184
182
Application of Motion Estimation in Video Coding
187
Summary
188
Problems 189
Bibliography
191
7
THREE-DIMENSIONAL MOTION ESTIMATION
194
7.1
7.2
7.3
7.4
7.5
7.6
Feature-Based Motion Estimation 195
7.1.1 Objects of Known Shape under Orthographic Projection, 195
7.1.2 Objects of Known Shape under Perspective Projection, 196
7.1.3 Planar Objects, 197
7.1.4 Objects of Unknown Shape Using the Epipolar Line, 198
Direct Motion Estimation 203
7.2.1 Image Signal Models and Motion, 204
7.2.2 Objects of Known Shape, 206
7.2.3 Planar Objects, 207
7.2.4 Robust Estimation, 209
Iterative Motion Estimation
212
Summary 213
Problems
214
Bibliography
215
8
FOUNDATIONS OF VIDEO CODING
217
8.1
Overview of Coding Systems
8.1.1 General Framework, 218
8.1.2 Categorization of Video Coding Schemes, 219
218
wang-50214
wang˙fm
August 23, 2001
14:22
xiv
Contents
8.2
8.3
8.4
8.5
8.6
8.7
8.8
8.9
Basic Notions in Probability and Information Theory
8.2.1 Characterization of Stationary Sources, 221
8.2.2 Entropy and Mutual Information for Discrete Sources, 222
8.2.3 Entropy and Mutual Information for Continuous
221
Sources, 226
Information Theory for Source Coding
8.3.1 Bound for Lossless Coding, 227
8.3.2 Bound for Lossy Coding, 229
8.3.3 Rate-Distortion Bounds for Gaussian Sources, 232
227
Binary Encoding 234
8.4.1 Huffman Coding, 235
8.4.2 Arithmetic Coding, 238
241
Scalar Quantization
8.5.1 Fundamentals, 241
8.5.2 Uniform Quantization, 243
8.5.3 Optimal Scalar Quantizer, 244
248
Vector Quantization
8.6.1 Fundamentals, 248
8.6.2 Lattice Vector Quantizer, 251
8.6.3 Optimal Vector Quantizer, 253
8.6.4 Entropy-Constrained Optimal Quantizer Design, 255
Summary 257
Problems 259
Bibliography 261
9 WAVEFORM-BASED VIDEO CODING
263
9.1
9.2
263
Block-Based Transform Coding
9.1.1 Overview, 264
9.1.2 One-Dimensional Unitary Transform, 266
9.1.3 Two-Dimensional Unitary Transform, 269
9.1.4 The Discrete Cosine Transform, 271
9.1.5 Bit Allocation and Transform Coding Gain, 273
9.1.6 Optimal Transform Design and the KLT, 279
9.1.7 DCT-Based Image Coders and the JPEG Standard, 281
9.1.8 Vector Transform Coding, 284
Predictive Coding 285
9.2.1 Overview, 285
9.2.2 Optimal Predictor Design and Predictive Coding Gain, 286
9.2.3 Spatial-Domain Linear Prediction, 290
9.2.4 Motion-Compensated Temporal Prediction, 291
wang-50214
wang˙fm
August 23, 2001
14:22
xv
293
Contents
9.3
9.4
9.5
9.6
Video Coding Using Temporal Prediction and Transform Coding
9.3.1 Block-Based Hybrid Video Coding, 293
9.3.2 Overlapped Block Motion Compensation, 296
9.3.3 Coding Parameter Selection, 299
9.3.4 Rate Control, 302
9.3.5 Loop Filtering, 305
Summary 308
Problems
309
Bibliography
311
10 CONTENT-DEPENDENT VIDEO CODING
314
10.1
10.2
10.3
10.4
10.5
10.6
10.7
10.8
10.9
10.10
10.11
Two-Dimensional Shape Coding
10.1.1 Bitmap Coding, 315
10.1.2 Contour Coding, 318
10.1.3 Evaluation Criteria for Shape Coding Efficiency, 323
314
Texture Coding for Arbitrarily Shaped Regions
10.2.1 Texture Extrapolation, 324
10.2.2 Direct Texture Coding, 325
324
Joint Shape and Texture Coding
326
Region-Based Video Coding
327
Object-Based Video Coding
10.5.1 Source Model F2D, 330
10.5.2 Source Models R3D and F3D, 332
328
Knowledge-Based Video Coding
336
Semantic Video Coding
338
Layered Coding System 339
Summary 342
Problems
343
Bibliography
344
11 SCALABLE VIDEO CODING
349
11.1
350
Basic Modes of Scalability
11.1.1 Quality Scalability, 350
11.1.2 Spatial Scalability, 353
11.1.3 Temporal Scalability, 356
11.1.4 Frequency Scalability, 356
wang-50214
wang˙fm
August 23, 2001
14:22
xvi
Contents
11.1.5 Combination of Basic Schemes, 357
11.1.6 Fine-Granularity Scalability, 357
Object-Based Scalability 359
Wavelet-Transform-Based Coding
361
11.3.1 Wavelet Coding of Still Images, 363
11.3.2 Wavelet Coding of Video, 367
Summary 370
Problems
370
Bibliography
371
11.2
11.3
11.4
11.5
11.6
12 STEREO AND MULTIVIEW SEQUENCE PROCESSING
374
12.1
12.2
12.3
12.4
12.5
Depth Perception
12.1.1 Binocular Cues—Stereopsis, 375
12.1.2 Visual Sensitivity Thresholds for Depth Perception, 375
375
377
Stereo Imaging Principle
12.2.1 Arbitrary Camera Configuration, 377
12.2.2 Parallel Camera Configuration, 379
12.2.3 Converging Camera Configuration, 381
12.2.4 Epipolar Geometry, 383
385
Disparity Estimation
12.3.1 Constraints on Disparity Distribution, 386
12.3.2 Models for the Disparity Function, 387
12.3.3 Block-Based Approach, 388
12.3.4 Two-Dimensional Mesh-Based Approach, 388
12.3.5 Intra-Line Edge Matching Using Dynamic Programming, 391
12.3.6 Joint Structure and Motion Estimation, 392
Intermediate View Synthesis
393
Stereo Sequence Coding 396
12.5.1 Block-Based Coding and MPEG-2 Multiview Profile, 396
12.5.2 Incomplete Three-Dimensional Representation
of Multiview Sequences, 398
12.5.3 Mixed-Resolution Coding, 398
12.5.4 Three-Dimensional Object-Based Coding, 399
12.5.5 Three-Dimensional Model-Based Coding, 400
12.6
12.7
12.8
Summary 400
Problems
402
Bibliography
403