Preface
Contents
Part IIntroduction and Multimedia Data Representations
1 Introduction to Multimedia
1.1 What is Multimedia?
1.1.1 Components of Multimedia
1.2 Multimedia: Past and Present
1.2.1 Early History of Multimedia
1.2.2 Hypermedia, WWW, and Internet
1.2.3 Multimedia in the New Millennium
1.3 Multimedia Software Tools: A Quick Scan
1.3.1 Music Sequencing and Notation
1.3.2 Digital Audio
1.3.3 Graphics and Image Editing
1.3.4 Video Editing
1.3.5 Animation
1.3.6 Multimedia Authoring
1.4 Multimedia in the Future
1.5 Exercises
2 A Taste of Multimedia
2.1 Multimedia Tasks and Concerns
2.2 Multimedia Presentation
2.3 Data Compression
2.4 Multimedia Production
2.5 Multimedia Sharing and Distribution
2.6 Some Useful Editing and Authoring Tools
2.6.1 Adobe Premiere
2.6.2 Adobe Director
2.6.3 Adobe Flash
2.7 Exercises
3 Graphics and Image Data Representations
3.1 Graphics/Image Data Types
3.1.1 1-Bit Images
3.1.2 8-Bit Gray-Level Images
3.1.3 Image Data Types
3.1.4 24-Bit Color Images
3.1.5 Higher Bit-Depth Images
3.1.6 8-Bit Color Images
3.1.7 Color Lookup Tables
3.2 Popular File Formats
3.2.1 GIF
3.2.2 JPEG
3.2.3 PNG
3.2.4 TIFF
3.2.5 Windows BMP
3.2.6 Windows WMF
3.2.7 Netpbm Format
3.2.8 EXIF
3.2.9 PS and PDF
3.2.10 PTM
3.3 Exercises
4 Color in Image and Video
4.1 Color Science
4.1.1 Light and Spectra
4.1.2 Human Vision
4.1.3 Spectral Sensitivity of the Eye
4.1.4 Image Formation
4.1.5 Camera Systems
4.1.6 Gamma Correction
4.1.7 Color-Matching Functions
4.1.8 CIE Chromaticity Diagram
4.1.9 Color Monitor Specifications
4.1.10 Out-of-Gamut Colors
4.1.11 White Point Correction
4.1.12 XYZ to RGB Transform
4.1.13 Transform with Gamma Correction
4.1.14 L*a*b* (CIELAB) Color Model
4.1.15 More Color Coordinate Schemes
4.1.16 Munsell Color Naming System
4.2 Color Models in Images
4.2.1 RGB Color Model for Displays
4.2.2 Multisensor Cameras
4.2.3 Camera-Dependent Color
4.2.4 Subtractive Color: CMY Color Model
4.2.5 Transformation from RGB to CMY
4.2.6 Undercolor Removal: CMYK System
4.2.7 Printer Gamuts
4.2.8 Multi-ink Printers
4.3 Color Models in Video
4.3.1 Video Color Transforms
4.3.2 YUV Color Model
4.3.3 YIQ Color Model
4.3.4 YCbCr Color Model
4.4 Exercises
5 Fundamental Concepts in Video
5.1 Analog Video
5.1.1 NTSC Video
5.1.2 PAL Video
5.1.3 SECAM Video
5.2 Digital Video
5.2.1 Chroma Subsampling
5.2.2 CCIR and ITU-R Standards for Digital Video
5.2.3 High-Definition TV
5.2.4 Ultra High Definition TV (UHDTV)
5.3 Video Display Interfaces
5.3.1 Analog Display Interfaces
5.3.2 Digital Display Interfaces
5.4 3D Video and TV
5.4.1 Cues for 3D Percept
5.4.2 3D Camera Models
5.4.3 3D Movie and TV Based on Stereo Vision
5.4.4 The Vergence-Accommodation Conflict
5.4.5 Autostereoscopic (Glasses-Free) Display Devices
5.4.6 Disparity Manipulation in 3D Content Creation
5.5 Exercises
6 Basics of Digital Audio
6.1 Digitization of Sound
6.1.1 What is Sound?
6.1.2 Digitization
6.1.3 Nyquist Theorem
6.1.4 Signal-to-Noise Ratio (SNR)
6.1.5 Signal-to-Quantization-Noise Ratio (SQNR)
6.1.6 Linear and Nonlinear Quantization
6.1.7 Audio Filtering
6.1.8 Audio Quality Versus Data Rate
6.1.9 Synthetic Sounds
6.2 MIDI: Musical Instrument Digital Interface
6.2.1 MIDI Overview
6.2.2 Hardware Aspects of MIDI
6.2.3 Structure of MIDI Messages
6.2.4 General MIDI
6.2.5 MIDI-to-WAV Conversion
6.3 Quantization and Transmission of Audio
6.3.1 Coding of Audio
6.3.2 Pulse Code Modulation
6.3.3 Differential Coding of Audio
6.3.4 Lossless Predictive Coding
6.3.5 DPCM
6.3.6 DM
6.3.7 ADPCM
6.4 Exercises
Part IIMultimedia Data Compression
7 Lossless Compression Algorithms
7.1 Introduction
7.2 Basics of Information Theory
7.3 Run-Length Coding
7.4 Variable-Length Coding
7.4.1 Shannon--Fano Algorithm
7.4.2 Huffman Coding
7.4.3 Adaptive Huffman Coding
7.5 Dictionary-Based Coding
7.6 Arithmetic Coding
7.6.1 Basic Arithmetic Coding Algorithm
7.6.2 Scaling and Incremental Coding
7.6.3 Integer Implementation
7.6.4 Binary Arithmetic Coding
7.6.5 Adaptive Arithmetic Coding
7.7 Lossless Image Compression
7.7.1 Differential Coding of Images
7.7.2 Lossless JPEG
7.8 Exercises
8 Lossy Compression Algorithms
8.1 Introduction
8.2 Distortion Measures
8.3 The Rate-Distortion Theory
8.4 Quantization
8.4.1 Uniform Scalar Quantization
8.4.2 Nonuniform Scalar Quantization
8.4.3 Vector Quantization
8.5 Transform Coding
8.5.1 Discrete Cosine Transform (DCT)
8.5.2 Karhunen--Loève Transform*
8.6 Wavelet-Based Coding
8.6.1 Introduction
8.6.2 Continuous Wavelet Transform*
8.6.3 Discrete Wavelet Transform*
8.7 Wavelet Packets
8.8 Embedded Zerotree of Wavelet Coefficients
8.8.1 The Zerotree Data Structure
8.8.2 Successive Approximation Quantization
8.8.3 EZW Example
8.9 Set Partitioning in Hierarchical Trees (SPIHT)
8.10 Exercises
9 Image Compression Standards
9.1 The JPEG Standard
9.1.1 Main Steps in JPEG Image Compression
9.1.2 JPEG Modes
9.1.3 A Glance at the JPEG Bitstream
9.2 The JPEG2000 Standard
9.2.1 Main Steps of JPEG2000 Image Compression*
9.2.2 Adapting EBCOT to JPEG2000
9.2.3 Region-of-Interest Coding
9.2.4 Comparison of JPEG and JPEG2000 Performance
9.3 The JPEG-LS Standard
9.3.1 Prediction
9.3.2 Context Determination
9.3.3 Residual Coding
9.3.4 Near-Lossless Mode
9.4 Bi-level Image Compression Standards
9.4.1 The JBIG Standard
9.4.2 The JBIG2 Standard
9.5 Exercises
10 Basic Video Compression Techniques
10.1 Introduction to Video Compression
10.2 Video Compression Based on Motion Compensation
10.3 Search for Motion Vectors
10.3.1 Sequential Search
10.3.2 2D Logarithmic Search
10.3.3 Hierarchical Search
10.4 H.261
10.4.1 Intra-Frame (I-Frame) Coding
10.4.2 Inter-Frame (P-Frame) Predictive Coding
10.4.3 Quantization in H.261
10.4.4 H.261 Encoder and Decoder
10.4.5 A Glance at the H.261 Video Bitstream Syntax
10.5 H.263
10.5.1 Motion Compensation in H.263
10.5.2 Optional H.263 Coding Modes
10.5.3 H.263+ and H.263++
10.6 Exercises
11 MPEG Video Coding: MPEG-1, 2, 4, and 7
11.1 Overview
11.2 MPEG-1
11.2.1 Motion Compensation in MPEG-1
11.2.2 Other Major Differences from H.261
11.2.3 MPEG-1 Video Bitstream
11.3 MPEG-2
11.3.1 Supporting Interlaced Video
11.3.2 MPEG-2 Scalabilities
11.3.3 Other Major Differences from MPEG-1
11.4 MPEG-4
11.4.1 Overview of MPEG-4
11.4.2 Video Object-Based Coding in MPEG-4
11.4.3 Synthetic Object Coding in MPEG-4
11.4.4 MPEG-4 Parts, Profiles and Levels
11.5 MPEG-7
11.5.1 Descriptor (D)
11.5.2 Description Scheme (DS)
11.5.3 Description Definition Language (DDL)
11.6 Exercises
12 New Video Coding Standards: H.264 and H.265
12.1 H.264
12.1.1 Motion Compensation
12.1.2 Integer Transform
12.1.3 Quantization and Scaling
12.1.4 Examples of H.264 Integer Transform and Quantization
12.1.5 Intra Coding
12.1.6 In-Loop Deblocking Filtering
12.1.7 Entropy Coding
12.1.8 Context-Adaptive Variable Length Coding (CAVLC)
12.1.9 Context-Adaptive Binary Arithmetic Coding (CABAC)
12.1.10 H.264 Profiles
12.1.11 H.264 Scalable Video Coding
12.1.12 H.264 Multiview Video Coding
12.2 H.265
12.2.1 Motion Compensation
12.2.2 Integer Transform
12.2.3 Quantization and Scaling
12.2.4 Intra Coding
12.2.5 Discrete Sine Transform
12.2.6 In-Loop Filtering
12.2.7 Entropy Coding
12.2.8 Special Coding Modes
12.2.9 H.265 Profiles
12.3 Comparisons of Video Coding Efficiency
12.3.1 Objective Assessment
12.3.2 Subjective Assessment
12.4 Exercises
13 Basic Audio Compression Techniques
13.1 ADPCM in Speech Coding
13.1.1 ADPCM
13.2 G.726 ADPCM, G.727-9
13.3 Vocoders
13.3.1 Phase Insensitivity
13.3.2 Channel Vocoder
13.3.3 Formant Vocoder
13.3.4 Linear Predictive Coding (LPC)
13.3.5 Code Excited Linear Prediction (CELP)
13.3.6 Hybrid Excitation Vocoders*
13.4 Exercises
14 MPEG Audio Compression
14.1 Psychoacoustics
14.1.1 Equal-Loudness Relations
14.1.2 Frequency Masking
14.1.3 Temporal Masking
14.2 MPEG Audio
14.2.1 MPEG Layers
14.2.2 MPEG Audio Strategy
14.2.3 MPEG Audio Compression Algorithm
14.2.4 MPEG-2 AAC (Advanced Audio Coding)
14.2.5 MPEG-4 Audio
14.3 Other Audio Codecs
14.3.1 Ogg Vorbis
14.4 MPEG-7 Audio and Beyond
14.5 Further Exploration
14.6 Exercises
Part IIIMultimedia Communications and Networking
15 Network Services and Protocols -13pt for Multimedia Communications
15.1 Protocol Layers of Computer Communication Networks
15.2 Local Area Network and Access Networks
15.2.1 LAN Standards
15.2.2 Ethernet Technology
15.2.3 Access Network Technologies
15.3 Internet Technologies and Protocols
15.3.1 Network Layer: IP
15.3.2 Transport Layer: TCP and UDP
15.3.3 Network Address Translation and Firewall
15.4 Multicast Extension
15.4.1 Router-Based Architectures: IP Multicast
15.4.2 Non Router-Based Multicast Architectures
15.5 Quality-of-Service for Multimedia Communications
15.5.1 Quality of Service
15.5.2 Internet QoS
15.5.3 Rate Control and Buffer Management
15.6 Protocols for Multimedia Transmission and Interaction
15.6.1 HyperText Transfer Protocol
15.6.2 Real-Time Transport Protocol
15.6.3 RTP Control Protocol
15.6.4 Real-Time Streaming Protocol
15.7 Case Study: Internet Telephony
15.7.1 Signaling Protocols: H.323 and Session Initiation Protocol
15.8 Further Exploration
15.9 Exercises
16 Internet Multimedia Content
Distribution
16.1 Proxy Caching
16.1.1 Sliding-Interval Caching
16.1.2 Prefix Caching and Segment Caching
16.1.3 Rate-Split Caching and Work-Ahead Smoothing
16.1.4 Summary and Comparison
16.2 Content Distribution Networks (CDNs)
16.2.1 Representative: Akamai Streaming CDN
16.3 Broadcast/Multicast Video-on-Demand
16.3.1 Smart TV and Set-Top Box (STB)
16.3.2 Scalable Multicast/Broadcast VoD
16.4 Broadcast/Multicast for Heterogeneous Users
16.4.1 Stream Replication
16.4.2 Layered Multicast
16.5 Application-Layer Multicast
16.5.1 Representative: End-System Multicast (ESM)
16.5.2 Multi-tree Structure
16.6 Peer-to-Peer Video Streaming with Mesh Overlays
16.6.1 Representative: CoolStreaming
16.6.2 Hybrid Tree and Mesh Overlay
16.7 HTTP-Based Media Streaming
16.7.1 HTTP for Streaming
16.7.2 Dynamic Adaptive Streaming Over HTTP (DASH)
16.8 Exercises
17 Multimedia Over Wireless and Mobile
Networks
17.1 Characteristics of Wireless Channels
17.1.1 Path Loss
17.1.2 Multipath Fading
17.2 Wireless Networking Technologies
17.2.1 1G Cellular Analog Wireless Networks
17.2.2 2G Cellular Networks: GSM and Narrowband CDMA
17.2.3 3G Cellular Networks: Wideband CDMA
17.2.4 4G Cellular Networks and Beyond
17.2.5 Wireless Local Area Networks
17.2.6 Bluetooth and Short-Range Technologies
17.3 Multimedia Over Wireless Channels
17.3.1 Error Detection
17.3.2 Error Correction
17.3.3 Error-Resilient Coding
17.3.4 Error Concealment
17.4 Mobility Management
17.4.1 Network Layer Mobile IP
17.4.2 Link-Layer Handoff Management
17.5 Further Exploration
17.6 Exercises
Part IVMultimedia Information Sharing and Retrieval
18 Social Media Sharing
18.1 Representative Social Media Services
18.1.1 User-Generated Content Sharing
18.1.2 Online Social Networking
18.2 User-Generated Media Content Sharing
18.2.1 YouTube Video Format and Meta-data
18.2.2 Characteristics of YouTube Video
18.2.3 Small-World in YouTube Videos
18.2.4 YouTube from a Partner's View
18.2.5 Enhancing UGC Video Sharing
18.3 Media Propagation in Online Social Networks
18.3.1 Sharing Patterns of Individual Users
18.3.2 Video Propagation Structure and Model
18.3.3 Video Watching and Sharing Behaviors
18.3.4 Coordinating Live Streaming and Online Storage
18.4 Further Exploration
18.5 Exercises
19 Cloud Computing for Multimedia Services
19.1 Cloud Computing Overview
19.1.1 Representative Storage Service: Amazon S3
19.1.2 Representative Computation Service: Amazon EC2
19.2 Multimedia Cloud Computing
19.3 Cloud-Assisted Media Sharing
19.3.1 Impact of Globalization
19.3.2 Case Study: Netflix
19.4 Computation Offloading for Multimedia Services
19.4.1 Requirements for Computation Offloading
19.4.2 Service Partitioning for Video Coding
19.4.3 Case Study: Cloud-Assisted Motion Estimation
19.5 Interactive Cloud Gaming
19.5.1 Issues and Challenges of Cloud Gaming
19.5.2 Real-World Implementation
19.6 Further Exploration
19.7 Exercises
20 Content-Based Retrieval in Digital Libraries
20.1 How Should We Retrieve Images?
20.2 Synopsis of Early CBIR Systems
20.3 C-BIRD: A Case Study
20.3.1 Color Histogram
20.3.2 Color Density and Color Layout
20.3.3 Texture Layout
20.3.4 Texture Analysis Details
20.3.5 Search by Illumination Invariance
20.3.6 Search by Object Model
20.4 Quantifying Search Results
20.5 Key Technologies in Current CBIR Systems
20.5.1 Robust Image Features and Their Representation
20.5.2 Relevance Feedback
20.5.3 Other Post-processing Techniques
20.5.4 Visual Concept Search
20.5.5 The Role of Users in Interactive CBIR Systems
20.6 Querying on Videos
20.7 Querying on Videos Based on Human Activity
20.7.1 Modeling Human Activity Structures
20.7.2 Experimental Results
20.8 Quality-Aware Mobile Visual Search
20.8.1 Related Work
20.8.2 Quality-Aware Method
20.8.3 Experimental Results
20.9 Exercises
Index