Preface
Contents
1 Introduction
1.1 Basics of Imaging Systems
1.1.1 Pin-Hole Camera Model
1.1.2 Camera Geometry and Projection Matrix
1.1.3 Lens Distortions
1.2 Stereo Vision Systems
1.2.1 Two-view Stereo Systems
1.2.1.1 Epipolar Geometry
1.2.1.2 Epipolar Rectification
1.2.1.3 The Correspondence Problem
1.2.2 N-view Stereo Systems and Structure from Motion
1.2.3 Calibrated and Uncalibrated 3D Reconstruction
1.3 Basics of Structured Light Depth Cameras
1.4 Basics of ToF Depth Cameras
1.4.1 ToF Operation Principle
1.4.2 Direct ToF Measurement Methods
1.4.2.1 Direct Pulse Modulation
1.4.2.2 Direct CW Modulation
1.4.3 Surface Measurement by Single Point and Matricial ToF Systems
1.4.4 ToF Depth Camera Components
1.4.4.1 Modulation Methods for ToF Depth Cameras
1.4.4.2 ToF Depth Camera Transmitter Basics
1.4.4.3 ToF Depth Camera Receiver Basics
1.5 Book Overview
References
Part I Operating Principles of Depth Cameras
2 Operating Principles of Structured Light Depth Cameras
2.1 Camera Virtualization
2.2 General Characteristics
2.2.1 Depth Resolution
2.3 Illuminator Design Approaches
2.3.1 Implementing Uniqueness by Signal Multiplexing
2.3.1.1 Wavelength Multiplexing
2.3.1.2 Range Multiplexing
2.3.1.3 Temporal Multiplexing
2.3.1.4 Spatial Multiplexing
2.3.2 Structured Light Systems Non-idealities
2.4 Examples of Structured Light Depth Cameras
2.4.1 The Intel RealSense F200
2.4.2 The Intel RealSense R200
2.4.3 The Primesense Camera (AKA Kinect™ v1)
2.5 Conclusions and Further Reading
References
3 Operating Principles of Time-of-Flight Depth Cameras
3.1 AM Modulation Within In-Pixel Photo-Mixing Devices
3.1.1 Sinusoidal Modulation
3.1.2 Square Wave Modulation
3.2 Imaging Characteristics of ToF Depth Cameras
3.3 Practical Implementation Issues of ToF Depth Cameras
3.3.1 Phase Wrapping
3.3.2 Harmonic Distortion
3.3.3 Photon-Shot Noise
3.3.4 Saturation and Motion Blur
3.3.5 Multipath Error
3.3.6 Flying Pixels
3.3.7 Other Noise Sources
3.4 Examples of ToF Depth Cameras
3.4.1 Kinect™ v2
3.4.2 MESA ToF Depth Cameras
3.4.3 PMD Devices
3.4.4 ToF Depth Cameras Based on SoftKinetic Technology
3.5 Conclusions and Further Reading
References
Part II Extraction of 3D Information from Depth Cameras Data
4 Calibration
4.1 Calibration of a Generic Imaging Device
4.1.1 Measurement's Accuracy, Precision and Resolution
4.1.2 General Calibration Procedure
4.1.3 Supervised and Unsupervised Calibration
4.1.4 Calibration Error
4.1.5 Geometric Calibration
4.1.6 Photometric Calibration
4.2 Calibration of Standard Cameras
4.2.1 Calibration of a Single Camera
4.2.2 Calibration of a Stereo Vision System
4.2.3 Extension to N-View Systems
4.3 Calibration of Depth Cameras
4.3.1 Calibration of Structured Light Depth Cameras
4.3.2 Calibration of ToF Depth Cameras
4.4 Calibration of Heterogeneous Imaging Systems
4.4.1 Calibration of a Depth Camera and a Standard Camera
4.4.2 Calibration of a Depth Camera and a Stereo Vision System
4.4.3 Calibration of Multiple Depth Cameras
4.5 Conclusions and Further Readings
References
5 Data Fusion from Depth and Standard Cameras
5.1 Acquisition Setup with Multiple Sensors
5.1.1 Example of Acquisition Setups
5.1.2 Data Registration
5.2 Fusion of a Depth Camera with a Single Color Camera
5.2.1 Local Filtering and Interpolation Techniques
5.2.2 Global Optimization Based Approaches
5.3 Fusion of a Depth Camera with a Stereo System
5.3.1 Local Fusion Methods
5.3.1.1 Confidence Based Techniques
5.3.1.2 Probabilistic Approaches
5.3.2 Global Optimization Based Approaches
5.3.2.1 MAP-MRF Probabilistic Fusion Framework
5.3.2.2 Other Global Optimization Based Frameworks
5.3.3 Other Approaches
5.4 Conclusions and Further Reading
References
Part III Applications of Depth Camera Data
6 Scene Segmentation Assisted by Depth Data
6.1 Scene Matting with Color and Depth Data
6.1.1 Single Frame Matting with Color and Depth Data
6.1.2 Video Matting with Color and Depth Data
6.2 Scene Segmentation from Color and Depth Data
6.2.1 Single Frame Segmentation from Color and Depth Data
6.2.2 Single Frame Segmentation: Clustering of Multidimensional Vectors
6.2.3 Single Frame Segmentation: Graph-Based Approaches
6.2.4 Single Frame Segmentation Based on Geometric Clues
6.2.5 Video Segmentation from Color and Depth Data
6.3 Semantic Segmentation from Color and Depth Data
6.4 Conclusions and Further Reading
References
7 3D Scene Reconstruction from Depth Camera Data
7.1 3D Reconstruction from Depth Camera Data
7.2 Pre-processing of the Views
7.3 Rough Pairwise Registration
7.4 Fine Pairwise Registration
7.5 Global Registration
7.6 Fusion of the Registered Views
7.6.1 KinectFusion
7.7 Reconstruction of Dynamic Scenes
7.8 SLAM with Depth Camera Data
7.9 Conclusions and Further Reading
References
8 Human Pose Estimation and Tracking
8.1 Human Body Models
8.1.1 Articulated Objects
8.1.2 Kinematic Skeleton Models
8.1.3 Augmented Skeleton Models
8.2 Human Pose Estimation
8.2.1 Learning Based Approaches and the Kinect™ pose Estimation Algorithm
8.2.2 Example-Based Approaches
8.2.3 Point of Interest Detection
8.3 Human Pose Tracking
8.3.1 Optimization-Based Approaches
8.3.1.1 Local Optimization Methods
8.3.2 ICP and Ray Casting Approaches
8.3.3 Filtering Approaches
8.3.3.1 Pose Tracking with Particle Filter
8.3.3.2 Pose Tracking with Kalman Filter
8.3.4 Approaches Based on Markov Random Fields
8.4 Conclusions and Further Reading
References
9 Gesture Recognition
9.1 Static Gesture Recognition
9.1.1 Pose-Based Descriptors
9.1.2 Contour Shape-Based Descriptors
9.1.3 Surface Shape Descriptors
9.1.4 Area and Volume Occupancy Descriptors
9.1.5 Depth Image-Based Descriptors
9.1.6 Convex Hull-Based Descriptors
9.1.7 Feature Classification
9.1.8 Feature Selection
9.1.9 Static Gesture Recognition with Deep Learning
9.2 Dynamic Gesture Recognition
9.2.1 Deterministic Recognition Approaches
9.2.2 Stochastic Recognition Approaches
9.2.2.1 Dynamic Gesture Recognition with Hidden Markov Models
9.2.2.2 Dynamic Gesture Recognition with Hierarchical Markov Models
9.2.3 Dynamic Gesture Recognition with Action Graphs
9.2.4 Descriptors for Dynamic Gesture Recognition
9.3 Conclusions and Further Readings
References
10 Conclusions
Index