logo资料库

Time-of-Flight and Structured Light Depth Cameras.pdf

第1页 / 共360页
第2页 / 共360页
第3页 / 共360页
第4页 / 共360页
第5页 / 共360页
第6页 / 共360页
第7页 / 共360页
第8页 / 共360页
资料共360页,剩余部分请下载后查看
Preface
Contents
1 Introduction
1.1 Basics of Imaging Systems
1.1.1 Pin-Hole Camera Model
1.1.2 Camera Geometry and Projection Matrix
1.1.3 Lens Distortions
1.2 Stereo Vision Systems
1.2.1 Two-view Stereo Systems
1.2.1.1 Epipolar Geometry
1.2.1.2 Epipolar Rectification
1.2.1.3 The Correspondence Problem
1.2.2 N-view Stereo Systems and Structure from Motion
1.2.3 Calibrated and Uncalibrated 3D Reconstruction
1.3 Basics of Structured Light Depth Cameras
1.4 Basics of ToF Depth Cameras
1.4.1 ToF Operation Principle
1.4.2 Direct ToF Measurement Methods
1.4.2.1 Direct Pulse Modulation
1.4.2.2 Direct CW Modulation
1.4.3 Surface Measurement by Single Point and Matricial ToF Systems
1.4.4 ToF Depth Camera Components
1.4.4.1 Modulation Methods for ToF Depth Cameras
1.4.4.2 ToF Depth Camera Transmitter Basics
1.4.4.3 ToF Depth Camera Receiver Basics
1.5 Book Overview
References
Part I Operating Principles of Depth Cameras
2 Operating Principles of Structured Light Depth Cameras
2.1 Camera Virtualization
2.2 General Characteristics
2.2.1 Depth Resolution
2.3 Illuminator Design Approaches
2.3.1 Implementing Uniqueness by Signal Multiplexing
2.3.1.1 Wavelength Multiplexing
2.3.1.2 Range Multiplexing
2.3.1.3 Temporal Multiplexing
2.3.1.4 Spatial Multiplexing
2.3.2 Structured Light Systems Non-idealities
2.4 Examples of Structured Light Depth Cameras
2.4.1 The Intel RealSense F200
2.4.2 The Intel RealSense R200
2.4.3 The Primesense Camera (AKA Kinect™ v1)
2.5 Conclusions and Further Reading
References
3 Operating Principles of Time-of-Flight Depth Cameras
3.1 AM Modulation Within In-Pixel Photo-Mixing Devices
3.1.1 Sinusoidal Modulation
3.1.2 Square Wave Modulation
3.2 Imaging Characteristics of ToF Depth Cameras
3.3 Practical Implementation Issues of ToF Depth Cameras
3.3.1 Phase Wrapping
3.3.2 Harmonic Distortion
3.3.3 Photon-Shot Noise
3.3.4 Saturation and Motion Blur
3.3.5 Multipath Error
3.3.6 Flying Pixels
3.3.7 Other Noise Sources
3.4 Examples of ToF Depth Cameras
3.4.1 Kinect™ v2
3.4.2 MESA ToF Depth Cameras
3.4.3 PMD Devices
3.4.4 ToF Depth Cameras Based on SoftKinetic Technology
3.5 Conclusions and Further Reading
References
Part II Extraction of 3D Information from Depth Cameras Data
4 Calibration
4.1 Calibration of a Generic Imaging Device
4.1.1 Measurement's Accuracy, Precision and Resolution
4.1.2 General Calibration Procedure
4.1.3 Supervised and Unsupervised Calibration
4.1.4 Calibration Error
4.1.5 Geometric Calibration
4.1.6 Photometric Calibration
4.2 Calibration of Standard Cameras
4.2.1 Calibration of a Single Camera
4.2.2 Calibration of a Stereo Vision System
4.2.3 Extension to N-View Systems
4.3 Calibration of Depth Cameras
4.3.1 Calibration of Structured Light Depth Cameras
4.3.2 Calibration of ToF Depth Cameras
4.4 Calibration of Heterogeneous Imaging Systems
4.4.1 Calibration of a Depth Camera and a Standard Camera
4.4.2 Calibration of a Depth Camera and a Stereo Vision System
4.4.3 Calibration of Multiple Depth Cameras
4.5 Conclusions and Further Readings
References
5 Data Fusion from Depth and Standard Cameras
5.1 Acquisition Setup with Multiple Sensors
5.1.1 Example of Acquisition Setups
5.1.2 Data Registration
5.2 Fusion of a Depth Camera with a Single Color Camera
5.2.1 Local Filtering and Interpolation Techniques
5.2.2 Global Optimization Based Approaches
5.3 Fusion of a Depth Camera with a Stereo System
5.3.1 Local Fusion Methods
5.3.1.1 Confidence Based Techniques
5.3.1.2 Probabilistic Approaches
5.3.2 Global Optimization Based Approaches
5.3.2.1 MAP-MRF Probabilistic Fusion Framework
5.3.2.2 Other Global Optimization Based Frameworks
5.3.3 Other Approaches
5.4 Conclusions and Further Reading
References
Part III Applications of Depth Camera Data
6 Scene Segmentation Assisted by Depth Data
6.1 Scene Matting with Color and Depth Data
6.1.1 Single Frame Matting with Color and Depth Data
6.1.2 Video Matting with Color and Depth Data
6.2 Scene Segmentation from Color and Depth Data
6.2.1 Single Frame Segmentation from Color and Depth Data
6.2.2 Single Frame Segmentation: Clustering of Multidimensional Vectors
6.2.3 Single Frame Segmentation: Graph-Based Approaches
6.2.4 Single Frame Segmentation Based on Geometric Clues
6.2.5 Video Segmentation from Color and Depth Data
6.3 Semantic Segmentation from Color and Depth Data
6.4 Conclusions and Further Reading
References
7 3D Scene Reconstruction from Depth Camera Data
7.1 3D Reconstruction from Depth Camera Data
7.2 Pre-processing of the Views
7.3 Rough Pairwise Registration
7.4 Fine Pairwise Registration
7.5 Global Registration
7.6 Fusion of the Registered Views
7.6.1 KinectFusion
7.7 Reconstruction of Dynamic Scenes
7.8 SLAM with Depth Camera Data
7.9 Conclusions and Further Reading
References
8 Human Pose Estimation and Tracking
8.1 Human Body Models
8.1.1 Articulated Objects
8.1.2 Kinematic Skeleton Models
8.1.3 Augmented Skeleton Models
8.2 Human Pose Estimation
8.2.1 Learning Based Approaches and the Kinect™ pose Estimation Algorithm
8.2.2 Example-Based Approaches
8.2.3 Point of Interest Detection
8.3 Human Pose Tracking
8.3.1 Optimization-Based Approaches
8.3.1.1 Local Optimization Methods
8.3.2 ICP and Ray Casting Approaches
8.3.3 Filtering Approaches
8.3.3.1 Pose Tracking with Particle Filter
8.3.3.2 Pose Tracking with Kalman Filter
8.3.4 Approaches Based on Markov Random Fields
8.4 Conclusions and Further Reading
References
9 Gesture Recognition
9.1 Static Gesture Recognition
9.1.1 Pose-Based Descriptors
9.1.2 Contour Shape-Based Descriptors
9.1.3 Surface Shape Descriptors
9.1.4 Area and Volume Occupancy Descriptors
9.1.5 Depth Image-Based Descriptors
9.1.6 Convex Hull-Based Descriptors
9.1.7 Feature Classification
9.1.8 Feature Selection
9.1.9 Static Gesture Recognition with Deep Learning
9.2 Dynamic Gesture Recognition
9.2.1 Deterministic Recognition Approaches
9.2.2 Stochastic Recognition Approaches
9.2.2.1 Dynamic Gesture Recognition with Hidden Markov Models
9.2.2.2 Dynamic Gesture Recognition with Hierarchical Markov Models
9.2.3 Dynamic Gesture Recognition with Action Graphs
9.2.4 Descriptors for Dynamic Gesture Recognition
9.3 Conclusions and Further Readings
References
10 Conclusions
Index
PietroZanuttigh· GiulioMarin CarloDalMutto· FabioDominio LudovicoMinto· GuidoMariaCortelazzo Time-of-Flight and Structured Light Depth Cameras Technology and Applications
Time-of-Flight and Structured Light Depth Cameras
Pietro Zanuttigh • Giulio Marin Carlo Dal Mutto Fabio Dominio Ludovico Minto Guido Maria Cortelazzo Time-of-Flight and Structured Light Depth Cameras Technology and Applications 123
Pietro Zanuttigh Department of Information Engineering University of Padova Padova, Italy Giulio Marin Department of Information Engineering University of Padova Padova, Italy Carlo Dal Mutto Aquifi Inc. Palo Alto, CA, USA Ludovico Minto Department of Information Engineering University of Padova Padova, Italy Fabio Dominio Department of Information Engineering University of Padova Padova, Italy Guido Maria Cortelazzo 3D Everywhere s.r.l. Padova, Italy ISBN 978-3-319-30971-2 DOI 10.1007/978-3-319-30973-6 ISBN 978-3-319-30973-6 (eBook) Library of Congress Control Number: 2016935940 © Springer International Publishing Switzerland 2016 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer International Publishing AG Switzerland
“Cras ingens iterabimus aequor” (Horace, Odes, VII) In memory of Alberto Apostolico (1948–2015) unique scholar and person
Preface This book originates from three-dimensional data processing research in the Multi- media Technology and Telecommunications Laboratory (LTTM) at the Department of Information Engineering of the University of Padova. The LTTM laboratory has a long history of research activity on consumer depth cameras, starting with Time- of-Flight (ToF) depth cameras in 2008 and continuing since, with a particular focus on recent structured light and ToF depth cameras like the two versions of Microsoft KinectTM. In the past years, the students and researchers at the LTTM laboratory have extensively explored many topics on 3D data acquisition, processing, and visualization, all fields of large interest for the computer vision and the computer graphics communities, as well as for the telecommunications community active on multimedia. In contrast to a previous book by some of the authors, published as Springer Briefs in Electrical and Computer Engineering targeted to specialists, this book has been written for a wider audience, including students and practitioners interested in current consumer depth cameras and the data they provide. This book focuses on the system rather than the device and circuit aspects of the acquisition equipment. Processing methods required by the 3D nature of the data are presented within general frameworks purposely as independent as possible from the technological characteristics of the measurement instruments used to capture the data. The results are typically presented by practical exemplifications with real data to give the reader a clear and concrete idea about the actual processing possibilities. This book is organized into three parts, the first devoted to the working principles of ToF and structured light depth cameras, the second to the extraction of accurate 3D information from depth camera data through proper calibration and data fusion techniques, and the third to the use of 3D data in some challenging computer vision applications. This book comes from the contribution of a great number of people besides the authors. First, almost every student who worked at the LTTM laboratory in the past years gave some contribution to the know-how at the basis of this book and must be acknowledged. Among them, in particular, Alvise Memo must be thanked for his help with the acquisitions from a number of different depth cameras and for vii
分享到:
收藏