logo资料库

Learning.OpenCV.3.Computer.Vision.with.Python.2nd.pdf

第1页 / 共263页
第2页 / 共263页
第3页 / 共263页
第4页 / 共263页
第5页 / 共263页
第6页 / 共263页
第7页 / 共263页
第8页 / 共263页
资料共263页,剩余部分请下载后查看
Cover
Copyright
Credits
About the Authors
About the Reviewers
www.PacktPub.com
Table of Contents
Preface
Chapter 1: Setting Up OpenCV
Choosing and using the right setup tools
Installation on Windows
Using binary installers (no support for depth cameras)
Using CMake and compilers
Installing on OS X
Using MacPorts with ready-made packages
Using MacPorts with your own custom packages
Using Homebrew with ready-made packages (no support for depth cameras)
Using Homebrew with your own custom packages
Installation on Ubuntu and its derivatives
Using the Ubuntu repository (no support for depth cameras)
Building OpenCV from a source
Installation on other Unix-like systems
Installing the Contrib modules
Running samples
Finding documentation, help, and updates
Summary
Chapter 2: Handling Files, Cameras, and GUIs
Basic I/O scripts
Reading/writing an image file
Converting between an image and raw bytes
Accessing image data with numpy.array
Reading/writing a video file
Capturing camera frames
Displaying images in a window
Displaying camera frames in a window
Project Cameo (face tracking and image manipulation)
Cameo – an object-oriented design
Abstracting a video stream with managers.CaptureManager
Abstracting a window and keyboard with managers.WindowManager
Applying everything with cameo.Cameo
Summary
Chapter 3: Processing Images with OpenCV 3
Converting between different color spaces
A quick note on BGR
The Fourier Transform
High pass filter
Low pass filter
Creating modules
Edge detection
Custom kernels – getting convoluted
Modifying the application
Edge detection with Canny
Contour detection
Contours – bounding box, minimum area rectangle, and minimum enclosing circle
Contours – convex contours and the Douglas-Peucker algorithm
Line and circle detection
Line detection
Circle detection
Detecting shapes
Summary
Chapter 4: Depth Estimation and Segmentation
Creating modules
Capturing frames from a depth camera
Creating a mask from a disparity map
Masking a copy operation
Depth estimation with a normal camera
Object segmentation using the Watershed and GrabCut algorithms
Example of foreground detection with GrabCut
Image segmentation with the Watershed algorithm
Summary
Chapter 5: Detecting and Recognizing Faces
Conceptualizing Haar cascades
Getting Haar cascade data
Using OpenCV to perform face detection
Performing face detection on a still image
Performing face detection on a video
Performing face recognition
Generating the data for face recognition
Recognizing faces
Preparing the training data
Loading the data and recognizing faces
Performing an Eigenfaces recognition
Performing face recognition with Fisherfaces
Performing face recognition with LBPH
Discarding results with confidence score
Summary
Chapter 6: Retrieving Images and Searching Using Image Descriptors
Feature detection algorithms
Defining features
Detecting features – corners
Feature extraction and description using DoG and SIFT
Anatomy of a keypoint
Feature extraction and detection using Fast Hessian and SURF
ORB feature detection and feature matching
FAST
BRIEF
Brute-Force matching
Feature matching with ORB
Using K-Nearest Neighbors matching
FLANN-based matching
FLANN matching with homography
A sample application – tattoo forensics
Saving image descriptors to file
Scanning for matches
Summary
Chapter 7: Detecting and Recognizing Objects
Object detection and recognition techniques
HOG descriptors
The scale issue
The location issue
Non-maximum (or non-maxima) suppression
Support vector machines
People detection
Creating and training an object detector
Bag-of-words
BOW in computer vision
Detecting cars
What did we just do?
SVM and sliding windows
Example – car detection in a scene
Dude, where's my car?
Summary
Chapter 8: Tracking Objects
Detecting moving objects
Basic motion detection
Background subtractors – KNN, MOG2, and GMG
Meanshift and CAMShift
Color histograms
The calcHist function
The calcBackProject function
In summary
Back to the code
CAMShift
The Kalman filter
Predict and update
An example
A real-life example – tracking pedestrians
The application workflow
A brief digression – functional versus object-oriented programming
The Pedestrian class
The main program
Where do we go from here?
Summary
Chapter 9: Neural Networks with OpenCV – an Introduction
Artificial neural networks
Neurons and perceptrons
The structure of an ANN
Network layers by example
The input layer
The output layer
The hidden layer
ANNs in OpenCV
ANN-imal classification
Training epochs
Handwritten digit recognition with ANNs
MNIST – the handwritten digit database
Customized training data
The initial parameters
The input layer
The hidden layer
The output layer
Training epochs
Other parameters
Mini-libraries
The main file
Possible improvements and potential applications
Improvements
Potential applications
Summary
To boldly go…
Index
Learning OpenCV 3 Computer Vision with Python Second Edition Unleash the power of computer vision with Python using OpenCV Joe Minichino Joseph Howse BIRMINGHAM - MUMBAI
Learning OpenCV 3 Computer Vision with Python Second Edition Copyright © 2015 Packt Publishing All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews. Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the authors, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book. Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information. First published: September 2015 Production reference: 1240915 Published by Packt Publishing Ltd. Livery Place 35 Livery Street Birmingham B3 2PB, UK. ISBN 978-1-78528-384-0 www.packtpub.com [ FM-2 ]
Credits Project Coordinator Milton Dsouza Proofreader Safis Editing Indexer Monica Ajmera Mehta Graphics Disha Haria Production Coordinator Arvindkumar Gupta Cover Work Arvindkumar Gupta Authors Joe Minichino Joseph Howse Reviewers Nandan Banerjee Tian Cao Brandon Castellano Haojian Jin Adrian Rosebrock Commissioning Editor Akram Hussain Acquisition Editors Vivek Anantharaman Prachi Bisht Content Development Editor Ritika Singh Technical Editors Novina Kewalramani Shivani Kiran Mistry Copy Editor Sonia Cheema [ FM-3 ]
About the Authors Joe Minichino is a computer vision engineer for Hoolux Medical by day and a developer of the NoSQL database LokiJS by night. On weekends, he is a heavy metal singer/songwriter. He is a passionate programmer who is immensely curious about programming languages and technologies and constantly experiments with them. At Hoolux, Joe leads the development of an Android computer vision-based advertising platform for the medical industry. Born and raised in Varese, Lombardy, Italy, and coming from a humanistic background in philosophy (at Milan's Università Statale), Joe has spent his last 11 years living in Cork, Ireland, which is where he became a computer science graduate at the Cork Institute of Technology. I am immensely grateful to my partner, Rowena, for always encouraging me, and also my two little daughters for inspiring me. A big thank you to the collaborators and editors of this book, especially Joe Howse, Adrian Roesbrock, Brandon Castellano, the OpenCV community, and the people at Packt Publishing. Joseph Howse lives in Canada. During the winters, he grows his beard, while his four cats grow their thick coats of fur. He loves combing his cats every day and sometimes, his cats also pull his beard. He has been writing for Packt Publishing since 2012. His books include OpenCV for Secret Agents, OpenCV Blueprints, Android Application Programming with OpenCV 3, OpenCV Computer Vision with Python, and Python Game Programming by Example. When he is not writing books or grooming his cats, he provides consulting, training, and software development services through his company, Nummist Media (http://nummist.com). [ FM-4 ]
About the Reviewers Nandan Banerjee has a bachelor's degree in computer science and a master's in robotics engineering. He started working with Samsung Electronics right after graduation. He worked for a year at its R&D centre in Bangalore. He also worked in the WPI-CMU team on the Boston Dynamics' robot, Atlas, for the DARPA Robotics Challenge. He is currently working as a robotics software engineer in the technology organization at iRobot Corporation. He is an embedded systems and robotics enthusiast with an inclination toward computer vision and motion planning. He has experience in various languages, including C, C++, Python, Java, and Delphi. He also has a substantial experience in working with ROS, OpenRAVE, OpenCV, PCL, OpenGL, CUDA and the Android SDK. I would like to thank the author and publisher for coming out with this wonderful book. Tian Cao is pursuing his PhD in computer science at the University of North Carolina in Chapel Hill, USA, and working on projects related to image analysis, computer vision, and machine learning. I dedicate this work to my parents and girlfriend. [ FM-5 ]
Brandon Castellano is a student from Canada pursuing an MESc in electrical engineering at the University of Western Ontario, City of London, Canada. He received his BESc in the same subject in 2012. The focus of his research is in parallel processing and GPGPU/FPGA optimization for real-time implementations of image processing algorithms. Brandon also works for Eagle Vision Systems Inc., focusing on the use of real-time image processing for robotics applications. While he has been using OpenCV and C++ for more than 5 years, he has also been advocating the use of Python frequently in his research, most notably, for its rapid speed of development, allowing low-level interfacing with complex systems. This is evident in his open source projects hosted on GitHub, for example, PySceneDetect, which is mostly written in Python. In addition to image/video processing, he has also worked on implementations of three-dimensional displays as well as the software tools to support the development of such displays. In addition to posting technical articles and tutorials on his website (http://www.bcastell.com), he participates in a variety of both open and closed source projects and contributes to GitHub under the username Breakthrough (http://www.github.com/Breakthrough). He is an active member of the Super User and Stack Overflow communities (under the name Breakthrough), and can be contacted directly via his website. I would like to thank all my friends and family for their patience during the past few years (especially my parents, Peter and Lori, and my brother, Mitchell). I could not have accomplished everything without their continued love and support. I can't ever thank everyone enough. I would also like to extend a special thanks to all of the developers that contribute to open source software libraries, specifically OpenCV, which help bring the development of cutting-edge software technology closer to all the software developers around the world, free of cost. I would also like to thank those people who help write documentation, submit bug reports, and write tutorials/books (especially the author of this book!). Their contributions are vital to the success of any open source project, especially one that is as extensive and complex as OpenCV. [ FM-6 ]
Haojian Jin is a software engineer/researcher at Yahoo! Labs, Sunnyvale, CA. He looks primarily at building new systems of what's possible on commodity mobile devices (or with minimum hardware changes). To create things that don't exist today, he spends large chunks of his time playing with signal processing, computer vision, machine learning, and natural language processing and using them in interesting ways. You can find more about him at http://shift-3.com/ Adrian Rosebrock is an author and blogger at http://www.pyimagesearch.com/. He holds a PhD in computer science from the University of Maryland, Baltimore County, USA, with a focus on computer vision and machine learning. He has consulted for the National Cancer Institute to develop methods that automatically predict breast cancer risk factors using breast histology images. He has also authored a book, Practical Python and OpenCV (http://pyimg.co/x7ed5), on the utilization of Python and OpenCV to build real-world computer vision applications. [ FM-7 ]
分享到:
收藏