logo资料库

A Guide to Convolutional Neural Networks for Computer Vision 无水印原版pdf.pdf

第1页 / 共209页
第2页 / 共209页
第3页 / 共209页
第4页 / 共209页
第5页 / 共209页
第6页 / 共209页
第7页 / 共209页
第8页 / 共209页
资料共209页,剩余部分请下载后查看
Cover
Copyright
Contents
Preface
1 Introduction
What is Computer Vision?
Applications
Image Processing vs. Computer Vision
What is Machine Learning?
Why Deep Learning?
Book Overview
2 Features and Classifiers
Importance of Features and Classifiers
Features
Classifiers
Traditional Feature Descriptors
Histogram of Oriented Gradients (HOG)
Scale-invariant Feature Transform (SIFT)
Speeded-up Robust Features (SURF)
Limitations of Traditional Hand-engineered Features
Machine Learning Classifiers
Support Vector Machine (SVM)
Random Decision Forest
Conclusion
3 Neural Networks Basics
Introduction
Multi-layer Perceptron
Architecture Basics
Parameter Learning
Recurrent Neural Networks
Architecture Basics
Parameter Learning
Link with Biological Vision
Biological Neuron
Computational Model of a Neuron
Artificial vs. Biological Neuron
4 Convolutional Neural Network
Introduction
Network Layers
Pre-processing
Convolutional Layers
Pooling Layers
Nonlinearity
Fully Connected Layers
Transposed Convolution Layer
Region of Interest Pooling
Spatial Pyramid Pooling Layer
Vector of Locally Aggregated Descriptors Layer
Spatial Transformer Layer
CNN Loss Functions
Cross-entropy Loss
SVM Hinge Loss
Squared Hinge Loss
Euclidean Loss
The 1 Error
Contrastive Loss
Expectation Loss
Structural Similarity Measure
5 CNN Learning
Weight Initialization
Gaussian Random Initialization
Uniform Random Initialization
Orthogonal Random Initialization
Unsupervised Pre-training
Xavier Initialization
ReLU Aware Scaled Initialization
Layer-sequential Unit Variance
Supervised Pre-training
Regularization of CNN
Data Augmentation
Dropout
Drop-connect
Batch Normalization
Ensemble Model Averaging
The 2 Regularization
The 1 Regularization
Elastic Net Regularization
Max-norm Constraints
Early Stopping
Gradient-based CNN Learning
Batch Gradient Descent
Stochastic Gradient Descent
Mini-batch Gradient Descent
Neural Network Optimizers
Momentum
Nesterov Momentum
Adaptive Gradient
Adaptive Delta
RMSprop
Adaptive Moment Estimation
Gradient Computation in CNNs
Analytical Differentiation
Numerical Differentiation
Symbolic Differentiation
Automatic Differentiation
Understanding CNN through Visualization
Visualizing Learned Weights
Visualizing Activations
Visualizations based on Gradients
6 Examples of CNN Architectures
LeNet
AlexNet
Network in Network
VGGnet
GoogleNet
ResNet
ResNeXt
FractalNet
DenseNet
7 Applications of CNNs in Computer Vision
Image Classification
PointNet
Object Detection and Localization
Region-based CNN
Fast R-CNN
Regional Proposal Network (RPN)
Semantic Segmentation
Fully Convolutional Network (FCN)
Deep Deconvolution Network (DDN)
DeepLab
Scene Understanding
DeepContext
Learning Rich Features from RGB-D Images
PointNet for Scene Understanding
Image Generation
Generative Adversarial Networks (GANs)
Deep Convolutional Generative Adversarial Networks (DCGANs)
Super Resolution Generative Adversarial Network (SRGAN)
Video-based Action Recognition
Action Recognition From Still Video Frames
Two-stream CNNs
Long-term Recurrent Convolutional Network (LRCN)
8 Deep Learning Tools and Libraries
Caffe
TensorFlow
MatConvNet
Torch7
Theano
Keras
Lasagne
Marvin
Chainer
PyTorch
Conclusion
Bibliography
Authors' Biographies
Blank Page
Series ISSN: 2153-1056 Sven Dickinson, University of Toronto Series Editors: Gérard Medioni, University of Southern California A Guide to Convolutional Neural Networks for Computer Vision Salman Khan, Data61-CSIRO and Australian National University Hossein Rahmani, The University of Western Australia Syed Afaq Ali Shah, The University of Western Australia Mohammed Bennamoun, The University of Western Australia Computer vision has become increasingly important and effective in recent years due to its wide-ranging applications in areas as diverse as smart surveillance and monitoring, health and medicine, sports and recreation, robotics, drones, and self-driving cars. Visual recognition tasks, such as image classification, localization, and detection, are the core building blocks of many of these applications, and recent developments in Convolutional Neural Networks (CNNs) have led to outstanding performance in these state-of-the-art visual recognition tasks and systems. As a result, CNNs now form the crux of deep learning algorithms in computer vision. This self-contained guide will benefit those who seek to both understand the theory behind CNNs and to gain hands-on experience on the application of CNNs in computer vision. It provides a comprehensive introduction to CNNs starting with the essential concepts behind neural networks: training, regularization, and optimization of CNNs. The book also discusses a wide range of loss functions, network layers, and popular CNN architectures, reviews the different techniques for the evaluation of CNNs, and presents some popular CNN tools and libraries that are commonly used in computer vision. Further, this text describes and discusses case studies that are related to the application of CNN in computer vision, including image classification, object detection, semantic segmentation, scene understanding, and image generation. This book is ideal for undergraduate and graduate students, as no prior background knowledge in the field is required to follow the material, as well as new researchers, developers, engineers, and practitioners who are interested in gaining a quick understanding of CNN models. About SYNTHESIS This volume is a printed version of a work that appears in the Synthesis Digital Library of Engineering and Computer Science. Synthesis books provide concise, original presentations of important research and development topics, published quickly, in digital and print formats. store.morganclaypool.com K H A N • E T A L A G U I D E T O C O N V O L U T I O N A L N E U R A L N E T W O R K S F O R C O M P U T E R V I S I O N M O R G A N & C L A Y P O O L A Guide to Convolutional Neural Networks for Computer Vision Salman Khan Hossein Rahmani Syed Afaq Ali Shah Mohammed Bennamoun
A Guide to Convolutional Neural Networks for Computer Vision
Synthesis Lectures on Computer Vision Editors Gérard Medioni, University of Southern California Sven Dickinson, University of Toronto Synthesis Lectures on Computer Vision is edited by Gérard Medioni of the University of Southern California and Sven Dickinson of the University of Toronto. The series publishes 50–150 page publications on topics pertaining to computer vision and pattern recognition. The scope will largely follow the purview of premier computer science conferences, such as ICCV, CVPR, and ECCV. Potential topics include, but not are limited to: • Applications and Case Studies for Computer Vision • Color, Illumination, and Texture • Computational Photography and Video • Early and Biologically-inspired Vision • Face and Gesture Analysis • Illumination and Reflectance Modeling • Image-Based Modeling • Image and Video Retrieval • Medical Image Analysis • Motion and Tracking • Object Detection, Recognition, and Categorization • Segmentation and Grouping • Sensors • Shape-from-X • Stereo and Structure from Motion • Shape Representation and Matching
iv • Statistical Methods and Learning • Performance Evaluation • Video Analysis and Event Recognition A Guide to Convolutional Neural Networks for Computer Vision Salman Khan, Hossein Rahmani, Syed Afaq Ali Shah, and Mohammed Bennamoun 2018 Covariances in Computer Vision and Machine Learning Hà Quang Minh and Vittorio Murino 2017 Elastic Shape Analysis of Three-Dimensional Objects Ian H. Jermyn, Sebastian Kurtek, Hamid Laga, and Anuj Srivastava 2017 The Maximum Consensus Problem: Recent Algorithmic Advances Tat-Jun Chin and David Suter 2017 Extreme Value Theory-Based Methods for Visual Recognition Walter J. Scheirer 2017 Data Association for Multi-Object Visual Tracking Margrit Betke and Zheng Wu 2016 Ellipse Fitting for Computer Vision: Implementation and Applications Kenichi Kanatani, Yasuyuki Sugaya, and Yasushi Kanazawa 2016 Computational Methods for Integrating Vision and Language Kobus Barnard 2016 Background Subtraction: Theory and Practice Ahmed Elgammal 2014 Vision-Based Interaction Matthew Turk and Gang Hua 2013
v Camera Networks: The Acquisition and Analysis of Videos over Wide Areas Amit K. Roy-Chowdhury and Bi Song 2012 Deformable Surface 3D Reconstruction from Monocular Images Mathieu Salzmann and Pascal Fua 2010 Boosting-Based Face Detection and Adaptation Cha Zhang and Zhengyou Zhang 2010 Image-Based Modeling of Plants and Trees Sing Bing Kang and Long Quan 2009
Copyright © 2018 by Morgan & Claypool All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher. A Guide to Convolutional Neural Networks for Computer Vision Salman Khan, Hossein Rahmani, Syed Afaq Ali Shah, and Mohammed Bennamoun www.morganclaypool.com ISBN: 9781681730219 ISBN: 9781681730226 ISBN: 9781681732787 paperback ebook hardcover DOI 10.2200/S00822ED1V01Y201712COV015 A Publication in the Morgan & Claypool Publishers series SYNTHESIS LECTURES ON COMPUTER VISION Lecture #15 Series Editors: Gérard Medioni, University of Southern California Sven Dickinson, University of Toronto Series ISSN Print 2153-1056 Electronic 2153-1064
分享到:
收藏