A Guide to Convolutional Neural Networks for Computer Vision 无水印原版pdf.pdf

发布时间：2022-06-14 发布人：admin 分类：说明书资料大小：6.22M 资料格式：pdf 举报版权申诉

7064f15e-6635-4346-9e1e-e5bca278eeaa.pdf-第1页.png

第1页 / 共209页

7064f15e-6635-4346-9e1e-e5bca278eeaa.pdf-第2页.png

第2页 / 共209页

7064f15e-6635-4346-9e1e-e5bca278eeaa.pdf-第3页.png

第3页 / 共209页

7064f15e-6635-4346-9e1e-e5bca278eeaa.pdf-第4页.png

第4页 / 共209页

7064f15e-6635-4346-9e1e-e5bca278eeaa.pdf-第5页.png

第5页 / 共209页

7064f15e-6635-4346-9e1e-e5bca278eeaa.pdf-第6页.png

第6页 / 共209页

7064f15e-6635-4346-9e1e-e5bca278eeaa.pdf-第7页.png

第7页 / 共209页

7064f15e-6635-4346-9e1e-e5bca278eeaa.pdf-第8页.png

第8页 / 共209页

Cover

Contents

Preface

1 Introduction

What is Computer Vision?

Applications

Image Processing vs. Computer Vision

What is Machine Learning?

Why Deep Learning?

Book Overview

2 Features and Classifiers

Importance of Features and Classifiers

Features

Classifiers

Traditional Feature Descriptors

Histogram of Oriented Gradients (HOG)

Scale-invariant Feature Transform (SIFT)

Speeded-up Robust Features (SURF)

Limitations of Traditional Hand-engineered Features

Machine Learning Classifiers

Support Vector Machine (SVM)

Random Decision Forest

Conclusion

3 Neural Networks Basics

Introduction

Multi-layer Perceptron

Architecture Basics

Parameter Learning

Recurrent Neural Networks

Architecture Basics

Parameter Learning

Link with Biological Vision

Biological Neuron

Computational Model of a Neuron

Artificial vs. Biological Neuron

4 Convolutional Neural Network

Introduction

Network Layers

Pre-processing

Convolutional Layers

Pooling Layers

Nonlinearity

Fully Connected Layers

Transposed Convolution Layer

Region of Interest Pooling

Spatial Pyramid Pooling Layer

Vector of Locally Aggregated Descriptors Layer

Spatial Transformer Layer

CNN Loss Functions

Cross-entropy Loss

SVM Hinge Loss

Squared Hinge Loss

Euclidean Loss

The 1 Error

Contrastive Loss

Expectation Loss

Structural Similarity Measure

5 CNN Learning

Weight Initialization

Gaussian Random Initialization

Uniform Random Initialization

Orthogonal Random Initialization

Unsupervised Pre-training

Xavier Initialization

ReLU Aware Scaled Initialization

Layer-sequential Unit Variance

Supervised Pre-training

Regularization of CNN

Data Augmentation

Dropout

Drop-connect

Batch Normalization

Ensemble Model Averaging

The 2 Regularization

The 1 Regularization

Elastic Net Regularization

Max-norm Constraints

Early Stopping

Gradient-based CNN Learning

Batch Gradient Descent

Stochastic Gradient Descent

Mini-batch Gradient Descent

Neural Network Optimizers

Momentum

Nesterov Momentum

Adaptive Gradient

Adaptive Delta

RMSprop

Adaptive Moment Estimation

Gradient Computation in CNNs

Analytical Differentiation

Numerical Differentiation

Symbolic Differentiation

Automatic Differentiation

Understanding CNN through Visualization

Visualizing Learned Weights

Visualizing Activations

Visualizations based on Gradients

6 Examples of CNN Architectures

LeNet

AlexNet

Network in Network

VGGnet

GoogleNet

ResNet

ResNeXt

FractalNet

DenseNet

7 Applications of CNNs in Computer Vision

Image Classification

PointNet

Object Detection and Localization

Region-based CNN

Fast R-CNN

Regional Proposal Network (RPN)

Semantic Segmentation

Fully Convolutional Network (FCN)

Deep Deconvolution Network (DDN)

DeepLab

Scene Understanding

DeepContext

Learning Rich Features from RGB-D Images

PointNet for Scene Understanding

Image Generation

Generative Adversarial Networks (GANs)

Deep Convolutional Generative Adversarial Networks (DCGANs)

Super Resolution Generative Adversarial Network (SRGAN)

Video-based Action Recognition

Action Recognition From Still Video Frames

Two-stream CNNs

Long-term Recurrent Convolutional Network (LRCN)

8 Deep Learning Tools and Libraries

Caffe

TensorFlow

MatConvNet

Torch7

Theano

Keras

Lasagne

Marvin

Chainer

PyTorch

Conclusion

Bibliography

Authors' Biographies

Blank Page

Series ISSN: 2153-1056 Sven Dickinson, University of Toronto Series Editors: Gérard Medioni, University of Southern California A Guide to Convolutional Neural Networks for Computer Vision Salman Khan, Data61-CSIRO and Australian National University Hossein Rahmani, The University of Western Australia Syed Afaq Ali Shah, The University of Western Australia Mohammed Bennamoun, The University of Western Australia Computer vision has become increasingly important and effective in recent years due to its wide-ranging applications in areas as diverse as smart surveillance and monitoring, health and medicine, sports and recreation, robotics, drones, and self-driving cars. Visual recognition tasks, such as image classification, localization, and detection, are the core building blocks of many of these applications, and recent developments in Convolutional Neural Networks (CNNs) have led to outstanding performance in these state-of-the-art visual recognition tasks and systems. As a result, CNNs now form the crux of deep learning algorithms in computer vision. This self-contained guide will benefit those who seek to both understand the theory behind CNNs and to gain hands-on experience on the application of CNNs in computer vision. It provides a comprehensive introduction to CNNs starting with the essential concepts behind neural networks: training, regularization, and optimization of CNNs. The book also discusses a wide range of loss functions, network layers, and popular CNN architectures, reviews the different techniques for the evaluation of CNNs, and presents some popular CNN tools and libraries that are commonly used in computer vision. Further, this text describes and discusses case studies that are related to the application of CNN in computer vision, including image classification, object detection, semantic segmentation, scene understanding, and image generation. This book is ideal for undergraduate and graduate students, as no prior background knowledge in the field is required to follow the material, as well as new researchers, developers, engineers, and practitioners who are interested in gaining a quick understanding of CNN models. About SYNTHESIS This volume is a printed version of a work that appears in the Synthesis Digital Library of Engineering and Computer Science. Synthesis books provide concise, original presentations of important research and development topics, published quickly, in digital and print formats. store.morganclaypool.com K H A N • E T A L A G U I D E T O C O N V O L U T I O N A L N E U R A L N E T W O R K S F O R C O M P U T E R V I S I O N M O R G A N & C L A Y P O O L A Guide to Convolutional Neural Networks for Computer Vision Salman Khan Hossein Rahmani Syed Afaq Ali Shah Mohammed Bennamoun

A Guide to Convolutional Neural Networks for Computer Vision

Synthesis Lectures on Computer Vision Editors Gérard Medioni, University of Southern California Sven Dickinson, University of Toronto Synthesis Lectures on Computer Vision is edited by Gérard Medioni of the University of Southern California and Sven Dickinson of the University of Toronto. The series publishes 50–150 page publications on topics pertaining to computer vision and pattern recognition. The scope will largely follow the purview of premier computer science conferences, such as ICCV, CVPR, and ECCV. Potential topics include, but not are limited to: • Applications and Case Studies for Computer Vision • Color, Illumination, and Texture • Computational Photography and Video • Early and Biologically-inspired Vision • Face and Gesture Analysis • Illumination and Reﬂectance Modeling • Image-Based Modeling • Image and Video Retrieval • Medical Image Analysis • Motion and Tracking • Object Detection, Recognition, and Categorization • Segmentation and Grouping • Sensors • Shape-from-X • Stereo and Structure from Motion • Shape Representation and Matching

iv • Statistical Methods and Learning • Performance Evaluation • Video Analysis and Event Recognition A Guide to Convolutional Neural Networks for Computer Vision Salman Khan, Hossein Rahmani, Syed Afaq Ali Shah, and Mohammed Bennamoun 2018 Covariances in Computer Vision and Machine Learning Hà Quang Minh and Vittorio Murino 2017 Elastic Shape Analysis of Three-Dimensional Objects Ian H. Jermyn, Sebastian Kurtek, Hamid Laga, and Anuj Srivastava 2017 The Maximum Consensus Problem: Recent Algorithmic Advances Tat-Jun Chin and David Suter 2017 Extreme Value Theory-Based Methods for Visual Recognition Walter J. Scheirer 2017 Data Association for Multi-Object Visual Tracking Margrit Betke and Zheng Wu 2016 Ellipse Fitting for Computer Vision: Implementation and Applications Kenichi Kanatani, Yasuyuki Sugaya, and Yasushi Kanazawa 2016 Computational Methods for Integrating Vision and Language Kobus Barnard 2016 Background Subtraction: Theory and Practice Ahmed Elgammal 2014 Vision-Based Interaction Matthew Turk and Gang Hua 2013

v Camera Networks: The Acquisition and Analysis of Videos over Wide Areas Amit K. Roy-Chowdhury and Bi Song 2012 Deformable Surface 3D Reconstruction from Monocular Images Mathieu Salzmann and Pascal Fua 2010 Boosting-Based Face Detection and Adaptation Cha Zhang and Zhengyou Zhang 2010 Image-Based Modeling of Plants and Trees Sing Bing Kang and Long Quan 2009

Copyright © 2018 by Morgan & Claypool All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted in any form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotations in printed reviews, without the prior permission of the publisher. A Guide to Convolutional Neural Networks for Computer Vision Salman Khan, Hossein Rahmani, Syed Afaq Ali Shah, and Mohammed Bennamoun www.morganclaypool.com ISBN: 9781681730219 ISBN: 9781681730226 ISBN: 9781681732787 paperback ebook hardcover DOI 10.2200/S00822ED1V01Y201712COV015 A Publication in the Morgan & Claypool Publishers series SYNTHESIS LECTURES ON COMPUTER VISION Lecture #15 Series Editors: Gérard Medioni, University of Southern California Sven Dickinson, University of Toronto Series ISSN Print 2153-1056 Electronic 2153-1064

分享到：

赞收藏

资料库

A Guide to Convolutional Neural Networks for Computer Vision 无水印原版pdf.pdf

相关推荐

开发技术

热门标签

最新资料