计算机视觉——算法与应用(PDF).pdf

发布时间：2022-05-30 发布人：admin 分类：说明书资料大小：22.14M 资料格式：pdf 举报版权申诉

yaoya3-4241743-SzeliskiBook_20100903_draft.pdf-第1页.png

第1页 / 共979页

yaoya3-4241743-SzeliskiBook_20100903_draft.pdf-第2页.png

第2页 / 共979页

yaoya3-4241743-SzeliskiBook_20100903_draft.pdf-第3页.png

第3页 / 共979页

yaoya3-4241743-SzeliskiBook_20100903_draft.pdf-第4页.png

第4页 / 共979页

yaoya3-4241743-SzeliskiBook_20100903_draft.pdf-第5页.png

第5页 / 共979页

yaoya3-4241743-SzeliskiBook_20100903_draft.pdf-第6页.png

第6页 / 共979页

yaoya3-4241743-SzeliskiBook_20100903_draft.pdf-第7页.png

第7页 / 共979页

yaoya3-4241743-SzeliskiBook_20100903_draft.pdf-第8页.png

第8页 / 共979页

Preface

Contents

Introduction

What is computer vision?

A brief history

Book overview

Sample syllabus

A note on notation

Additional reading

Image formation

Geometric primitives and transformations

Geometric primitives

2D transformations

3D transformations

3D rotations

3D to 2D projections

Lens distortions

Photometric image formation

Lighting

Reflectance and shading

Optics

The digital camera

Sampling and aliasing

Color

Compression

Additional reading

Exercises

Image processing

Point operators

Pixel transforms

Color transforms

Compositing and matting

Histogram equalization

Application: Tonal adjustment

Linear filtering

Separable filtering

Examples of linear filtering

Band-pass and steerable filters

More neighborhood operators

Non-linear filtering

Morphology

Distance transforms

Connected components

Fourier transforms

Fourier transform pairs

Two-dimensional Fourier transforms

Wiener filtering

Application: Sharpening, blur, and noise removal

Pyramids and wavelets

Interpolation

Decimation

Multi-resolution representations

Wavelets

Application: Image blending

Geometric transformations

Parametric transformations

Mesh-based warping

Application: Feature-based morphing

Global optimization

Regularization

Markov random fields

Application: Image restoration

Additional reading

Exercises

Feature detection and matching

Points and patches

Feature detectors

Feature descriptors

Feature matching

Feature tracking

Application: Performance-driven animation

Edges

Edge detection

Edge linking

Application: Edge editing and enhancement

Lines

Successive approximation

Hough transforms

Vanishing points

Application: Rectangle detection

Additional reading

Exercises

Segmentation

Active contours

Snakes

Dynamic snakes and CONDENSATION

Scissors

Level Sets

Application: Contour tracking and rotoscoping

Split and merge

Watershed

Region splitting (divisive clustering)

Region merging (agglomerative clustering)

Graph-based segmentation

Probabilistic aggregation

Mean shift and mode finding

K-means and mixtures of Gaussians

Mean shift

Normalized cuts

Graph cuts and energy-based methods

Application: Medical image segmentation

Additional reading

Exercises

Feature-based alignment

2D and 3D feature-based alignment

2D alignment using least squares

Application: Panography

Iterative algorithms

Robust least squares and RANSAC

3D alignment

Pose estimation

Linear algorithms

Iterative algorithms

Application: Augmented reality

Geometric intrinsic calibration

Calibration patterns

Vanishing points

Application: Single view metrology

Rotational motion

Radial distortion

Additional reading

Exercises

Structure from motion

Triangulation

Two-frame structure from motion

Projective (uncalibrated) reconstruction

Self-calibration

Application: View morphing

Factorization

Perspective and projective factorization

Application: Sparse 3D model extraction

Bundle adjustment

Exploiting sparsity

Application: Match move and augmented reality

Uncertainty and ambiguities

Application: Reconstruction from Internet photos

Constrained structure and motion

Line-based techniques

Plane-based techniques

Additional reading

Exercises

Dense motion estimation

Translational alignment

Hierarchical motion estimation

Fourier-based alignment

Incremental refinement

Parametric motion

Application: Video stabilization

Learned motion models

Spline-based motion

Application: Medical image registration

Optical flow

Multi-frame motion estimation

Application: Video denoising

Application: De-interlacing

Layered motion

Application: Frame interpolation

Transparent layers and reflections

Additional reading

Exercises

Image stitching

Motion models

Planar perspective motion

Application: Whiteboard and document scanning

Rotational panoramas

Gap closing

Application: Video summarization and compression

Cylindrical and spherical coordinates

Global alignment

Bundle adjustment

Parallax removal

Recognizing panoramas

Direct vs. feature-based alignment

Compositing

Choosing a compositing surface

Pixel selection and weighting (de-ghosting)

Application: Photomontage

Blending

Additional reading

Exercises

Computational photography

Photometric calibration

Radiometric response function

Noise level estimation

Vignetting

Optical blur (spatial response) estimation

High dynamic range imaging

Tone mapping

Application: Flash photography

Super-resolution and blur removal

Color image demosaicing

Application: Colorization

Image matting and compositing

Blue screen matting

Natural image matting

Optimization-based matting

Smoke, shadow, and flash matting

Video matting

Texture analysis and synthesis

Application: Hole filling and inpainting

Application: Non-photorealistic rendering

Additional reading

Exercises

Stereo correspondence

Epipolar geometry

Rectification

Plane sweep

Sparse correspondence

3D curves and profiles

Dense correspondence

Similarity measures

Local methods

Sub-pixel estimation and uncertainty

Application: Stereo-based head tracking

Global optimization

Dynamic programming

Segmentation-based techniques

Application: Z-keying and background replacement

Multi-view stereo

Volumetric and 3D surface reconstruction

Shape from silhouettes

Additional reading

Exercises

3D reconstruction

Shape from X

Shape from shading and photometric stereo

Shape from texture

Shape from focus

Active rangefinding

Range data merging

Application: Digital heritage

Surface representations

Surface interpolation

Surface simplification

Geometry images

Point-based representations

Volumetric representations

Implicit surfaces and level sets

Model-based reconstruction

Architecture

Heads and faces

Application: Facial animation

Whole body modeling and tracking

Recovering texture maps and albedos

Estimating BRDFs

Application: 3D photography

Additional reading

Exercises

Image-based rendering

View interpolation

View-dependent texture maps

Application: Photo Tourism

Layered depth images

Impostors, sprites, and layers

Light fields and Lumigraphs

Unstructured Lumigraph

Surface light fields

Application: Concentric mosaics

Environment mattes

Higher-dimensional light fields

The modeling to rendering continuum

Video-based rendering

Video-based animation

Video textures

Application: Animating pictures

3D Video

Application: Video-based walkthroughs

Additional reading

Exercises

Recognition

Object detection

Face detection

Pedestrian detection

Face recognition

Eigenfaces

Active appearance and 3D shape models

Application: Personal photo collections

Instance recognition

Geometric alignment

Large databases

Application: Location recognition

Category recognition

Bag of words

Part-based models

Recognition with segmentation

Application: Intelligent photo editing

Context and scene understanding

Learning and large image collections

Application: Image search

Recognition databases and test sets

Additional reading

Exercises

Conclusion

Linear algebra and numerical techniques

Matrix decompositions

Singular value decomposition

Eigenvalue decomposition

QR factorization

Cholesky factorization

Linear least squares

Total least squares

Non-linear least squares

Direct sparse matrix techniques

Variable reordering

Iterative techniques

Conjugate gradient

Preconditioning

Multigrid

Bayesian modeling and inference

Estimation theory

Likelihood for multivariate Gaussian noise

Maximum likelihood estimation and least squares

Robust statistics

Prior models and Bayesian inference

Markov random fields

Gradient descent and simulated annealing

Dynamic programming

Belief propagation

Graph cuts

Linear programming

Uncertainty estimation (error analysis)

Supplementary material

Data sets

Software

Slides and lectures

Bibliography

References

Index

Computer Vision: Algorithms and Applications Richard Szeliski September 3, 2010 draft c2010 Springer This electronic draft is for non-commercial personal use only, and may not be posted or re-distributed in any form. Please refer interested readers to the book’s Web site at http://szeliski.org/Book/.

This book is dedicated to my parents, Zdzisław and Jadwiga, and my family, Lyn, Anne, and Stephen.

1 Introduction What is computer vision? • A brief history • Book overview • Sample syllabus • Notation 2 Image formation Geometric primitives and transformations • Photometric image formation • The digital camera 3 Image processing 1 29 99 Point operators • Linear ﬁltering • More neighborhood operators • Fourier transforms • Pyramids and wavelets • Geometric transformations • Global optimization 4 Feature detection and matching 205 Points and patches • Edges • Lines 5 Segmentation Mean shift and mode ﬁnding • Normalized cuts • Active contours • Split and merge • Graph cuts and energy-based methods 6 Feature-based alignment 2D and 3D feature-based alignment • Pose estimation • Geometric intrinsic calibration 7 Structure from motion 267 309 343 Triangulation • Two-frame structure from motion • Factorization • Bundle adjustment • Constrained structure and motion n^

8 Dense motion estimation Translational alignment • Parametric motion • Spline-based motion • Optical ﬂow • Layered motion 9 Image stitching Motion models • Global alignment • Compositing 381 427 10 Computational photography 467 Photometric calibration • High dynamic range imaging • Super-resolution and blur removal • Image matting and compositing • Texture analysis and synthesis 11 Stereo correspondence 533 Epipolar geometry • Sparse correspondence • Dense correspondence • Local methods • Global optimization • Multi-view stereo 12 3D reconstruction 577 Shape from X • Active rangeﬁnding • Surface representations • Point-based representations • Volumetric representations • Model-based reconstruction • 619 13 Image-based rendering Recovering texture maps and albedos View interpolation • Layered depth images • Light ﬁelds and Lumigraphs • Environment mattes • Video-based rendering 14 Recognition 655 Instance recognition • Category recognition • Object detection • Face recognition • Context and scene understanding • Recognition databases and test sets

Preface The seeds for this book were ﬁrst planted in 2001 when Steve Seitz at the University of Wash- ington invited me to co-teach a course called “Computer Vision for Computer Graphics”. At that time, computer vision techniques were increasingly being used in computer graphics to create image-based models of real-world objects, to create visual effects, and to merge real- world imagery using computational photography techniques. Our decision to focus on the applications of computer vision to fun problems such as image stitching and photo-based 3D modeling from personal photos seemed to resonate well with our students. Since that time, a similar syllabus and project-oriented course structure has been used to teach general computer vision courses both at the University of Washington and at Stanford. (The latter was a course I co-taught with David Fleet in 2003.) Similar curricula have been adopted at a number of other universities and also incorporated into more specialized courses on computational photography. (For ideas on how to use this book in your own course, please see Table 1.1 in Section 1.4.) This book also reﬂects my 20 years’ experience doing computer vision research in corpo- rate research labs, mostly at Digital Equipment Corporation’s Cambridge Research Lab and at Microsoft Research. In pursuing my work, I have mostly focused on problems and solu- tion techniques (algorithms) that have practical real-world applications and that work well in practice. Thus, this book has more emphasis on basic techniques that work under real-world conditions and less on more esoteric mathematics that has intrinsic elegance but less practical applicability. This book is suitable for teaching a senior-level undergraduate course in computer vision to students in both computer science and electrical engineering. I prefer students to have either an image processing or a computer graphics course as a prerequisite so that they can spend less time learning general background mathematics and more time studying computer vision techniques. The book is also suitable for teaching graduate-level courses in computer vision (by delving into the more demanding application and algorithmic areas) and as a gen- eral reference to fundamental techniques and the recent research literature. To this end, I have attempted wherever possible to at least cite the newest research in each sub-ﬁeld, even if the

viii Computer Vision: Algorithms and Applications (September 3, 2010 draft) technical details are too complex to cover in the book itself. In teaching our courses, we have found it useful for the students to attempt a number of small implementation projects, which often build on one another, in order to get them used to working with real-world images and the challenges that these present. The students are then asked to choose an individual topic for each of their small-group, ﬁnal projects. (Sometimes these projects even turn into conference papers!) The exercises at the end of each chapter contain numerous suggestions for smaller mid-term projects, as well as more open-ended problems whose solutions are still active research topics. Wherever possible, I encourage students to try their algorithms on their own personal photographs, since this better motivates them, often leads to creative variants on the problems, and better acquaints them with the variety and complexity of real-world imagery. In formulating and solving computer vision problems, I have often found it useful to draw inspiration from three high-level approaches: • Scientiﬁc: build detailed models of the image formation process and develop mathe- matical techniques to invert these in order to recover the quantities of interest (where necessary, making simplifying assumption to make the mathematics more tractable). • Statistical: use probabilistic models to quantify the prior likelihood of your unknowns and the noisy measurement processes that produce the input images, then infer the best possible estimates of your desired quantities and analyze their resulting uncertainties. The inference algorithms used are often closely related to the optimization techniques used to invert the (scientiﬁc) image formation processes. • Engineering: develop techniques that are simple to describe and implement but that are also known to work well in practice. Test these techniques to understand their limitation and failure modes, as well as their expected computational costs (run-time performance). These three approaches build on each other and are used throughout the book. My personal research and development philosophy (and hence the exercises in the book) have a strong emphasis on testing algorithms. It’s too easy in computer vision to develop an algorithm that does something plausible on a few images rather than something correct. The best way to validate your algorithms is to use a three-part strategy. First, test your algorithm on clean synthetic data, for which the exact results are known. Second, add noise to the data and evaluate how the performance degrades as a function of noise level. Finally, test the algorithm on real-world data, preferably drawn from a wide variety of sources, such as photos found on the Web. Only then can you truly know if your algorithm can deal with real-world complexity, i.e., images that do not ﬁt some simpliﬁed model or assumptions.

分享到：

赞收藏

资料库

计算机视觉——算法与应用(PDF).pdf

相关推荐

开发技术

热门标签

最新资料