Decision Forests for Computer Vision and Medical Image Analysis.pdf

发布时间：2022-06-13 发布人：admin 分类：说明书资料大小：18.28M 资料格式：pdf 举报版权申诉

f934accb-fd23-4e32-96eb-73fa80368f86.pdf-第1页.png

第1页 / 共366页

f934accb-fd23-4e32-96eb-73fa80368f86.pdf-第2页.png

第2页 / 共366页

f934accb-fd23-4e32-96eb-73fa80368f86.pdf-第3页.png

第3页 / 共366页

f934accb-fd23-4e32-96eb-73fa80368f86.pdf-第4页.png

第4页 / 共366页

f934accb-fd23-4e32-96eb-73fa80368f86.pdf-第5页.png

第5页 / 共366页

f934accb-fd23-4e32-96eb-73fa80368f86.pdf-第6页.png

第6页 / 共366页

f934accb-fd23-4e32-96eb-73fa80368f86.pdf-第7页.png

第7页 / 共366页

f934accb-fd23-4e32-96eb-73fa80368f86.pdf-第8页.png

第8页 / 共366页

Decision Forests for Computer Vision and Medical Image Analysis

Foreword

Preface

Acknowledgements

Contents

Contributors

Chapter 1: Overview and Scope

A Chronological Literature Review

Chapter 2: Notation and Terminology

2.1 Notation

2.2 Common Terms

Part I: The Decision Forest Model

Chapter 3: Introduction: The Abstract Forest Model

3.1 Decision Tree Basics

3.1.1 Tree Data Structure

3.1.2 Decision Tree

3.2 Mathematical Notation and Basic Deﬁnitions

3.2.1 Data Point and Features

3.2.2 Test Functions, Split Functions and Weak Learners

3.2.3 Training Points and Training Sets

3.3 Randomly Trained Decision Trees

3.3.1 Tree Testing (On-line Phase)

3.3.2 Tree Training (Off-line Phase)

Tree Structure

3.3.3 Weak Learner Models

Linear Data Separation

Non-linear Data Separation

3.3.4 Energy Models

Entropy and Information Gain

3.3.5 Leaf Prediction Models

3.3.6 The Randomness Model

3.3.6.1 Bagging

3.3.6.2 Randomized Node Optimization (RNO)

3.4 Combining Trees into a Forest Ensemble

3.4.1 Key Model Parameters

3.5 Summary

Chapter 4: Classiﬁcation Forests

4.1 Classiﬁcation Algorithms in the Literature

4.2 Specializing the Decision Forest Model for Classiﬁcation

Problem Statement

4.2.1 The Training Objective Function

4.2.2 Class Re-balancing

4.2.3 Randomness

4.2.4 The Leaf and Ensemble Prediction Models

4.3 Effect of Model Parameters

4.3.1 The Effect of the Forest Size on Generalization

4.3.2 Multiple Classes and Training Noise

4.3.3 The Effect of the Tree Depth

4.3.4 The Effect of the Weak Learner

4.3.5 The Effect of Randomness

4.4 Maximum Margin Classiﬁcation with Forests

4.4.1 The Effect of Randomness on Optimal Separation

4.4.2 Inﬂuence of the Weak Learner Model

4.4.3 Maximum Margin with Multiple Classes

4.4.4 The Effect of the Randomness Model

4.5 Summary

4.6 Exercises and Experiments

Chapter 5: Regression Forests

5.1 Non-linear Regression in the Literature

5.2 Specializing the Decision Forest Model for Regression

Problem Statement

What Are Regression Forests?

5.2.1 The Prediction Model

5.2.2 The Ensemble Model

5.2.3 Randomness Model

5.2.4 The Training Objective Function

5.2.5 The Weak Learner Model

5.3 Effect of Model Parameters

5.3.1 The Effect of the Forest Size

5.3.2 The Effect of the Tree Depth

5.3.3 Spatial Smoothness and Testing Uncertainty

5.4 Summary

5.5 Exercises and Experiments

Chapter 6: Density Forests

6.1 Density Estimation in the Literature

6.2 Specializing the Forest Model for Density Estimation

Problem Statement

What Are Density Forests?

6.2.1 The Training Objective Function

Discussion

6.2.2 The Prediction Model

The Partition Function

6.2.3 The Ensemble Model

6.2.4 Discussion

6.3 Effect of Model Parameters

6.3.1 The Effect of Tree Depth

6.3.2 The Effect of Forest Size

6.3.3 More Complex Examples

6.4 Comparison with Alternative Algorithms

6.4.1 Comparison with Non-parametric Estimators

6.4.2 Comparison with GMM EM

6.4.3 Comparing Computational Complexity

6.5 Sampling from the Generative Model

6.5.1 Efﬁciency

6.5.2 Results

6.6 Quantitative Analysis

6.7 Summary

6.8 Exercises and Experiments

Chapter 7: Manifold Forests

7.1 Manifold Learning and Dimensionality Reduction in the Literature

7.2 Specializing the Forest Model for Manifold Learning

Problem Statement

What Are Manifold Forests?

7.2.1 The Training Objective Function

7.2.2 The Predictor Model

7.2.3 The Afﬁnity Model

7.2.4 The Ensemble Model

7.2.5 Estimating the Embedding Function

7.2.6 Mapping Previously Unseen Points

7.2.7 Properties and Advantages

7.2.7.1 Ensemble Clustering for Distance Estimation

An Illustrative Example

7.2.7.2 Choosing the Feature Space

7.2.7.3 Computational Efﬁciency

7.2.7.4 Estimating the Target Intrinsic Dimensionality

7.3 Experiments and the Effect of Model Parameters

7.3.1 The Effect of the Forest Size

7.3.2 Manifold Learning in Higher Dimensions

7.3.3 Discovering the Manifold Intrinsic Dimensionality

7.4 Summary

Chapter 8: Semi-supervised Classiﬁcation Forests

8.1 Semi-supervised Learning in the Literature

8.2 Specializing the Decision Forest Model for Semi-supervised Classiﬁcation

Problem Statement

What Are Semi-supervised Forests?

8.2.1 The Training Objective Function

8.3 Transduction Trees for Classifying Already Available Data

8.3.1 Transductive Ensemble

8.4 Induction from Transduction

8.4.1 Inductive Ensemble

8.4.2 Discussion

8.5 Examples, Comparisons and Effect of Model Parameters

8.5.1 The Effect of Additional Supervision, and Active Learning

8.5.2 Comparison with Transductive SVMs

8.5.3 Handling Multiple Classes

8.5.4 The Effect of Tree Depth

8.6 Summary

8.7 Exercises and Experiments

Part II: Applications in Computer Vision and Medical Image Analysis

Chapter 9: Keypoint Recognition Using Random Forests and Random Ferns

9.1 Introduction

9.2 Wide-Baseline Point Matching as a Classiﬁcation Problem

9.3 Keypoint Recognition with Classiﬁcation Forests

9.3.1 Random Classiﬁcation Forests

9.3.2 Node Tests

9.3.3 Building the Trees

9.4 Keypoint Recognition with Random Ferns

9.4.1 Random Ferns

9.4.2 Training the Ferns

9.5 Comparing Random Forests, Random Ferns, and SIFT

9.5.1 Empirical Comparisons of Trees and Ferns

9.5.2 Empirical Comparisons Between SIFT and Ferns

9.6 Discussion

9.7 Application Example

9.8 Conclusion

Chapter 10: Extremely Randomized Trees and Random Subwindows for Image Classiﬁcation, Annotation, and Retrieval

10.1 Introduction

10.2 Random Subwindow-Based Image Analysis

10.2.1 Training Stage

10.2.2 Prediction Stage

10.2.3 Discussion

10.2.3.1 Subwindows Extraction

10.2.3.2 Subwindows Transformation

10.2.3.3 Feature Extraction

10.3 Extremely Randomized Trees

10.3.1 Extremely and Totally Randomized Trees

10.3.2 Multiple Output Trees

10.3.3 Kernel View of Tree-Based Ensembles

10.4 Applications

10.4.1 Content-Based Image Retrieval

10.4.1.1 Method

10.4.1.2 Illustration

10.4.2 Image Classiﬁcation

10.4.2.1 Method

10.4.2.2 Illustration

10.4.3 Interest Point Detection

10.4.3.1 Method

10.4.3.2 Illustration

10.4.4 Image Segmentation

10.4.4.1 Method

10.4.4.2 Illustration

10.5 Conclusions and Future Works

Chapter 11: Class-Speciﬁc Hough Forests for Object Detection

11.1 Introduction

11.2 Related Work

11.3 Building Hough Forests

11.3.1 Training Data and Leaf Information

11.3.2 Patch Appearance and Binary Tests

11.3.3 Tree Construction

11.4 Object Detection with Hough Forests

11.4.1 Handling Variable Scales and Aspect Ratios

11.4.2 Leaf Pruning

11.5 Experiments

11.5.1 UIUC Cars

11.5.2 TUD Pedestrians, INRIA Pedestrians, Weizmann Horses

11.5.3 Does Offset Supervision Matter?

11.6 Discussion

Chapter 12: Hough-Based Tracking of Deformable Objects

12.1 Introduction

12.2 Related Work

12.3 Online Hough Ferns

12.3.1 Random Ferns

12.3.2 Hough Voting

12.3.3 Incremental Leaf Node Statistics

12.3.4 Online Adaptation

12.3.5 Support

12.4 Closing the Tracking Loop

12.5 Experimental Evaluation

12.5.1 Bounding Box Dataset

12.5.2 Tracking Deformable Objects

12.5.3 Bounding Box vs. Segmentation-Based Tracking

12.5.4 Discussion and Implementation Details

12.6 Conclusion

Chapter 13: Efﬁcient Human Pose Estimation from Single Depth Images

13.1 Introduction

13.2 Data

Body Part Classiﬁcation Labels

Offset Joint Regression Labels

13.3 Method

13.3.1 Depth Image Features

13.3.2 Weak Learners

13.3.3 Leaf Node Prediction Models

Body Part Classiﬁcation (bpc)

Offset Joint Regression (ojr)

13.3.4 Aggregating Predictions

13.3.5 Training

13.3.5.1 Tree Structure Training

Classiﬁcation

Regression

Discussion

13.3.5.2 Leaf Node Prediction Models

13.4 Experiments

13.4.1 Test Data

13.4.2 Qualitative Results

13.4.3 Comparison Between bpc and ojr

13.4.4 Comparison to Ganapathi et al. [121]

13.5 Conclusions

Chapter 14: Anatomy Detection and Localization in 3D Medical Images

14.1 Introduction

Regression Approach

Comparison with Classiﬁcation-Based Approaches

Comparison with Registration-Based Approaches

14.2 Organ Localization as Regression Task

14.2.1 Parametrization and Regression Formulation

14.3 Regression Forests for Organ Localization

14.3.1 Feature Responses for Application in CT and MR

Visual Features in CT

Visual Features in MR

14.3.2 Regression Forest Learning

Weak Learners

Objective Function

Stopping Criterion

Discussion

14.3.3 Regression Forest Prediction

Forest Testing

Organ Localization

Organ Detection

14.4 Results, Comparisons and Validation

14.4.1 Anatomy Localization in Computed Tomography Scans

The Labeled CT Database

Quantitative Evaluation

Computational Efﬁciency

Comparison with Afﬁne, Atlas-Based Registration

Automatic Landmark Detection

14.4.2 Anatomy Localization in Magnetic Resonance Scans

The Labeled MR Database

Comparative Experiments

14.5 Conclusion

Chapter 15: Semantic Texton Forests for Image Categorization and Segmentation

15.1 Introduction

15.1.1 Related Work

15.2 Randomized Decision Forests

15.3 Semantic Texton Forests

15.3.1 Learning Invariances

15.3.2 Implementation Details

15.3.3 Bags of Semantic Textons

Implementation Details

15.4 Image Categorization

15.5 Semantic Segmentation

Appearance Context vs. Semantic Context

15.5.1 Image-Level Prior

15.6 Experiments

15.6.1 Learning the Semantic Texton Forest Vocabulary

15.6.2 Image Categorization

15.6.3 Semantic Segmentation

15.6.3.1 Experiments on the MSRC Dataset [341]

15.6.3.2 Experiments on the VOC 2007 Segmentation Dataset [99]

15.7 Conclusions

Chapter 16: Semi-supervised Video Segmentation Using Decision Forests

16.1 Introduction

16.2 Literature Review

16.2.1 Unsupervised Video Segmentation

16.2.2 Classiﬁcation-Based Segmentation

Unstructured Classiﬁcation

Structured Classiﬁcation

16.2.3 Semi-supervised Video Segmentation

16.2.4 Work-Flow-Based Video Segmentation

16.3 Video Model for Semi-supervised Segmentation

16.3.1 Model Construction

16.3.2 Inference

16.3.2.1 From Patches to Pixel Posteriors

16.3.2.2 Forward and Backward Trees

16.3.2.3 Intra-frame Smoothing of Pixel Labels

16.3.3 Learning Pixel Unaries Using a Random Forest

16.4 Experiments and Results

16.5 Advantages and Drawbacks

16.6 Conclusions

Chapter 17: Classiﬁcation Forests for Semantic Segmentation of Brain Lesions in Multi-channel MRI

17.1 Introduction

Why Classiﬁcation Forests for Medical Image Segmentation?

17.2 Classiﬁcation Forests for Segmentation of Brain Tumor Tissues

17.2.1 Context-Aware Voxel Classiﬁcation

17.2.1.1 Estimating Initial Tissue Probabilities

17.2.1.2 Spatial Visual Features

17.2.2 Evaluation

17.2.2.1 Experiments

17.3 Classiﬁcation Forests for Segmentation of MS Lesions

17.3.1 Data

17.3.2 Feature Types

17.3.3 Experiments

17.3.4 Discussion

17.3.4.1 Interpretation of Segmentation Results

17.3.4.2 Inﬂuence of Forest Parameters

17.3.4.3 Analysis of the Relevance of Feature Types and Channels

17.4 Conclusion

Chapter 18: Manifold Forests for Multi-modality Classiﬁcation of Alzheimer's Disease

18.1 Introduction

18.2 Background: Biomarkers for Alzheimer's Disease

18.2.1 Cerebrospinal Fluid Measures of Neuropathology

18.2.2 Functional Imaging with Positron Emission Tomography

18.2.3 Structural Imaging with Magnetic Resonance Imaging

18.2.4 Risk Factors for Disease Development

18.3 Multi-modality Classiﬁcation Framework

18.3.1 Manifold Forests for Multi-modality Classiﬁcation

18.3.2 Implementation Details

18.4 Neuroimaging and Biological Data for Evaluation

18.4.1 Neuroimaging features

18.4.2 Biological features

18.5 Classiﬁcation Experiments and Results

18.5.1 Single-Modality Classiﬁcation Results

18.5.2 Single-Modality Similarity-Based Classiﬁcation Results

18.5.3 Multi-modality Similarity-Based Classiﬁcation Results

18.6 Conclusions

Chapter 19: Entanglement and Differentiable Information Gain Maximization

19.1 Introduction

19.2 Entangled Decision Forests

19.2.1 Entanglement Feature Design

19.2.1.1 Appearance Features

19.2.1.2 Semantic Context Entanglement Features

MAPClass Entanglement Features

TopNClasses Entanglement Features

NodeDescendant Entanglement Features

AncestorNodePair Entanglement Features

19.2.2 Guiding Feature Selection by Learned Proposal Distributions

19.2.3 Results

Qualitative Results

Quantitative Impact of Each Contribution

Efﬁciency Considerations

Inspecting the Chosen Features

19.3 Differentiable Information Gain Maximization

19.3.1 Formulating Differentiable Information Gain

Soft Label Decision Forests

19.3.2 Split Function Design and Gradient Ascent Optimization

Hyperplane Partition

Hyper-Ellipsoid Partition

19.3.3 Object Tracking via Information Gain Maximization

GainMax in Feature Space: Updating the Appearance Model

GainMax in Image Space: Tracking the Target

19.3.4 Results

19.3.4.1 Classiﬁcation of Public Machine Learning Datasets

19.3.4.2 Object Tracking in Videos

19.4 Discussion and Conclusion

Chapter 20: Decision Tree Fields: An Efﬁcient Non-parametric Random Field Model for Image Labeling

20.1 Introduction

20.2 Related Work

20.3 Model

20.3.1 Relation to Other Models

20.4 Learning Decision Tree Fields

20.4.1 Maximum Likelihood Learning

20.4.2 Pseudolikelihood

20.4.3 Learning the Tree Structure

20.4.4 Complexity of Training

20.4.5 Making Training Efﬁcient

20.4.6 Inference

20.5 Experiments

20.5.1 Conditional Interactions: Toy Snake Dataset

20.5.2 Learning Calligraphy: Chinese Characters

20.5.3 Body-Part Detection

20.6 Conclusion

Part III: Implementation and Conclusion

Chapter 21: Efﬁcient Implementation of Decision Forests

21.1 Introduction

21.1.1 Notation

21.2 Depth First and Breadth First Training

21.2.1 Depth First Training

21.2.2 Breadth First Training

21.2.3 Properties

21.3 Weak Learner Implementation

21.3.1 Computing Responses on the Fly

21.3.2 Multiple Candidate Thresholds per Feature

21.3.3 Efﬁcient Threshold Selection

21.4 Data Structures

21.4.1 Tree Memory Organization

21.4.2 Split and Leaf Node Representation

21.5 Parallelization

21.5.1 Multi-core CPU Implementation

21.5.1.1 Parallel-For

21.5.1.2 Parallel Training

21.5.1.3 Parallel Inference

21.5.1.4 Multi-threading Considerations

21.5.2 GPU Implementation

21.5.3 Distributed Computing Implementation

21.6 Parameter Tuning

21.6.1 Maximum Tree Depth, D

21.6.2 Number of Trees in the Forest, T

21.6.3 Number of Candidate Weak Learners

21.6.4 Number of Candidate Response Thresholds per Feature

21.7 Tricks of the Trade

21.7.1 In-place Partitioning of Training Data

21.7.2 Retrospective Tuning of Maximum Tree Depth

21.7.3 Dealing with Unbalanced Datasets

21.7.4 Bagging and Leaf Predictors

21.7.5 Augmenting Your Training Set

21.7.6 Sampling Weak Learner Parameters

21.7.7 Reducing Correlation Between Trees in a Forest

21.8 Conclusion

Chapter 22: The Sherwood Software Library

22.1 Introduction

22.2 Getting Started

22.3 Architectural Overview

22.4 Performance Considerations

22.5 Application Illustration: Supervised Classiﬁcation

22.5.1 Implementing Interfaces

22.5.2 Training the Forest

22.5.3 Serialization and Deserialization

22.5.4 Testing

22.6 Other Applications

Density Estimation

1D-1D Regression

Semi-supervised Classiﬁcation

22.7 Summary

Chapter 23: Conclusions

23.1 Summary

23.2 Successes and Challenges

Success Stories

Remaining Challenges

23.3 Conclusion

Further Material

References

Index

Advances in Computer Vision and Pattern Recognition For further volumes: www.springer.com/series/4205

A. Criminisi r J. Shotton Editors Decision Forests for Computer Vision and Medical Image Analysis

Editors A. Criminisi Microsoft Research Ltd. Cambridge, UK Series Editors Prof. Sameer Singh Research School of Informatics Loughborough University Loughborough UK J. Shotton Microsoft Research Ltd. Cambridge, UK Dr. Sing Bing Kang Microsoft Research Microsoft Corporation Redmond, WA USA ISSN 2191-6586 Advances in Computer Vision and Pattern Recognition ISBN 978-1-4471-4928-6 DOI 10.1007/978-1-4471-4929-3 Springer London Heidelberg New York Dordrecht ISSN 2191-6594 (electronic) ISBN 978-1-4471-4929-3 (eBook) Library of Congress Control Number: 2013930423 © Springer-Verlag London 2013 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, speciﬁcally the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microﬁlms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied speciﬁcally for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a speciﬁc statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of pub- lication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)

In memory of Josh, a wonderful brother, dearly missed

Foreword We are grateful to the authors for inviting us to comment on the evolution of random decision forests. We will begin by recounting our participation in the history (and hope that the evoked memories are not randomized), continue with the recent and dramatic advances in both research and scholarship reported in the book, and ﬁnish with some speculative remarks. For us, the story starts in the Spring of 1994 when we were failing to accurately classify handwritten digits using a decision tree. We were pretty conﬁdent that the features, which posed questions about the spatial relationship among local patterns, were discriminating. But we could not ask nearly enough of them and the best classi- ﬁcation rate was just above 90 %. The NIST dataset had about 6,000 training images per digit, but as anybody knows who has induced a decision tree from data, the sam- ple size goes down fast and estimation errors take over. So we could not ask more than say ten questions along any branch. We tried the recipes for pruning in [45] and elsewhere, and other ways to mitigate over-ﬁtting, but no amount of ﬁddling made much of a difference. Although building multiple trees was the “obvious” thing to do, it hadn’t occurred to us until then. The ﬁrst effort, which involved making ten trees by dividing the pool of questions among the trees, worked splendidly and lifted us well above 95 %. We sensed a breakthrough. Subsequently, we explored different randomization proce- dures and rather quickly settled on generating a small random sample of the question pool at every node of every tree, eventually exceeding 99 % accuracy with a forest of many trees. The features were binary queries about the relative locations of local features and hence did not require any normalization or preprocessing of the image. The space of relational graphs is very large and provided a very rich source of linked queries. Each tree probed some aspects of an essentially inﬁnite graph of relationships, with each new question reﬁning the accumulated structure. In this way, growing the trees provided a greedy mechanism to explore this space of representations. We also ex- plored clustering shape space by growing generic trees with unlabeled data [5], as well as initially growing trees with a small number of classes and then reestimat- ing leaf node distributions and possibly deepening the trees with additional data vii

viii Foreword from new classes. The work on trees for density estimation and manifold learning described in this book develops these ideas in multiple, fascinating directions. In October 1994 we both went to a workshop on statistics and information the- ory [169], where we met Leo Breiman. One of us (DG) spoke about “randomized trees” at the workshop; this was the name we used in a technical report [4]. At the workshop, and continuing over dinner, we discussed trees with Leo, who described “bagging” (re-sampling the training set). We were all excited about advancing in- ductive learning with randomized classiﬁers. Soon after we became aware of two closely related lines of research. One was boosting, introduced earlier by Freund and Shapire. The other was the work of Ho [165], also published in the mid-1990s, which independently introduced a variation on randomization. In 1999 DG spent time with Leo at another conference [298] discussing the tradeoffs between single- tree accuracy and tree-to-tree correlation as the degree of randomness was varied. From our perspective the source of success of both boosting and randomization was the ability to create moderately accurate trees that are weakly correlated condi- tional on the class. Boosting achieves this through the increased weights on the mis- classiﬁed training samples, and is largely insensitive to the particular re-weighting protocol. Randomization achieves this through sparse sampling from a large space of queries. In fact, both methods effectively sample the trees from some distribu- tion on the space of trees; randomized trees do this explicitly and boosting in the limit cycle of a deterministic tree-generating process (see the discussion [113] of Breiman’s 1997 paper on ARCing). By the way, back in 1994 we had speculated with Leo that bagging might be too stable, i.e. the distribution over trees resulting from mere re-sampling of the data might be too concentrated. With well-separated classes boosting can achieve lower error rates, but there is a risk of over-ﬁtting; as discussed in [3], a combination of boosting and randomization is effective with multiple classes. Random forests had very strong competition during the nineties from support vector machines. Whereas in practice the results on standard problems are about the same, SVMs had the appeal of a clean optimization framework, well-understood al- gorithms and an array of theoretical results. More recently, random forests have be- come popular in computational genomics, probably due to the relative transparency of the decision rules. Random forests have two other important advantages: speed and minimal storage. Nonlinear SVMs require storing large numbers of training samples, and computing large numbers of inner products. In contrast, it is hard to ﬁnd something more efﬁcient than storing and answering a series of simple ques- tions. This brings us to recent applications of decision and regression forests. Surely the best known is now the Kinect skeleton tracker; it is impressive how well simple depth-based comparison questions work. The Kinect solution is particularly inter- esting in that it synthesizes randomization with relational shape queries, the gen- eralized Hough transform and shape context, which has proven very powerful in shape recognition. The main problem with the original formulation of shape context in [24] was the intense computation; exploring the neighborhood of a pixel using randomized trees provides a very efﬁcient alternative. Also related is the application

Foreword ix of regression forests to anatomy detection in 3D computed tomography (CT) im- ages; using a form of generalized Hough transform that is trained with a predictive machine such as random forests opens up many interesting possibilities. The many other applications presented in this book further and convincingly demonstrate that sequential, adaptive learning and testing can provide a unifying framework for many of the major tasks of modern computational learning. Looking ahead, we hope these efﬁcient and versatile tools will spread into more applications. Many theoretical questions also remain open, such as determining conditions under which randomized forests really provide an advantage, as well as guidelines for selecting parameters such as tree depth, forest size and sampling density. Finally, whereas a decision forest efﬁciently integrates evidence by ask- ing a great many questions in total, it still does not capture the full discriminating power of “20 questions” because the “line of reasoning” is broken with each new tree. In principle a single tree can develop a long series of linked questions which continue building on each other, progressing systematically from coarse to ﬁne and exploring some aspect of the object under study in depth. For example, in computer vision, one might need 100 questions to accumulate a detailed understanding about a complex event in the scene involving multiple agents and actions. However, this is impossible in practice with current methodology: how would one store such a deep tree even if there was enough time and data to learn it or a model under which probabilities could be updated and new queries scored? Perhaps the answer is online construction. After all, at run time (e.g. during scene parsing), only one branch of each tree is traversed—the one dictated by the data being processed. Consequently, online learning might be feasible in some cases; a tracking example, which does not scale up to the complex problems treated in this book, was explored in [123]. Going further will require learning how to maintain a probability distribution over states of interest conditional on all the accumulated evidence. Baltimore, USA Yali Amit Donald Geman

分享到：

赞收藏

资料库

Decision Forests for Computer Vision and Medical Image Analysis.pdf

相关推荐

课程资源

热门标签

最新资料