logo资料库

Decision Forests for Computer Vision and Medical Image Analysis.pdf

第1页 / 共366页
第2页 / 共366页
第3页 / 共366页
第4页 / 共366页
第5页 / 共366页
第6页 / 共366页
第7页 / 共366页
第8页 / 共366页
资料共366页,剩余部分请下载后查看
Decision Forests for Computer Vision and Medical Image Analysis
Foreword
Preface
Acknowledgements
Contents
Contributors
Chapter 1: Overview and Scope
A Chronological Literature Review
Chapter 2: Notation and Terminology
2.1 Notation
2.2 Common Terms
Part I: The Decision Forest Model
Chapter 3: Introduction: The Abstract Forest Model
3.1 Decision Tree Basics
3.1.1 Tree Data Structure
3.1.2 Decision Tree
3.2 Mathematical Notation and Basic Definitions
3.2.1 Data Point and Features
3.2.2 Test Functions, Split Functions and Weak Learners
3.2.3 Training Points and Training Sets
3.3 Randomly Trained Decision Trees
3.3.1 Tree Testing (On-line Phase)
3.3.2 Tree Training (Off-line Phase)
Tree Structure
3.3.3 Weak Learner Models
Linear Data Separation
Non-linear Data Separation
3.3.4 Energy Models
Entropy and Information Gain
3.3.5 Leaf Prediction Models
3.3.6 The Randomness Model
3.3.6.1 Bagging
3.3.6.2 Randomized Node Optimization (RNO)
3.4 Combining Trees into a Forest Ensemble
3.4.1 Key Model Parameters
3.5 Summary
Chapter 4: Classification Forests
4.1 Classification Algorithms in the Literature
4.2 Specializing the Decision Forest Model for Classification
Problem Statement
4.2.1 The Training Objective Function
4.2.2 Class Re-balancing
4.2.3 Randomness
4.2.4 The Leaf and Ensemble Prediction Models
4.3 Effect of Model Parameters
4.3.1 The Effect of the Forest Size on Generalization
4.3.2 Multiple Classes and Training Noise
4.3.3 The Effect of the Tree Depth
4.3.4 The Effect of the Weak Learner
4.3.5 The Effect of Randomness
4.4 Maximum Margin Classification with Forests
4.4.1 The Effect of Randomness on Optimal Separation
4.4.2 Influence of the Weak Learner Model
4.4.3 Maximum Margin with Multiple Classes
4.4.4 The Effect of the Randomness Model
4.5 Summary
4.6 Exercises and Experiments
Chapter 5: Regression Forests
5.1 Non-linear Regression in the Literature
5.2 Specializing the Decision Forest Model for Regression
Problem Statement
What Are Regression Forests?
5.2.1 The Prediction Model
5.2.2 The Ensemble Model
5.2.3 Randomness Model
5.2.4 The Training Objective Function
5.2.5 The Weak Learner Model
5.3 Effect of Model Parameters
5.3.1 The Effect of the Forest Size
5.3.2 The Effect of the Tree Depth
5.3.3 Spatial Smoothness and Testing Uncertainty
5.4 Summary
5.5 Exercises and Experiments
Chapter 6: Density Forests
6.1 Density Estimation in the Literature
6.2 Specializing the Forest Model for Density Estimation
Problem Statement
What Are Density Forests?
6.2.1 The Training Objective Function
Discussion
6.2.2 The Prediction Model
The Partition Function
6.2.3 The Ensemble Model
6.2.4 Discussion
6.3 Effect of Model Parameters
6.3.1 The Effect of Tree Depth
6.3.2 The Effect of Forest Size
6.3.3 More Complex Examples
6.4 Comparison with Alternative Algorithms
6.4.1 Comparison with Non-parametric Estimators
6.4.2 Comparison with GMM EM
6.4.3 Comparing Computational Complexity
6.5 Sampling from the Generative Model
6.5.1 Efficiency
6.5.2 Results
6.6 Quantitative Analysis
6.7 Summary
6.8 Exercises and Experiments
Chapter 7: Manifold Forests
7.1 Manifold Learning and Dimensionality Reduction in the Literature
7.2 Specializing the Forest Model for Manifold Learning
Problem Statement
What Are Manifold Forests?
7.2.1 The Training Objective Function
7.2.2 The Predictor Model
7.2.3 The Affinity Model
7.2.4 The Ensemble Model
7.2.5 Estimating the Embedding Function
7.2.6 Mapping Previously Unseen Points
7.2.7 Properties and Advantages
7.2.7.1 Ensemble Clustering for Distance Estimation
An Illustrative Example
7.2.7.2 Choosing the Feature Space
7.2.7.3 Computational Efficiency
7.2.7.4 Estimating the Target Intrinsic Dimensionality
7.3 Experiments and the Effect of Model Parameters
7.3.1 The Effect of the Forest Size
7.3.2 Manifold Learning in Higher Dimensions
7.3.3 Discovering the Manifold Intrinsic Dimensionality
7.4 Summary
Chapter 8: Semi-supervised Classification Forests
8.1 Semi-supervised Learning in the Literature
8.2 Specializing the Decision Forest Model for Semi-supervised Classification
Problem Statement
What Are Semi-supervised Forests?
8.2.1 The Training Objective Function
8.3 Transduction Trees for Classifying Already Available Data
8.3.1 Transductive Ensemble
8.4 Induction from Transduction
8.4.1 Inductive Ensemble
8.4.2 Discussion
8.5 Examples, Comparisons and Effect of Model Parameters
8.5.1 The Effect of Additional Supervision, and Active Learning
8.5.2 Comparison with Transductive SVMs
8.5.3 Handling Multiple Classes
8.5.4 The Effect of Tree Depth
8.6 Summary
8.7 Exercises and Experiments
Part II: Applications in Computer Vision and Medical Image Analysis
Chapter 9: Keypoint Recognition Using Random Forests and Random Ferns
9.1 Introduction
9.2 Wide-Baseline Point Matching as a Classification Problem
9.3 Keypoint Recognition with Classification Forests
9.3.1 Random Classification Forests
9.3.2 Node Tests
9.3.3 Building the Trees
9.4 Keypoint Recognition with Random Ferns
9.4.1 Random Ferns
9.4.2 Training the Ferns
9.5 Comparing Random Forests, Random Ferns, and SIFT
9.5.1 Empirical Comparisons of Trees and Ferns
9.5.2 Empirical Comparisons Between SIFT and Ferns
9.6 Discussion
9.7 Application Example
9.8 Conclusion
Chapter 10: Extremely Randomized Trees and Random Subwindows for Image Classification, Annotation, and Retrieval
10.1 Introduction
10.2 Random Subwindow-Based Image Analysis
10.2.1 Training Stage
10.2.2 Prediction Stage
10.2.3 Discussion
10.2.3.1 Subwindows Extraction
10.2.3.2 Subwindows Transformation
10.2.3.3 Feature Extraction
10.3 Extremely Randomized Trees
10.3.1 Extremely and Totally Randomized Trees
10.3.2 Multiple Output Trees
10.3.3 Kernel View of Tree-Based Ensembles
10.4 Applications
10.4.1 Content-Based Image Retrieval
10.4.1.1 Method
10.4.1.2 Illustration
10.4.2 Image Classification
10.4.2.1 Method
10.4.2.2 Illustration
10.4.3 Interest Point Detection
10.4.3.1 Method
10.4.3.2 Illustration
10.4.4 Image Segmentation
10.4.4.1 Method
10.4.4.2 Illustration
10.5 Conclusions and Future Works
Chapter 11: Class-Specific Hough Forests for Object Detection
11.1 Introduction
11.2 Related Work
11.3 Building Hough Forests
11.3.1 Training Data and Leaf Information
11.3.2 Patch Appearance and Binary Tests
11.3.3 Tree Construction
11.4 Object Detection with Hough Forests
11.4.1 Handling Variable Scales and Aspect Ratios
11.4.2 Leaf Pruning
11.5 Experiments
11.5.1 UIUC Cars
11.5.2 TUD Pedestrians, INRIA Pedestrians, Weizmann Horses
11.5.3 Does Offset Supervision Matter?
11.6 Discussion
Chapter 12: Hough-Based Tracking of Deformable Objects
12.1 Introduction
12.2 Related Work
12.3 Online Hough Ferns
12.3.1 Random Ferns
12.3.2 Hough Voting
12.3.3 Incremental Leaf Node Statistics
12.3.4 Online Adaptation
12.3.5 Support
12.4 Closing the Tracking Loop
12.5 Experimental Evaluation
12.5.1 Bounding Box Dataset
12.5.2 Tracking Deformable Objects
12.5.3 Bounding Box vs. Segmentation-Based Tracking
12.5.4 Discussion and Implementation Details
12.6 Conclusion
Chapter 13: Efficient Human Pose Estimation from Single Depth Images
13.1 Introduction
13.2 Data
Body Part Classification Labels
Offset Joint Regression Labels
13.3 Method
13.3.1 Depth Image Features
13.3.2 Weak Learners
13.3.3 Leaf Node Prediction Models
Body Part Classification (bpc)
Offset Joint Regression (ojr)
13.3.4 Aggregating Predictions
13.3.5 Training
13.3.5.1 Tree Structure Training
Classification
Regression
Discussion
13.3.5.2 Leaf Node Prediction Models
13.4 Experiments
13.4.1 Test Data
13.4.2 Qualitative Results
13.4.3 Comparison Between bpc and ojr
13.4.4 Comparison to Ganapathi et al. [121]
13.5 Conclusions
Chapter 14: Anatomy Detection and Localization in 3D Medical Images
14.1 Introduction
Regression Approach
Comparison with Classification-Based Approaches
Comparison with Registration-Based Approaches
14.2 Organ Localization as Regression Task
14.2.1 Parametrization and Regression Formulation
14.3 Regression Forests for Organ Localization
14.3.1 Feature Responses for Application in CT and MR
Visual Features in CT
Visual Features in MR
14.3.2 Regression Forest Learning
Weak Learners
Objective Function
Stopping Criterion
Discussion
14.3.3 Regression Forest Prediction
Forest Testing
Organ Localization
Organ Detection
14.4 Results, Comparisons and Validation
14.4.1 Anatomy Localization in Computed Tomography Scans
The Labeled CT Database
Quantitative Evaluation
Computational Efficiency
Comparison with Affine, Atlas-Based Registration
Automatic Landmark Detection
14.4.2 Anatomy Localization in Magnetic Resonance Scans
The Labeled MR Database
Comparative Experiments
14.5 Conclusion
Chapter 15: Semantic Texton Forests for Image Categorization and Segmentation
15.1 Introduction
15.1.1 Related Work
15.2 Randomized Decision Forests
15.3 Semantic Texton Forests
15.3.1 Learning Invariances
15.3.2 Implementation Details
15.3.3 Bags of Semantic Textons
Implementation Details
15.4 Image Categorization
15.5 Semantic Segmentation
Appearance Context vs. Semantic Context
15.5.1 Image-Level Prior
15.6 Experiments
15.6.1 Learning the Semantic Texton Forest Vocabulary
15.6.2 Image Categorization
15.6.3 Semantic Segmentation
15.6.3.1 Experiments on the MSRC Dataset [341]
15.6.3.2 Experiments on the VOC 2007 Segmentation Dataset [99]
15.7 Conclusions
Chapter 16: Semi-supervised Video Segmentation Using Decision Forests
16.1 Introduction
16.2 Literature Review
16.2.1 Unsupervised Video Segmentation
16.2.2 Classification-Based Segmentation
Unstructured Classification
Structured Classification
16.2.3 Semi-supervised Video Segmentation
16.2.4 Work-Flow-Based Video Segmentation
16.3 Video Model for Semi-supervised Segmentation
16.3.1 Model Construction
16.3.2 Inference
16.3.2.1 From Patches to Pixel Posteriors
16.3.2.2 Forward and Backward Trees
16.3.2.3 Intra-frame Smoothing of Pixel Labels
16.3.3 Learning Pixel Unaries Using a Random Forest
16.4 Experiments and Results
16.5 Advantages and Drawbacks
16.6 Conclusions
Chapter 17: Classification Forests for Semantic Segmentation of Brain Lesions in Multi-channel MRI
17.1 Introduction
Why Classification Forests for Medical Image Segmentation?
17.2 Classification Forests for Segmentation of Brain Tumor Tissues
17.2.1 Context-Aware Voxel Classification
17.2.1.1 Estimating Initial Tissue Probabilities
17.2.1.2 Spatial Visual Features
17.2.2 Evaluation
17.2.2.1 Experiments
17.3 Classification Forests for Segmentation of MS Lesions
17.3.1 Data
17.3.2 Feature Types
17.3.3 Experiments
17.3.4 Discussion
17.3.4.1 Interpretation of Segmentation Results
17.3.4.2 Influence of Forest Parameters
17.3.4.3 Analysis of the Relevance of Feature Types and Channels
17.4 Conclusion
Chapter 18: Manifold Forests for Multi-modality Classification of Alzheimer's Disease
18.1 Introduction
18.2 Background: Biomarkers for Alzheimer's Disease
18.2.1 Cerebrospinal Fluid Measures of Neuropathology
18.2.2 Functional Imaging with Positron Emission Tomography
18.2.3 Structural Imaging with Magnetic Resonance Imaging
18.2.4 Risk Factors for Disease Development
18.3 Multi-modality Classification Framework
18.3.1 Manifold Forests for Multi-modality Classification
18.3.2 Implementation Details
18.4 Neuroimaging and Biological Data for Evaluation
18.4.1 Neuroimaging features
18.4.2 Biological features
18.5 Classification Experiments and Results
18.5.1 Single-Modality Classification Results
18.5.2 Single-Modality Similarity-Based Classification Results
18.5.3 Multi-modality Similarity-Based Classification Results
18.6 Conclusions
Chapter 19: Entanglement and Differentiable Information Gain Maximization
19.1 Introduction
19.2 Entangled Decision Forests
19.2.1 Entanglement Feature Design
19.2.1.1 Appearance Features
19.2.1.2 Semantic Context Entanglement Features
MAPClass Entanglement Features
TopNClasses Entanglement Features
NodeDescendant Entanglement Features
AncestorNodePair Entanglement Features
19.2.2 Guiding Feature Selection by Learned Proposal Distributions
19.2.3 Results
Qualitative Results
Quantitative Impact of Each Contribution
Efficiency Considerations
Inspecting the Chosen Features
19.3 Differentiable Information Gain Maximization
19.3.1 Formulating Differentiable Information Gain
Soft Label Decision Forests
19.3.2 Split Function Design and Gradient Ascent Optimization
Hyperplane Partition
Hyper-Ellipsoid Partition
19.3.3 Object Tracking via Information Gain Maximization
GainMax in Feature Space: Updating the Appearance Model
GainMax in Image Space: Tracking the Target
19.3.4 Results
19.3.4.1 Classification of Public Machine Learning Datasets
19.3.4.2 Object Tracking in Videos
19.4 Discussion and Conclusion
Chapter 20: Decision Tree Fields: An Efficient Non-parametric Random Field Model for Image Labeling
20.1 Introduction
20.2 Related Work
20.3 Model
20.3.1 Relation to Other Models
20.4 Learning Decision Tree Fields
20.4.1 Maximum Likelihood Learning
20.4.2 Pseudolikelihood
20.4.3 Learning the Tree Structure
20.4.4 Complexity of Training
20.4.5 Making Training Efficient
20.4.6 Inference
20.5 Experiments
20.5.1 Conditional Interactions: Toy Snake Dataset
20.5.2 Learning Calligraphy: Chinese Characters
20.5.3 Body-Part Detection
20.6 Conclusion
Part III: Implementation and Conclusion
Chapter 21: Efficient Implementation of Decision Forests
21.1 Introduction
21.1.1 Notation
21.2 Depth First and Breadth First Training
21.2.1 Depth First Training
21.2.2 Breadth First Training
21.2.3 Properties
21.3 Weak Learner Implementation
21.3.1 Computing Responses on the Fly
21.3.2 Multiple Candidate Thresholds per Feature
21.3.3 Efficient Threshold Selection
21.4 Data Structures
21.4.1 Tree Memory Organization
21.4.2 Split and Leaf Node Representation
21.5 Parallelization
21.5.1 Multi-core CPU Implementation
21.5.1.1 Parallel-For
21.5.1.2 Parallel Training
21.5.1.3 Parallel Inference
21.5.1.4 Multi-threading Considerations
21.5.2 GPU Implementation
21.5.3 Distributed Computing Implementation
21.6 Parameter Tuning
21.6.1 Maximum Tree Depth, D
21.6.2 Number of Trees in the Forest, T
21.6.3 Number of Candidate Weak Learners
21.6.4 Number of Candidate Response Thresholds per Feature
21.7 Tricks of the Trade
21.7.1 In-place Partitioning of Training Data
21.7.2 Retrospective Tuning of Maximum Tree Depth
21.7.3 Dealing with Unbalanced Datasets
21.7.4 Bagging and Leaf Predictors
21.7.5 Augmenting Your Training Set
21.7.6 Sampling Weak Learner Parameters
21.7.7 Reducing Correlation Between Trees in a Forest
21.8 Conclusion
Chapter 22: The Sherwood Software Library
22.1 Introduction
22.2 Getting Started
22.3 Architectural Overview
22.4 Performance Considerations
22.5 Application Illustration: Supervised Classification
22.5.1 Implementing Interfaces
22.5.2 Training the Forest
22.5.3 Serialization and Deserialization
22.5.4 Testing
22.6 Other Applications
Density Estimation
1D-1D Regression
Semi-supervised Classification
22.7 Summary
Chapter 23: Conclusions
23.1 Summary
23.2 Successes and Challenges
Success Stories
Remaining Challenges
23.3 Conclusion
Further Material
References
Index
Advances in Computer Vision and Pattern Recognition For further volumes: www.springer.com/series/4205
A. Criminisi r J. Shotton Editors Decision Forests for Computer Vision and Medical Image Analysis
Editors A. Criminisi Microsoft Research Ltd. Cambridge, UK Series Editors Prof. Sameer Singh Research School of Informatics Loughborough University Loughborough UK J. Shotton Microsoft Research Ltd. Cambridge, UK Dr. Sing Bing Kang Microsoft Research Microsoft Corporation Redmond, WA USA ISSN 2191-6586 Advances in Computer Vision and Pattern Recognition ISBN 978-1-4471-4928-6 DOI 10.1007/978-1-4471-4929-3 Springer London Heidelberg New York Dordrecht ISSN 2191-6594 (electronic) ISBN 978-1-4471-4929-3 (eBook) Library of Congress Control Number: 2013930423 © Springer-Verlag London 2013 This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use. While the advice and information in this book are believed to be true and accurate at the date of pub- lication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
In memory of Josh, a wonderful brother, dearly missed
Foreword We are grateful to the authors for inviting us to comment on the evolution of random decision forests. We will begin by recounting our participation in the history (and hope that the evoked memories are not randomized), continue with the recent and dramatic advances in both research and scholarship reported in the book, and finish with some speculative remarks. For us, the story starts in the Spring of 1994 when we were failing to accurately classify handwritten digits using a decision tree. We were pretty confident that the features, which posed questions about the spatial relationship among local patterns, were discriminating. But we could not ask nearly enough of them and the best classi- fication rate was just above 90 %. The NIST dataset had about 6,000 training images per digit, but as anybody knows who has induced a decision tree from data, the sam- ple size goes down fast and estimation errors take over. So we could not ask more than say ten questions along any branch. We tried the recipes for pruning in [45] and elsewhere, and other ways to mitigate over-fitting, but no amount of fiddling made much of a difference. Although building multiple trees was the “obvious” thing to do, it hadn’t occurred to us until then. The first effort, which involved making ten trees by dividing the pool of questions among the trees, worked splendidly and lifted us well above 95 %. We sensed a breakthrough. Subsequently, we explored different randomization proce- dures and rather quickly settled on generating a small random sample of the question pool at every node of every tree, eventually exceeding 99 % accuracy with a forest of many trees. The features were binary queries about the relative locations of local features and hence did not require any normalization or preprocessing of the image. The space of relational graphs is very large and provided a very rich source of linked queries. Each tree probed some aspects of an essentially infinite graph of relationships, with each new question refining the accumulated structure. In this way, growing the trees provided a greedy mechanism to explore this space of representations. We also ex- plored clustering shape space by growing generic trees with unlabeled data [5], as well as initially growing trees with a small number of classes and then reestimat- ing leaf node distributions and possibly deepening the trees with additional data vii
viii Foreword from new classes. The work on trees for density estimation and manifold learning described in this book develops these ideas in multiple, fascinating directions. In October 1994 we both went to a workshop on statistics and information the- ory [169], where we met Leo Breiman. One of us (DG) spoke about “randomized trees” at the workshop; this was the name we used in a technical report [4]. At the workshop, and continuing over dinner, we discussed trees with Leo, who described “bagging” (re-sampling the training set). We were all excited about advancing in- ductive learning with randomized classifiers. Soon after we became aware of two closely related lines of research. One was boosting, introduced earlier by Freund and Shapire. The other was the work of Ho [165], also published in the mid-1990s, which independently introduced a variation on randomization. In 1999 DG spent time with Leo at another conference [298] discussing the tradeoffs between single- tree accuracy and tree-to-tree correlation as the degree of randomness was varied. From our perspective the source of success of both boosting and randomization was the ability to create moderately accurate trees that are weakly correlated condi- tional on the class. Boosting achieves this through the increased weights on the mis- classified training samples, and is largely insensitive to the particular re-weighting protocol. Randomization achieves this through sparse sampling from a large space of queries. In fact, both methods effectively sample the trees from some distribu- tion on the space of trees; randomized trees do this explicitly and boosting in the limit cycle of a deterministic tree-generating process (see the discussion [113] of Breiman’s 1997 paper on ARCing). By the way, back in 1994 we had speculated with Leo that bagging might be too stable, i.e. the distribution over trees resulting from mere re-sampling of the data might be too concentrated. With well-separated classes boosting can achieve lower error rates, but there is a risk of over-fitting; as discussed in [3], a combination of boosting and randomization is effective with multiple classes. Random forests had very strong competition during the nineties from support vector machines. Whereas in practice the results on standard problems are about the same, SVMs had the appeal of a clean optimization framework, well-understood al- gorithms and an array of theoretical results. More recently, random forests have be- come popular in computational genomics, probably due to the relative transparency of the decision rules. Random forests have two other important advantages: speed and minimal storage. Nonlinear SVMs require storing large numbers of training samples, and computing large numbers of inner products. In contrast, it is hard to find something more efficient than storing and answering a series of simple ques- tions. This brings us to recent applications of decision and regression forests. Surely the best known is now the Kinect skeleton tracker; it is impressive how well simple depth-based comparison questions work. The Kinect solution is particularly inter- esting in that it synthesizes randomization with relational shape queries, the gen- eralized Hough transform and shape context, which has proven very powerful in shape recognition. The main problem with the original formulation of shape context in [24] was the intense computation; exploring the neighborhood of a pixel using randomized trees provides a very efficient alternative. Also related is the application
Foreword ix of regression forests to anatomy detection in 3D computed tomography (CT) im- ages; using a form of generalized Hough transform that is trained with a predictive machine such as random forests opens up many interesting possibilities. The many other applications presented in this book further and convincingly demonstrate that sequential, adaptive learning and testing can provide a unifying framework for many of the major tasks of modern computational learning. Looking ahead, we hope these efficient and versatile tools will spread into more applications. Many theoretical questions also remain open, such as determining conditions under which randomized forests really provide an advantage, as well as guidelines for selecting parameters such as tree depth, forest size and sampling density. Finally, whereas a decision forest efficiently integrates evidence by ask- ing a great many questions in total, it still does not capture the full discriminating power of “20 questions” because the “line of reasoning” is broken with each new tree. In principle a single tree can develop a long series of linked questions which continue building on each other, progressing systematically from coarse to fine and exploring some aspect of the object under study in depth. For example, in computer vision, one might need 100 questions to accumulate a detailed understanding about a complex event in the scene involving multiple agents and actions. However, this is impossible in practice with current methodology: how would one store such a deep tree even if there was enough time and data to learn it or a model under which probabilities could be updated and new queries scored? Perhaps the answer is online construction. After all, at run time (e.g. during scene parsing), only one branch of each tree is traversed—the one dictated by the data being processed. Consequently, online learning might be feasible in some cases; a tracking example, which does not scale up to the complex problems treated in this book, was explored in [123]. Going further will require learning how to maintain a probability distribution over states of interest conditional on all the accumulated evidence. Baltimore, USA Yali Amit Donald Geman
分享到:
收藏