logo资料库

Random Forests and Ferns 随机树和随机蕨的讲解.pdf

第1页 / 共40页
第2页 / 共40页
第3页 / 共40页
第4页 / 共40页
第5页 / 共40页
第6页 / 共40页
第7页 / 共40页
第8页 / 共40页
资料共40页,剩余部分请下载后查看
Random Forests and Ferns David Capel
The Multi-class Classification Problem Fergus, Zisserman and Perona 276 276 Fergus, Zisserman and Perona Training: Labelled exemplars representing multiple classes Handwritten digits Planes Faces Cars Cats 276 Fergus, Zisserman and Perona Classifying: to which class does this new example belong? ? ? Figure 1. Some sample images from the datasets. Note the large variation in scale in, for example, the cars (rear) database. These datasets are from both http://www.vision. caltech.edu/html-files/archive.html and http://www.robots. ox.ac.uk vgg/data/, except for the Cars (Side) from (http://l2r.cs.uiuc.edu/ cogcomp/ index research.html) and Spotted Cats from the Corel Image library. A Powerpoint presentation of the figures Figure 1. Some sample images from the datasets. Note the large variation in scale in, for example, the cars (rear) database. These datasets are from both http://www.vision. caltech.edu/html-files/archive.html and http://www.robots. ox.ac.uk vgg/data/, except for the Cars (Side) from (http://l2r.cs.uiuc.edu/ cogcomp/ index research.html) and Spotted Cats from the Corel Image library. A Powerpoint presentation of the figures
The Multi-class Classification Problem • A classifier is a mapping H from feature vectors f to discrete class labels C f = (f1, f2, ..., fN) C ∈ {c1, c2, ..., cK} H : f → C P (C = ci|f1, f2, ..., fN) • Numerous choices of feature space F are possible, e.g. H(f) = arg max P (C = ci|f1, f2, ..., fN) i Dm = (f m, Cm) for m = 1...M Raw pixel values Gij = Texton histograms 1 2! 1 n"k (d2 ik + d2 kj) − d2 ij − Color histograms Oriented filter banks kl# d2 1 n2"kl
C ∈ {c1, c2, ..., cK} H : f → C The Multi-class Classification Problem P (C = ci|f1, f2, ..., fN) • We have a (large) database of labelled exemplars P (C = ci|f1, f2, ..., fN) H(f) = arg max i Dm = (f m, Cm) for m = 1...M • Problem: Given such training data, learn the mapping H 1 Gij = • Even better: learn the posterior distribution over class label H(f) = arg max conditioned on the features: ..and obtain classifier H as the mode of the posterior: 1 kl# d2 f = (f1, f2, ..., fN) 2! 1 n"k n2"kl C ∈ {c1, c2, ..., cK} kj) − d2 ik + d2 (d2 ij − H : f → C f = (f1, f2, ..., fN) P (C = ci|f1, f2, ..., fN) C ∈ {c1, c2, ..., cK} f = (f1, f2, ..., fN) err(y) ="ij $Gij − y!i yj%2 C ∈ {c1, c2, ..., cK} i P (C = ck|f1, f2, ..., fN) P (C = ck|f1, f2, ..., fN) yαi =&λαvαi 2! 1 n"k P (C = ci|f1, f2, ..., fN) H : f → C H : f → C P (C = ck|f1, f2, ..., fN) for m = 1...M P (C = ck|f1, f2, ..., fN) for m = 1...M 1 kj) − d2 Dm = (f m, Cm) ik + d2 n2"kl H(f) = argmax Dm = (f m, Cm) H(f) = argmax k for α =1,2,...,d ij − (d2 k kl# d2 1 Gij =
How do we represent and learn the mapping? In Bishop’s book, we saw numerous ways to build multi-class classifiers, e.g. • Direct, non-parametric learning of class posterior (histograms) • K-Nearest Neighbours • Fisher Linear Discriminants • Relevance Vector Machines • Multi-class SVMs y t i s n e n t f1 i l i e x p e g d E Class A Class B 0.2 0.4 0.6 0.8 1 1 ! Specificity (c) Center pixel intensity Direct, non-parametric f2 (d) k-Nearest Neighbours
Binary Decision Trees • Decision trees classify features by a series of Yes/No questions • At each node, the feature space is split according to the outcome of some binary decision criterion • The leaves are labelled with the class C corresponding the feature reached via that path through the tree B1(f)=True? No Yes B2(f)=True? B3(f)=True? No Yes No Yes c1 c2 c1 c1
Binary Decision Trees • Decision trees classify features by a series of Yes/No questions • At each node, the feature space is split according to the outcome of some binary decision criterion • The leaves are labelled with the class C corresponding the feature reached via that path through the tree f B1(f)=True? B1(f)=True? No Yes B2(f)=True? B3(f)=True? No Yes No Yes c1 c2 c1 c1
Binary Decision Trees • Decision trees classify features by a series of Yes/No questions • At each node, the feature space is split according to the outcome of some binary decision criterion • The leaves are labelled with the class C corresponding the feature reached via that path through the tree f B1(f)=True? B1(f)=True? No Yes Yes B2(f)=True? B3(f)=True? B3(f)=True? No Yes No Yes c1 c2 c1 c1
分享到:
收藏