Kernel Methods for Pattern Analysis..pdf

发布时间：2022-06-01 发布人：admin 分类：说明书资料大小：3.02M 资料格式：pdf 举报版权申诉

baidu_22087895-8047567-4744300845194742216.pdf-第1页.png

第1页 / 共478页

baidu_22087895-8047567-4744300845194742216.pdf-第2页.png

第2页 / 共478页

baidu_22087895-8047567-4744300845194742216.pdf-第3页.png

第3页 / 共478页

baidu_22087895-8047567-4744300845194742216.pdf-第4页.png

第4页 / 共478页

baidu_22087895-8047567-4744300845194742216.pdf-第5页.png

第5页 / 共478页

baidu_22087895-8047567-4744300845194742216.pdf-第6页.png

第6页 / 共478页

baidu_22087895-8047567-4744300845194742216.pdf-第7页.png

第7页 / 共478页

baidu_22087895-8047567-4744300845194742216.pdf-第8页.png

第8页 / 共478页

Cover

Half-title

Title

Contents

Code fragments

Preface

Part I Basic concepts

1 Pattern analysis

1.1 Patterns in data

1.1.1 Data

1.1.2 Patterns

1.2 Pattern analysis algorithms

1.2.1 Statistical stability of patterns

1.2.2 Detecting patterns by recoding

1.3 Exploiting patterns

1.3.1 The overall strategy

1.3.2 Common pattern analysis tasks

1.4 Summary

1.5 Further reading and advanced topics

2 Kernel methods: an overview

2.1 The overall picture

2.2 Linear regression in a feature space

2.2.1 Primal linear regression

2.2.2 Ridge regression: primal and dual

2.2.3 Kernel-defined nonlinear feature mappings

2.3 Other examples

2.3.1 Algorithms

2.3.2 Kernels

2.4 The modularity of kernel methods

2.5 Roadmap of the book

2.6 Summary

2.7 Further reading and advanced topics

3 Properties of kernels

3.1 Inner products and positive semi-definite matrices

3.1.1 Hilbert spaces

3.1.2 Gram matrix

3.2 Characterisation of kernels

3.3 The kernel matrix

3.4 Kernel construction

3.4.1 Operations on kernel functions

3.4.2 Operations on kernel matrices

3.5 Summary

3.6 Further reading and advanced topics

4 Detecting stable patterns

4.1 Concentration inequalities

4.2 Capacity and regularisation: Rademacher theory

4.3 Pattern stability for kernel-based classes

4.4 A pragmatic approach

4.5 Summary

4.6 Further reading and advanced topics

Part II Pattern analysis algorithms

5 Elementary algorithms in feature space

5.1 Means and distances

5.1.1 A simple algorithm for novelty-detection

5.1.2 A simple algorithm for classification

5.2 Computing projections: Gram–Schmidt, QR and Cholesky

5.3 Measuring the spread of the data

5.4 Fisher discriminant analysis I

5.5 Summary

5.6 Further reading and advanced topics

6 Pattern analysis using eigen-decompositions

6.1 Singular value decomposition

6.2 Principal components analysis

6.2.1 Kernel principal components analysis

6.2.2 Stability of principal components analysis

6.3 Directions of maximum covariance

6.4 The generalised eigenvector problem

6.5 Canonical correlation analysis

6.6 Fisher discriminant analysis II

6.7 Methods for linear regression

6.7.1 Partial least squares

6.7.2 Kernel partial least squares

6.8 Summary

6.9 Further reading and advanced topics

7 Pattern analysis using convex optimisation

7.1 The smallest enclosing hypersphere

7.1.1 The smallest hypersphere containing a set of points

7.1.2 Stability of novelty-detection

7.1.3 Hyperspheres containing most of the points

7.2 Support vector machines for classification

7.2.1 The maximal margin classifier

7.2.2 Soft margin classifiers

7.3 Support vector machines for regression

7.3.1 Stability of regression

7.3.2 Ridge regression

7.3.3 ε-insensitive regression

7.4 On-line classification and regression

7.5 Summary

7.6 Further reading and advanced topics

8 Ranking, clustering and data visualisation

8.1 Discovering rank relations

8.1.1 Batch ranking

8.1.2 On-line ranking

8.2 Discovering cluster structure in a feature space

8.2.1 Measuring cluster quality

8.2.2 Greedy solution: k-means

8.2.3 Relaxed solution: spectral methods

8.3 Data visualisation

8.4 Summary

8.5 Further reading and advanced topics

Part III Constructing kernels

9 Basic kernels and kernel types

9.1 Kernels in closed form

9.2 ANOVA kernels

9.3 Kernels from graphs

9.4 Diffusion kernels on graph nodes

9.5 Kernels on sets

9.6 Kernels on real numbers

9.7 Randomised kernels

9.8 Other kernel types

9.8.1 Kernels from successive embeddings

9.8.2 Kernels over general structures

9.8.3 Kernels from generative information

9.9 Summary

9.10 Further reading and advanced topics

10 Kernels for text

10.1 From bag of words to semantic space

10.1.1 Representing text

10.1.2 Semantic issues

10.2 Vector space kernels

10.2.1 Designing semantic kernels

10.2.2 Designing the proximity matrix

10.3 Summary

10.4 Further reading and advanced topics

11 Kernels for structured data: strings, trees, etc.

11.1 Comparing strings and sequences

11.2 Spectrum kernels

11.3 All-subsequences kernels

11.4 Fixed length subsequences kernels

11.5 Gap-weighted subsequences kernels

11.5.1 Naive implementation

11.5.2 Efficient implementation

11.5.3 Variations on the theme

11.6 Beyond dynamic programming: trie-based kernels

11.6.1 Trie computation of the p-spectrum kernels

11.6.2 Trie-based mismatch kernels

11.6.3 Trie-based restricted gap-weighted kernels

11.7 Kernels for structured data

11.7.1 Comparing trees

11.7.2 Structured data: a framework

11.8 Summary

11.9 Further reading and advanced topics

12 Kernels from generative models

12.1 P-kernels

12.1.1 Conditional-independence (CI) and marginalisation

12.1.2 Representing multivariate distributions

12.1.3 Fixed length strings generated by a hidden binomial model

12.1.4 Fixed length strings generated by a hidden markov model

12.1.5 Pair hidden Markov model kernels

12.1.6 Hidden tree model kernels

12.2 Fisher kernels

12.2.1 From probability to geometry

12.2.2 Fisher kernels for hidden Markov models

12.3 Summary

12.4 Further reading and advanced topics

Appendix A Proofs omitted from the main text

A.1 Proof of McDiarmid’s theorem

A.2 Stability of principal components analysis

A.3 Proofs of diffusion kernels

Appendix B Notational conventions

B.1 List of symbols

B.2 Notation for Tables

Appendix C List of pattern analysis methods

C.1 Pattern analysis computations

C.2 Pattern analysis algorithms

Appendix D List of kernels

D.1 Kernel definitions and computations

D.2 Kernel algorithms

References

Index

This page intentionally left blank

Kernel Methods for Pattern Analysis Pattern Analysis is the process of ﬁnding general relations in a set of data, and forms the core of many disciplines, from neural networks to so-called syn- tactical pattern recognition, from statistical pattern recognition to machine learning and data mining. Applications of pattern analysis range from bioin- formatics to document retrieval. The kernel methodology described here provides a powerful and uniﬁed framework for all of these disciplines, motivating algorithms that can act on general types of data (e.g. strings, vectors, text, etc.) and look for general types of relations (e.g. rankings, classiﬁcations, regressions, clusters, etc.). This book fulﬁls two major roles. Firstly it provides practitioners with a large toolkit of algorithms, kernels and solutions ready to be implemented, many given as Matlab code suitable for many pattern analysis tasks in ﬁelds such as bioinformatics, text analysis, and image analysis. Secondly it furnishes students and researchers with an easy introduction to the rapidly expanding ﬁeld of kernel-based pattern analysis, demonstrating with examples how to handcraft an algorithm or a kernel for a new speciﬁc application, while covering the required conceptual and mathematical tools necessary to do so. The book is in three parts. The ﬁrst provides the conceptual foundations of the ﬁeld, both by giving an extended introductory example and by cov- ering the main theoretical underpinnings of the approach. The second part contains a number of kernel-based algorithms, from the simplest to sophis- ticated systems such as kernel partial least squares, canonical correlation analysis, support vector machines, principal components analysis, etc. The ﬁnal part describes a number of kernel functions, from basic examples to advanced recursive kernels, kernels derived from generative models such as HMMs and string matching kernels based on dynamic programming, as well as special kernels designed to handle text documents. All those involved in pattern recognition, machine learning, neural net- works and their applications, from computational biology to text analysis will welcome this account.

Kernel Methods for Pattern Analysis John Shawe-Taylor University of Southampton Nello Cristianini University of California at Davis

cambridge university press Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, São Paulo Cambridge University Press The Edinburgh Building, Cambridge cb2 2ru, UK Published in the United States of America by Cambridge University Press, New York www.cambridge.org Information on this title: www.cambridge.org/9780521813976 © Cambridge University Press 2004 This publication is in copyright. Subject to statutory exception and to the provision of relevant collective licensing agreements, no reproduction of any part may take place without the written permission of Cambridge University Press. First published in print format 2004 isbn-13 978-0-511-21060-0 isbn-10 0-511-21237-2 eBook (EBL) eBook (EBL) isbn-13 978-0-521-81397-6 isbn-10 0-521-81397-2 hardback hardback Cambridge University Press has no responsibility for the persistence or accuracy of urls for external or third-party internet websites referred to in this publication, and does not guarantee that any content on such websites is, or will remain, accurate or appropriate.

Contents List of code fragments Preface Part I Basic concepts Pattern analysis Summary Kernel methods: an overview Linear regression in a feature space 1 1.1 Patterns in data 1.2 Pattern analysis algorithms 1.3 Exploiting patterns 1.4 1.5 Further reading and advanced topics 2 2.1 The overall picture 2.2 2.3 Other examples 2.4 The modularity of kernel methods 2.5 Roadmap of the book 2.6 2.7 Further reading and advanced topics 3 3.1 3.2 Characterisation of kernels 3.3 The kernel matrix 3.4 Kernel construction 3.5 3.6 Further reading and advanced topics 4 4.1 Concentration inequalities 4.2 Capacity and regularisation: Rademacher theory Properties of kernels Inner products and positive semi-deﬁnite matrices Detecting stable patterns Summary Summary v page viii xi 1 3 4 12 17 22 23 25 26 27 36 42 43 44 45 47 48 60 68 74 82 82 85 86 93

vi Contents 4.3 Pattern stability for kernel-based classes 4.4 A pragmatic approach 4.5 4.6 Further reading and advanced topics Summary Part II Pattern analysis algorithms Elementary algorithms in feature space Summary 5 5.1 Means and distances 5.2 Computing projections: Gram–Schmidt, QR and Cholesky 5.3 Measuring the spread of the data 5.4 Fisher discriminant analysis I 5.5 5.6 Further reading and advanced topics Pattern analysis using eigen-decompositions 6 6.1 Singular value decomposition 6.2 Principal components analysis 6.3 Directions of maximum covariance 6.4 The generalised eigenvector problem 6.5 Canonical correlation analysis 6.6 Fisher discriminant analysis II 6.7 Methods for linear regression 6.8 6.9 Further reading and advanced topics 7 7.1 The smallest enclosing hypersphere 7.2 7.3 7.4 On-line classiﬁcation and regression 7.5 7.6 Further reading and advanced topics 8 8.1 Discovering rank relations 8.2 Discovering cluster structure in a feature space 8.3 Data visualisation 8.4 8.5 Further reading and advanced topics Support vector machines for classiﬁcation Support vector machines for regression Pattern analysis using convex optimisation Ranking, clustering and data visualisation Summary Summary Summary Part III Constructing kernels Basic kernels and kernel types 9 9.1 Kernels in closed form 97 104 105 106 109 111 112 122 128 132 137 138 140 141 143 155 161 164 176 176 192 193 195 196 211 230 241 249 250 252 253 264 280 286 286 289 291 292

分享到：

赞收藏

资料库

Kernel Methods for Pattern Analysis..pdf

相关推荐

行业

热门标签

最新资料