Nonlinear Dimensionality Reduction.pdf

发布时间：2022-06-03 发布人：admin 分类：说明书资料大小：21.97M 资料格式：pdf 举报版权申诉

weixin_38484339-9961545-16359647974029526819.pdf-第1页.png

第1页 / 共316页

weixin_38484339-9961545-16359647974029526819.pdf-第2页.png

第2页 / 共316页

weixin_38484339-9961545-16359647974029526819.pdf-第3页.png

第3页 / 共316页

weixin_38484339-9961545-16359647974029526819.pdf-第4页.png

第4页 / 共316页

weixin_38484339-9961545-16359647974029526819.pdf-第5页.png

第5页 / 共316页

weixin_38484339-9961545-16359647974029526819.pdf-第6页.png

第6页 / 共316页

weixin_38484339-9961545-16359647974029526819.pdf-第7页.png

第7页 / 共316页

weixin_38484339-9961545-16359647974029526819.pdf-第8页.png

第8页 / 共316页

Notations

Acronyms

High-Dimensional Data

Practical motivations

Fields of application

The goals to be reached

Theoretical motivations

How can we visualize high-dimensional spaces?

Curse of dimensionality and empty space phenomenon

Some directions to be explored

Relevance of the variables

Dependencies between the variables

About topology, spaces, and manifolds

Two benchmark manifolds

Overview of the next chapters

Characteristics of an Analysis Method

Purpose

Expected functionalities

Estimation of the number of latent variables

Embedding for dimensionality reduction

Embedding for latent variable separation

Internal characteristics

Underlying model

Algorithm

Criterion

Example: Principal component analysis

Data model of PCA

Criteria leading to PCA

Functionalities of PCA

Algorithms

Examples and limitations of PCA

Toward a categorization of DR methods

Hard vs. soft dimensionality reduction

Traditional vs. generative model

Linear vs. nonlinear model

Continuous vs. discrete model

Implicit vs. explicit mapping

Integrated vs. external estimation of the dimensionality

Layered vs. standalone embeddings

Single vs. multiple coordinate systems

Optional vs. mandatory vector quantization

Batch vs. online algorithm

Exact vs. approximate optimization

The type of criterion to be optimized

Estimation of the Intrinsic Dimension

Definition of the intrinsic dimension

Fractal dimensions

The q-dimension

Capacity dimension

Information dimension

Correlation dimension

Some inequalities

Practical estimation

Other dimension estimators

Local methods

Trial and error

Comparisons

Data Sets

PCA estimator

Correlation dimension

Local PCA estimator

Trial and error

Concluding remarks

Distance Preservation

State-of-the-art

Spatial distances

Metric space, distances, norms and scalar product

Multidimensional scaling

Sammon's nonlinear mapping

Curvilinear component analysis

Graph distances

Geodesic distance and graph distance

Isomap

Geodesic NLM

Curvilinear distance analysis

Other distances

Kernel PCA

Semidefinite embedding

Topology Preservation

State of the art

Predefined lattice

Self-Organizing Maps

Generative Topographic Mapping

Data-driven lattice

Locally linear embedding

Laplacian eigenmaps

Isotop

Method comparisons

Toy examples

The Swiss roll

Manifolds having essential loops or spheres

Cortex unfolding

Image processing

Artificial faces

Real faces

Conclusions

Summary of the book

The problem

A basic solution

Dimensionality reduction

Latent variable separation

Intrinsic dimensionality estimation

Data flow

Variable Selection

Calibration

Linear dimensionality reduction

Nonlinear dimensionality reduction

Latent variable separation

Further processing

Model complexity

Taxonomy

Distance preservation

Topology preservation

Spectral methods

Nonspectral methods

Tentative methodology

Perspectives

Matrix Calculus

Singular value decomposition

Eigenvalue decomposition

Square root of a square matrix

Gaussian Variables

One-dimensional Gaussian distribution

Multidimensional Gaussian distribution

Uncorrelated Gaussian variables

Isotropic multivariate Gaussian distribution

Linearly mixed Gaussian variables

Optimization

Newton's method

Finding extrema

Multivariate version

Gradient ascent/descent

Stochastic gradient descent

Vector quantization

Classical techniques

Competitive learning

Taxonomy

Initialization and ``dead units''

Graph Building

Without vector quantization

K-rule

-rule

With vector quantization

Data rule

Histogram rule

Implementation Issues

Dimension estimation

Capacity dimension

Correlation dimension

Computation of the closest point(s)

Graph distances

References

Index

Information Science and Statistics Series Editors: M. Jordan J. Kleinberg B. Schölkopf

Information Science and Statistics Akaike and Kitagawa: The Practice of Time Series Analysis. Bishop: Pattern Recognition and Machine Learning. Cowell, Dawid, Lauritzen, and Spiegelhalter: Probabilistic Networks and Expert Systems. Doucet, de Freitas, and Gordon: Sequential Monte Carlo Methods in Practice. Fine: Feedforward Neural Network Methodology. Hawkins and Olwell: Cumulative Sum Charts and Charting for Quality Improvement. Jensen and Nielsen: Bayesian Networks and Decision Graphs, Second Edition. Lee and Verleysen: Nonlinear Dimensionality Reduction. Marchette: Computer Intrusion Detection and Network Monitoring: A Statistical Viewpoint. Rissanen: Information and Complexity in Statistical Modeling. Rubinstein and Kroese: The Cross-Entropy Method: A Unified Approach to Combinatorial Optimization, Monte Carlo Simulation, and Machine Learning. Studen´y: Probabilistic Conditional Independence Structures. Vapnik: The Nature of Statistical Learning Theory, Second Edition. Wallace: Statistical and Inductive Inference by Minimum Massage Length.

John A. Lee Michel Verleysen Nonlinear Dimensionality Reduction

John Lee Molecular Imaging and Experimental Radiotherapy Université catholique de Louvain Avenue Hippocrate 54/69 B-1200 Bruxelles Belgium john.lee@uclouvain.be Michel Verleysen Machine Learning Group – DICE Université catholique de Louvain Place du Levant 3 B-1348 Louvain-la-Neuve Belgium michel.verleysen@uclouvain.be Series Editors: Michael Jordan Division of Computer Science and Department of Statistics University of California, Berkeley Berkeley, CA 94720 USA Jon Kleinberg Department of Computer Science Cornell University Ithaca, NY 14853 USA Bernhard Schölkopf Max Planck Institute for Biological Cybernetics Spemannstrasse 38 72076 Tübingen Germany Library of Congress Control Number: 2006939149 ISBN-13: 978-0-387-39350-6 e-ISBN-13: 978-0-387-39351-3 Printed on acid-free paper. © 2007 Springer Science+Business Media, LLC All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer Science + Business Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. 9 8 7 6 5 4 3 2 1 springer.com

To our families

Preface Methods of dimensionality reduction are innovative and important tools in the ﬁelds of data analysis, data mining, and machine learning. They provide a way to understand and visualize the structure of complex data sets. Traditional methods like principal component analysis and classical metric multidimen- sional scaling suﬀer from being based on linear models. Until recently, very few methods were able to reduce the data dimensionality in a nonlinear way. However, since the late 1990s, many new methods have been developed and nonlinear dimensionality reduction, also called manifold learning, has become a hot topic. New advances that account for this rapid growth are, for ex- ample, the use of graphs to represent the manifold topology, and the use of new metrics like the geodesic distance. In addition, new optimization schemes, based on kernel techniques and spectral decomposition, have led to spectral embedding, which encompasses many of the recently developed methods. This book describes existing and advanced methods to reduce the dimen- sionality of numerical databases. For each method, the description starts from intuitive ideas, develops the necessary mathematical details, and ends by out- lining the algorithmic implementation. Methods are compared with each other with the help of diﬀerent illustrative examples. The purpose of the book is to summarize clear facts and ideas about well-known methods as well as recent developments in the topic of nonlinear dimensionality reduction. With this goal in mind, methods are all described from a unifying point of view, in order to highlight their respective strengths and shortcomings. The book is primarily intended for statisticians, computer scientists, and data analysts. It is also accessible to other practitioners having a basic back- ground in statistics and/or computational learning, such as psychologists (in psychometry) and economists. Louvain-la-Neuve, Belgium October 2006 John A. Lee Michel Verleysen

Contents Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . XV Acronyms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .XVII 1 High-Dimensional Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1 Practical motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.1 Fields of application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.1.2 The goals to be reached . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Theoretical motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 How can we visualize high-dimensional spaces? . . . . . . . . 1.2.2 Curse of dimensionality and empty space phenomenon . 1.3 Some directions to be explored . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 2 3 3 4 6 9 1.3.1 Relevance of the variables . . . . . . . . . . . . . . . . . . . . . . . . . . 10 1.3.2 Dependencies between the variables . . . . . . . . . . . . . . . . . . 10 1.4 About topology, spaces, and manifolds . . . . . . . . . . . . . . . . . . . . . 11 1.5 Two benchmark manifolds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 1.6 Overview of the next chapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 2 Characteristics of an Analysis Method . . . . . . . . . . . . . . . . . . . . . 17 2.1 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 2.2 Expected functionalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.2.1 Estimation of the number of latent variables . . . . . . . . . . 18 2.2.2 Embedding for dimensionality reduction . . . . . . . . . . . . . . 19 2.2.3 Embedding for latent variable separation . . . . . . . . . . . . . 20 2.3 Internal characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3.1 Underlying model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 2.3.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.3.3 Criterion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 2.4 Example: Principal component analysis . . . . . . . . . . . . . . . . . . . . 24 2.4.1 Data model of PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.4.2 Criteria leading to PCA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

分享到：

赞收藏

资料库

Nonlinear Dimensionality Reduction.pdf

相关推荐

大数据

热门标签

最新资料