logo资料库

Pattern Classification.pdf

第1页 / 共715页
第2页 / 共715页
第3页 / 共715页
第4页 / 共715页
第5页 / 共715页
第6页 / 共715页
第7页 / 共715页
第8页 / 共715页
资料共715页,剩余部分请下载后查看
DHSPreface
DHSChap1
DHSChap2
DHSChap3
DHSChap4
DHSChap5
DHSChap6
DHSChap7
DHSChap8
DHSChap9
DHSChap10
DHSAppendix
1 To C. A. Rosen and C. W. Stork
2
Preface O ur purposes in writing this Second Edition | more than a quarter century af- ter the original | remain the same: to give a systematic account of the major topics in pattern recognition, based whenever possible on fundamental principles. We believe that this provides the required foundation for solving problems in more spe- cialized application areas such as speech recognition, optical character recognition, signal analysis, and so on. Since 1973, there has been an immense wealth of efiort, and in many cases progress, on the topics we addressed in the First Edition. The pace of progress in algorithms for learning and pattern recognition has been exceeded only by the improvements in computer hardware. Some of the outstanding problems acknowledged in the First Edition have been solved, whereas others remain as frus- trating as ever. Taken with the manifest usefulness of pattern recognition, this makes the fleld extremely vigorous and exciting. While we wrote then that pattern recognition might appear to be a rather special- ized topic, it is now abundantly clear that pattern recognition is an immensely broad subject, with applications in flelds as diverse as speech recognition, optical character recognition, handwriting and gesture recognition, lipreading, geological analysis, doc- ument searching and the recognition of bubble chamber tracks in elementary particle accelerators; it central to a host of man-machine interface problems, such as pen- based computing. The size of the current volume (which precluded material on scene analysis, included in the flrst edition) is a testament to the body of established the- ory. Whereas we expect most of our readers will be interested in developing pattern recognition systems, perhaps a few will be active in understanding existing pattern recognition systems, most notably human and animal nervous systems. To address the biological roots of pattern recognition would of course be beyond the scope of this book. Nevertheless, as neurobiologists and psychologists interested in pattern recognition in the natural world continue to rely on more advanced mathematics and theory, they too may proflt from the material presented here. Despite the existence of a number of excellent books that focus on a small set of speciflc techniques, we feel there is still a strong need for a book such as ours, which takes a somewhat difierent approach. Rather than focus on a speciflc technique such as neural networks, we address a speciflc class of problems | pattern recognition problems | and consider the wealth of difierent techniques that can be applied to it. Students and practitioners typically have a particular problem and need to know which technique is best suited. Books focusing on neural networks may not explain decision trees, or nearest-neighbor methods, or many other classiflers to the depth required by the pattern recognition practitioner who must decide among the various 3
4 techniques. Such books organized around a technique flrst and foremost generally fail to provide deeper understanding of any single technique, understanding that comes from exploring the relationships among various techniques. These developments demanded a unifled presentation in an updated edition of Part I of the original book. We have tried not only to expand but also to improve the text in a number of ways: New material The text has been brought up to date with chapters on pattern recog- nition topics that have, over the past decade or so, proven to be of value: neural networks, stochastic methods, genetic algorithms, theory of learning, to name a few. While the book continues to stress methods that are statistical at root, for completeness we have included material on syntactic methods as well. \Clas- sical" material has been included, such as Hidden Markov Models, model se- lection, combining classiflers, useful results from the theory of learning, and so forth. Examples Throughout the text we have included worked examples, usually contain- ing data and methods simple enough to avoid tedious calculations yet complex enough to illustrate important points. These are meant to impart intuition, clarify the ideas in the text, and to help students solve the homework prob- lems. In addition, in response to popular demand, a Solutions Manual has been prepared to help instructors who use this book in a course. Algorithms Some pattern recognition or learning techniques are best explained with the help of algorithms, and thus we have included several throughout the book. These are meant for clariflcation, of course; they provide only the skeleton of structure needed for a full computer program. We assume every reader is famil- iar with such pseudocode, or can understand it from context here. Starred Sections The starred sections (*) are a bit more specialized, and are gener- ally expansions upon other material. Such material is not needed to understand subsequent un-starred sections, and thus they can be skipped on flrst reading. Computer Exercises These are not speciflc to any language or system, and thus can be done in the language or style the student flnds most comfortable. Problems New homework problems have been added, organized by section earliest section where the material is covered. Chapter Summaries Chapter summaries are included to highlight the most impor- tant concepts covered in the rest of the text. Graphics We have gone to great lengths to produce a large number of high-quality flgures and graphics to illustrate our points. Some of these required extensive calculations, selection and reselection of parameters to best illustrate the con- cepts at hand. Study the flgures too! Faculty adopting this book for courses can contact the publisher in order to obtain a master set of the flgures for making overhead transparencies. These are invaluable for teaching. Mathematical Appendixes It comes as no surprise that students do not have the same mathematical background, and for this reason we have included mathe- matical appendixes on the foundations needed for the book. We have striven to use clear notation throughout | rich enough to cover the key properties, yet
simple enough for easy readability. The list of symbols in the Appendix should help those readers who dip into an isolated section that uses notation developed much earlier. 5 This book surely contains enough material to flll a two-semester upper-division or graduate course; alternatively, with careful selection of topics, faculty can fashion a one-semester course. A one-semester course could be based on Chapts. ?? { ??, ?? & ?? (most of the material from the First Edition, augmented by neural networks and machine learning), with or without the material from the starred sections. Because of the explosion in research developments, our historical remarks at the end of most chapters are necessarily cursory, and somewhat idiosyncratic. Our goal has been to stress important references rather than a complete historical record, to help the reader more than acknowledge, praise and cite the established researcher. This book could never have been written without the support and assistance of several institutions. First and foremost is of course Ricoh Silicon Valley (DGS and PEH). Its support of such a long-range and broadly educational project as this book | amidst the rough and tumble world of industry and its never ending need for prod- ucts and innovation | is proof positive of a wonderful environment and a rare and enlightened leadership. The enthusiastic support of Morio Onoe, who was Director of Research, Ricoh Company Ltd. when we began our writing efiorts, is gratefully acknowledged. Likewise, San Jose State University (ROD), Stanford University (De- partments of Electrical Engineering, Statistics and Psychology), The University of California, Berkeley Extension, The International Institute of Advanced Scientiflc Studies, the Niels Bohr Institute and the Santa Fe Institute (DGS) all provided a temporary home during the writing of this book. Our earnest gratitude goes to all. Deep thanks go to Stanford graduate students Chuck Lam and Chris Overton who helped immensely on flgure preparation and to Sudeshna Adak who helped in solving homework problems. Colleagues at Ricoh aided in numerous ways; Kathryn Berkner, Jonathan Hull and Greg Wolfi deserve special thanks, as does research librarian Rowan Fairgrove, who e–ciently found obscure references, including the flrst names of a few authors. The book has been used in manuscript form in several courses at Stanford University, and the feedback from students has been invaluable. Numerous faculty and scientiflc colleagues have sent us many suggestions and caught many errors. The following such commentators warrant special mention: David Cooper, Gary Ford, Dennis Kibler, Benny Lautrup, Nick Littlestone, Art Owen, Isabelle Guyon, Ros- alind Picard and David Wolpert. Specialist reviewers | Alex Pentland (1), Giovanni Parmigiani (2), Peter Cheeseman (3), Godfried Toussaint (4), Padhraic Smyth (5), Yann Le Cun (6), Emile Aarts (7), Horst Bunke (8), Tom Dietterich (9), Anil Jain (10) and Rao Vemuri (Appendix) | focussed on single chapters; their perceptive comments were often enlightening and improved the text in numerous ways. (Never- theless, we are responsible for any errors that remain.) George Telecki, our editor, gave the needed encouragement and support, and refrained from complaining as one manuscript deadline after another passed. He, and indeed all the folk at Wiley, were quite helpful and professional. Finally, deep thanks go to Nancy, Alex and Olivia Stork, for saintly understanding and patience. Portola Valley, California David G. Stork Richard O. Duda Peter E. Hart
Contents 1 Introduction 1.1 Machine Perception . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 An Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2.1 Related flelds . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 The Sub-problems of Pattern Classiflcation . . . . . . . . . . . . . . . 1.3.1 Feature Extraction . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.2 Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.3 Overfltting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.4 Model Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.5 Prior Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.6 Missing Features . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.7 Mereology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.8 Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.9 Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.10 Invariances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.11 Evidence Pooling . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.12 Costs and Risks . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3.13 Computational Complexity . . . . . . . . . . . . . . . . . . . . 1.4 Learning and Adaptation . . . . . . . . . . . . . . . . . . . . . . . . . 1.4.1 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . 1.4.2 Unsupervised Learning . . . . . . . . . . . . . . . . . . . . . . . 1.4.3 Reinforcement Learning . . . . . . . . . . . . . . . . . . . . . . 1.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Summary by Chapters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Bibliographical and Historical Remarks . . . . . . . . . . . . . . . . . . . . Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 3 11 11 11 12 12 12 12 13 13 13 14 14 15 15 16 16 16 17 17 17 17 19 19 22 1
2 CONTENTS
Chapter 1 Introduction T he ease with which we recognize a face, understand spoken words, read handwrit- ten characters, identify our car keys in our pocket by feel, and decide whether an apple is ripe by its smell belies the astoundingly complex processes that underlie these acts of pattern recognition. Pattern recognition | the act of taking in raw data and taking an action based on the \category" of the pattern | has been crucial for our survival, and over the past tens of millions of years we have evolved highly sophisticated neural and cognitive systems for such tasks. 1.1 Machine Perception It is natural that we should seek to design and build machines that can recognize patterns. From automated speech recognition, flngerprint identiflcation, optical char- acter recognition, DNA sequence identiflcation and much more, it is clear that reli- able, accurate pattern recognition by machine would be immensely useful. Moreover, in solving the myriad problems required to build such systems, we gain deeper un- derstanding and appreciation for pattern recognition systems in the natural world | most particularly in humans. For some applications, such as speech and visual recog- nition, our design efiorts may in fact be inuenced by knowledge of how these are solved in nature, both in the algorithms we employ and the design of special purpose hardware. 1.2 An Example To illustrate the complexity of some of the types of problems involved, let us consider the following imaginary and somewhat fanciful example. Suppose that a flsh packing plant wants to automate the process of sorting incoming flsh on a conveyor belt according to species. As a pilot project it is decided to try to separate sea bass from salmon using optical sensing. We set up a camera, take some sample images and begin to note some physical difierences between the two types of flsh | length, lightness, width, number and shape of flns, position of the mouth, and so on | and these suggest features to explore for use in our classifler. We also notice noise or variations in the 3
分享到:
收藏