“00-FM-SA272” 18/9/2008 page iv
Academic Press is an imprint of Elsevier
30 Corporate Drive, Suite 400, Burlington, MA 01803, USA
525 B Street, Suite 1900, San Diego, California 92101-4495, USA
84 Theobald’s Road, London WC1X 8RR, UK
This book is printed on acid-free paper. ⬁
Copyright © 2009, Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or
mechanical, including photocopy, recording, or any information storage and retrieval system, without
permission in writing from the publisher.
Permissions may be sought directly from Elsevier’s Science & Technology Rights Department in
Oxford, UK: phone: (+44) 1865 843830, fax: (+44) 1865 853333, E-mail: permissions@elsevier.com.
You may also complete your request on-line via the Elsevier homepage (http://elsevier.com), by
selecting “Support & Contact” then “Copyright and Permission” and then “Obtaining Permissions.”
Library of Congress Cataloging-in-Publication Data
Application submitted
British Library Cataloguing-in-Publication Data
A catalogue record for this book is available from the British Library.
ISBN: 978-1-59749-272-0
For information on all Academic Press publications
visit our Web site at www.books.elsevier.com
Printed in the United States of America
09 10 11 12 13 14 15 16 5 4 3 2 1
“02-Preface-SA272” 17/9/2008 page xv
Preface
This book is the outgrowth of our teaching advanced undergraduate and graduate
courses over the past 20 years. These courses have been taught to different
audiences, including students in electrical and electronics engineering, computer
engineering, computer science, and informatics, as well as to an interdisciplinary
audience of a graduate course on automation. This experience led us to make
the book as self-contained as possible and to address students with different back-
grounds. As prerequisitive knowledge, the reader requires only basic calculus,
elementary linear algebra, and some probability theory basics. A number of mathe-
matical tools, such as probability and statistics as well as constrained optimization,
needed by various chapters,are treated in four Appendices. The book is designed to
serve as a text for advanced undergraduate and graduate students,and it can be used
for either a one- or a two-semester course. Furthermore,it is intended to be used as a
self-study and reference book for research and for the practicing scientist/engineer.
This latter audience was also our second incentive for writing this book, due to the
involvement of our group in a number of projects related to pattern recognition.
SCOPE AND APPROACH
The goal of the book is to present in a unified way the most widely used tech-
niques and methodologies for pattern recognition tasks. Pattern recognition is
in the center of a number of application areas, including image analysis, speech
and audio recognition, biometrics, bioinformatics, data mining, and information
retrieval. Despite their differences, these areas share, to a large extent, a corpus
of techniques that can be used in extracting, from the available data, information
related to data categories, important “hidden”patterns, and trends. The emphasis in
this book is on the most generic of the methods that are currently available. Hav-
ing acquired the basic knowledge and understanding, the reader can subsequently
move on to more specialized application-dependent techniques, which have been
developed and reported in a vast number of research papers.
Each chapter of the book starts with the basics and moves, progressively, to
more advanced topics’and reviews up-to-date techniques. We have made an effort
to keep a balance between mathematical and descriptive presentation. This is not
always an easy task. However, we strongly believe that in a topic such as pattern
recognition, trying to bypass mathematics deprives the reader of understanding the
essentials behind the methods and also the potential of developing new techniques,
which fit the needs of the problem at hand that he or she has to tackle. In pattern
recognition, the final adoption of an appropriate technique and algorithm is very
much a problem-dependent task. Moreover, according to our experience, teaching
pattern recognition is also a good “excuse” for the students to refresh and solidify
xv
“02-Preface-SA272” 17/9/2008 page xvi
xvi
Preface
some of the mathematical basics they have been taught in earlier years. “Repetitio
est mater studiosum.”
NEW TO THIS EDITION
The new features of the fourth edition include the following.
■ MATLAB codes and computer experiments are given at the end of most
chapters.
■ More examples and a number of new figures have been included to enhance
the readability and pedagogic aspects of the book.
■ New sections on some important topics of high current interest have been
added, including:
• Nonlinear dimensionality reduction
• Nonnegative matrix factorization
• Relevance feedback
• Robust regression
• Semi-supervised learning
• Spectral clustering
• Clustering combination techniques
Also, a number of sections have been rewritten in the context of more recent
applications in mind.
SUPPLEMENTS TO THE TEXT
Demonstrations based on MATLAB are available for download from the book Web
site, www.elsevierdirect.com/9781597492720. Also available are electronic figures
from the text and (for instructors only) a solutions manual for the end-of-chapter
problems and exercises. The interested reader can download detailed proofs,
which in the book necessarily, are sometimes, slightly condensed. PowerPoint
presentations are also available covering all chapters of the book.
Our intention is to update the site regularly with more and/or improved versions
of the MATLAB demonstrations. Suggestions are always welcome. Also at this Web
site a page will be available for typos, which are unavoidable, despite frequent
careful reading. The authors would appreciate readers notifying them about any
typos found.
“02-Preface-SA272” 17/9/2008 page xvii
Preface
xvii
ACKNOWLEDGMENTS
This book would have not been written without the constant support and help
from a number of colleagues and students throughout the years. We are espe-
cially indebted to Kostas Berberidis, Velissaris Gezerlis, Xaris Georgion, Kristina
Georgoulakis, Leyteris Kofidis, Thanassis Liavas, Michalis Mavroforakis, Aggelos
Pikrakis,Thanassis Rontogiannis, Margaritis Sdralis, Kostas Slavakis, and Theodoros
Yiannakoponlos. The constant support provided by Yannis Kopsinis and Kostas
Thernelis from the early stages up to the final stage, with those long nights, has
been invaluable. The book improved a great deal after the careful reading and
the serious comments and suggestions of Alexandros Bölnn. Dionissis Cavouras,
Vassilis Digalakis, Vassilis Drakopoulos, Nikos Galatsanos, George Glentis, Spiros
Hatzispyros, Evagelos Karkaletsis, Elias Koutsoupias, Aristides Likas, Gerassimos
Mileounis, George Monstakides, George Paliouras, Stavros Perantonis, Takis Stam-
atoponlos, Nikos Vassilas, Manolis Zervakis, and Vassilis Zissimopoulos.
The book has greatly gained and improved thanks to the comments of a number
of people who provided feedback on the revision plan and/or comments on revised
chapters:
Tulay Adali, University of Maryland; Mehniet Celenk, Ohio University; Rama Chel-
lappa, University of Maryland; Mark Clements, Georgia Institute of Technology;
Robert Duin, Delft University of Technology; Miguel Figneroa,Villanueva University
of Puerto Rico; Dimitris Gunopoulos, University of Athens; Mathias Kolsch, Naval
Postgraduate School;Adam Krzyzak, Concordia University; Baoxiu Li,Arizona State
University; David Miller, Pennsylvania State University; Bernhard Schölkopf, Max
Planck Institute; Hari Sundaram, Arizona State University; Harry Wechsler, George
Mason University; and Alexander Zien, Max Planck Institute.
We are greatly indebted to these colleagues for their time and their constructive
criticisms. Our collaboration and friendship with Nikos Kalouptsidis have been
a source of constant inspiration for all these years. We are both deeply indebted
to him.
Last but not least, K. Koutroumbas would like to thank Sophia, Dimitris-
Marios, and Valentini-Theodora for their tolerance and support and
S.Theodoridis would like to thank Despina, Eva, and Eleni, his joyful and
supportive “harem.”
“03-Ch01-SA272” 17/9/2008 page 1
Introduction
CHAPTER
1
1.1 IS PATTERN RECOGNITION IMPORTANT?
Pattern recognition is the scientific discipline whose goal is the classification of
objects into a number of categories or classes. Depending on the application, these
objects can be images or signal waveforms or any type of measurements that need
to be classified. We will refer to these objects using the generic term patterns.
Pattern recognition has a long history,but before the 1960s it was mostly the output
of theoretical research in the area of statistics. As with everything else, the advent
of computers increased the demand for practical applications of pattern recogni-
tion, which in turn set new demands for further theoretical developments. As our
society evolves from the industrial to its postindustrial phase, automation in indus-
trial production and the need for information handling and retrieval are becoming
increasingly important. This trend has pushed pattern recognition to the high edge
of today’s engineering applications and research. Pattern recognition is an integral
part of most machine intelligence systems built for decision making.
Machine vision is an area in which pattern recognition is of importance.
A machine vision system captures images via a camera and analyzes them to produce
descriptions of what is imaged. A typical application of a machine vision system is
in the manufacturing industry, either for automated visual inspection or for automa-
tion in the assembly line. For example, in inspection, manufactured objects on a
moving conveyor may pass the inspection station, where the camera stands, and it
has to be ascertained whether there is a defect. Thus, images have to be analyzed
online, and a pattern recognition system has to classify the objects into the “defect”
or“nondefect”class. After that,an action has to be taken,such as to reject the offend-
ing parts. In an assembly line, different objects must be located and “recognized,”
that is, classified in one of a number of classes known a priori. Examples are the
“screwdriver class,” the “German key class,” and so forth in a tools’ manufacturing
unit. Then a robot arm can move the objects in the right place.
Character (letter or number) recognition is another important area of pattern
recognition,with major implications in automation and information handling. Opti-
cal character recognition (OCR) systems are already commercially available and
more or less familiar to all of us. An OCR system has a “front-end” device consisting
of a light source,a scan lens,a document transport,and a detector. At the output of
1
“03-Ch01-SA272” 17/9/2008 page 2
2
CHAPTER 1 Introduction
the light-sensitive detector, light-intensity variation is translated into “numbers” and
an image array is formed. In the sequel, a series of image processing techniques are
applied leading to line and character segmentation. The pattern recognition soft-
ware then takes over to recognize the characters—that is, to classify each character
in the correct “letter, number, punctuation”class. Storing the recognized document
has a twofold advantage over storing its scanned image. First, further electronic
processing, if needed, is easy via a word processor, and second, it is much more
efficient to store ASCII characters than a document image. Besides the printed
character recognition systems, there is a great deal of interest invested in systems
that recognize handwriting. A typical commercial application of such a system is
in the machine reading of bank checks. The machine must be able to recognize
the amounts in figures and digits and match them. Furthermore, it could check
whether the payee corresponds to the account to be credited. Even if only half of
the checks are manipulated correctly by such a machine, much labor can be saved
from a tedious job. Another application is in automatic mail-sorting machines for
postal code identification in post offices. Online handwriting recognition systems
are another area of great commercial interest. Such systems will accompany pen
computers, with which the entry of data will be done not via the keyboard but by
writing. This complies with today’s tendency to develop machines and computers
with interfaces acquiring human-like skills.
Computer-aided diagnosis is another important application of pattern recogni-
tion, aiming at assisting doctors in making diagnostic decisions. The final diagnosis
is, of course, made by the doctor. Computer-assisted diagnosis has been applied to
and is of interest for a variety of medical data,such as X-rays,computed tomographic
images, ultrasound images, electrocardiograms (ECGs), and electroencephalograms
(EEGs). The need for a computer-aided diagnosis stems from the fact that medi-
cal data are often not easily interpretable, and the interpretation can depend very
much on the skill of the doctor. Let us take for example X-ray mammography
for the detection of breast cancer. Although mammography is currently the best
method for detecting breast cancer, 10 to 30% of women who have the disease and
undergo mammography have negative mammograms. In approximately two thirds
of these cases with false results the radiologist failed to detect the cancer, which
was evident retrospectively. This may be due to poor image quality, eye fatigue
of the radiologist, or the subtle nature of the findings. The percentage of correct
classifications improves at a second reading by another radiologist. Thus, one can
aim to develop a pattern recognition system in order to assist radiologists with a
“second” opinion. Increasing confidence in the diagnosis based on mammograms
would, in turn, decrease the number of patients with suspected breast cancer who
have to undergo surgical breast biopsy, with its associated complications.
Speech recognition is another area in which a great deal of research and devel-
opment effort has been invested. Speech is the most natural means by which
humans communicate and exchange information. Thus, the goal of building intelli-
gent machines that recognize spoken information has been a long-standing one for
scientists and engineers as well as science fiction writers. Potential applications of
such machines are numerous. They can be used, for example, to improve efficiency
“03-Ch01-SA272” 17/9/2008 page 3
1.1 Is Pattern Recognition Important?
3
in a manufacturing environment, to control machines in hazardous environments
remotely, and to help handicapped people to control machines by talking to them.
A major effort, which has already had considerable success, is to enter data into
a computer via a microphone. Software, built around a pattern (spoken sounds
in this case) recognition system, recognizes the spoken text and translates it into
ASCII characters, which are shown on the screen and can be stored in the memory.
Entering information by “talking” to a computer is twice as fast as entry by a skilled
typist. Furthermore, this can enhance our ability to communicate with deaf and
dumb people.
Data mining and knowledge discovery in databases is another key application
area of pattern recognition. Data mining is of intense interest in a wide range of
applications such as medicine and biology, market and financial analysis, business
management, science exploration, image and music retrieval. Its popularity stems
from the fact that in the age of information and knowledge society there is an ever
increasing demand for retrieving information and turning it into knowledge. More-
over,this information exists in huge amounts of data in various forms including,text,
images, audio and video, stored in different places distributed all over the world.
The traditional way of searching information in databases was the description-based
model where object retrieval was based on keyword description and subsequent
word matching. However, this type of searching presupposes that a manual anno-
tation of the stored information has previously been performed by a human. This
is a very time-consuming job and, although feasible when the size of the stored
information is limited, it is not possible when the amount of the available informa-
tion becomes large. Moreover, the task of manual annotation becomes problematic
when the stored information is widely distributed and shared by a heterogeneous
“mixture”of sites and users. Content-based retrieval systems are becoming more and
more popular where information is sought based on “similarity”between an object,
which is presented into the system, and objects stored in sites all over the world.
In a content-based image retrieval CBIR (system) an image is presented to an input
device (e.g.,scanner). The system returns“similar”images based on a measured“sig-
nature,” which can encode, for example, information related to color, texture and
shape. In a music content-based retrieval system, an example (i.e., an extract from
a music piece), is presented to a microphone input device and the system returns
“similar” music pieces.
In this case, similarity is based on certain (automatically)
measured cues that characterize a music piece, such as the music meter, the music
tempo, and the location of certain repeated patterns.
Mining for biomedical and DNA data analysis has enjoyed an explosive growth
since the mid-1990s. All DNA sequences comprise four basic building elements;
the nucleotides: adenine (A), cytosine (C), guanine (G) and thymine (T). Like the
letters in our alphabets and the seven notes in music, these four nucleotides are
combined to form long sequences in a twisted ladder form. Genes consist of,usually,
hundreds of nucleotides arranged in a particular order. Specific gene-sequence
patterns are related to particular diseases and play an important role in medicine.
To this end, pattern recognition is a key area that offers a wealth of developed tools
for similarity search and comparison between DNA sequences. Such comparisons
“03-Ch01-SA272” 17/9/2008 page 4
4
CHAPTER 1 Introduction
between healthy and diseased tissues are very important in medicine to identify
critical differences between these two classes.
The foregoing are only five examples from a much larger number of possible
applications. Typically, we refer to fingerprint identification, signature authentica-
tion, text retrieval, and face and gesture recognition. The last applications have
recently attracted much research interest and investment in an attempt to facil-
itate human–machine interaction and further enhance the role of computers in
office automation, automatic personalization of environments, and so forth. Just to
provoke imagination, it is worth pointing out that the MPEG-7 standard includes
a provision for content-based video information retrieval from digital libraries of
the type: search and find all video scenes in a digital library showing person “X”
laughing. Of course, to achieve the final goals in all of these applications, pattern
recognition is closely linked with other scientific disciplines, such as linguistics,
computer graphics, machine vision, and database design.
Having aroused the reader’s curiosity about pattern recognition, we will next
sketch the basic philosophy and methodological directions in which the various
pattern recognition approaches have evolved and developed.
1.2 FEATURES, FEATURE VECTORS, AND CLASSIFIERS
Let us first simulate a simplified case “mimicking” a medical image classification
task. Figure 1.1 shows two images, each having a distinct region inside it. The
two regions are also themselves visually different. We could say that the region of
Figure 1.1a results from a benign lesion, class A, and that of Figure 1.1b from a
malignant one (cancer), class B. We will further assume that these are not the only
patterns (images) that are available to us, but we have access to an image database
(a)
(b)
FIGURE 1.1
Examples of image regions corresponding to (a) class A and (b) class B.