Face Recognition: A Literature Survey
W. ZHAO
Sarnoff Corporation
R. CHELLAPPA
University of Maryland
P. J. PHILLIPS
National Institute of Standards and Technology
AND
A. ROSENFELD
University of Maryland
As one of the most successful applications of image analysis and understanding, face
recognition has recently received significant attention, especially during the past
several years. At least two reasons account for this trend: the first is the wide range of
commercial and law enforcement applications, and the second is the availability of
feasible technologies after 30 years of research. Even though current machine
recognition systems have reached a certain level of maturity, their success is limited by
the conditions imposed by many real applications. For example, recognition of face
images acquired in an outdoor environment with changes in illumination and/or pose
remains a largely unsolved problem. In other words, current systems are still far away
from the capability of the human perception system.
This paper provides an up-to-date critical survey of still- and video-based face
recognition research. There are two underlying motivations for us to write this survey
paper: the first is to provide an up-to-date review of the existing literature, and the
second is to offer some insights into the studies of machine recognition of faces. To
provide a comprehensive survey, we not only categorize existing recognition techniques
but also present detailed descriptions of representative methods within each category.
In addition, relevant topics such as psychophysical studies, system evaluation, and
issues of illumination and pose variation are covered.
Categories and Subject Descriptors: I.5.4 [Pattern Recognition]: Applications
General Terms: Algorithms
Additional Key Words and Phrases: Face recognition, person identification
An earlier version of this paper appeared as “Face Recognition: A Literature Survey,” Technical Report CAR-
TR-948, Center for Automation Research, University of Maryland, College Park, MD, 2000.
Authors’ addresses: W. Zhao, Vision Technologies Lab, Sarnoff Corporation, Princeton, NJ 08543-5300;
email: wzhao@sarnoff.com; R. Chellappa and A. Rosenfeld, Center for Automation Research, University of
Maryland, College Park, MD 20742-3275; email: {rama,ar}@cfar.umd.edu; P. J. Phillips, National Institute
of Standards and Technology, Gaithersburg, MD 20899; email: jonathon@nist.gov.
Permission to make digital/hard copy of part or all of this work for personal or classroom use is granted with-
out fee provided that the copies are not made or distributed for profit or commercial advantage, the copyright
notice, the title of the publication, and its date appear, and notice is given that copying is by permission of
ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific
permission and/or a fee.
c2003 ACM 0360-0300/03/1200-0399 $5.00
ACM Computing Surveys, Vol. 35, No. 4, December 2003, pp. 399–458.
400
1. INTRODUCTION
As one of the most successful applications
of image analysis and understanding, face
recognition has recently received signifi-
cant attention, especially during the past
few years. This is evidenced by the emer-
gence of face recognition conferences such
as the International Conference on Audio-
and Video-Based Authentication (AVBPA)
since 1997 and the International Con-
ference on Automatic Face and Gesture
Recognition (AFGR) since 1995, system-
atic empirical evaluations of face recog-
nition techniques (FRT),
including the
FERET [Phillips et al. 1998b, 2000; Rizvi
et al. 1998], FRVT 2000 [Blackburn et al.
2001], FRVT 2002 [Phillips et al. 2003],
and XM2VTS [Messer et al. 1999] pro-
tocols, and many commercially available
systems (Table II). There are at least two
reasons for this trend; the first is the wide
range of commercial and law enforcement
applications and the second is the avail-
ability of feasible technologies after 30
years of research. In addition, the prob-
lem of machine recognition of human faces
continues to attract researchers from dis-
ciplines such as image processing, pattern
recognition, neural networks, computer
vision, computer graphics, and psychology.
The strong need for user-friendly sys-
tems that can secure our assets and pro-
tect our privacy without losing our iden-
tity in a sea of numbers is obvious. At
present, one needs a PIN to get cash from
an ATM, a password for a computer, a
dozen others to access the internet, and
so on. Although very reliable methods of
biometric personal identification exist, for
Zhao et al.
example, fingerprint analysis and retinal
or iris scans, these methods rely on the
cooperation of the participants, whereas
a personal identification system based on
analysis of frontal or profile images of the
face is often effective without the partici-
pant’s cooperation or knowledge. Some of
the advantages/disadvantages of different
biometrics are described in Phillips et al.
[1998]. Table I lists some of the applica-
tions of face recognition.
Commercial and law enforcement ap-
plications of FRT range from static,
controlled-format photographs to uncon-
trolled video images, posing a wide range
of technical challenges and requiring an
equally wide range of techniques from im-
age processing, analysis, understanding,
and pattern recognition. One can broadly
classify FRT systems into two groups de-
pending on whether they make use of
static images or of video. Within these
groups, significant differences exist, de-
pending on the specific application. The
differences are in terms of image qual-
ity, amount of background clutter (posing
challenges to segmentation algorithms),
variability of the images of a particular
individual that must be recognized, avail-
ability of a well-defined recognition or
matching criterion, and the nature, type,
and amount of input from a user. A list
of some commercial systems is given in
Table II.
A general statement of the problem of
machine recognition of faces can be for-
mulated as follows: given still or video
images of a scene,
identify or verify
one or more persons in the scene us-
ing a stored database of faces. Available
Areas
Entertainment
Smart cards
Information security
Law enforcement
and surveillance
Table I. Typical Applications of Face Recognition
Specific applications
Video game, virtual reality, training programs
Human-robot-interaction, human-computer-interaction
Drivers’ licenses, entitlement programs
Immigration, national ID, passports, voter registration
Welfare fraud
TV Parental control, personal device logon, desktop logon
Application security, database security, file encryption
Intranet security, internet access, medical records
Secure trading terminals
Advanced video surveillance, CCTV control
Portal control, postevent analysis
Shoplifting, suspect tracking and investigation
ACM Computing Surveys, Vol. 35, No. 4, December 2003.
Face Recognition: A Literature Survey
401
Table II. Available Commercial Face Recognition Systems (Some of these Web sites
may have changed or been removed.) [The identification of any company, commercial
product, or trade name does not imply endorsement or recommendation by the National
Institute of Standards and Technology or any of the authors or their institutions.]
Commercial products
FaceIt from Visionics
Viisage Technology
FaceVACS from Plettac
FaceKey Corp.
Cognitec Systems
Keyware Technologies
Passfaces from ID-arts
ImageWare Sofware
Eyematic Interfaces Inc.
BioID sensor fusion
Visionsphere Technologies
Biometric Systems, Inc.
FaceSnap Recoder
SpotIt for face composite
Websites
http://www.FaceIt.com
http://www.viisage.com
http://www.plettac-electronics.com
http://www.facekey.com
http://www.cognitec-systems.de
http://www.keywareusa.com/
http://www.id-arts.com/
http://www.iwsinc.com/
http://www.eyematic.com/
http://www.bioid.com
http://www.visionspheretech.com/menu.htm
http://www.biometrica.com/
http://www.facesnap.de/htdocs/english/index2.html
http://spotit.itc.it/SpotIt.html
Face perception is an important part of
the capability of human perception sys-
tem and is a routine task for humans,
while building a similar computer sys-
tem is still an on-going research area. The
earliest work on face recognition can be
traced back at least to the 1950s in psy-
chology [Bruner and Tagiuri 1954] and to
the 1960s in the engineering literature
[Bledsoe 1964]. Some of the earliest stud-
ies include work on facial expression
of emotions by Darwin [1972] (see also
Ekman [1998]) and on facial profile-based
biometrics by Galton [1888]). But re-
search on automatic machine recogni-
tion of faces really started in the 1970s
[Kelly 1970] and after the seminal work
of Kanade [1973]. Over the past 30
years extensive research has been con-
ducted by psychophysicists, neuroscien-
tists, and engineers on various aspects
of face recognition by humans and ma-
chines. Psychophysicists and neuroscien-
tists have been concerned with issues
such as whether face perception is a
dedicated process (this issue is still be-
ing debated in the psychology community
[Biederman and Kalocsai 1998; Ellis 1986;
Gauthier et al. 1999; Gauthier and Logo-
thetis 2000]) and whether it is done holis-
tically or by local feature analysis.
Many of the hypotheses and theories
put forward by researchers in these dis-
ciplines have been based on rather small
sets of images. Nevertheless, many of the
Fig. 1. Configuration of a generic face recognition
system.
collateral information such as race, age,
gender, facial expression, or speech may be
used in narrowing the search (enhancing
recognition). The solution to the problem
involves segmentation of faces (face de-
tection) from cluttered scenes, feature ex-
traction from the face regions, recognition,
or verification (Figure 1). In identification
problems, the input to the system is an un-
known face, and the system reports back
the determined identity from a database
of known individuals, whereas in verifica-
tion problems, the system needs to confirm
or reject the claimed identity of the input
face.
ACM Computing Surveys, Vol. 35, No. 4, December 2003.
402
Zhao et al.
findings have important consequences for
engineers who design algorithms and sys-
tems for machine recognition of human
faces. Section 2 will present a concise re-
view of these findings.
Barring a few exceptions that use range
data [Gordon 1991], the face recognition
problem has been formulated as recogniz-
ing three-dimensional (3D) objects from
two-dimensional (2D) images.1 Earlier ap-
proaches treated it as a 2D pattern recog-
nition problem. As a result, during the
early and mid-1970s, typical pattern clas-
sification techniques, which use measured
attributes of features (e.g., the distances
between important points) in faces or face
profiles, were used [Bledsoe 1964; Kanade
1973; Kelly 1970]. During the 1980s, work
on face recognition remained largely dor-
mant. Since the early 1990s, research in-
terest in FRT has grown significantly. One
can attribute this to several reasons: an in-
crease in interest in commercial opportu-
nities; the availability of real-time hard-
ware; and the increasing importance of
surveillance-related applications.
Over the past 15 years, research has
focused on how to make face recognition
systems fully automatic by tackling prob-
lems such as localization of a face in a
given image or video clip and extraction
of
features such as eyes, mouth, etc.
Meanwhile, significant advances have
been made in the design of classifiers
for successful
face recognition. Among
appearance-based holistic approaches,
eigenfaces [Kirby and Sirovich 1990;
Turk and Pentland 1991] and Fisher-
faces [Belhumeur et al. 1997; Etemad
and Chellappa 1997; Zhao et al. 1998]
have proved to be effective in experiments
with large databases. Feature-based
graph matching approaches
[Wiskott
et al. 1997] have also been quite suc-
cessful. Compared to holistic approaches,
feature-based methods are less sensi-
tive to variations in illumination and
viewpoint and to inaccuracy in face local-
1There have been recent advances on 3D face recogni-
tion in situations where range data acquired through
structured light can be matched reliably [Bronstein
et al. 2003].
ization. However, the feature extraction
techniques needed for this type of ap-
proach are still not reliable or accurate
enough [Cox et al. 1996]. For example,
most eye localization techniques assume
some geometric and textural models and
do not work if the eye is closed. Section 3
will present a review of still-image-based
face recognition.
During the past 5 to 8 years, much re-
search has been concentrated on video-
based face recognition. The still image
problem has several inherent advantages
and disadvantages. For applications such
as drivers’ licenses, due to the controlled
nature of the image acquisition process,
the segmentation problem is rather easy.
However, if only a static picture of an air-
port scene is available, automatic location
and segmentation of a face could pose se-
rious challenges to any segmentation al-
gorithm. On the other hand, if a video
sequence is available, segmentation of a
moving person can be more easily accom-
plished using motion as a cue. But the
small size and low image quality of faces
captured from video can significantly in-
crease the difficulty in recognition. Video-
based face recognition is reviewed in
Section 4.
As we propose new algorithms and build
more systems, measuring the performance
of new systems and of existing systems
becomes very important. Systematic data
collection and evaulation of face recogni-
tion systems is reviewed in Section 5.
Recognizing a 3D object from its 2D im-
ages poses many challenges. The illumina-
tion and pose problems are two prominent
issues for appearance- or image-based ap-
proaches. Many approaches have been
proposed to handle these issues, with the
majority of them exploring domain knowl-
edge. Details of these approaches are dis-
cussed in Section 6.
In 1995, a review paper [Chellappa et al.
1995] gave a thorough survey of FRT
at that time. (An earlier survey [Samal
and Iyengar 1992] appeared in 1992.) At
that time, video-based face recognition
was still in a nascent stage. During the
past 8 years, face recognition has received
increased attention and has advanced
ACM Computing Surveys, Vol. 35, No. 4, December 2003.
Face Recognition: A Literature Survey
technically. Many commercial systems for
still face recognition are now available.
Recently, significant research efforts have
been focused on video-based face model-
ing/tracking, recognition, and system in-
tegration. New datasets have been created
and evaluations of recognition techniques
using these databases have been carried
out. It is not an overstatement to say that
face recognition has become one of the
most active applications of pattern recog-
nition, image analysis and understanding.
In this paper we provide a critical review
of current developments in face recogni-
tion. This paper is organized as follows: in
Section 2 we briefly review issues that are
relevant from a psychophysical point of
view. Section 3 provides a detailed review
of recent developments in face recognition
techniques using still images. In Section 4
face recognition techniques based on video
are reviewed. Data collection and perfor-
mance evaluation of face recognition algo-
rithms are addressed in Section 5 with de-
scriptions of representative protocols. In
Section 6 we discuss two important prob-
lems in face recognition that can be math-
ematically studied, lack of robustness to
illumination and pose variations, and we
review proposed methods of overcoming
these limitations. Finally, a summary and
conclusions are presented in Section 7.
2. PSYCHOPHYSICS/NEUROSCIENCE
ISSUES RELEVANT TO FACE
RECOGNITION
Human recognition processes utilize a
broad spectrum of stimuli, obtained from
many, if not all, of the senses (visual,
auditory, olfactory, tactile, etc.). In many
situations, contextual knowledge is also
applied, for example, surroundings play
an important role in recognizing faces in
relation to where they are supposed to
be located. It is futile to even attempt to
develop a system using existing technol-
ogy, which will mimic the remarkable face
recognition ability of humans. However,
the human brain has its limitations in the
total number of persons that it can accu-
rately “remember.” A key advantage of a
computer system is its capacity to handle
ACM Computing Surveys, Vol. 35, No. 4, December 2003.
403
large numbers of face images. In most
applications the images are available only
in the form of single or multiple views of
2D intensity data, so that the inputs to
computer face recognition algorithms are
visual only. For this reason, the literature
reviewed in this section is restricted to
studies of human visual perception of
faces.
Many studies in psychology and neuro-
science have direct relevance to engineers
interested in designing algorithms or sys-
tems for machine recognition of faces. For
example, findings in psychology [Bruce
1988; Shepherd et al. 1981] about the rela-
tive importance of different facial features
have been noted in the engineering liter-
ature [Etemad and Chellappa 1997]. On
the other hand, machine systems provide
tools for conducting studies in psychology
and neuroscience [Hancock et al. 1998;
Kalocsai et al. 1998]. For example, a pos-
sible engineering explanation of the bot-
tom lighting effects studied in Johnston
et al. [1992] is as follows: when the actual
lighting direction is opposite to the usually
assumed direction, a shape-from-shading
algorithm recovers incorrect structural in-
formation and hence makes recognition of
faces harder.
A detailed review of relevant studies in
psychophysics and neuroscience is beyond
the scope of this paper. We only summa-
rize findings that are potentially relevant
to the design of face recognition systems.
For details the reader is referred to the
papers cited below. Issues that are of po-
tential interest to designers are2:
—Is face recognition a dedicated process?
[Biederman and Kalocsai 1998; Ellis
1986; Gauthier et al. 1999; Gauthier and
Logothetis 2000]: It is traditionally be-
lieved that face recognition is a dedi-
cated process different from other ob-
ject recognition tasks. Evidence for the
existence of a dedicated face process-
ing system comes from several sources
[Ellis 1986]. (a) Faces are more eas-
ily remembered by humans than other
2Readers should be aware of the existence of diverse
opinions on some of these issues. The opinions given
here do not necessarily represent our views.
404
objects when presented in an upright
orientation. (b) Prosopagnosia patients
are unable to recognize previously fa-
miliar faces, but usually have no other
profound agnosia. They recognize peo-
ple by their voices, hair color, dress, etc.
It should be noted that prosopagnosia
patients recognize whether a given ob-
ject is a face or not, but then have dif-
ficulty in identifying the face. Seven
differences between face recognition
and object recognition can be summa-
rized [Biederman and Kalocsai 1998]
based on empirical evidence: (1) con-
figural effects (related to the choice of
different types of machine recognition
systems), (2) expertise, (3) differences
verbalizable, (4) sensitivity to contrast
polarity and illumination direction (re-
lated to the illumination problem in ma-
chine recognition systems), (5) metric
variation, (6) Rotation in depth (related
to the pose variation problem in ma-
chine recognition systems), and (7) ro-
tation in plane/inverted face. Contrary
to the traditionally held belief, some re-
cent findings in human neuropsychol-
ogy and neuroimaging suggest that face
recognition may not be unique. Accord-
ing to [Gauthier and Logothetis 2000],
recent neuroimaging studies in humans
indicate that level of categorization and
expertise interact to produce the speci-
fication for faces in the middle fusiform
gyrus.3 Hence it is possible that the en-
coding scheme used for faces may also
be employed for other classes with simi-
lar properties. (On recognition of famil-
iar vs. unfamiliar faces see Section 7.)
—Is face perception the result of holistic
or feature analysis? [Bruce 1988; Bruce
et al. 1998]: Both holistic and feature
information are crucial for the percep-
tion and recognition of faces. Studies
suggest the possibility of global descrip-
tions serving as a front end for finer,
feature-based perception. If dominant
features are present, holistic descrip-
3The fusiform gyrus or occipitotemporal gyrus, lo-
cated on the ventromedial surface of the temporal
and occipital lobes, is thought to be critical for face
recognition.
Zhao et al.
tions may not be used. For example, in
face recall studies, humans quickly fo-
cus on odd features such as big ears, a
crooked nose, a staring eye, etc. One of
the strongest pieces of evidence to sup-
port the view that face recognition in-
volves more configural/holistic process-
ing than other object recognition has
been the face inversion effect in which
an inverted face is much harder to rec-
ognize than a normal face (first demon-
strated in [Yin 1969]). An excellent ex-
ample is given in [Bartlett and Searcy
1993] using the “Thatcher illusion”
[Thompson 1980]. In this illusion, the
eyes and mouth of an expressing face
are excised and inverted, and the re-
sult looks grotesque in an upright face;
however, when shown inverted, the face
looks fairly normal in appearance, and
the inversion of the internal features is
not readily noticed.
—Ranking of significance of facial features
[Bruce 1988; Shepherd et al. 1981]: Hair,
face outline, eyes, and mouth (not nec-
essarily in this order) have been de-
termined to be important for perceiv-
ing and remembering faces [Shepherd
et al. 1981]. Several studies have shown
that the nose plays an insignificant role;
this may be due to the fact that al-
most all of these studies have been done
using frontal images. In face recogni-
tion using profiles (which may be im-
portant in mugshot matching applica-
tions, where profiles can be extracted
from side views), a distinctive nose
shape could be more important than the
eyes or mouth [Bruce 1988]. Another
outcome of some studies is that both
external and internal features are im-
portant in the recognition of previ-
ously presented but otherwise unfamil-
iar faces, but internal features are more
dominant in the recognition of familiar
faces. It has also been found that the
upper part of the face is more useful
for face recognition than the lower part
[Shepherd et al. 1981]. The role of aes-
thetic attributes such as beauty, attrac-
tiveness, and/or pleasantness has also
been studied, with the conclusion that
ACM Computing Surveys, Vol. 35, No. 4, December 2003.
Face Recognition: A Literature Survey
405
the more attractive the faces are, the
better is their recognition rate; the least
attractive faces come next, followed by
the midrange faces, in terms of ease of
being recognized.
—Caricatures [Brennan 1985; Bruce 1988;
Perkins 1975]: A caricature can be for-
mally defined [Perkins 1975] as “a sym-
bol that exaggerates measurements rel-
ative to any measure which varies from
one person to another.” Thus the length
of a nose is a measure that varies from
person to person, and could be useful
as a symbol in caricaturing someone,
but not the number of ears. A stan-
dard caricature algorithm [Brennan
1985] can be applied to different qual-
ities of image data (line drawings and
photographs). Caricatures of line draw-
ings do not contain as much information
as photographs, but they manage to cap-
ture the important characteristics of a
face; experiments based on nonordinary
faces comparing the usefulness of line-
drawing caricatures and unexaggerated
line drawings decidedly favor the former
[Bruce 1988].
—Distinctiveness [Bruce et al. 1994]: Stud-
ies show that distinctive faces are bet-
ter retained in memory and are rec-
ognized better and faster than typical
faces. However, if a decision has to be
made as to whether an object is a face
or not, it takes longer to recognize an
atypical face than a typical face. This
may be explained by different mecha-
nisms being used for detection and for
identification.
—The role of spatial frequency analysis
[Ginsburg 1978; Harmon 1973; Sergent
1986]: Earlier studies [Ginsburg 1978;
Harmon 1973] concluded that informa-
tion in low spatial
frequency bands
plays a dominant role in face recog-
nition. Recent studies [Sergent 1986]
have shown that, depending on the spe-
cific recognition task, the low, band-
pass and high-frequency components
may play different roles. For example
gender classification can be successfully
accomplished using low-frequency com-
ponents only, while identification re-
ACM Computing Surveys, Vol. 35, No. 4, December 2003.
quires the use of high-frequency com-
ponents [Sergent 1986]. Low-frequency
components contribute to global de-
scription, while high-frequency compo-
nents contribute to the finer details
needed in identification.
—Viewpoint-invariant recognition? [Bie-
derman 1987; Hill et al. 1997; Tarr
and Bulthoff 1995]: Much work in vi-
sual object recognition (e.g. [Biederman
1987]) has been cast within a theo-
retical framework introduced in [Marr
1982] in which different views of ob-
jects are analyzed in a way which
allows access to (largely) viewpoint-
invariant descriptions. Recently, there
has been some debate about whether ob-
ject recognition is viewpoint-invariant
or not [Tarr and Bulthoff 1995]. Some
experiments suggest that memory for
faces is highly viewpoint-dependent.
Generalization even from one profile
viewpoint to another is poor, though
generalization from one three-quarter
view to the other is very good [Hill et al.
1997].
—Effect of lighting change [Bruce et al.
1998; Hill and Bruce 1996; Johnston
et al. 1992]: It has long been informally
observed that photographic negatives
of faces are difficult to recognize. How-
ever, relatively little work has explored
why it is so difficult to recognize nega-
tive images of faces. In [Johnston et al.
1992], experiments were conducted to
explore whether difficulties with nega-
tive images and inverted images of faces
arise because each of these manipula-
tions reverses the apparent direction of
lighting, rendering a top-lit image of a
face apparently lit from below. It was
demonstrated in [Johnston et al. 1992]
that bottom lighting does indeed make it
harder to identity familiar faces. In [Hill
and Bruce 1996], the importance of top
lighting for face recognition was demon-
strated using a different task: match-
ing surface images of faces to determine
whether they were identical.
—Movement and face recognition [O’Toole
et al. 2002; Bruce et al. 1998; Knight and
Johnston 1997]: A recent study [Knight
406
Zhao et al.
and Johnston 1997] showed that fa-
mous faces are easier to recognize when
shown in moving sequences than in
still photographs. This observation has
been extended to show that movement
helps in the recognition of familiar faces
shown under a range of different types
of degradations—negated, inverted, or
thresholded [Bruce et al. 1998]. Even
more interesting is the observation
that
there seems to be a benefit
due to movement even if the informa-
tion content is equated in the mov-
ing and static comparison conditions.
However, experiments with unfamiliar
faces suggest no additional benefit from
viewing animated rather than static
sequences.
—Facial expressions [Bruce 1988]: Based
on neurophysiological studies, it seems
that analysis of facial expressions is ac-
complished in parallel to face recogni-
tion. Some prosopagnosic patients, who
have difficulties in identifying famil-
iar faces, nevertheless seem to recog-
nize expressions due to emotions. Pa-
tients who suffer from “organic brain
syndrome” suffer from poor expression
analysis but perform face recognition
quite well.4 Similarly, separation of face
recognition and “focused visual process-
ing” tasks (e.g., looking for someone with
a thick mustache) have been claimed.
3. FACE RECOGNITION FROM
STILL IMAGES
As illustrated in Figure 1, the prob-
lem of automatic face recognition involves
three key steps/subtasks: (1) detection and
rough normalization of faces, (2) feature
extraction and accurate normalization of
faces, (3) identification and/or verification.
Sometimes, different subtasks are not to-
tally separated. For example, the facial
features (eyes, nose, mouth) used for face
recognition are often used in face detec-
tion. Face detection and feature extraction
can be achieved simultaneously, as indi-
4From a machine recognition point of view, dramatic
facial expressions may affect face recognition perfor-
mance if only one photograph is available.
cated in Figure 1. Depending on the nature
of the application, for example, the sizes of
the training and testing databases, clutter
and variability of the background, noise,
occlusion, and speed requirements, some
of the subtasks can be very challenging.
Though fully automatic face recognition
systems must perform all three subtasks,
research on each subtask is critical. This
is not only because the techniques used
for the individual subtasks need to be im-
proved, but also because they are critical
in many different applications (Figure 1).
For example, face detection is needed to
initialize face tracking, and extraction of
facial features is needed for recognizing
human emotion, which is in turn essential
in human-computer interaction (HCI) sys-
tems. Isolating the subtasks makes it eas-
ier to assess and advance the state of the
art of the component techniques. Earlier
face detection techniques could only han-
dle single or a few well-separated frontal
faces in images with simple backgrounds,
while state-of-the-art algorithms can de-
tect faces and their poses in cluttered
backgrounds [Gu et al. 2001; Heisele et al.
2001; Schneiderman and Kanade 2000; Vi-
ola and Jones 2001]. Extensive research on
the subtasks has been carried out and rel-
evant surveys have appeared on, for exam-
ple, the subtask of face detection [Hjelmas
and Low 2001; Yang et al. 2002].
In this section we survey the state of the
art of face recognition in the engineering
literature. For the sake of completeness,
in Section 3.1 we provide a highlighted
summary of research on face segmenta-
tion/detection and feature extraction. Sec-
tion 3.2 contains detailed reviews of recent
work on intensity image-based face recog-
nition and categorizes methods of recog-
nition from intensity images. Section 3.3
summarizes the status of face recognition
and discusses open research issues.
3.1. Key Steps Prior to Recognition: Face
Detection and Feature Extraction
The first step in any automatic face
recognition systems is the detection of
faces in images. Here we only provide a
summary on this topic and highlight a few
ACM Computing Surveys, Vol. 35, No. 4, December 2003.