LEARNING 
FROM 
DATA 
The book website AMLbook. com 
contains supporting  material 
for 
instructors 
and readers. 
LEARNING  FROM 
DATA 
A SHORT COURSE 
Yaser S .  Abu-Mostafa 
Institute 
California 
of Technology 
Malik Magdon-Ismail 
Polytechnic 
Rensselaer 
Institute 
Hsuan-Tien Lin 
National 
Taiwan University 
AMLbook.com 
of Technology 
Polytechnic 
Institute 
Rensselaer 
Troy, NY 12180, USA 
magdon@cs.rpi.edu 
of Electrical 
Engineering 
Malik Magdon Ismail 
Department 
of Computer 
Science 
Yaser S. Abu 1/fostafa 
Departments 
and Computer 
Science 
California 
Institute 
CA 9 1125, USA 
Pasadena, 
yaser©caltech.edu 
HsuanTien Lin 
Department 
and Information 
National 
Taipei, 
htlin©csie.ntu.edu.tw 
of Computer 
106, Taiwan 
Science 
Engineering 
Taiwan University 
ISBN 1 0: 1 60049 006 9 
ISBN 13:9781 60049 006 4 
@2012 Yaser S.  Abu Mostafa, 
Malik Magdon Ismail,  Hsuan Tien  Lin. 
1.10 
reserved. 
All rights 
without 
the written 
be reproduced, 
means-electronic, 
written 
permission 
the 1976 United States 
mechanical, 
of the authors, 
Copyright Act. 
This work may not be translated 
permission 
of the authors. 
or copied in whole or in part 
stored in a retrieval 
system, 
or transmitted 
No part of this publication 
may 
in any form or by any 
photocopying, scanning, 
or otherwise-without 
under Section 
107 or 108 of 
prior 
except as permitted 
any implied 
Limit of Liability 
efforts in preparing 
spect to the accuracy 
disclaim 
No warranty 
materials. 
situation. 
shall not be liable 
but not limited 
/Disclaimer 
of Warranty: 
While the authors 
have used their 
best 
this book, they make no representation 
or warranties 
with re
or completeness 
of the contents 
of this book and specifically 
warranties 
of merchantability 
or fitness 
for a particular purpose. 
may be created 
or extended 
by sales representatives 
or written 
The advice and strategies 
You should consult 
herein 
with a professional 
contained 
may not be suitable for 
where appropriate. 
The authors 
sales 
your 
for any loss of profit or any other commercial 
damages, including 
to special, 
incidental
, consequential, 
or other damages. 
The use in this publication 
terms, even if they are not identified 
opinion 
as to whether 
or not they are subject 
of tradenames, 
trademarks, 
marks, and similar 
as such, is not to be taken as an expression 
service 
of 
to proprietary 
rights. 
This book was typeset 
States 
of America. 
by the authors 
and was 
printed 
and bound in the United 
To our teachers) 
and to our students 
Preface 
learning. 
not a hurried course.  From over 
for a short course 
of teaching 
on machine 
a decade 
It is a short 
to be the core topics 
know. We chose the title 
'learning 
this material, 
we 
that every student 
of the 
from data' that faithfully 
is about, and made it a point to  cover  the 
topics 
in 
fashion. Our hope is that the reader 
can learn all the fundamentals 
should 
what we believe 
This book is designed 
course, 
have distilled 
subject 
describes what the 
a story-like 
of the subject 
Learning 
by reading 
subject 
the book cover to cover. 
from data has distinct 
theoretical 
and practical 
tracks. 
If you 
feel that you 
subjects 
altogether
about two different 
and the practical, 
read two books that focus on one track or the other,  you may 
are reading 
the theoretical 
criterion 
framework for 
formance 
parts are spelled out. Our philosophy 
what we don't know, and what we partially know. 
the mathematical 
and the heurist
e. Theory that establishes 
and so are heuristics 
Strengths 
for inclusion 
learning 
is relevanc
is included, 
is to say it like it is: what we know, 
of real learning 
that impact the per
systems. 
ic. Our 
. In this book, we balance 
and weaknesses of the different 
the conceptual 
The book can be taught 
in exactly 
the order 
it is presented. The notable 
may be Chapter 
2, which is the 
of generalization 
exception 
The theory 
from data, and we made an effort to make it accessible 
However, instructors 
in the practical 
over it, or delay it until after the practical  methods 
chapter 
most theoretical 
is central 
to a wide readership. 
side may skim 
3 are taught. 
of the book. 
to learning 
who are more interested 
that this chapter 
of Chapter 
covers 
exercises  (in gray boxes)  throughout 
the 
these exercises 
and enhance 
of a particular  topic  being  covered. 
for separating 
Nevertheless, 
is to engage the reader 
to the logical flow. 
are not crucial 
Our reason 
that we included 
You will  notice 
text. The main purpose of 
understanding 
the exercises 
they contain 
even if you don't do them to completion. 
exercises 
ditional 
each chapter. 
appropriate 
problems of  varying 
out is that they 
useful 
information, 
homework 
difficulty 
as 'easy' 
in  the 
and we strongly encourage 
Instructors 
problems, 
Problems section 
at the end of 
you to read them, 
may find some of the 
and we also provide 
ad
To help instructors 
material 
provide 
supporting 
also a forum that covers 
their lectures 
with preparing 
on the book's 
additional 
topics 
based on the book, we 
(AMLbook. 
corn). There is
website 
We will 
from  data. 
in learning 
vii