Use R!
Series Editors:
Robert Gentleman Kurt Hornik Giovanni Parmigiani
Use R!
Albert: Bayesian Computation with R
Bivand/Pebesma/Gómez-Rubio: Applied Spatial Data Analysis with R
Cook/Swayne: Interactive and Dynamic Graphics for Data Analysis:
With R and GGobi
Hahne/Huber/Gentleman/Falcon: Bioconductor Case Studies
Paradis: Analysis of Phylogenetics and Evolution with R
Pfaff: Analysis of Integrated and Cointegrated Time Series with R
Sarkar: Lattice: Multivariate Data Visualization with R
Spector: Data Manipulation with R
Roger S. Bivand • Edzer J. Pebesma
Virgilio Gómez-Rubio
Applied Spatial Data
Analysis with R
ABC
Virgilio Gómez-Rubio
Department of Epidemiology
and Public Health
Imperial College London
St. Mary’s Campus
Norfolk Place
London W2 1PG
United Kingdom
Kurt Hornik
Department für Statistik und Mathematik
Wirtschaftsuniversität Wien Augasse 2-6
A-1090 Wien
Austria
Roger S. Bivand
Norwegian School of Economics
and Business Administration
Breiviksveien 40
5045 Bergen
Norway
Edzer J. Pebesma
University of Utrecht
Department of Physical Geography
3508 TC Utrecht
Netherlands
Series Editors:
Robert Gentleman
Program in Computational Biology
Division of Public Health Sciences
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N, M2-B876
Seattle, Washington 98109-1024
USA
Giovanni Parmigiani
The Sidney Kimmel Comprehensive Cancer
Center at Johns Hopkins University
550 North Broadway
Baltimore, MD 21205-2011
USA
ISBN 978-0-387-78170-9
DOI 10.1007/978-0-387-78171-6
e-ISBN 978-0-387-78171-6
Library of Congress Control Number: 2008931196
c 2008 Springer Science+Business Media, LLC
All rights reserved. This work may not be translated or copied in whole or in part without the written
permission of the publisher (Springer Science+Business Media, LLC, 233 Spring Street, New York, NY
10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection
with any form of information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed is forbidden.
The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are
not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject
to proprietary rights.
Printed on acid-free paper
springer.com
Ewie
Voor Ellen, Ulla en Mandus
A mis padres, Victorina y Virgilio Benigno
Preface
We began writing this book in parallel with developing software for handling
and analysing spatial data with R (R Development Core Team, 2008). Al-
though the book is now complete, software development will continue, in the
R community fashion, of rich and satisfying interaction with users around the
world, of rapid releases to resolve problems, and of the usual joys and frustra-
tions of getting things done. There is little doubt that without pressure from
users, the development of R would not have reached its present scale, and the
same applies to analysing spatial data analysis with R.
It would, however, not be sufficient to describe the development of the
R project mainly in terms of narrowly defined utility. In addition to being a
community project concerned with the development of world-class data analy-
sis software implementations, it promotes specific choices with regard to how
data analysis is carried out. R is open source not only because open source
software development, including the dynamics of broad and inclusive user and
developer communities, is arguably an attractive and successful development
model.
R is also, or perhaps chiefly, open source because the analysis of empirical
and simulated data in science should be reproducible. As working researchers,
we are all too aware of the possibility of reaching inappropriate conclusions
in good faith because of user error or misjudgement. When the results of
research really matter, as in public health, in climate change, and in many
other fields involving spatial data, good research practice dictates that some-
one else should be, at least in principle, able to check the results. Open source
software means that the methods used can, if required, be audited, and jour-
nalling working sessions can ensure that we have a record of what we actually
1 – a tool that permits
did, not what we thought we did. Further, using Sweave
the embedding of R code for complete data analyses in documents – through-
out this book has provided crucial support (Leisch, 2002; Leisch and Rossini,
2003).
1
http://www.statistik.lmu.de/~leisch/Sweave/.
VIII
Preface
We acknowledge our debt to the members of R-core for their continu-
ing commitment to the R project. In particular, the leadership and example
of Professor Brian Ripley has been important to us, although our admitted
‘muddling through’ contrasts with his peerless attention to detail. His inter-
ested support at the Distributed Statistical Computing conference in Vienna
in 2003 helped us to see that encouraging spatial data analysis in R was a
project worth pursuing. Kurt Hornik’s dedication to keep the Comprehensive
R Archive Network running smoothly, providing package maintainers with
superb, almost 24/7, service, and his dry humour when we blunder, have
meant that the useR community is provided with contributed software in an
unequalled fashion. We are also grateful to Martin M¨achler for his help in
setting up and hosting the R-Sig-Geo mailing list, without which we would
have not had a channel for fostering the R spatial community.
We also owe a great debt to users participating in discussions on the mail-
ing list, sometimes for specific suggestions, often for fruitful questions, and
occasionally for perceptive bug reports or contributions. Other users contact
us directly, again with valuable input that leads both to a better understanding
on our part of their research realities and to the improvement of the software
involved. Finally, participants at R spatial courses, workshops, and tutorials
have been patient and constructive.
We are also indebted to colleagues who have contributed to improving the
final manuscript by commenting on earlier drafts and pointing out better pro-
cedures to follow in some examples. In particular, we would like to mention
Juanjo Abell´an, Nicky Best, Peter J. Diggle, Paul Hiemstra, Rebeca Ramis,
Paulo J. Ribeiro Jr., Barry Rowlingson, and Jon O. Skøien. We are also grate-
ful to colleagues for agreeing to our use of their data sets. Support from Luc
Anselin has been important over a long period, including a very fruitful CSISS
workshop in Santa Barbara in 2002. Work by colleagues, such as the first book
known to us on using R for spatial data analysis (Kopczewska, 2006), provided
further incentives both to simplify the software and complete its description.
Without John Kimmel’s patient encouragement, it is unlikely that we would
have finished this book.
Even though we have benefitted from the help and advice of so many
people, there are bound to be things we have not yet grasped – so remaining
mistakes and omissions remain our sole responsibility. We would be grateful
for messages pointing out errors in this book; errata will be posted on the
book website (http://www.asdar-book.org).
Bergen
M¨unster
London
April 2008
Roger S. Bivand
Edzer J. Pebesma
Virgilio G´omez-Rubio