logo资料库

A Beginner's Guide to R(R语言开发者指南 英文原版).pdf

第1页 / 共228页
第2页 / 共228页
第3页 / 共228页
第4页 / 共228页
第5页 / 共228页
第6页 / 共228页
第7页 / 共228页
第8页 / 共228页
资料共228页,剩余部分请下载后查看
Use R!
A Beginner’s Guide to R
Preface
The Absolute R Beginner
Datasets used in This book
Acknowledgements
Contents
Introduction
1.1 What Is R?
1.2 Downloading and Installing R
1.3 An Initial Impression
1.4 Script Code
1.4.1 The Art of Programming
1.4.2 Documenting Script Code
1.5 Graphing Facilities in R
1.6 Editors
1.7 Help Files and Newsgroups
1.8 Packages
1.8.1 Packages Included with the Base Installation
1.8.2 Packages Not Included with the Base Installation
1.8.2.1 Option 1. Manual Download and Installation
1.8.2.2 Option 2. Download and Install a Package from Within R
1.8.2.1 Loading the Package
1.8.2.2 How Good Is a Package?
1.9 General Issues in R
1.9.1 Quitting R and Setting the Working Directory
1.10 A History and a Literature Overview
1.10.1 A Short Historical Overview of R
1.10.2 Books on R and Books Using R
1.10.2.1 The Use R! Series
1.11 Using This Book
1.11.1 If You Are an Instructor
1.11.2 If You Are an Interested Reader with Limited R Experience
1.11.3 If You Are an R Expert
1.11.4 If You Are Afraid of R
1.12 Citing R and Citing Packages
1.13 Which R Functions Did We Learn?
Getting Data into R
2.1 First Steps in R
2.1.1 Typing in Small Datasets
2.1.2 Concatenating Data with the c Function
2.1.3 Combining Variables with the c, cbind, and rbind Functions
2.1.4 Combining Data with the vector Function*
2.1.5 Combining Data Using a Matrix*
2.1.6 Combining Data with the data.frame Function
2.1.7 Combining Data Using the list Function*
2.2 Importing Data
2.2.1 Importing Excel Data
2.2.1.1 Prepare the Data in Excel
2.2.1.2 Export Data to a Tab-Delimited ascii File
2.2.1.3 Using the read.table Function
2.2.2 Accessing Data from Other Statistical Packages**
2.2.3 Accessing a Database***
2.3 Which R Functions Did We Learn?
2.4 Exercises
Accessing Variables and Managing Subsets of Data
3.1 Accessing Variables from a Data Frame
3.1.1 The str Function
3.1.2 The Data Argument in a Function
3.1.3 The $ Sign
3.1.4 The attach Function
3.2 Accessing Subsets of Data
3.2.1 Sorting the Data
3.3 Combining Two Datasets with a Common Identifier
3.4 Exporting Data
3.5 Recoding Categorical Variables
3.6 Which R Functions Did We Learn?
3.7 Exercises
Simple Functions
4.1 The tapply Function
4.1.1 Calculating the Mean Per Transect
4.1.2 Calculating the Mean Per Transect More Efficiently
4.2 The sapply and lapply Functions
4.3 The summary Function
4.4 The table Function
4.5 Which R Functions Did We Learn?
4.6 Exercises
An Introduction to Basic Plotting Tools
5.1 The plot Function
5.2 Symbols, Colours, and Sizes
5.2.1 Changing Plotting Characters
5.2.1.1 Use of a Vector for pch
5.2.2 Changing the Colour of Plotting Symbols
5.2.2.1 Use of a Vector for col
5.2.3 Altering the Size of Plotting Symbols
5.2.3.1 Use of a Vector for cex
5.3 Adding a Smoothing Line
5.4 Which R Functions Did We Learn?
5.5 Exercises
Loops and Functions
6.1 Introduction to Loops
6.2 Loops
6.2.1 Be the Architect of Your Code
6.2.2 Step 1: Importing the Data
6.2.3 Steps 2 and 3: Making the Scatterplot and Adding Labels
6.2.4 Step 4: Designing General Code
6.2.5 Step 5: Saving the Graph
6.2.6 Step 6: Constructing the Loop
6.3 Functions
6.3.1 Zeros and NAs
6.3.2 Technical Information
6.3.3 A Second Example: Zeros and NAs
6.3.4 A Function with Multiple Arguments
6.3.5 Foolproof Functions
6.3.5.1 Default Values for Variables in Function Arguments
6.3.5.2 Misspelling
6.4 More on Functions and the if Statement
6.4.1 Playing the Architect Again
6.4.2 Step 1: Importing and Assessing the Data
6.4.3 Step 2: Total Abundance per Site
6.4.4 Step 3: Richness per Site
6.4.5 Step 4: Shannon Index per Site
6.4.6 Step 5: Combining Code
6.4.7 Step 6: Putting the Code into a Function
6.5 Which R Functions Did We Learn?
6.6 Exercises
Graphing Tools
7.1 The Pie Chart
7.1.1 Pie Chart Showing Avian Influenza Data
7.1.2 The par Function
7.2 The Bar Chart and Strip Chart
7.2.1 The Bar Chart Using the Avian Influenza Data
7.2.2 A Bar Chart Showing Mean Values with Standard Deviations
7.2.3 The Strip Chart for the Benthic Data
7.3 Boxplot
7.3.1 Boxplots Showing the Owl Data
7.3.2 Boxplots Showing the Benthic Data
7.4 Cleveland Dotplots
7.4.1 Adding the Mean to a Cleveland Dotplot
7.5 Revisiting the plot Function
7.5.1 The Generic plot Function
7.5.2 More Options for the plot Function
7.5.3 Adding Extra Points, Text, and Lines
7.5.4 Using type = ’’n’’
7.5.5 Legends
7.5.6 Identifying Points
7.5.7 Changing Fonts and Font Size*
7.5.8 Adding Special Characters
7.5.9 Other Useful Functions
7.6 The Pairplot
7.6.1 Panel Functions
7.7 The Coplot
7.7.1 A Coplot with a Single Conditioning Variable
7.7.2 The Coplot with Two Conditioning Variables
7.7.3 Jazzing Up the Coplot*
7.8 Combining Types of Plots*
7.9 Which R Functions Did We Learn?
7.10 Exercises
An Introduction to the Lattice Package
8.1 High-Level Lattice Functions
8.2 Multipanel Scatterplots: xyplot
8.3 Multipanel Boxplots: bwplot
8.4 Multipanel Cleveland Dotplots: dotplot
8.5 Multipanel Histograms: histogram
8.6 Panel Functions
8.6.1 First Panel Function Example
8.6.2 Second Panel Function Example
8.6.3 Third Panel Function Example*
8.7 3-D Scatterplots and Surface and Contour Plots
8.8 Frequently Asked Questions
8.8.1 How to Change the Panel Order?
8.8.2 How to Change Axes Limits and Tick Marks?
8.8.3 Multiple Graph Lines in a Single Panel
8.8.4 Plotting from Within a Loop*
8.8.5 Updating a Plot
8.9 Where to Go from Here?
8.10 Which R Functions Did We Learn?
8.11 Exercises
Common R Mistakes
9.1 Problems Importing Data
9.1.1 Errors in the Source File
9.1.2 Decimal Point or Comma Separation
9.1.3 Directory Names
9.2 Attach Misery
9.2.1 Entering the Same attach Command Twice
9.2.2 Attaching Two Data Frames Containing the Same Variable Names
9.2.3 Attaching a Data Frame and Demo Data
9.2.4 Making Changes to a Data Frame After Applying the attach Function
9.3 Non-attach Misery
9.4 The Log of Zero
9.5 Miscellaneous Errors
9.5.1 The Difference Between 1 and l
9.5.2 The Colour of 0
9.5.3 Mistakenly Saved the R Workspace
References
Index
Use R! Advisors: Robert Gentleman  Kurt Hornik  Giovanni Parmigiani
Use R! Series Editors: Robert Gentleman, Kurt Hornik, and Giovanni Parmigiani Albert: Bayesian Computation with R Bivand/Pebesma/G ´omez-Rubio: Applied Spatial Data Analysis with R Claude: Morphometrics with R Cook/Swayne: Interactive and Dynamic Graphics for Data Analysis: With R and GGobi Hahne/Huber/Gentleman/Falcon: Bioconductor Case Studies Kleiber/Zeileis, Applied Econometrics with R Nason: Wavelet Methods in Statistics with R Paradis: Analysis of Phylogenetics and Evolution with R Peng/Dominici: Statistical Methods for Environmental Epidemiology with R: A Case Study in Air Pollution and Health Pfaff: Analysis of Integrated and Cointegrated Time Series with R, 2nd edition Sarkar: Lattice: Multivariate Data Visualization with R Spector: Data Manipulation with R
Alain F. Zuur l Elena N. Ieno l Erik H.W.G. Meesters A Beginner’s Guide to R 13
Alain F. Zuur Highland Statistics Ltd. 6 Laverock Road Newburgh United Kingdom AB41 6FN highstat@highstat.com Elena N. Ieno Highland Statistics Ltd. 6 Laverock Road Newburgh United Kingdom AB41 6FN bio@highstat.com Erik H.W.G. Meesters IMARES, Institute for Marine Resources & Ecosystem Studies 1797 SH ’t Horntje The Netherlands erik.meesters@wur.nl ISBN 978-0-387-93836-3 DOI 10.1007/978-0-387-93837-0 Springer Dordrecht Heidelberg London New York e-ISBN 978-0-387-93837-0 Library of Congress Control Number: 2009929643 # Springer ScienceþBusiness Media, LLC 2009 All rights reserved. This work may not be translated or copied in whole or in part without the written permission of the publisher (Springer ScienceþBusiness Media, LLC, 233 Spring Street, New York, NY 10013, USA), except for brief excerpts in connection with reviews or scholarly analysis. Use in connection with any form of information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed is forbidden. The use in this publication of trade names, trademarks, service marks, and similar terms, even if they are not identified as such, is not to be taken as an expression of opinion as to whether or not they are subject to proprietary rights. Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
To my future niece (who will undoubtedly cost me a lot of money) To Juan Carlos and Norma Alain F. Zuur Elena N. Ieno For Leontine and Ava, Rick, and Merel Erik H.W.G. Meesters
Preface The Absolute R Beginner For whom was this book written? Since 2000, we have taught statistics to over 5000 life scientists. This sounds a lot, and indeed it is, but with some classes of 200 undergraduate students, numbers accumulate rapidly (although some courses have involved as few as 6 students). Most of our teaching has been done in Europe, but we have also conducted courses in South America, Central America, the Middle East, and New Zealand. Of course teaching at universities and research organisations means that our students may be from almost anywhere in the world. Partici- pants have included undergraduates, but most have been MSc students, post- graduate students, post-docs, or senior scientists, along with some consultants and nonacademics. This experience has given us an informed awareness of the typical life scientist’s knowledge of statistics. The word ‘‘typical’’ may be misleading, as those scientists enrolling in a statistics course are likely to be those who are unfamiliar with the topic or have become rusty. In general, we have worked with people who, at some stage in their education or career, have completed a statistics course covering such topics as mean, variance, t-test, Chi-square test, and hypothesis testing, and perhaps including half an hour devoted to linear regression. There are many books available on doing statistics with R. But this book does not deal with statistics, as, in our experience, teaching statistics and R at the same time means two steep learning curves, one for the statistical metho- dology and one for the R code. This is more than many students are prepared to undertake. This book is intended for people seeking an elementary introduction to R. Obviously, the term ‘‘elementary’’ is vague; elementary in one person’s view may be advanced in another’s. R contains a high ‘‘you need to know what you are doing’’ content, and its application requires a considerable amount of logical thinking. As statisticians, it is easy to sit in an ivory tower and expect the life scientist to knock on our door and ask to learn our language. This book aims to make that language as simple vii
viii Preface as possible. If the phrase ‘‘absolute beginner’’ offends, we apologize, but it answers the question: For whom is this book intended? All authors of this book are Windows users and have limited experience with Linux and with Mac OS. R is also available for computers with these operating systems, and all the R code we present should run properly on them. However, there may be small differences with saving graphs. Non-Windows users will also need to find an alternative to the text editor Tinn-R (Chapter 1 discusses where you can find information on this). Datasets used in This book This book uses mainly life science data. Nevertheless, whatever your area of study and whatever your data, the procedures presented will apply. Scientists in all fields need to import data, massage data, make graphs, and, finally, perform analyses. The R commands will be very similar in every case. A 200-page book does not offer a great deal of scope for presenting a variety of dataset types, and, in our experience, widely divergent examples confuse the reader. The optimal approach may be to use a single dataset to demonstrate all techniques, but this does not make many people happy. Therefore, we have used ecologi- cal datasets (e.g., involving plants, marine benthos, fish, birds) and epidemio- logical datasets. All datasets used in this book are downloadable from www.highstat.com. Newburgh Newburgh Den Burg Alain F. Zuur Elena N. Ieno Erik H.W.G. Meesters
分享到:
收藏