R 语言实现
「医学统计学及SAS 应用」之
UPDATE @ 2014-11-16
PREFACE
关于这份笔记,有几点需要说明:
0. 客套话。感谢您阅读本文,希望能与您共同进步。
1. 为什么是R。以前从来没想过用盗版SPSS 和SAS 有什么问题,直到到了国外,敏感的版
权形势下,R 是最好的选择。
2. 本文所引用数据的版权,归《医学统计学及SAS 应用》原作者所有。本文只探讨使用R 实
现的可能性,并且仅对本文中R 语言代码享有相关权利。此处还想特别感谢该书的编者之
3. 作者保留相关权利。在R 语言的世界里,倡导的精神是自由与分享。所以,在以非商业为
出处。4. 本文的作者学临床医学出身,并非统计专业,对统计知识也是处在一知半解之中,故文中
5. 有任何问题或指正,请随时联系作者邮箱:yefux@med.nagoya-u.ac.jp
一,上海交通大学医学院生物统计学教研室的王筱金老师,正是得益于她的用心教学,本文
作者才学会基本的统计学知识,也因此才有这份笔记。
目的的前提下,您可以自由地阅读、拷贝、使用和修改本文的代码。若您进行转载,请注明
的表述或程序可能存在各种错误和不足,还请读者注意辨别,切勿被作者误导,同时希望您
不吝指正。
Fuxiang Ye
于日本·名古屋
目 录
例3.19..........................................................................1
例3.20..........................................................................2
例3.21..........................................................................2
例3.22..........................................................................3
例3.23..........................................................................3
例4.13..........................................................................4
例4.14..........................................................................5
例4.15..........................................................................5
例4.16..........................................................................5
例4.17..........................................................................6
例4.18..........................................................................6
例5.1.............................................................................7
例5.2.............................................................................8
例5.3.............................................................................9
例5.4..........................................................................10
例5.5..........................................................................11
例5.6..........................................................................12
例5.7..........................................................................13
例6.1..........................................................................14
例6.2..........................................................................15
例6.3..........................................................................16
例6.4..........................................................................17
例6.5..........................................................................18
例6.6..........................................................................18
例7.1..........................................................................20
例7.2..........................................................................21
例7.3..........................................................................22
例7.4..........................................................................24
例8.6(程序8.1)...............................................24
例8.7(程序8.2)...............................................25
例8.8(程序8.3)...............................................26
例8.9(程序8.4)...............................................26
例8.10(程序8.5).............................................27
例8.11(程序8.6).............................................28
例8.12(程序8.7).............................................29
例9.1..........................................................................29
例9.2..........................................................................30
例9.4(程序9.3)...............................................30
例9.5(程序9.4)...............................................31
例9.6(程序9.5)..............................................31
例9.7(程序9.6)..............................................32
例9.8(程序9.7)..............................................32
例9.9(程序9.8)..............................................33
例9.11(程序9.9)...........................................33
例9.12(程序9.10).........................................33
例9.13(程序9.11).........................................34
例9.14(程序9.12).........................................36
例9.15(程序9.13).........................................37
例9.16(程序9.14).........................................37
例9.17(程序9.15).........................................38
例9.18(程序9.16).........................................38
例9.19(程序9.17).........................................39
例10.1-2..................................................................39
例10.3......................................................................41
例11.1......................................................................41
例11.2......................................................................42
例11.4(程序11.3).........................................43
例11.5(程序11.4a)......................................44
例11.5(程序11.4b、c)...............................45
例11.6(程序11.5).........................................46
例11.7(程序11.6a)......................................47
例11.7(程序11.6b)......................................48
例12.1......................................................................49
例12.3(程序12.2).........................................50
例12.4(程序12.3).........................................51
例12.5(程序12.4).........................................53
例13.2......................................................................54
例13.3......................................................................55
例13.4......................................................................55
例15.3(程序15_1)........................................57
例15.4(程序15_2)........................................57
例15.5(程序15_3)........................................58
例15.6(程序15_4)........................................59
例15.7(程序15_5)........................................60
例15.8(程序15_6)........................................61
例15.9(程序15_7)........................................62
附 录..........................................................................63
例 3.19
:直方图绘制
123456789101112131415161718192021
x <- c(108.0, 97.6, 103.4, 101.6, 104.4, 98.5, 110.5, 103.8, 109.7, 109.8,
104.5, 99.5, 104.0, 103.9, 97.2, 106.3, 106.2, 107.6, 108.3, 97.6,
102.7, 103.7, 107.6, 103.2, 103.6, 103.3, 102.8, 102.3, 102.2, 103.3,
101.2, 107.5, 106.3, 109.7, 99.5, 107.4, 103.4, 106.6, 105.7, 107.4,
103.0, 109.6, 106.4, 107.3, 100.6, 112.3, 100.5, 101.9, 98.8, 99.7,
104.3, 110.2, 105.3, 95.2, 105.8, 105.2, 106.1, 103.6, 106.6, 105.1,
105.5, 113.5, 107.7, 106.8, 106.2, 109.8, 99.7, 107.9, 104.8, 103.9,
106.8, 106.4, 108.3, 106.5, 103.3, 107.7, 106.2, 100.4, 102.6, 102.1,
110.6, 112.2, 110.2, 103.7, 102.3, 112.1, 105.4, 104.2, 105.7, 104.4,
102.8, 107.8, 102.5, 102.3, 105.8, 103.7, 103.1, 101.6, 106.5, 100.0,
103.2, 109.3, 105.8, 106.1, 104.9, 105.9, 105.3, 103.7, 99.6, 106.2,
102.5, 108.1, 106.1, 108.3, 99.8, 108.3, 104.0, 100.6, 112.6, 103.7)
hist(x, breaks=seq(95, 115, 2), xaxt="n", yaxt="n", col=gray(0.6), border=0)
# breaks=语句用于划分直方图单元格区间
# seq(95,115,2) 表示构建一组从95 到115,公差为2 的等差数列
# xaxt="n", yaxt="n" 表示不绘制坐标轴
axis(side = 1, at = seq(95, 117, 2))
axis(side = 2, at = seq(0, 30, 10))
# 此2 句代码为R 语言的低级绘图命令,用于人工控制图像
# axis() 函数用于生成坐标轴,side=1 为横坐标,side=2 为纵坐标
# 参数at 定义了坐标轴的数值,一般用向量表示
结果:
1
例 3.20
:统计描述1234567
# 数据见例3.19
mean(x)
sd(x)
cv <- sd(x) / mean(x) * 100 # 计算变异系数
print(cv)
min(x)
max(x)
结果:
例 3.21
:分组统计描述
1234567891011121314
x <- c(1, 1.8, 1.4, 2, 1.7, 1.1, 1, 2.2, 1.5, 3, 1.9, 1.2, 2, 2.5, 1.0, 1, 2.7, 1.6, 2, 2.3, 1.3, 2, 2.8,
0.9, 3, 3.0, 1.1, 1, 2.6, 1.4, 1, 2.4, 1.2, 2, 1.9, 1.3, 3, 2.9, 0.8, 1, 3.2, 1.7, 3, 3.1, 1.5, 2,
2.6, 1.9, 3, 3.5, 1.6, 3, 3.3, 1.5)
group <- x[c(seq(1, length(x), 3))]
VA <- x[c(seq(2, length(x), 3))]
VB1 <- x[c(seq(3, length(x), 3))]
xframe <- data.frame(group=group, VA=VA, VB1= VB1)
# 建立group, VA, VB1 的三列csv 文件,导入后较为简便
# 最后一句为建立共有三列的数据框
# 安装doBy package。更详细说明使用 ?doBy 查询
# install.package(“doBy”)
library(doBy)
summaryBy(VA+ VB1~ group, data= xframe, FUN= c(mean, sd, max, min))
# 以group 为分组变量,FUN=c()指出所求参数。
# 更详细说明使用 ?summaryBy 查询
结果:
2
例 3.22
:以频数计算均值(对数转换)
x <- c(4, 8, 16, 32, 64, 128, 256, 512)
freq <- c(1, 3, 8, 13, 21, 9, 4, 1)
y <- log10(x)
meany <- sum(y * freq) / sum(freq)
# 用2 个向量计算带频数的均值
meanx <- 10 ^ meany
meanymeanx
12345678
12345678910111213
结果:
3
结果:
例 3.23
:频数表,正态性检验,茎叶图,箱线图
# 数据见例3.19
summary(x)
# 数据简单描述
table(x)
# 制作频数表
shapiro.test(x)
# 正态性检验之Shapiro-Wilk 检验
stem(x)
# 画出茎叶图
boxplot(x)
# 画出箱线图
qqnorm(x)
# 画出正态概率图(Q-Q 图)
例 4.13
:单样本t 检验,与0 比较
123 # 数据同例3.19
t.test(x)
# 单样本t 检验的结果中包含95%置信区间
结果:
4
例 4.14
:单样本t 检验
1234
x <- c(74, 73, 68, 75, 75, 82, 80, 69, 72, 74, 83, 72, 71, 74, 76, 79, 67, 73, 81, 70, 67, 70,
d <- x -72
t.test(d)
78, 69, 70, 72, 67, 74, 80, 66)
结果:
例 4.15
:配对t 检验
1234
x <- data.frame(zhch=c(3550, 2000, 3000, 3950, 3800, 3750, 3450, 3050,
3350, 3650), quefa=c(2450, 2400, 1800, 3200, 3250,
2700, 2500, 1750, 2100, 2550))
t.test(x$zhch, x$quefa, paired = TRUE)
结果:
例 4.16
:独立样本(成组)t 检验
y <- c(1, 26, 1, 32, 1, 25, 1, 22, 1, 20, 1, 28, 1, 24, 1, 19, 1, 29, 1, 17, 1, 34, 1, 21, 1, 20, 1,
23, 1, 27, 2, 21, 2, 23, 2, 18, 2, 24, 2, 23, 2, 19, 2, 16, 2, 22, 2, 20, 2, 25, 2, 23, 2,
17, 2, 15, 2, 26, 2, 22)
group <- y[c(seq(1, length(y), 2 ) ) ]
x <- y[c(seq(2, length(y), 2 ) ) ]
t.test(x~group)
123456
5