作者celestialgod (天)
看板Statistics
标题Re: [程式] R 多个问题(sapply、ggplot)请教
时间Tue Sep 15 23:12:32 2015
※ 引述《tony255034 (5245566)》之铭言:
: [软体程式类别]: R
: [程式问题]:sapply系列问题与ggplot
: [软体熟悉度]:新手
: [问题叙述]:
: 想请教三个问题
: 1.sapply系列的方式真的会比for来的快吗??
快不快可以参考这篇:
#1LNhOYdK (R_Language)
基本上sapply主要比没preallocate memory的for快
其他的话,应该不会差太多
: 资料量大约为14000 ROWS 380 COLS
: 因为今天将for回圈改写成sapply的方式 sapply反而还比for回圈慢1.2秒
: 撰写方式为
: for(x in 1:rd){
: tool[i,1] = grep("1",data[i,3:rd]);
: }
: tool <- sapply(1:rd,function(x) grep("1",data[i,3:rd]))
: 这个功能主要是想要在一个矩阵(1 0矩阵,每列仅有一个1)中找出每列是1的index,
: 有没有更快的写法呢?
which(dat[,3:nc] == "1", arr.ind=TRUE)[,2]
下面是简单的benchmark:
nr = 14000
nc = 380
dat = matrix("0",nr, nc)
tool_true = sample(3:nc, nr, TRUE)
dat[cbind(1:nr, tool_true)] = "1"
st = proc.time()
tool = 0
for(i in 1:nr)
tool[i] = grep("1",dat[i,3:nc])
proc.time() - st
# user system elapsed
# 1.84 0.05 1.89
st = proc.time()
tool2 = vector('integer', nr)
for(i in 1:nr)
tool2[i] = grep("1",dat[i,3:nc])
proc.time() - st
# user system elapsed
# 1.53 0.00 1.56
st = proc.time()
tool3 = sapply(1:nr, function(i) grep("1",dat[i,3:nc]))
proc.time() - st
# user system elapsed
# 1.42 0.02 1.44
st = proc.time()
tool4 = which(t(dat[,3:nc] == "1")) %% (nc-2)
tool4 = ifelse(tool4 == 0, nc-2, tool4)
proc.time() - st
# user system elapsed
# 0.52 0.04 0.54
st = proc.time()
tool5 = which(dat[,3:nc] == "1", arr.ind=TRUE)[,2]
proc.time() - st
# user system elapsed
# 0.45 0.05 0.50
all.equal(tool_true-2, tool) # TRUE
all.equal(tool_true-2, tool2) # TRUE
all.equal(tool_true-2, tool3) # TRUE
all.equal(tool_true-2, tool4) # TRUE
# the order of tool5 need to be sort
tmp = which(dat[,3:nc] == "1", arr.ind=TRUE)
all.equal(tool_true-2, tmp[order(tmp[,1]), 2])
补上nr = 20000, nc = 3000的速度:
tool tool2 tool3 tool4 tool5
16.88 15.16 15.55 9.72 8.47
: 2.有三个图利用ggplot所绘出,并使用下面网页方式形成multiplot 但合并後图皆会有被
: 切掉的部分,不知道有无方法可能针对子plot进行缩小或者让multiplot的长宽整体放大呢?
: (目前只能用ggsave各自输出子plot,然後再用python合并图档)
: http://www.cookbook-r.com/Graphs/Multiple_graphs_on_one_page_(ggplot2)/
: 3.承第二题,R有办法针对已经制作出来的图档进行合并图档的功能吗?
直接使用gridExtra套件的grid.arrange
还可以参考这篇:
#1LtMYjwo (R_Language)
sessionInfo()
Revolution R Open 3.2.2
Default CRAN mirror snapshot taken on 2015-08-27
The enhanced R distribution from Revolution Analytics
Visit mran.revolutionanalytics.com/open for information
about additional features.
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=Chinese (Traditional)_Taiwan.950 LC_CTYPE=Chinese
(Traditional)_Taiwan.950
[3] LC_MONETARY=Chinese (Traditional)_Taiwan.950
LC_NUMERIC=C
[5] LC_TIME=Chinese (Traditional)_Taiwan.950
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] RevoUtilsMath_3.2.2
--
※ 发信站: 批踢踢实业坊(ptt.cc), 来自: 123.205.27.107
※ 文章网址: https://webptt.com/cn.aspx?n=bbs/Statistics/M.1442329956.A.93F.html
1F:推 tony255034: 十分感谢 我试试看 09/15 23:49
※ 编辑: celestialgod (123.205.27.107), 09/16/2015 00:18:18