作者celestialgod (天)
看板Statistics
標題Re: [程式] R 多個問題(sapply、ggplot)請教
時間Tue Sep 15 23:12:32 2015
※ 引述《tony255034 (5245566)》之銘言:
: [軟體程式類別]: R
: [程式問題]:sapply系列問題與ggplot
: [軟體熟悉度]:新手
: [問題敘述]:
: 想請教三個問題
: 1.sapply系列的方式真的會比for來的快嗎??
快不快可以參考這篇:
#1LNhOYdK (R_Language)
基本上sapply主要比沒preallocate memory的for快
其他的話,應該不會差太多
: 資料量大約為14000 ROWS 380 COLS
: 因為今天將for迴圈改寫成sapply的方式 sapply反而還比for迴圈慢1.2秒
: 撰寫方式為
: for(x in 1:rd){
: tool[i,1] = grep("1",data[i,3:rd]);
: }
: tool <- sapply(1:rd,function(x) grep("1",data[i,3:rd]))
: 這個功能主要是想要在一個矩陣(1 0矩陣,每列僅有一個1)中找出每列是1的index,
: 有沒有更快的寫法呢?
which(dat[,3:nc] == "1", arr.ind=TRUE)[,2]
下面是簡單的benchmark:
nr = 14000
nc = 380
dat = matrix("0",nr, nc)
tool_true = sample(3:nc, nr, TRUE)
dat[cbind(1:nr, tool_true)] = "1"
st = proc.time()
tool = 0
for(i in 1:nr)
tool[i] = grep("1",dat[i,3:nc])
proc.time() - st
# user system elapsed
# 1.84 0.05 1.89
st = proc.time()
tool2 = vector('integer', nr)
for(i in 1:nr)
tool2[i] = grep("1",dat[i,3:nc])
proc.time() - st
# user system elapsed
# 1.53 0.00 1.56
st = proc.time()
tool3 = sapply(1:nr, function(i) grep("1",dat[i,3:nc]))
proc.time() - st
# user system elapsed
# 1.42 0.02 1.44
st = proc.time()
tool4 = which(t(dat[,3:nc] == "1")) %% (nc-2)
tool4 = ifelse(tool4 == 0, nc-2, tool4)
proc.time() - st
# user system elapsed
# 0.52 0.04 0.54
st = proc.time()
tool5 = which(dat[,3:nc] == "1", arr.ind=TRUE)[,2]
proc.time() - st
# user system elapsed
# 0.45 0.05 0.50
all.equal(tool_true-2, tool) # TRUE
all.equal(tool_true-2, tool2) # TRUE
all.equal(tool_true-2, tool3) # TRUE
all.equal(tool_true-2, tool4) # TRUE
# the order of tool5 need to be sort
tmp = which(dat[,3:nc] == "1", arr.ind=TRUE)
all.equal(tool_true-2, tmp[order(tmp[,1]), 2])
補上nr = 20000, nc = 3000的速度:
tool tool2 tool3 tool4 tool5
16.88 15.16 15.55 9.72 8.47
: 2.有三個圖利用ggplot所繪出,並使用下面網頁方式形成multiplot 但合併後圖皆會有被
: 切掉的部分,不知道有無方法可能針對子plot進行縮小或者讓multiplot的長寬整體放大呢?
: (目前只能用ggsave各自輸出子plot,然後再用python合併圖檔)
: http://www.cookbook-r.com/Graphs/Multiple_graphs_on_one_page_(ggplot2)/
: 3.承第二題,R有辦法針對已經製作出來的圖檔進行合併圖檔的功能嗎?
直接使用gridExtra套件的grid.arrange
還可以參考這篇:
#1LtMYjwo (R_Language)
sessionInfo()
Revolution R Open 3.2.2
Default CRAN mirror snapshot taken on 2015-08-27
The enhanced R distribution from Revolution Analytics
Visit mran.revolutionanalytics.com/open for information
about additional features.
R version 3.2.2 (2015-08-14)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=Chinese (Traditional)_Taiwan.950 LC_CTYPE=Chinese
(Traditional)_Taiwan.950
[3] LC_MONETARY=Chinese (Traditional)_Taiwan.950
LC_NUMERIC=C
[5] LC_TIME=Chinese (Traditional)_Taiwan.950
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] RevoUtilsMath_3.2.2
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 123.205.27.107
※ 文章網址: https://webptt.com/m.aspx?n=bbs/Statistics/M.1442329956.A.93F.html
1F:推 tony255034: 十分感謝 我試試看 09/15 23:49
※ 編輯: celestialgod (123.205.27.107), 09/16/2015 00:18:18