作者rlearner (rlearner)
看板Cloud
标题[新手] R+hadoop安装问题(附影片)
时间Thu Jan 12 01:56:17 2017
[目的]======================================================
在ubuntu16上安装R和hadoop环境
并用rhdfs和rmr2执行简单范例
[问题]======================================================
(问题1):在library(rmr2)时会出现错误讯息:
Please review your hadoop settings. See help(hadoop.settings)
(问题2):library(rhdfs)後
init.hdfs()出现错误讯息:
17/01/11 17:20:17 WARN util.NativeCodeLoader:
Unable to load native-hadoop
library for your platform...
using builtin-java classes where applicable
猜或许是hadoop streaming设置错了??
[安装过程]====================================================
启动hadoop
cd ~/hadoop && sbin/start-all.sh
-----------------------------------------------------------------------------
装r在master就好
sudo apt-get update
sudo apt-get upgrade
sudo apt-get install r-base
java 设订-----------------------------------------------------------
echo $JAVA_HOME
sudo JAVA_HOME=/usr/lib/jvm/jdk/ R CMD javareconf
in the R-------------------------------------------------------------
进入R
sudo R
install.packages(c("codetools","R","Rcpp","RJSONIO","bitops","digest","functional","stringr","plyr","reshape2","rJava","caTools"))
下载rmr(用於mapreduce 和 rhabse)
wget --no-check-certificate
https://raw.github.com/RevolutionAnalytics/rmr2/3.3.0/build/rmr2_3.3.0.tar.gz
wget --no-check-certificate
https://raw.github.com/RevolutionAnalytics/rhdfs/master/build/rhdfs_1.0.8.tar.gz
在R中
---------------------------------------------------------------------------------------
$sudo R
install.packages("/home/hduser/rhdfs_1.0.8.tar.gz", repos=NULL, type="source")
install.packages("/home/hduser/rmr2_3.3.0.tar.gz", repos = NULL,
type="source")
安装影片如下:
https://www.youtube.com/watch?v=w70h_u8qoHM&t=680s
[路径设置/网路资料]=====================================================
参考一些网路资料
都无法决解这问题
发现很多讨论都跟HADOOP_STREAMING路径设置有关@@
@资料一
http://stackoverflow.com/questions/29682432/r-mapreduce-library-rmr2-shows-a-warning-message-when-loaded
这篇提到要在R中重设Sys.setenv的路径跟我完全不一样
感觉也不是我的问题
@资料二
https://github.com/RevolutionAnalytics/RHadoop/issues/122
这篇还没开始看,英文好吃力
@资料三
https://github.com/RevolutionAnalytics/rmr2/issues/155
这篇跟我的问题非常像,但我还是看不太懂,而他设的路径也跟我不一样@@
>small.ints = to.dfs(1:10)
>mapr = mapreduce(input = small.ints,
map = function(k,v) cbind(v,v^2))
会有
Streaming Command Failed!
Error in ...
hadoop streaming failed with error code 5
不知是什麽意思>"<
以下是我设的路径:
Sys.setenv(HADOOP_HOME="/home/hduser/hadoop")
Sys.setenv(HADOOP_PREFIX="/home/hduser/hadoop")
Sys.setenv(HADOOP_CMD="/home/hduser/hadoop/bin/hadoop")
Sys.setenv(HADOOP_STREAMING="/home/hduser/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.7.3.jar")
不知是哪错的...
在hadoop/logs也看不太懂错误是什麽QAQ
希望大大们帮看一下我HADOOP_STREAMING设置是否有错?或怎麽看错误
或是哪出错了>"<
--
※ 发信站: 批踢踢实业坊(ptt.cc), 来自: 140.128.101.143
※ 文章网址: https://webptt.com/cn.aspx?n=bbs/Cloud/M.1484157380.A.404.html