我已经配置了 gridgain-hadoop-os-6.6.2.zip,并按照 docs/hadoop_readme.pdf 中提到的步骤进行操作。使用 bin/ggstart.sh 命令启动 gridgain,现在我使用 hadoop-2.2.0 在 gridgain 中运行一个简单的 wordcount 代码。使用命令

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/*-mapreduce-examples-*.jar wordcount /input /output 

我尝试过的步骤:

第 1 步: 提取 usr/local 文件夹中的 hadoop-2.2.0 和 gridgain-hadoop-os-6.6.2.zip 文件,并将 gridgain 文件夹的名称更改为“gridgain”。

第 2 步: 将导出 GRIDGAIN_HOME=/usr/local/gridgain.. 的路径和 JAVA_HOME 的 hadoop-2.2.0 路径设置为

    # Set Hadoop-related environment variables 
export HADOOP_PREFIX=/usr/local/hadoop-2.2.0 
export HADOOP_HOME=/usr/local/hadoop-2.2.0 
export HADOOP_MAPRED_HOME=/usr/local/hadoop-2.2.0 
export HADOOP_COMMON_HOME=/usr/local/hadoop-2.2.0 
export HADOOP_HDFS_HOME=/usr/local/hadoop-2.2.0 
export YARN_HOME=/usr/local/hadoop-2.2.0 
export HADOOP_CONF_DIR=/usr/local/hadoop-2.2.0/etc/hadoop 
export GRIDGAIN_HADOOP_CLASSPATH='/usr/local/hadoop-2.2.0/lib/*:/usr/local/hadoop-2.2.0/lib/*:/usr/local/hadoop-2.2.0/lib/*' 

第 3 步:

现在我以 bin/setup-hadoop.sh 运行命令 ... 对每个提示回答 Y。

第 4 步:

使用命令启动 gridgain

bin/ggstart.sh

第 5 步:

现在我使用 :

创建了目录并上传了文件
hadoop fs -mkdir /input 
 
hadoop fs -copyFromLocal $HADOOP_HOME/README.txt /input/WORD_COUNT_ME. 
txt 

第 6 步:

运行此命令会出现错误:

hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/*-mapreduce-examples-*. 
jar wordcount /input /output 

出现以下错误:

15/02/22 12:49:13 INFO Configuration.deprecation: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 
15/02/22 12:49:13 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_091ebfbd-2993-475f-a506-28280dbbf891_0002 
15/02/22 12:49:13 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/hduser/.staging/job_091ebfbd-2993-475f-a506-28280dbbf891_0002 
java.lang.NullPointerException 
    at org.gridgain.client.hadoop.GridHadoopClientProtocol.processStatus(GridHadoopClientProtocol.java:329) 
    at org.gridgain.client.hadoop.GridHadoopClientProtocol.submitJob(GridHadoopClientProtocol.java:115) 
    at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:430) 
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1268) 
    at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1265) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at javax.security.auth.Subject.doAs(Subject.java:415) 
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1491) 
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:1265) 
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1286) 
    at org.apache.hadoop.examples.WordCount.main(WordCount.java:84) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72) 
    at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144) 
    at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74) 
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) 
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) 
    at java.lang.reflect.Method.invoke(Method.java:606) 
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212) 

gridgain 控制台错误为:

sLdrId=a0b8610bb41-091ebfbd-2993-475f-a506-28280dbbf891, userVer=0, loc=true, sampleClsName=java.lang.String, pendingUndeploy=false, undeployed=false, usage=0]], taskClsName=o.g.g.kernal.processors.hadoop.proto.GridHadoopProtocolSubmitJobTask, sesId=e129610bb41-091ebfbd-2993-475f-a506-28280dbbf891, startTime=1424589553332, endTime=9223372036854775807, taskNodeId=091ebfbd-2993-475f-a506-28280dbbf891, clsLdr=sun.misc.Launcher$AppClassLoader@1bdcbb2, closed=false, cpSpi=null, failSpi=null, loadSpi=null, usage=1, fullSup=false, subjId=091ebfbd-2993-475f-a506-28280dbbf891], jobId=f129610bb41-091ebfbd-2993-475f-a506-28280dbbf891]] 
java.lang.NoClassDefFoundError: org/apache/hadoop/mapreduce/JobContext 
    at java.lang.Class.getDeclaredConstructors0(Native Method) 
    at java.lang.Class.privateGetDeclaredConstructors(Class.java:2585) 
    at java.lang.Class.getConstructor0(Class.java:2885) 
    at java.lang.Class.getConstructor(Class.java:1723) 
    at org.gridgain.grid.hadoop.GridHadoopDefaultJobInfo.createJob(GridHadoopDefaultJobInfo.java:107) 
    at org.gridgain.grid.kernal.processors.hadoop.jobtracker.GridHadoopJobTracker.job(GridHadoopJobTracker.java:959) 
    at org.gridgain.grid.kernal.processors.hadoop.jobtracker.GridHadoopJobTracker.submit(GridHadoopJobTracker.java:222) 
    at org.gridgain.grid.kernal.processors.hadoop.GridHadoopProcessor.submit(GridHadoopProcessor.java:188) 
    at org.gridgain.grid.kernal.processors.hadoop.GridHadoopImpl.submit(GridHadoopImpl.java:73) 
    at org.gridgain.grid.kernal.processors.hadoop.proto.GridHadoopProtocolSubmitJobTask.run(GridHadoopProtocolSubmitJobTask.java:54) 
    at org.gridgain.grid.kernal.processors.hadoop.proto.GridHadoopProtocolSubmitJobTask.run(GridHadoopProtocolSubmitJobTask.java:37) 
    at org.gridgain.grid.kernal.processors.hadoop.proto.GridHadoopProtocolTaskAdapter$Job.execute(GridHadoopProtocolTaskAdapter.java:95) 
    at org.gridgain.grid.kernal.processors.job.GridJobWorker$2.call(GridJobWorker.java:484) 
    at org.gridgain.grid.util.GridUtils.wrapThreadLoader(GridUtils.java:6136) 
    at org.gridgain.grid.kernal.processors.job.GridJobWorker.execute0(GridJobWorker.java:478) 
    at org.gridgain.grid.kernal.processors.job.GridJobWorker.body(GridJobWorker.java:429) 
    at org.gridgain.grid.util.worker.GridWorker.run(GridWorker.java:151) 
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) 
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) 
    at java.lang.Thread.run(Thread.java:745) 
Caused by: java.lang.ClassNotFoundException: Failed to load class: org.apache.hadoop.mapreduce.JobContext 
    at org.gridgain.grid.kernal.processors.hadoop.GridHadoopClassLoader.loadClass(GridHadoopClassLoader.java:125) 
    at java.lang.ClassLoader.loadClass(ClassLoader.java:358) 
    ... 20 more 
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.mapreduce.JobContext 
    at java.net.URLClassLoader$1.run(URLClassLoader.java:366) 
    at java.net.URLClassLoader$1.run(URLClassLoader.java:355) 
    at java.security.AccessController.doPrivileged(Native Method) 
    at java.net.URLClassLoader.findClass(URLClassLoader.java:354) 
    at org.gridgain.grid.kernal.processors.hadoop.GridHadoopClassLoader.loadClassExplicitly(GridHadoopClassLoader.java:196) 
    at org.gridgain.grid.kernal.processors.hadoop.GridHadoopClassLoader.loadClass(GridHadoopClassLoader.java:106) 
    ... 21 more 
^[[B 

在这里帮忙....

在这里编辑:

raj@ubuntu:~$ hadoop classpath 
/usr/local/hadoop-2.2.0/etc/hadoop:/usr/local/hadoop-2.2.0/share/hadoop/common/lib/*:/usr/local/hadoop-2.2.0/share/hadoop/common/*:/usr/local/hadoop-2.2.0/share/hadoop/hdfs:/usr/local/hadoop-2.2.0/share/hadoop/hdfs/lib/*:/usr/local/hadoop-2.2.0/share/hadoop/hdfs/*:/usr/local/hadoop-2.2.0/share/hadoop/yarn/lib/*:/usr/local/hadoop-2.2.0/share/hadoop/yarn/*:/usr/local/hadoop-2.2.0/share/hadoop/mapreduce/lib/*:/usr/local/hadoop-2.2.0/share/hadoop/mapreduce/*:/usr/local/hadoop-2.2.0/contrib/capacity-scheduler/*.jar 
raj@ubuntu:~$ jps 
3529 GridCommandLineStartup 
3646 Jps 
raj@ubuntu:~$ echo $GRIDGAIN_HOME 
/usr/local/gridgain 
raj@ubuntu:~$ echo $HADOOP_HOME 
/usr/local/hadoop-2.2.0 
raj@ubuntu:~$ hadoop version 
Hadoop 2.2.0 
Subversion https://svn.apache.org/repos/asf/hadoop/common -r 1529768 
Compiled by hortonmu on 2013-10-07T06:28Z 
Compiled with protoc 2.5.0 
From source with checksum 79e53ce7994d1628b240f09af91e1af4 
This command was run using /usr/local/hadoop-2.2.0/share/hadoop/common/hadoop-common-2.2.0.jar 
raj@ubuntu:~$ cd /usr/local/hadoop-2.2.0/share/hadoop/mapreduce 
raj@ubuntu:/usr/local/hadoop-2.2.0/share/hadoop/mapreduce$ ls 
hadoop-mapreduce-client-app-2.2.0.jar     hadoop-mapreduce-client-hs-2.2.0.jar          hadoop-mapreduce-client-jobclient-2.2.0-tests.jar  lib 
hadoop-mapreduce-client-common-2.2.0.jar  hadoop-mapreduce-client-hs-plugins-2.2.0.jar  hadoop-mapreduce-client-shuffle-2.2.0.jar          lib-examples 
hadoop-mapreduce-client-core-2.2.0.jar    hadoop-mapreduce-client-jobclient-2.2.0.jar   hadoop-mapreduce-examples-2.2.0.jar                sources 
raj@ubuntu:/usr/local/hadoop-2.2.0/share/hadoop/mapreduce$  

请您参考如下方法:

我完全配置了您提到的版本 (gridgain-hadoop-os-6.6.2.zip + hadoop-2.2.0)——“wordcount”示例工作正常。


[问题作者日志分析后更新:]

Raju,感谢您提供详细的日志。 问题的原因是环境变量设置不正确

export HADOOP_MAPRED_HOME=${HADOOP_HOME} 
export HADOOP_COMMON_HOME=${HADOOP_HOME} 
export HADOOP_HDFS_HOME=${HADOOP_HOME} 

您明确地将所有这些变量设置为 ${HADOOP_HOME} 值,错误。这会导致 GG 编写不正确的 hadoop 类路径,如下面的 GG 节点日志所示:

+++ HADOOP_PREFIX=/usr/local/hadoop-2.2.0 
+++ [[ -z /usr/local/hadoop-2.2.0 ]] 
+++ '[' -z /usr/local/hadoop-2.2.0 ']' 
+++ HADOOP_COMMON_HOME=/usr/local/hadoop-2.2.0 
+++ HADOOP_HDFS_HOME=/usr/local/hadoop-2.2.0 
+++ HADOOP_MAPRED_HOME=/usr/local/hadoop-2.2.0 
+++ GRIDGAIN_HADOOP_CLASSPATH='/usr/local/hadoop-2.2.0/lib/*:/usr/local/hadoop-2.2.0/lib/*:/usr/local/hadoop-2.2.0/lib/*' 

因此,要解决此问题,请不要设置不必要的环境变量。 JAVA_HOME 和 HADOOP_HOME 已经足够了,不需要其他的。


评论关闭
IT虾米网

微信公众号号:IT虾米 (左侧二维码扫一扫)欢迎添加!