gpt4 book ai didi

java - 如何构建/运行这个简单的 Mahout 程序而不出现异常?

转载 作者:可可西里 更新时间:2023-11-01 14:13:15 25 4
gpt4 key购买 nike

我想运行我在 Mahout In Action 中找到的这段代码:

package org.help;

import java.io.IOException;
import java.util.ArrayList;
import java.util.List;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.SequenceFile;
import org.apache.hadoop.io.Text;
import org.apache.mahout.math.DenseVector;
import org.apache.mahout.math.NamedVector;
import org.apache.mahout.math.VectorWritable;

public class SeqPrep {

public static void main(String args[]) throws IOException{

List<NamedVector> apples = new ArrayList<NamedVector>();

NamedVector apple;

apple = new NamedVector(new DenseVector(new double[]{0.11, 510, 1}), "small round green apple");

apples.add(apple);

Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(conf);
Path path = new Path("appledata/apples");

SequenceFile.Writer writer = new SequenceFile.Writer(fs, conf, path, Text.class, VectorWritable.class);

VectorWritable vec = new VectorWritable();
for(NamedVector vector : apples){
vec.set(vector);
writer.append(new Text(vector.getName()), vec);
}
writer.close();

SequenceFile.Reader reader = new SequenceFile.Reader(fs, new Path("appledata/apples"), conf);

Text key = new Text();
VectorWritable value = new VectorWritable();
while(reader.next(key, value)){
System.out.println(key.toString() + " , " + value.get().asFormatString());
}
reader.close();

}

}

我编译它:

$ javac -classpath :/usr/local/hadoop-1.0.3/hadoop-core-1.0.3.jar:/home/hduser/mahout/trunk/core/target/mahout-core-0.8-SNAPSHOT.jar:/home/hduser/mahout/trunk/core/target/mahout-core-0.8-SNAPSHOT-job.jar:/home/hduser/mahout/trunk/core/target/mahout-core-0.8-SNAPSHOT-sources.jar -d myjavac/ SeqPrep.java

我把它装 jar 了:

$ jar -cvf SeqPrep.jar -C myjavac/ .

现在我想在本地 hadoop 节点上运行它。我试过:

 hadoop jar SeqPrep.jar org.help.SeqPrep

但是我得到:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/mahout/math/Vector
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.util.RunJar.main(RunJar.java:149)

所以我尝试使用 libjars 参数:

$ hadoop jar SeqPrep.jar org.help.SeqPrep -libjars /home/hduser/mahout/trunk/core/target/mahout-core-0.8-SNAPSHOT.jar -libjars /home/hduser/mahout/trunk/core/target/mahout-core-0.8-SNAPSHOT-job.jar -libjars /home/hduser/mahout/trunk/core/target/mahout-core-0.8-SNAPSHOT-sources.jar -libjars /home/hduser/mahout/trunk/math/target/mahout-math-0.8-SNAPSHOT.jar -libjars /home/hduser/mahout/trunk/math/target/mahout-math-0.8-SNAPSHOT-sources.jar

遇到了同样的问题。我不知道还能尝试什么。

我的最终目标是能够将 hadoop fs 上的 .csv 文件读入稀疏矩阵,然后将其乘以随机 vector 。

编辑: 看起来 Razvan 明白了(注意:请参阅下面的另一种不会干扰您的 hadoop 安装的方法)。供引用:

$ find /usr/local/hadoop-1.0.3/. |grep mah
/usr/local/hadoop-1.0.3/./lib/mahout-core-0.8-SNAPSHOT-tests.jar
/usr/local/hadoop-1.0.3/./lib/mahout-core-0.8-SNAPSHOT.jar
/usr/local/hadoop-1.0.3/./lib/mahout-core-0.8-SNAPSHOT-job.jar
/usr/local/hadoop-1.0.3/./lib/mahout-core-0.8-SNAPSHOT-sources.jar
/usr/local/hadoop-1.0.3/./lib/mahout-math-0.8-SNAPSHOT-sources.jar
/usr/local/hadoop-1.0.3/./lib/mahout-math-0.8-SNAPSHOT-tests.jar
/usr/local/hadoop-1.0.3/./lib/mahout-math-0.8-SNAPSHOT.jar

然后:

$hadoop jar SeqPrep.jar org.help.SeqPrep

small round green apple , small round green apple:{0:0.11,1:510.0,2:1.0}

编辑:我试图在不将 mahout jar 复制到 hadoop 库中的情况下执行此操作/

$ rm /usr/local/hadoop-1.0.3/lib/mahout-*

当然还有:

hadoop jar SeqPrep.jar org.help.SeqPrep

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/mahout/math/Vector
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.util.RunJar.main(RunJar.java:149)
Caused by: java.lang.ClassNotFoundException: org.apache.mahout.math.Vector
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)

当我尝试 mahout 作业文件时:

$hadoop jar ~/mahout/trunk/core/target/mahout-core-0.8-SNAPSHOT-job.jar org.help.SeqPrep

Exception in thread "main" java.lang.ClassNotFoundException: org.help.SeqPrep
at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:247)
at org.apache.hadoop.util.RunJar.main(RunJar.java:149)

如果我尝试包含我制作的 .jar 文件:

$ hadoop jar ~/mahout/trunk/core/target/mahout-core-0.8-SNAPSHOT-job.jar SeqPrep.jar org.help.SeqPrep

Exception in thread "main" java.lang.ClassNotFoundException: SeqPrep.jar

编辑:显然我一次只能发送一个 jar 到 hadoop。这意味着我需要将我创建的类添加到 mahout 核心作业文件中:

~/mahout/trunk/core/target$ cp mahout-core-0.8-SNAPSHOT-job.jar mahout-core-0.8-SNAPSHOT-job.jar_backup

~/mahout/trunk/core/target$ cp ~/workspace/seqprep/bin/org/help/SeqPrep.class .

~/mahout/trunk/core/target$ jar uf mahout-core-0.8-SNAPSHOT-job.jar SeqPrep.class

然后:

~/mahout/trunk/core/target$ hadoop jar mahout-core-0.8-SNAPSHOT-job.jar org.help.SeqPrep

Exception in thread "main" java.lang.ClassNotFoundException: org.help.SeqPrep

编辑: 好的,现在我可以在不影响我的 hadoop 安装的情况下完成它。我在之前的编辑中错误地更新了 .jar。应该是:

~/mahout/trunk/core/target$ jar uf mahout-core-0.8-SNAPSHOT-job.jar org/help/SeqPrep.class

然后:

~/mahout/trunk/core/target$ hadoop jar mahout-core-0.8-SNAPSHOT-job.jar org.help.SeqPrep

small round green apple , small round green apple:{0:0.11,1:510.0,2:1.0}

最佳答案

您需要使用 Mahout 提供的“作业”JAR 文件。它打包了所有依赖项。您也需要将您的类(class)添加到其中。这就是所有 Mahout 示例的工作方式。您不应该将 Mahout jar 放在 Hadoop 库中,因为那样会将程序“安装”到 Hadoop 中太深。

关于java - 如何构建/运行这个简单的 Mahout 程序而不出现异常?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11479600/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com