gpt4 book ai didi

hadoop - Mahout LDA给出FileNotFound异常

转载 作者:行者123 更新时间:2023-12-02 21:58:09 26 4
gpt4 key购买 nike

我像here所示创建了术语 vector ,如下所示:

~/Scripts/Mahout/trunk/bin/mahout seqdirectory --input /home/ben/Scripts/eipi/files --output /home/ben/Scripts/eipi/mahout_out -chunk 1
~/Scripts/Mahout/trunk/bin/mahout seq2sparse -i /home/ben/Scripts/eipi/mahout_out -o /home/ben/Scripts/eipi/termvecs -wt tf -seq

然后我跑
~/Scripts/Mahout/trunk/bin/mahout lda -i /home/ben/Scripts/eipi/termvecs -o /home/ben/Scripts/eipi/lda_working -k 2 -v 100

我得到:

MAHOUT-JOB: /home/ben/Scripts/Mahout/trunk/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar 11/09/04 16:28:59 INFO common.AbstractJob: Command line arguments: {--endPhase=2147483647, --input=/home/ben/Scripts/eipi/termvecs, --maxIter=-1, --numTopics=2, --numWords=100, --output=/home/ben/Scripts/eipi/lda_working, --startPhase=0, --tempDir=temp, --topicSmoothing=-1.0} 11/09/04 16:29:00 INFO lda.LDADriver: LDA Iteration 1 11/09/04 16:29:01 INFO input.FileInputFormat: Total input paths to process : 4 11/09/04 16:29:01 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-ben/mapred/staging/ben692167368/.staging/job_local_0001 Exception in thread "main" java.io.FileNotFoundException: File file:/home/ben/Scripts/eipi/termvecs/tokenized-documents/data does not exist. at org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:371) at org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:245) at org.apache.hadoop.mapreduce.lib.input.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:63) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:252) at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:902) at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:919) at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:838) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:791) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059) at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:791) at org.apache.hadoop.mapreduce.Job.submit(Job.java:465) at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:494) at org.apache.mahout.clustering.lda.LDADriver.runIteration(LDADriver.java:426) at org.apache.mahout.clustering.lda.LDADriver.run(LDADriver.java:226) at org.apache.mahout.clustering.lda.LDADriver.run(LDADriver.java:174) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.mahout.clustering.lda.LDADriver.main(LDADriver.java:90) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156)



是的,该文件不存在。我应该如何创建它?

最佳答案

vector 可能是空的,因为它们的创建可能存在问题。检查 vector 是否在其文件夹中成功创建(文件大小不为0字节)。如果您输入的文件夹缺少某些文件,则可能会发生此错误。在这种情况下,尽管没有创建有效的输出,但这两个步骤将起作用。

关于hadoop - Mahout LDA给出FileNotFound异常,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7309630/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com