gpt4 book ai didi

java - Nutch Crawl 错误 - 输入路径不存在

转载 作者:可可西里 更新时间:2023-11-01 15:19:07 24 4
gpt4 key购买 nike

我有带 2 个数据节点服务器的 nutch/hadoop。我尝试抓取一些网址,但 nutch 失败并出现此错误:

Fetcher: segment: crawl/segments
Fetcher: org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: hdfs://devcluster01:9000/user/nutch/crawl/segments/crawl_generate
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:190)
at org.apache.hadoop.mapred.SequenceFileInputFormat.listStatus(SequenceFileInputFormat.java:44)
at org.apache.nutch.fetcher.Fetcher$InputFormat.getSplits(Fetcher.java:105)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:1107)
at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:1145)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:1116)

有人可以帮助我吗?我不知道如何解决这个问题!非常感谢!

最佳答案

验证nutch/crawl/segments/crawl_generate路径是否正确。

路径错误或解析阶段未完成。

关于java - Nutch Crawl 错误 - 输入路径不存在,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/7371602/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com