gpt4 book ai didi

apache - Nutch - 作业失败 - 错误 mapred.FileOutputCommitter - Mkdirs 无法创建文件

转载 作者:行者123 更新时间:2023-12-03 18:53:08 24 4
gpt4 key购买 nike

我正在尝试按照 Nutch tutorial 上的简单步骤进行操作.这是我第一次使用 Nutch。

一切顺利,直到我执行以下命令:

bin/nutch crawl bin/urls -dir crawl -depth 3 -topN 5 -threads 1

这给了我以下错误

    log4j:ERROR setFile(null,true) call failed
java.io.FileNotFoundException: /usr/local/nutch/framework/apache-nutch-1.6/logs/hadoop.log (No such file or directory)
at java.io.FileOutputStream.open(Native Method)
at java.io.FileOutputStream.<init>(FileOutputStream.java:212)
at java.io.FileOutputStream.<init>(FileOutputStream.java:136)
at org.apache.log4j.FileAppender.setFile(FileAppender.java:290)
at org.apache.log4j.FileAppender.activateOptions(FileAppender.java:164)
at org.apache.log4j.DailyRollingFileAppender.activateOptions(DailyRollingFileAppender.java:216)
at org.apache.log4j.config.PropertySetter.activate(PropertySetter.java:257)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:133)
at org.apache.log4j.config.PropertySetter.setProperties(PropertySetter.java:97)
at org.apache.log4j.PropertyConfigurator.parseAppender(PropertyConfigurator.java:689)
at org.apache.log4j.PropertyConfigurator.parseCategory(PropertyConfigurator.java:647)
at org.apache.log4j.PropertyConfigurator.configureRootCategory(PropertyConfigurator.java:544)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:440)
at org.apache.log4j.PropertyConfigurator.doConfigure(PropertyConfigurator.java:476)
at org.apache.log4j.helpers.OptionConverter.selectAndConfigure(OptionConverter.java:471)
at org.apache.log4j.LogManager.<clinit>(LogManager.java:125)
at org.slf4j.impl.Log4jLoggerFactory.getLogger(Log4jLoggerFactory.java:73)
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:242)
at org.slf4j.LoggerFactory.getLogger(LoggerFactory.java:254)
at org.apache.nutch.crawl.Crawl.<clinit>(Crawl.java:43)
log4j:ERROR Either File or DatePattern options are not set for appender [DRFA].
solrUrl is not set, indexing will be skipped...
crawl started in: crawl
rootUrlDir = bin/urls
threads = 1
depth = 3
solrUrl=null
topN = 5
Injector: starting at 2013-04-02 19:08:03
Injector: crawlDb: crawl/crawldb
Injector: urlDir: bin/urls
Injector: Converting injected urls to crawl db entries.
Injector: total number of urls rejected by filters: 0
Injector: total number of urls injected after normalization and filtering: 1
Injector: Merging injected urls into crawl db.
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1265)
at org.apache.nutch.crawl.Injector.inject(Injector.java:296)
at org.apache.nutch.crawl.Crawl.run(Crawl.java:127)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:55)

我的 bin 目录有:

  1. 纳奇

  2. 抓取

  3. urls/seeds.txt

不知道问题出在哪里。

hadoop.log 有以下错误:

2013-04-03 17:33:18,370 ERROR mapred.FileOutputCommitter - Mkdirs failed to create file:/usr/local/nutch/framework/apache-nutch-1.6/bin/crawl/crawldb/1971189408/_temporary

2013-04-03 17:33:21,394 WARN mapred.LocalJobRunner - job_local_0002

java.io.IOException: The temporary job-output directory file:/usr/local/nutch/framework/apache-nutch-1.6/bin/crawl/crawldb/1971189408/_temporary doesn't exist!

最佳答案

问题出在 -dir crawl 上。

您需要提及正确的目录路径/名称

关于apache - Nutch - 作业失败 - 错误 mapred.FileOutputCommitter - Mkdirs 无法创建文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15778060/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com