gpt4 book ai didi

hadoop - PigLatin 无法从 hdfs 读取文件

转载 作者:可可西里 更新时间:2023-11-01 14:58:42 24 4
gpt4 key购买 nike

我正在按照其在线手稿尝试 Pig 演示代码。

首先,我创建了一个名为 myfile.txt 的测试文件。它包含两行中的六个整数:

4 5 3 
1 2 3

使用hadoop fs -copyFromLocal myfile.txt/user/myfile.txt将文件放入hdfs

然后我跑

A = LOAD '/user/myfile.text';
DUMP A;

但是得到如下错误信息:

2014-10-08 14:15:54,259 [main] INFO  org.apache.pig.tools.pigstats.ScriptState - Pig features used in the script: UNKNOWN
2014-10-08 14:15:54,594 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MRCompiler - File concatenation threshold: 100 optimistic? false
2014-10-08 14:15:54,692 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1
2014-10-08 14:15:54,693 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1
2014-10-08 14:15:54,909 [main] INFO org.apache.pig.tools.pigstats.ScriptState - Pig script settings are added to the job
2014-10-08 14:15:54,998 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
2014-10-08 14:15:55,006 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
2014-10-08 14:15:55,013 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator - BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=12
2014-10-08 14:15:55,015 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting Parallelism to 1
2014-10-08 14:15:55,016 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - creating jar file Job7804857093829884774.jar
2014-10-08 14:15:58,229 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - jar file Job7804857093829884774.jar created
2014-10-08 14:15:58,266 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job
2014-10-08 14:15:58,304 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission.
2014-10-08 14:15:58,353 [JobControl] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
2014-10-08 14:15:58,806 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete
2014-10-08 14:15:58,964 [JobControl] WARN org.apache.hadoop.conf.Configuration - fs.default.name is deprecated. Instead, use fs.defaultFS
2014-10-08 14:15:58,968 [JobControl] WARN org.apache.hadoop.conf.Configuration - dfs.https.address is deprecated. Instead, use dfs.namenode.https-address
2014-10-08 14:15:58,969 [JobControl] WARN org.apache.hadoop.conf.Configuration - io.bytes.per.checksum is deprecated. Instead, use dfs.bytes-per-checksum
2014-10-08 14:15:59,024 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
2014-10-08 14:15:59,025 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths to process : 1
2014-10-08 14:15:59,051 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1
2014-10-08 14:16:00,533 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_201410081312_0015
2014-10-08 14:16:00,534 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases A
2014-10-08 14:16:00,534 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: A[2,4] C: R:
2014-10-08 14:16:05,098 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure.
2014-10-08 14:16:05,098 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_201410081312_0015 has failed! Stop running all dependent jobs
2014-10-08 14:16:05,099 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete
2014-10-08 14:16:05,109 [main] ERROR org.apache.pig.tools.pigstats.PigStatsUtil - 1 map reduce job(s) failed!
2014-10-08 14:16:05,111 [main] INFO org.apache.pig.tools.pigstats.SimplePigStats - Script Statistics:

HadoopVersion PigVersion UserId StartedAt FinishedAt Features
2.0.0-cdh4.7.0 0.11.0-cdh4.7.0 hdfs 2014-10-08 14:15:54 2014-10-08 14:16:05 UNKNOWN

Failed!

Failed Jobs:
JobId Alias Feature Message Outputs
job_201410081312_0015 A MAP_ONLY Message: Job failed!

**Input(s):
Failed to read data from "/user/myfile.txt"**

Pig 似乎没有连接到 hdfs,因此无法访问该文件。有人可以帮我解决这个问题吗?

最佳答案

更改文件的设置。可能您无法读取该文件。

在Linux环境下使用改变文件的权限

chmod 755 myfile.txt

之后执行 CopyFromLocal 命令。

关于hadoop - PigLatin 无法从 hdfs 读取文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26259506/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com