gpt4 book ai didi

hadoop - PIG无法读取导致工作失败的本地CSV

转载 作者:行者123 更新时间:2023-12-02 21:13:09 25 4
gpt4 key购买 nike

相对来说,对 pig / pig 圈生态系统而言这是相对较新的,并且在尝试执行简单的DUMP时遇到令人沮丧的问题。我试图调用下面的Pig脚本(该文件是本地的,不是HFDS,所以我使用pig -x local打开Pig的 shell )。

REGISTER utils.py USING jython AS utils;
events = LOAD '../test/events.csv' USING PigStorage(',') AS (patientid:int, eventid:chararray, eventdesc:chararray, timestamp:chararray, value:float);
events = FOREACH events GENERATE patientid, eventid, ToDate(timestamp, 'yyyy-MM-dd') AS etimestamp, value;
DUMP events;

但是,这样做时,我收到以下错误消息(下面的作业摘要失败,底部是完整的PIG堆栈跟踪):
Input(s): Failed to read data from "file:///bootcamp/test/events.csv"
Output(s): Failed to produce result in "file/tmp/temp/305054006/tmp-908064458"

pig 栈轨迹:
ERROR 1066: Unable to open iterator for alias events. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias events. Backend error : java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
at org.apache.pig.PigServer.openIterator(PigServer.java:925)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:746)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:372)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66)
at org.apache.pig.Main.run(Main.java:558)
at org.apache.pig.Main.main(Main.java:170)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.util.RunJar.run(RunJar.java:221)
at org.apache.hadoop.util.RunJar.main(RunJar.java:136)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 0: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:822)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:452)
at org.apache.pig.backend.hadoop.executionengine.HExecutionEngine.launchPig(HExecutionEngine.java:280)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1390)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1375)
at org.apache.pig.PigServer.storeEx(PigServer.java:1034)
at org.apache.pig.PigServer.store(PigServer.java:997)
at org.apache.pig.PigServer.openIterator(PigServer.java:910)
... 13 more
Caused by: java.lang.IllegalStateException: Job in state DEFINE instead of RUNNING
at org.apache.hadoop.mapreduce.Job.ensureState(Job.java:294)
at org.apache.hadoop.mapreduce.Job.getTaskReports(Job.java:540)
at org.apache.pig.backend.hadoop.executionengine.shims.HadoopShims.getTaskReports(HadoopShims.java:235)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.getStats(MapReduceLauncher.java:801)
...20 more

关于失败的工作,我也曾遇到过类似的问题,但遗憾的是,到目前为止,我还没有设法找到解决方案。

编辑:我应该提到,在下面的链接中遵循PIG教程时,我遇到了同样的问题。

http://www.sunlab.org/teaching/cse8803/fall2016/lab/hadoop-pig/

最佳答案

因此,我发现我可以通过执行以下操作“复制”文件:

tmp = events 100000; --any int larger than number of rows
dump tmp;

我在这里看到了类似的问题,并且能够通过以root用户身份运行来解决。

关于hadoop - PIG无法读取导致工作失败的本地CSV,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39643638/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com