gpt4 book ai didi

hadoop - 从oozie调用Pig时出错

转载 作者:行者123 更新时间:2023-12-02 21:44:18 25 4
gpt4 key购买 nike

我正在尝试在Oozie WorkFlow中使用PIG操作读取特定的文件模式:

Oozie工作流程:

<workflow-app>

<fork>
<path to ="subWorkflow1" />
<path to ="subWorkflow2" />
</fork>

<join>
</workflow-app>

**subWorkflow1.xml :**
<subworkflow>
<action>
<pig>
Calling the pig script load_data_into_tbl.pig
<params>{Namenode}</params>
<params>{input Path}</params>
</pig>
</action>
</subworkflow>

pig 脚本:
load '${namenode}/data/filename*.log  -- This file  is in HDFS.  
.. . ... .
Store data into <Table_nm> using HCatStorer`

InputSource : /data/src_folder/20141029/filename*.log

第一次尝试 :

当我尝试从HDFS的文件夹中读取数据时,我第一次看到PIG执行成功。我执行的其余部分都失败了。

第二次尝试 :

我发现当我尝试使用文件夹中的相同源文件重新运行OOzie(20141029)时,我的执行失败。

第三次尝试:

然后,我尝试通过重命名文件夹中的源文件来重新运行工作流(20141029)。它工作正常。

可能是什么原因 ?提前致谢。

错误日志:
Pig Stack Trace
---------------
ERROR 2997: Encountered IOException. org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1413868377323_35233' doesn't exist in RM.
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:288)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

java.io.IOException: org.apache.hadoop.yarn.exceptions.ApplicationNotFoundException: Application with id 'application_1413868377323_35233' doesn't exist in RM.
at org.apache.hadoop.yarn.server.resourcemanager.ClientRMService.getApplicationReport(ClientRMService.java:288)
at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getApplicationReport(ApplicationClientProtocolPBServiceImpl.java:145)
at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:321)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:348)
at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:419)
at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:532)
at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:183)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:580)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:578)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:578)
at org.apache.hadoop.mapred.JobClient.getTaskReports(JobClient.java:633)
at org.apache.hadoop.mapred.JobClient.getMapTaskReports(JobClient.java:627)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.Launcher.getStats(Launcher.java:150)
at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher.launchPig(MapReduceLauncher.java:429)
at org.apache.pig.PigServer.launchPlan(PigServer.java:1324)
at org.apache.pig.PigServer.executeCompiledLogicalPlan(PigServer.java:1309)
at org.apache.pig.PigServer.execute(PigServer.java:1299)
at org.apache.pig.PigServer.executeBatch(PigServer.java:377)
at org.apache.pig.PigServer.executeBatch(PigServer.java:355)
at org.apache.pig.tools.grunt.GruntParser.executeBatch(GruntParser.java:140)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:202)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:173)
at org.apache.pig.tools.grunt.Grunt.exec(Grunt.java:84)
at org.apache.pig.Main.run(Main.java:478)
at org.apache.pig.PigRunner.run(PigRunner.java:49)
at org.apache.oozie.action.hadoop.PigMain.runPigJob(PigMain.java:286)
at org.apache.oozie.action.hadoop.PigMain.run(PigMain.java:226)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:38)
at org.apache.oozie.action.hadoop.PigMain.main(PigMain.java:76)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:225)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Failing
Oozie Launcher, Main class [org.apache.oozie.action.hadoop.PigMain], exit code [2]

最佳答案

我解决了这个问题,实际上不是问题。它具有Pig的性质,并且有一些票证可以解决此问题。一旦分区中存在数据,就无法使用Pig覆盖数据。这就是问题所在,这就是为什么我能够在第一次尝试中成功加载而不是在此之后加载的原因。谢谢 !

有用的网址 :
https://cwiki.apache.org/confluence/display/Hive/HCatalog+UsingHCat

关于hadoop - 从oozie调用Pig时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26638765/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com