gpt4 book ai didi

hadoop - 使用Pig从CSV文件读取数据

转载 作者:行者123 更新时间:2023-12-02 21:34:56 25 4
gpt4 key购买 nike

我正在尝试在Mac上的Pig Shell上读取csv文件。我正在做的只是将文件load到变量中,然后dump变量。这是我的做法:

movies = LOAD '/user/myhome/movies_data.csv' USING PigStorage(',') as (id,name,year,rating,duration);
DUMP movies;

我正在使用的数据是从 here提供的github下载的

我的Mac上本地安装的hdfs中提供了此文件。当我执行 dump时出现错误:

org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias movies

at org.apache.pig.PigServer.openIterator(PigServer.java:935) at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:754) at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:376) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:230) at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:205) at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:66) at org.apache.pig.Main.run(Main.java:565) at org.apache.pig.Main.main(Main.java:177) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.hadoop.util.RunJar.run(RunJar.java:221) at org.apache.hadoop.util.RunJar.main(RunJar.java:136) Caused by: java.io.IOException: Job terminated with anomalous status FAILED at org.apache.pig.PigServer.openIterator(PigServer.java:927) ... 13 more



当我运行该作业时点击应用程序集群链接时,出现以下异常:

Diagnostics: Exception from container-launch. Container id: container_1443887668938_0007_02_000001 Exit code: 127 Stack trace: ExitCodeException exitCode=127: at org.apache.hadoop.util.Shell.runCommand(Shell.java:538) at org.apache.hadoop.util.Shell.run(Shell.java:455) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:715) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:211) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 127 Failing this attempt. Failing the application.



Pig版本是0.15.0,Hadoop版本是2.6.1。我在这里想念什么吗?

最佳答案

您可以使用来自piggybank的CSVLoader。如果没有可用的储钱 jar ,请注册并使用CSVLoader。这样的事情。

register '/your/path/to/piggybank/jar' ;
define CSVLoader org.apache.pig.piggybank.storage.CSVLoader();
movies = LOAD '/user/myhome/movies_data.csv' USING CSVLoader as (id,name,year,rating,duration);

关于hadoop - 使用Pig从CSV文件读取数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32935007/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com