json - 错误1066:无法打开别名的迭代器-PIG SCRIPT-6ren

json - 错误1066:无法打开别名的迭代器-PIG SCRIPT

转载作者：行者123 更新时间：2023-12-02 21:13:54

我长期以来一直在面对这个问题。我试图解决这个问题，但我做不到。我需要一些专家意见来解决此问题。

我正在尝试加载样本推文json文件。

sample.json;-

{"filter_level":"low","retweeted":false,"in_reply_to_screen_name":"FilmFan","truncated":false,"lang":"en","in_reply_to_status_id_str":null,"id":689085590822891521,"in_reply_to_user_id_str":"6048122","timestamp_ms":"1453125782100","in_reply_to_status_id":null,"created_at":"Mon Jan 18 14:03:02 +0000 2016","favorite_count":0,"place":null,"coordinates":null,"text":"@filmfan hey its time for you guys follow @acadgild To #AchieveMore and participate in contest Win Rs.500 worth vouchers","contributors":null,"geo":null,"entities":{"symbols":[],"urls":[],"hashtags":[{"text":"AchieveMore","indices":[56,68]}],"user_mentions":[{"id":6048122,"name":"Tanya","indices":[0,8],"screen_name":"FilmFan","id_str":"6048122"},{"id":2649945906,"name":"ACADGILD","indices":[42,51],"screen_name":"acadgild","id_str":"2649945906"}]},"is_quote_status":false,"source":"<a href=\"https://about.twitter.com/products/tweetdeck\" rel=\"nofollow\">TweetDeck<\/a>","favorited":false,"in_reply_to_user_id":6048122,"retweet_count":0,"id_str":"689085590822891521","user":{"location":"India ","default_profile":false,"profile_background_tile":false,"statuses_count":86548,"lang":"en","profile_link_color":"94D487","profile_banner_url":"https://pbs.twimg.com/profile_banners/197865769/1436198000","id":197865769,"following":null,"protected":false,"favourites_count":1002,"profile_text_color":"000000","verified":false,"description":"Proud Indian, Digital Marketing Consultant,Traveler, Foodie, Adventurer, Data Architect, Movie Lover, Namo Fan","contributors_enabled":false,"profile_sidebar_border_color":"000000","name":"Bahubali","profile_background_color":"000000","created_at":"Sat Oct 02 17:41:02 +0000 2010","default_profile_image":false,"followers_count":4467,"profile_image_url_https":"https://pbs.twimg.com/profile_images/664486535040000000/GOjDUiuK_normal.jpg","geo_enabled":true,"profile_background_image_url":"http://abs.twimg.com/images/themes/theme1/bg.png","profile_background_image_url_https":"https://abs.twimg.com/images/themes/theme1/bg.png","follow_request_sent":null,"url":null,"utc_offset":19800,"time_zone":"Chennai","notifications":null,"profile_use_background_image":false,"friends_count":810,"profile_sidebar_fill_color":"000000","screen_name":"Ashok_Uppuluri","id_str":"197865769","profile_image_url":"http://pbs.twimg.com/profile_images/664486535040000000/GOjDUiuK_normal.jpg","listed_count":50,"is_translator":false}}

我尝试使用 加载此json文件ELEPHANT BIRD

脚本:-

REGISTER json-simple-1.1.1.jar 
REGISTER elephant-bird-2.2.3.jar 
REGISTER guava-11.0.2.jar 
REGISTER avro-1.7.7.jar
REGISTER piggybank-0.12.0.jar


twitter = LOAD 'sample.json' USING com.twitter.elephantbird.pig.load.JsonLoader();

B = foreach twitter generate (chararray)$0#'created_at' as created_at,(chararray)$0#'id' as id,(chararray)$0#'id_str' as id_str,(chararray)$0#'text' as text,(chararray)$0#'source' as source,com.twitter.elephantbird.pig.piggybank.JsonStringToMap($0#'entities') as entities,(boolean)$0#'favorited' as favorited;

describe B;

输出:-

B: {created_at: chararray,id: chararray,id_str: chararray,text: chararray,source: chararray,entitis: map[chararray],favorited: boolean}

但是，当我尝试 DUMP B 时，发生了以下错误

错误org.apache.pig.tools.grunt.Grunt-错误1066:无法打开别名B的迭代器

我在这里提供完整的日志。

2016-09-11 14:07:57,184 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size before optimization: 1 2016-09-11 14:07:57,184 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MultiQueryOptimizer - MR plan size after optimization: 1 2016-09-11 14:07:57,194 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 2016-09-11 14:07:57,194 [main] INFO org.apache.pig.tools.pigstats.mapreduce.MRScriptState - Pig script settings are added to the job 2016-09-11 14:07:57,194 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 2016-09-11 14:07:57,199 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.JobControlCompiler - Setting up single store job 2016-09-11 14:07:57,199 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Key [pig.schematuple] is false, will not generate code. 2016-09-11 14:07:57,199 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Starting process to move generated code to distributed cacche 2016-09-11 14:07:57,199 [main] INFO org.apache.pig.data.SchemaTupleFrontend - Distributed cache not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp directory: /tmp/1473583077199-0 2016-09-11 14:07:57,206 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 1 map-reduce job(s) waiting for submission. 2016-09-11 14:07:57,207 [JobControl] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 2016-09-11 14:07:57,208 [JobControl] WARN org.apache.hadoop.mapreduce.JobResourceUploader - No job jar file set. User classes may not be found. See Job or Job#setJar(String). 2016-09-11 14:07:57,211 [JobControl] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1 2016-09-11 14:07:57,211 [JobControl] INFO org.apache.pig.backend.hadoop.executionengine.util.MapRedUtil - Total input paths (combined) to process : 1 2016-09-11 14:07:57,212 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - number of splits:1 2016-09-11 14:07:57,216 [JobControl] INFO org.apache.hadoop.mapreduce.JobSubmitter - Submitting tokens for job: job_local360376249_0009 2016-09-11 14:07:57,267 [JobControl] INFO org.apache.hadoop.mapreduce.Job - The url to track the job: http://localhost:8080/ 2016-09-11 14:07:57,267 [Thread-214] INFO org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter set in config null 2016-09-11 14:07:57,270 [Thread-214] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm version is 1 2016-09-11 14:07:57,270 [Thread-214] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false 2016-09-11 14:07:57,270 [Thread-214] INFO org.apache.hadoop.mapred.LocalJobRunner - OutputCommitter is org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigOutputCommitter 2016-09-11 14:07:57,271 [Thread-214] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting for map tasks 2016-09-11 14:07:57,272 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.LocalJobRunner - Starting task: attempt_local360376249_0009_m_000000_0 2016-09-11 14:07:57,277 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm version is 1 2016-09-11 14:07:57,277 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false 2016-09-11 14:07:57,277 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.Task - Using ResourceCalculatorProcessTree : [ ] 2016-09-11 14:07:57,278 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapred.MapTask - Processing split: Number of splits :1 Total Length = 2416 Input split[0]: Length = 2416 ClassName: org.apache.hadoop.mapreduce.lib.input.FileSplit Locations: ----------------------- 2016-09-11 14:07:57,282 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader - Current split being processed file:/root/PIG/PIG/sample.json:0+2416 2016-09-11 14:07:57,282 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - File Output Committer Algorithm version is 1 2016-09-11 14:07:57,282 [LocalJobRunner Map Task Executor #0] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false 2016-09-11 14:07:57,288 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.data.SchemaTupleBackend - Key [pig.schematuple] was not set... will not generate code. 2016-09-11 14:07:57,290 [LocalJobRunner Map Task Executor #0] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapOnly$Map - Aliases being processed per job phase (AliasName[line,offset]): M: twitter[20,10],B[21,4] C: R: 2016-09-11 14:07:57,291 [Thread-214] INFO org.apache.hadoop.mapred.LocalJobRunner - map task executor complete. 2016-09-11 14:07:57,296 [Thread-214] WARN org.apache.hadoop.mapred.LocalJobRunner - job_local360376249_0009 java.lang.Exception: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.Counter, but class was expected at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522) Caused by: java.lang.IncompatibleClassChangeError: Found interface org.apache.hadoop.mapreduce.Counter, but class was expected at com.twitter.elephantbird.pig.util.PigCounterHelper.incrCounter(PigCounterHelper.java:55) at com.twitter.elephantbird.pig.load.LzoBaseLoadFunc.incrCounter(LzoBaseLoadFunc.java:70) at com.twitter.elephantbird.pig.load.JsonLoader.getNext(JsonLoader.java:130) at org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigRecordReader.nextKeyValue(PigRecordReader.java:204) at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556) at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80) at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91) at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145) at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341) at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) 2016-09-11 14:07:57,467 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - HadoopJobId: job_local360376249_0009 2016-09-11 14:07:57,467 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Processing aliases B,twitter 2016-09-11 14:07:57,467 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - detailed locations: M: twitter[20,10],B[21,4] C: R: 2016-09-11 14:07:57,468 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 0% complete 2016-09-11 14:07:57,468 [main] WARN org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - Ooops! Some job has failed! Specify -stop_on_failure if you want Pig to stop immediately on failure. 2016-09-11 14:07:57,468 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - job job_local360376249_0009 has failed! Stop running all dependent jobs 2016-09-11 14:07:57,468 [main] INFO org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.MapReduceLauncher - 100% complete 2016-09-11 14:07:57,469 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 2016-09-11 14:07:57,469 [main] INFO org.apache.hadoop.metrics.jvm.JvmMetrics - Cannot initialize JVM Metrics with processName=JobTracker, sessionId= - already initialized 2016-09-11 14:07:57,469 [main] ERROR org.apache.pig.tools.pigstats.mapreduce.MRPigStatsUtil - 1 map reduce job(s) failed! 2016-09-11 14:07:57,470 [main] INFO org.apache.pig.tools.pigstats.mapreduce.SimplePigStats - Script Statistics: HadoopVersionPigVersionUserIdStartedAtFinishedAtFeatures 2.7.1.2.3.4.7-40.15.0.2.3.4.7-4root2016-09-11 14:07:572016-09-11 14:07:57UNKNOWN Failed! Failed Jobs: JobIdAliasFeatureMessageOutputs job_local360376249_0009B,twitterMAP_ONLYMessage: Job failed!file:/tmp/temp252944192/tmp-470484503, Input(s): Failed to read data from "file:///root/PIG/PIG/sample.json" Output(s): Failed to produce result in "file:/tmp/temp252944192/tmp-470484503" Counters: Total records written : 0 Total bytes written : 0 Spillable Memory Manager spill count : 0 Total bags proactively spilled: 0 Total records proactively spilled: 0 Job DAG: job_local360376249_0009

并且请说明如何使用jar文件，

以及使用什么版本。我很困惑要使用哪个版本。

有人说使用Elephant Bird，有人说使用AVRO。但我俩都没有工作。

请帮忙。

莫汉

最佳答案

我自己买的。
它是jar版本的问题。
脚本:-

REGISTER elephant-bird-core-4.1.jar 
REGISTER elephant-bird-pig-4.1.jar 
REGISTER elephant-bird-hadoop-compat-4.1.jar

而且效果很好。

关于json - 错误1066:无法打开别名的迭代器-PIG SCRIPT，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/39434764/

文章推荐： java - 如何在Hadoop中加载 native 库

文章推荐： bash - 来自Alpine的Docker phpunit:3.7输入shell CLI sh或bash

文章推荐： mongodb - 扩展架构时的Cors和数据库uri问题

C# Dll注入(inject)器，VB.Net Dll注入(inject)器
我之前让 dll 注入(inject)器变得简单，但我有 Windows 7，我用 C# 和 C++ 做了它，它工作得很好!但是现在当我在 Windows 8 中尝试相同的代码时，它似乎没有以正确的方
javascript - Polymer 1.0 尝试制作一种类似于核心 split 器的 split 器，可以称为铁 split 器
我正在尝试制作一个名为 core-splitter 的元素，该元素在 1.0 中已弃用，因为它在我们的项目中起着关键作用。如果您不知道 core-splitter 的作用，我可以提供一个简短的描述。
scrapy - 在scrapy的同一进程中运行多个蜘蛛后如何停止 react 器？
我有几个不同的蜘蛛，想一次运行所有它们。基于 this和 this ，我可以在同一个进程中运行多个蜘蛛。但是，我不知道如何设计一个信号系统来在所有蜘蛛都完成后停止 react 器。我试过了: cra
twisted - 在某个条件下停止扭曲 react 器
有没有办法在达到特定条件时停止扭曲 react 器。例如，如果一个变量被设置为某个值，那么 react 器应该停止吗？最佳答案理想情况下，您不会将变量设置为一个值并停止 react 器，而是调用
javascript - 我如何定义我的应用程序的注入(inject)器
https://code.angularjs.org/1.0.0rc9/angular-1.0.0rc9.js 上面的链接定义了外部js文件，我不知道Angular-1.0.0rc9.js的注入(in
angularjs - 如何为我的应用程序检索注入(inject)器？
我正在尝试运行一个函数并将服务注入(inject)其中。我认为这可以使用 $injector 轻松完成.所以我尝试了以下(简化示例): angular.injector().invoke( [ "$q
gwt - 使用多个抽象模块实例化一个注入(inject)器
在 google Guice 中，我可以使用函数 createInjector 创建基于多个模块的注入(inject)器。因为我使用 GWT.create 在 GoogleGin 中实例化注入(in
c# - 属性的自定义配置绑定(bind)器
我在 ASP.NET Core 1.1 解决方案中使用配置绑定(bind)。基本上，我在“ConfigureServices Startup”部分中有一些用于绑定(bind)的简单代码，如下所示: s
java - Spring初始化绑定(bind)器
我在 Spring MVC 中设置 initBinder 时遇到一些问题。我有一个 ModelAttribute，它有一个有时会显示的字段。 public class Model { privat
jquery post表单数据和MVC模型绑定(bind)器
我正在尝试通过jquery post发布knockoutjs View 模型 var $form = $('#barcodeTemplate form'); var data = ko.toJS(vm
c# - 具有多态对象集合的复杂模型的自定义模型绑定(bind)器
如何为包含多态对象集合的复杂模型编写自定义模型绑定(bind)程序？我有下一个模型结构: public class CustomAttributeValueViewModel { publi
c# - 使用多个构造函数注册开放泛型的简单注入(inject)器
您好，我正在尝试实现我在 this article 中找到的扩展方法对于简单的注入(inject)器，因为它不支持开箱即用的特定构造函数的注册。根据这篇文章，我需要用一个假的委托(delegate)
c# - 注册动态类型的简单注入(inject)器
你好，我想自动注册我的依赖项。我现在拥有的是: public interface IRepository where T : class public interface IFolderReposi
javascript - 带有位置服务的angularjs注入(inject)器
我正在使用 Jasmine 测试一些 Angular.js 代码。为此，我需要一个 Angular 注入(inject)器: var injector = angular.injector(['ng'
C 代码 reshape 器
我正在使用 Matlab 代码生成器。不可能包含代码风格指南。这就是为什么我正在寻找一个工具来“ reshape ”、重命名和重新格式化生成的代码，根据我的: 功能横幅约定文件横幅约定命名约定等
c++ - 与模板模板类一起使用的自定义模板参数绑定(bind)器
这个问题在这里已经有了答案: Where and why do I have to put the "template" and "typename" keywords? (8 个答案) 关闭 8
c++ - 开源dll注入(inject)器
我开发了一种工具，可以更改某些程序的外观。为此，我需要在某些进程中注入(inject)一个 dll。现在我基本上使用这个 approach .问题通常是人们无法注入(inject) dll，因为他们
java - 是否有使用方面和注释的数据绑定(bind)器？
我想使用 swing、spring 和 hibernate 编写一个 java 应用程序。我想使用数据绑定(bind)器用 bean 的值填充 gui，并且我还希望它反射(reflect) gui
python - 当两个蜘蛛都完成时如何停止 react 器
我有这段代码，当两个蜘蛛完成后，程序仍在运行。 #!C:\Python27\python.exe from twisted.internet import reactor from scrapy.cr
java - 我如何才能限定我不使用的 Autowiring 器 "own"
要点是 Spring Batch (v2) 测试框架具有带有 @Autowired 注释的 JobLauncherTestUtils.setJob。我们的测试套件有多个 Job 类提供者。因为这个类不

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

json - 错误1066:无法打开别名的迭代器-PIG SCRIPT