gpt4 book ai didi

hadoop - 从 MapReduce 写入 Hive(初始化 HCatOutputFormat)

转载 作者:可可西里 更新时间:2023-11-01 16:22:55 27 4
gpt4 key购买 nike

我编写了 MR 脚本,它应该从 HBase 加载数据并将它们转储到 Hive 中。连接到 HBase 没问题,但是当我尝试将数据保存到 HIVE 表中时,出现以下错误消息:

 Failing Oozie Launcher, Main class [org.apache.oozie.action.hadoop.JavaMain], main() threw exception, org.apache.hive.hcatalog.common.HCatException : 2004 : HCatOutputFormat not initialized, setOutput has to be called
org.apache.oozie.action.hadoop.JavaMainException: org.apache.hive.hcatalog.common.HCatException : 2004 : HCatOutputFormat not initialized, setOutput has to be called
at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:58)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:38)
at org.apache.oozie.action.hadoop.JavaMain.main(JavaMain.java:36)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.LauncherMapper.map(LauncherMapper.java:226)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1548)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: org.apache.hive.hcatalog.common.HCatException : 2004 : HCatOutputFormat not initialized, setOutput has to be called
at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getJobInfo(HCatBaseOutputFormat.java:118)
at org.apache.hive.hcatalog.mapreduce.HCatBaseOutputFormat.getTableSchema(HCatBaseOutputFormat.java:61)
at com.nrholding.t0_mr.main.DumpProductViewsAggHive.run(DumpProductViewsAggHive.java:254)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at com.nrholding.t0_mr.main.DumpProductViewsAggHive.main(DumpProductViewsAggHive.java:268)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.oozie.action.hadoop.JavaMain.run(JavaMain.java:55)
... 15 more

我正在检查:

  • 表存在
  • setOutput 方法在 getTableSchema 和 setSchema 之前被调用

这是我的运行方法:

@Override
public int run(String[] args) throws Exception {

// Create configuration
Configuration conf = this.getConf();
String databaseName = null;
String tableName = "test";

// Parse arguments
String[] otherArgs = new GenericOptionsParser(conf,args).getRemainingArgs();
getParams(otherArgs);

// It is better to specify zookeeper quorum in CLI parameter -D hbase.zookeeper.quorum=zookeeper servers
conf.set( "hbase.zookeeper.quorum",
"cz-dc1-s-132.mall.local,cz-dc1-s-133.mall.local,"
+ "cz-dc1-s-134.mall.local,cz-dc1-s-135.mall.local,"
+ "cz-dc1-s-136.mall.local");

// Create job
Job job = Job.getInstance(conf, NAME);
job.setJarByClass(DumpProductViewsAggHive.class);


// Setup MapReduce job
job.setReducerClass(Reducer.class);
//job.setNumReduceTasks(0); // If reducer is not needed

// Specify key / value
job.setOutputKeyClass(Writable.class);
job.setOutputValueClass(DefaultHCatRecord.class);

// Input
getInput(null, dateFrom, dateTo, job, caching, table);

// Output
// Ignore the key for the reducer output; emitting an HCatalog record as value
job.setOutputFormatClass(HCatOutputFormat.class);

HCatOutputFormat.setOutput(job, OutputJobInfo.create(databaseName, tableName, null));
HCatSchema s = HCatOutputFormat.getTableSchema(conf);
System.err.println("INFO: output schema explicitly set for writing:" + s);
HCatOutputFormat.setSchema(job, s);

// Execute job and return status
return job.waitForCompletion(true) ? 0 : 1;
}

你知道如何帮助我吗?谢谢!

最佳答案

使用:

HCatSchema s = HCatOutputFormat.getTableSchema(job.getConfiguration());

关于hadoop - 从 MapReduce 写入 Hive(初始化 HCatOutputFormat),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24558943/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com