- android - RelativeLayout 背景可绘制重叠内容
- android - 如何链接 cpufeatures lib 以获取 native android 库?
- java - OnItemClickListener 不起作用,但 OnLongItemClickListener 在自定义 ListView 中起作用
- java - Android 文件转字符串
我正在尝试使用 HFileOutputFormat2 作为 OutputFormat 将数据从 hdfs 中的文件上传到 hbase 表,但出现以下异常,
java.lang.Exception: java.lang.ClassCastException: org.apache.hadoop.hbase.client.Put cannot be cast to org.apache.hadoop.hbase.Cell
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.ClassCastException: org.apache.hadoop.hbase.client.Put cannot be cast to org.apache.hadoop.hbase.Cell
at org.apache.hadoop.hbase.mapreduce.HFileOutputFormat2$1.write(HFileOutputFormat2.java:148)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:635)
at org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
at com.xogito.ingestion.mr.hbase.CSEventsHBaseMapper.map(CSEventsHBaseMapper.java:90)
at com.xogito.ingestion.mr.hbase.CSEventsHBaseMapper.map(CSEventsHBaseMapper.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:243)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:662)
以下是Job的代码
@Override
public int run(String[] args) throws Exception {
Configuration conf = getConf();
job = Job.getInstance(conf, "MY_JOB");
job.setJarByClass(getClass());
job.setMapOutputKeyClass(ImmutableBytesWritable.class);
job.setMapOutputValueClass(Put.class);
job.setSpeculativeExecution(false);
job.setReduceSpeculativeExecution(false);
job.setMapperClass(CustomMapper.class);//Custom Mapper
job.setInputFormatClass(TextInputFormat.class);
job.setOutputFormatClass(HFileOutputFormat.class);
String parentInputPath = args[0];
String parentOutputPath = args[1];
FileInputFormat.addInputPaths(job, inputPath);
HFileOutputFormat.setOutputPath(job,new Path(parentOutputPath));
Configuration hConf = HBaseConfiguration.create(conf);
hConf.set("hbase.zookeeper.quorum", "x.x.x.x");
hConf.set("hbase.zookeeper.property.clientPort", "2181");
HTable hTable = new HTable(hConf, "mytable");
// hTable.setAutoFlush(false, true);
// hTable.setWriteBufferSize(1024 * 1024 * 12);
HFileOutputFormat.configureIncrementalLoad(job, hTable);
job.setNumReduceTasks(0);
job.submit();
}
Mapper 的代码如下,
@Override
public void map(WritableComparable key, Writable val, Context context) throws IOException, InterruptedException{
String data = val.toString();
String[] splitted = data.split("\t");
String account = splitted[1];
Matcher match = ACCOUNT.matcher(account);
int clientid = 0;
if (match.find()) {
clientid = Integer.valueOf(Integer.parseInt(match
.group(1)));
}
String userid = splitted[2];
Long timestamp = 0L;
try {
timestamp = Long.valueOf(splitted[10]);
} catch (Exception e) {
LOGGER.error(e.getMessage(), e);
}
String rowKeyText = "somtext";
ImmutableBytesWritable rowKey = new
ImmutableBytesWritable(Bytes.toBytes(rowKeyText));
Put put = new Put(Bytes.toBytes(rowKeyText));
put.add(cf,column, value);
context.write(rowKey, put);
}
最佳答案
HFileOutputFormat
或新版本 HFileOutputFormat2
需要 KeyValue
作为最终类。可能 PutSortReducer
未正确应用将 Put
转换为 KeyValue
实例。
在我的例子中,我没有使用 MapReduce,而是使用 Spark,所以我只是直接创建 KeyValue
而不是 Put
关于java - 使用 HFileOutputFormat2 时发生 ClassCastException,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/27324569/
我使用 HFileOutputFormat 将 CSV 文件批量加载到 hbase 表中。我只有 map 而没有使用 job.setNumReduceTasks(0) 的 reduce 任务。但是我可
我正在使用 Hbase:0.92.1-cdh4.1.2, 和Hadoop:2.0.0-cdh4.1.2 我有一个 mapreduce 程序,它将在集群模式下使用 HFileOutputFormat 将
我正在使用 Hadoop 运行 ETL 作业,我需要将经过转换的有效数据输出到 HBase,并将该数据的外部索引输出到 MySQL。我最初的想法是,我可以使用 MultipleOutputFormat
您能告诉我 HBASE 中的 HFileOutputFormat2.configureIncrementalLoad 与 HFileOutputFormat.configureIncrementalL
我是一名优秀的程序员,十分优秀!