gpt4 book ai didi

java - 使用 Hbase MapReduce 将文件中的数据加载到表中

转载 作者:太空宇宙 更新时间:2023-11-04 08:03:58 25 4
gpt4 key购买 nike

我需要从位于 HDFS 中的文件加载数据,并使用 Hbase Map Reduce 将其加载到 Hbase 表中。我有一个 csv 文件,其中仅包含列限定符的值,如下所示:

现在在我的 Hbase 表中如何从 mapReduce 程序加载这些值。以及如何自动生成 RowId。

    Class:


public class SampleExample {

private static final String NAME = "SampleExample "; //class Name

static class Uploader extends Mapper<LongWritable, Text, ImmutableBytesWritable, Put>
{
private long statuspoint = 100;
private long count = 0;
@Override
public void map(LongWritable key, Text line, Context context)
throws IOException {
String [] values = line.toString().split(",");
/* How to read values into columnQualifier and how to generate row id */
// put function-------------------
try {
context.write(new ImmutableBytesWritable(row), put);
} catch (InterruptedException e) {
e.printStackTrace();
}
if(++count % statuspoint == 0) {
context.setStatus("Emitting Put " + count);
}
}
}
public static Job configureJob(Configuration conf, String [] args)
throws IOException {

}
}

错误:

12/09/17 05:23:30 INFO mapred.JobClient: Task Id : attempt_201209041554_0071_m_000000_0, Status : FAILED
java.io.IOException: Type mismatch in value from map: expected org.apache.hadoop.io.Writable, recieved org.apache.hadoop.hbase.client.Put
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:1019)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:691)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at com.hbase.Administration$Uploader.map(HealthAdministration.java:51)
at com.hbase.Administration$Uploader.map(HealthAdministration.java:1)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

任何人都可以帮助我吗,我无法弄清楚我们如何将值读入限定符。

最佳答案

String stringLine = line.toString();
StringTokenizer stringTokenizer = new StringTokenizer(line, "\t");`

Put put = new Put(key.get());
put.add(family, column1,stringTokenizer.nextToken().getBytes());
put.add(family, column2,stringTokenizer.nextToken().getBytes());
put.add(family, column3,stringTokenizer.nextToken().getBytes());
put.add(family, column4,stringTokenizer.nextToken().getBytes());

try {
context.write(new ImmutableBytesWritable(row), put);
} catch (InterruptedException e) {
e.printStackTrace();
}

关于java - 使用 Hbase MapReduce 将文件中的数据加载到表中,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/12385868/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com