gpt4 book ai didi

java - 通过 HFile 将数据加载到 HBase 不工作

转载 作者:可可西里 更新时间:2023-11-01 16:39:30 25 4
gpt4 key购买 nike

我写了一个映射器通过 HFile 将数据从磁盘加载到 HBase,程序运行成功,但是我的 HBase 表中没有加载数据,请问有什么想法吗?

这是我的java程序:

protected void writeToHBaseViaHFile() throws Exception {
try {
System.out.println("In try...");
Configuration conf = HBaseConfiguration.create();
conf.set("hbase.zookeeper.quorum", "XXXX");
Connection connection = ConnectionFactory.createConnection(conf);
System.out.println("got connection");

String inputPath = "/tmp/nuggets_from_Hive/part-00000";
String outputPath = "/tmp/mytemp" + new Random().nextInt(1000);
final TableName tableName = TableName.valueOf("steve1");
System.out.println("got table steve1, outputPath = " + outputPath);

// tag::SETUP[]
Table table = connection.getTable(tableName);

Job job = Job.getInstance(conf, "ConvertToHFiles");
System.out.println("job is setup...");

HFileOutputFormat2.configureIncrementalLoad(job, table,
connection.getRegionLocator(tableName)); // <1>
System.out.println("done configuring incremental load...");

job.setInputFormatClass(TextInputFormat.class); // <2>

job.setJarByClass(Importer.class); // <3>

job.setMapperClass(LoadDataMapper.class); // <4>
job.setMapOutputKeyClass(ImmutableBytesWritable.class); // <5>
job.setMapOutputValueClass(KeyValue.class); // <6>

FileInputFormat.setInputPaths(job, inputPath);
HFileOutputFormat2.setOutputPath(job, new org.apache.hadoop.fs.Path(outputPath));
System.out.println("Setup complete...");
// end::SETUP[]

if (!job.waitForCompletion(true)) {
System.out.println("Failure");
} else {
System.out.println("Success");
}
} catch (Exception e) {
e.printStackTrace();
}
}

这是我的映射器类:

public class LoadDataMapper extends Mapper<LongWritable, Text, ImmutableBytesWritable, Cell> {

public static final byte[] FAMILY = Bytes.toBytes("pd");
public static final byte[] COL = Bytes.toBytes("bf");
public static final ImmutableBytesWritable rowKey = new ImmutableBytesWritable();

@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String[] line = value.toString().split("\t"); // <1>
byte[] rowKeyBytes = Bytes.toBytes(line[0]);
rowKey.set(rowKeyBytes);
KeyValue kv = new KeyValue(rowKeyBytes, FAMILY, COL, Bytes.toBytes(line[1])); // <6>
context.write (rowKey, kv); // <7>
System.out.println("line[0] = " + line[0] + "\tline[1] = " + line[1]);
}

}

我已经在我的集群中创建了表 steve1,但是在程序成功运行后得到了 0 行:

hbase(main):007:0> count 'steve1'
0 row(s) in 0.0100 seconds

=> 0

我尝试过的:

我尝试像在映射器类中那样添加打印输出消息,以查看它是否真的读取了数据,但打印输出从未在我的控制台中打印出来。我不知道如何调试它。

非常感谢任何想法!

最佳答案

这只是为了创建 HFile,您仍然需要将 HFile 加载到您的表中。例如,您需要执行以下操作:

LoadIncrementalHFiles loader = new LoadIncrementalHFiles(conf);
loader.doBulkLoad(new Path(outputPath), admin, hTable, regionLocator);

关于java - 通过 HFile 将数据加载到 HBase 不工作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44360444/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com