gpt4 book ai didi

hadoop - 查找Hbase中具有空值的行数

转载 作者:行者123 更新时间:2023-12-02 20:45:53 25 4
gpt4 key购买 nike

我已经在Hbase表中填充了rowid和与tweet有关的大量信息,例如纯文本,url,hashtag等,如下所示

902221655086211073    column=clean-tweet:clean-text-cta, timestamp=1514793745304, value=democrat mayor order hurricane harvey stand houston

但是,在填充时,我注意到某些行是空的,例如
902487280543305728    column=clean-tweet:clean-text-cta, timestamp=1514622371008, value=  

现在如何找到有数据的行数?

请帮我

最佳答案

到目前为止,尚无HBase Shell中执行此操作的准备。可能是您可以使用这样的简单代码来获取许多记录,这些记录对于所提供的列限定符没有值。

CountAndFilter [tableName] [columnFamily] [columnQualifier]

import java.io.IOException;

import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;

public class CountAndFilter {

private static Connection conn;
private static int recordsWithoutValue = 0;
public static Admin getConnection() throws IOException {
if (conn == null) {
conn = ConnectionFactory.createConnection(HBaseConfiguration.create());
}
return conn.getAdmin();
}

public static void main(String args[]) throws IOException {
getConnection();
scan(args[0], args[1], args[2]);
System.out.println("Records with empty value : " + recordsWithoutValue);
}

public static void scan(String tableName, String columnFamily, String columnQualifier) throws IOException {
Table table = conn.getTable(TableName.valueOf(tableName));
ResultScanner rs = table.getScanner(new Scan().addColumn(Bytes.toBytes(columnFamily), Bytes.toBytes(columnQualifier)));

Result res = null;
try {
while ((res = rs.next()) != null) {
if (res.containsEmptyColumn(Bytes.toBytes(columnFamily), Bytes.toBytes(columnQualifier))){
recordsWithoutValue++;
}
}
} finally {
rs.close();
}
}
}

关于hadoop - 查找Hbase中具有空值的行数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48069821/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com