gpt4 book ai didi

java - 无法从数据帧中提取数组/列表,AnalysisException : need struct type but got binary

转载 作者:行者123 更新时间:2023-12-02 12:22:13 26 4
gpt4 key购买 nike

我有一个带有 String[] 的数据集,我正在努力从中提取列。这是代码

import static org.apache.spark.sql.functions.col;

//Read parquet data
Dataset<Row> readerDF = spark.readStream().format("parquet").

List<String> columns = Arrays.asList("city","country");
//Interested in only field in data for now 'fieldMap' which is Map<String,String>

Dataset<String[]> stringArrDF = readerDF.map((MapFunction<Row, String[]>) row -> {
Map<String,String> fields = row.getJavaMap(row.fieldIndex("fieldMap"));
List<String> columnList = new ArrayList<>();
columns.forEach(columnName ->
{
columnList.add(fields.getOrDefault(columnName, ""));
});
return columnList.toArray(new String[columns.size]);
}, Encoders.kryo(String[].class));

//I was expecting to extract city here:
Dataset ds = stringArrDF.select(col("value").getItem(1).as("city"));

但它失败并出现以下异常。

Exception in thread "main" org.apache.spark.sql.AnalysisException: Can't extract value from value#22;

如何从数据集中访问 String[] 或 List 字段?

最佳答案

您遇到以下错误。

Exception in thread "main" org.apache.spark.sql.AnalysisException: Can't extract value from value#22: need struct type but got binary;

您正在使用Encoders.kryo(String[].class)来创建stringArrDF。如果您查看 Encoders.kryo 的文档,它会显示

Creates an encoder that serializes objects of type T using Kryo. This encoder maps T into a single byte array (binary) field.

使用 spark.implicits().newStringArrayEncoder() 对 String[] 进行编码。

关于java - 无法从数据帧中提取数组/列表,AnalysisException : need struct type but got binary,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/45684812/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com