gpt4 book ai didi

scala - Spark scala - 嵌套的 StructType 转换为 Map

转载 作者:行者123 更新时间:2023-12-02 22:22:46 24 4
gpt4 key购买 nike

我在 scala 中使用 Spark 1.6。

我用一个对象在 ElasticSearch 中创建了一个索引。对象“params”被创建为 Map[String, Map[String, String]]。示例:

val params : Map[String, Map[String, String]] = ("p1" -> ("p1_detail" -> "table1"), "p2" -> (("p2_detail" -> "table2"), ("p2_filter" -> "filter2")), "p3" -> ("p3_detail" -> "table3"))

这给我的记录如下所示:

{
"_index": "x",
"_type": "1",
"_id": "xxxxxxxxxxxx",
"_score": 1,
"_timestamp": 1506537199650,
"_source": {
"a": "toto",
"b": "tata",
"c": "description",
"params": {
"p1": {
"p1_detail": "table1"
},
"p2": {
"p2_detail": "table2",
"p2_filter": "filter2"
},
"p3": {
"p3_detail": "table3"
}
}
}
},

然后我尝试读取 Elasticsearch 索引以更新值。

Spark 使用以下模式读取索引:

|-- a: string (nullable = true)
|-- b: string (nullable = true)
|-- c: string (nullable = true)
|-- params: struct (nullable = true)
| |-- p1: struct (nullable = true)
| | |-- p1_detail: string (nullable = true)
| |-- p2: struct (nullable = true)
| | |-- p2_detail: string (nullable = true)
| | |-- p2_filter: string (nullable = true)
| |-- p3: struct (nullable = true)
| | |-- p3_detail: string (nullable = true)

我的问题是对象被读取为结构。为了管理和轻松更新字段,我想要一个 Map,因为我对 StructType 不是很熟悉。

我试图将 UDF 中的对象作为 Map 获取,但出现以下错误:

 User class threw exception: org.apache.spark.sql.AnalysisException: cannot resolve 'UDF(params)' due to data type mismatch: argument 1 requires map<string,map<string,string>> type, however, 'params' is of struct<p1:struct<p1_detail:string>,p2:struct<p2_detail:string,p2_filter:string>,p3:struct<p3_detail:string>> type.;

UDF 代码片段:

val getSubField : Map[String, Map[String, String]] => String = (params : Map[String, Map[String, String]]) => { val return_string = (params ("p1") getOrElse("p1_detail", null.asInstanceOf[String]) return_string }

我的问题:我们如何将这个 Struct 转换为 Map?我已经阅读过文档中提供的 toMap 方法,但由于我是 Scala 初学者,所以找不到如何使用它(对隐式参数不是很熟悉)。

提前致谢

最佳答案

最后我是这样解决的:

def convertRowToMap[T](row: Row): Map[String, T] = {
row.schema.fieldNames
.filter(field => !row.isNullAt(row.fieldIndex(field)))
.map(field => field -> row.getAs[T](field))
.toMap
}

/* udf that converts Row to Map */
val rowToMap: Row => Map[String, Map[String, String]] = (row: Row) => {
val mapTemp = convertRowToMap[Row](row)

val mapToReturn = mapTemp.map { case (k, v) => k -> convertRowToMap[String](v) }

mapToReturn
}
val udfrowToMap = udf(rowToMap)

关于scala - Spark scala - 嵌套的 StructType 转换为 Map,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46566374/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com