gpt4 book ai didi

java - 在 spark 中写入 JSON 时保留具有空值的键

转载 作者:搜寻专家 更新时间:2023-11-01 02:37:28 25 4
gpt4 key购买 nike

我正在尝试使用 spark 编写一个 JSON 文件。有一些键的值为 null。这些在 DataSet 中显示得很好,但是当我写入文件时, key 会丢失。我如何确保保留它们?

写入文件的代码:

ddp.coalesce(20).write().mode("overwrite").json("hdfs://localhost:9000/user/dedupe_employee");

来自源的部分JSON数据:

"event_header": {
"accept_language": null,
"app_id": "App_ID",
"app_name": null,
"client_ip_address": "IP",
"event_id": "ID",
"event_timestamp": null,
"offering_id": "Offering",
"server_ip_address": "IP",
"server_timestamp": 1492565987565,
"topic_name": "Topic",
"version": "1.0"
}

输出:

"event_header": {
"app_id": "App_ID",
"client_ip_address": "IP",
"event_id": "ID",
"offering_id": "Offering",
"server_ip_address": "IP",
"server_timestamp": 1492565987565,
"topic_name": "Topic",
"version": "1.0"
}

在上面的示例中,键 accept_languageapp_nameevent_timestamp 已被删除。

最佳答案

显然,spark 没有提供任何处理空值的选项。因此,以下自定义解决方案应该有效。

import com.fasterxml.jackson.module.scala.DefaultScalaModule
import com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper
import com.fasterxml.jackson.databind.ObjectMapper

case class EventHeader(accept_language:String,app_id:String,app_name:String,client_ip_address:String,event_id: String,event_timestamp:String,offering_id:String,server_ip_address:String,server_timestamp:Long,topic_name:String,version:String)

val ds = Seq(EventHeader(null,"App_ID",null,"IP","ID",null,"Offering","IP",1492565987565L,"Topic","1.0")).toDS()

val ds1 = ds.mapPartitions(records => {
val mapper = new ObjectMapper with ScalaObjectMapper
mapper.registerModule(DefaultScalaModule)
records.map(mapper.writeValueAsString(_))
})

ds1.coalesce(1).write.text("hdfs://localhost:9000/user/dedupe_employee")

这将产生如下输出:

{"accept_language":null,"app_id":"App_ID","app_name":null,"client_ip_address":"IP","event_id":"ID","event_timestamp":null,"offering_id":"Offering","server_ip_address":"IP","server_timestamp":1492565987565,"topic_name":"Topic","version":"1.0"}

关于java - 在 spark 中写入 JSON 时保留具有空值的键,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44271612/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com