作者热门文章
- html - 出于某种原因,IE8 对我的 Sass 文件中继承的 html5 CSS 不友好?
- JMeter 在响应断言中使用 span 标签的问题
- html - 在 :hover and :active? 上具有不同效果的 CSS 动画
- html - 相对于居中的 html 内容固定的 CSS 重复背景?
我有一个包含普通列的文件和一个包含 Json 字符串的列,如下所示。还附上图片。每一行实际上属于一个名为 Demo 的列(在 pic 中不可见)。其他列被删除并且在 pic 中不可见,因为它们现在不重要。
[{"key":"device_kind","value":"desktop"},{"key":"country_code","value":"ID"},{"key":"device_platform","value":"windows"}]
val jsonDF1 = spark.range(1).selectExpr(""" '{"key":"device_kind","value":"desktop"}' as jsonString""")
jsonDF1.select(get_json_object(col("jsonString"), "$.value") as "device_kind").show(2)// prints desktop under column named device_kind
val jsonDF2 = spark.range(1).selectExpr(""" '[{"key":"device_kind","value":"desktop"},{"key":"country_code","value":"ID"},{"key":"device_platform","value":"windows"}]' as jsonString""")
jsonDF2.select(get_json_object(col("jsonString"), "$.[0].value") as "device_kind").show(2)// print null but expected is desktop under column named device_kind
+------------------------------------------------------------------------+
|Demographics |
+------------------------------------------------------------------------+
|[[device_kind, desktop], [country_code, ID], [device_platform, windows]]|
|[[device_kind, mobile], [country_code, BE], [device_platform, android]] |
|[[device_kind, mobile], [country_code, QA], [device_platform, android]] |
+------------------------------------------------------------------------+
+------------------------------------------------------------------------+-----------+------------+---------------+
|Demographics |device_kind|country_code|device_platform|
+------------------------------------------------------------------------+-----------+------------+---------------+
|[[device_kind, desktop], [country_code, ID], [device_platform, windows]]|desktop |ID |windows |
|[[device_kind, mobile], [country_code, BE], [device_platform, android]] |mobile |BE |android |
|[[device_kind, mobile], [country_code, QA], [device_platform, android]] |mobile |QA |android |
+------------------------------------------------------------------------+-----------+------------+---------------+
最佳答案
谢谢你的回答。它工作正常。
因为我使用的是 2.3.3 spark,所以我以稍微不同的方式解决了这个问题。
val sch = ArrayType(StructType(Array(
StructField("key", StringType, true),
StructField("value", StringType, true)
)))
val jsonDF3 = mdf.select(from_json(col("jsonString"), sch).alias("Demographics"))
val jsonDF4 = jsonDF3.withColumn("device_kind", expr("Demographics[0].value"))
.withColumn("country_code", expr("Demographics[1].value"))
.withColumn("device_platform", expr("Demographics[2].value"))
关于json - Spark : How to parse a Array of JSON object using Spark,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57970480/
我是一名优秀的程序员,十分优秀!