gpt4 book ai didi

scala - 如何将 map 的 RDD 转换为数据框

转载 作者:行者123 更新时间:2023-12-04 10:15:19 25 4
gpt4 key购买 nike

我有 Map 的 RDD,我想将其转换为数据框
这是RDD的输入格式

val mapRDD: RDD[Map[String, String]] = sc.parallelize(Seq(
Map("empid" -> "12", "empName" -> "Rohan", "depId" -> "201"),
Map("empid" -> "13", "empName" -> "Ross", "depId" -> "201"),
Map("empid" -> "14", "empName" -> "Richard", "depId" -> "401"),
Map("empid" -> "15", "empName" -> "Michale", "depId" -> "501"),
Map("empid" -> "16", "empName" -> "John", "depId" -> "701")))

有什么方法可以转换成数据帧
 val df=mapRDD.toDf

df.show
empid,  empName,    depId
12 Rohan 201
13 Ross 201
14 Richard 401
15 Michale 501
16 John 701

最佳答案

您可以轻松地将其转换为 Spark DataFrame:

这是一个可以解决问题的代码:

val mapRDD= sc.parallelize(Seq(
Map("empid" -> "12", "empName" -> "Rohan", "depId" -> "201"),
Map("empid" -> "13", "empName" -> "Ross", "depId" -> "201"),
Map("empid" -> "14", "empName" -> "Richard", "depId" -> "401"),
Map("empid" -> "15", "empName" -> "Michale", "depId" -> "501"),
Map("empid" -> "16", "empName" -> "John", "depId" -> "701")))

val columns=mapRDD.take(1).flatMap(a=>a.keys)

val resultantDF=mapRDD.map{value=>
val list=value.values.toList
(list(0),list(1),list(2))
}.toDF(columns:_*)

resultantDF.show()

输出是:
+-----+-------+-----+
|empid|empName|depId|
+-----+-------+-----+
| 12| Rohan| 201|
| 13| Ross| 201|
| 14|Richard| 401|
| 15|Michale| 501|
| 16| John| 701|
+-----+-------+-----+

关于scala - 如何将 map 的 RDD 转换为数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40780843/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com