gpt4 book ai didi

scala - 将包含值的列作为列表转换为数组

转载 作者:行者123 更新时间:2023-12-05 01:55:27 27 4
gpt4 key购买 nike

我有一个 spark 数据框,如下所示:

+------------------------------------------------------------------------+
| domains |
+------------------------------------------------------------------------+
|["0b3642ab5be98c852890aff03b3f83d8","4d7a5a24426749f3f17dee69e13194a9", |
| "9d0f74269019ad82ae82cc7a7f2b5d1b","0b113db8e20b2985d879a7aaa43cecf6", |
| "d095db19bd909c1deb26e0a902d5ad92","f038deb6ade0f800dfcd3138d82ae9a9", |
| "ab192f73b9db26ec2aca2b776c4398d2","ff9cf0599ae553d227e3f1078957a5d3", |
| "aa717380213450746a656fe4ff4e4072","f3346928db1c6be0682eb9307e2edf38", |
| "806a006b5e0d220c2cf714789828ecf7","9f6f8502e71c325f2a6f332a76d4bebf", |
| "c0cb38016fb603e89b160e921eced896","56ad547c6292c92773963d6e6e7d5e39"] |
+------------------------------------------------------------------------+

它包含作为列表的列。我想转换成 Array[String]。例如:

Array("0b3642ab5be98c852890aff03b3f83d8","4d7a5a24426749f3f17dee69e13194a9", "9d0f74269019ad82ae82cc7a7f2b5d1b","0b113db8e20b2985d879a7aaa43cecf6", "d095db19bd909c1deb26e0a902d5ad92","f038deb6ade0f800dfcd3138d82ae9a9", 
"ab192f73b9db26ec2aca2b776c4398d2","ff9cf0599ae553d227e3f1078957a5d3",
"aa717380213450746a656fe4ff4e4072","f3346928db1c6be0682eb9307e2edf38",
"806a006b5e0d220c2cf714789828ecf7","9f6f8502e71c325f2a6f332a76d4bebf",
"c0cb38016fb603e89b160e921eced896","56ad547c6292c92773963d6e6e7d5e39")

我尝试了以下代码,但没有得到预期的结果:

DF.select("domains").as[String].collect()

相反,我得到了这个:

[Ljava.lang.String;@7535f28 ...

有什么想法可以实现吗?

最佳答案

您可以在收集之前先展开您的domains 列,如下所示:

import org.apache.spark.sql.functions.{col, explode}

val result: Array[String] = DF.select(explode(col("domains"))).as[String].collect()

然后您可以使用 mkString 方法打印您的 result 数组:

println(result.mkString("[", ", ", "]"))

关于scala - 将包含值的列作为列表转换为数组,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/70211654/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com