gpt4 book ai didi

scala - Spark : Split is not a member of org. apache.spark.sql.Row

转载 作者:行者123 更新时间:2023-12-04 02:45:44 24 4
gpt4 key购买 nike

下面是我来自 Spark 1.6 的代码。我正在尝试将其转换为 Spark 2.3,但在使用 split 时出现错误。

Spark 1.6 代码:

val file = spark.textFile(args(0))
val mapping = file.map(_.split('/t')).map(a => a(1))
mapping.saveAsTextFile(args(1))

Spark 2.3 代码:
val file = spark.read.text(args(0))
val mapping = file.map(_.split('/t')).map(a => a(1)) //Getting Error Here
mapping.write.text(args(1))

错误信息:
value split is not a member of org.apache.spark.sql.Row

最佳答案

不像 spark.textFile返回 RDD ,
spark.read.text返回 DataFrame这本质上是一个 RDD[Row] .您可以执行 map带有部分函数,​​如以下示例所示:

// /path/to/textfile:
// a b c
// d e f

import org.apache.spark.sql.Row

val df = spark.read.text("/path/to/textfile")

df.map{ case Row(s: String) => s.split("\\t") }.map(_(1)).show
// +-----+
// |value|
// +-----+
// | b|
// | e|
// +-----+

关于scala - Spark : Split is not a member of org. apache.spark.sql.Row,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57346978/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com