gpt4 book ai didi

scala - 如何在缺少名称时将新列添加到 DataFrame 中?

转载 作者:行者123 更新时间:2023-12-04 17:16:15 24 4
gpt4 key购买 nike

我想将选定的列添加到尚不可用的 DataFrame。

val columns=List("Col1","Col2","Col3") 
for(i<-columns)
if(!df.schema.fieldNames.contains(i)==true)
df.withColumn(i,lit(0))

When select column the data frame only old column are coming, new columns are not coming.

最佳答案

它更多地是关于如何在 Scala 中做到这一点而不是 Spark 并且是 foldLeft 的绝佳案例。 (我最喜欢的!)

// start with an empty DataFrame, but could be anything
val df = spark.emptyDataFrame
val columns = Seq("Col1", "Col2", "Col3")
val columnsAdded = columns.foldLeft(df) { case (d, c) =>
if (d.columns.contains(c)) {
// column exists; skip it
d
} else {
// column is not available so add it
d.withColumn(c, lit(0))
}
}

scala> columnsAdded.printSchema
root
|-- Col1: integer (nullable = false)
|-- Col2: integer (nullable = false)
|-- Col3: integer (nullable = false)

关于scala - 如何在缺少名称时将新列添加到 DataFrame 中?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43468515/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com