gpt4 book ai didi

scala - 在 Spark scala 数据框中,我如何根据周数获取周末结束日期

转载 作者:行者123 更新时间:2023-12-04 17:15:09 26 4
gpt4 key购买 nike

按照我的业务逻辑,每周开始日是星期一,周末是星期日

我想根据周数获得星期日的周末结束日期,某年有 53 周,仅第 53 周不起作用

dsupp_trans_dt 的预期值为 2021-01-03

但根据下面的代码,它是空的

 scala>  case class Data(id:Int,weekNumber:String)
defined class Data

scala> var stgDF = Seq(Data(100,"53/2020")).toDF()
stgDF: org.apache.spark.sql.DataFrame = [id: int, weekNumber: string]

scala> val weekNumber = "53/2020"
weekNumber: String = 53/2020

scala> val monthIdNum = "202001"
monthIdNum: String = 202001

scala> val isLeapYearFunc = (year: Int) => (((year % 4) == 0) && !(
| ((year % 100) == 0) &&
| !((year % 400) == 0))
| )
isLeapYearFunc: Int => Boolean = <function1>

scala> val isLeapYear = isLeapYearFunc(monthIdNum.substring(0,4).toInt)
isLeapYear: Boolean = true

scala> val kafkaFilePeriod = "2053"
kafkaFilePeriod: String = 2053

scala> stgDF = stgDF.withColumn("year_week_number",lit(weekNumber)).withColumn("is_leap_year",lit(isLeapYear)).withColumn("dsupp_trans_dt",
| when (col("is_leap_year") === true ,date_add(to_date(col("year_week_number"), "w/yyyy"),7)).otherwise(date_add(to_date(col("year_week_number"), "w/yyyy"),14)))
stgDF: org.apache.spark.sql.DataFrame = [id: int, weekNumber: string ... 3 more fields]

scala> stgDF.show(10,false)

+---+----------+----------------+------------+--------------+
|id |weekNumber|year_week_number|is_leap_year|dsupp_trans_dt|
+---+----------+----------------+------------+--------------+
|100|53/2020 |53/2020 |true |null |
+---+----------+----------------+------------+--------------+

同样适用于下面

 scala> val weekNumber = "52/2020"
weekNumber: String = 52/2020

scala> stgDF = stgDF.withColumn("year_week_number",lit(weekNumber)).withColumn("is_leap_year",lit(isLeapYear)).withColumn("dsupp_trans_dt",
| when (col("is_leap_year") === true ,date_add(to_date(col("year_week_number"), "w/yyyy"),7)).otherwise(date_add(to_date(col("year_week_number"), "w/yyyy"),14)))
stgDF: org.apache.spark.sql.DataFrame = [id: int, weekNumber: string ... 3 more fields]

scala> stgDF.show
+---+----------+----------------+------------+--------------+
| id|weekNumber|year_week_number|is_leap_year|dsupp_trans_dt|
+---+----------+----------------+------------+--------------+
|100| 53/2020| 52/2020| true| 2020-12-27|
+---+----------+----------------+------------+--------------+

最佳答案

您可以使用 user-defined function使用新的 Java 时间 API。

首先,您需要创建一个函数,将表示一周的字符串(例如 53/2020)转换为本周星期日的日期:

import java.time.LocalDate
import java.time.format.DateTimeFormatter

val toWeekDate = (weekNumber: String) => {
LocalDate.parse("7/" + weekNumber, DateTimeFormatter.ofPattern("e/w/YYYY"))
}

其中,对于日期模式的元素(有关更多详细信息,请参阅 DateTimeFormatter's documentation):

  • e 代表星期几(1 代表周一,7 代表周日)
  • w 是一年中的星期
  • YYYY 是周年:例如,01/01/2021 是 2020 周年,因为它属于 2020 年的第 53 周。

然后将其转换为用户定义的函数并将其注册到您的 spark 上下文中:

import org.apache.spark.sql.functions.udf

val to_week_date = udf(toWeekDate)
spark.udf.register("to_week_date", to_week_date)

最后,您可以在创建新列时使用您的用户定义函数:

import org.apache.spark.sql.functions.{col, lit}

val weekNumber = "53/2020"

stgDF
.withColumn("year_week_number",lit(weekNumber))
.withColumn("dsupp_trans_dt", to_week_date(col("year_week_number")))

结果如下:

+---+----------+----------------+--------------+
|id |weekNumber|year_week_number|dsupp_trans_dt|
+---+----------+----------------+--------------+
|100|53/2020 |53/2020 |2021-01-03 |
+---+----------+----------------+--------------+

关于scala - 在 Spark scala 数据框中,我如何根据周数获取周末结束日期,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68859496/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com