gpt4 book ai didi

How to get a timestamp data type column without the seconds in Pyspark?(如何在Pyspark中获得没有秒的timestamp数据类型列?)

转载 作者:bug小助手 更新时间:2023-10-25 18:33:11 26 4
gpt4 key购买 nike



I have a timestamp column

我有一个时间戳专栏


data = [(1,'2023-01-22 09:00'),(2,'2023-09-11 00:09')]

schema = StructType([StructField("id",IntegerType(),False),StructField("ts",StringType(),True)])

main_df = spark.createDataFrame(data,schema)

main_df.printSchema()

root
|-- id: integer (nullable = false)
|-- ts: string (nullable = true)

main_df2 = main_df.withColumn('ts', date_format(to_timestamp(col('ts'),("yyyy-MM-dd HH:mm")),"yyyy-MM-dd HH:mm").cast("timestamp")).show()

main_df2.printSchema()

root
|-- id: integer (nullable = false)
|-- ts: timestamp (nullable = true)

main_df2.show()

+---+-------------------+
| id| ts|
+---+-------------------+
| 1|2023-01-22 09:00:00|
| 2|2023-09-11 00:09:00|
+---+-------------------+

Is it possible to have a timestamp datatype column, in Pyspark, without the seconds, like yyyy-MM-dd HH:mm?

是否可以使用不带秒的时间戳数据类型列,如yyyy-MM-dd hh:mm?


Desired Output

期望输出


+---+----------------+
| id| ts|
+---+----------------+
| 1|2023-01-22 09:00|
| 2|2023-09-11 00:09|
+---+----------------+~

root
|-- id: integer (nullable = false)
|-- ts: timestamp (nullable = true


Thanks in advande

谢谢你的好意


更多回答

in spark, only yyyy-MM-dd HH:mm:ss is the acceptable timestamp format. all others are considered strings.

在Spark中,只有yyyy-MM-dd hh:mm:ss是可接受的时间戳格式。所有其他的都被认为是字符串。

优秀答案推荐

You don't need .cast("timestamp") after you did a date_format - just remove it and you'll get what you need:

在执行DATE_FORMAT之后,您不需要.cast(“时间戳”)--只需删除它,您就会得到所需的内容:


main_df.withColumn('ts', date_format(to_timestamp(col('ts'),
("yyyy-MM-dd HH:mm")),"yyyy-MM-dd HH:mm")).show()

+---+----------------+
| id| ts|
+---+----------------+
| 1|2023-01-22 09:00|
| 2|2023-09-11 00:09|
+---+----------------+

更多回答

Thanks for your help. But in your solution the column ts will be string type but I want timestamp type

谢谢你的帮助。但在您的解决方案中,列ts将是字符串类型,但我想要时间戳类型

timestamp type is by definition with seconds & milliseconds. The show just visualizes that data.

根据定义,TIMESTAMP类型为秒和毫秒。该节目只是将这些数据可视化。

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com