gpt4 book ai didi

apache-spark - PySpark:如何在 For 循环中附加数据帧

转载 作者:行者123 更新时间:2023-12-03 18:44:56 24 4
gpt4 key购买 nike

我正在对单个时间序列数据帧执行滚动中值计算,然后我想连接/附加结果。

# UDF for rolling median
median_udf = udf(lambda x: float(np.median(x)), FloatType())

series_list = ['0620', '5914']
SeriesAppend=[]

for item in series_list:
# Filter for select item
series = test_df.where(col("ID").isin([item]))
# Sort time series
series_sorted = series.sort(series.ID,
series.date).persist()
# Calculate rolling median
series_sorted = series_sorted.withColumn("list",
collect_list("metric").over(w)) \
.withColumn("rolling_median", median_udf("list"))

SeriesAppend.append(series_sorted)

SeriesAppend

[DataFrame[ntwrk_genre_cd: string, date: date, mkt_cd: string, syscode: string, ntwrk_cd: string, syscode_ntwrk: string, metric: double, list: array, rolling_median: float], DataFrame[ntwrk_genre_cd: string, date: date, mkt_cd:字符串,syscode:字符串,ntwrk_cd:字符串,syscode_ntwrk:字符串,度量: double ,列表:数组,rolling_median:float]]

当我尝试 .show() 时:
'list' object has no attribute 'show'
Traceback (most recent call last):
AttributeError: 'list' object has no attribute 'show'

我意识到这是说对象是数据帧列表。如何转换为单个数据帧?

我知道以下解决方案适用于明确数量的数据帧,但我希望我的 for 循环与数据帧的数量无关:
from functools import reduce
from pyspark.sql import DataFrame

dfs = [df1,df2,df3]
df = reduce(DataFrame.unionAll, dfs)

有没有办法将其推广到非显式数据帧名称?

最佳答案

谢谢大家!总结一下 - 解决方案使用 Reduce 和 unionAll:

SeriesAppend=[]

for item in series_list:
# Filter for select item
series = test_df.where(col("ID").isin([item]))
# Sort time series
series_sorted = series.sort(series.ID,
series.date).persist()
# Calculate rolling median
series_sorted = series_sorted.withColumn("list",
collect_list("metric").over(w)) \
.withColumn("rolling_median", median_udf("list"))

SeriesAppend.append(series_sorted)

df_series = reduce(DataFrame.unionAll, SeriesAppend)

关于apache-spark - PySpark:如何在 For 循环中附加数据帧,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56363561/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com