gpt4 book ai didi

python - 如何基于两列堆叠数据框

转载 作者:行者123 更新时间:2023-12-04 02:25:16 26 4
gpt4 key购买 nike

我有以下数据框 (test_df),我想将 quote.BTC.* 和 quote.USD.* 列堆叠在一起并为它们分配标签。

{'time_open': {0: '2021-07-02T00:00:00.000Z',
1: '2021-07-03T00:00:00.000Z',
2: '2021-07-04T00:00:00.000Z'},
'time_close': {0: '2021-07-02T23:59:59.999Z',
1: '2021-07-03T23:59:59.999Z',
2: '2021-07-04T23:59:59.999Z'},
'quote.BTC.open': {0: 0.000531186983930284,
1: 0.0005377264271786289,
2: 0.0005618907288594747},
'quote.BTC.close': {0: 0.0005381508670811756,
1: 0.0005631835764711054,
2: 0.0005886421482917653},
'quote.USD.open': {0: 17.83307192, 1: 18.22733883, 2: 19.47993593},
'quote.USD.close': {0: 18.24172609, 1: 19.52475708, 2: 20.77187449}}

输出应如下所示:example_output

我设法用这段代码做到了,但它看起来非常笨拙而且不是很通用:

df_list = []
for asset in ["BTC", "USD"]:
base_cols = ['time_open', 'time_close']
# define list of columns I
asset_cols = [
f'quote.{asset}.open',
f'quote.{asset}.close']
base_cols.extend(asset_cols)

# define dict of what col names should be renamed to
col_dict = {
f'quote.{asset}.open' : 'open',
f'quote.{asset}.close' : 'close'}

df_temp = test_df[base_cols].rename(columns=col_dict)
df_temp["quote_asset"] = asset
df_list.append(df_temp)
print(pd.concat(df_list))

有更好的方法吗?

最佳答案

简单的 Pandas 解决方案

  • 将索引设置为time_opentime_close
  • 围绕分隔符 . 拆分列,并通过传递可选参数 expand=True 转换为 MultiIndex
  • 放下未使用的层然后堆叠level=0上 reshape
s = df.set_index(['time_open', 'time_close'])
s.columns = s.columns.str.split('.', expand=True)
s = s.droplevel(0, axis=1).stack(0)

结果

print(s)
close open
time_open time_close
2021-07-02T00:00:00.000Z 2021-07-02T23:59:59.999Z BTC 0.000538 0.000531
USD 18.241726 17.833072
2021-07-03T00:00:00.000Z 2021-07-03T23:59:59.999Z BTC 0.000563 0.000538
USD 19.524757 18.227339
2021-07-04T00:00:00.000Z 2021-07-04T23:59:59.999Z BTC 0.000589 0.000562
USD 20.771874 19.479936

使用 pd.wide_to_long 的替代方法

df.columns = df.columns.str.replace(r'quote\.(.*?)\.(.*)', r'\2_\1')
pd.wide_to_long(df, i=['time_open', 'time_close'], j='quote',
stubnames=['open', 'close'], sep='_', suffix='\w+')

结果

                                                              open      close
time_open time_close quote
2021-07-02T00:00:00.000Z 2021-07-02T23:59:59.999Z BTC 0.000531 0.000538
USD 17.833072 18.241726
2021-07-03T00:00:00.000Z 2021-07-03T23:59:59.999Z BTC 0.000538 0.000563
USD 18.227339 19.524757
2021-07-04T00:00:00.000Z 2021-07-04T23:59:59.999Z BTC 0.000562 0.000589
USD 19.479936 20.771874

关于python - 如何基于两列堆叠数据框,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68258062/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com