gpt4 book ai didi

python - 如何用 Pandas 划分两个不同形状的数据框?

转载 作者:行者123 更新时间:2023-12-01 01:45:34 24 4
gpt4 key购买 nike

我有两个具有相同索引但形状不同的数据帧,并且无法将数据帧 df1 中的列与数据帧 df2 中的列分开。

预期结果为df1/df2

df1.head()
volume volume volume volume \
timestamp
2016-07-24 00:00:00+00:00 NaN NaN NaN NaN
2016-07-25 00:00:00+00:00 NaN NaN NaN NaN
2016-07-26 00:00:00+00:00 NaN NaN NaN 102720.829507
2016-07-27 00:00:00+00:00 NaN NaN 3.729644e+05 398346.509801
2016-07-28 00:00:00+00:00 NaN NaN 1.326648e+06 244165.794698

volume volume volume volume
timestamp
2016-07-24 00:00:00+00:00 NaN NaN NaN 1.734943e+07
2016-07-25 00:00:00+00:00 NaN NaN NaN 1.365341e+07
2016-07-26 00:00:00+00:00 NaN NaN NaN 5.199938e+07
2016-07-27 00:00:00+00:00 NaN 2.471076e+06 NaN 2.558753e+07
2016-07-28 00:00:00+00:00 NaN 1.642990e+06 NaN 3.118785e+06

df2.head()

timestamp
2016-07-24 00:00:00+00:00 1.734943e+07
2016-07-25 00:00:00+00:00 1.365341e+07
2016-07-26 00:00:00+00:00 5.210210e+07
2016-07-27 00:00:00+00:00 2.882991e+07
2016-07-28 00:00:00+00:00 6.332589e+06
Freq: D, dtype: float64

df1.shape
Out[2126]: (723, 8)

df2.shape
Out[2127]: (723,)

df1.divide(df2, axis= 'index')
ValueError: operands could not be broadcast together with shapes (5784,) (723,)

两个数据帧具有不同的结构,但索引相同。

type(df1)
Out[2143]: pandas.core.frame.DataFrame

type(df2)
Out[2144]: pandas.core.series.Series

我读到我需要 reshape 其中一个数据框,所以我尝试了这样的方法:

df1.divide(df2.reshape(723,1), axis= 'index')

但它返回一个错误:

ValueError: Unable to coerce to DataFrame, shape must be (723, 8): given (723, 1)

当我使用 pd.DataFrame(df2) 转换 df2 时,它会抛出错误:

TypeError: '<' not supported between instances of 'str' and 'int' 

我缺少什么以及我该怎么做?

最佳答案

尝试这种方法。我使用了一个简单的示例,但如果这不起作用,请告诉我。

import pandas as pd
import numpy as np
from IPython.display import display, HTML

CSS = """
.output {
flex-direction: row;
}
"""

HTML('<style>{}</style>'.format(CSS))


data1 = {"a":[1.,7.,12.],
"b":[4.,8.,3.],
"c":[5.,45.,67.]}
data2 = {"a":[3.],
"b":[2.],
"c":[8.]}

df1 = pd.DataFrame(data1)
df2 = pd.DataFrame(data2)
df2 = df2.T
df2 = df2.reset_index()
del df2['index']
display(df1)
display(df2)
display(df1.iloc[:,0:].truediv(df2[0], axis=0)) # this portion of code you want


a b c
0 1.0 4.0 5.0
1 7.0 8.0 45.0
2 12.0 3.0 67.0

0
0 3.0
1 2.0
2 8.0

a b c
0 0.333333 1.333333 1.666667
1 3.500000 4.000000 22.500000
2 1.500000 0.375000 8.375000

关于python - 如何用 Pandas 划分两个不同形状的数据框?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51386157/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com