gpt4 book ai didi

python - 在所有数据框列上应用具有不同 bin 大小的 binning

转载 作者:行者123 更新时间:2023-12-04 10:27:32 30 4
gpt4 key购买 nike

我有一个琐碎的问题。我有一个非常大的 df 有很多列。我正在尝试找到最有效的方法来对具有不同 bin 大小的所有列进行 bin 并创建一个新的 df。这是一个仅对单个列进行分箱的示例:

import numpy as np
import pandas as pd
df = pd.DataFrame(np.random.randint(0,20,size=(5, 4)), columns=list('ABCD'))
newDF = pd.cut(df.A, 2, precision=0)
newDF
0 (9.0, 18.0]
1 (-0.0, 9.0]
2 (-0.0, 9.0]
3 (-0.0, 9.0]
4 (9.0, 18.0]
Name: A, dtype: category
Categories (2, interval[float64]): [(-0.0, 9.0] < (9.0, 18.0]]

最佳答案

如果要单独处理每一列,请使用 DataFrame.apply :

df = pd.DataFrame(np.random.randint(0,20,size=(5, 4)), columns=list('ABCD'))
newDF = df.apply(lambda x: pd.cut(x, 2, precision=0))
print (newDF)
A B C D
0 (2.0, 4.0] (8.0, 15.0] (7.0, 13.0] (12.0, 18.0]
1 (2.0, 4.0] (8.0, 15.0] (7.0, 13.0] (12.0, 18.0]
2 (4.0, 7.0] (8.0, 15.0] (13.0, 19.0] (12.0, 18.0]
3 (4.0, 7.0] (8.0, 15.0] (7.0, 13.0] (5.0, 12.0]
4 (4.0, 7.0] (1.0, 8.0] (7.0, 13.0] (5.0, 12.0]

如果要按相同的 bin 处理所有列,请使用 DataFrame.stack 对于 MultiIndex Series , 申请 cut并通过 Series.unstack reshape :
newDF = pd.cut(df.stack(), 2, precision=0).unstack()
print (newDF)
A B C D
0 (10.0, 19.0] (10.0, 19.0] (10.0, 19.0] (-0.0, 10.0]
1 (10.0, 19.0] (10.0, 19.0] (-0.0, 10.0] (-0.0, 10.0]
2 (-0.0, 10.0] (10.0, 19.0] (-0.0, 10.0] (-0.0, 10.0]
3 (-0.0, 10.0] (-0.0, 10.0] (10.0, 19.0] (-0.0, 10.0]
4 (10.0, 19.0] (10.0, 19.0] (-0.0, 10.0] (-0.0, 10.0]

关于python - 在所有数据框列上应用具有不同 bin 大小的 binning,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60566534/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com