gpt4 book ai didi

python - 按值在总计行中的百分比替换值

转载 作者:太空宇宙 更新时间:2023-11-04 09:31:11 25 4
gpt4 key购买 nike

用连续的百分比替换某些列中的值的最直接方法是什么?

示例:

由此

enter image description here

对此

enter image description here

我试过代码:

cols=['h1', 'h2', 'h3', 'hn']
df[cols]=df[cols]/df['sum']

但这会返回错误:

ValueError: Columns must be same length as key.

此外,我认为这不是最好的方法,因为我的列数可能比 4 列多得多。

最佳答案

使用DataFrame.div指定 axis=0:

cols=['h1', 'h2', 'h3', 'hn']
df[cols]=df[cols].div(df['sum'], axis=0)

如果 sum 是最后一列,则可以使用:

df.iloc[:, :-1]=df.iloc[:, :-1].div(df['sum'], axis=0)

示例:

df = pd.DataFrame({
'h1':[4,5,4,5,5,4],
'h2':[7,8,9,4,2,3],
'h3':[1,3,5,7,1,0],
'hn':[4,5,4,5,5,4],
})
df['sum'] = df.sum(axis=1)

df.iloc[:, :-1] = df.iloc[:, :-1].div(df['sum'], axis=0)
print (df)
h1 h2 h3 hn sum
0 0.250000 0.437500 0.062500 0.250000 16
1 0.238095 0.380952 0.142857 0.238095 21
2 0.181818 0.409091 0.227273 0.181818 22
3 0.238095 0.190476 0.333333 0.238095 21
4 0.384615 0.153846 0.076923 0.384615 13
5 0.363636 0.272727 0.000000 0.363636 11

性能:

np.random.seed(2019)

N = 10000
df = pd.DataFrame(np.random.randint(100, size=(N, 20))).add_prefix('h')
df['sum'] = df.sum(axis=1)
print (df)

In [220]: %%timeit
...: df.iloc[:, :-1]=df.iloc[:, :-1].div(df['sum'], axis=0)
...:
8.03 ms ± 1.05 ms per loop (mean ± std. dev. of 7 runs, 100 loops each)
C:\Anaconda\lib\site-packages\spyder\widgets\variableexplorer\utils.py:410: FutureWarning: 'summary' is deprecated and will be removed in a future version.
display = value.summary()

In [221]: %%timeit
...: for col in df.columns[:-1]:
...: df[col] /= df["sum"]
...:
9.46 ms ± 168 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
C:\Anaconda\lib\site-packages\spyder\widgets\variableexplorer\utils.py:410: FutureWarning: 'summary' is deprecated and will be removed in a future version.
display = value.summary()

In [222]: %%timeit
...: df.iloc[:,:-1] = df.iloc[:,:-1].apply(lambda x: x/sum(x), axis=1)
...:
2.51 s ± 194 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
C:\Anaconda\lib\site-packages\spyder\widgets\variableexplorer\utils.py:410: FutureWarning: 'summary' is deprecated and will be removed in a future version.
display = value.summary()

关于python - 按值在总计行中的百分比替换值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55686637/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com