gpt4 book ai didi

python - 在数据框中添加随机噪声

转载 作者:行者123 更新时间:2023-12-01 06:23:16 24 4
gpt4 key购买 nike

我有一个包含此类数据的数据框:

      0    1    2    3    4    5    6    7    8    9    10   11   12   13   14   15   16   17   18   19   ...  309  310  311  312  313  314  315  316  317  318  319  320  321  322  323  324  325  326  327  328
0 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 84 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 50 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

df 形状为 (10000, 329)

我想将数据帧中 1 的随机 5% 转换为 0

这可能吗?

最佳答案

试试这个:

# Get all columns from 1 to 328 and stack them into a temp series
tmp = df.loc[:, 1:].stack()

# Get the 1s
ones = tmp[tmp == 1].values.astype('int8')

# Mix with 5% zeros. You can use ceil or floor here
# as long as it makes an integer
n_zero = np.ceil(ones.shape[0] * .05).astype('int')

# Make the 0s
zeros = np.zeros(n_zero, dtype='int8')

# Replace 5% of the 1s with 0s and shuffle them
noise = np.concatenate((ones[n_zero:], zeros))
np.random.shuffle(noise)

# Assign the noise back to `tmp`
tmp.loc[tmp == 1] = noise

# Assign the noise back to the orignal frame
df.loc[:, 1:] = tmp.unstack()

通过对前后帧求和,可以判断 5% 的 1 是否已被 0 替换:

# Run this before and after the last line above to verify
df.loc[:, 1:].values.sum()

关于python - 在数据框中添加随机噪声,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60272584/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com