gpt4 book ai didi

python - 根据其他列的值创建新列的更好方法

转载 作者:太空宇宙 更新时间:2023-11-03 21:48:05 24 4
gpt4 key购买 nike

创建下面提到的相同列的更好方法是什么:

col_new = []
for r1 in df['col_A']:
if r1==1:
for r2 in df['col_B']:
if r2!='None':
col_new.append('col_new')

df['col_new'] = col_new

我的数据帧很大(120k * 22),运行上面的代码会挂起笔记本。有没有一种更快、更有效的方法来创建此列,当 col_A 为 1 时,它表示 col_B 的所有非空值。

最佳答案

我认为需要创建 bool 掩码,然后通过 DataFrame.loc 附加值:

mask = (df['col_A'] == 1) & (df['col_B']!='None')

#if None is not string
#mask = (df['col_A'] == 1) & (df['col_B'].notnull())
df.loc[mask, 'col_new'] = 'col_new'

示例:

列中是字符串:

df = pd.DataFrame({
'col_A': [1,1,2,1],
'col_B': ['a','None','None','a']
})
print (df)
col_A col_B
0 1 a
1 1 None
2 2 None
3 1 a

mask = (df['col_A'] == 1) & (df['col_B']!='None')
df.loc[mask, 'col_new'] = 'val'
print (df)
col_A col_B col_new
0 1 a val
1 1 None NaN
2 2 None NaN
3 1 a val

列中为 not strings Nones ,然后使用 Series.notna :

df = pd.DataFrame({
'col_A': [1,1,2,1],
'col_B': ['a',None,None,'a']
})
print (df)
col_A col_B
0 1 a
1 1 None
2 2 None
3 1 a

mask = (df['col_A'] == 1) & (df['col_B'].notna())
#oldier pandas versions
#mask = (df['col_A'] == 1) & (df['col_B'].notnull())
df.loc[mask, 'col_new'] = 'val'
print (df)
col_A col_B col_new
0 1 a val
1 1 None NaN
2 2 None NaN
3 1 a val

如果想要使用if-else语句numpy.where真的很有帮助:

df['col_new'] = np.where(mask, 'val', 'another_val')
print (df)
col_A col_B col_new
0 1 a val
1 1 None another_val
2 2 None another_val
3 1 a val

关于python - 根据其他列的值创建新列的更好方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52325230/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com