gpt4 book ai didi

python - 根据 pandas DataFrame 中的列值有条件地替换多列

转载 作者:太空狗 更新时间:2023-10-30 02:19:32 25 4
gpt4 key购买 nike

我想根据第一组列中的值(具体来说,第一列中的一个为空白)同时用其他列中的相应值替换多个列的值。这是我正在尝试做的一个例子:

import pandas as pd

df = pd.DataFrame({'a1':['m', 'n', 'o', 'p'],
'a2':['q', 'r', 's', 't'],
'b1':['', '', 'a', '' ],
'b2':['', '', 'b', '']})

df

# a1 a2 b1 b2
# 0 m q
# 1 n r
# 2 o s a b
# 3 p t

我想用 a1 和 a2 中的相应值替换 b1 和 b2 中的 '' 值,其中 b1 为空:

#   a1 a2 b1 b2
# 0 m q m q
# 1 n r n r
# 2 o s a b
# 3 p t p t

这是我的思考过程(我对 pandas 比较陌生,所以我在这里说话可能带有很重的 R 口音):

missing = (df.b1 == '')

# First thought:
df[missing, ['b1', 'b2']] = df[missing, ['a1', 'a2']]
# TypeError: 'Series' objects are mutable, thus they cannot be hashed

# Fair enough
df[tuple(missing), ('b1', 'b2')] = df[tuple(missing), ('a1', 'a2')]
# KeyError: ((True, True, False, True), ('a1', 'a2'))

# Obviously I'm going about this wrong. Maybe I need to use indexing?
df[['b1', 'b2']].ix[missing,:]
# b1 b2
# 0
# 1
# 3

# That looks right
df[['b1', 'b2']][missing, :] = df[['a1', 'a2']].ix[missing, :]
# TypeError: 'Series' objects are mutable, thus they cannot be hashed
# Deja vu

df[['b1', 'b2']].ix[tuple(missing), :] = df[['a1', 'a2']].ix[tuple(missing), :]
# ValueError: could not convert string to float:
# Uhh...

我可以逐列进行:

df['b1'].ix[missing] = df['a1'].ix[missing]
df['b2'].ix[missing] = df['a2'].ix[missing]

...但我怀疑有更惯用的方法来做到这一点。想法?

更新:为了澄清,我特别想知道是否可以同时更新所有列。例如,对 Primer 答案的假设修改(这不起作用并导致 NaN,尽管我不确定为什么):

df.loc[missing, ['b1', 'b2']] = f.loc[missing, ['a1', 'a2']]

# a1 a2 b1 b2
# 0 m q NaN NaN
# 1 n r NaN NaN
# 2 o s a b
# 3 p t NaN NaN

最佳答案

怎么样

df[['b1', 'b2']] = df[['b1', 'b2']].where(df[['b1', 'b2']] != '', df[['a1', 'a2']].values)

返回

  a1 a2 b1 b2
0 m q m q
1 n r n r
2 o s a b
3 p t p t

关于python - 根据 pandas DataFrame 中的列值有条件地替换多列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29018638/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com