gpt4 book ai didi

python - Pandas 蟒 : Col[C] if value is in Col[A] and Col[B]

转载 作者:太空宇宙 更新时间:2023-11-04 00:05:15 27 4
gpt4 key购买 nike

我有一个像这样的数据框:

    ColA             ColB                        ColC
"lorem ipsum" ["lorem", "foo", "bar"]
"lorem ipsum" NaN
NaN ["lorem", "foo", "bar"]
NaN NaN

我正在尝试获取此输出:

    ColA             ColB                        ColC
"lorem ipsum" ["lorem", "foo", "bar"] "lorem"

我试过像这样使用理解列表:

df["C"] = [elem for elem in df["B"] if elem in df["A"] ]

但没有成功:

TypeError: unhashable 类型: 'list'如果我将 ColB 格式化为列表,并且ValueError:值的长度与索引的长度不匹配如果我使用元组

一些帮助将不胜感激,谢谢。

编辑 + 编辑 2:两列中只有一个词(或无),我需要捕获它以将其放在 C 列中。我还忘了提及 ColA 和 ColB 可以将 NaN 作为值。

最佳答案

通过try+except 使用自定义函数并通过pipe 传递DataFrame :

df = pd.DataFrame({'A':['lorem ipsum','lorem ipsum',np.nan, np.nan],
'B':[["lorem", "foo", "bar"], np.nan, ["lorem", "foo", "bar"], np.nan]})
print (df)
A B
0 lorem ipsum [lorem, foo, bar]
1 lorem ipsum NaN
2 NaN [lorem, foo, bar]
3 NaN NaN

def test(df):
out = []
for a, b in zip(df["A"], df["B"]):
try:
out.append(next(y for y in b if y in a))
except Exception:
out.append('')
return out

df["C"] = df.pipe(test)
print (df)
A B C
0 lorem ipsum [lorem, foo, bar] lorem
1 lorem ipsum NaN
2 NaN [lorem, foo, bar]
3 NaN NaN

另一种解决方案效果不佳:

df = df.fillna("undefined")
df["C"] = [next((y for y in b if y in a), '') for a, b, in zip(df["A"],df["B"])]
print (df)


A B C
0 lorem ipsum [d, foo, bar]
1 lorem ipsum undefined u
2 undefined [lorem, foo, bar]
3 undefined undefined u

关于python - Pandas 蟒 : Col[C] if value is in Col[A] and Col[B],我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54366320/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com