gpt4 book ai didi

python-3.x - 在另一列的指定组中查找另一列中存在重复项的行

转载 作者:行者123 更新时间:2023-12-03 07:54:17 24 4
gpt4 key购买 nike

对于数据集df,我想对B列中的两组foobar进行分组,并识别两个组中都存在的重复行。我怎样才能实现这个目标?

df = pd.DataFrame({'A': [1, 2, 2, 3, 3, 1],
'B': ['foo', 'bar', 'foo', 'bar', 'foo', 'foo']})
df = df.sort_values('B')
df
Out[15]:
A B
1 2 bar
3 3 bar
0 1 foo
2 2 foo
4 3 foo
5 1 foo

预期结果:

   A    B  Indicator
1 2 bar True # value 2 also present in foo, so returns True
3 3 bar True # value 3 also present in foo, so returns True
0 1 foo False # value 1 only present in foo, so returns False
2 2 foo True # value 2 also present in bar, so returns True
4 3 foo True # value 3 also present in bar, so returns True
5 1 foo False # value 1 only present in foo, so returns False

更新:

假设B列有超过2个类别,则示例数据df如下:

df = pd.DataFrame({'A': [1, 2, 2, 3, 3, 2, 1],  'B': ['foo', 'bar', 'foo', 'bar', 'foo', 'baz', 'baz']})
df = df.sort_values('B')
df
Out[30]:
A B
1 2 bar
3 3 bar
5 2 baz
6 1 baz
0 1 foo
2 2 foo
4 3 foo

在这种情况下,预期结果如下:

   A    B  Indicator
1 2 bar True # The value 2 occurs in categories baz, bar, and foo, so returns True.
3 3 bar False # The value 3 only occurs in categories bar and foo, so returns False.
5 2 baz True # The value 2 occurs in categories baz, bar, and foo, so returns True.
6 1 baz False # The value 1 only occurs in categories baz and foo, so returns False.
0 1 foo False # The value 1 only occurs in categories baz and foo, so returns False.
2 2 foo True # The value 2 occurs in categories baz, bar, and foo, so returns True.
4 3 foo False # The value 3 only occurs in categories bar and foo, so returns False.

最佳答案

由于您有多个组,您可以使用:

data = {'A': [2, 3, 2, 1, 1, 2, 3],
'B': ['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'foo']}
df = pd.DataFrame(data).sort_values('B')

df['Indicator'] = df.groupby('A')['B'].transform('nunique') == df['B'].nunique()

输出:

>>> df
A B Indicator
0 2 bar True
1 3 bar False
2 2 baz True
3 1 baz False
4 1 foo False
5 2 foo True
6 3 foo False

关于python-3.x - 在另一列的指定组中查找另一列中存在重复项的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/76454193/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com