gpt4 book ai didi

Python/Pandas - 删除不重复的行

转载 作者:太空宇宙 更新时间:2023-11-04 09:52:15 24 4
gpt4 key购买 nike

我有这样的 DataFrame:

        product_id          dt  stock_qty
226870 2948259 2017-11-11 17.000
233645 2948259 2017-11-12 17.000
240572 2948260 2017-11-13 5.000
247452 2948260 2017-11-14 5.000
233644 2948260 2017-11-12 5.000
226869 2948260 2017-11-11 5.000
247451 2948262 2017-11-14 -2.000
226868 2948262 2017-11-11 -1.000 <- not duplicated
240571 2948262 2017-11-13 -2.000
240570 2948264 2017-11-13 5.488
233643 2948264 2017-11-12 5.488
244543 2948269 2017-11-11 2.500
247450 2948276 2017-11-14 3.250
226867 2948276 2017-11-11 3.250

我必须删除 stock_qty 不同但 product_id 值相同的行。所以我应该像这样得到 DataFrame:

        product_id          dt  stock_qty
226870 2948259 2017-11-11 17.000
233645 2948259 2017-11-12 17.000
240572 2948260 2017-11-13 5.000
247452 2948260 2017-11-14 5.000
233644 2948260 2017-11-12 5.000
226869 2948260 2017-11-11 5.000
240570 2948264 2017-11-13 5.488
233643 2948264 2017-11-12 5.488
244543 2948269 2017-11-11 2.500
247450 2948276 2017-11-14 3.250
226867 2948276 2017-11-11 3.250

感谢您的帮助!

最佳答案

你需要drop_duplicates获取所有 product_id 值,然后通过 isin 排除它们另一个条件由 xor (^) 链接:

m1 = df['product_id'].isin(df.drop_duplicates('stock_qty', keep=False)['product_id'])
m2 = df.duplicated('product_id', keep=False)

df = df[m1 ^ m2]
print (df)
product_id dt stock_qty
226870 2948259 2017-11-11 17.000
233645 2948259 2017-11-12 17.000
240572 2948260 2017-11-13 5.000
247452 2948260 2017-11-14 5.000
233644 2948260 2017-11-12 5.000
226869 2948260 2017-11-11 5.000
240570 2948264 2017-11-13 5.488
233643 2948264 2017-11-12 5.488
244543 2948269 2017-11-11 2.500
247450 2948276 2017-11-14 3.250
226867 2948276 2017-11-11 3.250

详细信息:

print (m1)
226870 False
233645 False
240572 False
247452 False
233644 False
226869 False
247451 True
226868 True
240571 True
240570 False
233643 False
244543 True
247450 False
226867 False
Name: product_id, dtype: bool

print (m2)
226870 True
233645 True
240572 True
247452 True
233644 True
226869 True
247451 True
226868 True
240571 True
240570 True
233643 True
244543 False
247450 True
226867 True
dtype: bool

关于Python/Pandas - 删除不重复的行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47312040/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com