gpt4 book ai didi

python pandas按条件删除重复的列

转载 作者:行者123 更新时间:2023-12-04 03:59:03 26 4
gpt4 key购买 nike

我想按条件删除重复的列
所以我想要做的是“类型”是相同的(重复)删除“数字”一个
我懂了

data={"col1":[2,3,4,5,9,2,6],
"col2":[4,2,4,6,0,1,5],
"col3":[7,6,0,11,3,6,7],
"col4":[14,11,22,8,6,3,9],
"col5":[0,5,7,3,8,2,9],
"type":["A","A","C","D","B","B","E"],
"number":["one","two","two","one","one","two","two"]}
df=pd.DataFrame.from_dict(data)
我想要这个
data={"col1":[3,4,5,2,6],
"col2":[2,4,6,1,5],
"col3":[6,0,11,6,7],
"col4":[11,22,8,3,9],
"col5":[5,7,3,2,9],
"type":["A","C","D","B","E"],
"number":["two","two","one","two","two"]}
df=pd.DataFrame.from_dict(data)

最佳答案

您可以链接 2 个条件 - 选择所有非 one比较 Series.ne 的值和倒置掩码 Series.duplicated :

df1 = df[df['number'].ne('one') | ~df['type'].duplicated(keep=False)]
print (df1)
col1 col2 col3 col4 col5 type number
1 3 2 6 11 5 A two
2 4 4 0 22 7 C two
3 5 6 11 8 3 D one
5 2 1 6 3 2 B two
6 6 5 7 9 9 E two
有序分类的另一个想法:
cats = pd.unique(['one'] + df['number'].unique().tolist())

df['number'] = pd.Categorical(df['number'], categories=cats, ordered=True)

df2 = df.sort_values('number').drop_duplicates(subset=['type'], keep='last').sort_index()
print (df2)
col1 col2 col3 col4 col5 type number
1 3 2 6 11 5 A two
2 4 4 0 22 7 C two
3 5 6 11 8 3 D one
5 2 1 6 3 2 B two
6 6 5 7 9 9 E two

关于python pandas按条件删除重复的列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/63355136/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com