gpt4 book ai didi

python - 根据两列从得分最高的组中选择行

转载 作者:行者123 更新时间:2023-11-28 22:13:23 25 4
gpt4 key购买 nike

数据

     Sentence  Score_Unigram  Score_Bigram  versionId
0 As of Dat 5 1 269004158
1 Date Docum 4 3 269004158
2 As of Dat 4 1 269004158
3 Date Docum 5 3 345973060
4 x Indicate 4 1 372529352
5 Date Docum 5 3 372529352
6 1 Financial 9 1 372529352
7 020 per shar 2 0 372529352
8 Date $ in 8 1 372529352
9 Date $ in 9 4 372529352
10 4 --------- 4 1 372529352
11 Date Begin 1 0 372529352

要求的输出

       Sentence  Score_Unigram  Score_Bigram  versionId
0 As of Dat 5 1 269004158
3 Date Docum 5 3 345973060
9 Date $ in 9 4 372529352

Objective

按version id分组,得到Score_unigram最大的行,如果结果多于一个,则检查Score_Bigram列,并得到具有最高值的行(如果有多个这样的行则返回所有)

What have I tried

maximum = 0
index_to_pick = []

for index,row_data in a.iterrows():
if row_data['Score_Unigram'] > maximum:
maximum = row_data['Score_Unigram']
score_bigram = row_data['Score_Bigram']
index_to_pick.append(index)

elif row_data['Score_Unigram'] == maximum:
if row_data['Score_Bigram'] > score_bigram:

maximum = row_data['Score_Unigram']
score_bigram = row_data['Score_Bigram']
index_to_pick = []
index_to_pick.append(index)

elif row_data['Score_Bigram'] == score_bigram:
index_to_pick.append(index)

a.loc[[index_to_pick[0]]]

输出

       Sentence  Score_Unigram  Score_Bigram  versionId
5 Date $ in 9 4 372529352

好吧,我猜这种方法不是很好(因为数据很大),正在寻找一种有效的方法。我尝试了 idxmax 但它返回了唯一的顶部。可能是重复的,但找不到。感谢您的帮助!

最佳答案

使用 boolean indexing 双重过滤- 首先是第一列 Score_Unigrammax,然后是 Score_Bigram:

df = df[ df['Sentence'].duplicated(keep=False)]
df = df[df.groupby('Sentence')['Score_Unigram'].transform('max') == df['Score_Unigram']]
df = df[df.groupby(['Sentence', 'Score_Unigram'])['Score_Bigram'].transform('max') == df['Score_Bigram']]
print (df)
Sentence Score_Unigram Score_Bigram versionId
0 As of Dat 5 1 269004158
3 Date Docum 5 3 345973060
5 Date Docum 5 3 372529352
9 Date $ in 9 4 372529352

关于python - 根据两列从得分最高的组中选择行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54007984/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com