gpt4 book ai didi

python - 将 pandas 列的元素与另一个 pandas 数据框的列匹配

转载 作者:行者123 更新时间:2023-11-28 19:01:48 25 4
gpt4 key购买 nike

我有一个 pandas 数据框 A,列 keywords 为:-

 keywords
['loans','mercedez','bugatti','a4']
['trump','usa','election','president']
['galaxy','7s','canon','macbook']
['beiber','spiderman','marvels','ironmen']
.........................................
.........................................
.........................................

我还有另一个 pandas 数据框 B,其中包含列 categorywords,它是逗号分隔的字符串:-

category              words
audi audi a4,audi a6
bugatti bugatti veyron, bugatti chiron
mercedez mercedez s-class, mercedez e-class
dslr canon, nikon
apple iphone 7s,iphone 6s,iphone 5
finance sales,loans,sales price
politics donald trump, election, votes
entertainment spiderman,captain america, ironmen
music justin beiber, rihana,drake
........ ..............
......... .........

我想将 dataframe Akeywords 映射到 dataframe Bwords 并分配相应的 类别keywords 列的映射应与列 word 的字符串中的每个单词对应。例如:- 关键字 a4 应与 words 列中的字符串 audi a4 中的两个词匹配。预期结果为:-

  keywords                                       matched_category
['loans','mercedez','bugatti','a4'] ['finance','mercedez','mercedez','bugatti','bugatti','audi']
['trump','usa','election','president'] ['politics','politics']
['galaxy','7s','canon','macbook'] ['apple','dslr']
['beiber','spiderman','marvels','ironmen'] ['music','entertaiment','entertainment','entertainment']

最佳答案

一种方法是使用 pandas.transform:

import pandas as pd

A = pd.DataFrame({'keywords': [['loans','mercedez','bugatti','a4'],
['trump','usa','election','president']]})
B = pd.DataFrame({'category': ['audi', 'finance'],
'words': ['audi a4,audi a6', 'sales,loans,sales price']})

def match_category_to_keywords(kws):
ret = []
for kw in kws:
m = B['words'].transform(lambda words: any([kw in w for w in words.split(',')]))
ret.extend(B['category'].loc[m].tolist())
return pd.np.unique(ret)

A['matched_category'] = A['keywords'].transform(lambda kws: match_category_to_keywords(kws))
print(A)

输出:

                            keywords matched_category
0 [loans, mercedez, bugatti, a4] [audi, finance]
1 [trump, usa, election, president] []

关于python - 将 pandas 列的元素与另一个 pandas 数据框的列匹配,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51864822/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com