gpt4 book ai didi

python - 通过检查字典来修改数据框中的元素

转载 作者:太空宇宙 更新时间:2023-11-03 14:31:04 25 4
gpt4 key购买 nike

我正在尝试使用字典对列表中的元素进行分类:

mydict = {'beach': ['beach', 'sand', 'coast'], 'package': ['package', 'inclusive']}

给定数据框:

Keyword                         |cat |
--------------------------------|----|
beach holiday | |
package beach holiday | |
inclusive beach holiday | |

我想检查一个元素是否在字典中,它是否将关键字应用于类别列,例如:

Keyword                         |cat |
--------------------------------|----|
beach holiday |beach |
package beach holiday |package|
inclusive package beach holiday |package|

我尝试使用以下代码:

df = get_csv(csv)
mydict = {'beach': ['beach', 'sand', 'coast'], 'package': ['package', 'inclusive']}

for key in mydict.keys():
item = key
if item in mydict[key]:
target_cats = item
find_keywords = lambda kw: [s for s in kw.split() if s in target_cats]

df.loc[:, 'cat_list'] = df['Keyword'].apply(lambda x: find_keywords(x))
for i in range(1, 4):
df.loc[:, 'cat{0}'.format(i)] = df['cat_list'].apply(lambda x: x[i-1] if len(x) >= i else '')

print(df)
df.to_csv('kuoniTesting.csv')

然而,这只是给出了一个空的类别列表,用于检查列表的代码有效,我如何修改它以使用字典?

target_cats = ['cat', 'dog', 'cow']
df = pd.DataFrame({'Keyword': ['cat dog cow', 'cat dog', 'dog sheep']})
find_keywords = lambda kw: [s for s in kw.split() if s in target_cats]

df.loc[:, 'cat_list'] = df['Keyword'].apply(lambda x: find_keywords(x))
for i in range(1, 4):
df.loc[:, 'cat{0}'.format(i)] = df['cat_list'].apply(lambda x: x[i-1] if
len(x) >= i else '')

Keyword cat_list cat1 cat2 cat3
0 cat dog cow [cat, dog, cow] cat dog cow
1 cat dog [cat, dog] cat dog
2 dog sheep [dog] dog

最佳答案

您可以用键交换字典中的,但它会返回多个值:

mydict = {'beach': ['beach', 'sand', 'coast'], 'package': ['package', 'inclusive']}
d = {k: oldk for oldk, oldv in mydict.items() for k in oldv}
print (d)
{'sand': 'beach', 'package': 'package', 'beach': 'beach',
'inclusive': 'package', 'coast': 'beach'}

find_keywords = lambda kw: [d[s] for s in kw.split() if s in d.keys()]
df['cat_list'] = df['Keyword'].apply(lambda x: find_keywords(x))
print (df)
Keyword cat_list
0 beach holiday [beach]
1 package beach holiday [package, beach]
2 inclusive beach holiday [package, beach]

对于新列:

df = df.join(pd.DataFrame(df['cat_list'].values.tolist(), columns=['cat1','cat2']))
print (df)
Keyword cat_list cat1 cat2
0 beach holiday [beach] beach None
1 package beach holiday [package, beach] package beach
2 inclusive beach holiday [package, beach] package beach

关于python - 通过检查字典来修改数据框中的元素,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47289878/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com