gpt4 book ai didi

python - 如何将字典值映射到具有列表值的数据框列

转载 作者:行者123 更新时间:2023-12-03 20:17:22 25 4
gpt4 key购买 nike

我有一个数据框:

df = pd.DataFrame(
{'title':['a1','a2','a3','a4','a5'],
'genre_name':[
['family', 'animation'],
['action', 'family', 'comedy'],
['family', 'comedy'],
['horror','action'],
['family', 'animation','comedy']]}
)

df
title genre_name
0 a1 ['family', 'animation']
1 a2 ['action', 'family', 'comedy']
2 a3 ['family', 'comedy']
3 a4 ['horror','action]
4 a5 ['family', 'animation','comedy']

我的字典是:

dict={'1':'family','2':'animation','3':'action','4':'comedy','5':'horror'}

我想创建一个名为“genre_ids”的新列,它将所有 genre_names 映射到字典“dict”中的键。

所需的 df 是:

df
title genre_name genre_ids
0 a1 ['family', 'animation'] [1,2]
1 a2 ['action', 'family', 'comedy'] [3,1,4]
2 a3 ['family', 'comedy'] [1,4]
3 a4 ['horror','action] [5,3]
4 a5 ['family', 'animation','comedy'] [1,2,4]

我怎样才能做到这一点?

最佳答案

将字典名称从 dict 更改为另一个变量,因为内置函数(python 代码字),然后将键与值交换并在列表理解中映射值:

d={'1':'family','2':'animation','3':'action','4':'comedy','5':'horror'}

d1 = {v:k for k, v in d.items()}
df['genre_ids'] = df['genre_name'].apply(lambda x: [d1.get(y) for y in x])
#alternative
#df['genre_ids'] = [[d1.get(y) for y in x] for x in df['genre_name']]
print (df)
title genre_name genre_ids
0 a1 [family, animation] [1, 2]
1 a2 [action, family, comedy] [3, 1, 4]
2 a3 [family, comedy] [1, 4]
3 a4 [horror, action] [5, 3]
4 a5 [family, animation, comedy] [1, 2, 4]

编辑:您还可以指定如果不匹配会发生什么,这里为第一个列表添加了crime:

df = pd.DataFrame({'title':['a1','a2','a3','a4','a5'], 
'genre_name':[['crime', 'animation'],['action', 'family', 'comedy'],
['family', 'comedy'],['horror','action'],
['family', 'animation','comedy']]})

d={'1':'family','2':'animation','3':'action','4':'comedy','5':'horror'}


d1 = {v:k for k, v in d.items()}
#no matched values repalced to None
df['genre_ids0'] = df['genre_name'].apply(lambda x: [d1.get(y) for y in x])
#no match replaced to default value
df['genre_ids1'] = df['genre_name'].apply(lambda x: [d1.get(y, 0) for y in x])
#no match is removed
df['genre_ids2'] = df['genre_name'].apply(lambda x: [d1[y] for y in x if y in d1])
print (df)
title genre_name genre_ids0 genre_ids1 genre_ids2
0 a1 [crime, animation] [None, 2] [0, 2] [2]
1 a2 [action, family, comedy] [3, 1, 4] [3, 1, 4] [3, 1, 4]
2 a3 [family, comedy] [1, 4] [1, 4] [1, 4]
3 a4 [horror, action] [5, 3] [5, 3] [5, 3]
4 a5 [family, animation, comedy] [1, 2, 4] [1, 2, 4] [1, 2, 4]

关于python - 如何将字典值映射到具有列表值的数据框列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61709168/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com