gpt4 book ai didi

pandas - 如何在 Pandas 中将字符串分成多行

转载 作者:行者123 更新时间:2023-12-04 01:52:02 24 4
gpt4 key购买 nike

我们有以下数据

  Name  genres

A Action|Adventure|Science Fiction|Thriller
B Action|Adventure|Science Fiction|Thriller
C Adventure|Science Fiction|Thriller

我想要这样的数据,我的数据框是
  Name  genres
A Action
A Adventure
A Science Fiction
A Thriller
B Action
B Adventure
B Science Fiction
B Thriller
C Adventure
C Science Fiction
C Thriller

这是我的代码
gen = df1[df1['genres'].str.contains('|')]
gen1 = gen.copy()
gen2 = gen.copy()
gen3 = gen.copy()
gen4 = gen.copy()
gen1['genres'] = gen1['genres'].apply(lambda x: x.split("|")[0])
gen2['genres'] = gen2['genres'].apply(lambda x: x.split("|")[1])
gen3['genres'] = gen3['genres'].apply(lambda x: x.split("|")[2])
gen4['genres'] = gen4['genres'].apply(lambda x: x.split("|")[3])

我收到错误

IndexError: list index out of range

最佳答案

克里特岛类型列表 split , repeat 值来自 str.len 最后由 chain.from_iterable 压平列表:

from itertools import chain

genres = df['genres'].str.split('|')
df = pd.DataFrame({
'Name' : df['Name'].values.repeat(genres.str.len()),
'genres' : list(chain.from_iterable(genres.tolist()))
})

print (df)
Name genres
0 A Action
1 A Adventure
2 A Science Fiction
3 A Thriller
4 B Action
5 B Adventure
6 B Science Fiction
7 B Thriller
8 C Adventure
9 C Science Fiction
10 C Thriller

编辑:

动态列数的解决方案:
print (df)
Name genres col
0 A Action|Adventure|Science Fiction|Thriller 2
1 B Action|Adventure|Science Fiction|Thriller 3
2 C Adventure|Science Fiction|Thriller 5

from itertools import chain

cols = df.columns.difference(['genres'])
genres = df['genres'].str.split('|')

df = (df.loc[df.index.repeat(genres.str.len()), cols]
.assign(genres=list(chain.from_iterable(genres.tolist()))))
print (df)
Name col genres
0 A 2 Action
0 A 2 Adventure
0 A 2 Science Fiction
0 A 2 Thriller
1 B 3 Action
1 B 3 Adventure
1 B 3 Science Fiction
1 B 3 Thriller
2 C 5 Adventure
2 C 5 Science Fiction
2 C 5 Thriller

关于pandas - 如何在 Pandas 中将字符串分成多行,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52575290/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com