gpt4 book ai didi

python - 如何循环 Pandas 中特定列的列表值?

转载 作者:太空宇宙 更新时间:2023-11-04 03:01:04 25 4
gpt4 key购买 nike

我有一个 Pandas 数据框,第一列是列表值。我想循环每个列表的每个 str 值,下一列的值将包含在一起。

例如:

tm = pd.DataFrame({'author':[['author_a1','author_a2','author_a3'],['author_b1','author_b2'],['author_c1','author_c2']],'journal':['journal01','journal02','journal03'],'date':pd.date_range('2015-02-03',periods=3)})
tm

author date journal
0 [author_a1, author_a2, author_a3] 2015-02-03 journal01
1 [author_b1, author_b2] 2015-02-04 journal02
2 [author_c1, author_c2] 2015-02-05 journal03

我想要这个:

    author       date          journal
0 author_a1 2015-02-03 journal01
1 author_a2 2015-02-03 journal01
2 author_a3 2015-02-03 journal01
3 author_b1 2015-02-04 journal02
4 author_b2 2015-02-04 journal02
5 author_c1 2015-02-05 journal03
6 author_c2 2015-02-05 journal03

我用了一个复杂的方法来解决这个问题。有没有什么简单高效的pandas方法?

author_use = []
date_use = []
journal_use = []

for i in range(0,len(tm['author'])):
for m in range(0,len(tm['author'][i])):
author_use.append(tm['author'][i][m])
date_use.append(tm['date'][i])
journal_use.append(tm['journal'][i])

df_author = pd.DataFrame({'author':author_use,
'date':date_use,
'journal':journal_use,
})

df_author

最佳答案

我想你可以使用 numpy.repeat按长度重复值 str.lenchain 的嵌套 lists 的平面值:

from  itertools import chain

lens = tm.author.str.len()

df = pd.DataFrame({
"date": np.repeat(tm.date.values, lens),
"journal": np.repeat(tm.journal.values,lens),
"author": list(chain.from_iterable(tm.author))})

print (df)

author date journal
0 author_a1 2015-02-03 journal01
1 author_a2 2015-02-03 journal01
2 author_a3 2015-02-03 journal01
3 author_b1 2015-02-04 journal02
4 author_b2 2015-02-04 journal02
5 author_c1 2015-02-05 journal03
6 author_c2 2015-02-05 journal03

另一种numpy解决方案:

df = pd.DataFrame(np.column_stack((tm[['date','journal']].values.\
repeat(list(map(len,tm.author)),axis=0) ,np.hstack(tm.author))),
columns=['date','journal','author'])

print (df)
date journal author
0 2015-02-03 00:00:00 journal01 auther_a1
1 2015-02-03 00:00:00 journal01 auther_a2
2 2015-02-03 00:00:00 journal01 auther_a3
3 2015-02-04 00:00:00 journal02 auther_b1
4 2015-02-04 00:00:00 journal02 auther_b2
5 2015-02-05 00:00:00 journal03 auther_c1
6 2015-02-05 00:00:00 journal03 auther_c2

关于python - 如何循环 Pandas 中特定列的列表值?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40888274/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com