gpt4 book ai didi

字符串列上的 Pandas 滚动总和

转载 作者:行者123 更新时间:2023-12-04 12:45:37 25 4
gpt4 key购买 nike

我将 Python3 与 Pandas 版本“0.19.2”一起使用。

我有一个 Pandas df 如下:

chat_id    line
1 'Hi.'
1 'Hi, how are you?.'
1 'I'm well, thanks.'
2 'Is it going to rain?.'
2 'No, I don't think so.'

我想按“chat_id”分组,然后在“line”上执行滚动总和之类的操作以获得以下结果:
chat_id    line                     conversation
1 'Hi.' 'Hi.'
1 'Hi, how are you?.' 'Hi. Hi, how are you?.'
1 'I'm well, thanks.' 'Hi. Hi, how are you?. I'm well, thanks.'
2 'Is it going to rain?.' 'Is it going to rain?.'
2 'No, I don't think so.' 'Is it going to rain?. No, I don't think so.'

我相信 df.groupby('chat_id')['line'].cumsum() 只能在数字列上工作。

我也试过 df.groupby(by=['chat_id'], as_index=False)['line'].apply(list) 来获取完整对话中所有行的列表,但后来我想不通了解如何解压缩该列表以创建“滚动总和”样式的对话列。

最佳答案

为我工作 apply Series.cumsum , 如果需要分隔符添加 space :

df['new'] = df.groupby('chat_id')['line'].apply(lambda x: (x + ' ').cumsum().str.strip())
print (df)
chat_id line new
0 1 Hi. Hi.
1 1 Hi, how are you?. Hi. Hi, how are you?.
2 1 I'm well, thanks. Hi. Hi, how are you?. I'm well, thanks.
3 2 Is it going to rain?. Is it going to rain?.
4 2 No, I don't think so. Is it going to rain?. No, I don't think so.
df['line'] = df['line'].str.strip("'")
df['new'] = df.groupby('chat_id')['line'].apply(lambda x: "'" + (x + ' ').cumsum().str.strip() + "'")
print (df)
chat_id line \
0 1 Hi.
1 1 Hi, how are you?.
2 1 I'm well, thanks.
3 2 Is it going to rain?.
4 2 No, I don't think so.

new
0 'Hi.'
1 'Hi. Hi, how are you?.'
2 'Hi. Hi, how are you?. I'm well, thanks.'
3 'Is it going to rain?.'
4 'Is it going to rain?. No, I don't think so.'

关于字符串列上的 Pandas 滚动总和,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43569056/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com