gpt4 book ai didi

python-3.x - 在 python 中应用预训练的 facebook/bart-large-cnn 进行文本摘要

转载 作者:行者123 更新时间:2023-12-04 08:02:43 26 4
gpt4 key购买 nike

我正处于使用拥抱面变压器的情况,并且对它有了一些了解。我正在使用 facebook/bart-large-cnn 模型为我的项目执行文本摘要,目前我正在使用以下代码进行一些测试:

text = """
Justin Timberlake and Jessica Biel, welcome to parenthood.
The celebrity couple announced the arrival of their son, Silas Randall Timberlake, in
statements to People."""

from transformers import pipeline
smr_bart = pipeline(task="summarization", model="facebook/bart-large-cnn")
smbart = smr_bart(text, max_length=150)
print(smbart[0]['summary_text'])
一小段代码实际上给了我一个很好的文本摘要。但我的问题是如何在数据框列的顶部应用相同的预训练模型。我的数据框如下所示:
ID        Lang          Text
1 EN some long text here...
2 EN some long text here...
3 EN some long text here...
.... 等等 50K 行
现在我想将预先训练的模型应用于 col Text 以从中生成一个新列 df['summary'] ,结果数据帧应如下所示:
ID        Lang         Text                              Summary
1 EN some long text here... Text summary goes here...
2 EN some long text here... Text summary goes here...
3 EN some long text here... Text summary goes here...
我怎样才能做到这一点?任何帮助将非常感激。

最佳答案

你总是可以做的就是利用数据帧 apply功能:

df = pd.DataFrame([('EN',text)]*10, columns=['Lang','Text'])

df['summary'] = df.apply(lambda x: smr_bart(x['Text'], max_length=150)[0]['summary_text'] , axis=1)

df.head(3)
输出:
    Lang    Text                                                summary
0 EN \nJustin Timberlake and Jessica Biel, welcome ... The celebrity couple announced the arrival of ...
1 EN \nJustin Timberlake and Jessica Biel, welcome ... The celebrity couple announced the arrival of ...
2 EN \nJustin Timberlake and Jessica Biel, welcome ... The celebrity couple announced the arrival of ...
这有点低效,因为将为每一行调用管道(执行时间 2 分 16 秒)。因此我建议投 Text列到列表并将其直接传递给管道(执行时间 41 秒):
df = pd.DataFrame([('EN',text)]*10, columns=['Lang','Text'])

df['summary'] = [x['summary_text'] for x in smr_bart(df['Text'].tolist(), max_length=150)]

df.head(3)
输出:
    Lang    Text                                                summary
0 EN \nJustin Timberlake and Jessica Biel, welcome ... The celebrity couple announced the arrival of ...
1 EN \nJustin Timberlake and Jessica Biel, welcome ... The celebrity couple announced the arrival of ...
2 EN \nJustin Timberlake and Jessica Biel, welcome ... The celebrity couple announced the arrival of ...

关于python-3.x - 在 python 中应用预训练的 facebook/bart-large-cnn 进行文本摘要,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66372741/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com