gpt4 book ai didi

python - 如何使用正则表达式将此文本标记为句子

转载 作者:太空宇宙 更新时间:2023-11-03 15:11:00 25 4
gpt4 key购买 nike

"You could not possibly have come at a better time, my dear Watson," he said cordially. 'It is not worth your while to wait,' she went on."You can pass through the door; no one hinders." And then, seeing that I smiled and shook my head, she suddenly threw aside her constraint and made a step forward, with her hands wrung together.

查看突出显示的区域。我如何区分“””后跟句点(.)以结束句子的情况和句点(.)后跟“””的情况

我已经为分词器尝试过这个片段。除了那一部分之外,它运行良好。

(([^।\.?!]|[।\.?!](?=[\"\']))+\s*[।\.?!]\s*)

编辑:我不打算使用任何 NLP 工具包来解决这个问题。

最佳答案

此处使用NLTK代替正则表达式:

from nltk import sent_tokenize
parts = sent_tokenize(your_string)
# ['"You could not possibly have come at a better time, my dear Watson," he said cordially.', "'It is not worth your while to wait,' she went on.", '"You can pass through the door; no one hinders."', 'And then, seeing that I smiled and shook my head, she suddenly threw aside her constraint and made a step forward, with her hands wrung together.']

关于python - 如何使用正则表达式将此文本标记为句子,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44209203/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com