gpt4 book ai didi

javascript - 将段落拆分成句子

转载 作者:行者123 更新时间:2023-11-28 17:33:30 25 4
gpt4 key购买 nike

我正在使用以下 Python 代码(我不久前在网上找到的)将段落拆分成句子。

def splitParagraphIntoSentences(paragraph):
import re
sentenceEnders = re.compile(r"""
# Split sentences on whitespace between them.
(?: # Group for two positive lookbehinds.
(?<=[.!?]) # Either an end of sentence punct,
| (?<=[.!?]['"]) # or end of sentence punct and quote.
) # End group of two positive lookbehinds.
(?<! Mr\. ) # Don't end sentence on "Mr."
(?<! Mrs\. ) # Don't end sentence on "Mrs."
(?<! Jr\. ) # Don't end sentence on "Jr."
(?<! Dr\. ) # Don't end sentence on "Dr."
(?<! Prof\. ) # Don't end sentence on "Prof."
(?<! Sr\. ) # Don't end sentence on "Sr."."
\s+ # Split on whitespace between sentences.
""",
re.IGNORECASE | re.VERBOSE)
sentenceList = sentenceEnders.split(paragraph)
return sentenceList

我工作得很好,但现在我需要 Javascript 中完全相同的正则表达式(以确保输出一致),我正在努力将这个 Python 正则表达式转换为与 Javascript 兼容的正则表达式。

最佳答案

它不是直接拆分的正则表达式,而是一种解决方法:

(?!Mrs?\.|Jr\.|Dr\.|Sr\.|Prof\.)(\b\S+[.?!]["']?)\s

DEMO

您可以将匹配的片段替换为例如:$1#(或文本中未出现的其他字符,而不是 #),然后将其拆分为 # DEMO .然而,这不是太优雅的解决方案。

关于javascript - 将段落拆分成句子,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/32695810/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com