gpt4 book ai didi

短语重复 n 次的正则表达式?

转载 作者:行者123 更新时间:2023-12-04 18:25:06 25 4
gpt4 key购买 nike

我让用户输入文本 block ,我试图阻止他们重复一个短语超过 5 次。所以这很好:

I like fish very much I like fish very much I like fish very much

这也是:

Marshmallows are yummy. Marshmallows are yummy. Marshmallows are yummy.

但这不会是:

I like fish very much I like fish very much I like fish very much I like fish very much I like fish very much I like fish very much I like fish very much I like fish very much

也不是这个:

Marshmallows are yummy. Marshmallows are yummy. Marshmallows are yummy. Marshmallows are yummy. Marshmallows are yummy. Marshmallows are yummy. Marshmallows are yummy. Marshmallows are yummy. Marshmallows are yummy. Marshmallows are yummy.

理想情况下,即使它是这样输入的,它也会捕获它:

I like fish very much
I like fish very much
I like fish very much
I like fish very much
I like fish very much
I like fish very much

我试过:

\b(\S.*\S)[ ,.]*\b(\1){5}

但它并不总是有效,这取决于短语的长度,而且似乎只有在每个句子都以句号结尾时才有效。

有什么想法吗?

最佳答案

这是一种可能:

(\b\w.{3,49})\1{4}

它在一组中捕获 2 到 50 个字符(以单词字符开头),并检查该组是否连续至少重复 5 次。

https://regex101.com/r/tS6kHF/2

如果正则表达式通过,则有一些重复的短语。

也就是说,这可能不是一个好主意,尤其是对于大型输入字符串 - 正如您在链接中看到的那样,它需要非常多的步骤,因为对于输入中的每个字符(例如,以“hello"),它要找到对应的长度为2的子串("he")并检查它是否不重复,然后找到"hel"和后面的内容,然后找到"hell"和后面的内容,等等,50次.然后,它从下一个字符开始,“e”:“el”,然后是“ell”,然后是“ello”等。(你确实需要一个上限,比如 50 个字符,或者其他什么- 否则,计算时间会增加,例如 8k 步到 74k 步)

根据情况,计算量可能很大 - 使用另一种方法可能会更好 programatically find重复子串。

关于短语重复 n 次的正则表达式?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53274214/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com