python - 如何替换两个或多个重复的 :punct: using re in python?-6ren

python - 如何替换两个或多个重复的 :punct: using re in python?

转载作者：行者123 更新时间：2023-12-04 01:10:07

26

4

我需要替换某些字符串上两个或多个重复的标点符号。

"asdasdasd - adasdasd asda ------- asda wadsda +-----+ wwww qqqqqq aaaaa"

到

"asdasdasd - adasdasd asda -  asda wadsda +- + wwww qqqqqq aaaaa"

我使用 regex101 应用程序创建了这个应用程序:

https://regex101.com/r/vdR5T1/1/

但是当我尝试使用 python 时:

import re
texto = "asdasdasd - adasdasd asda ------- asda wadsda +-----+ wwww qqqqqq aaaaa"
rx = re.compile(r'([[:punct:]])\1{2,}')
texto = rx.sub(' ', texto)
print(texto)

我遇到了这个错误:

FutureWarning: Possible nested set at position 2
  rx = re.compile(r'([[:punct:]])\1{2,}')

如何使用 python 运行这个(或类似的)正则表达式？

最佳答案

Python re 不识别 POSIX 括号表达式，因此 [[:punct:]] 看起来像一个嵌套的字符类(因此出现警告消息)。您可以将其替换为包含所有标点符号的字符类，例如[!-/:-@[-`{-~]。请注意，您的正则表达式需要 3 个或更多相同字符(初始捕获组加上 2 个或更多重复)，您只需要 + 而不是 {2,} 并且您需要替换为 \1 以在输出中获取重复字符一次:

import re
texto = "asdasdasd - adasdasd asda ------- asda wadsda +-----+ wwww -- qqqqqq aaaaa"
rx = re.compile(r'([!-/:-@[-`{-~])\1+')
texto = rx.sub(r'\1 ', texto)
print(texto)

输出:

asdasdasd - adasdasd asda -  asda wadsda +- + wwww -  qqqqqq aaaaa

关于python - 如何替换两个或多个重复的 :punct: using re in python?，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/65136664/

26

4

0

puncte 的冲突类型
int main() {scanf("%d",&n); float *puncte; puncte=(float*)malloc(n*sizeof(float)); printf("\nSIZEOF
Java 截断 punct {.} 之前的字符串
我有一些字符串。它们包含一些数据。示例:“Alberto Macano。这是描述。”还有另一个示例:“Pablo Don Carlo。此处有说明。” 我需要什么:一种将名称与描述分开的方法。例如，在
ruby - 除了某些字符外，如何匹配 `:punct:`？
这个问题在这里已经有了答案: I want to match all punctuation in my regexp except apostrophes. How do i do that in
java - 如何将分隔符设置为 "\\p{Punct}"不包括引号？
如果我想将扫描仪的定界符设置为 scanner.useDelimiter("\\p{Punct}"); 但不希望引号包含在该列表中，是否有一个简单的方法来排除它？我试着做 s.useDelimite
python - 如何替换两个或多个重复的 :punct: using re in python?
我需要替换某些字符串上两个或多个重复的标点符号。 "asdasdasd - adasdasd asda ------- asda wadsda +-----+ wwww qqqqqq aaaaa" 到
node.js - [ :^punct] 的 Postgresql 无效类
已阅读 Remove all punctuation except apostrophes in R 中的答案发布，我尝试使用 '[[:space:]]|[^\/[:^punct:]]' 在 REGE
\p{Punct} 上的 Java Regex 帮助
我有以下正则表达式，一个带有 \p{Punct} ，另一个没有片段(1): add(\s[\w\p{Punct}]+)+(\s#\w+)* 片段(2): add(\s[\w]+)+(\s#\w+)*
java - 使用 [ :punct:] function in java 的正则表达式
我正在使用“punct”函数来替换 a 中的特殊字符字符串 ex: ' REPLACE (REGEXP_REPLACE (colum1, '[[:punct:]]' ), ' ', '')) AS O
linux - 错误 TR 命令宽度 :punct: param
我有以下命令: cat original.txt | tr [:upper:] [:lower:] | tr -d [:digit:] | tr -d [:punct:] > preproces.tx
regex - tr 实用程序 - 向括号表达式添加异常(exception) [[ :punct:]]
我想知道是否有一种简单的方法可以在使用 tr 时向 [[:punct:]] 括号表达式添加异常。效用: cat *.txt | tr '[[:punct:]]' '\012' 例如:如果标点字符是-
ruby - 正则表达式 "punct"字符类根据 Ruby 版本匹配不同的字符
Ruby 的标点字符字符类，即[:punct:]、\p{Punct} 或\p {P} 似乎根据我使用的 Ruby 版本匹配不同的字符。这是一个小例子:(很抱歉弄乱了 SO 的语法高亮显示) # p
regex - 在 R 中，如何使用正则表达式 [ :punct:] in gsub?
给定的 test<-"Low-Decarie, Etienne" 我想用空格替换所有标点符号 gsub(pattern="[:punct:]", x=test, replacement=" ") 但这
java - 如何在 java 中使用\\p{Punct} 检查文本行的开头为 : {"
给定一个以符号开头的 String:{" 并以:"} 结尾。行与行之间还有其他标点符号，例如:、' 或 ""等。如何使用 java 正则表达式实用程序来了解给定的字符串是否以:{" 开头。我正在尝试返
java - 我的 Android 应用程序中的\\p{Punct} 行为
这是我的正则表达式 - “[\\w\\d\\p{Punct}]+” 在我的应用程序中，\\p{Punct} 的行为非常奇怪。根据文档(https://docs.oracle.com/javase/7/
ruby - 为什么 Ruby/[[ :punct:]]/miss some punctuation characters?
ruby /[[:punct:]]/应该匹配所有“标点字符”。根据Wikipedia , 这意味着 /[\]\[!"#$%&'()*+,./:;?@\^_`{|}~-]/根据 POSIX 标准。匹配
r - 使用 ">"时，"[[:punct:]]"与 `stringr::str_replace_all` 不匹配？
这个问题在这里已经有了答案: R/regex with stringi/ICU: why is a '+' considered a non-[:punct:] character? (2 个回答)
java - 有没有办法在正则表达式(java)中使用\p{Punct}，但没有 "(",")"字符？
有没有一种方法可以在 java 的正则表达式中使用 \p{Punct}，但没有 ( 和 ) 这两个字符？最佳答案您应该能够使用: [\p{Punct}&&[^()]] 这句话的意思是: The p
java - 我正在尝试从定义的字符串中删除字符串中的字符。错误出现在这里 : if(s. charAt(x) == punct.charAt(y))
String dirtyStr = "Who. do yo$u th,ink you are?!"; System.out.println(scrub(dirtyStr));
java - 正则表达式\p{Punct} 在 java 中缺少 unicode 标点符号
我写了一个小测试来演示 @Test public void missingPunctuationRegex() { Pattern punct = Pattern.compile("[\\p{
java - 在 Haxe 正则表达式中是否有等同于 Java 的\p{Punct} 的东西？
Haxe 的手册没有正则表达式符号的详细信息。我找不到哪个符号映射到 Java 的 \p{Punct}。 Haxe有没有类似的东西？最佳答案根据documentation ，标点符号 unicod

首页

博学

6Ren·AI

商城

python - 如何替换两个或多个重复的 :punct: using re in python?