- xml - AJAX/Jquery XML 解析
- 具有多重继承的 XML 模式
- .net - 枚举序列化 Json 与 XML
- XML 简单类型、简单内容、复杂类型、复杂内容
我正在尝试编写一个 Ruby 正则表达式,它可以捕获引用的短语,而不是前面有“:”的短语。例如:
Obama: "Yes, we can!"
应该被忽略。
我写了一些测试:
最佳答案
编辑:还有更多调整。
根据输入的具体内容,这适用于 ASCII:
(?<! [:\s] ) \s* ( ["'] ) (?: (?! \1 ) . )+ \1
对于“Unicode‘匹配’引号”,你必须在你的配对中更“具体”一点,也许沿着这些方向:
(?xs) (?<!:) \s+
(?: ( ["'] ) (?: (?! \1 ) . )+ \1
| “ .*? ” # English etc
| ‘ .*? ’
| « .*? » # French, Spanish, Italian
| ‹ .*? ›
| „ .*? “ # German, Icelandic, Romanian
| ‚ .*? ‘
| „ .?* ” # Hungarian
| ” .?* ” # Swedish
| ’ .?* ’
| » .?* « # Danish, Hungarian
| › .*? ‹
| 「 .*? 」 # Japanese, Chinese
| 『 .?* 』
)
您可以阅读更多关于各种语言使用的成对引号的种类 here .
这是一个用 Perl 编写的测试程序,但这些原则在 Ruby 中应该完全适用:
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use open qw[ :std IO :utf8 ];
while (<DATA>) {
print if / (?<! [:\s] ) \s* ( ["'] ) (?: (?! \1 ) . )+ \1/sx;
}
__END__
"Take off, hoser!"
Dorothy Parker:Brevity is the soul of lingerie.
Dorothy Parker:"Brevity is the soul of lingerie."
Dorothy Parker: "Brevity is the soul of lingerie."
Dorothy Parker: "Brevity is the soul of lingerie."
Larry Wall: I don't know if it's what you want, but it's what you get. :-)
Larry Wall said, "I don't know if it's what you want, but it's what you get. :-)"
Larry Wall said: “I don't know if it's what you want, but it’s what you get. :-)”
Larry Wall said: “I don't know if it's what you want, but it’s what you get. :-)”
Larry Wall said, “I don't know if it's what you want, but it's what you get. :-)”
Boss: And what's that "goto" doing there?!?
Hacker: Er, I guess my finger slipped when I was typing "getservbyport"...
‘Nevermore!’ quoth the raven.
Quoth the raven: ‘Nevermore!’
'I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent.
src/perl/mg.c: "I wish I had never come here, and I don't want to see no more magic," he said, and fell silent.
src/perl/mg.c: 'I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent.
src/perl/mg.c => "I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent."
‘I wish I had never come here, and I don’t want to see no more magic,’ he said, and fell silent.’
“I wish I had never come here, and I don’t want to see no more magic,’ he said, and fell silent.”
输出是
"Take off, hoser!"
Larry Wall: I don't know if it's what you want, but it's what you get. :-)
Larry Wall said, "I don't know if it's what you want, but it's what you get. :-)"
Larry Wall said: “I don't know if it's what you want, but it’s what you get. :-)”
Larry Wall said: “I don't know if it's what you want, but it’s what you get. :-)”
Larry Wall said, “I don't know if it's what you want, but it's what you get. :-)”
Boss: And what's that "goto" doing there?!?
Hacker: Er, I guess my finger slipped when I was typing "getservbyport"...
'I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent.
src/perl/mg.c: 'I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent.
src/perl/mg.c => "I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent."
这可能看起来“错误”,但这是因为内部引用。这是一个更完整的版本,可以更好地说明问题:
#!/usr/bin/perl
use strict;
use warnings;
use utf8;
use open qw[ :std IO :utf8 ];
while (<DATA>) {
chomp;
my $bingo = m{
(?<! [:\s] ) \s*
(?: (?<= ^ )
| (?<= \s )
)
(?: ( ["'] ) (?: (?! \1 ) . )+ \1
| “ .*? ” # English etc
| ‘ .*? ’
)
}sx;
if ($bingo) {
printf("Line %2d, quote 「%s」\n", $., $&);
printf(" " x 7 . "in line 『%s』\n", $_);
} else {
printf("Line %2d IGNORE 『%s』\n", $., $_);
}
}
__END__
"Take off, hoser!"
Dorothy Parker:Brevity is the soul of lingerie.
Dorothy Parker:"Brevity is the soul of lingerie."
Dorothy Parker: "Brevity is the soul of lingerie."
Dorothy Parker: "Brevity is the soul of lingerie."
Larry Wall: I don't know if it's what you want, but it's what you get. :-)
Larry Wall said, "I don't know if it's what you want, but it's what you get. :-)"
Larry Wall said: “I don't know if it's what you want, but it’s what you get. :-)”
Larry Wall said: “I don't know if it's what you want, but it’s what you get. :-)”
Larry Wall said, “I don't know if it's what you want, but it's what you get. :-)”
Boss: And what's that "goto" doing there?!?
Hacker: Er, I guess my finger slipped when I was typing "getservbyport"...
‘Nevermore!’ quoth the raven.
Quoth the raven: ‘Nevermore!’
'I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent.
src/perl/mg.c: "I wish I had never come here, and I don't want to see no more magic," he said, and fell silent.
src/perl/mg.c: 'I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent.
src/perl/mg.c => "I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent."
‘I wish I had never come here, and I don’t want to see no more magic,’ he said, and fell silent.’
“I wish I had never come here, and I don’t want to see no more magic,’ he said, and fell silent.”
谁的输出是:
Line 1, quote 「"Take off, hoser!"」
in line 『"Take off, hoser!"』
Line 2 IGNORE 『Dorothy Parker:Brevity is the soul of lingerie.』
Line 3 IGNORE 『Dorothy Parker:"Brevity is the soul of lingerie."』
Line 4 IGNORE 『Dorothy Parker: "Brevity is the soul of lingerie."』
Line 5 IGNORE 『Dorothy Parker: "Brevity is the soul of lingerie."』
Line 6 IGNORE 『Larry Wall: I don't know if it's what you want, but it's what you get. :-)』
Line 7, quote 「 "I don't know if it's what you want, but it's what you get. :-)"」
in line 『Larry Wall said, "I don't know if it's what you want, but it's what you get. :-)"』
Line 8 IGNORE 『Larry Wall said: “I don't know if it's what you want, but it’s what you get. :-)”』
Line 9 IGNORE 『Larry Wall said: “I don't know if it's what you want, but it’s what you get. :-)”』
Line 10, quote 「 “I don't know if it's what you want, but it's what you get. :-)”」
in line 『Larry Wall said, “I don't know if it's what you want, but it's what you get. :-)”』
Line 11, quote 「 "goto"」
in line 『Boss: And what's that "goto" doing there?!?』
Line 12, quote 「 "getservbyport"」
in line 『Hacker: Er, I guess my finger slipped when I was typing "getservbyport"...』
Line 13, quote 「‘Nevermore!’」
in line 『‘Nevermore!’ quoth the raven.』
Line 14 IGNORE 『Quoth the raven: ‘Nevermore!’』
Line 15, quote 「'I wish I had never come here, and I don'」
in line 『'I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent.』
Line 16 IGNORE 『src/perl/mg.c: "I wish I had never come here, and I don't want to see no more magic," he said, and fell silent.』
Line 17 IGNORE 『src/perl/mg.c: 'I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent.』
Line 18, quote 「 "I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent."」
in line 『src/perl/mg.c => "I wish I had never come here, and I don't want to see no more magic,' he said, and fell silent."』
Line 19, quote 「‘I wish I had never come here, and I don’」
in line 『‘I wish I had never come here, and I don’t want to see no more magic,’ he said, and fell silent.’』
Line 20, quote 「“I wish I had never come here, and I don’t want to see no more magic,’ he said, and fell silent.”」
in line 『“I wish I had never come here, and I don’t want to see no more magic,’ he said, and fell silent.”』
此外,还有一个标准的 Unicode 派生属性,称为 \p{Quotation_Mark}
或简称 \p{QMark}
,但 Ruby 不支持它。您可以使用 the unichars script 列出所有这些:
$ unichars '\p{qmark}'
" 34 0022 QUOTATION MARK
' 39 0027 APOSTROPHE
« 171 00AB LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
» 187 00BB RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
‘ 8216 2018 LEFT SINGLE QUOTATION MARK
’ 8217 2019 RIGHT SINGLE QUOTATION MARK
‚ 8218 201A SINGLE LOW-9 QUOTATION MARK
‛ 8219 201B SINGLE HIGH-REVERSED-9 QUOTATION MARK
“ 8220 201C LEFT DOUBLE QUOTATION MARK
” 8221 201D RIGHT DOUBLE QUOTATION MARK
„ 8222 201E DOUBLE LOW-9 QUOTATION MARK
‟ 8223 201F DOUBLE HIGH-REVERSED-9 QUOTATION MARK
‹ 8249 2039 SINGLE LEFT-POINTING ANGLE QUOTATION MARK
› 8250 203A SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
「 12300 300C LEFT CORNER BRACKET
」 12301 300D RIGHT CORNER BRACKET
『 12302 300E LEFT WHITE CORNER BRACKET
』 12303 300F RIGHT WHITE CORNER BRACKET
〝 12317 301D REVERSED DOUBLE PRIME QUOTATION MARK
〞 12318 301E DOUBLE PRIME QUOTATION MARK
〟 12319 301F LOW DOUBLE PRIME QUOTATION MARK
﹁ 65089 FE41 PRESENTATION FORM FOR VERTICAL LEFT CORNER BRACKET
﹂ 65090 FE42 PRESENTATION FORM FOR VERTICAL RIGHT CORNER BRACKET
﹃ 65091 FE43 PRESENTATION FORM FOR VERTICAL LEFT WHITE CORNER BRACKET
﹄ 65092 FE44 PRESENTATION FORM FOR VERTICAL RIGHT WHITE CORNER BRACKET
" 65282 FF02 FULLWIDTH QUOTATION MARK
' 65287 FF07 FULLWIDTH APOSTROPHE
「 65378 FF62 HALFWIDTH LEFT CORNER BRACKET
」 65379 FF63 HALFWIDTH RIGHT CORNER BRACKET
您可以使用 the uniprops script 列出代码点的所有属性:
$ uniprops -a 2018
U+2018 ‹‘› \N{ LEFT SINGLE QUOTATION MARK }:
\pP \p{Pi}
All Any Assigned InGeneralPunctuation Case_Ignorable CI Common Zyyy Pi P General_Punctuation Gr_Base Grapheme_Base Graph GrBase Initial_Punctuation Punct Pat_Syn Pattern_Syntax PatSyn Print Punctuation QMark Quotation_Mark X_POSIX_Graph X_POSIX_Print X_POSIX_Punct
Age=1.1 Bidi_Class=ON Bidi_Class=Other_Neutral BC=ON Block=General_Punctuation Canonical_Combining_Class=0 Canonical_Combining_Class=Not_Reordered CCC=NR Canonical_Combining_Class=NR Script=Common Decomposition_Type=None DT=None East_Asian_Width=A East_Asian_Width=Ambiguous EA=A Grapheme_Cluster_Break=Other GCB=XX Grapheme_Cluster_Break=XX Hangul_Syllable_Type=NA Hangul_Syllable_Type=Not_Applicable HST=NA Joining_Group=No_Joining_Group JG=NoJoiningGroup Joining_Type=Non_Joining JT=U Joining_Type=U Line_Break=QU Line_Break=Quotation LB=QU Numeric_Type=None NT=None Numeric_Value=NaN NV=NaN Present_In=1.1 IN=1.1 Present_In=2.0 IN=2.0 Present_In=2.1 IN=2.1 Present_In=3.0 IN=3.0 Present_In=3.1 IN=3.1 Present_In=3.2 IN=3.2 Present_In=4.0 IN=4.0 Present_In=4.1 IN=4.1 Present_In=5.0 IN=5.0 Present_In=5.1 IN=5.1 Present_In=5.2 IN=5.2 Present_In=6.0 IN=6.0 SC=Zyyy Script=Zyyy Sentence_Break=CL Sentence_Break=Close SB=CL Word_Break=MB Word_Break=MidNumLet WB=MB _Case_Ignorable _X_Begin
关于 ruby 正则表达式 : ignore quotes if a colon stands before them,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/4984171/
所以我一直在尝试使用 Postgresql + Nginx 使用 CodeIgniter 4 设置 CRUD 我目前将数据库文件设置为。 database.default.database = ign
var quote = 'some text here [[quote=bob]This is some text bob wrote[/quote]] other text here'; 我正在尝试
我需要进行以下类型的正则表达式匹配 例如,如果我有一个字符串 - 一些“这是“样本”数据”示例 我想从上面的字符串中提取“这是“样本”数据”。请给我一个正则表达式,它可以返回我想要的结果 更多详情 我
我需要将以下所有 [quote] 替换为 " 并将 [/quote] 替换为 "字符串: [quote]fgfhfgh [quote] vbbb[/quote]ghhhhjj[/quote] 结果应该
我正在尝试找到以下问题的解决方案,如果有的话。 假设主引号是双引号“(开始)和”(结束)。假设第二个引号是单引号‘(开始)和’(结束)。 想想如何显示引号中的引号。 写引号的标准当然是交替的,像这样。
我有这个带有预写信息的 HTML 代码。我的目标是在我聚焦/单击文本区域后,在黄色背景中突出显示 [quote] [/quote] 之间的文本。 This is a test message. [q
嘿,我有一些格式如下的文本: [quote]foo text[/quote] 我想把它改成这样: foo text 如何用 JS 做到这一点? 我正在使用 jQuery。 最佳答案 应该执行以下操作:
目标是使用 Javascript (vanilla) 删除介于 [quote][/quote] 和 [quote=something][/quote] 之间的所有文本(包括)(不区分大小写).最好也删
我正在使用 Content: Open-quote,我第一次使用它时,它看起来不错,但第二次它是一个单引号,这不是我想要的。这是一个显示问题的 jsfiddle。 https://jsfiddle.n
for /f "delims=" %%a in ('"%systemRoot%\system32\find.exe" /?') do @echo %%a 是的,上一行有效。没有多大用处,但有效。但是尝
可能是个愚蠢的问题,但似乎无法让它发挥作用。我需要用\"替换文本框中的引号,以便正确导出到 excel。我正在尝试: [Note].Text).Replace("\"", "\"") 我做的完全错了吗
有什么区别 和 第一个在 ${expression.value} 周围使用单引号,第二个使用双引号。 最佳答案 没有区别,哪个都好。两者都有效的原因是 " 或 ' 可以在表达式中使用。在这种情况下
我怎样才能像下面的代码那样使用字符串。 $str = 'Is yo"ur name O'reil"ly?'; 上面的代码只是一个例子..我需要使用包含单引号和双引号的大 html 模板。我尝试了 Ad
我的一个项目使用 shlex.quote,它从 python 3.3 开始可用。但是 shlex.quote 与 pipes.quote 相同,后者在移动到 shlex 后已弃用。 现在为了兼容性我正
在ANTLR v4中,我们如何像VBA中那样用双引号转义的双引号解析这种字符串? 对于文本: "some string with ""john doe"" in it" 目标是识别字符串:some s
我正在制作一个Windows批处理文件以运行7zip命令行来压缩一组文件。需要根据文档引用以下脚本中/C后面的部分。问题出在文件路径周围已经有双引号引起了。我是否需要\转义某些引号,使用多余的引号,单
也许你可以帮我在文档中找到它。我使用磅引号能够在执行之前传递未评估的函数名称。例如: (#'cons 1 ()) ;(1) (defn funcrunner [func a b] (func a
我从 CSV 文件中得到了一行,其中 " 作为字段封闭器,, 作为字段分隔符作为字符串。有时数据中有 " 会破坏字段包围。我正在寻找一个正则表达式来删除这些 " 。 我的字符串如下所示: my $cs
我正在尝试以下操作: $stmt = $db->prepare("SELECT * FROM table WHERE date BETWEEN :year-
我有一个从我们网站导出的带有引号 "的 excel 文件,需要用 html 实体替换: " 这个: Dimensions are 7" wide by 5" tall. 需要看起来像: Di
我是一名优秀的程序员,十分优秀!