gpt4 book ai didi

php - 匹配字符集和可选实体

转载 作者:可可西里 更新时间:2023-11-01 01:09:05 25 4
gpt4 key购买 nike

所以我想使用这段代码在字符串的每 5 个字符中插入一个分词符。

([^\s-]{5})([^\s-]{5})

不幸的是,它也会在实体字符 (&#xxx;) 上中断。有人能给我一个不会破坏实体代码的例子吗?我要断开的字符串来自 xml,因此实际实体被进一步转义 (&#xxx;)。

编辑代码示例

preg_replace('/([^\s-]{5})([^\s-]{5})/', '$1­$2', $subject)

Given the word "Fårevejle"
Expect "Få­revejle" as result
But it outputs "F­5;revejle" instead

最佳答案

假设你想在五个字符之后拆分每个单词,除非它们已经被连字符分隔,将一个实体视为单个字符,试试这个:

$result = preg_replace(
'/ # Start the match
(?: # at one of the following positions:
(?<= # Either right after...
[\s-] # a space or dash
) # end of lookbehind
| # or...
\G # wherever the last match ended.
) # End of start condition.
( # Now match and capture the following:
(?> # Match the following in an atomic group:
&amp;\#\w+; # an entity
| # or
[^\s-] # a non-space, non-dash character
){5} # exactly 5 times.
) # End of capture
(?=[^\s-]) # Assert that we\'re not at the end of a "word"/x',
'\1&shy;', $subject);

这改变了

supercalifragilisticexpidon'tremember! 
alrea-dy se-parated
count entity as one character&amp;#345;blahblah
F&amp;#xe5;revejle

进入

super&shy;calif&shy;ragil&shy;istic&shy;expid&shy;on'tr&shy;ememb&shy;er! 
alrea-dy se-parat&shy;ed
count entit&shy;y as one chara&shy;cter&amp;#345;&shy;blahb&shy;lah
F&amp;#xe5;rev&shy;ejle

关于php - 匹配字符集和可选实体,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5181403/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com