gpt4 book ai didi

php - 删除括号(以及里面的任何括号)的正则表达式模式

转载 作者:行者123 更新时间:2023-12-05 05:19:05 26 4
gpt4 key购买 nike

输入是维基百科页面的第一段。我想删除括号和括号本身之间的任何内容。

但是,有时(通常)括号内的 HTML 内容本身包含一个或多个括号,通常在链接的 href="" 中。

采取以下措施:

<p>
The <b>Sarcopterygii</b> or <b>lobe-finned fish</b> (from Greek σαρξ <i>sarx</i>, flesh, and πτερυξ <i>pteryx</i>, fin) – sometimes considered synonymous with <b>Crossopterygii</b> ("fringe-finned fish", from Greek κροσσός <i>krossos</i>, fringe) – constitute a <a href="/wiki/Clade" title="Clade">clade</a> (traditionally a <a href="/wiki/Class_(biology)" title="Class (biology)">class</a> or subclass) of the <a href="/wiki/Osteichthyes" title="Osteichthyes">bony fish</a>, though a strict <a href="/wiki/Cladistic" class="mw-redirect" title="Cladistic">cladistic</a> view includes the terrestrial <a href="/wiki/Vertebrate" title="Vertebrate">vertebrates</a>.
</p>

我希望最终结果是:

<p>
The <b>Sarcopterygii</b> or <b>lobe-finned fish</b> – sometimes considered synonymous with <b>Crossopterygii</b> – constitute a <a href="/wiki/Clade" title="Clade">clade</a> of the <a href="/wiki/Osteichthyes" title="Osteichthyes">bony fish</a>, though a strict <a href="/wiki/Cladistic" class="mw-redirect" title="Cladistic">cladistic</a> view includes the terrestrial <a href="/wiki/Vertebrate" title="Vertebrate">vertebrates</a>.
</p>

但是当我使用下面的 preg_replace 模式时它不起作用,它会被括号内的括号混淆。

public function removeParentheses( $content ) {

$pattern = '@\(.*?\)@';
$content = preg_replace( $pattern, '', $content );
$content = str_replace( ' .', '.', $content );
$content = str_replace( ' ', ' ', $content );
return $content;
}

其次,如何将括号留在链接的 href=""title="" 中?这些(如果不在文本括号内)很重要。

最佳答案

您可以用占位符替换所有链接,然后删除所有括号,最后将占位符替换回其原始值。

这是通过 preg_replace_callback() 完成的,传递一个出现次数计数器和一个替换数组来跟踪链接,然后使用您自己的 removeParentheses() 去除括号,最后使用 str_replace()array_keys()array_values()取回您的链接:

<?php
$string = '<p>
The <b>Sarcopterygii</b> or <b>lobe-finned fish</b> (from Greek σαρξ <i>sarx</i>, flesh, and πτερυξ <i>pteryx</i>, fin) – sometimes considered synonymous with <b>Crossopterygii</b> ("fringe-finned fish", from Greek κροσσός <i>krossos</i>, fringe) – constitute a <a href="/wiki/Clade" title="Clade">clade</a> (traditionally a <a href="/wiki/Class_(biology)" title="Class (biology)">class</a> or subclass) of the <a href="/wiki/Osteichthyes" title="Osteichthyes">bony fish</a>, though a strict <a href="/wiki/Cladistic" class="mw-redirect" title="Cladistic">cladistic</a> view includes the terrestrial <a href="/wiki/Vertebrate" title="Vertebrate">vertebrates</a>.
</p>';
$occurrences = 0;
$replacements = [];
$replacedString = preg_replace_callback("/<a .*?>.*?<\/a>/i", function($el) use (&$occurrences, &$replacements) {
$replacements["|||".$occurrences] = $el[0]; // the ||| are just to avoid unwanted matches
return "|||".$occurrences++;
}, $string);
function removeParentheses( $content ) {
$pattern = '@\(.*?\)@';
$content = preg_replace( $pattern, '', $content );
$content = str_replace( ' .', '.', $content );
$content = str_replace( ' ', ' ', $content );
return $content;
}
$replacedString = removeParentheses($replacedString);
$replacedString = str_replace(array_keys($replacements), array_values($replacements), $replacedString); // get your links back
echo $replacedString;

Demo

结果

<p>
The <b>Sarcopterygii</b> or <b>lobe-finned fish</b> – sometimes considered synonymous with <b>Crossopterygii</b> – constitute a <a href="/wiki/Clade" title="Clade">clade</a> of the <a href="/wiki/Osteichthyes" title="Osteichthyes">bony fish</a>, though a strict <a href="/wiki/Cladistic" class="mw-redirect" title="Cladistic">cladistic</a> view includes the terrestrial <a href="/wiki/Vertebrate" title="Vertebrate">vertebrates</a>.
</p>

然而,在我看来,这有点脆弱。正如其他人在评论中告诉你的那样,你 shouldn't parse HTML with regular expressions . 很多可能会发生变化,您可能会得到意想不到的结果。不过,这可能会让您朝着正确的方向前进。

编辑 关于括号内的括号,您可以使用递归模式。看看this great answer by Bart Kiers :

function removeParentheses( $content ) {
$pattern = '@\(([^()]|(?R))*\)@';
$content = preg_replace( $pattern, '', $content );
$content = str_replace( ' .', '.', $content );
$content = str_replace( ' ', ' ', $content );
return $content;
}

Demo

关于php - 删除括号(以及里面的任何括号)的正则表达式模式,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46814338/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com