gpt4 book ai didi

php - 如何删除 DOM 元素标签但保留其内容?

转载 作者:可可西里 更新时间:2023-11-01 00:29:49 25 4
gpt4 key购买 nike

我有 PHP 代码可以删除所有至少具有一个属性的节点。 Here是我的代码:

<?php

$data = <<<DATA
<div>
<p>These line shall stay</p>
<p class="myclass">Remove this one</p>
<p>But keep this</p>
<div style="color: red">and this</div>
</div>
DATA;

$dom = new DOMDOcument();
$dom->loadHTML($data, LIBXML_HTML_NOIMPLIED);
$dom->removeChild($dom->doctype);

$xpath = new DOMXPath($dom);

$lines_to_be_removed = $xpath->query("//*[count(@*)>0]");

foreach ($lines_to_be_removed as $line) {
$line->parentNode->removeChild($line);
}

// just to check
echo $dom->saveHTML();
?>

正如您在 fiddle 中看到的,这是上面代码的当前输出:

<div>
<p>These line shall stay</p>

<p>But keep this</p>

</div>

虽然这是期望的结果:

<div>
<p>These line shall stay</p>
Remove this one
<p>But keep this</p>
and this
</div>

我该怎么做?

最佳答案

在删除元素之前,您要拔出它们的子节点并将它们添加到它后面。

示例:

$data = <<<DATA
<div>
<p>These line shall stay</p>
<p class="myclass">Remove this one</p>
<p>But keep this</p>
<div style="color: red">and this</div>
<div style="color: red">and <p>also</p> this</div>
<div style="color: red">and this <div style="color: red">too</div></div>
</div>
DATA;

$dom = new DOMDocument();
$dom->loadHTML($data, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($dom);

foreach ($xpath->query("//*[@*]") as $node) {
$parent = $node->parentNode;
while ($node->hasChildNodes()) {
$parent->insertBefore($node->lastChild, $node->nextSibling);
}
$parent->removeChild($node);
}

echo $dom->saveHTML();

输出:

<div>
<p>These line shall stay</p>
Remove this one
<p>But keep this</p>
and this
and <p>also</p> this
and this too
</div>

https://3v4l.org/9qHRM

(我添加了一些嵌套元素来证明这种方法的安全性。)


一些旁白:

  • 如果加载额外的 LIBXML_HTML_NODEFDTD 标志,则不需要 $dom->removeChild($dom->doctype)
  • 您的 xpath 表达式可以简化为 //*[@*]

关于php - 如何删除 DOM 元素标签但保留其内容?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39322393/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com