gpt4 book ai didi

ruby - 如何使用 Nokogiri 递归删除 XML 中特定 xpath 位置的空子元素?

转载 作者:太空宇宙 更新时间:2023-11-03 17:57:10 24 4
gpt4 key购买 nike

我有以下 XML,其中我有几个带有空文本的子元素。

doc = <<'XML'
<Book>
<BookId>BK45647</BookId>
<BookName>The Client by John Grisham</BookName>
<BookAuthenticationCode></BookAuthenticationCode>
<BookCategory>Suspense</BookCategory>
<BookSequence></BookSequence>
<BookPublisherInfo>
<PublisherId>PBBK12345</PublisherId>
<PublisherName>Mc.GrawHill</PublisherName>
<PublisherIndex></PublisherIndex>
<PublisherCategoryQuota></PublisherCategoryQuota>
</BookPublisherInfo>
<BookPurchaselist>
<Customer>
<FirstName>John</FirstName>
<LastName>Smith</LastName>
<MiddleName></MiddleName>
<NickName></NickName>
</Customer>
<Customer>
<FirstName>Winston</FirstName>
<LastName>Churchill</LastName>
<MiddleName></MiddleName>
<NickName></NickName>
</Customer>
</BookPurchaselist>
</Book>
XML

我试过下面的代码,但不知何故无法正常工作。

cust = doc.at_xpath("//Customer")
cust.each do |cust_obj|
if cust_obj.has_text? == false
cust_obj.delete
end
end

这在某种程度上无法正常工作并给出以下输出

<Book>
<BookId>BK45647</BookId>
<BookName>The Client by John Grisham</BookName>
<BookAuthenticationCode></BookAuthenticationCode>
<BookCategory>Suspense</BookCategory>
<BookSequence></BookSequence>
<BookPublisherInfo>
<PublisherId>PBBK12345</PublisherId>
<PublisherName>Mc.GrawHill</PublisherName>
<PublisherIndex></PublisherIndex>
<PublisherCategoryQuota></PublisherCategoryQuota>
</BookPublisherInfo>
<BookPurchaselist>
<Customer>
<FirstName>John</FirstName>
<LastName>Smith</LastName>
<MiddleName></MiddleName>
</Customer>
<Customer>
<FirstName>Winston</FirstName>
<LastName>Churchill</LastName>
<NickName></NickName>
</Customer>
</BookPurchaselist>
</Book>

很少有空文本的元素被获取,也很少保持原样。我如何以递归方式删除特定 xpath 中的元素(带有空数据)并重写 XML。

卡在这里..需要建议。

最佳答案

doc.xpath('//Customer/child::*[not(text())]').each do |node|
node.remove
end

如果你也想删除没有子节点的节点,你可以使用not(node())

编辑:完整的工作示例(使用与上面相同的代码)

require 'nokogiri'

xml = <<-XML
<Book>
<BookId>BK45647</BookId>
<BookName>The Client by John Grisham</BookName>
<BookAuthenticationCode></BookAuthenticationCode>
<BookCategory>Suspense</BookCategory>
<BookSequence></BookSequence>
<BookPublisherInfo>
<PublisherId>PBBK12345</PublisherId>
<PublisherName>Mc.GrawHill</PublisherName>
<PublisherIndex></PublisherIndex>
<PublisherCategoryQuota></PublisherCategoryQuota>
</BookPublisherInfo>
<BookPurchaselist>
<Customer>
<FirstName>John</FirstName>
<LastName>Smith</LastName>
<MiddleName></MiddleName>
</Customer>
<Customer>
<FirstName>Winston</FirstName>
<LastName>Churchill</LastName>
<NickName></NickName>
</Customer>
</BookPurchaselist>
</Book>
XML

doc = Nokogiri.parse(xml)

doc.xpath('//Customer/child::*[not(text())]').each do |node|
node.remove
end

puts doc.to_s

这个程序的输出是:

<?xml version="1.0"?>
<Book>
<BookId>BK45647</BookId>
<BookName>The Client by John Grisham</BookName>
<BookAuthenticationCode/>
<BookCategory>Suspense</BookCategory>
<BookSequence/>
<BookPublisherInfo>
<PublisherId>PBBK12345</PublisherId>
<PublisherName>Mc.GrawHill</PublisherName>
<PublisherIndex/>
<PublisherCategoryQuota/>
</BookPublisherInfo>
<BookPurchaselist>
<Customer>
<FirstName>John</FirstName>
<LastName>Smith</LastName>

</Customer>
<Customer>
<FirstName>Winston</FirstName>
<LastName>Churchill</LastName>

</Customer>
</BookPurchaselist>
</Book>

关于ruby - 如何使用 Nokogiri 递归删除 XML 中特定 xpath 位置的空子元素?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11367168/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com