gpt4 book ai didi

html - nokogiri:xml 到 html

转载 作者:行者123 更新时间:2023-11-28 02:18:36 24 4
gpt4 key购买 nike

我只是想做一些直接的转换(几乎只是搜索和替换),但我很难把东西放在适当的位置——我最终得到了错位的链接和重复的内容。我确定我在尝试遍历 xml 时做了一些愚蠢的事情:)

我正在尝试:

builder = Nokogiri::HTML::Builder.new do |doc|
doc.html {
doc.body {
doc.div.wrapper! {
doc.h1 "Short"

xm.css('paragraph').each do |para|

doc.h3.para(:id => para['number']) { doc.text para['number'] }

doc.p.narrativeparagraph {

xm.css('paragraph inner-section').each do |section|
doc.span.innersection { doc.text section.content

xm.css('inner-section xref').each do |xref|
doc.a(:href => "#" + xref['number']) { doc.text xref['number'] }
end

xm.css('paragraph inner-text').each do |innertext|
doc.span.innertext { doc.text innertext.content }
end

} end #inner-section

}

end#end paragraph
}#end wrapper
}#end body
}#end html
end#end builder

在:

<?xml version="1.0"?>

<looseleaf>

<paragraph number="1">
<inner-section> blah one blah <xref number="link1location"></xref>
<inner-text> blah two blah blah </inner-text>
blah three
</inner-section>
</paragraph>

<paragraph number="2">
<inner-section> blah four blah <xref number="link2location"></xref>
<inner-text>blah five blah blah </inner-text>
blah six
</inner-section>
</paragraph>

</looseleaf>

创建:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC- html40/loose.dtd">
<html>
<body>
<div id="wrapper">
<h1>Short</h1>
<h3 class="para" id="1">1</h3>
<p class="narrativeparagraph">
<span class="innersection"> blah one blah <a href="#link1location">link1location</a>
<span class="innertext"> blah two blah blah </span>
blah three</span>
</p>

<h3 class="para" id="2">2</h3>
<p class="narrativeparagraph">
<span class="innersection"> blah four blah <a ref="#link2location">link2location</a>
<span class="innertext">blah five blah blah </span>
blah six</span></p>

我一直在尝试各种方法来使它正常工作,基本的 html 结构没问题,但是段落的子项一团糟 - 非常感谢任何帮助。问候,里奇

最佳答案

有很多方法可以做到这一点,但如果你坚持使用 Builder 方式,我会制作一个函数来翻译 <paragraph><p> .

builder = Nokogiri::HTML::Builder.new do |doc|
doc.html {
doc.body {
doc.div.wrapper! {
doc.h1 "Short"
xm.css('paragraph').each do |para|
doc << translate_paragraph para.dup
end #para
}#end body
}#end html
end#end builder

def translate_paragraph(p)
# Change '<paragraph>' to '<p>'
p.name = 'p'

# Change '<innersection>' to '<span class='innersection'>'
p.css('innersection').each { |tag|
tag.name = 'span'
tag['class'] = 'innersection'
}

# ...
end

不完美,但它适用于 Builder。

我还会考虑 XSLT,或者递归遍历 HTML 树并从那里构建。

关于html - nokogiri:xml 到 html,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/1801042/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com