gpt4 book ai didi

xslt - 使用 xslt 查找表替换子字符串

转载 作者:行者123 更新时间:2023-12-02 10:10:16 24 4
gpt4 key购买 nike

我有一些包含十六进制字符串变体的字符串(来源是framemaker,如果有人关心的话)。因此,字符串可能看起来像

this is some sentence with some hex code\x27 s , and we need that fixed.

并且需要更改为

this is some sentence with some hex code's , and we need that fixed.

实际上,单个字符串中可能有其中一些,因此我正在寻找遍历文本、捕获所有十六进制代码(看起来像\x## )并将所有这些代码替换为的最佳方法正确的字符。我制作了一个 xml 列表/查找表,其中包含所有字符,如下所示:

<xsl:param name="reflist">
<Code Value="\x27">'</Code>
<Code Value="\x28">(</Code>
<Code Value="\x29">)</Code>
<Code Value="\x2a">*</Code>
<Code Value="\x2b">+</Code>
<!-- much more like these... -->
</xsl:param>

现在我使用了一个简单的替换参数,但字符太多,无法实现。

最好的方法是什么?

最佳答案

人们可以完全避免使用任何“引用表”——就像这样:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:my="my:my" exclude-result-prefixes="my xs">
<xsl:output omit-xml-declaration="yes" indent="yes"/>

<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>

<xsl:template match="text()[matches(., '\\x(\d|[a-f])+')]">
<xsl:analyze-string select="." regex="\\x(\d|[a-f])+" >
<xsl:matching-substring>
<xsl:value-of select=
"codepoints-to-string(my:hex2dec(substring(.,3), 0))"/>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="."/>
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:template>

<xsl:function name="my:hex2dec" as="xs:integer">
<xsl:param name="pStr" as="xs:string"/>
<xsl:param name="pAccum" as="xs:integer"/>

<xsl:sequence select=
"if(not($pStr))
then $pAccum
else
for $char in substring($pStr, 1, 1),
$code in
if($char ge '0' and $char le '9')
then xs:integer($char)
else
string-to-codepoints($char) - string-to-codepoints('a') +10
return
my:hex2dec(substring($pStr,2), 16*$pAccum + $code)
"/>
</xsl:function>
</xsl:stylesheet>

当此转换应用于以下 XML 文档时:

<t>
<p>this is some sentence with some hex code\x27 s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code\x28 s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code\x29 s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code\x2a s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code\x2b s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code\x2c s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code\x2d s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code\x2e s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code\x2f s ,
and we need that fixed.</p>
</t>

产生了想要的正确结果:

<t>
<p>this is some sentence with some hex code' s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code( s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code) s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code* s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code+ s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code, s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code- s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code. s ,
and we need that fixed.</p>
<p>this is some sentence with some hex code/ s ,
and we need that fixed.</p>
</t>

请注意:

此转换是通用的,可以正确处理任何十六进制 unicode 代码。

例如,如果对此 XML 文档应用相同的转换:

<t>
<p>this is some sentence with some hex code\x0428\x0438\x0448 s ,
and we need that fixed.</p>
</t>

生成正确的结果(包含保加利亚语中的西里尔字母“烧烤”):

<t>
<p>this is some sentence with some hex codeШиш s ,
and we need that fixed.</p>
</t>

关于xslt - 使用 xslt 查找表替换子字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13309899/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com