gpt4 book ai didi

xslt - 根据属性序列比较2个节点集

转载 作者:行者123 更新时间:2023-12-02 02:24:14 31 4
gpt4 key购买 nike

我正在尝试构建一种 XML 库,比较各种节点并将它们组合起来以供以后重用。逻辑应该很简单,如果给定语言的 tag_XX 属性值序列等于另一种语言的 tag_YY 属性值序列,则节点可以合并。请参阅下面的 XML 示例

<Book>
<Section>
<GB>
<Para tag_GB="L1">
<Content_GB>string_1</Content_GB>
</Para>
<Para tag_GB="Illanc">
<Content_GB>string_2</Content_GB>
</Para>
<Para tag_GB="|PLB">
<Content_GB>string_3</Content_GB>
</Para>
<Para tag_GB="L1">
<Content_GB>string_4</Content_GB>
</Para>
<Para tag_GB="Sub">
<Content_GB>string_5</Content_GB>
</Para>
<Para tag_GB="L3">
<Content_GB>string_6</Content_GB>
</Para>
<Para tag_GB="Subbull">
<Content_GB>string_7</Content_GB>
</Para>
</GB>
<!-- German translations - OK because same attribute sequence -->
<DE>
<Para tag_DE="L1">
<Content_DE>German_translation of_string_1</Content_DE>
</Para>
<Para tag_DE="Illanc">
<Content_DE>German_translation of_string_2</Content_DE>
</Para>
<Para tag_DE="|PLB">
<Content_DE>German_translation of_string_3</Content_DE>
</Para>
<Para tag_DE="L1">
<Content_DE>German_translation of_string_4</Content_DE>
</Para>
<Para tag_DE="Sub">
<Content_DE>German_translation of_string_5</Content_DE>
</Para>
<Para tag_DE="L3">
<Content_DE>German_translation of_string_6</Content_DE>
</Para>
<Para tag_DE="Subbull">
<Content_DE>German_translation of_string_7</Content_DE>
</Para>
</DE>
<!-- Danish translations - NG because not same attribute sequence -->
<DK>
<Para tag_DK="L1">
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag_DK="L1_sub">
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag_DK="Illanc">
<Content_DK>Danish_translation_of_string_2</Content_DK>
</Para>
<Para tag_DK="L1">
<Content_DK>Danish_translation_of_string_4</Content_DK>
</Para>
<Para tag_DK="|PLB">
<Content_DK>Danish_translation_of_string_3</Content_DK>
</Para>
<Para tag_DK="L3">
<Content_DK>Danish_translation_of_string_6</Content_DK>
</Para>
<Para tag_DK="Sub">
<Content_DK>Danish_translation_of_string_5</Content_DK>
</Para>
<Para tag_DK="Subbull">
<Content_DK>Danish_translation_of_string_7</Content_DK>
</Para>
</DK>
</Section>
</Book>

所以

GB tag_GB value sequence = L1 -> Illanc -> ... -> SubBul

DE tag_DE value sequence = L1 -> Illanc -> ... -> SubBul(与 GB 相同所以没问题)

DK tag_DK value sequence = L1 -> L1.sub -> 糟糕,预期的 Illanc 意味着这个序列与 GB 不同,locale 可以忽略

由于德语和英语节点集具有相同的属性序列,我喜欢将它们组合如下:

<Book>
<Dictionary>
<Para tag="L1">
<Content_GB>string_1</Content_GB>
<Content_DE>German_translation of_string_1</Content_DE>
</Para>
<Para tag="Illanc">
<Content_GB>string_2</Content_GB>
<Content_DE>German_translation of_string_2</Content_DE>
</Para>
<Para tag="|PLB">
<Content_GB>string_3</Content_GB>
<Content_DE>German_translation of_string_3</Content_DE>
</Para>
<Para tag="L1">
<Content_GB>string_4</Content_GB>
<Content_DE>German_translation of_string_4</Content_DE>
</Para>
<Para tag="Sub">
<Content_GB>string_5</Content_GB>
<Content_DE>German_translation of_string_5</Content_DE>
</Para>
<Para tag="L3">
<Content_GB>string_6</Content_GB>
<Content_DE>German_translation of_string_6</Content_DE>
</Para>
<Para tag="Subbull">
<Content_GB>string_7</Content_GB>
<Content_DE>German_translation of_string_7</Content_DE>
</Para>
</Dictionary>
</Book>

我使用的样式表如下:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" version="1.0" xmlns="http://www.w3.org/1999/xhtml" encoding="UTF-8" indent="yes"/>
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:template match="/">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="text()">
<xsl:value-of select="normalize-space(.)"/>
</xsl:template>
<xsl:template match="Section">
<!-- store reference tag list -->
<xsl:variable name="Ref_tagList" select="GB/Para/attribute()[1]"/>
<Dictionary>
<xsl:for-each select="GB/Para">
<xsl:variable name="pos" select="position()"/>
<Para tag="{@tag_GB}">
<!-- Copy English Master -->
<xsl:apply-templates select="element()[1]"/>
<xsl:for-each select="//Book/Section/element()[not(self::GB)]">
<!-- store current locale tag list -->
<xsl:variable name="Curr_tagList" select="Para/attribute()[1]"/>
<xsl:if test="$Ref_tagList = $Curr_tagList">
<!-- Copy current locale is current tag list equals reference tag list -->
<xsl:apply-templates select="Para[position()=$pos]/element()[1]"/>
</xsl:if>
</xsl:for-each>
</Para>
</xsl:for-each>
</Dictionary>
</xsl:template>
</xsl:stylesheet>

除了可能不是执行此操作的最有效方法(我是 xslt 游戏的新手...),它也不起作用。我想到的逻辑是采用英语母版的属性集,如果任何其他语言环境的属性集相等,我就复制,否则我忽略。但出于某种原因,具有不同属性序列的节点集也会被愉快地复制(如下所示)。有人能告诉我我的逻辑在哪里与现实冲突吗?提前致谢!

当前输出包括应该被忽略的丹麦语......

<Book>
<Dictionary>
<Para tag="L1">
<Content_GB>string_1</Content_GB>
<Content_DE>German_translation of_string_1</Content_DE>
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag="Illanc">
<Content_GB>string_2</Content_GB>
<Content_DE>German_translation of_string_2</Content_DE>
<Content_DK>Partial_Danish_translation_of_string_1</Content_DK>
</Para>
<Para tag="|PLB">
<Content_GB>string_3</Content_GB>
<Content_DE>German_translation of_string_3</Content_DE>
<Content_DK>Danish_translation_of_string_2</Content_DK>
</Para>
<Para tag="L1">
<Content_GB>string_4</Content_GB>
<Content_DE>German_translation of_string_4</Content_DE>
<Content_DK>Danish_translation_of_string_4</Content_DK>
</Para>
<Para tag="Sub">
<Content_GB>string_5</Content_GB>
<Content_DE>German_translation of_string_5</Content_DE>
<Content_DK>Danish_translation_of_string_3</Content_DK>
</Para>
<Para tag="L3">
<Content_GB>string_6</Content_GB>
<Content_DE>German_translation of_string_6</Content_DE>
<Content_DK>Danish_translation_of_string_6</Content_DK>
</Para>
<Para tag="Subbull">
<Content_GB>string_7</Content_GB>
<Content_DE>German_translation of_string_7</Content_DE>
<Content_DK>Danish_translation_of_string_5</Content_DK>
</Para>
</Dictionary>
</Book>

最佳答案

这可能不是最佳解决方案。我使用了以下 XSLT 2.0 功能:

  • 我使用 string-join() 比较了属性序列。
  • 我已经利用了使用 RTF 变量的可能性

可能有更多 XSLT 2.0 工具可以解决您的问题。但我认为这里的大问题是您的输入文档。

很抱歉没有查看您当前的转换。刚刚从头开始实现。希望对您有所帮助:

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>

<xsl:template match="GB">
<Book>
<Dictionary>

<xsl:variable name="matches">
<xsl:for-each select="following-sibling::*
[string-join(Para/@*,'-')
= string-join(current()/Para/@*,'-')]">
<match><xsl:copy-of select="Para/*"/></match>
</xsl:for-each>
</xsl:variable>

<xsl:apply-templates select="Para">
<xsl:with-param name="matches" select="$matches"/>
</xsl:apply-templates>

</Dictionary>
</Book>
</xsl:template>

<xsl:template match="Para[parent::GB]">
<xsl:param name="matches"/>
<xsl:variable name="pos" select="position()"/>
<Para tag="{@tag_GB}">
<xsl:copy-of select="Content_GB"/>
<xsl:copy-of select="$matches/match/*[position()=$pos]"/>
</Para>
</xsl:template>

<xsl:template match="text()"/>

</xsl:stylesheet>

当应用于问题中提供的输入文档时,会产生以下输出:

<Book>
<Dictionary>
<Para tag="L1">
<Content_GB>string_1</Content_GB>
<Content_DE>German_translation of_string_1</Content_DE>
</Para>
<Para tag="Illanc">
<Content_GB>string_2</Content_GB>
<Content_DE>German_translation of_string_2</Content_DE>
</Para>
<Para tag="|PLB">
<Content_GB>string_3</Content_GB>
<Content_DE>German_translation of_string_3</Content_DE>
</Para>
<Para tag="L1">
<Content_GB>string_4</Content_GB>
<Content_DE>German_translation of_string_4</Content_DE>
</Para>
<Para tag="Sub">
<Content_GB>string_5</Content_GB>
<Content_DE>German_translation of_string_5</Content_DE>
</Para>
<Para tag="L3">
<Content_GB>string_6</Content_GB>
<Content_DE>German_translation of_string_6</Content_DE>
</Para>
<Para tag="Subbull">
<Content_GB>string_7</Content_GB>
<Content_DE>German_translation of_string_7</Content_DE>
</Para>
</Dictionary>
</Book>

关于xslt - 根据属性序列比较2个节点集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6694308/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com