xml - 如何按内容对元素进行分组(XSLT 2.0)？-6ren

xml - 如何按内容对元素进行分组(XSLT 2.0)？

转载作者：数据小太阳更新时间：2023-10-29 02:32:07

25

4

-- 修改后的问题--

已经感谢所有提供潜在解决方案的人，但这些与我已经尝试过的一致，所以我想我应该更清楚。我稍微扩展了 XML 以使问题更加透明。

XML实际上是各种文件的汇编，包含翻译的内容，目的是得到一个统一的文档，只包含唯一的英文字符串，并且(经过人工审查和清理)每个字符串都有一个翻译的，所以它可以用于翻译内存库。这就是为什么它现在是一个包含大量冗余信息的大文件。

每一段行都包含英文母版(在文件中可以重复数十次)和翻译变体。在很多情况下，这很容易，因为所有翻译版本都是相同的，所以我最终会得到一行，但在其他情况下，它可能会更复杂。

所以，假设今天我有 10 行包含相同的英语内容 (#1)、2 种不同的德语变体、3 种不同的法语变体，而其余的语言环境我只需要一个变体:

1 Para 具有:1 EN/2 DE(v1 和 v2)/3 FR(v1、v2 和 v3)/...

这对我列表中的每个分组的唯一英语值重复

修改后的 XML:

<Books>
<!--First English String (#1) with number of potential translations -->
<Para>
    <EN>English Content #1</EN>
    <DE>German Trans of #1 v1</DE>
    <FR>French Trans of #1 v1</FR>
    <!-- More locales here -->
</Para>
<Para>
    <EN>English Content #1</EN>
    <DE>German Trans of #1 v2</DE>
    <FR>French Trans of #1 v1</FR>
    <!-- More locales here -->
</Para>
<Para>
    <EN>English Content #1</EN>
    <DE>German Trans of #1 v1</DE>
    <FR>French Trans of #1 v2</FR>
    <!-- More locales here -->
</Para>
<!--Second English String (#2) with number of potential translations -->
<Para>
    <EN>English Content #2</EN>
    <DE>German Trans of #2 v1</DE>
    <FR>French Trans of #2 v1</FR>
    <!-- More locales here -->
</Para>
<Para>
    <EN>English Content #2</EN>
    <DE>German Trans of #2 v3</DE>
    <FR>French Trans of #2 v1</FR>
    <!-- More locales here -->
</Para>
<Para>
    <EN>English Content #2</EN>
    <DE>German Trans of #2 v2</DE>
    <FR>French Trans of #2 v1</FR>
    <!-- More locales here -->
</Para>
<!--Loads of additional English Strings (#3 ~ #n) with number of potential    translations -->

当前的解决方案为我提供了以下输出

<Books>
<Para>
    <EN>English Content #1</EN>
    <DE>German Trans of #1 v1</DE>
    <DE>German Trans of #1 v2</DE>
    <DE>German Trans of #2 v1</DE>
    <DE>German Trans of #2 v3</DE>
    <DE>German Trans of #2 v2</DE>
    <FR>French Trans of #1 v1</FR>
    <FR>French Trans of #1 v1</FR>
    <FR>French Trans of #1 v2</FR>
    <FR>French Trans of #2 v1</FR>
</Para>
</Books>

因此，只取第一个 EN 标签，然后将所有其他标签分组，与英文主字符串之间的差异无关。虽然我的目标是获得以下内容:

<Books>
<!-- First Grouped EN string and linked grouped translations -->
<Para>
    <EN>English Content #1</EN>
    <DE>German Trans of #1 v1</DE>
    <DE>German Trans of #1 v2</DE>
    <FR>French Trans of #1 v1</FR>
    <FR>French Trans of #1 v2</FR>
</Para>
<!-- Second Grouped EN string and linked grouped translations -->
<Para>
    <EN>English Content #2</EN>
    <DE>German Trans of #2 v1</DE>
    <DE>German Trans of #2 v3</DE>
    <DE>German Trans of #2 v2</DE>
    <FR>French Trans of #2 v1</FR>
</Para>
<!-- 3d to n Grouped EN string and linked grouped translations -->
</Books>

最佳答案

扩展 XSLT 2.0 答案以完成问题请求中的更新

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="Books">
        <xsl:copy>
            <xsl:for-each-group select="*" 
                group-by="EN">
                <xsl:copy>
                   <xsl:copy-of select="EN"/>
                   <xsl:for-each-group select="current-group()/*[not(local-name()='EN')]"
                        group-by=".">
                        <xsl:sort select="local-name()"/>
                        <xsl:copy-of select="."/>
                    </xsl:for-each-group>
                </xsl:copy>
            </xsl:for-each-group>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

扩展 XSLT 1.0 答案以完成问题请求中的更新

即使您需要两种不同类型的 key ，您仍然可以使用相同类型的解决方案。这是第一个想到的简单解决方案:

<xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:key name="main" match="Para" use="EN"/>
    <xsl:key name="locale" match="Para/*[not(self::EN)]" use="concat(../EN,.)"/>

    <xsl:template match="Books">
        <xsl:copy>
            <xsl:apply-templates select="Para[
                generate-id()
                = generate-id(key('main',EN)[1])]" mode="EN"/>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="*" mode="EN">
        <xsl:copy>
            <xsl:copy-of select="EN"/>
            <xsl:apply-templates select="../Para/*[
                generate-id()
                = generate-id(key('locale',concat(current()/EN,.))[1])]" mode="locale">
                <xsl:sort select="local-name()"/>
            </xsl:apply-templates>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="*" mode="locale">
        <xsl:copy>
            <xsl:value-of select="."/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

应用时

n the new provided input, produces:

<Books>
    <Para>
        <EN>English Content #1</EN>
        <DE>German Trans of #1 v1</DE>
        <DE>German Trans of #1 v2</DE>
        <FR>French Trans of #1 v1</FR>
        <FR>French Trans of #1 v2</FR>
    </Para>
    <Para>
        <EN>English Content #2</EN>
        <DE>German Trans of #2 v1</DE>
        <DE>German Trans of #2 v3</DE>
        <DE>German Trans of #2 v2</DE>
        <FR>French Trans of #2 v1</FR>
    </Para>
</Books>

此 XSLT 1.0 转换完全符合您的要求，如果您愿意，它可以用作创建更有意义的结果树的起点:

 <xsl:stylesheet version="1.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>


    <xsl:key name="locale" match="Para/*[not(local-name()='EN')]" use="text()"/>

    <xsl:template match="Books">
        <xsl:copy>
            <Para>
                <xsl:copy-of select="Para[1]/EN"/>
                <xsl:apply-templates select="Para/*[
                    generate-id()
                    = generate-id(key('locale',text())[1])]" mode="group">
                    <xsl:sort select="local-name()"/>
                </xsl:apply-templates>
            </Para>
        </xsl:copy>
    </xsl:template>

    <xsl:template match="*" mode="group">
        <xsl:copy>
            <xsl:value-of select="."/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

解释:

xsl:key 用于按内容对所有元素进行分组(但 EN)
第一个PARA/EN节点的简单直接复制
Meunchian 分组方法与xsl:sort 输出按要求分组的其他元素(具有相同内容的元素报告一次)

当应用于问题中提供的输入时，结果树是:

<Books>
   <Para>
      <EN>Some English Content</EN>
      <DE>German Trans v1</DE>
      <DE>German Trans v2</DE>
      <FR>French Trans v1</FR>
      <FR>French Trans v2</FR>
   </Para>
</Books>

与 XSLT 2.0 xsl:for-each-group 相同的结果(和更短的转换):

<xsl:stylesheet version="2.0"
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
    <xsl:output omit-xml-declaration="yes" indent="yes"/>
    <xsl:strip-space elements="*"/>

    <xsl:template match="Books">
        <xsl:copy>
            <Para>
                <xsl:copy-of select="Para[1]/EN"/>
                <xsl:for-each-group select="Para/*[not(local-name()='EN')]" 
                            group-by=".">
                    <xsl:sort select="local-name()"/>
                    <xsl:copy>
                        <xsl:value-of select="."/>
                    </xsl:copy>
                </xsl:for-each-group>
            </Para>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>

关于xml - 如何按内容对元素进行分组(XSLT 2.0)？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/6486761/

25

4

0

文章推荐： xml - 如何使用 xslt 保留 xml 元素的数据位置

文章推荐： go - 通过引用实例化对象

文章推荐：带有 golang-onbuild 的 Docker-compose 在代理后面下载失败

文章推荐： html - Word XML - XSLT 到 HTML

xslt - 使用 XSLT 从 XSLT 样式表中删除命名空间声明
我有一个 XSLT 样式表，如下所示: 我想使用第二个 XSLT 样式表来转换此样式表，以删除与 XQHead
xslt - 一个大的 xslt 优于更小、更细粒度的 xslt
我们有一个大型 xslt，可以呈现整个商店区域，包括产品、制造商，并根据价格和类别进行过滤。我使用 sitecore 作为 CMS，但遇到缓存问题。我有大约 9000 个项目，有些页面需要长达 20
xslt - XSLT:是否应用带有条件参数的模板？
我想根据条件的结果应用具有不同参数的模板。像这样： Attribute no. 1
xslt - 循环 XSLT
我有一些看起来像这样的 XML Foo Details Bar Details Baz Details Foo Blah Bar BlahBlah Baz BlahBlahBl
xslt - XSLT 中的矩阵转置
我试图从这种输入出发: a b c d e f g ... 使用 XSLT 的 HTML 输出: one two a e b f
xslt - xslt 中的第一个子节点名称
我想知道如何在 xslt 中找到特定节点的第一个子节点名称。我有一个 xml: some text 我可以使用 body/
xslt - XSLT 中上个月的最后一天
是否可以在 XSLT 中获取上个月的最后一天？我找到了这个函数:http://www.xsltfunctions.com/xsl/functx_last-day-of-month.html但我不确定如
xslt - xslt 中匹配命名空间的问题
具有特定节点的匹配元素存在问题。 xml: description of profile PhoneKeyPad S
xslt - XSLT 中的动态变量
我将一堆键值对作为参数传递给 XSL(日期 ->“1 月 20 日”，作者 ->“Dominic Rodger”，...)。我正在解析的一些 XML 中引用了这些 - XML 如下所示: 目前，除
xslt - xslt 中最后一个字符后的子字符串
我找不到这个问题的确切答案，所以我希望有人能在这里帮助我。我有一个字符串，我想在最后一个 '.' 之后获取子字符串。我正在使用 xslt 1.0。这是怎么做的？这是我的代码。
xslt - XSLT 中的变量范围
我在尝试找出 xslt 上的 var 范围时遇到问题。我实际上想要做的是忽略具有重复“旅游代码”的“旅行”标签。示例 XML: X1 Budapest X1 Budapest X
xslt - XSLT 中的动态排序？
我有一些数据在 xslt 的 for-each 循环中输出。我对列表进行了分页，但没有对排序选择器进行分页。用户应该能够对 2 个值(创建的数据和每个项目的数字字段)进行排序。默认的排序方法是创建日
xslt - XSLT 的奇怪排序要求
我有一个奇怪的要求。我在 xslt 中有一个包含月份的变量，带有它们的 id (1-12) 问题是我需要全部显示它们，但从一月(1)以外的月份开始。目前我有以下 JAN
xslt - 模块化 xslt？
如何在 xslt 转换中模块化一组重复的输出？例如，我有如下内容(伪代码)。并
xslt - XSLT 中的位置字符串拆分
我得到一个像这样的字符串。 13091711111100222222003333330044444400 字符串的模式是这样的 13 - 09 - 17 - 11111 - 100 - 22222 -
xslt - XSLT 中的设计和编码模式
我是 XSLT 的新手，有一个一般性问题。为了区分具有不同属性的两个元素，最好(也是为了性能)使用和而不是在一个模板中。据我所知，这就是 XSLT 中应该“思考”的方式。但在我看来，这有一个缺点
xslt - 如何从字符串中删除连字符 +xslt
如何从“19650512-0065”到“196505120065”这样的字符串中删除连字符使用这个模板:传递 theID =
xslt - XSLT 中的填充零
是否有任何功能可以在左侧填充零？我正在尝试做的要求是: 我们不知道即将到来的输入字符串长度。如果小于 20，我们必须在左侧填充零。如果输入字符串长度为 10，那么我们必须在左侧填充 10 个零。
xslt - XSLT 应用模板的默认选择是什么？
身份模板如下所示: 是否选择多于，或者身份模板可能是这样的？当我执行以下操作时，究竟选择了什么？最佳答案
xslt - XSLT 模板中的超链接
我正在尝试使用 XML 信息和 XSLT 模板创建超链接。这是 XML 源代码。 Among individual stocks, the top percentage gainers in the

首页

博学

6Ren·AI

商城

xml - 如何按内容对元素进行分组(XSLT 2.0)？