gpt4 book ai didi

xml - XSLT 从 xml 文件中的所有 url 中删除查询字符串

转载 作者:数据小太阳 更新时间:2023-10-29 01:56:47 26 4
gpt4 key购买 nike

我需要对 MRSS RSS 提要中所有属性的查询字符串执行正则表达式样式替换,将它们剥离到仅 url。我在这里使用建议尝试了一些事情:XSLT Replace function not found但无济于事

<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
<channel>
<atom:link href="http://www.videojug.com/user/metacafefamilyandeducation/subscriptions.mrss" type="application/rss+xml" rel="self" />
<title>How to and instructional videos from Videojug.com</title>
<description>Award-winning Videojug.com has over 50k professionally-made instructional videos.</description>
<link>http://www.videojug.com</link>
<item>
<title>How To Calculate Median</title>
<media:content url="http://direct.someurl.com/54/543178dd-11a7-4b8d-764c-ff0008cd2e95/how-to-calculate-median__VJ480PENG.mp4?somequerystring" type="video/mp4" bitrate="1200" height="848" duration="169" width="480">
<media:title>How To Calculate Median</media:title>
..
</media:content>
</item>

任何有用的建议

最佳答案

如果您使用的是 XSLT 2.0,则可以使用 tokenize():

  <xsl:template match="media:content">
<xsl:value-of select="tokenize(@url,'\?')[1]"/>
</xsl:template>

这是另一个仅更改 media:contenturl 属性的示例:

  <xsl:template match="media:content">
<media:content url="{tokenize(@url,'\?')[1]}">
<xsl:copy-of select="@*[not(name()='url')]"/>
<xsl:apply-templates/>
</media:content>
</xsl:template>

编辑

要处理您实例中的所有 url 属性,并保持其他一切不变,请使用身份转换并仅使用 @url< 的模板覆盖它.

这是您的示例 XML 的修改版本。我在 description 中添加了两个属性用于测试。 attr 属性应该保持不变,url 属性应该被处理。

XML

<rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/" version="2.0">
<channel>
<atom:link href="http://www.videojug.com/user/metacafefamilyandeducation/subscriptions.mrss" type="application/rss+xml" rel="self"/>
<title>How to and instructional videos from Videojug.com</title>
<!-- added some attributes for testing -->
<description attr="don't delete me!" url="http://www.test.com/foo?anotherquerystring">Award-winning Videojug.com has over 50k professionally-made instructional videos.</description>
<link>http://www.videojug.com</link>
<item>
<title>How To Calculate Median</title>
<media:content url="http://direct.someurl.com/54/543178dd-11a7-4b8d-764c-ff0008cd2e95/how-to-calculate-median__VJ480PENG.mp4?somequerystring" type="video/mp4" bitrate="1200" height="848"
duration="169" width="480">
<media:title>How To Calculate Median</media:title>
..
</media:content>
</item>
</channel>
</rss>

XSLT

<xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:media="http://search.yahoo.com/mrss/">
<xsl:output indent="yes"/>
<xsl:strip-space elements="*"/>

<!--Identity Transform-->
<xsl:template match="node()|@*">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
</xsl:copy>
</xsl:template>

<xsl:template match="@url">
<xsl:attribute name="url">
<xsl:value-of select="tokenize(.,'\?')[1]"/>
</xsl:attribute>
</xsl:template>

</xsl:stylesheet>

输出(使用 Saxon 9.3.0.5)

<rss xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:media="http://search.yahoo.com/mrss/"
version="2.0">
<channel>
<atom:link href="http://www.videojug.com/user/metacafefamilyandeducation/subscriptions.mrss"
type="application/rss+xml"
rel="self"/>
<title>How to and instructional videos from Videojug.com</title>
<!-- added some attributes for testing --><description attr="don't delete me!" url="http://www.test.com/foo">Award-winning Videojug.com has over 50k professionally-made instructional videos.</description>
<link>http://www.videojug.com</link>
<item>
<title>How To Calculate Median</title>
<media:content url="http://direct.someurl.com/54/543178dd-11a7-4b8d-764c-ff0008cd2e95/how-to-calculate-median__VJ480PENG.mp4"
type="video/mp4"
bitrate="1200"
height="848"
duration="169"
width="480">
<media:title>How To Calculate Median</media:title>
..
</media:content>
</item>
</channel>
</rss>

关于xml - XSLT 从 xml 文件中的所有 url 中删除查询字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6142112/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com