gpt4 book ai didi

python - lxml -- 如何将 img src 更改为绝对链接

转载 作者:搜寻专家 更新时间:2023-10-31 22:12:56 26 4
gpt4 key购买 nike

使用 lxml,如何用绝对链接全局替换所有 src 属性?

最佳答案

这是一个示例代码,它也涵盖了 <a href> :

from lxml import etree, html
import urlparse

def fix_links(content, absolute_prefix):
"""
Rewrite relative links to be absolute links based on certain URL.

@param content: HTML snippet as a string
"""

if type(content) == str:
content = content.decode("utf-8")

parser = etree.HTMLParser()

content = content.strip()

tree = html.fragment_fromstring(content, create_parent=True)

def join(base, url):
"""
Join relative URL
"""
if not (url.startswith("/") or "://" in url):
return urlparse.urljoin(base, url)
else:
# Already absolute
return url

for node in tree.xpath('//*[@src]'):
url = node.get('src')
url = join(absolute_prefix, url)
node.set('src', url)
for node in tree.xpath('//*[@href]'):
href = node.get('href')
url = join(absolute_prefix, href)
node.set('href', url)

data = etree.tostring(tree, pretty_print=False, encoding="utf-8")

return data

The full story is available in Plone developer documentation .

关于python - lxml -- 如何将 img src 更改为绝对链接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26167690/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com