gpt4 book ai didi

python - 如何在 Python 中获取 URL 的基础?

转载 作者:太空狗 更新时间:2023-10-29 17:10:54 25 4
gpt4 key购买 nike

我正在尝试确定 URL 的基础,或者除页面和参数之外的所有内容。我试过使用 split,但有没有比将它分成几 block 更好的方法?有没有办法删除最后一个“/”中的所有内容?

鉴于此: http://127.0.0.1/asdf/login.php

我想: http://127.0.0.1/asdf/

最佳答案

最好的方法是使用 urllib.parse .

来自文档:

The module has been designed to match the Internet RFC on Relative Uniform Resource Locators. It supports the following URL schemes: file, ftp, gopher, hdl, http, https, imap, mailto, mms, news, nntp, prospero, rsync, rtsp, rtspu, sftp, shttp, sip, sips, snews, svn, svn+ssh, telnet, wais, ws, wss.

你想用 urlsplit 做这样的事情和 urlunsplit :

from urllib.parse import urlsplit, urlunsplit

split_url = urlsplit('http://127.0.0.1/asdf/login.php?q=abc#stackoverflow')

# You now have:
# split_url.scheme "http"
# split_url.netloc "127.0.0.1"
# split_url.path "/asdf/login.php"
# split_url.query "q=abc"
# split_url.fragment "stackoverflow"

# Use all the path except everything after the last '/'
clean_path = "".join(split_url.path.rpartition("/")[:-1])

# "/asdf/"

# urlunsplit joins a urlsplit tuple
clean_url = urlunsplit(split_url)

# "http://127.0.0.1/asdf/login.php?q=abc#stackoverflow"


# A more advanced example
advanced_split_url = urlsplit('http://foo:bar@127.0.0.1:5000/asdf/login.php?q=abc#stackoverflow')

# You now have *in addition* to the above:
# advanced_split_url.username "foo"
# advanced_split_url.password "bar"
# advanced_split_url.hostname "127.0.0.1"
# advanced_split_url.port "5000"

关于python - 如何在 Python 中获取 URL 的基础?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35616434/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com