python-3.4 - Python 3 : Why would you use urlparse/urlsplit-6ren

python-3.4 - Python 3 : Why would you use urlparse/urlsplit

转载作者：行者123 更新时间：2023-12-02 04:21:31

我不太确定这些模块的用途。我知道他们将各自的 url 拆分为其组件，但是为什么这会有用，或者什么时候使用 urlparse 的示例是什么？

最佳答案

仅当需要参数时才使用urlparse。我在下面解释了为什么需要参数。

Reference

urllib.parse.urlsplit(urlstring, scheme='', allow_fragments=True)

This is similar to urlparse(), but does not split the params from theURL. This should generally be used instead of urlparse() if the morerecent URL syntax allowing parameters to be applied to each segment ofthe path portion of the URL (see RFC 2396) is wanted.

主机名对于存储在变量中以便稍后使用或添加参数、查询主机名以在抓取时获取您想要的网页总是有用的。

关于参数:

仅供引用:根据 RFC2396，url 中的参数

Extensive testing of current client applications demonstrated that themajority of deployed systems do not use the ";" character to indicatetrailing parameter information, and that the presence of a semicolonin a path segment does not affect the relative parsing of thatsegment. Therefore, parameters have been removed as a separatecomponent and may now appear in any path segment. Their influence hasbeen removed from the algorithm for resolving a relative URIreference.

参数在抓取时很有用，例如如果网址为 http://www.example.com/products/women?color=green

当你使用urlparse时，你会得到参数。现在您必须将其更改为 men，这样它将是 http://www.example.com/products/men?color=green 和 kids、女孩、男孩等等。

关于python-3.4 - Python 3 : Why would you use urlparse/urlsplit，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30091297/