gpt4 book ai didi

python - 用python分割url

转载 作者:太空宇宙 更新时间:2023-11-04 06:54:46 26 4
gpt4 key购买 nike

我有这个网址:

/drive/rayon.productlist.seomenulevel/fh_refpath$003dfacet_1$0026fh_refview$003dlister$0026fh_view_size$003d100$0026fh_reffacet$003dcategories$0026auchan_page_type$003dcatalogue$0026fh_location$003d$00252f$00252f52$00252ffr_FR$00252fdrive_id$00253d993$00252fcategories$00253c$00257b52_3686967$00257d$00252fcategories$00253c$00257b52_3686967_3686326$00257d$00252fcategories$00253c$00257b52_3686967_3686326_3700610$00257d$00252fcategories$00253c$00257b52_3686967_3686326_3700610_3700620$00257d/Capsules$0020$002843$0029/3700620?t:ac=3686967/3700610

我想要最后 3 个数字:item[0] = 3700620、item[1]=3686967 和 item[2] = 3700610

我试过了

one =   url.split('/')[-1]
two = url.split('/')[-2]

第一个的结果是 3700610"

第二个 3700620?t:ac=3686967

最佳答案

非正则表达式方法将涉及使用 urlparse和一点 split :

>>> import urlparse
>>> parsed_url = urlparse.urlparse(url)
>>> number1 = parsed_url.path.split("/")[-1]
>>> number2, number3 = urlparse.parse_qs(parsed_url.query)["t:ac"][0].split("/")
>>> number1, number2, number3
('3700620', '3686967', '3700610')

正则表达式方法:

>>> import re
>>> re.search(r"/(\d+)\?t:ac=(\d+)/(\d+)$", url).groups()
('3700620', '3686967', '3700610')

其中 (\d+)saving/capturing groups匹配一个或多个数字,\? 将匹配文字问号(我们需要将其转义,因为它具有特殊含义),$ 将匹配字符串。

您也可以name the groups并生成字典:

>>> re.search(r"/(?P<number1>\d+)\?t:ac=(?P<number2>\d+)/(?P<number3>\d+)", url).groupdict()
{'number2': '3686967', 'number3': '3700610', 'number1': '3700620'}

关于python - 用python分割url,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36845976/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com