gpt4 book ai didi

python mechanize find_link 找到最终匹配的链接

转载 作者:行者123 更新时间:2023-11-28 20:53:06 24 4
gpt4 key购买 nike

我有一个页面有 >=1 个链接,文本中有“显示费用”。我可以找到第一个这样的链接

firstLink = br.find_link(text_regex=re.compile("Display charges"),nr=0)

我希望能够找到最终链接。我希望这会起作用

lastLink = br.find_link(text_regex=re.compile("Display charges"),nr=-1)

但在只有一个匹配链接的情况下,它会失败。

请注意:Python 和 mechanize 初学者,但发现了 help(mechanize.Browser),这是一个重大突破:)

最佳答案

您可以使用 br.links() 生成所有此类链接,然后使用 list(...)[-1] 挑选最后一个:

lastLink = list(br.links(text_regex=re.compile("Display charges")))[-1]

例如:

In [29]: import mechanize

In [30]: import re

In [31]: br=mechanize.Browser()

In [32]: br.open('http://www.example.com')
Out[32]: <response_seek_wrapper at 0xa2b59ec whose wrapped object = <closeable_response at 0xa2b554c whose fp = <socket._fileobject object at 0xa3143ac>>>

In [33]: br.links()
Out[33]: <generator object __call__ at 0xa289af4>

In [34]: list(br.links())
Out[34]:
[Link(base_url='http://www.iana.org/domains/example/', url='/', text='Homepage[IMG]', tag='a', attrs=[('href', '/')]),
Link(base_url='http://www.iana.org/domains/example/', url='/domains/', text='Domains', tag='a', attrs=[('href', '/domains/')]),
Link(base_url='http://www.iana.org/domains/example/', url='/numbers/', text='Numbers', tag='a', attrs=[('href', '/numbers/')]),
Link(base_url='http://www.iana.org/domains/example/', url='/protocols/', text='Protocols', tag='a', attrs=[('href', '/protocols/')]),
Link(base_url='http://www.iana.org/domains/example/', url='/about/', text='About IANA', tag='a', attrs=[('href', '/about/')]),
Link(base_url='http://www.iana.org/domains/example/', url='/go/rfc2606', text='RFC 2606', tag='a', attrs=[('href', '/go/rfc2606')]),
Link(base_url='http://www.iana.org/domains/example/', url='/about/', text='About', tag='a', attrs=[('href', '/about/')]),
Link(base_url='http://www.iana.org/domains/example/', url='/domains/', text='Domains', tag='a', attrs=[('href', '/domains/')]),
Link(base_url='http://www.iana.org/domains/example/', url='/protocols/', text='Protocols', tag='a', attrs=[('href', '/protocols/')]),
Link(base_url='http://www.iana.org/domains/example/', url='/numbers/', text='Number Resources', tag='a', attrs=[('href', '/numbers/')]),
Link(base_url='http://www.iana.org/domains/example/', url='http://www.icann.org/', text='Internet Corporation for Assigned Names and Numbers', tag='a', attrs=[('href', 'http://www.icann.org/')]),
Link(base_url='http://www.iana.org/domains/example/', url='mailto:iana@iana.org?subject=General%20website%20feedback', text='iana@iana.org', tag='a', attrs=[('href', 'mailto:iana@iana.org?subject=General%20website%20feedback')])]

In [35]: list(br.links(text_regex=re.compile("About")))
Out[35]:
[Link(base_url='http://www.iana.org/domains/example/', url='/about/', text='About IANA', tag='a', attrs=[('href', '/about/')]),
Link(base_url='http://www.iana.org/domains/example/', url='/about/', text='About', tag='a', attrs=[('href', '/about/')])]

关于python mechanize find_link 找到最终匹配的链接,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/5513059/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com