gpt4 book ai didi

python - Python bs4 获取杂散文本

转载 作者:太空宇宙 更新时间:2023-11-04 09:27:02 24 4
gpt4 key购买 nike

    <li><a class="atc-group atc-group-active" href="" data-url="/atc-kodlari/1">
<i class="fa fa-lg fa-pulse fa-spinner atc-group-loading" style="margin-right: 5px; display: none;"></i>

<span class="lists-rundown-no">(16)</span>
</a>
<i class="fa fa-lg fa-pulse fa-spinner atc-group-loading" style="margin-right: 5px; display: none;"></i>




<span class="lists-rundown-no">(16)</span>
<a class="atc-group atc-group-active" href="" data-url="/atc-kodlari/1">
<i class="fa fa-lg fa-pulse fa-spinner atc-group-loading" style="margin-right: 5px; display: none;"></i>
HERE!!
<span class="lists-rundown-no">(16)</span>
</a></li>

我需要参加这里写的部分!!在 python 上使用漂亮的汤,但它是一个杂散的文本,所以它没有选择器或其他东西。有可能得到吗?

我试过的。

import requests
from bs4 import BeautifulSoup

r = requests.get('anywebsite')
source = BeautifulSoup(r.content,"lxml")

for child in source.select("#atc-wrapper > ul"):
for child2 in child.findChildren():
print(child2)

最佳答案

您可以使用 CSS 选择器 a:last-of-type i选择元素 <i>在最后一个元素里面 <a> .然后使用 find_next()带参数 text=True :

data = '''    <li><a class="atc-group atc-group-active" href="" data-url="/atc-kodlari/1">
<i class="fa fa-lg fa-pulse fa-spinner atc-group-loading" style="margin-right: 5px; display: none;"></i>
A - Gastrointestinal kanal ve metabolizma
<span class="lists-rundown-no">(16)</span>
</a>
<i class="fa fa-lg fa-pulse fa-spinner atc-group-loading" style="margin-right: 5px; display: none;"></i>


A - Gastrointestinal kanal ve metabolizma

<span class="lists-rundown-no">(16)</span>
<a class="atc-group atc-group-active" href="" data-url="/atc-kodlari/1">
<i class="fa fa-lg fa-pulse fa-spinner atc-group-loading" style="margin-right: 5px; display: none;"></i>
HERE!!
<span class="lists-rundown-no">(16)</span>
</a></li>'''

from bs4 import BeautifulSoup

soup = BeautifulSoup(data, 'lxml')

# select last i
i = soup.select_one('a:last-of-type i')

# select next text
print(i.find_next(text=True).strip())

打印:

HERE!!

进一步阅读:

CSS Selectors Reference

关于python - Python bs4 获取杂散文本,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57157900/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com