gpt4 book ai didi

python - 如何从 "class"HTML 标签中查找文本匹配?

转载 作者:太空宇宙 更新时间:2023-11-03 20:37:48 25 4
gpt4 key购买 nike

我需要解析 HTML 并找到与“product-size _product-size”匹配的文本(没有任何其他单词,如“disabled _disabled”)。所以我使用了 BeautifulSoup 并剪下了我需要的 HTML 代码

import requests
from bs4 import BeautifulSoup
import re

URL =.......

headers = {"User-Agent": .......}

page = requests.get(URL, headers=headers)

soup = BeautifulSoup(page.content, 'html.parser')

div = soup.find("div", class_="size-list")
print("Find size-list \n" + str(div) +'\n')

明白了

<div class="size-list" tabindex="-1">
<label for="size-10"
class="product-size _product-size disabled _disabled "
data-sku="01122345" data-name="10">
<div>
<input type="radio" value="10" name="size" id="size-10"
disabled="disabled" class="_sizeInput" tabindex="-1">
</div>
<span class="size-name" title="10">10</span>
<span></span>
</label>
<label for="size-11"
class="product-size _product-size disabled _disabled "
data-sku="01122346" data-name="11">
<div>
<input type="radio" value="11" name="size" id="size-11"
disabled="disabled" class="_sizeInput" tabindex="-1">
</div>
<span class="size-name" title="11">11</span>
<span></span>
</label>
<label for="size-12"
class="product-size _product-size "
data-sku="01122347" data-name="12">
<div>
<input type="radio" value="12" name="size" id="size-12"
class="_sizeInput" tabindex="0">
</div>
<span class="size-name" title="12">12</span>
<span></span>
</label>
<label for="size-13"
class="product-size _product-size disabled _disabled "
data-sku="01122348" data-name="13">
<div>
<input type="radio" value="13" name="size" id="size-13"
disabled="disabled" class="_sizeInput" tabindex="-1">
</div>
<span class="size-name" title="13">13</span>
<span></span>
</label>
<label for="size-14"
class="product-size _product-size "
data-sku="01122349" data-name="14">
<div>
<input type="radio" value="14" name="size" id="size-14"
class="_sizeInput" tabindex="0">
</div>
<span class="size-name" title="14">14</span>
<span></span>
</label>
</div>

现在我需要在文本中查找包含字符串“product-size _product-size”但不包含“disabled _disabled”的匹配项如果我找到了,请检查它们的“尺寸名称”。我只是卡住了(半小时的Python用户,抱歉)。尝试使用此查找与字符串“product-size _product-size”的简单匹配

soup.find_all('label', class_="product-size _product-size ")
soup.find(class_="product-size _product-size ")
soup.find_all(text=re.compile(r'product-size _product-size '))
#div.find... or soup.find..., and ect, whatever.

但只有 [] 或没有。我做错了什么?

最佳答案

使用CSS选择器和:not(class)

data='''<div class="size-list" tabindex="-1">
<label for="size-10"
class="product-size _product-size disabled _disabled "
data-sku="01122345" data-name="10">
<div>
<input type="radio" value="10" name="size" id="size-10"
disabled="disabled" class="_sizeInput" tabindex="-1">
</div>
<span class="size-name" title="10">10</span>
<span></span>
</label>
<label for="size-11"
class="product-size _product-size disabled _disabled "
data-sku="01122346" data-name="11">
<div>
<input type="radio" value="11" name="size" id="size-11"
disabled="disabled" class="_sizeInput" tabindex="-1">
</div>
<span class="size-name" title="11">11</span>
<span></span>
</label>
<label for="size-12"
class="product-size _product-size "
data-sku="01122347" data-name="12">
<div>
<input type="radio" value="12" name="size" id="size-12"
class="_sizeInput" tabindex="0">
</div>
<span class="size-name" title="12">12</span>
<span></span>
</label>
<label for="size-13"
class="product-size _product-size disabled _disabled "
data-sku="01122348" data-name="13">
<div>
<input type="radio" value="13" name="size" id="size-13"
disabled="disabled" class="_sizeInput" tabindex="-1">
</div>
<span class="size-name" title="13">13</span>
<span></span>
</label>
<label for="size-14"
class="product-size _product-size "
data-sku="01122349" data-name="14">
<div>
<input type="radio" value="14" name="size" id="size-14"
class="_sizeInput" tabindex="0">
</div>
<span class="size-name" title="14">14</span>
<span></span>
</label>
</div>'''
soup=BeautifulSoup(data,'html.parser')
for item in soup.select('.product-size._product-size:not(.disabled)'):
print(item.select_one('.size-name').text)

输出:

12
14

关于python - 如何从 "class"HTML 标签中查找文本匹配?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57047053/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com