gpt4 book ai didi

Python BeautifulSoup 循环遍历表数据

转载 作者:太空宇宙 更新时间:2023-11-04 00:05:33 26 4
gpt4 key购买 nike

这里是 Python 的新手。我正在尝试从此页面捕获一些数据 this page .我试图在两个列表中获取项目名称和项目类型。稍后我可以弄清楚如何将它们连接到一张表中。任何帮助都会很棒!

代码行可以独立运行,但循环对我不起作用。这会成功生成两行代码:

import urllib
import bs4 as bs

sauce = urllib.request.urlopen('https://us.diablo3.com/en/item/helm/').read()
soup = bs.BeautifulSoup(sauce, 'lxml')

item_details = soup.find('tbody')
print(item_details)

item_name = item_details.find('div', class_='item-details').h3.a.text
print(item_name)

item_type = item_details.find('ul', class_='item-type').span.text
print(item_type)

这会一遍又一遍地重复第一个 item_name 的值:

for div in soup.find_all('div', class_='item-details'):
item_name = item_details.find('div', class_='item-details').h3.a.text
print(item_name)
item_type = item_details.find('ul', class_='item-type').span.text
print(item_type)

这是输出:

Veil of Steel
Magic Helm
Veil of Steel
Magic Helm
Veil of Steel
Magic Helm
Veil of Steel
Magic Helm
Veil of Steel
Magic Helm
Veil of Steel
Magic Helm
Veil of Steel
Magic Helm
...

最佳答案

您需要使用 find_all(返回列表)而不是 find(返回单个元素):

for i, j in zip(item_details.find_all('div', class_='item-details'), item_details.find_all('ul', class_='item-type')):
print(i.h3.a.text, " - ", j.span.text)

输出是:

Veil of Steel  -  Magic Helm
Leoric's Crown - Legendary Helm
Harlequin Crest - Magic Helm
The Undead Crown - Magic Helm
...

或更易读的格式:

names = item_details.find_all('div', class_='item-details')
types = item_details.find_all('ul', class_='item-type')

for name, type in zip(names, types):
print(name.h3.a.text, " - ", type.span.text)

关于Python BeautifulSoup 循环遍历表数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54238239/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com