gpt4 book ai didi

python - 使用 BeautifulSoup 从表中提取某些列

转载 作者:行者123 更新时间:2023-12-01 00:46:59 26 4
gpt4 key购买 nike

您好,我正在尝试使用 html 表确定在 eBay 上购买商品的日期:https://offer.ebay.com/ws/eBayISAPI.dll?ViewBidsLogin&item=173653442617&rt=nc&_trksid=p2047675.l2564

我的Python代码:

def soup_creator(url):
# Downloads the eBay page for processing
res = requests.get(url)
# Raises an exception error if there's an error downloading the website
res.raise_for_status()
# Creates a BeautifulSoup object for HTML parsing
return BeautifulSoup(res.text, 'lxml')

soup = soup_creator(item_link)
purchases = soup.find('div', attrs={'class' : 'BHbidSecBorderGrey'})
purchases = purchases.findAll('tr', attrs={'bgcolor' : '#ffffff'})
for purchase in purchases:
date = purchase.findAll("td", {"align": "left"})
date = date[2].get_text()
print(purchase)

当我运行 print 语句时,它不会返回任何内容,我认为这意味着它没有找到任何内容。我希望它打印出这样的内容:

Jul-02-19 18:22:28 PDT
Jun-27-19 16:12:59 PDT
Jun-23-19 06:46:23 PDT
...

最佳答案

Pandas :

对于 pandas 来说非常简单,只需为右表建立索引并切出列

import pandas as pd

table = pd.read_html('https://offer.ebay.com/ws/eBayISAPI.dll?ViewBidsLogin&item=173653442617&rt=nc&_trksid=p2047675.l2564')[4]
table['Date of Purchase']
<小时/>

bs4 方法 1:

正如您所知的列号,您可以在感兴趣的表上使用 nth-of-type

from bs4 import BeautifulSoup as bs
import requests

r = requests.get('https://offer.ebay.com/ws/eBayISAPI.dll?ViewBidsLogin&item=173653442617&rt=nc&_trksid=p2047675.l2564')
soup = bs(r.content, 'lxml')
#if column # is known
purchases = [item.text for item in soup.select('table[width] td:nth-of-type(5)')]
<小时/>

bs4 方法 2(不太理想且列号未知)

from bs4 import BeautifulSoup as bs
import requests

r = requests.get('https://offer.ebay.com/ws/eBayISAPI.dll?ViewBidsLogin&item=173653442617&rt=nc&_trksid=p2047675.l2564')
soup = bs(r.content, 'lxml')
#if column # not known
headers = [item.text.strip() for item in soup.select('table[width] th')]
desired_header = 'Date of Purchase'

if desired_header in headers:
print([item.text for item in soup.select('table[width] td:nth-of-type(' + str(headers.index(desired_header) + 1) + ')')])

关于python - 使用 BeautifulSoup 从表中提取某些列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56895156/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com