gpt4 book ai didi

Python、 Selenium : Isolate Item From Returned List

转载 作者:行者123 更新时间:2023-12-01 00:46:31 30 4
gpt4 key购买 nike

通过阅读、视频、SO 和社区的帮助,我能够从 Tessco.com 中抓取数据。使用 Selenium 和 Python。

该网站需要 UN 和 PW。我已将其包含在下面的代码中,这是非必要的凭据,专门用于提问。

我的最终目标是循环浏览 Excel 零件号列表,并搜索包括价格在内的一组参数。在引入循环列表之前,我希望将所需信息与抓取的信息分开。

我不确定如何过滤此信息。

代码如下:

    import time
#Need Selenium for interacting with web elements
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
#Need numpy/pandas to interact with large datasets
import numpy as np
import pandas as pd

chrome_path = r"C:\Users\James\Documents\Python Scripts\jupyterNoteBooks\ScrapingData\chromedriver_win32\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("https://www.tessco.com/login")

userName = "FirstName.SurName321123@gmail.com"
password = "PasswordForThis123"

#Set a wait, for elements to load into the DOM
wait10 = WebDriverWait(driver, 10)
wait20 = WebDriverWait(driver, 20)
wait30 = WebDriverWait(driver, 30)

elem = wait10.until(EC.element_to_be_clickable((By.ID, "userID")))
elem.send_keys(userName)

elem = wait10.until(EC.element_to_be_clickable((By.ID, "password")))
elem.send_keys(password)

#Press the login button
driver.find_element_by_xpath("/html/body/account-login/div/div[1]/form/div[6]/div/button").click()

#Expand the search bar
searchIcon = wait10.until(EC.element_to_be_clickable((By.XPATH, "/html/body/header/div[2]/div/div/ul/li[2]/i")))
searchIcon.click()

searchBar = wait10.until(EC.element_to_be_clickable((By.XPATH, '/html/body/header/div[3]/input')))
searchBar.click()

#load in manufacture part number from a collection of components, via an Excel file

#Enter information into the search bar
searchBar.send_keys("HL4RPV-50" + '\n')

# wait for the products information to be loaded
products = wait30.until(EC.presence_of_all_elements_located((By.XPATH,"//div[@class='CoveoResult']")))
# create a dictionary to store product and price
productInfo = {}
# iterate through all products in the search result and add details to dictionary
for product in products:
# get product info such as OEM, Description and Part Number
productDescr = product.find_element_by_xpath(".//a[@class='productName CoveoResultLink hidden-xs']").text
mfgPart = product.find_element_by_xpath(".//ul[@class='unlisted info']").text.split('\n')[3]
mfgName = product.find_element_by_tag_name("img").get_attribute("alt")

# get price
price = product.find_element_by_xpath(".//div[@class='price']").text.split('\n')[1]

# add details to dictionary
productInfo[mfgPart, mfgName, productDescr] = price

# print products information
print(productInfo)

输出为

{('MFG PART #: HL4RPV-50', 'CommScope', '1/2" Plenum Air Cable, Off White'): '$1.89', ('MFG PART #: HL4RPV-50B', 'CommScope', '1/2" Plenum Air Cable, Blue'): '$1.89', ('MFG PART #: L4HM-D', 'CommScope', '4.3-10 Male for 1/2" AL4RPV-50,LDF4-50A,HL4RPV-50'): '$19.94', ('MFG PART #: L4HR-D', 'CommScope', '4.3-10M RA for 1/2" AL4RPV-50, LDF4-50A, HL4RPV-50'): '$39.26', ('MFG PART #: UPL-4MT-12', 'JMA Wireless', '4.3-10 Male Connector for 1/2” Plenum Cables'): '$32.99', ('MFG PART #: UPL-4F-12', 'JMA Wireless', '4.3-10 Female Connector for 1/2" Plenum'): '$33.33', ('MFG PART #: UPL-4RT-12', 'JMA Wireless', '4.3-10 R/A Male Connector for 1/2" Plenum'): '$42.82', ('MFG PART #: L4HF-D', 'CommScope', '4.3-10 Female for 1/2 in AL4RPV-50, LDF4-50A'): '$20.30'}

我只想要自动搜索中引用的内容,因此对于这个示例,我将寻找

(“制造零件编号:HL4RPV-50”、“康普”、“1/2”静压空气电缆,灰白色”):“1.89 美元”

最终,我计划用项目列表替换 HL4RPV-50 标签,但现在,我认为我应该过滤所需的内容。

我怀疑逻辑是否正确,但我尝试打印符合搜索要求的任何部分的产品信息,如下所示。

for item in mfgPart:
if mfgPart == "HL4RPV-50":
print(productInfo)

但是上面的代码只是像以前一样打印了所有输出。

然后我尝试导入 itertools 并运行以下命令:

print(dict(itertools.islice(productInfo.items(), 1)))

实际上返回了我想要的订单项,但不能保证第一个返回的项目就是我正在寻找的项目。如果我可以根据给定的部件号过滤出精确的搜索,那就最好了。

有没有办法可以根据输入过滤结果?

非常感谢任何提示。

最佳答案

其他答案似乎检查零件编号是否在制造零件字符串中,但我看到某些项目可能包含相同的零件编号,例如 HL4RPV-50HL4RPV-50B 。如果您想隔离零件号,以便可以准确地知道您正在查看哪个零件,我建议您迭代字典,并在冒号处拆分制造零件字符串以获取 ID。您还可以抓取该项目的其他部分以更清晰地打印信息,如下例所示。

for (mfg_part, comm_scope, name), price in productInfo.items():
mfg_id = mfg_part.split(': ')[1]
if mfg_id == 'HL4RPV-50':
print('Part #:', mfg_id)
print('Company:', comm_scope)
print('Name:', name)
print('Price:', price)

关于Python、 Selenium : Isolate Item From Returned List,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56940686/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com