gpt4 book ai didi

python-3.x - 如果消息元素不可见,则跳过python Selenium 跳过页面

转载 作者:行者123 更新时间:2023-12-03 07:43:01 25 4
gpt4 key购买 nike

我正在尝试从页面中获取文本元素。要访问此元素,我的脚本会在页面上单击两个过滤器。我需要抓取5,000页。该脚本在收集文本元素方面起作用,但是,在一定数量的页面之后,它总是返回一条消息“元素不可见”。我认为这是由于页面未及时加载而造成的,因为我检查了页面中断且文本元素在其中的情况。 (每次单击后我已经实现了time.sleep(3))。如果页面未及时加载,我可以在脚本中使用什么来跳过该页面?

def yelp_scraper(url):
driver.get(url)
# get total number of restaurants
total_rest_loc = '//span[contains(text(),"Showing 1")]'
total_rest_raw = driver.find_element_by_xpath(total_rest_loc).text
total_rest = int(re.sub(r'Showing 1.*of\s','',total_rest_raw))

button1 = driver.find_element_by_xpath('//span[@class="filter-label filters-toggle js-all-filters-toggle show-tooltip"]')
button1.click()
time.sleep(1)

button2 = driver.find_element_by_xpath('//span[contains(text(),"Walking (1 mi.)")]')
button2.click()
time.sleep(2)

rest_num_loc = '//span[contains(text(),"Showing 1")]'
rest_num_raw = driver.find_element_by_xpath(rest_num_loc).text
rest_num = int(re.sub(r'Showing 1.*of\s','',rest_num_raw))

if total_rest==rest_num:

button3 = driver.find_element_by_xpath('//span[contains(text(),"Biking (2 mi.)")]')
button3.click()
time.sleep(2)

button4 = driver.find_element_by_xpath('//span[contains(text(),"Walking (1 mi.)")]')
button4.click()
time.sleep(2)

rest_num_loc = '//span[contains(text(),"Showing 1")]'
rest_num_raw = driver.find_element_by_xpath(rest_num_loc).text
rest_num = int(re.sub(r'Showing 1.*of\s','',rest_num_raw))


return(rest_num)


chromedriver = "/Applications/chromedriver" # path to the chromedriver executable
os.environ["webdriver.chrome.driver"] = chromedriver


chrome_options = Options()
# add headless mode
chrome_options.add_argument("--headless")
# turn off image loading
prefs = {"profile.managed_default_content_settings.images":2}
chrome_options.add_experimental_option("prefs",prefs)

driver = webdriver.Chrome(chromedriver, chrome_options=chrome_options)


for url in url_list:
yelp_data[url] = yelp_scraper(url)

json.dump(yelp_data, open('../data/yelp_json/yelp_data.json', 'w'), indent="\t")


driver.close()

最佳答案

  • 示例:
        from selenium.common.exceptions import NoSuchElementException
    for item in driver.find_elements_by_class_name('item'):
    try:
    model = item.find_element_by_class_name('product-model')
    price = item.find_element_by_class_name('product-display-price')
    title = item.find_element_by_class_name('product-title')
    url = item.find_element_by_class_name('js-detail-link')

    items.append({'model': model, 'price': price, 'title': title, 'url': url})
    print (model.text, price.text, title.text, url.get_attribute("href"))
    c = (model.text, price.text, title.text, url.get_attribute("href"))
    a.writerow(c)
    except NoSuchElementException:
    #here you can do what you want to do when an element is not found. Then it'll continue with the next one.
    b.close()
  • 关于python-3.x - 如果消息元素不可见,则跳过python Selenium 跳过页面,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/49970316/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com