gpt4 book ai didi

python - 如何从链接中获取图像?

转载 作者:太空宇宙 更新时间:2023-11-04 03:17:47 26 4
gpt4 key购买 nike

当我尝试通过此代码访问图像标签时,我得到以下输出。

url = 'https://paytm.com/shop/p/pepe-jeans-blue-slim-fit-t-shirts-APPPEPE-JEANS-BSETU2010438B648267'

def soup_maker(url):
r = requests.get(url)
markup = r.content
soup = bs(markup, "html.parser")
return soup

def get_images(url):
soup = soup_maker(url)
divs = soup.find_all('div', {'class': 'fixed-height'})
print(divs)
images = soup.find_all('img')
print(images)

输出

[]
[<img alt="{{::product.text}}" ng-src="{{::product.image_url}}"/>,
<img alt="{{item.title}}" ng-src='{{cart.imgResized(item.image_url,"50x50") }}'/>,
<img ng-src="{{pixelSource}}"/>]

但是当我通过 Inspect Element 查看时,它就在那里。我不知道如何保存这些图像。

更新

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

def soup_maker(url):
driver = webdriver.Chrome()
driver.get(url)
try:
element = WebDriverWait(driver, 20).until(
EC.presence_of_element_located((By.CLASS_NAME, "fixed-height"))
)
markup = driver.page_source
soup = bs(markup, "html.parser")
return soup
finally:
driver.quit()
driver.close()

以上对我有用。

最佳答案

这看起来像一个具有定义绑定(bind)的 AngularJS 模板,这意味着该站点需要一个带有 javascript 引擎的真实浏览器才能呈现。让我们保留解析部分,但不是 requests,而是获取来自 selenium 的来源:

from selenium import webdriver

def soup_maker(url):
driver = webdriver.Firefox() # could also be Chrome(), PhantomJS() or other
driver.get(url)

# you might also need an Explicit Wait here to wait for the page to load
# see http://selenium-python.readthedocs.org/waits.html#explicit-waits

markup = driver.page_source
driver.close()
soup = bs(markup, "html.parser")
return soup

关于python - 如何从链接中获取图像?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35679169/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com