How to scrape Title and Price (Beautifulsoup)(如何刮标题和价格（Beautifulsoup）)-6ren

How to scrape Title and Price (Beautifulsoup)(如何刮标题和价格（Beautifulsoup）)

转载作者：bug小助手更新时间：2023-10-22 13:03:31

34

4

I'm trying to get all the album names and prices from this website: https://vinilosalvaro.cl/tienda/

我正在尝试从这个网站获取所有专辑的名称和价格：https://vinilosalvaro.cl/tienda/

But with the following script I'm just getting one of them.

但在下面的剧本中，我只得到了其中一个。

import requests
from bs4 import BeautifulSoup


URL = 'https://vinilosalvaro.cl/tienda/'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}
page = requests.get(URL, headers=headers)

soup = BeautifulSoup(page.content, 'html.parser')
listado_productos = soup.find_all('ul', class_='products columns-3')

for listado_productos in listado_productos:
  titulos = listado_productos.find('h2', class_='woocommerce-loop-product__title').text.strip()
  precios = listado_productos.find('span', class_='woocommerce-Price-amount amount').text.strip()
  print(titulos)
  print(precios)

How to get all the album names and prices?

如何获取所有专辑名称和价格？

更多回答

优秀答案推荐

Main issue is that your selection give you a ResultSet of one <ul> not all the <li> so your loop just iterate one time.

主要问题是，您的选择会给您一个＜ul＞的ResultSet，而不是所有的＜li＞，所以您的循环只迭代一次。

Select your Elements more specific as also mentioned @benyamin payandeh or for example with css selectors:

选择更具体的元素，如@benyamin-payandeh所述，或者例如使用css选择器：

soup.select('ul.products li')

In addition also some concepts like while loop for paging, .get_text(strip=True) and how to store your results over iterations in a more structured form like a list of dicts that you can simply transform in dataframes or process like you need.

此外，还有一些概念，如用于分页的while循环、.get_text（strip=True）以及如何以更结构化的形式在迭代中存储结果，如dict列表，您可以简单地在数据帧中转换或根据需要进行处理。

Example

Be aware this will start from page 59, to show how the while loop works and break if there is no more page to scrape. Simply set URL to your default value to iterate all pages

请注意，这将从第59页开始，以显示while循环是如何工作的，如果没有更多的页面可刮，则会中断。只需将URL设置为默认值即可迭代所有页面

import requests
from bs4 import BeautifulSoup

URL = 'https://vinilosalvaro.cl/tienda/page/59/'
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.3'}

data = []

while True:
    page = requests.get(URL, headers=headers)
    soup = BeautifulSoup(page.content, 'html.parser')

    for listado_productos in soup.select('ul.products li'):
        data.append({
            'titulos' : listado_productos.h2.get_text(strip=True),
            'precios' : listado_productos.span.get_text(strip=True)
        })
    
    if soup.select_one('a.next'):
        URL = soup.select_one('a.next').get('href')
    else:
        break
data

Instead of find_all you can use find when searching for one specific ul tags. go ahead and change line 10 to:

在搜索一个特定的ul标签时，可以使用find而不是find_all。继续，将第10行更改为：

listado_productos = soup.find('ul', class_='products columns-3')

Also, to get li childs, you should use find_all('li'), so change the line 12 to:

此外，要获取li childs，您应该使用find_all（'li'），因此将第12行更改为：

for listado_productos in listado_productos.find_all('li'):

更多回答

34

4

0

javascript - 通过标题覆盖 <title>TITLE</title>
有什么方法可以覆盖无法直接编辑的页面标题，只能在页眉中添加 Javascript？我不能直接编辑的行是: Title of the page 我能想到的解决这个问题的唯一方法是在我可以通过我的门户后
django - <title> {% block title %} Home {% endblock %} </title> 没有被其他页面覆盖？
这是我的基础文件 {% load static %} {% include "feed/header.html" %} {% block content%} {% endblock %} {% inc
html - <meta name ="title"> 标签和 <title></title> 标签的区别
请说明之间有什么区别标记和标签。 Page title 如果两者都使用，哪个最优先？我观察到一些网站同时具有和 tags 和两者相同，这是预期的，请确认？如果我们不使用标签标题，我
mysql - 从另一个表 `title`引用 `title`
我有一个带有唯一title的表_primary，并且我有一个需要设置引用title的表_secondary > 对于 _primary 表。最佳答案尝试这个解决方案并让我知道它对您有用。 ALTE
php - PDO 数据库访问 WHERE title = $title
我正在尝试学习使用 PDO 而不是 MySQLi 进行数据库访问，但我在从数据库中选择数据时遇到了问题。我想使用: $STH = $DBH->query('SELECT * FROM ratings
title - 关于 title/alt 属性
我了解 title 和 alt 属性的用途，但我只是不了解它们的最佳用途，或者我是否可以使用相同的 title /alt 不止一次。例如，以一个关于狗的网站为例: 根据我的理解，所有 img 标签都
iphone - UITabBarItem.title 与 UINavigationController.title
我分配了一个带有标题 (initWithTitle) 的 UITabBarItem 并将其连接到 UINavigationController。我发现，如果导航 Controller 的 Root
iphone - 如何做到 tabBar.title != navBar.title ？
我有标签栏和导航栏。在导航栏中我有表格 View 。问题是，当我在 IB 中将标题设置为选项卡栏，然后在 TableView 中设置标题时，选项卡栏标题将更改为 TableView 中的标题，并且我在
java - jsp中 "<%=TITLE %> "和 "${TITLE} "有什么不同？
在我的 JSP 页面中，我使用显示页面标题，有时可以，但有时页面显示无法cpmplie代码。所以我将代码更改为 ${TITLE} ，也可以。有什么不同和${TITLE}在jsp中？这是我的页
pug - 如何在 Jade 中将自定义字符串附加到 "title= title"？
我目前正在向 Jade 和 node.js 介绍自己由于我想避免冗余，我想到将域名附加到当前标题，例如Blog | example.com 我的 Jade 模板得到了 Blog通过 Node.js
java - 帮助我了解 title.compareTo(a.title()); 的幕后花絮内码
//Sorting userDefined object from ArrayList<>... import java.io.*; import java.util.*; class Song
html - "og:title"的元标记是否使 "title"的元标记变得多余？
我的网站有这两个元标记，它们目前具有相同的值: 第二个是 facebook 连接所需的格式。这是否意味着第一个是多余的并且可以删除？最佳答案最好同时存在这两个标签。该标签告诉搜索引擎有
asp.net-mvc - View.Title == ViewData ["Title"]
我现在对 ASP.NET MVC 的 Razor ViewEngine 感到困惑。大多数人会说: View.Title 与相同 ViewData["Title"] 运行应用程序后我得到了这个 Com
iphone - self.title 与 self.navigationItem.title
UIViewController 的 title 属性的用途是什么，不能用 navigationItem.title 设置标题吗？两者似乎都有效，我只是想知道为什么会有这种看似重复的功能。最佳答案
javascript - {props.props.title}？为什么不是 {props.title}？
我仍在学习如何将 API 数据与 react 和 nextjs 一起使用。但是，为什么我的函数只在我编写 {props.props.title} 而不是我期望的 {props.title} 时起作用？
regex - 正则表达式匹配<title> ，包括任意位置的换行符
我正在尝试编写一个从URL提取的正则表达式，但是问题是“。”与我们已经知道的不匹配换行符。如何编写正则表达式以匹配和提取pageTitle(。*？)，但换行符可能介于我在用grails。最佳答案
javascript - {props.props.title}？为什么不是 {props.title}？
我仍在学习如何将 API 数据与 react 和 nextjs 一起使用。但是，为什么我的函数只在我编写 {props.props.title} 而不是我期望的 {props.title} 时起作用？
github - 潘多克 : set document title to first title
我正在 github 上创建一个库，所以我为此使用了一个 Markdown 文件，其结构如下: # My main title ## My first section ... ## My second
javascript - {`${props.title} ` } 和 {props.title} 之间的区别
我在某些地方看到，为了从 props 中获取 title 的值，我们使用 {`${props.title} `} 而在其他一些地方，我们使用它 {props.title} 有什么区别？最佳答案第一
css - 图像的自动标题不起作用 : img[title]:after{content:attr(title);}
我想使用 IMG 标签的 TITLE 属性，为图像创建标题: HTML CSS img[title]:after{content:attr(title);} 但是无论是在 IE、Firefox 还是

首页

博学

6Ren·AI

商城

How to scrape Title and Price (Beautifulsoup)(如何刮标题和价格（Beautifulsoup）)

Example