python - Soup-ify 获取请求-6ren

python - Soup-ify 获取请求

转载作者：太空宇宙更新时间：2023-11-04 11:07:36

26

4

我正在尝试 soup-ify 获取请求

from bs4 import BeautifulSoup
import requests 
import pandas as pd

html_page = requests.get('"https://www.dataquest.io"')

soup = BeautifulSoup(html_page, "lxml")
soup.find_all('<\a>')

但是，这只会返回一个空列表

最佳答案

这将拉取表行并将每一行分配给一个字典，该字典附加到一个列表中。您可能需要稍微调整选择器。

from bs4 import BeautifulSoup
import requests
from pprint import pprint

output_data = [] # This is a LoD containing all of the table data

for i in range(1, 453): # For loop used to paginate
    data_page = requests.get(f'https://www.dataquest.io?')
    print(data_page)

    soup = BeautifulSoup(data_page.text, "lxml")

    # Find all of the table rows
    elements = soup.select('div.head_table_t')
    try:
        secondary_elements = soup.select('div.list_table_subs')
        elements = elements + secondary_elements
    except:
        pass
    print(len(elements))
    # Iterate through the rows and select individual column and assign it to the dictionary with the correct header
    for element in elements:
        data = {}
        data['Name'] = element.select_one('div.col_1 a').text.strip()
        data['Page URL'] = element.select_one('div.col_1 a')['href']
        output_data.append(data) # Append dictionary (contact info) to the list
        pprint(data) # Pretty Print the dictionary out (to see what you're receiving, this can be removed)

关于python - Soup-ify 获取请求，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59067700/

26

4

0

文章推荐： java - 如何使用 JAX-RS 捕获 WebApplicationException

文章推荐： HTML/CSS 两个相邻的搜索框

文章推荐： html - 将

volume 设置为 0 并使用 CSS 隐藏它

python - Soup-ify 获取请求
我正在尝试 soup-ify 获取请求 from bs4 import BeautifulSoup import requests import pandas as pd html_page = re
OSGi-ified Apache Commons 日志记录？
我需要一个 OSGi 化版本的 Apache POI，但找不到，所以我使用 BND 构建了一个。 BND 声称 Apache POI 需要 Apache Commons Logging。所以我寻找了一
syntax - 如何 Groovy-ify 空检查？
是否有更“Groovy”的方式来编写这个 Groovy 代码: def myVar=(System.getProperty("props") == null)? null : System.g
javascript - 使用async/await进行同步代码执行时， "sync-ify"第三方代码怎么办？
这个问题在这里已经有了答案: How do I convert an existing callback API to promises? (24 个答案) 关闭 4 年前。我有一个用 JS 编写
git - 如何 "un-git-ify"我的高清
所以几个月前，我显然以我硬盘上的用户文件夹的形式创建了一个本地存储库。我在 Git101 文档中使用一些愚蠢的尝试来做到这一点，结果我目前有 168,352 个未决/未说明的更改。有没有一种方法不仅可
javascript - 'cache: false' 是否阻止缓存或对请求进行 UNIQUE-IFY 以绕过缓存？
如果我通过带有“cache: false”的ajax调用对资源发出多个请求，这是否会阻止浏览器使用请求 header (或其他方式)缓存每个响应，或者它绕过之前缓存的响应，因为资源 URL 由于 _=
ubuntu - gpg : ify: skipped: public key not found when I made the encryption myself
我现在正在尝试使用私钥和公钥设置我自己的加密。我正在关注本教程: http://wooledge.org/~greg/crypto/node41.html 当我尝试使用 gpg -verify mes
haskell - 当在一个列表上运行 'sequence' 时，ghc 如何知道 list-ify 是哪个参数？
我有两个函数，我很困惑序列如何知道将哪个参数放入列表中，以及两者中的 Left 实际发生了什么 getMonStat :: T.Text -> IO (Either CmdError MonStat)
de.lmu.ifi.dbs.elki.math.spacefillingcurves.ZCurveSpatialSorter.zSort()方法的使用及代码示例
本文整理了Java中de.lmu.ifi.dbs.elki.math.spacefillingcurves.ZCurveSpatialSorter.zSort()方法的一些代码示例，展示了ZCurve
de.lmu.ifi.dbs.elki.math.spacefillingcurves.ZCurveSpatialSorter.getMinPlusMaxObject()方法的使用及代码示例
本文整理了Java中de.lmu.ifi.dbs.elki.math.spacefillingcurves.ZCurveSpatialSorter.getMinPlusMaxObject()方法的一些
de.lmu.ifi.dbs.elki.math.spacefillingcurves.ZCurveSpatialSorter.pivotizeList1D()方法的使用及代码示例
本文整理了Java中de.lmu.ifi.dbs.elki.math.spacefillingcurves.ZCurveSpatialSorter.pivotizeList1D()方法的一些代码示例，

首页

博学

6Ren·AI

商城

python - Soup-ify 获取请求