gpt4 book ai didi

Python BeautifulSoup 返回元素的空文本,浏览器显示文本,源代码不显示任何内容

转载 作者:行者123 更新时间:2023-12-01 07:17:26 24 4
gpt4 key购买 nike

如果我运行这段代码

import requests
from bs4 import BeautifulSoup

page = requests.get("https://nutritiondata.self.com/facts/nut-and-seed-products/3071/1")
soup = BeautifulSoup(page.content, 'html.parser')

span = soup.find("span", id="NUTRIENT_0")
print(span.text)

我什么也没打印出来。

Span 标签在 Chrome 浏览器中包含文本,但在 html 源代码中不包含文本。

如何抓取此文本?

最佳答案

data/json 嵌入在 html 注释中。另一个问题是键没有双引号。所以我使用正则表达式添加双引号来解决这个问题。只需将其读入字典即可从中获取您想要的任何数据。

代码:

import requests
from bs4 import BeautifulSoup
import json
import re

page = requests.get("https://nutritiondata.self.com/facts/nut-and-seed-products/3071/1")
soup = BeautifulSoup(page.content, 'html.parser')

scripts = soup.find_all("script")
for script in scripts:
if 'foodNutrients = ' in script.text:
jsonStr = script.text
jsonStr = jsonStr.split('foodNutrients =')[-1]
jsonStr = jsonStr.rsplit('fillSpanValues')[0]
jsonStr = jsonStr.rsplit(';',1)[0]
jsonStr = "".join(jsonStr.split())

valid_json = re.sub(r'([{,:])(\w+)([},:])', r'\1"\2"\3', jsonStr)
jsonObj = json.loads(valid_json)

# These are in terms of 100 grams. I also calculated for per serving
g_per_serv = int(jsonObj['FOODSERVING_WEIGHT_1'].split('(')[-1].split('g')[0])

for k, v in jsonObj.items():
if k == 'NUTRIENT_0':
conv_v = (float(v)*g_per_serv)/100

print ('%s : %s (per 100 grams) | %s (per serving %s' %(k, round(float(v)), round(float(conv_v)), jsonObj['FOODSERVING_WEIGHT_1'] ))

输出:

NUTRIENT_0 : 565 (per 100 grams)   |   158 (per serving 1ounce(28g)

关于Python BeautifulSoup 返回元素的空文本,浏览器显示文本,源代码不显示任何内容,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57883356/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com