gpt4 book ai didi

javascript - 在 Python 中解析 BeautifulSoup 后的脚本标签

转载 作者:行者123 更新时间:2023-12-05 05:14:53 25 4
gpt4 key购买 nike

大家好论坛的各位。在解析一个页面的时候,遇到了从标签脚本中提取数据的问题。标签的内部内容不是 json 对象。使用web.driver 结果没有。谁遇到过这样的事情?我请求你的帮助。

代码示例:

<script>window.ShopifyAnalytics = window.ShopifyAnalytics || {};
window.ShopifyAnalytics.meta = window.ShopifyAnalytics.meta || {};
window.ShopifyAnalytics.meta.currency = 'AUD';
var meta = {"product":{"id":8993669708,"vendor":"Womanizer","type":"Vibrators","variants":[{"id":31066737740,"price":14999,"name":"Womanizer - Black","public_title":"Black","sku":"172145678"},{"id":31066737804,"price":14999,"name":"Womanizer - Purple","public_title":"Purple","sku":"172146924"},{"id":31066737868,"price":14999,"name":"Womanizer - Pink","public_title":"Pink","sku":"172150324"},{"id":31066737996,"price":14999,"name":"Womanizer - Tattoo","public_title":"Tattoo","sku":"172205168"},{"id":1509908217881,"price":14999,"name":"Womanizer - Blue","public_title":"Blue","sku":"1725205076"}]},"page":{"pageType":"product","resourceType":"product","resourceId":8993669708}};
for (var attr in meta) {
window.ShopifyAnalytics.meta[attr] = meta[attr];
}</script>

最佳答案

使用正则表达式。

演示:

from bs4 import BeautifulSoup
import json
import re


s = """<script>window.ShopifyAnalytics = window.ShopifyAnalytics || {};
window.ShopifyAnalytics.meta = window.ShopifyAnalytics.meta || {};
window.ShopifyAnalytics.meta.currency = 'AUD';
var meta = {"product":{"id":8993669708,"vendor":"Womanizer","type":"Vibrators","variants":[{"id":31066737740,"price":14999,"name":"Womanizer - Black","public_title":"Black","sku":"172145678"},{"id":31066737804,"price":14999,"name":"Womanizer - Purple","public_title":"Purple","sku":"172146924"},{"id":31066737868,"price":14999,"name":"Womanizer - Pink","public_title":"Pink","sku":"172150324"},{"id":31066737996,"price":14999,"name":"Womanizer - Tattoo","public_title":"Tattoo","sku":"172205168"},{"id":1509908217881,"price":14999,"name":"Womanizer - Blue","public_title":"Blue","sku":"1725205076"}]},"page":{"pageType":"product","resourceType":"product","resourceId":8993669708}};
for (var attr in meta) {
window.ShopifyAnalytics.meta[attr] = meta[attr];
}</script>"""

soup = BeautifulSoup(s, "html.parser")
scr = soup.find("script")
m = re.search(r"var meta = (.*?);", scr.string)
if m:
data = json.loads(m.group(1))
for sku in data["product"]["variants"]:
print(sku["sku"])

输出:

172145678
172146924
172150324
172205168
1725205076

关于javascript - 在 Python 中解析 BeautifulSoup 后的脚本标签,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52166973/

25 4 0