gpt4 book ai didi

python - beautifulsoup4 python 处理解析数据

转载 作者:行者123 更新时间:2023-11-28 21:34:19 25 4
gpt4 key购买 nike

with requests.Session() as s:
auth_return = s.get('https://urproject.com/?page=com_auth_return')
soup = bs(auth_return.text,'html.parser')

我得到的是这样的。

<script type="text/javascript">
document.location = 'https://urproject.com/admin/php/user_id_check.php?EncData=abcdefg1234&EncKey=hijk9876';
</script>

有了这个,我想获得EncData和EncKey

EncData = soup.find_all("EncData")
EncKey = soup.find_all("EncKey")

encdatanenckey = {'EncData':EncData,
'EncKey':EncKey}

print(encdatanenckey)

结果是

{'EncData': 'abcdefg1234', 'EncKey': 'hijk9876'}

我如何得到这个......我必须使用正则表达式吗?我对正则表达式很菜鸟,所以......你能给我一些例子吗?

最佳答案

首先可以使用bs4提取脚本内容,然后通过正则表达式匹配特定数据

from bs4 import BeautifulSoup
import re

html = """
<script type="text/javascript" ...></script>
<script type="text/javascript">
document.location = 'https://urproject.com/admin/php/user_id_check.php?EncData=abcdefg1234&EncKey=hijk9876';
</script>
"""
soup = BeautifulSoup(html,'lxml')
js_ = soup.find_all("script",text=True)
regex = r"(?<={}\=).*?(?=&|\'|\")"
EncData = [ re.search(regex.format("EncData"),url.text).group(0) for url in js_]
EncKey = [ re.search(regex.format("EncKey"),url.text).group(0) for url in js_]

encdatanenckey = {'EncData':EncData,
'EncKey':EncKey}

print(encdatanenckey)
# {'EncData': ['abcdefg1234'], 'EncKey': ['hijk9876']}

关于python - beautifulsoup4 python 处理解析数据,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/53474728/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com