gpt4 book ai didi

python - 如何在Python和beautifulsoup中从CDATA中提取数据?

转载 作者:行者123 更新时间:2023-12-01 01:48:34 29 4
gpt4 key购买 nike

我想从 cd 数据中提取 post_id

<script type='text/javascript' data-cfasync='false'>
//<![CDATA[
_SHR_SETTINGS = {"endpoints":{"local_recs_url":"https:\/\/klaudynahebda.pl\/wp-admin\/admin-ajax.php?action=shareaholic_permalink_related"},"url_components":{"year":"2018","monthnum":"06","day":"19","post_id":"21132","postname":"letnie-warsztaty-ziolowo-kosmetyczne-7-9lipiec","author":"admin"}};
//]]>
</script>

我能够获取整个 CData,但不知道下一步该做什么?

最佳答案

如果您只需要post_id,请尝试使用regex

例如:

import re
s = """<script type='text/javascript' data-cfasync='false'>
//<![CDATA[
_SHR_SETTINGS = {"endpoints":{"local_recs_url":"https:\/\/klaudynahebda.pl\/wp-admin\/admin-ajax.php?action=shareaholic_permalink_related"},"url_components":{"year":"2018","monthnum":"06","day":"19","post_id":"21132","postname":"letnie-warsztaty-ziolowo-kosmetyczne-7-9lipiec","author":"admin"}};
//]]>
</script>"""
m = re.search(r'(?<="post_id":\")(?P<post_id>.*?)(?=\",\")', s)
if m:
print(m.group('post_id'))

输出:

21132

关于python - 如何在Python和beautifulsoup中从CDATA中提取数据?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50974695/

29 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com