gpt4 book ai didi

python - 如何使用 python beautiful soup 从下面的 HTML 中获取标签和 ID 信息

转载 作者:太空宇宙 更新时间:2023-11-03 14:28:41 25 4
gpt4 key购买 nike

如何从下面的 HTML 代码中提取 ID 和标签(10870,7th Phase JP Nagar)

<input id="filter_data" type="hidden" value="{&quot;Locality&quot;
:{&quot;Top_Results_Array&quot;
:{&quot;0&quot;
:{&quot;ID&quot;:&quot;10870&quot;,&quot;LABEL&quot;:&quot;7th Phase JP Nagar&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:202.0},&quot;1&quot;
:{&quot;ID&quot;:&quot;2259&quot;,&quot;LABEL&quot;:&quot;Electronic City&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:126.0},&quot;2&quot;
:{&quot;ID&quot;:&quot;2265&quot;,&quot;LABEL&quot;:&quot;Koramangala&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:118.0},&quot;3&quot;
:{&quot;ID&quot;:&quot;11646&quot;,&quot;LABEL&quot;:&quot;BTM 2nd Stage&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:118.0}},&quot;More_Locality_Array&quot;
:{&quot;0&quot;
:{&quot;ID&quot;:&quot;2277&quot;,&quot;LABEL&quot;:&quot;Bellandur&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:102.0},&quot;1&quot;
:{&quot;ID&quot;:&quot;5467&quot;,&quot;LABEL&quot;:&quot;Hulimavu&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:95.0},&quot;2&quot;
:{&quot;ID&quot;:&quot;2261&quot;,&quot;LABEL&quot;:&quot;HSR Layout&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:94.0},&quot;3&quot;:
:{&quot;ID&quot;:&quot;2293&quot;,&quot;LABEL&quot;:&quot;Jigani&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:91.0},&quot;4&quot;
:{&quot;ID&quot;:&quot;2249&quot;,&quot;LABEL&quot;:&quot;Bannerghatta Road&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:83.0},&quot;5&quot;
:{&quot;ID&quot;:&quot;2264&quot;,&quot;LABEL&quot;:&quot;Kanakpura Road&quot;,&quot;SELECTED&quot;:&quot;&quot;,&quot;COUNT&quot;:83.0},&quot;6&quot;:

我尝试过以下Python代码,它只获取输入的值(id=filter_data)

for loc in soup.find_all('input',id='filter_data'):
print(loc.get('value'))

我的输出低于

{"Locality":{"Top_Results_Array":{
"0":{"ID":"10870","Locality":"7th Phase JP Nagar","SELECTED":"","COUNT":202.0}
,"1":{"ID":"2259","LABEL":"Electronic City","SELECTED":"","COUNT":126.0}
,"2":{"ID":"2265","LABEL":"Koramangala","SELECTED":"","COUNT":118.0}
,"3":{"ID":"11646","LABEL":"BTM 2nd Stage","SELECTED":"","COUNT":118.0}}
,"More_Locality_Array":{"0":{
"ID":"2277","LABEL":"Bellandur","SELECTED":"","COUNT":102.0}
,"1":{"ID":"5467","LABEL":"Hulimavu","SELECTED":"","COUNT":95.0}
,"2":{"ID":"2261","LABEL":"HSR Layout","SELECTED":"","COUNT":94.0}
,"3":{"ID":"2293","LABEL":"Jigani","SELECTED":"","COUNT":91.0}
,"4":{"ID":"2249","LABEL":"Bannerghatta Road","SELECTED":"","COUNT":83.0}
,"5":{"ID":"2264","LABEL":"Kanakpura Road","SELECTED":"","COUNT":83.0}

但我需要以下输出

10870 7期JP Nagar

2259电子城

2265 科拉曼加拉

11646 BTM 第二阶段

2277 贝兰德尔

5467胡里马武

2261高铁布局

。.

你能帮我解决这个问题吗

最佳答案

我建议的一种方法是jsonify您的结果集并根据需要提取信息。问题是 unicode 的输出格式。您可以在获得结果后尝试此代码,您可以按照自己的方式获取数据。您可以将数据加载为列表、字典等,并根据需要获取值。

import json
exp = soup.find_all('input', attrs={"id":"filter_data"})
abc = exp[0].get('value') # len(exp) = 1
abc = abc.decode('utf-8') # since its unicode
result = json.loads(abc)
result

如果您想查看具有位置的结果值,请选中

print result.values()[2]

在字典中查看并决定您想要得到什么。

dict(result)

尝试一下 json,你会得到你想要的。我希望这有帮助。

关于python - 如何使用 python beautiful soup 从下面的 HTML 中获取标签和 ID 信息,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47457836/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com