gpt4 book ai didi

python - 如何使用 标签获取
  • 中的每个值 BeautifulSoup
  • 转载 作者:行者123 更新时间:2023-12-03 19:36:36 25 4
    gpt4 key购买 nike

    我有一个如下所示的 HTML 文档,self.soup 是 BeautifulSoup 对象。我试图在列表元素内抓取数据。列表元素如下所示:

     <ul class="list-group">
    <li class="list-group-item">
    <span class="strong">Name</span>
    <span class="pull-right">Piter</span>
    </li>
    <li class="list-group-item">
    <span class="strong">Year</span>
    <span class="pull-right">2017</span>
    </li>
    </ul>

    python 文件 scrape.py

      #person is a array
    need = { 'Name' : 'name',
    'Year' : 'year'
    }

    第一次尝试

      specs = self.soup.select("ul.list-group li.list-group-item") 
    if len(specs) > 0 :
    for data in specs :
    text = data.get_text()
    if need.has_key( data[0].strip()) :
    if need[ data[0].strip() ] not in person or person[ need[ data[0].strip() ] ] == '':
    person[ need[ text[0].strip() ] ] = text[1].strip()

    第一个错误

     File "scraper.py", line 68, in scrape
    if need.has_key( data[0].strip()) :
    File "build/bdist.linux-x86_64/egg/bs4/element.py", line 1011, in__getitem__
    KeyError: 0

    第二次尝试

      specs = self.soup.select("ul.list-group li.list-group-item")
    if len(specs) > 0 :
    for data in specs :
    text = data.get_text()
    if need.has_key( data[0].strip()) :
    if need[ data[0].strip() ] not in person or person[ need[ data[0].strip() ] ] == '':
    person[ need[ text[0].strip() ] ] = text[1].strip()

    第二个错误

      File "site_scrapers/v12software.scraper.py", line 66, in scrape
    text = [ data.contents[0].get_text(), data.contents[1].get_text() ]
    File "build/bdist.linux-x86_64/egg/bs4/element.py", line 737, in __getattr__
    AttributeError: 'NavigableString' object has no attribute 'get_text'

    我试图将上面的元素字符串获取到 person 数组。

    我需要这样的结果:

      print person['Name']
    #output Piter
    print person['Year']
    #output 2017

    最佳答案

    from bs4 import BeautifulSoup

    html = """<ul class="list-group">
    <li class="list-group-item">
    <span class="strong">Name</span>
    <span class="pull-right">Piter</span>
    </li>
    <li class="list-group-item">
    <span class="strong">Year</span>
    <span class="pull-right">2017</span>
    </li>
    </ul>"""

    soup = BeautifulSoup(html, 'html.parser')

    need = {}

    for li_tag in soup.find_all('ul', {'class':'list-group'}):
    for span_tag in li_tag.find_all('li', {'class':'list-group-item'}):
    field = span_tag.find('span', {'class':'strong'}).text
    value = span_tag.find('span', {'class':'pull-right'}).text
    need[field] = value

    print(need)

    关于python - 如何使用 <span> 标签获取 <li> 中的每个值 BeautifulSoup,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/44816149/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com