gpt4 book ai didi

python - 解析器返回 "\n"而不是所需的输出。

转载 作者:太空宇宙 更新时间:2023-11-03 18:23:20 25 4
gpt4 key购买 nike

我必须从 this link 解析阿德莱德乌鸦队的球员姓名,为此我编写了一个像这样的解析器

import requests                                                                 
from bs4 import BeautifulSoup

href_val = requests.get("http://www.afl.com.au/news/teams?round=9")
soup1 = BeautifulSoup(href_val.content)

players_info_adel = soup1.find_all("ul", {"class" : "team1 team-adel"})
for li in players_info_adel:

player_names_adel = li.find_all("li", {"class" : "player"})
#print player_names_adel

#print player_names_adel

for span in player_names_adel:

if span.find(text = True):
text = ''.join(span.find(text = True))
text1 = text.encode('ascii')
print text

但每当我运行此代码时,我总是会打印一堆 "\n" 而不是名称。我应该怎么做才能获取玩家的名字?

最佳答案

您不想循环每个玩家<li>元素;第一个元素是一个文本节点,其中只有一个换行符。更好用Tag.get_text()从元素中获取所有文本。

使用 CSS 选择器来简化代码:

for player in soup1.select('ul.team1 li.player'):
text = player.get_text().strip()
print text

这包括玩家编号;您可以使用以下方法分隔此号码和玩家姓名:

number, name = player.span.get_text().strip(), player.span.next_sibling.strip()

相反。

演示:

>>> import requests
>>> from bs4 import BeautifulSoup
>>> href_val = requests.get("http://www.afl.com.au/news/teams?round=9")
>>> soup1 = BeautifulSoup(href_val.content)
>>> for player in soup1.select('ul.team1 li.player'):
... text = player.get_text().strip()
... print text
...
24 Sam Jacobs
32 Patrick Dangerfield
26 Richard Douglas
41 Kyle Hartigan
25 Ben Rutten
16 Luke Brown
33 Brodie Smith
# .. etc ..
>>> for player in soup1.select('ul.team1 li.player'):
... number, name = player.span.get_text().strip(), player.span.next_sibling.strip()
... print name
...
Sam Jacobs
Patrick Dangerfield
Richard Douglas
Kyle Hartigan
Ben Rutten
Luke Brown
Brodie Smith
# ... etc ...

关于python - 解析器返回 "\n"而不是所需的输出。,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23675950/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com