gpt4 book ai didi

python - 如何使用 Python 解析 NHL Team Defense 统计数据以创建 Pandas DataFrame?

转载 作者:行者123 更新时间:2023-12-01 07:01:37 26 4
gpt4 key购买 nike

我已经抓取了数据,但需要帮助才能正确解析它。我仍在学习中,并将感谢我能得到的任何建议。

我正在寻找以下两个变量的数据:TEAM、SA/G

这是迄今为止我的代码:


#import modules
from selenium import webdriver

from bs4 import BeautifulSoup

#set path for driver
driver = webdriver.Chrome('C:\webdrivers\chromedriver.exe')

# open page
driver.get('http://www.espn.com/nhl/statistics/team/_/stat/scoring/sort/avgGoals')

# driver.page_source
soup = BeautifulSoup(driver.page_source,'lxml')

#close driver
driver.close()

#grab table data
table = soup.find(class_='tablehead')

#parse data (extra data included)
for t in table:
td_tags = table.find_all('td')
# print(td_tags)
for td in td_tags:
a_tags = table.find('a')
print(td.text)

我已经抓取了正确的数据,但还有一些额外的信息我可以使用帮助解析。关于如何获取 TEAM 和 SA/G 数据有什么建议吗?

这是我正在寻找的 Pandas DataFrame 输出的示例:

Team             SA/G

Nashville 30.1

Colorado 33.6

Washington 31.0

预先感谢您提供的任何帮助!

代码更新:

第一次尝试仅获取团队信息并具有额外数​​据(例如“GP”)。

第一次尝试修复代码:

# parse data (closer to desired output but missing SA/G data)
for tab in table:
tr = table.find_all('tr')
for t in tr:
td = table.find_all('td')
print((t.a.text))

第二次尝试获取了团队数据和 SA/G,但也有额外的数据(例如,每 11 行代码中就有“TEAM”和“SA/G”文本)。

这是第二次尝试:

#parses TEAM and SA/G
import pandas as pd
x = pd.read_html("http://www.espn.com/nhl/statistics/team/_/stat/scoring/sort/avgGoals")[0]

print(x[[1, 9]])

最佳答案

如果您想从 url 读取表格,我会使用方法 read_html来自 Pandas 。在下面,Pandas 使用 bs4 为您解析网页。您可以在下面看到一个示例:

In [3]: import pandas as pd 
In [4]: pd.read_html("http://www.espn.com/nhl/statistics/team/_/stat/scoring/sort/avgGoals")[0]
Out[4]:
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14
0 RK TEAM GP G GA GF/G GA/G DIFF SF/G SA/G DIFF SVPCT PIM PIMA DIFF
1 1 Nashville 11 45 33 4.09 3.00 1.09 31.9 30.1 01.8 .900 87 109 -22
2 2 Colorado 11 44 30 4.00 2.73 1.27 31.4 33.6 -02.3 .919 102 140 -38
3 3 Washington 13 49 43 3.77 3.31 0.46 30.3 31.0 -00.7 .893 125 111 14
4 4 Vancouver 11 40 26 3.64 2.36 1.27 32.6 31.3 01.4 .924 103 119 -16
5 NaN Montreal 11 40 35 3.64 3.18 0.45 34.4 31.1 03.3 .898 77 83 -6
6 6 Toronto 13 46 44 3.54 3.38 0.15 32.7 32.8 -00.1 .897 88 82 6
7 7 Florida 12 42 45 3.50 3.75 -0.25 34.0 30.0 04.0 .875 78 86 -8
8 NaN Philadelphia 10 35 30 3.50 3.00 0.50 35.4 27.4 08.0 .891 78 90 -12
9 9 Buffalo 13 43 32 3.31 2.46 0.85 30.2 33.5 -03.2 .926 100 118 -18
10 10 Tampa Bay 10 33 32 3.30 3.20 0.10 31.4 34.5 -03.1 .907 100 88 12
11 RK TEAM GP G GA GF/G GA/G DIFF SF/G SA/G DIFF SVPCT PIM PIMA DIFF
12 11 Boston 11 36 23 3.27 2.09 1.18 33.3 31.5 01.7 .934 82 80 2
13 NaN Carolina 11 36 29 3.27 2.64 0.64 32.9 29.4 03.5 .910 97 87 10
14 13 Pittsburgh 12 39 30 3.25 2.50 0.75 31.9 29.8 02.1 .916 82 84 -2
15 14 NY Rangers 9 29 34 3.22 3.78 -0.56 28.2 36.9 -08.7 .898 90 82 8
16 15 St. Louis 12 37 38 3.08 3.17 -0.08 29.0 30.3 -01.3 .895 87 91 -4
17 16 Vegas 13 40 36 3.08 2.77 0.31 35.3 32.7 02.6 .915 143 143 0
18 17 Edmonton 12 36 32 3.00 2.67 0.33 27.9 30.6 -02.7 .913 80 74 6
19 NaN Arizona 11 33 24 3.00 2.18 0.82 31.5 29.8 01.6 .927 68 74 -6
20 NaN NY Islanders 11 33 27 3.00 2.45 0.55 27.6 31.5 -03.8 .922 95 67 28
21 20 Columbus 11 30 39 2.73 3.55 -0.82 33.6 31.1 02.5 .886 75 81 -6
22 RK TEAM GP G GA GF/G GA/G DIFF SF/G SA/G DIFF SVPCT PIM PIMA DIFF
23 21 Ottawa 11 29 36 2.64 3.27 -0.64 31.1 35.0 -03.9 .906 134 110 24
24 22 Calgary 13 34 39 2.62 3.00 -0.38 30.9 31.2 -00.3 .904 147 122 25
25 23 San Jose 12 31 43 2.58 3.58 -1.00 28.3 31.8 -03.4 .887 128 124 4
26 NaN Los Angeles 12 31 49 2.58 4.08 -1.50 37.3 28.3 08.9 .856 102 116 -14
27 25 Winnipeg 12 30 37 2.50 3.08 -0.58 33.2 33.3 -00.1 .907 52 88 -36
28 NaN Chicago 10 25 30 2.50 3.00 -0.50 31.6 32.9 -01.3 .909 66 68 -2
29 27 Anaheim 13 32 31 2.46 2.38 0.08 27.5 31.5 -04.0 .924 131 99 32
30 28 New Jersey 9 22 34 2.44 3.78 -1.33 29.3 29.0 00.3 .870 99 93 6
31 29 Minnesota 11 26 37 2.36 3.36 -1.00 29.5 30.4 -00.8 .889 87 93 -6
32 30 Detroit 12 27 45 2.25 3.75 -1.50 31.5 33.2 -01.7 .887 105 96 9
33 RK TEAM GP G GA GF/G GA/G DIFF SF/G SA/G DIFF SVPCT PIM PIMA DIFF
34 31 Dallas 13 25 35 1.92 2.69 -0.77 27.8 28.8 -01.1 .907 89 79 10

关于python - 如何使用 Python 解析 NHL Team Defense 统计数据以创建 Pandas DataFrame?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58602261/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com