gpt4 book ai didi

python - 将抓取表转换为 DataFrame 时的 NoneType 对象

转载 作者:行者123 更新时间:2023-12-01 08:07:41 55 4
gpt4 key购买 nike

我正在尝试抓取以下链接的表格中显示的股票代码列表:http://www.advfn.com/nyse/newyorkstockexchange.asp?companies=A我使用 beautiful soup 抓取了表格,但是当我将其转换为 Pandas 数据框架时,出现错误:

TypeError: 'NoneType' object is not callable

我尝试了以下代码:

url = 'http://www.advfn.com/nyse/newyorkstockexchange.asp?companies=A'
res = requests.get(url)
soup = BeautifulSoup(res.content,'lxml')
table = soup.find("table",{"class":"market tab1"})
df = pd.read_html(table)

但是它不起作用。我该如何解决?为什么我会收到错误消息?

完整错误日志:

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
~/anaconda3/lib/python3.7/site-packages/pandas/io/html.py in _parse(flavor, io, match, attrs, encoding, displayed_only, **kwargs)
796 try:
--> 797 tables = p.parse_tables()
798 except Exception as caught:

~/anaconda3/lib/python3.7/site-packages/pandas/io/html.py in parse_tables(self)
212 def parse_tables(self):
--> 213 tables = self._parse_tables(self._build_doc(), self.match, self.attrs)
214 return (self._build_table(table) for table in tables)

~/anaconda3/lib/python3.7/site-packages/pandas/io/html.py in _build_doc(self)
618 # try to parse the input in the simplest way
--> 619 r = parse(self.io, parser=parser)
620 try:

~/anaconda3/lib/python3.7/site-packages/lxml/html/__init__.py in parse(filename_or_url, parser, base_url, **kw)
939 parser = html_parser
--> 940 return etree.parse(filename_or_url, parser, base_url=base_url, **kw)
941

src/lxml/etree.pyx in lxml.etree.parse()

src/lxml/parser.pxi in lxml.etree._parseDocument()

TypeError: 'NoneType' object is not callable

During handling of the above exception, another exception occurred:

TypeError Traceback (most recent call last)
<ipython-input-23-c3e05c494f63> in <module>
5 table = soup.find("table",{"class":"market tab1"})
6 #print(table)
----> 7 df = pd.read_html(table)

~/anaconda3/lib/python3.7/site-packages/pandas/io/html.py in read_html(io, match, flavor, header, index_col, skiprows, attrs, parse_dates, tupleize_cols, thousands, encoding, decimal, converters, na_values, keep_default_na, displayed_only)
985 decimal=decimal, converters=converters, na_values=na_values,
986 keep_default_na=keep_default_na,
--> 987 displayed_only=displayed_only)

~/anaconda3/lib/python3.7/site-packages/pandas/io/html.py in _parse(flavor, io, match, attrs, encoding, displayed_only, **kwargs)
799 # if `io` is an io-like object, check if it's seekable
800 # and try to rewind it before trying the next parser
--> 801 if hasattr(io, 'seekable') and io.seekable():
802 io.seek(0)
803 elif hasattr(io, 'seekable') and not io.seekable():

TypeError: 'NoneType' object is not callable

请求表:

<table cellpadding="0" cellspacing="1" class="market tab1" width="610">
<colgroup><col/><col/><col class="c"/></colgroup>
<tr><td class="tabh" colspan="3"><b>Companies listed on the NYSE</b></td></tr>
<tr><th>Equity</th><th>Symbol</th><th>Info</th></tr>
<tr class="ts0"><td align="left"><a href="http://ih.advfn.com/stock-market/NYSE/a-k-steel-AKS/stock-price">A K Steel</a></td><td><a href="http://ih.advfn.com/stock-market/NYSE/a-k-steel-AKS/stock-price">AKS</a></td><td><a href="http://ih.advfn.com/stock-market/NYSE/a-k-steel-AKS/chart"><img src="/s/stock-chart.gif"/></a><a href="http://ih.advfn.com/stock-market/NYSE/a-k-steel-AKS/news"><img src="/s/stock-news.gif"/></a><a href="http://ih.advfn.com/stock-market/NYSE/a-k-steel-AKS/financials"><img src="/s/fundamentals.gif"/></a><a href="http://ih.advfn.com/stock-market/NYSE/a-k-steel-AKS/trades"><img src="/s/stock-trades.gif"/></a></td></tr>

最佳答案

您正在传递 <class 'bs4.element.Tag'>元素变成 Pandas read_html 。您需要将其转换为 string .

from bs4 import BeautifulSoup
import requests
import pandas as pd
url = 'http://www.advfn.com/nyse/newyorkstockexchange.asp?companies=A'
res = requests.get(url)
soup = BeautifulSoup(res.content,'lxml')
table = soup.find("table",{"class":"market tab1"})
df = pd.read_html(str(table))
print(df)

输出:

[                                    0       1     2
0 Companies listed on the NYSE NaN NaN
1 Equity Symbol Info
2 A K Steel AKS NaN
3 A M R AMR NaN
4 A M R Cp 7.875 AAR NaN
5 A V X AVX NaN
6 A a R AIR NaN
7 A.h. Belo Corporation AHC NaN
8 Aaron Rents RNT.A NaN
9 Aaron Rents RNT NaN
10 Aarons Cl A AAN.A NaN
11 Aarons Inc. AAN NaN
12 Ab Svensk Cdss Arbmn CBJ NaN
13 Ab Svensk Ekport AXF NaN
14 Ab Svensk Ekportkrdt SQT NaN
15 Ab Svensk Ekportkred DVK NaN
16 Ab Svensk Ekportkred IWK NaN
17 Ab Svensk Ekportkred RCW NaN
18 Ab Svensk Ekportkred EOA NaN
19 Ab Svensk Msci Arn MIS NaN
20 Ab Svensk Russell REU NaN
21 Ab Svensk Sp Arns SAD NaN
22 Ab Svensk Sp Arns MHG NaN
23 Abb ABB NaN
24 Abbott Labs ABT NaN
25 Abercrombie & Fitch ANF NaN
26 Abitibi ABY NaN
27 Abm ABM NaN
28 Acadia AKR NaN
29 Acc Bear Amex Egy IMW NaN
.. ... ... ...
194 Ashland ASH NaN
195 Aspen Insurance AHL NaN
196 Assisted Living Concepts (nevada ALC NaN
197 Associated Estates AEC NaN
198 Assurant AIZ NaN
199 Assured Guaranty AGO NaN
200 Astoria AF NaN
201 Astrazeneca AZN NaN
202 Atlanta Gas Light ATG NaN
203 Atlas Pipeline APL NaN
204 Atlas Pipeline Holdings Lp AHD NaN
205 Atmos ATO NaN
206 Att T NaN
207 Att ATT NaN
208 Atwood Oceanics ATW NaN
209 Au Optronics AUO NaN
210 Autoliv ALV NaN
211 Autonation AN NaN
212 Autozone AZO NaN
213 Av Svensk Ekportkred NEH NaN
214 Avalonbay AVB NaN
215 Aventine Renew Enrgy AVR NaN
216 Avery Dennison AVY NaN
217 Avis Budget Grp. CAR NaN
218 Avista AVA NaN
219 Avnet AVT NaN
220 Avon Products AVP NaN
221 Axa AXA NaN
222 Axis AXS NaN
223 Azz AZZ NaN

[224 rows x 3 columns]]

关于python - 将抓取表转换为 DataFrame 时的 NoneType 对象,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55462375/

55 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com