gpt4 book ai didi

python - Pandas read_html 导致 TypeError

转载 作者:太空宇宙 更新时间:2023-11-04 08:44:47 25 4
gpt4 key购买 nike

我正在使用 bs4 解析一个 html 页面并提取一个表,下面给出了示例表,我试图将它加载到 pandas 中,但是当我调用 pddataframe = pd.read_html(LOTable,skiprows=2 , flavor=['bs4']) 我得到下面列出的错误,但我可以打印由 bs4 美化的表格

有什么建议可以解决这个问题而无需获取每个 td 并逐一读取吗?

示例表

<table cellpadding="5" cellspacing="0" class="borders" width="100%">
<tr>
<th colspan="2">
Learning Outcomes
</th>
</tr>
<tr>
<td class="info" colspan="2">
On successful completion of this module the learner will be able to:
</td>
</tr>
<tr>
<td style="width:10%;">
LO1
</td>
<td>
Demonstrate an awareness of the important role of Financial Accounting information as an input into the decision making process.
</td>
</tr>
<tr>
<td style="width:10%;">
LO2
</td>
<td>
Display an understanding of the fundamental accounting concepts, principles and conventions that underpin the preparation of Financial statements.
</td>
</tr>
<tr>
<td style="width:10%;">
LO3
</td>
<td>
Understand the various formats in which information in relation to transactions or events is recorded and classified.
</td>
</tr>
<tr>
<td style="width:10%;">
LO4
</td>
<td>
Apply a knowledge of accounting concepts,conventions and techniques such as double entry to the posting of recorded information to the T accounts in the Nominal Ledger.
</td>
</tr>
<tr>
<td style="width:10%;">
LO5
</td>
<td>
Prepare and present the financial statements of a Sole Trader in prescribed format from a Trial Balance accompanies by notes with additional information.
</td>
</tr>
</table>

错误

---------------------------------------------------------------------------  TypeError                                 Traceback (most recent call last) <ipython-input-20-12673b1a4bfc> in <module>()
10 #Read table into pandas
11 if first:
---> 12 pddataframe = pd.read_html(LOTable,skiprows=2, flavor=['bs4'])
13 first = False
14 pddataframe

C:\Program Files\Anaconda3\envs\LearningOutcomes\lib\site-packages\pandas\io\html.py in read_html(io, match, flavor, header, index_col, skiprows, attrs, parse_dates, tupleize_cols, thousands, encoding)
872 _validate_header_arg(header)
873 return _parse(flavor, io, match, header, index_col, skiprows,
--> 874 parse_dates, tupleize_cols, thousands, attrs, encoding)

C:\Program Files\Anaconda3\envs\LearningOutcomes\lib\site-packages\pandas\io\html.py in _parse(flavor, io, match, header, index_col, skiprows, parse_dates, tupleize_cols, thousands, attrs, encoding)
734 break
735 else:
--> 736 raise_with_traceback(retained)
737
738 ret = []

C:\Program Files\Anaconda3\envs\LearningOutcomes\lib\site-packages\pandas\compat\__init__.py in raise_with_traceback(exc, traceback)
331 if traceback == Ellipsis:
332 _, _, traceback = sys.exc_info()
--> 333 raise exc.with_traceback(traceback)
334 else:
335 # this version of raise is a syntax error in Python 3

**TypeError: 'NoneType' object is not callable**

最佳答案

感谢所有建议的答案和评论中的指点,我的菜鸟错误是我在使用 bs4 提取表格后将其放在变量中。我正在运行 pd.read_html(LOTable,skiprows=2, flavor='bs4') 当我需要运行 pd.read_html(LOTable.pretify(),skiprows=2, flavor= 'bs4')

关于python - Pandas read_html 导致 TypeError,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/41651350/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com