gpt4 book ai didi

python - 使用 Python 获取表值

转载 作者:行者123 更新时间:2023-11-30 22:11:27 25 4
gpt4 key购买 nike

我正在尝试使用 Python 从 html 表中获取值。html 看起来像这样:

<table border=1 width=900>
<tr><td width=50%>
<table>
<tr><td align=right><b>Invoice #</td><td><input type=text value="1624140" size=12></td></tr>
<tr><td align=right>Company</td><td><input type=text value="NZone" size=40></td></tr>
<tr><td align=right>Name:</td><td><input type=text value="John Dot" size=40></td></tr>
<tr><td align=right>Address:</td><td><input type=text value="Posie Row, Moscow Road" size=40></td></tr>
<tr><td align=right>City:</td><td><input type=text value="Co. Dubllin" size=40></td></tr>
<tr><td align=right>Province</td><td><input type=text value="" size=40></td></tr>
<tr><td align=right>Postal Code:</td><td><input type=text value="" size=40></td></tr>
<tr><td align=right>Country:</td><td><input type=text value="IRELAND" size=40></td></tr>
<tr><td align=right>Date:</td><td><input type=text value="24.4.18" size=12></td></tr>
<tr><td align=right>Sub Total:</td><td><input type=text value="93,24" size=40></td></tr>
<tr><td align=right>Combined Weight:</td><td><input type=text value="1,24" size=40></td></tr>
</table>

到目前为止我的代码是:

from __future__ import print_function
import requests
import re

from bs4 import BeautifulSoup as bs

request = requests.get('url')

content = request.content

soup = bs(content, 'html.parser')

table = soup.findChildren('table')[1]

rows = table.findChildren('tr')

for row in rows:
cells = row.findChildren('td')
for cell in cells:
cell_content = cell.getText()

print(cell_content)

输出是:

Invoice #
Company
Name:
Address:
City:
Province
Postal Code:
Country:
Date:
Sub Total:
Combined Weight:

我想要如下的最终输出:

Invoice:1624140
Company:NZone
Name:John Dot
Address:Possie Row, Moscow Road
City:Co. Dublin
Province:
Postal Code:
Country:IRELAND
Date:24.4.18
Sub Total:93,24
Combined Weight:1,24

最佳答案

data = """
<table border=1 width=900>
<tr><td width=50%>
<table>
<tr><td align=right><b>Invoice #</td><td><input type=text value="1624140" size=12></td></tr>
<tr><td align=right>Company</td><td><input type=text value="NZone" size=40></td></tr>
<tr><td align=right>Name:</td><td><input type=text value="John Dot" size=40></td></tr>
<tr><td align=right>Address:</td><td><input type=text value="Posie Row, Moscow Road" size=40></td></tr>
<tr><td align=right>City:</td><td><input type=text value="Co. Dubllin" size=40></td></tr>
<tr><td align=right>Province</td><td><input type=text value="" size=40></td></tr>
<tr><td align=right>Postal Code:</td><td><input type=text value="" size=40></td></tr>
<tr><td align=right>Country:</td><td><input type=text value="IRELAND" size=40></td></tr>
<tr><td align=right>Date:</td><td><input type=text value="24.4.18" size=12></td></tr>
<tr><td align=right>Sub Total:</td><td><input type=text value="93,24" size=40></td></tr>
<tr><td align=right>Combined Weight:</td><td><input type=text value="1,24" size=40></td></tr>
</table>
"""

from bs4 import BeautifulSoup

soup = BeautifulSoup(data, 'lxml')

for (td, inp) in zip(soup.find_all('td', align="right"), soup.find_all('input')):
print(td.text, inp['value'])

输出是:

Invoice # 1624140
Company NZone
Name: John Dot
Address: Posie Row, Moscow Road
City: Co. Dubllin
Province
Postal Code:
Country: IRELAND
Date: 24.4.18
Sub Total: 93,24
Combined Weight: 1,24

关于python - 使用 Python 获取表值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51401638/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com