gpt4 book ai didi

python - 用 BeautifulSoup 解析表并写入文本文件

转载 作者:太空狗 更新时间:2023-10-29 20:46:36 25 4
gpt4 key购买 nike

我需要这种格式的文本文件 (output.txt) 中的表中的数据:数据 1;数据 2;数据 3;数据 4;.....

Celkova podlahova plocha bytu;33m;Vytah;Ano;Nadzemne podlazie;Prizemne podlazie;......;Forma vlastnictva;Osobne

全部在“一行”中,分隔符为“;”(稍后导出为csv文件)。

我是初学者..帮助,谢谢。

from BeautifulSoup import BeautifulSoup
import urllib2
import codecs

response = urllib2.urlopen('http://www.reality.sk/zakazka/0747-003578/predaj/1-izb-byt/kosice-mestska-cast-sever-sladkovicova-kosice-sever/art-real-1-izb-byt-sladkovicova-ul-kosice-sever')
html = response.read()
soup = BeautifulSoup(html)

tabulka = soup.find("table", {"class" : "detail-char"})

for row in tabulka.findAll('tr'):
col = row.findAll('td')
prvy = col[0].string.strip()
druhy = col[1].string.strip()
record = ([prvy], [druhy])

fl = codecs.open('output.txt', 'wb', 'utf8')
for rec in record:
line = ''
for val in rec:
line += val + u';'
fl.write(line + u'\r\n')
fl.close()

最佳答案

您在读入时不会保留每条记录。试试这个,它将记录存储在 records 中:

from BeautifulSoup import BeautifulSoup
import urllib2
import codecs

response = urllib2.urlopen('http://www.reality.sk/zakazka/0747-003578/predaj/1-izb-byt/kosice-mestska-cast-sever-sladkovicova-kosice-sever/art-real-1-izb-byt-sladkovicova-ul-kosice-sever')
html = response.read()
soup = BeautifulSoup(html)

tabulka = soup.find("table", {"class" : "detail-char"})

records = [] # store all of the records in this list
for row in tabulka.findAll('tr'):
col = row.findAll('td')
prvy = col[0].string.strip()
druhy = col[1].string.strip()
record = '%s;%s' % (prvy, druhy) # store the record with a ';' between prvy and druhy
records.append(record)

fl = codecs.open('output.txt', 'wb', 'utf8')
line = ';'.join(records)
fl.write(line + u'\r\n')
fl.close()

这可以进一步清理,但我认为这就是您想要的。

关于python - 用 BeautifulSoup 解析表并写入文本文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/2224602/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com