gpt4 book ai didi

python - BeautifulSoup 文件保存 .txt 时出错

转载 作者:太空宇宙 更新时间:2023-11-03 16:49:03 27 4
gpt4 key购买 nike

from bs4 import BeautifulSoup
import requests
import os


url = "http://nos.nl/artikel/2093082-steeds-meer-nekklachten-bij-kinderen-door-gebruik-tablets.html"
r = requests.get(url)
soup = BeautifulSoup(r.content.decode('utf-8', 'ignore'))
data = soup.find_all("article", {"class": "article"})

with open("data1.txt", "wb") as file:
content=‘utf-8’
for item in data:
content+='''{}\n{}\n\n{}\n{}'''.format( item.contents[0].find_all("time", {"datetime": "2016-03-16T09:50:30+0100"})[0].text,
item.contents[0].find_all("a", {"class": "link-grey"})[0].text,
item.contents[0].find_all("img", {"class": "media-full"})[0],
item.contents[1].find_all("div", {"class": "article_textwrap"})[0].text,
)
with open("data1.txt".format(file_name), "wb") as file:
file.write(content)

最近解决了 utf/Unicode 问题,但现在它不将其保存为 .txt 文件,也不保存它。我需要做什么?

最佳答案

如果您想将数据以 UTF-8 格式写入文件,请尝试 codecs.open,例如:

from bs4 import BeautifulSoup
import requests
import os
import codecs


url = "http://nos.nl/artikel/2093082-steeds-meer-nekklachten-bij-kinderen-door-gebruik-tablets.html"
r = requests.get(url)
soup = BeautifulSoup(r.content)
data = soup.find_all("article", {"class": "article"})

with codecs.open("data1.txt", "wb", "utf-8") as filen:
for item in data:
filen.write(item.contents[0].find_all("time", {"datetime": "2016-03-16T09:50:30+0100"})[0].get_text())
filen.write('\n')
filen.write(item.contents[0].find_all("a", {"class": "link-grey"})[0].get_text())
filen.write('\n\n')
filen.write(item.contents[0].find_all("img", {"class": "media-full"})[0].get_text())
filen.write('\n')
filen.write(item.contents[1].find_all("div", {"class": "article_textwrap"})[0].get_text())

我不确定 filen.write(item.contents[0].find_all("img", {"class": "media-full"})[0]) 因为为我返回了一个 Tag 实例。

关于python - BeautifulSoup 文件保存 .txt 时出错,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/36044653/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com