gpt4 book ai didi

python - 类型错误 : must be convertible to a buffer, 不是结果集

转载 作者:太空宇宙 更新时间:2023-11-04 03:13:16 25 4
gpt4 key购买 nike

我正在尝试使用 scraperwiki 将 PDF 转换为文本文件和 bs4 .我收到一个 TypeError。我是 Python 的新手,非常感谢您的帮助。

这里出现错误:

File "scraper_wiki_download.py", line 53, in write_file
f.write(soup)

这是我的代码:

# Get content, regardless of whether an HTML, XML or PDF file
def send_Request(url):
response = http.urlopen('GET', url, preload_content=False)
return response

# Use this to get PDF, covert to XML
def process_PDF(fileLocation):
pdfToProcess = send_Request(fileLocation)
pdfToObject = scraperwiki.pdftoxml(pdfToProcess.read())
return pdfToObject

# returns a navigatibale tree, which you can iterate through
def parse_HTML_tree(contentToParse):
soup = BeautifulSoup(contentToParse, 'lxml')
return soup

pdf = process_PDF('http://www.sfbos.org/Modules/ShowDocument.aspx?documentid=54790')
pdfToSoup = parse_HTML_tree(pdf)
soupToArray = pdfToSoup.findAll('text')

def write_file(soup_array):
with open('test.txt', "wb") as f:
f.write(soup_array)

write_file(soupToArray)

最佳答案

到现在为止从未使用过 scraperwiki 但这得到了文本:

import scraperwiki
import requests
from bs4 import BeautifulSoup

pdf_xml = scraperwiki.pdftoxml(requests.get('http://www.sfbos.org/Modules/ShowDocument.aspx?documentid=54790').content)
print(BeautifulSoup(pdf_xml, "lxml").find_all("text"))

关于python - 类型错误 : must be convertible to a buffer, 不是结果集,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37251362/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com