gpt4 book ai didi

python - 如何在 python 中使用 pdf2image 将 pdf 从 url 转换为图像?

转载 作者:太空宇宙 更新时间:2023-11-04 03:59:46 32 4
gpt4 key购买 nike

我可以使用 pdf2image convert_to_path 将驱动器中的 pdf 文件转换为图像,但是当我尝试对 pdf 进行相同操作时 ' https://example.com/abc.pdf ',最终出现多个错误。

代码:

url = 'https://example.com/abc.pdf'
scrape = urlopen(url) # for external files
pil_images = pdf2image.convert_from_bytes(scrape.read(), dpi=200,
output_folder=None, first_page=None, last_page=None,
thread_count=1, userpw=None,use_cropbox=False, strict=False,
poppler_path=r"C:\poppler-0.68.0_x86\poppler-0.68.0\bin",)

错误:

   Unable to get page count. Syntax Error: Document stream is empty

也点击了下面的链接,但没有成功

Python3: Download PDF to memory and convert first page to image

身份验证截图:

enter image description here

最佳答案

首先按照本博客中提到的网址从 URL 下载 pdf。 https://dzone.com/articles/simple-examples-of-downloading-files-using-python

如果您在 pdf 中有多个页面,则使用此将 pdf 转换为图像或任何其他格式。

import ghostscript

def pdf2jpeg(pdf_input_path, jpeg_output_path):
args = ["pdf2jpeg", # actual value doesn't matter
"-dNOPAUSE",
"-sDEVICE=jpeg",
"-r144",
"-sOutputFile=" + jpeg_output_path,
pdf_input_path]
ghostscript.Ghostscript(*args)

引用:Converting a PDF to a series of images with Python

对于身份验证,试试这个。

import os
import requests

from urlparse import urlparse

username = 'foo'
password = 'sekret'

url = 'http://example.com/blueberry/download/somefile.jpg'
filename = os.path.basename(urlparse(url).path)

r = requests.get(url, auth=(username,password))

if r.status_code == 200:
with open(filename, 'wb') as out:
for bits in r.iter_content():
out.write(bits)

引用:Download a file providing username and password using Python

关于python - 如何在 python 中使用 pdf2image 将 pdf 从 url 转换为图像?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58603978/

32 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com