gpt4 book ai didi

python - 使用 Python 将 PDF 转换为图像

转载 作者:行者123 更新时间:2023-12-04 16:27:23 31 4
gpt4 key购买 nike

我正在尝试在我安装的 ubuntu 服务器中将 pdf 文件转换为图像文件:

  1. python2.7
  2. poppler-utils
  3. pdf2image==1.12.1

我的代码:

from pdf2image import convert_from_path, convert_from_bytes

images = convert_from_path("/home/user/pdf_file.pdf")

# OR

with open("/home/user/pdf_file.pdf") as pdf:
images = convert_from_bytes(pdf.read())

输出

当我使用“convert_from_path”函数时

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 143, in convert_from_path
thread_output_file = next(output_file)
TypeError: ThreadSafeGenerator object is not an iterator

当我使用“convert_from_bytes”函数时

Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 268, in convert_from_bytes
paths_only=paths_only,
File "/usr/local/lib/python2.7/dist-packages/pdf2image/pdf2image.py", line 143, in convert_from_path
thread_output_file = next(output_file)
TypeError: ThreadSafeGenerator object is not an iterator

我已经重新安装了所有实用程序,然后我遇到了这些问题。

最佳答案

如果您想将 PDF 转换为图像,您可以尝试 Python Ghostscript package :

pip install ghostscript

import ghostscript
import locale

def pdf2jpeg(pdf_input_path, jpeg_output_path):
args = ["pef2jpeg", # actual value doesn't matter
"-dNOPAUSE",
"-sDEVICE=jpeg",
"-r144",
"-sOutputFile=" + jpeg_output_path,
pdf_input_path]

encoding = locale.getpreferredencoding()
args = [a.encode(encoding) for a in args]

ghostscript.Ghostscript(*args)

pdf2jpeg(
"...Fixate/ActiveState/pdf/a.pdf",
"...Fixate/ActiveState/pdf/a.jpeg",
)

关于python - 使用 Python 将 PDF 转换为图像,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60701262/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com