gpt4 book ai didi

python - 在 python 中拆分 PDF 文件 - ValueError : invalid literal for int() with base 10: '' "

转载 作者:太空宇宙 更新时间:2023-11-03 11:35:03 31 4
gpt4 key购买 nike

我正在尝试使用 pyPdf 将一个巨大的 pdf 文件拆分成几个小的 pdf。我正在尝试使用这个过于简化的代码:

from pyPdf import PdfFileWriter, PdfFileReader 
inputpdf = PdfFileReader(file("document.pdf", "rb"))

for i in xrange(inputpdf.numPages):
output = PdfFileWriter()
output.addPage(inputpdf.getPage(i))
outputStream = file("document-page%s.pdf" % i, "wb")
output.write(outputStream)
outputStream.close()

但我收到以下错误消息:

Traceback (most recent call last):
File "./hltShortSummary.py", line 24, in <module>
for i in xrange(inputpdf.numPages):
File "/usr/lib/pymodules/python2.7/pyPdf/pdf.py", line 342, in <lambda>
numPages = property(lambda self: self.getNumPages(), None, None)
File "/usr/lib/pymodules/python2.7/pyPdf/pdf.py", line 334, in getNumPages
self._flatten()
File "/usr/lib/pymodules/python2.7/pyPdf/pdf.py", line 500, in _flatten
pages = catalog["/Pages"].getObject()
File "/usr/lib/pymodules/python2.7/pyPdf/generic.py", line 466, in __getitem__
return dict.__getitem__(self, key).getObject()
File "/usr/lib/pymodules/python2.7/pyPdf/generic.py", line 165, in getObject
return self.pdf.getObject(self).getObject()
File "/usr/lib/pymodules/python2.7/pyPdf/pdf.py", line 549, in getObject
retval = readObject(self.stream, self)
File "/usr/lib/pymodules/python2.7/pyPdf/generic.py", line 67, in readObject
return DictionaryObject.readFromStream(stream, pdf)
File "/usr/lib/pymodules/python2.7/pyPdf/generic.py", line 517, in readFromStream
value = readObject(stream, pdf)
File "/usr/lib/pymodules/python2.7/pyPdf/generic.py", line 58, in readObject
return ArrayObject.readFromStream(stream, pdf)
File "/usr/lib/pymodules/python2.7/pyPdf/generic.py", line 153, in readFromStream
arr.append(readObject(stream, pdf))
File "/usr/lib/pymodules/python2.7/pyPdf/generic.py", line 87, in readObject
return NumberObject.readFromStream(stream)
File "/usr/lib/pymodules/python2.7/pyPdf/generic.py", line 232, in readFromStream
return NumberObject(name)
ValueError: invalid literal for int() with base 10: ''

有什么想法吗???

最佳答案

我认为这是 pypdf 中的错误。查看来源here . NumberObject.readFromStream 需要一个类似整数的字符串,但没有得到。可能有问题的 pdf 以某种意想不到的方式出现了格式错误。

关于python - 在 python 中拆分 PDF 文件 - ValueError : invalid literal for int() with base 10: '' ",我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/6393800/

31 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com