gpt4 book ai didi

python - pyPdf IndirectObject in/Rotate

转载 作者:太空宇宙 更新时间:2023-11-04 10:40:54 25 4
gpt4 key购买 nike

<分区>

我们有一个简单的脚本来读取传入的 PDF 文件。如果是横向,它会将其旋转为纵向,供其他程序稍后使用。 pyPdf 一切都运行良好,直到我遇到一个文件,该文件将 IndirectObject 作为页面上/Rotate 键的值。对象是可解析的,所以我可以知道/Rotate 值是什么,但是当尝试 rotateClockwise 或 rotateCounterClockwise 时,我得到一个回溯,因为 pyPdf 不期望/Rotate 中有 IndirectObject。我已经对文件做了很多尝试,试图用值覆盖 IndirectObject,但我没有得到任何结果。我什至尝试将相同的 IndirectObject 传递给 rotateClockwise,它会抛出相同的回溯,即 pdf.pyc 中较早的一行

我的问题简单地说是。 . .是否有 pyPdf 或 PyPDF2 的补丁使其不会在这种设置上阻塞,或者我可以采用不同的方式旋转页面,或者我还没有看到/考虑过的不同库?我试过 PyPDF2,它有同样的问题。我已经将 PDFMiner 视为替代品,但它似乎更适合从 PDF 文件中获取信息而不是操纵它们。这是我在 ipython 中使用 pyPDF 播放文件的输出,PyPDF2 的输出非常相似,但一些信息格式略有不同:

In [1]: from pyPdf import PdfFileReader

In [2]: mypdf = PdfFileReader(open("RP121613.pdf","rb"))

In [3]: mypdf.getNumPages()
Out[3]: 1

In [4]: mypdf.resolvedObjects
Out[4]:
{0: {1: {'/Pages': IndirectObject(2, 0), '/Type': '/Catalog'},
2: {'/Count': 1, '/Kids': [IndirectObject(4, 0)], '/Type': '/Pages'},
4: {'/Count': 1,
'/Kids': [IndirectObject(5, 0)],
'/Parent': IndirectObject(2, 0),
'/Type': '/Pages'},
5: {'/Contents': IndirectObject(6, 0),
'/MediaBox': [0, 0, 612, 792],
'/Parent': IndirectObject(4, 0),
'/Resources': IndirectObject(7, 0),
'/Rotate': IndirectObject(8, 0),
'/Type': '/Page'}}}

In [5]: mypage = mypdf.getPage(0)

In [6]: myrotation = mypage.get("/Rotate")

In [7]: myrotation
Out[7]: IndirectObject(8, 0)

In [8]: mypdf.getObject(myrotation)
Out[8]: 0

In [9]: mypage.rotateCounterClockwise(90)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)

/root/<ipython console> in <module>()

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in rotateCounterClockwise(self, angle)
1049 def rotateCounterClockwise(self, angle):
1050 assert angle % 90 == 0
-> 1051 self._rotate(-angle)
1052 return self
1053

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in _rotate(self, angle)
1054 def _rotate(self, angle):
1055 currentAngle = self.get("/Rotate", 0)
-> 1056 self[NameObject("/Rotate")] = NumberObject(currentAngle + angle)
1057
1058 def _mergeResources(res1, res2, resource):

TypeError: unsupported operand type(s) for +: 'IndirectObject' and 'int'

In [10]: mypage.rotateClockwise(90)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)

/root/<ipython console> in <module>()

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in rotateClockwise(self, angle)
1039 def rotateClockwise(self, angle):
1040 assert angle % 90 == 0
-> 1041 self._rotate(angle)
1042 return self
1043

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in _rotate(self, angle)
1054 def _rotate(self, angle):
1055 currentAngle = self.get("/Rotate", 0)
-> 1056 self[NameObject("/Rotate")] = NumberObject(currentAngle + angle)
1057
1058 def _mergeResources(res1, res2, resource):

TypeError: unsupported operand type(s) for +: 'IndirectObject' and 'int'

In [11]: mypage.rotateCounterClockwise(myrotation)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)

/root/<ipython console> in <module>()

/usr/lib/python2.7/site-packages/pyPdf/pdf.pyc in rotateCounterClockwise(self, angle)
1048 # @param angle Angle to rotate the page. Must be an increment of 90 deg.

1049 def rotateCounterClockwise(self, angle):
-> 1050 assert angle % 90 == 0
1051 self._rotate(-angle)
1052 return self

TypeError: unsupported operand type(s) for %: 'IndirectObject' and 'int'

如果有人想深入研究它,我很乐意提供我正在使用的文件。

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com