gpt4 book ai didi

pdf - 如何使用 PyPDF2 提取目录?

转载 作者:行者123 更新时间:2023-12-04 13:22:40 33 4
gpt4 key购买 nike

this pdf举个例子。我可以使用 dumppdf.py -T 1707.09725.pdf 提取目录 (TOC):

<outlines>
<outline level="1" title="1 Introduction">
<dest>
<list size="5">
<ref id="513"/>
<literal>XYZ</literal>
<number>99.213</number>
<number>742.911</number>
<null/>
</list>
</dest>
<pageno>14</pageno>
</outline>
<outline level="1" title="2 Convolutional Neural Networks">
<dest>
<list size="5">
<ref id="554"/>
<literal>XYZ</literal>
<number>99.213</number>
<number>742.911</number>
<null/>
</list>
</dest>
<pageno>16</pageno>
</outline>
...

我可以用 PyPDF2 做类似的事情吗?

最佳答案

找到了:

from PyPDF2 import PdfFileReader

reader = PdfFileReader(open("1707.09725.pdf", 'rb'))

print(reader.outlines)

给出:

[{'/Title': '1 Introduction', '/Left': 99.213, '/Type': '/XYZ', '/Top': 742.911, '/Zoom': ..., '/Page': IndirectObject(513, 0)},
{'/Title': '2 Convolutional Neural Networks', '/Left': 99.213, '/Type': '/XYZ', '/Top': 742.911, '/Zoom': ..., '/Page': IndirectObject(554, 0)}, [{'/Title': '2.1 Linear Image Filters', '/Left': 99.213, '/Type': '/XYZ', '/Top': 486.791, '/Zoom': ..., '/Page': IndirectObject(554, 0)},
{'/Title': '2.2 CNN Layer Types', '/Left': 70.866, '/Type': '/XYZ', '/Top': 316.852, '/Zoom': ..., '/Page': IndirectObject(580, 0)},
[{'/Title': '2.2.1 Convolutional Layers', '/Left': 99.213, '/Type': '/XYZ', '/Top': 562.722, '/Zoom': ..., '/Page': IndirectObject(608, 0)},
{'/Title': '2.2.2 Pooling Layers', '/Left': 99.213, '/Type': '/XYZ', '/Top': 299.817, '/Zoom': ..., '/Page': IndirectObject(654, 0)},
{'/Title': '2.2.3 Dropout', '/Left': 99.213, '/Type': '/XYZ', '/Top': 742.911, '/Zoom': ..., '/Page': IndirectObject(689, 0)},
{'/Title': '2.2.4 Normalization Layers', '/Left': 99.213, '/Type': '/XYZ', '/Top': 193.779, '/Zoom': <PyPDF2.generic.NullObject object at 0x7fbe49d14350>, '/Page': IndirectObject(689, 0)}]

关于pdf - 如何使用 PyPDF2 提取目录?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48157194/

33 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com