gpt4 book ai didi

puppeteer - Puppeter 的 page.pdf API 中的页眉和页脚打印是如何工作的?

转载 作者:行者123 更新时间:2023-12-04 16:28:28 39 4
gpt4 key购买 nike

我在尝试使用 headerTemplate 时发现了一些不一致的地方。和 footerTemplate page.pdf 的选项:

  • 页眉和页脚的 DPI 似乎较低(我认为主体为 72 对 96)。所以如果我想匹配边距,我必须按比例缩放。
  • 样式不与主体共享,因此我必须将它们包含在模板中。
  • 如果我尝试使用本地存储的字体,即使我在页眉/页脚模板中包含相同的 CSS,它也适用于主体,但不适用于页眉/页脚。

  • 我怀疑这是因为页眉和页脚被视为单独的文档并分别转换为图像/pdf( https://cs.chromium.org/chromium/src/components/printing/resources/print_header_footer_template_page.html 也意味着类似的东西)。熟悉实现的人可以解释它的实际工作原理吗?谢谢!

    最佳答案

    简短的回答:

    Puppeteer 通过 DevTools Protocol 控制 Chrome 或 Chromium .

    Chrome 使用 Skia用于 PDF 生成。

    Skia 分别处理页眉、对象集和页脚。

    详细答案:

    来自 Puppeteer Documentation :

    page.pdf(options)

    • options <Object> Options object which might have the following properties:
      • headerTemplate <string> HTML template for the print header. Should be valid HTML markup with following classes used to inject printing values into them:
        • date formatted print date
        • title document title
        • url document location
        • pageNumber current page number
        • totalPages total pages in the document
      • footerTemplate <string> HTML template for the print footer. Should use the same format as the headerTemplate.
    • returns: <Promise<Buffer>> Promise which resolves with PDF buffer.

    NOTE Generating a pdf is currently only supported in Chrome headless.





    NOTE headerTemplate and footerTemplate markup have the following limitations:

    1. Script tags inside templates are not evaluated.
    2. Page styles are not visible inside templates.



    我们可以借鉴 Puppeteer source code for page.pdf() 那:
  • Chrome DevTools 协议(protocol)方法Page.printToPDF (连同 headerTemplatefooterTemplate 参数)被发送到 page._client .
  • page._clientpage.target().createCDPSession() 的一个实例(Chrome DevTools 协议(protocol) session )。


  • 来自 Chrome DevTools Protocol Viewer ,我们可以看到 Page.printToPDF包含参数 headerTemplatefooterTemplate :

    Page.printToPDF

    Print page as PDF.

    PARAMETERS

    • headerTemplate string (optional)
      • HTML template for the print header. Should be valid HTML markup with following classes used to inject printing values into them:
        • date: formatted print date
        • title: document title
        • url: document location
        • pageNumber: current page number
        • totalPages: total pages in the document
      • For example, <span class=title></span> would generate span containing the title.
    • footerTemplate string (optional)
      • HTML template for the print footer. Should use the same format as the headerTemplate.

    RETURN OBJECT

    • data string
      • Base64-encoded pdf data.


    Chromium source code for Page.printToPDF 向我们展示:
  • Page.printToPDF参数传递给 sendDevToolsMessage 函数,它发出 DevTools 协议(protocol)命令并返回结果的 promise 。


  • 经过进一步挖掘,我们可以看到 Chromium 有一个具体的 implementation of a class called SkDocument 创建 PDF 文件。

    SkDocument 来自 Skia Graphics Library , 其中 Chromium uses for PDF generation .

    Skia PDF Theory of Operation , 在 PDF Objects and Document Structure部分,指出:

    Background: The PDF file format has a header, a set of objects and then a footer that contains a table of contents for all of the objects in the document (the cross-reference table). The table of contents lists the specific byte position for each object. The objects may have references to other objects and the ASCII size of those references is dependent on the object number assigned to the referenced object; therefore we can’t calculate the table of contents until the size of objects is known, which requires assignment of object numbers. The document uses SkWStream::bytesWritten() to query the offsets of each object and build the cross-reference table.



    该文件进一步解释:

    The PDF backend requires all indirect objects used in a PDF to be added to the SkPDFObjNumMap of the SkPDFDocument. The catalog is responsible for assigning object numbers and generating the table of contents required at the end of PDF files. In some sense, generating a PDF is a three step process. In the first step all the objects and references among them are created (mostly done by SkPDFDevice). In the second step, SkPDFObjNumMap assigns and remembers object numbers. Finally, in the third step, the header is printed, each object is printed, and then the table of contents and trailer are printed. SkPDFDocument takes care of collecting all the objects from the various SkPDFDevice instances, adding them to an SkPDFObjNumMap, iterating through the objects once to set their file positions, and iterating again to generate the final PDF.

    关于puppeteer - Puppeter 的 page.pdf API 中的页眉和页脚打印是如何工作的?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51458286/

    39 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com