gpt4 book ai didi

python - PDF 出血检测

转载 作者:太空宇宙 更新时间:2023-11-03 12:13:48 30 4
gpt4 key购买 nike

我目前正在编写一个小工具(Python + pyPdf)来测试 PDF 的打印机一致性。

唉,我已经对第一项任务感到困惑:检测 PDF 是否有至少 3 毫米的“出血”(页面周围没有打印任何内容的边框)。我已经知道我无法检测到完整文档的出血,因为似乎没有全局性的出血。然而,在页面上我总共可以检测到五个不同的框:

  • 媒体框
  • 流血框
  • trimBox
  • 裁剪框
  • 艺术盒

我读了pyPdf documentation关于那些框,但我唯一理解的是 mediaBox,它似乎代表了整个页面的大小(即纸张)。

bleedBox 很明显应该 定义出血,但情况似乎并非总是如此。

我注意到的另一件事是,例如 PDF ,所有这些框在每一页上都具有完全相同的大小(意味着根本没有出血),但是当我打开它时,有大量的出血;这让我认为各个文本元素都有自己的偏移量。

因此,显然,仅计算 mediaBoxbleedBox 的出血量并不是一个可行的选择。

如果有人能阐明这些盒子的实际含义以及我可以从中得出什么结论(例如,一个盒子总是比另一个小),我将非常高兴。

奖励问题:谁能告诉我 documentation 中提到的 “默认用户空间单元” 到底是什么? ?我很确定这指的是我机器上的 mm,但我想在所有地方强制执行 mm

最佳答案

引自 PDF 规范 ISO 32000-1:2008由 Adob​​e 发布:

14.11.2 Page Boundaries

14.11.2.1 General

A PDF page may be prepared either for a finished medium, such as asheet of paper, or as part of a prepress process in which the contentof the page is placed on an intermediate medium, such as film or animposed reproduction plate. In the latter case, it is important todistinguish between the intermediate page and the finished page. Theintermediate page may often include additional production-relatedcontent, such as bleeds or printer marks, that falls outside theboundaries of the finished page. To handle such cases, a PDF pagemaydefine as many as five separate boundaries to control variousaspects of the imaging process:

  • The media box defines the boundaries of the physical medium on whichthe page is to be printed. It may include any extended areasurrounding the finished page for bleed, printing marks, or other suchpurposes. It may also include areas close to the edges of the mediumthat cannot be marked because of physical limitations of the outputdevice. Content falling outside this boundary may safely be discardedwithout affecting the meaning of the PDF file.

  • The crop box defines the region to which the contents of the pageshall be clipped (cropped) when displayed or printed. Unlike the otherboxes, the crop box has no defined meaning in terms of physical pagegeometry or intended use; it merely imposes clipping on the pagecontents. However, in the absence of additional information (such asimposition instructions specified in a JDF or PJTF job ticket), thecrop box determines how the page’s contents shall be positioned on theoutput medium. The default value is the page’s media box.

  • The bleed box (PDF 1.3) defines the region to which the contents ofthe page shall be clipped when output in a production environment.This may include any extra bleed area needed to accommodate thephysical limitations of cutting, folding, and trimming equipment. Theactual printed page may include printing marks that fall outside thebleed box. The default value is the page’s crop box.

  • The trim box (PDF 1.3) defines the intended dimensions of thefinished page after trimming. It may be smaller than the media box toallow for production-related content, such as printing instructions,cut marks, or colour bars. The default value is the page’s crop box.

  • The art box (PDF 1.3) defines the extent of the page’s meaningfulcontent (including potential white space) as intended by the page’screator. The default value is the page’s crop box.

The page object dictionary specifies these boundaries in the MediaBox,CropBox, BleedBox, TrimBox, and ArtBox entries, respectively (seeTable 30). All of them are rectangles expressed in default user spaceunits. The crop, bleed, trim, and art boxes shall not ordinarilyextend beyond the boundaries of the media box. If they do, they areeffectively reduced to their intersection with the media box. Figure86 illustrates the relationships among these boundaries. (The crop boxis not shown in the figure because it has no defined relationship withany of the other boundaries.)

下面是一个很好的图形,显示了这些框之间的相互关系:

PDF boxes illustrated

很多情况下只设置媒体框的原因是

  1. 对于用于电子消费(即在计算机上阅读)的 PDF,其他框几乎无关紧要;和

  2. 即使在印前环境中,它们也不再像以前那样必要,参见。 article佩德罗在他的评论中提到。

关于你的“红利问题”:用户空间单位默认为1⁄72英寸;但是,自 PDF 1.6 起,可以使用页面字典中的 UserUnit 条目将其更改为该大小的任何(不一定是整数)倍数。在现有 PDF 中更改它本质上是缩放它,因为用户空间单位是页面的设备独立坐标系中的基本单位。因此,除非您想更新页面描述中引用坐标的每条命令以保持页面尺寸,否则您不会想要强制使用毫米用户空间单位...;)

关于python - PDF 出血检测,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/13236370/

30 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com