gpt4 book ai didi

javascript - 是否有支持文本选择的简约 PDF.js 示例?

转载 作者:行者123 更新时间:2023-12-03 02:06:38 28 4
gpt4 key购买 nike

我正在尝试PDF.js .

我的问题是 Hello World demo不支持文本选择。它将在 Canvas 上绘制所有内容,而无需文本层。 official PDF.js demo确实支持文本选择,但代码太复杂。我想知道是否有人有一个带文本层的简约演示。

最佳答案

I have committed the example to Mozilla's pdf.js repository and it is available under the examples directory.

我 promise 的pdf.js的原始示例不再存在,但我相信它 this示例展示了文本选择。他们清理并重新组织了 pdf.js,因此文本选择逻辑被封装在文本层内,可以使用工厂创建文本层。

具体来说,PDFJS.DefaultTextLayerFactory 负责设置基本的文本选择内容。

<小时/>

以下示例已过时;只是由于历史原因才将其留在这里。

我已经被这个问题困扰了2-3天,但我终于解决了。 Here是一个 fiddle ,向您展示如何加载启用文本选择的 PDF。

解决这个问题的困难在于文本选择逻辑与查看器代码(viewer.jsviewer.htmlviewer.js)交织在一起。 CSS)。我必须提取相关代码和 CSS 才能使其正常工作(文件中引用了该 JavaScript 文件;您也可以查看 here )。最终结果是一个最小的演示,应该会有所帮助。为了正确实现选择,viewer.css 中的 CSS 也非常重要,因为它为最终创建的 div 设置 CSS 样式,然后用于获取文本选择工作。

繁重的工作是由 TextLayerBuilder 对象完成的,它实际上处理选择 div 的创建。您可以在 viewer.js 中看到对此对象的调用。

无论如何,这是包含 CSS 的代码。请记住,您仍然需要 pdf.js 文件。我的 fiddle 有一个链接,指向我从 Mozilla 的 GitHub 存储库为 pdf.js 构建的版本。我不想直接链接到存储库的版本,因为他们正在不断开发它,并且它可能会被破坏。

言归正传:

HTML:

<html>
<head>
<title>Minimal pdf.js text-selection demo</title>
</head>

<body>
<div id="pdfContainer" class = "pdf-content">
</div>
</body>
</html>

CSS:

.pdf-content {
border: 1px solid #000000;
}

/* CSS classes used by TextLayerBuilder to style the text layer divs */

/* This stuff is important! Otherwise when you select the text, the text in the divs will show up! */
::selection { background:rgba(0,0,255,0.3); }
::-moz-selection { background:rgba(0,0,255,0.3); }

.textLayer {
position: absolute;
left: 0;
top: 0;
right: 0;
bottom: 0;
color: #000;
font-family: sans-serif;
overflow: hidden;
}

.textLayer > div {
color: transparent;
position: absolute;
line-height: 1;
white-space: pre;
cursor: text;
}

.textLayer .highlight {
margin: -1px;
padding: 1px;

background-color: rgba(180, 0, 170, 0.2);
border-radius: 4px;
}

.textLayer .highlight.begin {
border-radius: 4px 0px 0px 4px;
}

.textLayer .highlight.end {
border-radius: 0px 4px 4px 0px;
}

.textLayer .highlight.middle {
border-radius: 0px;
}

.textLayer .highlight.selected {
background-color: rgba(0, 100, 0, 0.2);
}

JavaScript:

//Minimal PDF rendering and text-selection example using pdf.js by Vivin Suresh Paliath (http://vivin.net)
//This fiddle uses a built version of pdf.js that contains all modules that it requires.
//
//For demonstration purposes, the PDF data is not going to be obtained from an outside source. I will be
//storing it in a variable. Mozilla's viewer does support PDF uploads but I haven't really gone through
//that code. There are other ways to upload PDF data. For instance, I have a Spring app that accepts a
//PDF for upload and then communicates the binary data back to the page as base64. I then convert this
//into a Uint8Array manually. I will be demonstrating the same technique here. What matters most here is
//how we render the PDF with text-selection enabled. The source of the PDF is not important; just assume
//that we have the data as base64.
//
//The problem with understanding text selection was that the text selection code has heavily intertwined
//with viewer.html and viewer.js. I have extracted the parts I need out of viewer.js into a separate file
//which contains the bare minimum required to implement text selection. The key component is TextLayerBuilder,
//which is the object that handles the creation of text-selection divs. I have added this code as an external
//resource.
//
//This demo uses a PDF that only has one page. You can render other pages if you wish, but the focus here is
//just to show you how you can render a PDF with text selection. Hence the code only loads up one page.
//
//The CSS used here is also very important since it sets up the CSS for the text layer divs overlays that
//you actually end up selecting.
//
//For reference, the actual PDF document that is rendered is available at:
//http://vivin.net/pub/pdfjs/TestDocument.pdf

var pdfBase64 = "..."; //should contain base64 representing the PDF

var scale = 1; //Set this to whatever you want. This is basically the "zoom" factor for the PDF.

/**
* Converts a base64 string into a Uint8Array
*/
function base64ToUint8Array(base64) {
var raw = atob(base64); //This is a native function that decodes a base64-encoded string.
var uint8Array = new Uint8Array(new ArrayBuffer(raw.length));
for(var i = 0; i < raw.length; i++) {
uint8Array[i] = raw.charCodeAt(i);
}

return uint8Array;
}

function loadPdf(pdfData) {
PDFJS.disableWorker = true; //Not using web workers. Not disabling results in an error. This line is
//missing in the example code for rendering a pdf.

var pdf = PDFJS.getDocument(pdfData);
pdf.then(renderPdf);
}

function renderPdf(pdf) {
pdf.getPage(1).then(renderPage);
}

function renderPage(page) {
var viewport = page.getViewport(scale);
var $canvas = jQuery("<canvas></canvas>");

//Set the canvas height and width to the height and width of the viewport
var canvas = $canvas.get(0);
var context = canvas.getContext("2d");
canvas.height = viewport.height;
canvas.width = viewport.width;

//Append the canvas to the pdf container div
jQuery("#pdfContainer").append($canvas);

//The following few lines of code set up scaling on the context if we are on a HiDPI display
var outputScale = getOutputScale();
if (outputScale.scaled) {
var cssScale = 'scale(' + (1 / outputScale.sx) + ', ' +
(1 / outputScale.sy) + ')';
CustomStyle.setProp('transform', canvas, cssScale);
CustomStyle.setProp('transformOrigin', canvas, '0% 0%');

if ($textLayerDiv.get(0)) {
CustomStyle.setProp('transform', $textLayerDiv.get(0), cssScale);
CustomStyle.setProp('transformOrigin', $textLayerDiv.get(0), '0% 0%');
}
}

context._scaleX = outputScale.sx;
context._scaleY = outputScale.sy;
if (outputScale.scaled) {
context.scale(outputScale.sx, outputScale.sy);
}

var canvasOffset = $canvas.offset();
var $textLayerDiv = jQuery("<div />")
.addClass("textLayer")
.css("height", viewport.height + "px")
.css("width", viewport.width + "px")
.offset({
top: canvasOffset.top,
left: canvasOffset.left
});

jQuery("#pdfContainer").append($textLayerDiv);

page.getTextContent().then(function(textContent) {
var textLayer = new TextLayerBuilder($textLayerDiv.get(0), 0); //The second zero is an index identifying
//the page. It is set to page.number - 1.
textLayer.setTextContent(textContent);

var renderContext = {
canvasContext: context,
viewport: viewport,
textLayer: textLayer
};

page.render(renderContext);
});
}

var pdfData = base64ToUint8Array(pdfBase64);
loadPdf(pdfData);

关于javascript - 是否有支持文本选择的简约 PDF.js 示例?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16775907/

28 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com