gpt4 book ai didi

php - 使用 PHP Word 阅读 MS Word 文档

转载 作者:行者123 更新时间:2023-12-02 20:22:31 27 4
gpt4 key购买 nike

我已在 PHPStorm (IDE) 上安装并设置了 PHP Word。我正在尝试使用 PHPWord 从下面标题为“helloWorld.docx”的 Word 文档中读取“学习昨天,为今天而活,希望明天...”这一行。

enter image description here

这是我到目前为止加载和读取文档的代码:

<?php

require_once 'PHPWord/bootstrap.php';

$objReader = \PhpOffice\PhpWord\IOFactory::createReader("Word2007");
$phpWord = $objReader->load("helloWorld.docx");

$sections = $phpWord->getSection(0);

echo var_dump($sections);

输出:

/usr/bin/php7.2 /home/wade/PhpstormProjects/getWord/readDoc.php
object(PhpOffice\PhpWord\Element\Section)#21 (21) {

["container":protected]=>
string(7) "Section"
["style":"PhpOffice\PhpWord\Element\Section":private]=>
object(PhpOffice\PhpWord\Style\Section)#22 (32) {
["orientation":"PhpOffice\PhpWord\Style\Section":private]=>
string(8) "portrait"
["paper":"PhpOffice\PhpWord\Style\Section":private]=>
object(PhpOffice\PhpWord\Style\Paper)#14 (8) {
["sizes":"PhpOffice\PhpWord\Style\Paper":private]=>
array(7) {
["A3"]=>
array(3) {
[0]=>
int(297)
[1]=>
int(420)
[2]=>
string(2) "mm"
}
["A4"]=>
array(3) {
[0]=>
int(210)
[1]=>
int(297)
[2]=>
string(2) "mm"
}
["A5"]=>
array(3) {
[0]=>
int(148)
[1]=>
int(210)
[2]=>
string(2) "mm"
}
["B5"]=>
array(3) {
[0]=>
int(176)
[1]=>
int(250)
[2]=>
string(2) "mm"
}
["Folio"]=>
array(3) {
[0]=>
float(8.5)
[1]=>
int(13)
[2]=>
string(2) "in"
}
["Legal"]=>
array(3) {
[0]=>
float(8.5)
[1]=>
int(14)
[2]=>
string(2) "in"
}
["Letter"]=>
array(3) {
[0]=>
float(8.5)
[1]=>
int(11)
[2]=>
string(2) "in"
}
}
["size":"PhpOffice\PhpWord\Style\Paper":private]=>
string(2) "A4"
["width":"PhpOffice\PhpWord\Style\Paper":private]=>
float(11905.511811024)
["height":"PhpOffice\PhpWord\Style\Paper":private]=>
float(16837.795275591)
["styleName":protected]=>
NULL
["index":protected]=>
NULL
["aliases":protected]=>
array(0) {
}
["isAuto":"PhpOffice\PhpWord\Style\AbstractStyle":private]=>
bool(false)
}
["pageSizeW":"PhpOffice\PhpWord\Style\Section":private]=>
string(15) "11905.511811024"
["pageSizeH":"PhpOffice\PhpWord\Style\Section":private]=>
string(15) "16837.795275591"
["marginTop":"PhpOffice\PhpWord\Style\Section":private]=>
string(4) "1440"
["marginLeft":"PhpOffice\PhpWord\Style\Section":private]=>
string(4) "1440"
["marginRight":"PhpOffice\PhpWord\Style\Section":private]=>
string(4) "1440"
["marginBottom":"PhpOffice\PhpWord\Style\Section":private]=>
string(4) "1440"
["gutter":"PhpOffice\PhpWord\Style\Section":private]=>
string(1) "0"
["headerHeight":"PhpOffice\PhpWord\Style\Section":private]=>
string(3) "720"
["footerHeight":"PhpOffice\PhpWord\Style\Section":private]=>
string(3) "720"
["pageNumberingStart":"PhpOffice\PhpWord\Style\Section":private]=>
NULL
["colsNum":"PhpOffice\PhpWord\Style\Section":private]=>
int(1)
["colsSpace":"PhpOffice\PhpWord\Style\Section":private]=>
string(3) "720"
["breakType":"PhpOffice\PhpWord\Style\Section":private]=>
NULL
["lineNumbering":"PhpOffice\PhpWord\Style\Section":private]=>
NULL
["borderTopSize":protected]=>
NULL
["borderTopColor":protected]=>
NULL
["borderTopStyle":protected]=>
NULL
["borderLeftSize":protected]=>
NULL
["borderLeftColor":protected]=>
NULL
["borderLeftStyle":protected]=>
NULL
["borderRightSize":protected]=>
NULL
["borderRightColor":protected]=>
NULL
["borderRightStyle":protected]=>
NULL
["borderBottomSize":protected]=>
NULL
["borderBottomColor":protected]=>
NULL
["borderBottomStyle":protected]=>
NULL
["styleName":protected]=>
NULL
["index":protected]=>
NULL
["aliases":protected]=>
array(0) {
}
["isAuto":"PhpOffice\PhpWord\Style\AbstractStyle":private]=>
bool(false)
}
["headers":"PhpOffice\PhpWord\Element\Section":private]=>
array(0) {
}
["footers":"PhpOffice\PhpWord\Element\Section":private]=>
array(0) {
}
["footnoteProperties":"PhpOffice\PhpWord\Element\Section":private]=>
NULL
["elements":protected]=>
array(4) {
[0]=>
object(PhpOffice\PhpWord\Element\TextRun)#34 (18) {
["container":protected]=>
string(7) "TextRun"
["paragraphStyle":protected]=>
object(PhpOffice\PhpWord\Style\Paragraph)#35 (34) {
["aliases":protected]=>
array(1) {
["line-height"]=>
string(10) "lineHeight"
}
["basedOn":"PhpOffice\PhpWord\Style\Paragraph":private]=>
string(6) "Normal"
["next":"PhpOffice\PhpWord\Style\Paragraph":private]=>
NULL
["alignment":"PhpOffice\PhpWord\Style\Paragraph":private]=>
string(0) ""
["indentation":"PhpOffice\PhpWord\Style\Paragraph":private]=>
NULL
["spacing":"PhpOffice\PhpWord\Style\Paragraph":private]=>
NULL
["lineHeight":"PhpOffice\PhpWord\Style\Paragraph":private]=>
NULL
["widowControl":"PhpOffice\PhpWord\Style\Paragraph":private]=>
bool(true)
["keepNext":"PhpOffice\PhpWord\Style\Paragraph":private]=>
bool(false)
["keepLines":"PhpOffice\PhpWord\Style\Paragraph":private]=>
bool(false)
["pageBreakBefore":"PhpOffice\PhpWord\Style\Paragraph":private]=>
bool(false)
["numStyle":"PhpOffice\PhpWord\Style\Paragraph":private]=>
NULL
["numLevel":"PhpOffice\PhpWord\Style\Paragraph":private]=>
int(0)
["tabs":"PhpOffice\PhpWord\Style\Paragraph":private]=>
array(0) {
}
["shading":"PhpOffice\PhpWord\Style\Paragraph":private]=>
NULL
["contextualSpacing":"PhpOffice\PhpWord\Style\Paragraph":private]=>
bool(false)
["bidi":"PhpOffice\PhpWord\Style\Paragraph":private]=>
bool(false)
["textAlignment":"PhpOffice\PhpWord\Style\Paragraph":private]=>
NULL
["suppressAutoHyphens":"PhpOffice\PhpWord\Style\Paragraph":private]=>
bool(false)
["borderTopSize":protected]=>
NULL
["borderTopColor":protected]=>
NULL
["borderTopStyle":protected]=>
NULL
["borderLeftSize":protected]=>
NULL
["borderLeftColor":protected]=>
NULL
["borderLeftStyle":protected]=>
NULL
["borderRightSize":protected]=>
NULL
["borderRightColor":protected]=>
NULL
["borderRightStyle":protected]=>
NULL
["borderBottomSize":protected]=>
NULL
["borderBottomColor":protected]=>
NULL
["borderBottomStyle":protected]=>
NULL
["styleName":protected]=>
NULL
["index":protected]=>
NULL
["isAuto":"PhpOffice\PhpWord\Style\AbstractStyle":private]=>
bool(false)
}
["elements":protected]=>
array(1) {
[0]=>
object(PhpOffice\PhpWord\Element\Text)#41 (18) {
["text":protected]=>
string(134) "&quot;Learn from yesterday, live for today, hope for tomorrow. The important thing is not to stop questioning.&quot; (Albert Einstein)"
["fontStyle":protected]=>
object(PhpOffice\PhpWord\Style\Font)#43 (28) {
["aliases":protected]=>
array(1) {
["line-height"]=>
string(10) "lineHeight"
}
["type":"PhpOffice\PhpWord\Style\Font":private]=>
string(4) "text"
["name":"PhpOffice\PhpWord\Style\Font":private]=>
string(15) "Times New Roman"
["hint":"PhpOffice\PhpWord\Style\Font":private]=>
NULL
["size":"PhpOffice\PhpWord\Style\Font":private]=>
int(20)
["color":"PhpOffice\PhpWord\Style\Font":private]=>
NULL
["bold":"PhpOffice\PhpWord\Style\Font":private]=>
bool(false)
["italic":"PhpOffice\PhpWord\Style\Font":private]=>
bool(false)
["underline":"PhpOffice\PhpWord\Style\Font":private]=>
string(4) "none"
["superScript":"PhpOffice\PhpWord\Style\Font":private]=>
bool(false)
["subScript":"PhpOffice\PhpWord\Style\Font":private]=>
bool(false)
["strikethrough":"PhpOffice\PhpWord\Style\Font":private]=>
bool(false)
["doubleStrikethrough":"PhpOffice\PhpWord\Style\Font":private]=>
bool(false)
["smallCaps":"PhpOffice\PhpWord\Style\Font":private]=>
bool(false)
["allCaps":"PhpOffice\PhpWord\Style\Font":private]=>
bool(false)
["fgColor":"PhpOffice\PhpWord\Style\Font":private]=>
NULL
["scale":"PhpOffice\PhpWord\Style\Font":private]=>
NULL
["spacing":"PhpOffice\PhpWord\Style\Font":private]=>
NULL
["kerning":"PhpOffice\PhpWord\Style\Font":private]=>
NULL
["paragraph":"PhpOffice\PhpWord\Style\Font":private]=>
object(PhpOffice\PhpWord\Style\Paragraph)#42 (34) {
["aliases":protected]=>
array(1) {
["line-height"]=>
string(10) "lineHeight"
}
["basedOn":"PhpOffice\PhpWord\Style\Paragraph":private]=>
string(6) "Normal"
["next":"PhpOffice\PhpWord\Style\Paragraph":private]=>
NULL
["alignment":"PhpOffice\PhpWord\Style\Paragraph":private]=>
string(0) ""
["indentation":"PhpOffice\PhpWord\Style\Paragraph":private]=>
NULL
["spacing":"PhpOffice\PhpWord\Style\Paragraph":private]=>
NULL
["lineHeight":"PhpOffice\PhpWord\Style\Paragraph":private]=>
NULL
["widowControl":"PhpOffice\PhpWord\Style\Paragraph":private]=>
bool(true)
["keepNext":"PhpOffice\PhpWord\Style\Paragraph":private]=>
bool(false)
["keepLines":"PhpOffice\PhpWord\Style\Paragraph":private]=>
bool(false)
["pageBreakBefore":"PhpOffice\PhpWord\Style\Paragraph":private]=>
bool(false)

完整的输出太长,无法发布,但如果您向下滚动一段距离,您可以看到我在这段代码中查找的字符串

我的主要问题是“有没有办法在不使用 var_dump 和搜索大量输出的情况下找到这个字符串?

最佳答案

以下是从 docx 文件中检索文本内容的示例代码。

$content = '';

require_once dirname(__FILE__) . '/includes/phpoffice/vendor/autoload.php';
$phpWord = \PhpOffice\PhpWord\IOFactory::load('helloworld.docx');

foreach($phpWord->getSections() as $section) {
foreach($section->getElements() as $element) {
if (method_exists($element, 'getElements')) {
foreach($element->getElements() as $childElement) {
if (method_exists($childElement, 'getText')) {
$content .= $childElement->getText() . ' ';
}
else if (method_exists($childElement, 'getContent')) {
$content .= $childElement->getContent() . ' ';
}
}
}
else if (method_exists($element, 'getText')) {
$content .= $element->getText() . ' ';
}
}
}

echo $content;

关于php - 使用 PHP Word 阅读 MS Word 文档,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/50994146/

27 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com