我正在尝试使用以下代码从 docx 生成一个 txt 文件:
from subprocess import Popen, PIPE
from docx import opendocx, getdocumenttext
from pdfminer.pdfinterp import PDFResourceManager, PDFPageInterpreter
from pdfminer.converter import TextConverter
from pdfminer.layout import LAParams
from pdfminer.pdfpage import PDFPage
from cStringIO import StringIO
def convert_pdf_to_txt(path):
...
def document_to_text(filename, file_path):
...
elif filename[-5:] == ".docx":
document = opendocx(file_path)
paratextlist = getdocumenttext(document)
newparatextlist = []
for paratext in paratextlist:
newparatextlist.append(paratext.encode("utf-8"))
return '\n\n'.join(newparatextlist)
elif filename[-4:] == ".odt":
...
elif filename[-4:] == ".pdf":
...
document_to_text('1.docx','D:\Nucho\Python\AntiPlagiat\1.docx')
但是,我只看到:ImportError: cannot import name opendocx
一些文本 '.......' 来发布问题。
我是一名优秀的程序员,十分优秀!