gpt4 book ai didi

python - 使用 Python 将 DOCX 文件转换为文本文件

转载 作者:太空宇宙 更新时间:2023-11-04 00:10:17 25 4
gpt4 key购买 nike

我编写了以下代码将我的 docx 文件转换为文本文件。我在文本文件中打印的输出是整个文件的最后一段/部分,而不是完整的内容。代码如下:

from docx import Document
import io
import shutil

def convertDocxToText(path):
for d in os.listdir(path):
fileExtension=d.split(".")[-1]
if fileExtension =="docx":
docxFilename = path + d
print(docxFilename)
document = Document(docxFilename)


# for printing the complete document
print('\nThe whole content of the document:->>>\n')
for para in document.paragraphs:
textFilename = path + d.split(".")[0] + ".txt"
with io.open(textFilename,"w", encoding="utf-8") as textFile:
#textFile.write(unicode(para.text))
x=unicode(para.text)
print(x) //the complete content gets printed by this line
textFile.write((x)) #after writing the content to text file only last paragraph is copied.
#textFile.write(para.text)

path= "/home/python/resumes/"
convertDocxToText(path)

最佳答案

针对上述问题的解决方案如下:

from docx import Document
import io
import shutil
import os

def convertDocxToText(path):
for d in os.listdir(path):
fileExtension=d.split(".")[-1]
if fileExtension =="docx":
docxFilename = path + d
print(docxFilename)
document = Document(docxFilename)
textFilename = path + d.split(".")[0] + ".txt"
with io.open(textFilename,"w", encoding="utf-8") as textFile:
for para in document.paragraphs:
textFile.write(unicode(para.text))

path= "/home/python/resumes/"
convertDocxToText(path)

关于python - 使用 Python 将 DOCX 文件转换为文本文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52719258/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com