gpt4 book ai didi

python - 在基于 django 类的 View 中保存新模型后,自动将 docx/pdf 文件转换为文本文件

转载 作者:太空宇宙 更新时间:2023-11-03 21:04:24 26 4
gpt4 key购买 nike

我正在设置一个用于在线申请的django应用程序,我的模型包含一些字符字段和一个用于简历的文件字段,我想要的是:每次保存新简历时,它都会自动转换为txt格式并且保存在媒体文件夹中。问题是只有重新启动服务器才能进行转换。

这是我的观点:

from django.shortcuts import render
from rest_framework import viewsets, permissions
from rest_framework.parsers import FormParser, MultiPartParser
from .serializers import candidateSerializer
from .models import Candidate
from .conversion import convertPDF, convertDOCX, handle_uploaded_file
#from django.db.models.signals import post_save
#from django.dispatch import receiver
from rest_framework.response import Response
from rest_framework.decorators import action
#from django.core.files import File



class candidateView(viewsets.ModelViewSet):
permission_classes = [
permissions.AllowAny,
]
serializer_class = candidateSerializer
queryset = Candidate.objects.all()
cv = list(queryset.values('CV'))
cvName = [el['CV'] for el in cv]
file = cvName[len(cvName)-1]
handle_uploaded_file(file)

这里是用于转换上传文件的handle_uploaded_file函数:

def handle_uploaded_file(file):
Dir = 'C:/workspace/backend/media/'
textDir = 'C:/workspace/backend/media/textResumes/'

if file.endswith(".pdf"):
name = file.split(".")[0]
textfilename = name + '.txt'
filename = Dir + file
doc= convertPDF(filename)
f = open(textDir + textfilename, 'w+', encoding="utf-8")
for line in doc:
f.write(line)
f.close()


if file.endswith(".DOCX"):
name = file.split(".")[0]
textfilename = name + '.txt'
filename = Dir +file
doc = docx2txt.process(filename)
f = open(textDir + textfilename, 'w+', encoding="utf-8")
for line in doc:
f.write(line)
f.close()

if file.endswith(".docx"):
name = file.split(".")[0]
textfilename = name + '.txt'
filename = Dir +file
doc = convertDOCX(filename)
f = open(textDir + textfilename, 'w+', encoding="utf-8")
for line in doc:
f.write(line)
f.close()

def convertPDF(fname):
with open(fname, 'rb') as f:
pdfReader = PyPDF2.PdfFileReader(fname)
content = []
for i in range(pdfReader.numPages):
pageObj = pdfReader.getPage(i)
content.append(pageObj.extractText())
doc = ''
for line in content:
doc = doc + line
return doc

def convertDOCX(fname):
doc = docx.Document(fname)
fullText = []
for para in doc.paragraphs:
fullText.append(para.text)
doc = ''
for line in fullText:
doc = doc+ line
return doc

最佳答案

类似question已被询问。我也不确定为什么您在 View 文件中创建模型类。

关于python - 在基于 django 类的 View 中保存新模型后,自动将 docx/pdf 文件转换为文本文件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55520788/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com