gpt4 book ai didi

python - 鹡鸰文件 : Large file size (>2GB) upload fails

转载 作者:行者123 更新时间:2023-12-04 17:37:32 25 4
gpt4 key购买 nike

我正在尝试使用我的 Wagtail 应用程序中内置的 wagtaildocs 应用程序上传文件。我已经设置我的 Ubuntu 16.04 服务器是使用 Nginx 的 Digital Ocean 教程方法设置的 | unicorn |邮政系统

一些初步的说明:

  1. 在我的 Nginx 配置中,我设置了 client_max_body_size 10000M;
  2. 在我的生产设置中,我有以下几行:MAX_UPLOAD_SIZE = "5242880000"
    WAGTAILIMAGES_MAX_UPLOAD_SIZE = 5000 * 1024 * 1024
  3. 我的文件类型是 .zip
  4. 此时是生产测试。我只实现了一个没有附加模块的基本 wagtail 应用程序。

因此,只要我的文件大小低于 10Gb,从配置的角度来看,我应该没问题,除非我遗漏了某些东西或对拼写错误视而不见。

我已经尝试将所有配置值调整到不合理的大值。我试过使用其他文件扩展名,但没有改变我的错误。

我认为这与 session 期间关闭的 TCP 或 SSL 连接有关。我以前从未遇到过这个问题,所以非常感谢您的帮助。

这是我的错误信息:

Internal Server Error: /admin/documents/multiple/add/
Traceback (most recent call last):
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
psycopg2.DatabaseError: SSL SYSCALL error: Operation timed out


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/core/handlers/exception.py", line 34, in inner
response = get_response(request)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/core/handlers/base.py", line 115, in _get_response
response = self.process_exception_by_middleware(e, request)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/core/handlers/base.py", line 113, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/views/decorators/cache.py", line 44, in _wrapped_view_func
response = view_func(request, *args, **kwargs)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/admin/urls/__init__.py", line 102, in wrapper
return view_func(request, *args, **kwargs)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/admin/decorators.py", line 34, in decorated_view
return view_func(request, *args, **kwargs)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/admin/utils.py", line 151, in wrapped_view_func
return view_func(request, *args, **kwargs)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/views/decorators/vary.py", line 20, in inner_func
response = func(*args, **kwargs)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/wagtail/documents/views/multiple.py", line 60, in add
doc.save()
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 741, in save
force_update=force_update, update_fields=update_fields)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 779, in save_base
force_update, using, update_fields,
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 870, in _save_table
result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/base.py", line 908, in _do_insert
using=using, raw=raw)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/manager.py", line 82, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/query.py", line 1186, in _insert
return query.get_compiler(using=using).execute_sql(return_id)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/models/sql/compiler.py", line 1335, in execute_sql
cursor.execute(sql, params)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 99, in execute
return super().execute(sql, params)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/utils.py", line 89, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/Users/wgarlock/Git/wagtaildev/lib/python3.7/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
django.db.utils.DatabaseError: SSL SYSCALL error: Operation timed out

这是我的设置

### base.py ###
import os

PROJECT_DIR = os.path.dirname(os.path.dirname(os.path.abspath(__file__)))
BASE_DIR = os.path.dirname(PROJECT_DIR)
SECRET_KEY = os.getenv('SECRET_KEY_WAGTAILDEV')

# Quick-start development settings - unsuitable for production
# See https://docs.djangoproject.com/en/2.2/howto/deployment/checklist/


# Application definition

INSTALLED_APPS = [
'home',
'search',

'wagtail.contrib.forms',
'wagtail.contrib.redirects',
'wagtail.embeds',
'wagtail.sites',
'wagtail.users',
'wagtail.snippets',
'wagtail.documents',
'wagtail.images',
'wagtail.search',
'wagtail.admin',
'wagtail.core',

'modelcluster',
'taggit',

'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'storages',
]

MIDDLEWARE = [
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
'django.middleware.security.SecurityMiddleware',

'wagtail.core.middleware.SiteMiddleware',
'wagtail.contrib.redirects.middleware.RedirectMiddleware',
]

ROOT_URLCONF = 'wagtaildev.urls'

TEMPLATES = [
{
'BACKEND': 'django.template.backends.django.DjangoTemplates',
'DIRS': [
os.path.join(PROJECT_DIR, 'templates'),
],
'APP_DIRS': True,
'OPTIONS': {
'context_processors': [
'django.template.context_processors.debug',
'django.template.context_processors.request',
'django.contrib.auth.context_processors.auth',
'django.contrib.messages.context_processors.messages',
],
},
},
]

WSGI_APPLICATION = 'wagtaildev.wsgi.application'


# Database
# https://docs.djangoproject.com/en/2.2/ref/settings/#databases

DATABASES = {
'default': {
'ENGINE': 'django.db.backends.postgresql_psycopg2',
'HOST': os.getenv('DATABASE_HOST_WAGTAILDEV'),
'USER': os.getenv('DATABASE_USER_WAGTAILDEV'),
'PASSWORD': os.getenv('DATABASE_PASSWORD_WAGTAILDEV') ,
'NAME': os.getenv('DATABASE_NAME_WAGTAILDEV'),
'PORT': '5432',
}
}


# Password validation
# https://docs.djangoproject.com/en/2.2/ref/settings/#auth-password-validators

AUTH_PASSWORD_VALIDATORS = [
{
'NAME': 'django.contrib.auth.password_validation.UserAttributeSimilarityValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.MinimumLengthValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.CommonPasswordValidator',
},
{
'NAME': 'django.contrib.auth.password_validation.NumericPasswordValidator',
},
]


# Internationalization
# https://docs.djangoproject.com/en/2.2/topics/i18n/

LANGUAGE_CODE = 'en-us'

TIME_ZONE = 'UTC'

USE_I18N = True

USE_L10N = True

USE_TZ = True


# Static files (CSS, JavaScript, Images)
# https://docs.djangoproject.com/en/2.2/howto/static-files/

STATICFILES_FINDERS = [
'django.contrib.staticfiles.finders.FileSystemFinder',
'django.contrib.staticfiles.finders.AppDirectoriesFinder',
]

STATICFILES_DIRS = [
os.path.join(PROJECT_DIR, 'static'),
]

# ManifestStaticFilesStorage is recommended in production, to prevent outdated
# Javascript / CSS assets being served from cache (e.g. after a Wagtail upgrade).
# See https://docs.djangoproject.com/en/2.2/ref/contrib/staticfiles/#manifeststaticfilesstorage
STATICFILES_STORAGE = 'django.contrib.staticfiles.storage.ManifestStaticFilesStorage'

STATIC_ROOT = os.path.join(BASE_DIR, 'static')
STATIC_URL = '/static/'

MEDIA_ROOT = os.path.join(BASE_DIR, 'media')
MEDIA_URL = '/media/'


# Wagtail settings

WAGTAIL_SITE_NAME = "wagtaildev"

# Base URL to use when referring to full URLs within the Wagtail admin backend -
# e.g. in notification emails. Don't include '/admin' or a trailing slash
BASE_URL = 'http://example.com'

### production.py ###

from .base import *

DEBUG = True

ALLOWED_HOSTS = ['wagtaildev.wesgarlock.com', '127.0.0.1','134.209.230.125']

from wagtaildev.aws.conf import *

EMAIL_BACKEND = 'django.core.mail.backends.console.EmailBackend'

MAX_UPLOAD_SIZE = "5242880000"
WAGTAILIMAGES_MAX_UPLOAD_SIZE = 5000 * 1024 * 1024
FILE_UPLOAD_TEMP_DIR = str(os.path.join(BASE_DIR, 'tmp'))

这是我的 Nginx 设置

server {
listen 80;

server_name wagtaildev.wesgarlock.com;
client_max_body_size 10000M;

location = /favicon.ico { access_log off; log_not_found off; }

location / {
include proxy_params;
proxy_pass http://unix:/home/wesgarlock/run/wagtaildev.sock;
}
}

最佳答案

我从来没能直接解决这个问题,但我确实想出了一个 hack 来绕过它。

我不是 Wagtail 或 Django 专家,所以我确信这个答案有一个合适的解决方案,但无论如何这就是我所做的。如果您有任何改进建议,请随时发表评论。

请注意,这实际上是一个文档,可以提醒我我也做了什么。此时 (05-25-19) 有很多冗余代码行,因为我科学怪人把很多代码放在一起。我会加类编辑它。

以下是我为创建此解决方案而将 Frankenstein 组合在一起的教程。

  1. https://www.codingforentrepreneurs.com/blog/large-file-uploads-with-amazon-s3-django/
  2. http://docs.wagtail.io/en/v2.1.1/advanced_topics/documents/custom_document_model.html
  3. https://docs.aws.amazon.com/AWSJavaScriptSDK/latest/AWS/S3.html
  4. https://medium.com/faun/summary-667d0fdbcdae
  5. http://docs.aws.amazon.com/sdk-for-javascript/v2/developer-guide/loading-browser-credentials-federated-id.html
  6. https://kite.com/python/examples/454/threading-wait-for-a-thread-to-finish
  7. http://docs.celeryproject.org/en/latest/userguide/daemonizing.html#usage-systemd

可能还有其他一些,但这些是原则。

好的,我们开始吧。

我创建了一个名为“files”的应用程序,然后一个自定义文档为一个 models.py 文件建模。您需要在设置文件中指定 WAGTAILDOCS_DOCUMENT_MODEL = 'files.LargeDocument'。我这样做的唯一原因是为了更明确地跟踪我正在改变的行为。这个自定义文档模型只是扩展了 Wagtail 中的标准文档模型。

#models.py

from django.db import models
from wagtail.documents.models import AbstractDocument
from wagtail.admin.edit_handlers import FieldPanel
# Create your models here.
class LargeDocument(AbstractDocument):

admin_form_fields = (
'file',
)
panels = [
FieldPanel('file', classname='fn'),
]

接下来您需要创建一个包含以下内容的 wagtail_hook.py 文件。

#wagtail_hook.py
from wagtail.contrib.modeladmin.options import (
ModelAdmin, modeladmin_register)
from .models import LargeDocument
from .views import LargeDocumentAdminView


class LargeDocumentAdmin(ModelAdmin):
model = LargeDocument

menu_label = 'Large Documents' # ditch this to use verbose_name_plural from model
menu_icon = 'pilcrow' # change as required
menu_order = 200 # will put in 3rd place (000 being 1st, 100 2nd)
add_to_settings_menu = False # or True to add your model to the Settings sub-menu
exclude_from_explorer = False # or True to exclude pages of this type from Wagtail's explorer view

create_template_name ='large_document_index.html'

# Now you just need to register your customised ModelAdmin class with Wagtail
modeladmin_register(LargeDocumentAdmin)

这允许你做两件事:

  1. 创建一个用于上传大型文档的新菜单项,同时维护具有标准功能的标准文档菜单项。
  2. 指定用于处理大型上传的自定义 html 文件。

这是html

{% extends "wagtailadmin/base.html" %}
{% load staticfiles cache %}
{% load static wagtailuserbar %}
{% load compress %}
{% load underscore_hyphan_to_space %}
{% load url_vars %}
{% load pagination_value %}

{% load static %}
{% load i18n %}

{% block titletag %}{{ view.page_title }}{% endblock %}

{% block content %}

{% include "wagtailadmin/shared/header.html" with title=view.page_title icon=view.header_icon %}
<!-- Google Signin Button -->
<div class="g-signin2" data-onsuccess="onSignIn" data-theme="dark">
</div>
<!-- Select the file to upload -->

<div class="input-group mb-3">
<link rel="stylesheet" href="{% static 'css/input.css'%}"/>
<div class="custom-file">
<input type="file" class="custom-file-input" id="file" name="file">
<label id="file_label" class="custom-file-label" style="width:auto!important;" for="inputGroupFile02" aria-describedby="inputGroupFileAddon02">Choose file</label>
</div>
<div class="input-group-append">
<span class="input-group-text" id="file_submission_button">Upload</span>
</div>
<div id="start_progress"></div>
</div>
<div class="progress-upload">
<div class="progress-upload-bar" role="progressbar" style="width: 100%;" aria-valuenow="100" aria-valuemin="0" aria-valuemax="100"></div>
</div>
{% endblock %}

{% block extra_js %}
{{ block.super }}
{{ form.media.js }}
<script src="https://apis.google.com/js/platform.js" async defer></script>
<script src="https://sdk.amazonaws.com/js/aws-sdk-2.148.0.min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.2.1/jquery.min.js"></script>
<script src="{% static 'js/awsupload.js' %}"></script>
{% endblock %}

{% block extra_css %}
{{ block.super }}
{{ form.media.css }}
<meta name="google-signin-client_id" content="847336061839-9h651ek1dv7u1i0t4edsk8pd20d0lkf3.apps.googleusercontent.com">
<link rel="stylesheet" href="https://stackpath.bootstrapcdn.com/bootstrap/4.3.1/css/bootstrap.min.css" integrity="sha384-ggOyR0iXCbMQv3Xipma34MD+dH/1fQ784/j6cY/iJTQUOhcWr7x9JvoRxT2MZw1T" crossorigin="anonymous">

{% endblock %}

然后我在 views.py 中创建了一些对象

#views.py
from django.shortcuts import render

# Create your views here.
import base64
import hashlib
import hmac
import os
import time
from rest_framework import permissions, status, authentication
from rest_framework.response import Response
from rest_framework.views import APIView
from .config_aws import (
AWS_UPLOAD_BUCKET,
AWS_UPLOAD_REGION,
AWS_UPLOAD_ACCESS_KEY_ID,
AWS_UPLOAD_SECRET_KEY
)
from .models import LargeDocument
import datetime
from wagtail.contrib.modeladmin.views import WMABaseView
from django.db.models.fields.files import FieldFile
from django.core.files import File
import urllib.request
from django.core.mail import send_mail
from .tasks import file_creator

class FilePolicyAPI(APIView):
"""
This view is to get the AWS Upload Policy for our s3 bucket.
What we do here is first create a LargeDocument object instance in our
Django backend. This is to include the LargeDocument instance in the path
we will use within our bucket as you'll see below.
"""
permission_classes = [permissions.IsAuthenticated]
authentication_classes = [authentication.SessionAuthentication]

def post(self, request, *args, **kwargs):
"""
The initial post request includes the filename
and auth credientails. In our case, we'll use
Session Authentication but any auth should work.
"""
filename_req = request.data.get('filename')
if not filename_req:
return Response({"message": "A filename is required"}, status=status.HTTP_400_BAD_REQUEST)
policy_expires = int(time.time()+5000)
user = request.user
username_str = str(request.user.username)
"""
Below we create the Django object. We'll use this
in our upload path to AWS.

Example:
To-be-uploaded file's name: Some Random File.mp4
Eventual Path on S3: <bucket>/username/2312/2312.mp4
"""
doc_obj = LargeDocument.objects.create(uploaded_by_user=user, )
doc_obj_id = doc_obj.id
doc_obj.title=filename_req
upload_start_path = "{location}".format(
location = "LargeDocuments/",
)
file_extension = os.path.splitext(filename_req)
filename_final = "{title}".format(
title= filename_req,
)
"""
Eventual file_upload_path includes the renamed file to the
Django-stored LargeDocument instance ID. Renaming the file is
done to prevent issues with user generated formatted names.
"""
final_upload_path = "{upload_start_path}/{filename_final}".format(
upload_start_path=upload_start_path,
filename_final=filename_final,
)
if filename_req and file_extension:
"""
Save the eventual path to the Django-stored LargeDocument instance
"""
policy_document_context = {
"expire": policy_expires,
"bucket_name": AWS_UPLOAD_BUCKET,
"key_name": "",
"acl_name": "public-read",
"content_name": "",
"content_length": 524288000,
"upload_start_path": upload_start_path,

}
policy_document = """
{"expiration": "2020-01-01T00:00:00Z",
"conditions": [
{"bucket": "%(bucket_name)s"},
["starts-with", "$key", "%(upload_start_path)s"],
{"acl": "public-read"},

["starts-with", "$Content-Type", "%(content_name)s"],
["starts-with", "$filename", ""],
["content-length-range", 0, %(content_length)d]
]
}
""" % policy_document_context
aws_secret = str.encode(AWS_UPLOAD_SECRET_KEY)
policy_document_str_encoded = str.encode(policy_document.replace(" ", ""))
url = 'https://thearchmedia.s3.amazonaws.com/'
policy = base64.b64encode(policy_document_str_encoded)
signature = base64.b64encode(hmac.new(aws_secret, policy, hashlib.sha1).digest())
doc_obj.file_hash = signature
doc_obj.path = final_upload_path

doc_obj.save()



data = {
"policy": policy,
"signature": signature,
"key": AWS_UPLOAD_ACCESS_KEY_ID,
"file_bucket_path": upload_start_path,
"file_id": doc_obj_id,
"filename": filename_final,
"url": url,
"username": username_str,
}
return Response(data, status=status.HTTP_200_OK)

class FileUploadCompleteHandler(APIView):
permission_classes = [permissions.IsAuthenticated]
authentication_classes = [authentication.SessionAuthentication]

def post(self, request, *args, **kwargs):
file_id = request.POST.get('file')
size = request.POST.get('fileSize')
data = {}
type_ = request.POST.get('fileType')
if file_id:
obj = LargeDocument.objects.get(id=int(file_id))
obj.size = int(size)
obj.uploaded = True
obj.type = type_
obj.file_hash
obj.save()
data['id'] = obj.id
data['saved'] = True
data['url']=obj.url
return Response(data, status=status.HTTP_200_OK)

class ModelFileCompletion(APIView):
permission_classes = [permissions.IsAuthenticated]
authentication_classes = [authentication.SessionAuthentication]

def post(self, request, *args, **kwargs):
file_id = request.POST.get('file')
url = request.POST.get('aws_url')
data = {}
if file_id:
obj = LargeDocument.objects.get(id=int(file_id))
file_creator.delay(obj.pk)
data['test'] = 'process started'
return Response(data, status=status.HTTP_200_OK)

def LargeDocumentAdminView(request):
context = super(WMABaseView, self).get_context(request)
render(request, 'modeladmin/files/index.html', context)

这个 View 围绕着标准的文件处理系统。我不想放弃标准的文件处理系统或编写一个新的。这就是为什么我称此 hack 为非理想解决方案的原因。

// javascript upload file "awsupload.js"
var id_token; //token we get upon Authentication with Web Identiy Provider
function onSignIn(googleUser) {
var profile = googleUser.getBasicProfile();
// The ID token you need to pass to your backend:
id_token = googleUser.getAuthResponse().id_token;
}

$(document).ready(function(){

// setup session cookie data. This is Django-related
function getCookie(name) {
var cookieValue = null;
if (document.cookie && document.cookie !== '') {
var cookies = document.cookie.split(';');
for (var i = 0; i < cookies.length; i++) {
var cookie = jQuery.trim(cookies[i]);
// Does this cookie string begin with the name we want?
if (cookie.substring(0, name.length + 1) === (name + '=')) {
cookieValue = decodeURIComponent(cookie.substring(name.length + 1));
break;
}
}
}
return cookieValue;
}
var csrftoken = getCookie('csrftoken');
function csrfSafeMethod(method) {
// these HTTP methods do not require CSRF protection
return (/^(GET|HEAD|OPTIONS|TRACE)$/.test(method));
}
$.ajaxSetup({
beforeSend: function(xhr, settings) {
if (!csrfSafeMethod(settings.type) && !this.crossDomain) {
xhr.setRequestHeader("X-CSRFToken", csrftoken);
}
}
});
// end session cookie data setup.

// declare an empty array for potential uploaded files
var fileItemList = []

$(document).on('click','#file_submission_button', function(event){
var selectedFiles = $('#file').prop('files');
formItem = $(this).parent()
$.each(selectedFiles, function(index, item){
uploadFile(item)
})
$(this).val('');
$('.progress-upload-bar').attr('aria-valuenow',progress);
$('.progress-upload-bar').attr('width',progress.toString()+'%');
$('.progress-upload-bar').attr('style',"width:"+progress.toString()+'%');
$('.progress-upload-bar').text(progress.toString()+'%');
})
$(document).on('change','#file', function(event){
var selectedFiles = $('#file').prop('files');
$('#file_label').text(selectedFiles[0].name)
})



function constructFormPolicyData(policyData, fileItem) {
var contentType = fileItem.type != '' ? fileItem.type : 'application/octet-stream'
var url = policyData.url
var filename = policyData.filename
var repsonseUser = policyData.user
// var keyPath = 'www/' + repsonseUser + '/' + filename
var keyPath = policyData.file_bucket_path
var fd = new FormData()
fd.append('key', keyPath + filename);
fd.append('acl','private');
fd.append('Content-Type', contentType);
fd.append("AWSAccessKeyId", policyData.key)
fd.append('Policy', policyData.policy);
fd.append('filename', filename);
fd.append('Signature', policyData.signature);
fd.append('file', fileItem);
return fd
}

function fileUploadComplete(fileItem, policyData){
data = {
uploaded: true,
fileSize: fileItem.size,
file: policyData.file_id,

}
$.ajax({
method:"POST",
data: data,
url: "/api/files/complete/",
success: function(data){
displayItems(fileItemList)
},
error: function(jqXHR, textStatus, errorThrown){
alert("An error occured, please refresh the page.")
}
})
}

function modelComplete(policyData, aws_url){
data = {
file: policyData.file_id,
aws_url: aws_url
}
$.ajax({
method:"POST",
data: data,
url: "/api/files/modelcomplete/",
success:
console.log('model complete success') ,
error: function(jqXHR, textStatus, errorThrown){
alert("An error occured, please refresh the page.")
}
})
}

function displayItems(fileItemList){
var itemList = $('.item-loading-queue')
itemList.html("")
$.each(fileItemList, function(index, obj){
var item = obj.file
var id_ = obj.id
var order_ = obj.order
var html_ = "<div class=\"progress\">" +
"<div class=\"progress-bar\" role=\"progressbar\" style='width:" + item.progress + "%' aria-valuenow='" + item.progress + "' aria-valuemin=\"0\" aria-valuemax=\"100\"></div></div>"
itemList.append("<div>" + order_ + ") " + item.name + "<a href='#' class='srvup-item-upload float-right' data-id='" + id_ + ")'>X</a> <br/>" + html_ + "</div><hr/>")

})
}

function uploadFile(fileItem){
var policyData;
var newLoadingItem;
// get AWS upload policy for each file uploaded through the POST method
// Remember we're creating an instance in the backend so using POST is
// needed.
$.ajax({
method:"POST",
data: {
filename: fileItem.name
},
url: "/api/files/policy/",
success: function(data){
policyData = data
},
error: function(data){
alert("An error occured, please try again later")
}
}).done(function(){
// construct the needed data using the policy for AWS
var file = fileItem;
AWS.config.credentials = new AWS.WebIdentityCredentials({
RoleArn: 'arn:aws:iam::120974195102:role/thearchmedia-google-role',
ProviderId: null, // this is null for Google
WebIdentityToken: id_token // Access token from identity provider
});
var bucket = 'thearchmedia'
var key = 'LargeDocuments/'+file.name
var aws_url = 'https://'+bucket+'.s3.amazonaws.com/'+ key
var s3bucket = new AWS.S3({params: {Bucket: bucket}});
var params = {Key: key , ContentType: file.type, Body: file, ACL:'public-read', };
s3bucket.upload(params, function (err, data) {
$('#results').html(err ? 'ERROR!' : 'UPLOADED :' + data.Location);
}).on(
'httpUploadProgress', function(evt) {
progress = parseInt((evt.loaded * 100) / evt.total)
$('.progress-upload-bar').attr('aria-valuenow',progress)
$('.progress-upload-bar').attr('width',progress.toString()+'%')
$('.progress-upload-bar').attr('style',"width:"+progress.toString()+'%')
$('.progress-upload-bar').text(progress.toString()+'%')

}).send(
function(err, data) {
alert("File uploaded successfully.")
fileUploadComplete(fileItem, policyData)
modelComplete(policyData, aws_url)
});
})
}


})

.js与.view.py交互说明

首先,头部带有文件信息的 Ajax 调用会创建 Document 对象,但由于文件从不接触服务器,因此不会在 Document 对象中创建“File”对象。这个"file"对象包含我需要的功能,所以我需要做更多的事情。接下来,我的 javascript 文件使用 AWS Javascript SDK 将文件上传到我的 s3 存储桶。 SDK 中的 s3bucket.upload() 函数足够强大,可以上传最大 5GB 的文件,但如果不包括一些其他修改,它可以上传最大 5TB(aws 限制)。文件上传到 s3 存储桶后,我将进行最后的 API 调用。最后的 API 调用会触发一个 Celery 任务,该任务将文件下载到我的远程服务器上的一个临时目录中。一旦文件存在于我的远程服务器上,就会创建文件对象并将其保存到文档模型中。

task.py 文件处理文件从 S3 存储桶下载到远程服务器,然后创建文件对象并将其保存到文档文件。

#task.py
from .models import LargeDocument
from celery import shared_task
import urllib.request
from django.core.mail import send_mail
from django.core.files import File
import threading

@shared_task
def file_creator(pk_num):
obj = LargeDocument.objects.get(pk=pk_num)
tmp_loc = 'tmp/'+ obj.title
def downloadit():
urllib.request.urlretrieve('https://thearchmedia.s3.amazonaws.com/LargeDocuments/' + obj.title, tmp_loc)

def after_dwn():
dwn_thread.join() #waits till thread1 has completed executing
#next chunk of code after download, goes here
send_mail(
obj.title + ' has finished to downloading to the server',
obj.title + 'Downloaded to server',
'info@thearchmedia.com',
['wes@wesgarlock.com'],
fail_silently=False,
)
reopen = open(tmp_loc, 'rb')
django_file = File(reopen)
obj.file = django_file
obj.save()
send_mail(
obj.title + ' has finished to downloading to the server',
'File Model Created for' + obj.title,
'info@thearchmedia.com',
['wes@wesgarlock.com'],
fail_silently=False,
)

dwn_thread = threading.Thread(target=downloadit)
dwn_thread.start()

metadata_thread = threading.Thread(target=after_dwn)
metadata_thread.start()

这个过程需要在 Celery 中运行,因为下载大文件需要时间,而且我不想在打开浏览器的情况下等待。此 task.py 中还有一个 python thread(),它强制进程等待文件成功下载到远程服务器。如果您是 Celery 的新手,这里是他们文档的开始 (http://docs.celeryproject.org/en/master/getting-started/introduction.html)

我还添加了一些电子邮件通知以确认流程已完成。

最后说明 我在我的项目中创建了一个/tmp 目录,并设置了每天删除所有文件以赋予它 tmp 功能。

crontab -e
find ~/thearchmedia/tmp -mtime +1 -delete

关于python - 鹡鸰文件 : Large file size (>2GB) upload fails,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56105090/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com