gpt4 book ai didi

Java 文件上传到 S3 - 应该多部分加速吗?

转载 作者:行者123 更新时间:2023-12-05 04:48:04 54 4
gpt4 key购买 nike

我们使用 Java 8 并使用 AWS SDK 以编程方式将文件上传到 AWS S3。对于上传大文件(>100MB),我们了解到首选使用的方法是分段上传。我们试过了,但它似乎并没有加快速度,上传时间几乎与不使用分段上传相同。更糟糕的是,我们甚至遇到内存不足的错误,表示堆空间不足。

问题:

  1. 使用分段上传真的可以加快上传速度吗?如果不是,那为什么要使用它?
  2. 为什么使用分段上传会比不使用更快地占用内存?是否同时上传所有部分?

下面是我们使用的代码:

private static void uploadFileToS3UsingBase64(String bucketName, String region, String accessKey, String secretKey,
String fileBase64String, String s3ObjectKeyName) {

byte[] bI = org.apache.commons.codec.binary.Base64.decodeBase64((fileBase64String.substring(fileBase64String.indexOf(",")+1)).getBytes());
InputStream fis = new ByteArrayInputStream(bI);

long start = System.currentTimeMillis();
AmazonS3 s3Client = null;
TransferManager tm = null;

try {

s3Client = AmazonS3ClientBuilder.standard().withRegion(region)
.withCredentials(new AWSStaticCredentialsProvider(new BasicAWSCredentials(accessKey, secretKey)))
.build();

tm = TransferManagerBuilder.standard()
.withS3Client(s3Client)
.withMultipartUploadThreshold((long) (50* 1024 * 1025))
.build();

ObjectMetadata metadata = new ObjectMetadata();
metadata.setHeader(Headers.STORAGE_CLASS, StorageClass.Standard);
PutObjectRequest putObjectRequest = new PutObjectRequest(bucketName, s3ObjectKeyName,
fis, metadata).withSSEAwsKeyManagementParams(new SSEAwsKeyManagementParams());

Upload upload = tm.upload(putObjectRequest);

// Optionally, wait for the upload to finish before continuing.
upload.waitForCompletion();

long end = System.currentTimeMillis();
long duration = (end - start)/1000;

// Log status
System.out.println("Successul upload in S3 multipart. Duration = " + duration);
} catch (Exception e) {
e.printStackTrace();
} finally {
if (s3Client != null)
s3Client.shutdown();
if (tm != null)
tm.shutdownNow();
}

}

最佳答案

如果同时上传多个部分,使用 multipart 只会加快上传速度。

在您的代码中,您正在设置 withMultipartUploadThreshold。如果您的上传大小大于该阈值,那么您应该观察到不同部分的并发上传。如果不是,则应仅使用一个上传连接。你是说你有 >100 MB 的文件,并且在你的代码中你有 50 * 1024 * 1025 = 52 480 000 字节作为分段上传阈值,因此应该同时上传该文件的各个部分。

但是,如果您的上传吞吐量无论如何都受到网络速度的限制,则吞吐量不会有任何增加。这可能是您没有观察到任何速度增加的原因。

还有其他使用 multipart 的原因,因为容错原因也推荐使用 multipart。此外,它的最大尺寸大于单次上传。

有关详细信息,请参阅 documentation :

Multipart upload allows you to upload a single object as a set ofparts. Each part is a contiguous portion of the object's data. You canupload these object parts independently and in any order. Iftransmission of any part fails, you can retransmit that part withoutaffecting other parts. After all parts of your object are uploaded,Amazon S3 assembles these parts and creates the object. In general,when your object size reaches 100 MB, you should consider usingmultipart uploads instead of uploading the object in a singleoperation.

Using multipart upload provides the following advantages:

  • Improved throughput - You can upload parts in parallel to improve throughput.

  • Quick recovery from any network issues - Smaller part size minimizes the impact of restarting a failed upload due to a networkerror.

  • Pause and resume object uploads - You can upload object parts over time. After you initiate a multipart upload, there is no expiry; youmust explicitly complete or stop the multipart upload.

  • Begin an upload before you know the final object size - You can upload an object as you are creating it.

We recommend that you use multipart upload in the following ways:

  • If you're uploading large objects over a stable high-bandwidth network, use multipart upload to maximize the use of your availablebandwidth by uploading object parts in parallel for multi-threadedperformance.

  • If you're uploading over a spotty network, use multipart upload to increase resiliency to network errors by avoiding upload restarts.When using multipart upload, you need to retry uploading only partsthat are interrupted during the upload. You don't need to restartuploading your object from the beginning.

关于Java 文件上传到 S3 - 应该多部分加速吗?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68255312/

54 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com