gpt4 book ai didi

java - 从 S3 解压缩并读取 gz 文件 - Scala

转载 作者:行者123 更新时间:2023-12-02 09:24:54 25 4
gpt4 key购买 nike

我在 S3 文件夹中有一个 gzip 文件列表,并且必须使用 scala 读取这些文件。迭代每个文件并将文件的内容存储在字符串缓冲区列表中。

这是读取一个文件并以字符串形式返回的方法。

  def getDecompressedData(bucket: String, key: String) : String= {
val getObjectRequest = new GetObjectRequest(bucket, key)
val s3Object = s3Client.getObject(getObjectRequest)
val byteArray = IOUtils.toByteArray(s3Object.getObjectContent)
val inputStream = new GZIPInputStream(new ByteArrayInputStream(byteArray))
val data = scala.io.Source.fromInputStream(inputStream).mkString
inputStream.close()
data
}

我收到错误

Exception in thread "main" java.io.EOFException: Unexpected end of ZLIB input stream
at java.util.zip.InflaterInputStream.fill(InflaterInputStream.java:240)
at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:158)
at java.util.zip.GZIPInputStream.read(GZIPInputStream.java:117)
at java.io.FilterInputStream.read(FilterInputStream.java:107)
at com.amazonaws.util.IOUtils.toByteArray(IOUtils.java:44)
at com.amazonaws.util.IOUtils.toString(IOUtils.java:58)

at val data = scala.io.Source.fromInputStream(inputStream).mkString

最佳答案

def getDecompressedData(bucket: String, key: String) : String= {
val getObjectRequest = new GetObjectRequest(bucket, key)
val s3Object = s3Client.getObject(getObjectRequest)

var data: String = ""

// If S3 file is compressed
if(gzip) {

val gzipData = new Scanner(new GZIPInputStream(s3Object.getObjectContent)).asScala
data = gzipData.mkstring

} else {

val plainText = new Scanner(new InputStreamReader(s3Object.getObjectContent)).asScala
data = plainText.mkstring
}

s3Object.close()

data
}

我已经提供了 gzip 文件和纯文件的代码。

关于java - 从 S3 解压缩并读取 gz 文件 - Scala,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/58400568/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com