gpt4 book ai didi

haskell - 如何使用 amazonka、conduit 和 lazy bytestring 进行分块

转载 作者:行者123 更新时间:2023-12-03 23:31:10 24 4
gpt4 key购买 nike

我编写了下面的代码来模拟从 Lazy ByteString 上传到 S3(将通过网络套接字接收。在这里,我们通过读取文件来模拟大小~100MB)。下面代码的问题是它似乎强制将整个文件读取到内存中而不是将其分块 (cbytes) - 将感谢有关为什么分块不起作用的指针:

import Control.Lens
import Network.AWS
import Network.AWS.S3
import Network.AWS.Data.Body
import System.IO
import Data.Conduit (($$+-))
import Data.Conduit.Binary (sinkLbs,sourceLbs)
import qualified Data.Conduit.List as CL (mapM_)
import Network.HTTP.Conduit (responseBody,RequestBody(..),newManager,tlsManagerSettings)
import qualified Data.ByteString.Lazy as LBS

example :: IO PutObjectResponse
example = do
-- To specify configuration preferences, newEnv is used to create a new Env. The Region denotes the AWS region requests will be performed against,
-- and Credentials is used to specify the desired mechanism for supplying or retrieving AuthN/AuthZ information.
-- In this case, Discover will cause the library to try a number of options such as default environment variables, or an instance's IAM Profile:
e <- newEnv NorthVirginia Discover

-- A new Logger to replace the default noop logger is created, with the logger set to print debug information and errors to stdout:
l <- newLogger Debug stdout

-- The payload for the S3 object is retrieved from a file that simulates lazy bytestring received over network
inb <- LBS.readFile "out"
lenb <- System.IO.withFile "out" ReadMode hFileSize -- evaluates to 104857600 (100MB)
let cbytes = toBody $ ChunkedBody (1024*128) (fromIntegral lenb) (sourceLbs inb)

-- We now run the AWS computation with the overriden logger, performing the PutObject request:
runResourceT . runAWS (e & envLogger .~ l) $
send ((putObject "yourtestenv-change-it-please" "testbucket/test" cbytes) & poContentType .~ Just "text; charset=UTF-8")

main = example >> return ()

使用 RTS -s 选项运行可执行文件显示整个内容都被读入内存(~113MB 最大驻留 - 我曾经看到过~87MB)。另一方面,如果我使用 chunkedFile ,它被正确分块(~10MB 最大驻留)。

最佳答案

这点很清楚

  inb <- LBS.readFile "out"
lenb <- System.IO.withFile "out" ReadMode hFileSize -- evaluates to 104857600 (100MB)
let cbytes = toBody $ ChunkedBody (1024*128) (fromIntegral lenb) (sourceLbs inb)

应该改写为

  lenb <- System.IO.withFile "out" ReadMode hFileSize -- evaluates to 104857600 (100MB)
let cbytes = toBody $ ChunkedBody (1024*128) (fromIntegral lenb) (C.sourceFile "out")

如您所写,管道的用途已经落空。整个文件需要由 LBS.readFile 累积,然后在馈送到 sourceLBS 时逐 block 分解。 (如果惰性 IO 工作正常,这可能不会发生。)sourceFile 逐 block 地增量读取文件。可能是这样,例如toBody 收集整个文件,在这种情况下,管道点在不同的点被破坏。不过,浏览一下 send 等的源代码,我看不到任何可以做到这一点的东西。

关于haskell - 如何使用 amazonka、conduit 和 lazy bytestring 进行分块,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37617885/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com