gpt4 book ai didi

performance - ByteString concatMap 性能

转载 作者:行者123 更新时间:2023-12-04 20:42:33 38 4
gpt4 key购买 nike

我有一个 37MB bin 文件,我正在尝试将其转换为 ppm 序列。它工作正常,我正在尝试将其用作练习来学习一些分析以及有关 Haskell 中惰性字节串的更多信息。我的程序似乎在 concatMap 处爆炸,它用于将每个字节复制三遍,所以我有 R、G 和 B。代码相当简单——我每 2048 个字节写一个新头:

{-# LANGUAGE OverloadedStrings #-}

import System.IO
import System.Environment
import Control.Monad
import qualified Data.ByteString.Lazy.Char8 as B


main :: IO ()
main = do [from, to] <- getArgs
withFile from ReadMode $ \inH ->
withFile to WriteMode $ \outH ->
loop (B.hGet inH 2048) (process outH) B.null


loop :: (Monad m) => m a -> (a -> m ()) -> (a -> Bool) -> m ()
loop inp outp done = inp >>= \x -> unless (done x) (outp x >> loop inp outp done)


process :: Handle -> B.ByteString -> IO ()
process h bs | B.null bs = return ()
| otherwise = B.hPut h header >> B.hPut h bs'
where header = "P6\n32 64\n255\n" :: B.ByteString
bs' = B.concatMap (B.replicate 3) bs

这使它在 5s 多一点。这并不可怕,我唯一的比较是我非常幼稚的 C 实现,它在 4s 下完成了一点 - 所以这或理想情况下一直是我的目标。

这是来自上述代码的 RTS:
  33,435,345,688 bytes allocated in the heap
14,963,640 bytes copied during GC
54,640 bytes maximum residency (77 sample(s))
21,136 bytes maximum slop
2 MB total memory in use (0 MB lost due to fragmentation)

Tot time (elapsed) Avg pause Max pause
Gen 0 64604 colls, 0 par 0.20s 0.25s 0.0000s 0.0001s
Gen 1 77 colls, 0 par 0.00s 0.01s 0.0001s 0.0006s

INIT time 0.00s ( 0.00s elapsed)
MUT time 5.09s ( 5.27s elapsed)
GC time 0.21s ( 0.26s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 5.29s ( 5.52s elapsed)

%GC time 3.9% (4.6% elapsed)

Alloc rate 6,574,783,667 bytes per MUT second

Productivity 96.1% of total user, 92.1% of total elapsed

相当粗糙的结果。当我删除 concatMap 并每 2048 字节复制带有标题的所有内容时,它实际上是即时的:
      70,983,992 bytes allocated in the heap
48,912 bytes copied during GC
54,640 bytes maximum residency (2 sample(s))
19,744 bytes maximum slop
1 MB total memory in use (0 MB lost due to fragmentation)

Tot time (elapsed) Avg pause Max pause
Gen 0 204 colls, 0 par 0.00s 0.00s 0.0000s 0.0000s
Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0001s

INIT time 0.00s ( 0.00s elapsed)
MUT time 0.01s ( 0.07s elapsed)
GC time 0.00s ( 0.00s elapsed)
EXIT time 0.00s ( 0.00s elapsed)
Total time 0.02s ( 0.07s elapsed)

%GC time 9.6% (2.9% elapsed)

Alloc rate 5,026,838,892 bytes per MUT second

Productivity 89.8% of total user, 22.3% of total elapsed

所以我想我的问题有两个方面:
  • 如何提高整体性能?
  • 如果瓶颈不是那么明显,我可以通过哪些方式找到它?

  • 谢谢你。

    编辑

    如果有人感兴趣,这是最终代码和 RTS!在阅读了 Profiling and optimizationReal World Haskell 章节后,我还能够通过使用 ghc 的分析器和 -prof -auto-all -caf-all 找到其他瓶颈。
    {-# LANGUAGE OverloadedStrings #-}

    import System.IO
    import System.Environment
    import Control.Monad
    import Data.Monoid
    import qualified Data.ByteString.Builder as BU
    import qualified Data.ByteString.Lazy.Char8 as BL


    main :: IO ()
    main = do [from, to] <- getArgs
    withFile from ReadMode $ \inH ->
    withFile to WriteMode $ \outH ->
    loop (BL.hGet inH 2048) (process outH) BL.null


    loop :: (Monad m) => m a -> (a -> m ()) -> (a -> Bool) -> m ()
    loop inp outp done = inp >>= \x -> unless (done x) (outp x >> loop inp outp done)


    upConcatMap :: Monoid c => (Char -> c) -> BL.ByteString -> c
    upConcatMap f bs = mconcat . map f $ BL.unpack bs


    process :: Handle -> BL.ByteString -> IO ()
    process h bs | BL.null bs = return ()
    | otherwise = BU.hPutBuilder h frame
    where header = "P6\n32 64\n255\n"
    bs' = BU.toLazyByteString $ upConcatMap trip bs
    frame = BU.lazyByteString $ mappend header bs'
    trip c = let b = BU.char8 c in mconcat [b, b, b]
    6,383,263,640 bytes allocated in the heap
    18,596,984 bytes copied during GC
    54,640 bytes maximum residency (2 sample(s))
    31,056 bytes maximum slop
    1 MB total memory in use (0 MB lost due to fragmentation)

    Tot time (elapsed) Avg pause Max pause
    Gen 0 11165 colls, 0 par 0.06s 0.06s 0.0000s 0.0001s
    Gen 1 2 colls, 0 par 0.00s 0.00s 0.0001s 0.0002s

    INIT time 0.00s ( 0.00s elapsed)
    MUT time 0.69s ( 0.83s elapsed)
    GC time 0.06s ( 0.06s elapsed)
    EXIT time 0.00s ( 0.00s elapsed)
    Total time 0.75s ( 0.89s elapsed)

    %GC time 7.4% (7.2% elapsed)

    Alloc rate 9,194,103,284 bytes per MUT second

    Productivity 92.6% of total user, 78.0% of total elapsed

    最佳答案

    怎么样Builder ?

    这个版本对我来说快了 5 倍:

    process :: Handle -> B.ByteString -> IO ()
    process h bs
    | B.null bs = return ()
    | otherwise = B.hPut h header >> B.hPutBuilder h bs'
    where header = "P6\n32 64\n255\n" :: B.ByteString
    bs' = mconcat $ map triple $ B.unpack bs
    triple c = let b = B.char8 c in mconcat [b, b, b]

    它分配的垃圾要少得多。

    ADD:供引用,运行时统计:
       4,642,746,104 bytes allocated in the heap
    390,110,640 bytes copied during GC
    63,592 bytes maximum residency (2 sample(s))
    21,648 bytes maximum slop
    1 MB total memory in use (0 MB lost due to fragmentation)

    Tot time (elapsed) Avg pause Max pause
    Gen 0 8992 colls, 0 par 0.54s 0.63s 0.0001s 0.0017s
    Gen 1 2 colls, 0 par 0.00s 0.00s 0.0002s 0.0002s

    INIT time 0.00s ( 0.00s elapsed)
    MUT time 0.98s ( 1.13s elapsed)
    GC time 0.54s ( 0.63s elapsed)
    EXIT time 0.00s ( 0.00s elapsed)
    Total time 1.52s ( 1.76s elapsed)

    %GC time 35.4% (36.0% elapsed)

    Alloc rate 4,718,237,910 bytes per MUT second

    Productivity 64.6% of total user, 55.9% of total elapsed

    关于performance - ByteString concatMap 性能,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/23501384/

    38 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com