gpt4 book ai didi

performance - 优化一个被多次调用的简单解析器

转载 作者:行者123 更新时间:2023-12-04 02:54:04 25 4
gpt4 key购买 nike

我使用 attoparsec 为自定义文件编写了解析器.
分析报告表明,大约 67% 的内存分配是在一个名为 tab 的函数中完成的。 ,这也是最耗时的。tab功能非常简单:

tab :: Parser Char
tab = char '\t'

整个分析报告如下:
       ASnapshotParser +RTS -p -h -RTS

total time = 37.88 secs (37882 ticks @ 1000 us, 1 processor)
total alloc = 54,255,105,384 bytes (excludes profiling overheads)

COST CENTRE MODULE %time %alloc

tab Main 83.1 67.7
main Main 6.4 4.2
readTextDevice Data.Text.IO.Internal 5.5 24.0
snapshotParser Main 4.7 4.0


individual inherited
COST CENTRE MODULE no. entries %time %alloc %time %alloc

MAIN MAIN 75 0 0.0 0.0 100.0 100.0
CAF Main 149 0 0.0 0.0 100.0 100.0
tab Main 156 1 0.0 0.0 0.0 0.0
snapshotParser Main 153 1 0.0 0.0 0.0 0.0
main Main 150 1 6.4 4.2 100.0 100.0
doStuff Main 152 1000398 0.3 0.0 88.1 71.8
snapshotParser Main 154 0 4.7 4.0 87.7 71.7
tab Main 157 0 83.1 67.7 83.1 67.7
readTextDevice Data.Text.IO.Internal 151 40145 5.5 24.0 5.5 24.0
CAF Data.Text.Array 142 0 0.0 0.0 0.0 0.0
CAF Data.Text.Internal 140 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Handle.FD 122 0 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal 103 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding 101 0 0.0 0.0 0.0 0.0
CAF GHC.IO.FD 100 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.Iconv 89 0 0.0 0.0 0.0 0.0
main Main 155 0 0.0 0.0 0.0 0.0

我该如何优化呢?

完整代码 for the parser is here.我正在解析的文件大约是 77MB。

最佳答案

tab是替罪羊。如果您定义 boo :: Parser (); boo = return ()并插入 boosnapshotParser 中的每个绑定(bind)之前定义,成本分配将变为:

 main             Main                    255           0   11.8   13.8   100.0  100.0
doStuff Main 258 2097153 1.1 0.5 86.2 86.2
snapshotParser Main 260 0 0.4 0.1 85.1 85.7
boo Main 262 0 71.0 73.2 84.8 85.5
tab Main 265 0 13.8 12.3 13.8 12.3

因此,分析器似乎正在将解析结果分配的责任转移到,可能是由于 attoparsec 的广泛内联。代码,正如 John L 在评论中建议的那样。

至于性能问题,关键在于,当您解析一个 77MB 的文本文件以构建一个包含一百万个元素的列表时,您希望文件处理是惰性的,而不是严格的。一旦解决,解耦 I/O 并解析 doStuff并且在没有累加器的情况下构建快照列表也很有帮助。这是考虑到这一点的程序的修改版本。
{-# LANGUAGE BangPatterns #-}
module Main where

import Data.Maybe
import Data.Attoparsec.Text.Lazy
import Control.Applicative
import qualified Data.Text.Lazy.IO as TL
import Data.Text (Text)
import qualified Data.Text.Lazy as TL

buildStuff :: TL.Text -> [Snapshot]
buildStuff text = case maybeResult (parse endOfInput text) of
Just _ -> []
Nothing -> case parse snapshotParser text of
Done !i !r -> r : buildStuff i
Fail _ _ _ -> []

main :: IO ()
main = do
text <- TL.readFile "./snap.dat"
let ss = buildStuff text
print $ listToMaybe ss
>> Just (fromIntegral (length $ show ss) / fromIntegral (length ss))

newtype VehicleId = VehicleId Int deriving Show
newtype Time = Time Int deriving Show
newtype LinkID = LinkID Int deriving Show
newtype NodeID = NodeID Int deriving Show
newtype LaneID = LaneID Int deriving Show

tab :: Parser Char
tab = char '\t'

-- UNPACK pragmas. GHC 7.8 unboxes small strict fields automatically;
-- however, it seems we still need the pragmas while profiling.
data Snapshot = Snapshot {
vehicle :: {-# UNPACK #-} !VehicleId,
time :: {-# UNPACK #-} !Time,
link :: {-# UNPACK #-} !LinkID,
node :: {-# UNPACK #-} !NodeID,
lane :: {-# UNPACK #-} !LaneID,
distance :: {-# UNPACK #-} !Double,
velocity :: {-# UNPACK #-} !Double,
vehtype :: {-# UNPACK #-} !Int,
acceler :: {-# UNPACK #-} !Double,
driver :: {-# UNPACK #-} !Int,
passengers :: {-# UNPACK #-} !Int,
easting :: {-# UNPACK #-} !Double,
northing :: {-# UNPACK #-} !Double,
elevation :: {-# UNPACK #-} !Double,
azimuth :: {-# UNPACK #-} !Double,
user :: {-# UNPACK #-} !Int
} deriving (Show)

-- No need for bang patterns here.
snapshotParser :: Parser Snapshot
snapshotParser = do
sveh <- decimal
tab
stime <- decimal
tab
slink <- decimal
tab
snode <- decimal
tab
slane <- decimal
tab
sdistance <- double
tab
svelocity <- double
tab
svehtype <- decimal
tab
sacceler <- double
tab
sdriver <- decimal
tab
spassengers <- decimal
tab
seasting <- double
tab
snorthing <- double
tab
selevation <- double
tab
sazimuth <- double
tab
suser <- decimal
endOfLine <|> endOfInput
return $ Snapshot
(VehicleId sveh) (Time stime) (LinkID slink) (NodeID snode)
(LaneID slane) sdistance svelocity svehtype sacceler sdriver
spassengers seasting snorthing selevation sazimuth suser

即使您将整个快照列表强制到内存中,此版本也应该具有可接受的性能,就像我在 main 中所做的那样。这里。要衡量什么是“可接受的”,请记住,给定每个 Snapshot 中的 16 个(小的、未装箱的)字段。加上 overheadSnapshot和列表构造函数,我们谈论的是每个列表单元格 152 个字节,归结为您的测试数据约为 152MB。无论如何,这个版本尽可能地懒惰,正如您通过删除 main 中的除法所看到的那样。 ,或将其替换为 last ss .

注意:我的测试是使用 attoparsec-0.12 完成的。

关于performance - 优化一个被多次调用的简单解析器,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24063219/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com