gpt4 book ai didi

performance - Haskell:当不需要日志时,让 Writer 和普通代码一样高效

转载 作者:行者123 更新时间:2023-12-02 02:38:52 31 4
gpt4 key购买 nike

我想编写一个可以在两种“模式”下运行的代码:

  • 在记录模式下,即它应该记录一些信息(在我的情况下,我想记录在给定时间对某些特定功能完成的调用次数)
  • 或在高效模式下,即它不记录任何内容,而是尽可能快地运行

  • 我尝试编写以下代码,它创建了两个 Writer,一个普通的(用于记录模式)和一个愚蠢的(不记录任何内容,用于高效模式)。然后我定义一个新类 LogFunctionCalls这使我可以在这两个 Writer 之一中运行我的函数。

    但是,我尝试比较使用 Stupid writer 的代码速度,它比没有 writer 的普通代码慢得多:这是分析信息:
  • 没有编写器的代码:总时间 = 0.27s,总分配 = 55,800 字节
  • 用愚蠢的作家编写代码StupidLogEntry : 总时间 = 0.74 秒,总分配 = 600,060,408 字节(注意:实际时间远大于 0.74 秒...)
  • 真实作者的代码LogEntry : 总时间 = 5.03 秒,总分配 = 1,920,060,624 字节

  • 这是代码(您可以根据要使用的运行进行评论):
    {-# LANGUAGE ScopedTypeVariables #-}
    module Main where

    --- It depends on the transformers, containers, and base packages.

    --- You can profile it with:
    --- $ cabal v2-run --enable-profiling debug -- +RTS -p
    --- and a file debug.prof will be created.

    import qualified Data.Map.Strict as MapStrict
    import qualified Data.Map.Merge.Strict as MapMerge

    import qualified Control.Monad as CM
    import Control.Monad.Trans.Writer.Strict (Writer)
    import qualified Control.Monad.Trans.Writer.Strict as Wr
    import qualified Data.Time as Time

    -- Test using writer monad

    -- The actual LogEntry, that should associate a number
    -- to each name
    newtype LogEntry = LogEntry { logMap:: MapStrict.Map String Int }
    deriving (Eq, Show)

    -- A logentry that does not record anything, always empty
    newtype StupidLogEntry = StupidLogEntry { stupidLogMap:: MapStrict.Map String Int }
    deriving (Eq, Show)

    -- Create the Monoid instances
    instance Semigroup LogEntry where
    (LogEntry m1) <> (LogEntry m2) =
    LogEntry $ MapStrict.unionWith (+) m1 m2
    instance Monoid LogEntry where
    mempty = LogEntry MapStrict.empty

    instance Semigroup StupidLogEntry where
    (StupidLogEntry m1) <> (StupidLogEntry m2) =
    StupidLogEntry $ m1
    instance Monoid StupidLogEntry where
    mempty = StupidLogEntry MapStrict.empty

    -- Create a class that allows me to use the function "myTell"
    -- that adds a number in the writer (either the LogEntry
    -- or StupidLogEntry one)
    class (Monoid r) => LogFunctionCalls r where
    myTell :: String -> Int -> Writer r ()

    instance LogFunctionCalls LogEntry where
    myTell namefunction n = do
    Wr.tell $ LogEntry $ MapStrict.singleton namefunction n

    instance LogFunctionCalls StupidLogEntry where
    myTell namefunction n = do
    -- Wr.tell $ StupidLogEntry $ Map.singleton namefunction n
    return ()

    -- Function in itself, with writers
    countNumberCalls :: (LogFunctionCalls r) => Int -> Writer r Int
    countNumberCalls 0 = return 0
    countNumberCalls n = do
    myTell "countNumberCalls" 1
    x <- countNumberCalls $ n - 1
    return $ 1 + x

    --- Without any writer, pretty efficient
    countNumberCallsNoWriter :: Int -> Int
    countNumberCallsNoWriter 0 = 0
    countNumberCallsNoWriter n = 1 + countNumberCallsNoWriter (n-1)

    main :: IO ()
    main = do
    putStrLn $ "Hello"
    -- Version without any writter
    print =<< Time.getZonedTime
    let n = countNumberCallsNoWriter 15000000
    putStrLn $ "Without any writer, the result is " ++ (show n)
    -- Version with Logger
    print =<< Time.getZonedTime
    let (n, log :: LogEntry) = Wr.runWriter $ countNumberCalls 15000000
    putStrLn $ "The result is " ++ (show n)
    putStrLn $ "With the logger, the number of calls is " ++ (show $ (logMap log))
    -- Version with the stupid logger
    print =<< Time.getZonedTime
    let (n, log :: StupidLogEntry) = Wr.runWriter $ countNumberCalls 15000000
    putStrLn $ "The result is " ++ (show n)
    putStrLn $ "With the stupid logger, the number of calls is " ++ (show $ (stupidLogMap log))
    print =<< Time.getZonedTime

    最佳答案

    Writer monad 是瓶颈。一个更好的方法来概括你的代码以便它可以在这两种“模式”下运行是改变接口(interface),即 LogFunctionCalls类,由 monad 参数化:

    class Monad m => LogFunctionCalls m where
    myTell :: String -> Int -> m ()

    然后我们可以使用一个身份单子(monad)(或单子(monad)转换器)来简单地实现它:

    newtype NoLog a = NoLog a
    deriving (Functor, Applicative, Monad) via Identity

    instance LogFunctionCalls NoLog where
    myTell _ _ = pure ()

    另请注意,现在要测试的函数具有不同的类型,不再引用 Writer。明确:

    countNumberCalls :: (LogFunctionCalls m) => Int -> m Int

    让我们把它放在一个基准中,正如评论中指出的那样,它有各种各样的方法论问题,但是如果我们用 ghc -O 编译它,仍然会发生一些有趣的事情。 :

    main :: IO ()
    main = do
    let iternumber = 1500000
    putStrLn $ "Hello"
    t0 <- Time.getCurrentTime

    -- Non-monadic version
    let n = countNumberCallsNoWriter iternumber
    putStrLn $ "Without any writer, the result is " ++ (show n)
    t1 <- Time.getCurrentTime
    print (Time.diffUTCTime t1 t0)

    -- NoLog version
    let n = unNoLog $ countNumberCalls iternumber
    putStrLn $ "The result is " ++ (show n)
    t2 <- Time.getCurrentTime
    print (Time.diffUTCTime t2 t1)

    输出:

    Hello
    Without any writer, the result is 1500000
    0.022030957s
    The result is 1500000
    0.000081533s

    如我们所见,第二个版本(我们关心的那个)花费了零时间。如果我们从基准测试中删除第一个版本,那么剩下的将采用前者的 0.022 秒。

    所以 GHC 实际上优化了这两个基准中的一个,因为它发现它们是相同的,这实现了我们最初想要的:“记录”代码的运行速度与没有记录的专用代码一样快,因为它们实际上是相同的,而基准数字无关紧要。

    这也可以通过查看生成的 Core 来确认;运行 ghc -O -ddump-simpl -ddump-to-file -dsuppres-all并理解文件 Main.dump-simpl .或使用 inspection-testing .

    可编译要点: https://gist.github.com/Lysxia/2f98c4a8a61034dcc614de5e95d7d5f8

    关于performance - Haskell:当不需要日志时,让 Writer 和普通代码一样高效,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/61635717/

    31 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com