haskell - 如何优化大型 IntMap 生成？-6ren

haskell - 如何优化大型 IntMap 生成？

转载作者：行者123 更新时间：2023-12-01 22:26:27

以下代码的 allCases 大小为 2^20 IntMap，其生成需要大量计算和内存。我不知道这是否是不可避免的成本，以及如何找出效率低下的地方。

import Control.Monad (forM_)
import Control.Monad.ST
import Data.Array
import Data.Array.ST
import Data.Bits
import Data.List
import qualified Data.IntMap as M

type Switch = Int
type Clock = Int

switches :: Array Switch [Clock]
switches = listArray (0, 9) [
  [0, 1, 2],
  [3, 7, 9, 11],
  [4, 10, 14, 15],
  [0, 4, 5, 6, 7],
  [6, 7, 8, 10, 12],
  [0, 2, 14, 15],
  [3, 14, 15],
  [4, 5, 7, 1, 15],
  [1, 2, 3, 4, 5],
  [3, 4, 5, 9, 13]]

type Quads = Int

intsToQuads :: [Int] -> Quads
intsToQuads [] = 0
intsToQuads (x:xs) = x .|. (intsToQuads xs `shiftL` 2)

switchCases :: [[Int]]
switchCases = sequence $ replicate 10 [0..3]

applySwitch :: Int -> STUArray s Int Int -> ST s ()
applySwitch sw clocks = forM_ (switches ! sw) $ \ix -> do
  clock <- readArray clocks ix
  writeArray clocks ix ((clock + 1) `rem` 4)

allCasesST :: Int -> STUArray s Int Int -> Int -> ST s (M.IntMap Int)
allCasesST ix clocks pushs
  | ix > 9 = do
    cs <- getElems clocks
    return (M.singleton (intsToQuads cs) pushs)
  | otherwise = do
    rs <- mapM next [pushs..pushs + 3]
    return (M.unions rs)
  where
    next pu = do
      rs <- allCasesST (ix + 1) clocks pu
      applySwitch ix clocks
      return rs

allCases :: M.IntMap Int
allCases = runST $ do
  st <- newArray (0,15) 0
  allCasesST 0 st 0

main = do
  putStrLn . show $ M.lookup 0 allCases
  return ()

以下是分析结果。

  2,754,242,472 bytes allocated in the heap
  2,095,063,056 bytes copied during GC
    788,992,504 bytes maximum residency (23 sample(s))
      3,087,880 bytes maximum slop
            815 MB total memory in use (0 MB lost due to fragmentation)

                                    Tot time (elapsed)  Avg pause  Max pause
Gen  0      2618 colls,     0 par   10.859s  11.052s     0.0042s    1.1284s
Gen  1        23 colls,     0 par    0.000s   0.011s     0.0005s    0.0009s

INIT    time    0.000s  (  0.001s elapsed)
MUT     time    1.531s  (  3.622s elapsed)
GC      time    8.750s  (  8.916s elapsed)
RP      time    0.000s  (  0.000s elapsed)
PROF    time    2.109s  (  2.146s elapsed)
EXIT    time    0.047s  (  0.072s elapsed)
Total   time   12.438s  ( 12.612s elapsed)

%GC     time      70.4%  (70.7% elapsed)

Alloc rate    1,798,688,961 bytes per MUT second

Productivity  12.7% of total user, 12.3% of total elapsed

.prof 文件(为了整洁进行了一些修改)。

    Mon Dec 25 23:52 2017 Time and Allocation Profiling Report  (Final)

       a.exe +RTS -p -hc -xc -s -c -RTS

    total time  =        1.52 secs   (1518 ticks @ 1000 us, 1 processor)
    total alloc = 1,576,087,408 bytes  (excludes profiling overheads)

COST CENTRE     MODULE       SRC                       %time %alloc

intsToQuads     Main         Main.hs:38:29-42          24.9   17.0
allCasesST      Main         Main.hs:55:13-23          13.7   20.2
applySwitch     Main         Main.hs:44:25-45          12.2    0.0
allCasesST      Main         Main.hs:51:11-25          10.5   44.2
applySwitch     Main         Main.hs:46:3-44            9.5    0.0
intsToQuads     Main         Main.hs:38:22-54           7.2    0.0
allCasesST.next Main         Main.hs:58:13-41           6.2    5.7
intsToQuads     Main         Main.hs:38:29-53           4.2    0.0
applySwitch     Main         Main.hs:45:12-30           2.6    7.1
allCasesST.next Main         Main.hs:59:7-27            2.2    0.0
allCasesST      Main         Main.hs:52:26-39           1.8    1.1
applySwitch     Main         Main.hs:(44,25)-(46,44)    1.5    0.0
allCasesST      Main         Main.hs:52:13-46           1.4    2.7
allCasesST      Main         Main.hs:54:11-38           0.7    2.1


                                                          individual     inherited
COST CENTRE           SRC                       entries  %time %alloc   %time %alloc

MAIN                  <built-in>                   0      0.1    0.0    100.0  100.0
 CAF                  GHC.IO.Encoding.CodePage     0      0.0    0.0      0.0    0.0
 CAF                  GHC.IO.Encoding              0      0.0    0.0      0.0    0.0
 CAF                  GHC.IO.Handle.Text           0      0.0    0.0      0.0    0.0
 CAF                  GHC.Show                     0      0.0    0.0      0.0    0.0
 CAF                  GHC.IO.Handle.FD             0      0.0    0.0      0.0    0.0
 CAF                  <entire-module>              0      0.0    0.0     99.9  100.0
  allCases            Main.hs:(63,12)-(65,19)      0      0.0    0.0     99.9  100.0
   allCases           Main.hs:64:9-25              0      0.0    0.0      0.0    0.0
   allCasesST         Main.hs:55:5-24              0      0.0    0.0      5.5    5.3
    allCasesST        Main.hs:55:13-23             0      5.5    5.3      5.5    5.3
   allCasesST         Main.hs:54:11-38             0      0.7    2.1     94.5   94.7
    allCasesST.next   Main.hs:59:7-27              0      0.1    0.0      0.5    0.0
     applySwitch      Main.hs:(44,25)-(46,44)      0      0.0    0.0      0.3    0.0
      applySwitch     Main.hs:44:25-45             0      0.1    0.0      0.3    0.0
       applySwitch    Main.hs:45:12-30             0      0.0    0.0      0.0    0.0
       applySwitch    Main.hs:44:32-44             0      0.2    0.0      0.2    0.0
    allCasesST.next   Main.hs:58:13-41             0      6.2    5.7     93.3   92.5
     allCasesST       Main.hs:52:5-47              0      0.1    0.0     39.7   20.8
      allCasesST      Main.hs:52:13-46             0      1.4    2.7     39.6   20.8
       allCasesST     Main.hs:52:26-39             0      1.8    1.1     38.2   18.1
        intsToQuads   Main.hs:38:22-54             0      7.2    0.0     36.4   17.0
         intsToQuads  Main.hs:38:29-53             0      4.2    0.0     29.1   17.0
          intsToQuads Main.hs:38:29-42             0     24.9   17.0     24.9   17.0
     allCasesST       Main.hs:55:5-24              0      0.1    0.0      8.3   14.9
      allCasesST      Main.hs:55:13-23             0      8.2   14.9      8.2   14.9
     allCasesST       Main.hs:51:11-25             0     10.5   44.2     10.5   44.2
     allCasesST.next  Main.hs:59:7-27              0      2.1    0.0     28.7    7.1
      applySwitch     Main.hs:(44,25)-(46,44)      0      1.5    0.0     26.5    7.1
       applySwitch    Main.hs:44:25-45             0     12.1    0.0     25.0    7.1
        applySwitch   Main.hs:46:3-44              0      9.5    0.0     10.4    0.0
         applySwitch  Main.hs:46:25-43             0      0.9    0.0      0.9    0.0
        applySwitch   Main.hs:45:12-30             0      2.6    7.1      2.6    7.1
  main                Main.hs:68:3-39              0      0.0    0.0      0.0    0.0
   main               Main.hs:68:21-39             0      0.0    0.0      0.0    0.0
   main               Main.hs:68:3-17              0      0.0    0.0      0.0    0.0
  switches            Main.hs:(22,12)-(32,19)      0      0.0    0.0      0.0    0.0
 main                 Main.hs:68:3-39              0      0.0    0.0      0.0    0.0
  main                Main.hs:68:3-17              0      0.0    0.0      0.0    0.0

最佳答案

你的算法有点太懒了，你在 map 中放入了未评估的重击。

您的原始代码执行如下:

<<ghc: 1185249008 bytes, 1136 GCs, 85006914/288126544 avg/max bytes residency (10 samples), 827M in use, 0.000 INIT (0.002 elapsed), 0.595 MUT (0.605 elapsed), 1.830 GC (2.065 elapsed) :ghc>>
./jeiea +RTS -tstderr  2.43s user 0.32s system 99% cpu 2.773 total

即，288MB 驻留时间，2.4 秒。

但是如果我们强制评估 intsToQuads:

allCasesST :: Int -> STUArray s Int Int -> Int -> ST s (M.IntMap Int)
allCasesST ix clocks pushs
  | ix > 9 = do
    cs <- getElems clocks
    let n = intsToQuads cs
    n `seq` return (M.singleton n pushs)

那么性能是:

<<ghc: 1151694576 bytes, 1104 GCs, 26972457/72834536 avg/max bytes residency (11 samples), 215M in use, 0.000 INIT (0.002 elapsed), 0.500 MUT (0.515 elapsed), 0.735 GC (0.816 elapsed) :ghc>>
./jeiea +RTS -tstderr  1.24s user 0.11s system 98% cpu 1.367 total

最大驻留时间为 72MB，时间为 1.25 秒。

如果我们对 applySwitch 和 next 应用相同的修改，我们将获得以下性能:

<<ghc: 1151694576 bytes, 1104 GCs, 26972457/72834536 avg/max bytes residency (11 samples), 215M in use, 0.000 INIT (0.002 elapsed), 0.389 MUT (0.395 elapsed), 0.517 GC (0.573 elapsed) :ghc>>

这是 72MB 驻留时间，大约 1 秒，但差异很大。

编辑:

完整代码并运行:

import Control.Monad (forM_)
import Control.Monad.ST
import Data.Array
import Data.Array.ST
import Data.Bits
import Data.List
import qualified Data.IntMap as M

type Switch = Int
type Clock = Int

switches :: Array Switch [Clock]
switches = listArray (0, 9) [
  [0, 1, 2],
  [3, 7, 9, 11],
  [4, 10, 14, 15],
  [0, 4, 5, 6, 7],
  [6, 7, 8, 10, 12],
  [0, 2, 14, 15],
  [3, 14, 15],
  [4, 5, 7, 1, 15],
  [1, 2, 3, 4, 5],
  [3, 4, 5, 9, 13]]

type Quads = Int

intsToQuads :: [Int] -> Quads
intsToQuads [] = 0
intsToQuads (x:xs) = x .|. (intsToQuads xs `shiftL` 2)

switchCases :: [[Int]]
switchCases = sequence $ replicate 10 [0..3]

applySwitch :: Int -> STUArray s Int Int -> ST s ()
applySwitch sw clocks = forM_ (switches ! sw) $ \ix -> do
  clock <- readArray clocks ix
  let n = ((clock + 1) `rem` 4)
  n `seq` writeArray clocks ix n

allCasesST :: Int -> STUArray s Int Int -> Int -> ST s (M.IntMap Int)
allCasesST ix clocks pushs
  | ix > 9 = do
    cs <- getElems clocks
    let n = intsToQuads cs
    n `seq` return (M.singleton n pushs)
  | otherwise = do
    rs <- mapM next [pushs..pushs + 3]
    return (M.unions rs)
  where
    next pu = do
      let n = ix + 1
      rs <- n `seq` allCasesST n clocks pu
      applySwitch ix clocks
      return rs

allCases :: M.IntMap Int
allCases = runST $ do
  st <- newArray (0,15) 0
  allCasesST 0 st 0

main = do
  putStrLn . show $ M.lookup 0 allCases
  return ()

与:

% time ./jeiea +RTS -tstderr
Just 0
<<ghc: 1151694576 bytes, 1104 GCs, 26972457/72834536 avg/max bytes residency (11 samples), 215M in use, 0.000 INIT (0.005 elapsed), 0.812 MUT (0.820 elapsed), 0.979 GC (1.093 elapsed) :ghc>>
./jeiea +RTS -tstderr  1.79s user 0.16s system 99% cpu 1.971 total

关于haskell - 如何优化大型 IntMap 生成？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/47969861/

文章推荐： go - 使用UNWIND时如何访问Neo4j结果？

文章推荐： templates - 如何使用结构或变量值的字段作为模板名称？

文章推荐： webpack - Webpack 1 在加载器中支持 'include' 和 'exclude' 吗？

haskell - 如何使用具有不同键类型的 IntMap？
我在使用两个 IntMap 的文档中有一条记录: data Doc = Doc { kernels :: IntMap Kernel, nodes :: IntMap Node } 但是我发现两个 I
haskell - 遍历后如何保持Data.IntMap.Strict严格？
Data.IntMap.Strict docs say : Be aware that the Functor, Traversable and Data instances are the same
haskell - 如何优化大型 IntMap 生成？
以下代码的 allCases 大小为 2^20 IntMap，其生成需要大量计算和内存。我不知道这是否是不可避免的成本，以及如何找出效率低下的地方。 import Control.Monad (for
opa - 循环 db intmap(person)
例子: type person = { name : string ; age : int } db /person : intmap(pe
haskell - Data.Map/Data.IntMap 是否存在 monad 实例？
我有一个在 IntMap 上运行的算法，我觉得最好用命令式的方式表达它。也就是说，我想说的是: 在 map 中查找值 X。如果它符合条件，则从 map 中删除该值。循环直到 map 中不再存在值。
Scala:将 Map[K,V] 转换为 IntMap[V] 的最有效方法是什么？
假设我有一个带有 toInt 方法的类 Point，并且我有一个不可变的 Map[Point,V] ，对于某些类型 V 。Scala 中将其转换为 IntMap[V] 的最有效方法是什么？这里是我目前
Haskell:从具有一百万个值的列表构建 IntMap 时，我应该得到 "Stack space overflow"吗？
我的问题是，在 Haskell 中使用任何 Map 实现时，在处理一百万个值时总是会出现“堆栈空间溢出”。我想要做的是处理一对列表。每对包含两个整数(不是整数，我用它们失败了，所以我尝试了整数)。我

行者123

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

haskell - 如何优化大型 IntMap 生成？