gpt4 book ai didi

haskell - 用组织态实现加泰罗尼亚数的显着开销

转载 作者:行者123 更新时间:2023-12-04 18:11:07 25 4
gpt4 key购买 nike

我正在探索recursion-schemes最近想找一些组织态的用例——我认为Catalan numbers可能是一个有趣的(我知道有更好的方法来实现加泰罗尼亚数字,这不是这个问题的重点)。我想出的是以下内容:


import Control.Comonad.Cofree
import Control.Monad
import Data.Foldable
import Data.Function.Memoize (memoFix)
import Data.Functor.Foldable
import GHC.Natural

type Nat = Natural

-- unrelated lines omitted

catalanHisto :: Nat -> Nat
catalanHisto = histo \case
Nothing ->
1
Just fs ->
let xs = toList fs -- this is line 101 in my original code.
ys = reverse xs
in sum $ zipWith (*) xs ys

catalanMemo :: Integer -> Integer
catalanMemo = memoFix \q n ->
if n == 0
then 1
else
let xs = fmap q [0 .. n -1]
ys = reverse xs
in sum $ zipWith (*) xs ys

main :: IO ()
main = do
-- print $ catalanMemo 1000
print $ catalanHisto 1000
然而,性能会受到影响, catalanHisto :
real    49.36s
user 416.48s
sys 99.38s
catalanMemo 相比,这非常糟糕:
real    0.84s
user 5.09s
sys 2.08s
鉴于至少它终止了, histo版本肯定记住了一些东西,但有一个巨大的开销,我不确定我是否在滥用 histo ,或者这只是以这种方式编写程序的代价。当我继续进行一些基本的分析时:
    Sat Feb 19 22:58 2022 Time and Allocation Profiling Report  (Final)

demo +RTS -N -s -p -RTS

total time = 20.78 secs (52462 ticks @ 1000 us, 24 processors)
total alloc = 122,870,767,920 bytes (excludes profiling overheads)

COST CENTRE MODULE SRC %time %alloc

catalanHisto.\ Catalan src/Catalan.hs:(101,5)-(103,31) 68.0 71.5
foldMap.go Control.Comonad.Cofree src/Control/Comonad/Cofree.hs:301:5-46 28.4 25.0
catalanHisto Catalan src/Catalan.hs:(97,1)-(103,31) 1.7 0.0
catalanHisto.\.ys Catalan src/Catalan.hs:102:9-23 1.3 3.3

不是解释这些结果的专家,我猜除了 Control.Comonad.Cofree 中的一些分配,它花费了大部分时间在 catalanHisto 的非平凡分支中进行分配,可能是由于 toListreverse ,我不确定有多少优化空间。

最佳答案

免责声明:我不太了解范畴论或递归方案背后的深层含义。我只是在重读这篇 excellent blog post并决定将这个问题作为练习。
首先,写histo 非常容易(考虑到背后的理论)从头开始测试 histo 是否存在巨大开销或不
this 启发的代码

-- definition from free package
data Cofree f a = a :< (f (Cofree f a)) deriving (Foldable)
-- derive foldable to use toList in your function
extract (a :< _) = a
newtype Fix f = Fix {unfix :: f (Fix f)}

histo :: Functor f => (f (Cofree f a) -> a) -> Fix f -> a
histo halg = extract . go where
go = (\x -> halg x :< x) . fmap go . unfix
您的功能几乎保持不变
data NatF a  = Z  | S a  deriving (Functor, Foldable)
catalan = histo \case
Z -> 1
(S fs) ->
let xs = toList fs
ys = reverse xs
in sum $ zipWith (*) xs ys

main = print $ catalan (mkNat 1000)
where
mkNat 0 = Fix Z
mkNat n = Fix (S (mkNat (n - 1)))
这是 ghci 的运行时间
(0.60 secs, 788,909,152 bytes)
为了了解发生了什么,正如本文中所指出的 answer你可以稍微作弊并实例化 Functor fNatF简化 histo 的定义
NatF ~~ Maybe
Cofree NatF a ~~ Cofree Maybe a ~~ NonEmpty a
(NatF (Cofree NatF a) -> a) ~~ (Maybe (NonEmptyList a) -> a ) ~~ ([a] -> a)
那里的功能简化为
-- even a simpler definition can be found in answer in the link above
histo2 :: ([a] -> a) -> Fix Maybe -> a
histo2 halg = head . go
where
go = maybe [] (\x -> halg x : x) . fmap go . unfix

catalan2 = histo2 \case
[] -> 1
fs ->
let xs = toList fs
ys = reverse xs
in sum $ zipWith (*) xs ys

-- (0.33 secs, 307,525,608 bytes)
main2 = print $ catalan2 (mkNat 1000)
where
mkNat 0 = Fix Nothing
mkNat n = Fix $ Just (mkNat (n - 1))
因此该功能按预期工作,有线性步数,因此不会浪费任何工作。
好的,那么您的原始功能是怎么回事?为了理解它的执行,我尝试类似地简化上面的库函数
-- original definition
histo0 = gcata distHisto where
distHisto fc = fmap extract fc :< fmap (distHisto' . Cofree.unwrap) fc
gcata k g = g . extract . c
where
c = k . fmap (duplicate . fmap g . c) . project

projectNat 0 = Nothing
projectNat n = Just (n - 1)

-- f instantiated to Maybe with comonad definitions copied
histo1 :: (Maybe (Cofree Maybe Integer) -> Integer) -> Natural -> Integer
histo1 g = g . extract . go
where
go = distHisto . fmap (duplicate . fmap g . go) . projectNat
distHisto :: Maybe (Cofree Maybe a) -> Cofree Maybe (Maybe a)
distHisto Nothing = Nothing :< Nothing
distHisto (Just (a :< xs)) = Just a :< Just (distHisto xs)
duplicate (a :< xs) = (a :< xs) :< fmap duplicate xs
extract (a :< _) = a
对我来说仍然太复杂了,所以我试图进一步简化。
extractNE :: NonEmpty a -> a
extractNE (a :| _) = a
duplicateNE :: NonEmpty a -> NonEmpty (NonEmpty a)
duplicateNE k@(_ :| xs) =
k :| case xs of
[] -> []
(a : as) -> toList (duplicate (a :| as))
-- use cofree Maybe ~ nonempty
histo2 g = g . extractNE . go
where
go = distHisto . fmap (duplicateNE . fmap g . go) . projectNat
distHisto :: Maybe (NonEmpty (NonEmpty a)) -> NonEmpty (Maybe (NonEmpty a))
distHisto Nothing = Nothing :| []
distHisto (Just (a :| [])) = Just a :| [Nothing]
distHisto (Just (a :| as)) = Just a :| toList (distHisto (Just (fromList as)))

-- use Maybe (NonEmpty a) ~ [a]
histo3 g = g . extractNE . go
where
go = distHisto . conv . fmap (duplicateNE . fmap g . go) . projectNat
conv :: Maybe (NonEmpty a) -> [a]
conv Nothing = []
conv ((Just (a :| as))) = a : as
distHisto :: [NonEmpty a] -> NonEmpty [a]
distHisto [] = [] :| []
distHisto ((a : as)) = toList a :| toList (distHisto as)

-- not sure about this step but is probably close enough definition.

histo4 :: ([Integer] -> Integer) -> Natural -> Integer
histo4 g = g . head . go where
go 0 = [[]]
go b = tails $ fmap g $ go (b - 1)
所以它似乎执行 n^2 步骤,这就是为什么它与内存版本相比如此缓慢。如果添加 trace调用你的函数被评估的地方,你可以看到自己浪费的工作。

关于haskell - 用组织态实现加泰罗尼亚数的显着开销,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71192138/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com