gpt4 book ai didi

haskell - 内联派生类型类方法

转载 作者:行者123 更新时间:2023-12-04 14:05:59 24 4
gpt4 key购买 nike

Haskell 允许您派生类型类实例,例如:

{-# LANGUAGE DeriveFunctor #-}

data Foo a = MakeFoo a a deriving (Functor)

...但有时基准测试表明,如果您手动实现类型类实例并使用 INLINE 注释类型类方法,则性能会有所提高。 , 像这样:
data Foo a = MakeFoo a a

instance Functor Foo where
fmap f (MakeFoo x y) = MakeFoo (f x) (f y)
{-# INLINE fmap #-}

有没有办法两全其美?换句话说,有没有一种方法可以派生类型类实例并用 INLINE 注释派生的类型类方法? ?

最佳答案

尽管您不能像使用动态语言中的类那样在 Haskell 中“重新打开”实例,但有一些方法可以通过将某些标志传递给 GHC 来确保尽可能积极地内联函数。

-fspecialise-aggressively removes the restrictions about which functions are specialisable. Any overloaded function will be specialised with this flag. This can potentially create lots of additional code.

-fexpose-all-unfoldings will include the (optimised) unfoldings of all functions in interface files so that they can be inlined and specialised across modules.

Using these two flags in conjunction will have nearly the same effect as marking every definition as INLINABLE apart from the fact that the unfoldings for INLINABLE definitions are not optimised.



(来源: https://wiki.haskell.org/Inlining_and_Specialisation#Which_flags_can_I_use_to_control_the_simplifier_and_inliner.3F)

这些选项将允许 GHC 编译器内联 fmap . -fexpose-all-unfoldings尤其是选项,允许编译器公开 Data.Functor 的内部结构。用于内联目的的程序的其余部分(它似乎提供了最大的性能优势)。这是我放在一起的一个快速而愚蠢的基准:
functor.hs包含此代码:
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE Strict #-}

data Foo a = MakeFoo a a deriving (Functor)

one_fmap foo = fmap (+1) foo

main = sequence (fmap (\n -> return $ one_fmap $ MakeFoo n n) [1..10000000])

编译时没有参数:
$ time ./functor 

real 0m4.036s
user 0m3.550s
sys 0m0.485s

使用 -fexpose-all-unfoldings 编译:
$ time ./functor

real 0m3.662s
user 0m3.258s
sys 0m0.404s

这是 .prof来自该编译的文件,以显示对 fmap 的调用确实是内联:
    Sun Oct  7 00:06 2018 Time and Allocation Profiling Report  (Final)

functor +RTS -p -RTS

total time = 1.95 secs (1952 ticks @ 1000 us, 1 processor)
total alloc = 4,240,039,224 bytes (excludes profiling overheads)

COST CENTRE MODULE SRC %time %alloc

CAF Main <entire-module> 100.0 100.0


individual inherited
COST CENTRE MODULE SRC no. entries %time %alloc %time %alloc

MAIN MAIN <built-in> 44 0 0.0 0.0 100.0 100.0
CAF Main <entire-module> 87 0 100.0 100.0 100.0 100.0
CAF GHC.IO.Handle.FD <entire-module> 84 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding <entire-module> 77 0 0.0 0.0 0.0 0.0
CAF GHC.Conc.Signal <entire-module> 71 0 0.0 0.0 0.0 0.0
CAF GHC.IO.Encoding.Iconv <entire-module> 58 0 0.0 0.0 0.0 0.0

使用 -fspecialise-aggressively 编译:
$ time ./functor

real 0m3.761s
user 0m3.300s
sys 0m0.460s

使用两个标志编译:
$ time ./functor

real 0m3.665s
user 0m3.213s
sys 0m0.452s

这些小基准绝对不能代表实际代码中的性能(或文件大小),但它肯定表明您可以强制 GHC 编译器内联 fmap (而且它确实会对性能产生不可忽视的影响)。

关于haskell - 内联派生类型类方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/52601129/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com