gpt4 book ai didi

performance - 为什么这个 Monte Carlo Haskell 程序这么慢?

转载 作者:行者123 更新时间:2023-12-04 15:04:21 26 4
gpt4 key购买 nike

更新:

我的编译命令是ghc -O2 Montecarlo.hs。我的随机版本是random-1.1,ghc版本是8.6.4,我的系统是macOS Big Sur 11.1(Intel芯片)。我用来测试速度的命令是time ./Montecarlo 10000000,它返回的结果是real 0m17.463s, user 0m17.176s, sys 0m0.162s


以下是使用蒙特卡洛计算pi的Haskell程序。但是,当输入为1000万时,程序运行了20秒。同样逻辑编写的C程序只用了0.206秒。为什么会这样,我怎样才能加快速度?谢谢大家。

这是 Haskell 版本:

import System.Random
import Data.List
import System.Environment

montecarloCircle :: Int -> Double
montecarloCircle x
= 4*fromIntegral
(foldl' (\x y -> if y <= 1 then x+1 else x) 0
$ zipWith (\x y -> (x**2 + y**2))
(take x $ randomRs (-1.0,1) (mkStdGen 1) :: [Double])
(take x $ randomRs (-1.0,1) (mkStdGen 2) :: [Double]) )
/ fromIntegral x

main = do
num <- getArgs
let n = read (num !! 0)
print $ montecarloCircle n

这是 C 版本:

#include <stdio.h>
#include <math.h>
#include <time.h>
#include <stdlib.h>

#define N 10000000

#define int_t long // the type of N and M

// Rand from 0.0 to 1.0
double rand01()
{
return rand()*1.0/RAND_MAX;
}

int main()
{
srand((unsigned)time(NULL));

double x,y;
int_t M = 0;

for (int_t i = 0;i < N;i++)
{
x = rand01();
y = rand01();
if (x*x+y*y<1) M++;
}
double pi = (double)4*M/N;
printf("%lf\n", pi);
}

最佳答案

在某种程度上,在数值应用程序中,Haskell 代码往往比用经典命令式语言(如 C/C++/Fortran)编写的等效代码慢 2 到 5 倍。然而,Haskell 代码慢了 100 倍是非常意外的!

完整性检查:重现结果

使用带有 Intel Celeron N2840 64 位 cpu、运行 Linux kernel 5.3、GLIBC 2.28、GCC 8.3、GHC 8.2.2、GHC random-1.1 的有点旧的笔记本,我们得到了计时:

C code,        gcc -O3:    950 msecHaskell code,  ghc -O3:    104 sec

So indeed, with these configurations, the Haskell code runs about 100 times slower than the C code.

Why is that ?

A first remark is that the π computing arithmetics atop random number generation looks quite simple, hence the runtimes are probably dominated by the generation of 2*10 millions pseudo-random numbers. Furthermore, we have every reason to expect that Haskell and C are using completely unrelated random number generation algorithms.

So, rather than comparing Haskell with C, we could be comparing the random number generation algorithms these 2 languages happen to be using, as well as their respective implementations.

In C, the language standard does not specify which algorithm function srand() is supposed to use, but that tends to be old, simple and fast algorithms.

On the other hand, in Haskell, we have traditionally seen a lot of complaints about the poor efficiency of StdGen, like here back in 2006:

https://gitlab.haskell.org/ghc/ghc/-/issues/427

where one of the leading Haskell luminaries mentions that StdGen could possibly be made 30 times faster.

Fortunately, help is already on the way. This recent blog post explains how the new Haskell random-1.2 package is solving the problem of StdGen lack of speed, using a completely different algorithm known as splitmix.

The field of pseudo-random number generation (PRNG) is a rather active one. Algorithms routinely get obsoleted by newer and better ones. For the sake of perspective, here is a relatively recent (2018) review paper on the subject.

Moving to more recent, better Haskell components:

Using another, slightly more powerful machine with an Intel Core i5-4440 3GHz 64 bits cpu, running Linux kernel 5.10, GLIBC 2.32, GCC 10.2, and critically Haskell package random-1.2:

C code,        gcc -O3:    164 msecHaskell code,  ghc -O3:    985 msec

So Haskell is now “just” 6 times slower instead of 100 times.

And we still have to address the injustice of Haskell having to use x**2+y**2 versus C getting x*x+y*y, a detail which did not really matter before with random-1.1. This gives us 379 msec ! So we are back into the usual 2X-5X ballpark for Haskell to C speed comparisons.

Note that if we ask the Haskell executable to run with statistics on, we get the following output:

$ time q66441802.x +RTS -s -RTS 100000003.1415616     925,771,512 bytes allocated in the heap     ...  Alloc rate    2,488,684,937 bytes per MUT second  Productivity  99.3% of total user, 99.3% of total elapsedreal   0m0,379suser   0m0,373ssys    0m0,004s

so Haskell is found to allocate close to one gigabyte of memory along the way, which helps to understand the speed difference.

A code fix:

We note that the C code uses a single random serie while the Haskell code is using two, with two calls to mkStdGen with numbers 1 and 2 as seeds. This is not only unfair but also incorrect. Applied mathematicians who design PRNG algorithms take great care to ensure any single serie has the right statistical properties, but give essentially no guarantees about possible correlations between different series. It is not unheard of to even use the seed as an offset into a single global sequence.

This is fixed in the following code, which does not alter performance significantly:

computeNorms :: [Double] -> [Double]
computeNorms [] = []
computeNorms [x] = []
computeNorms (x:y:xs2) = (x*x + y*y) : (computeNorms xs2)

monteCarloCircle :: Int -> Double
monteCarloCircle nn =
let
randomSeed = 1
gen0 = mkStdGen randomSeed
-- use a single random serie:
uxys = (randomRs (-1.0, 1.0) gen0) :: [Double]
norms = take nn (computeNorms uxys)
insiderCount = length $ filter (<= 1.0) norms
in
(4.0::Double) * ((fromIntegral insiderCount) / (fromIntegral nn))

边注:

这个 recent SO question 中提到了新的 Haskell random-1.2 包,尽管是在新的 monadic 接口(interface)的上下文中。

结语:

假设练习的目标是衡量 C 和 Haskell 语言的相对运行速度,基本上有两种可能性:

一个是完全避免使用 PRNG,因为使用的算法不同。

另一种是通过手动为两种语言提供一种相同的算法来控制随机数的生成。例如,可以使用公开可用的 MRG32k3a由 Pierre L'Ecuyer 设计的算法。 Candidate Haskell MRG32k3a implementation here (截至 2021 年 3 月)。留作读者练习。

关于performance - 为什么这个 Monte Carlo Haskell 程序这么慢?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/66441802/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com