gpt4 book ai didi

random - 随机数生成器的上限

转载 作者:行者123 更新时间:2023-12-02 01:24:36 24 4
gpt4 key购买 nike

这实际上是上一个问题的后续问题: Rounding of double precision to single precision: Forcing an upper bound

在我认为上一个问题的答案解决了我的问题之后,我尝试再次运行我的程序,发现我遇到了同样的问题。

我使用的 Mersenne Twister 实现生成一个带符号的 32 位随机整数。实现 RNG 的人使用此函数生成范围 [0,1] 内的随机 double float :

  function genrand_real2()
double precision genrand_real2,r
integer genrand_int32
r=dble(genrand_int32())
if(r.lt.0.d0)r=r+2.d0**32
genrand_real2=r/4294967296.d0
return
end

它完美地工作,所以按照上一个问题中的建议,我使用以下函数生成一个随机单精度 float ,在我认为的范围内 [0,1]:

  function genrand_real()
real genrand_real, r
integer genrand_int32
r = real(genrand_int32())
if (r .lt. 0.0) r = r + 2.0**32
genrand_real = r / 4294967296.0
return
end

但是我遇到了与之前相同的错误,由 1.0 数字引起。于是写了个小程序,显示我的genrand_real居然生成了一个1.0,结果发现我猜对了,生成了1.0。这导致我用来生成 [1,MAX] 范围内的整数(在此示例中为 [1,5])的方式无法生成值 MAX+1,以及我正在处理的代码带来的其他不便。

  i = 0
do while (.true.)
r = genrand_real()
if (r .gt. 0.99999) then
i = i + 1
print *, 'number is:', r
print *, 'conversion is: ', int(5*r)+1
endif
if (i .gt. tot_large) exit
enddo

我的问题是,为什么它适用于 double float 而不适用于单精度 float ?我没有看到它失败的原因,因为 2**32 适合单个精度 float 。另外,我应该怎么做才能解决它?我考虑过将数字除以 2.0**32+1 而不是 2.0**32,但我不确定它在理论上是否正确以及数字是否统一。

最佳答案

我不确定是在旧问题上还是在此处发布此答案。无论如何,我可能有一个解决方案(在第二个代码块中)。

大约两年前,我用于同一任务的例程是这样的:

function uniran( )
implicit none
integer, parameter :: dp = selected_real_kind(15, 307)
real(dp) :: tmp
real :: uniran
tmp = 0.5_dp + 0.2328306e-9_dp * genrand_int32( )
uniran = real(tmp)
end function uniran

我忘记了代码的来源,虽然它很简单,但它有一个微妙的技巧,我现在才意识到这一点。明显的区别是乘法而不是除法,但这只是因为乘法比除法更快 (0.2328306e-9 = 1/4294967296)。
诀窍是:那不是真的。 1/4294967296 = 0.23283064365386962890625e-9,因此该程序使用的有效数字少于 double 所能容纳的数字(15,而仅使用 7)。如果您增加位数,则结果数字会更接近 1,并在后面的转换过程中正好变成 1。您可以尝试一下:如果您只使用一位数字,它就会开始失败(= 1.0)。显然,这个解决方案有点 hack,所以我也尝试了一种不同的方法,如果结果正好是 1,则重新采样:

recursive function resample_uniran( ) result(res)
implicit none
integer, parameter :: dp = selected_real_kind(15, 307)
real(dp) :: tmp
real :: res
tmp = 0.5_dp + 0.23283064365386962890625e-9_dp * genrand_int32( )
res = real(tmp)
if (res == 1.0) then
res = resample_uniran()
end if
end function resample_uniran

我写了一个测试函数的程序(包含函数和子程序的模块在文末,比较长):

program prng_fail
use mod_prngtest
implicit none
integer(kind=16) :: i, j, k

! loop counters
i = 0
j = 0
k = 0

call init_genrand_int32()

do
i = i + 1
j = j + 1
k = k + 1
if (genrand_real() == 1.0) then
print*, 'genrand_real fails after ', i, ' iterations'
i = 0
end if
if (uniran() == 1.0) then
print*, 'uniran fails after ', j, ' iterations'
j = 0
end if
if (resample_uniran() == 1.0) then
print*, 'resample_uniran fails after ', k, ' iterations'
k = 0
end if
end do

end program prng_fail

结果是 genrand_real 经常失败 (= 1.0)(我们说的是每隔几百万个数字),而其他两个到目前为止从未失败过。递归版本会花费您时间,但在技术上更好,因为可能的最高数字更接近 1。

我还测试了速度和“均匀性”,并与内在的 random_number 子例程进行了比较,后者也在 [0,1) 中给出了均匀的随机数。(小心,这会创建 3 x 512 MB 文件)

program prng_uniformity
use mod_prngtest
implicit none
integer, parameter :: n = 2**27
real, dimension(n) :: uniran_array, resamp_array, intrin_array
integer :: array_recl, i
real :: start_time, end_time

call init_genrand_int32()
call init_random_seed()

! first check how long they take to produce PRNs
call cpu_time(start_time)
do i=1,n
uniran_array(i) = uniran()
end do
call cpu_time(end_time)
print*, 'uniran took ', end_time - start_time, ' s to produce ', n, ' PRNs'

call cpu_time(start_time)
do i=1,n
resamp_array(i) = resample_uniran()
end do
call cpu_time(end_time)
print*, 'resamp took ', end_time - start_time, ' s to produce ', n, ' PRNs'

call cpu_time(start_time)
do i=1,n
call random_number(resamp_array(i))
end do
call cpu_time(end_time)
print*, 'intrin took ', end_time - start_time, ' s to produce ', n, ' PRNs'

! then save PRNs into files. Use both() to have the same random
! underlying integers, reducing the difference purely to
! the scaling into the interval [0,1)
inquire(iolength=array_recl) uniran_array
open(11, file='uniran.out', status='replace', access='direct', action='write', recl=array_recl)
open(12, file='resamp.out', status='replace', access='direct', action='write', recl=array_recl)
open(13, file='intrin.out', status='replace', access='direct', action='write', recl=array_recl)
do i=1,n
call both(uniran_array(i), resamp_array(i))
call random_number(intrin_array(i))
end do
write(11, rec=1) uniran_array
write(12, rec=1) resamp_array
write(13, rec=1) intrin_array

end program prng_uniformity

原则上结果总是相同的,即使时间不同:

uniran took   0.700139999      s to produce    134217728  PRNs
resamp took 0.737253010 s to produce 134217728 PRNs
intrin took 0.773686171 s to produce 134217728 PRNs

uniran 比 resample_uniran 快,resample_uniran 比 intrinsic 快(虽然这在很大程度上取决于 PRNG,但 Mersenne twister 会比 intrinsic 慢)。

我还查看了每个方法提供的输出(使用 Python):

import numpy as np
import matplotlib.pyplot as plt

def read1dbinary(fname, xdim):
with open(fname, 'rb') as fid:
data = np.fromfile(file=fid, dtype=np.single)
return data

if __name__ == '__main__':
n = 2**27
data_uniran = read1dbinary('uniran.out', n)
print('uniran:')
print('{0:.15f}'.format(max(data_uniran)))
plt.hist(data_uniran, bins=1000)
plt.show()

data_resamp = read1dbinary('resamp.out', n)
print('resample uniran:')
print('{0:.15f}'.format(max(data_resamp)))
plt.hist(data_resamp, bins=1000)
plt.show()

data_intrin = read1dbinary('intrin.out', n)
print('intrinsic:')
print('{0:.15f}'.format(max(data_intrin)))
plt.hist(data_intrin, bins=1000)
plt.show()

三个直方图在视觉上看起来都很好,但最高值揭示了 uniran 的缺点:

uniran:
0.999999880790710
resample uniran:
0.999999940395355
intrinsic:
0.999999940395355

我运行了几次,结果总是一样的。 resample_uniran 和 intrinsic 具有相同的最高值,而 uniran 也始终相同,但较低。我想要一些可靠的统计测试来指示输出的真实程度,但在尝试 Anderson-Darling 测试、Kuiper 测试和 Kolmogorov-Smirnov 测试时,我遇到了 this problem .从本质上讲,您拥有的样本越多,测试发现输出有问题的可能性就越大。也许应该做类似 this 的事情,但我还没有抽出时间。

为了完整性,模块:

module mod_prngtest
implicit none
integer :: iseed_i, iseed_j, iseed_k, iseed_n
integer, dimension(4) :: seed

contains

function uniran( )
! Generate uniformly distributed random numbers in [0, 1) from genrand_int32
! New version
integer, parameter :: dp = selected_real_kind(15, 307)
real(dp) :: tmp
real :: uniran
tmp = 0.5_dp + 0.2328306e-9_dp * genrand_int32( )
uniran = real(tmp)
end function uniran

recursive function resample_uniran( ) result(res)
! Generate uniformly distributed random numbers in [0, 1) from genrand_int32
! New version, now recursive
integer, parameter :: dp = selected_real_kind(15, 307)
real(dp) :: tmp
real :: res
tmp = 0.5_dp + 0.23283064365386962890625e-9_dp * genrand_int32( )
res = real(tmp)
if (res == 1.0) then
res = resample_uniran()
end if
end function resample_uniran

recursive subroutine both(uniran, resamp)
integer, parameter :: dp = selected_real_kind(15, 307)
real(dp) :: tmp1, tmp2
integer :: prn
real :: uniran, resamp

prn = genrand_int32( )

tmp1 = 0.5_dp + 0.2328306e-9_dp * prn
uniran = real(tmp1)

tmp2 = 0.5_dp + 0.23283064365386962890625e-9_dp * prn
resamp = real(tmp2)
if (resamp == 1.0) then
call both(uniran, resamp)
end if
end subroutine both

function genrand_real()
! Generate uniformly distributed random numbers in [0, 1) from genrand_int32
! Your version, modified by me earlier
real genrand_real, r
r = real(genrand_int32())
if (r .lt. 0.0) r = r + 2.0**32
genrand_real = r / 4294967296.0
return
end

subroutine init_genrand_int32()
! seed the PRNG, if you don't have /dev/urandom comment out this block ...
open(11, file='/dev/urandom', form='unformatted', access='stream')
read(11) seed
iseed_i=1+abs(seed( 1))
iseed_j=1+abs(seed( 2))
iseed_k=1+abs(seed( 3))
iseed_n=1+abs(seed( 4))

! ... and use this block instead (any integer > 0)
!iseed_i = 1253795357
!iseed_j = 520466003
!iseed_k = 68202083
!iseed_n = 1964789093
end subroutine init_genrand_int32

function genrand_int32()
! From Marsaglia 1994, return pseudorandom integer over the
! whole range. Fortran doesn't have a function like that intrinsically.
! Replace this with your Mersegne twister PRNG
implicit none
integer :: genrand_int32
genrand_int32=iseed_i-iseed_k
if(genrand_int32.lt.0)genrand_int32=genrand_int32+2147483579
iseed_i=iseed_j
iseed_j=iseed_k
iseed_k=genrand_int32
iseed_n=69069*iseed_n+1013904243
genrand_int32=genrand_int32+iseed_n
end function genrand_int32

subroutine init_random_seed()
use iso_fortran_env, only: int64
implicit none
integer, allocatable :: seed(:)
integer :: i, n, un, istat, dt(8), pid
integer(int64) :: t

call random_seed(size = n)
allocate(seed(n))
! First try if the OS provides a random number generator
open(newunit=un, file="/dev/urandom", access="stream", &
form="unformatted", action="read", status="old", iostat=istat)
if (istat == 0) then
read(un) seed
close(un)
else
! Fallback to XOR:ing the current time and pid. The PID is
! useful in case one launches multiple instances of the same
! program in parallel.
call system_clock(t)
if (t == 0) then
call date_and_time(values=dt)
t = (dt(1) - 1970) * 365_int64 * 24 * 60 * 60 * 1000 &
+ dt(2) * 31_int64 * 24 * 60 * 60 * 1000 &
+ dt(3) * 24_int64 * 60 * 60 * 1000 &
+ dt(5) * 60 * 60 * 1000 &
+ dt(6) * 60 * 1000 + dt(7) * 1000 &
+ dt(8)
end if
pid = getpid()
t = ieor(t, int(pid, kind(t)))
do i = 1, n
seed(i) = lcg(t)
end do
end if
call random_seed(put=seed)
contains
! This simple PRNG might not be good enough for real work, but is
! sufficient for seeding a better PRNG.
function lcg(s)
integer :: lcg
integer(int64) :: s
if (s == 0) then
s = 104729
else
s = mod(s, 4294967296_int64)
end if
s = mod(s * 279470273_int64, 4294967291_int64)
lcg = int(mod(s, int(huge(0), int64)), kind(0))
end function lcg
end subroutine init_random_seed
end module mod_prngtest

关于random - 随机数生成器的上限,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/37859027/

24 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com