gpt4 book ai didi

c++ - 这些随机数我做错了什么?

转载 作者:太空宇宙 更新时间:2023-11-04 13:09:42 25 4
gpt4 key购买 nike

有人告诉我 rand() mod n 会产生有偏差的结果,所以我试着编写这段代码来检查它。它生成从 1 到 ls 个数字,然后按出现次数排序。

#include <iostream>
#include <random>

using namespace std;

struct vec_struct{
int num;
int count;
double ratio;
};

void num_sort(vec_struct v[], int n){
for (int i = 0; i < n-1; i++){
for (int k = 0; k < n-1-i; k++){
if (v[k].num > v[k+1].num) swap(v[k], v[k+1]);
}
}
}

void count_sort(vec_struct v[], int n){
for (int i = 0; i < n-1; i++){
for (int k = 0; k < n-1-i; k++){
if (v[k].count < v[k+1].count) swap(v[k], v[k+1]);
}
}
}

int main(){

srand(time(0));

random_device rnd;

int s, l, b, c = 1;

cout << "How many numbers to generate? ";
cin >> s;

cout << "Generate " << s << " numbers ranging from 1 to? ";
cin >> l;

cout << "Use rand or mt19937? [1/2] ";
cin >> b;

vec_struct * vec = new vec_struct[s];

mt19937 engine(rnd());
uniform_int_distribution <int> dist(1, l);

if (b == 1){
for (int i = 0; i < s; i++){
vec[i].num = (rand() % l) + 1;
}
} else if (b == 2){
for (int i = 0; i < s; i++){
vec[i].num = dist(engine);
}
}
num_sort(vec, s);

for (int i = 0, j = 0; i < s; i++){
if (vec[i].num == vec[i+1].num){
c++;
} else {
vec[j].num = vec[i].num;
vec[j].count = c;
vec[j].ratio = ((double)c/s)*100;
j++;
c = 1;
}
}
count_sort(vec, l);

if (l >= 20){

cout << endl << "Showing the 10 most common numbers" << endl;
for (int i = 0; i < 10; i++){
cout << vec[i].num << "\t" << vec[i].count << "\t" << vec[i].ratio << "%" << endl;
}

cout << endl << "Showing the 10 least common numbers" << endl;
for (int i = l-10; i < l; i++){
cout << vec[i].num << "\t" << vec[i].count << "\t" << vec[i].ratio << "%" << endl;
}
} else {

for (int i = 0; i < l; i++){
cout << vec[i].num << "\t" << vec[i].count << "\t" << vec[i].ratio << "%" << endl;
}
}
}

运行此代码后,我可以发现 rand() 的预期偏差:

$ ./rnd_test 
How many numbers to generate? 10000
Generate 10000 numbers ranging from 1 to? 50
Use rand or mt19937? [1/2] 1

Showing the 10 most common numbers
17 230 2.3%
32 227 2.27%
26 225 2.25%
25 222 2.22%
3 221 2.21%
10 220 2.2%
35 218 2.18%
5 217 2.17%
13 215 2.15%
12 213 2.13%

Showing the 10 least common numbers
40 187 1.87%
7 186 1.86%
39 185 1.85%
42 184 1.84%
43 184 1.84%
34 182 1.82%
21 175 1.75%
22 175 1.75%
18 173 1.73%
44 164 1.64%

Hoover 我用 mt19937uniform_int_distribution 得到了几乎相同的结果!这里出了什么问题?不应该是统一的,还是测试没用?

最佳答案

不,它不应该是完全统一的。因此,以上并不是任何错误的证据。

它们是随机的,因此应该相当均匀,但不完全一致。

特别是您会期望每个数字出现大约 10000/50=200 次 - 标准差 sqrt(200) 大约为 14 - 对于 50 个数字,您会期望大约 2 个标准差的差异 - 这是 +-/28。

RAND_MAX取模造成的bias比那个小;所以你需要更多的样本来检测偏差。

关于c++ - 这些随机数我做错了什么?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40481909/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com