gpt4 book ai didi

c++ - C++ 中 PRNG 的默认随机引擎为类的每个实例生成相同的输出 - 正确的种子?

转载 作者:行者123 更新时间:2023-11-30 03:38:43 25 4
gpt4 key购买 nike

我对伪随机数生成 (PRNG) 没有经验,但最近我一直在考虑它,因为我想测试一些东西,手动生成数据很困难,坦率地说很容易出错。

我有以下类(class):

#include <QObject>
#include <QList>
#include <QVector3D>
#include <random>
#include <functional>

// TaskCommData is part of a Task instance (a QRunnable).
// It contains all the data required for partially controlling the runnable
// and what it processes inside its run() method
class TaskCommData : public QObject
{
friend class Task;
Q_OBJECT
// Property is used to abort the run() of the Task and also signal the TaskManager that the Task has changed its running status
Q_PROPERTY(bool running
READ isRunning
WRITE setRunningStatus
NOTIFY signalRunningStatusChanged)
public:
QString getId() const; // Task ID
bool isRunning() const;
signals:
void signalRunningStatusChanged(QString id, bool running);
public slots:
void slotAbort();
private:
bool running;
QList<QVector3D> data; // Some data in the form of a list of 3D vectors
QString id;

// PRNG related members
std::default_random_engine* engine;
std::uniform_int_distribution<>* distribution;
std::function<int()> dice;

// Private constructor (don't allow creation of TaskCommData outside the Task class which instantiates the class as its class member
explicit TaskCommData(QString id, QObject *parent = 0);

void setRunningStatus(bool running);
QList<QVector3D>* getData();
void generateData();
};

此对象在基于 Qt 5.7 的应用程序中创建并附加到一组 QRunnable。重要部分如下:

#include <QDebug>
#include "TaskCommData.h"

// ...

TaskCommData::TaskCommData(QString _id, QObject *parent)
: QObject(parent),
running(false),
id(_id)
{
this->engine = new std::default_random_engine();
this->distribution = new std::uniform_int_distribution<int>(0, 1);
this->dice = std::bind(*this->distribution, *this->engine);

generateData();
}

// ...

void TaskCommData::generateData()
{
QString s;
s += QString("Task %1: Generated data [").arg(this->id);
for(int i = 0; i < 10; ++i) {
this->data.append(QVector3D(dice(), dice(), dice())); // PROBLEM occurs here but it's probably just the aftermath
s += "[" + QString::number(this->data.at(i).x()) + ","
+ QString::number(this->data.at(i).y()) + ","
+ QString::number(this->data.at(i).z()) + "]";
}
s += "]";
qDebug() << s;
}

初始化后,我从 qDebug() 得到以下输出(我创建了 10 个 Task 实例,其中实例化了 TaskCommData - 每个任务一个) :

"Task task_0: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_0" (sleep: 0)
"Task task_1: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_1" (sleep: 1315)
"Task task_2: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_2" (sleep: 7556)
"Task task_3: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_3" (sleep: 4586)
"Task task_4: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_4" (sleep: 5328)
"Task task_5: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_5" (sleep: 2189)
"Task task_6: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_6" (sleep: 470)
"Task task_7: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_7" (sleep: 6789)
"Task task_8: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_8" (sleep: 6793)
"Task task_9: Generated data [[1,0,0][0,1,0][1,1,0][1,0,1][0,0,1][0,1,1][0,0,0][1,1,1][0,1,1][1,0,1]]"
Added task "task_9" (sleep: 9347)

正如您从查看输出中可能已经猜到的那样,我希望有更多的多样性(显然不可能有那么多多样性,因为单个数据 block (a QVector3D) 包含 3 个二进制值),这里显然出了问题。

您可能还注意到输出中的 (sleep: ...)。这是来 self 的 TaskManager 类的输出,它创建了一堆 Task 及其各自的 TaskCommData:

void TaskManager::initData()
{
// Setup PRNG
std::default_random_engine generator;
std::uniform_int_distribution<int> distribution(0,10000); // Between 0 and 10000ms
auto dice = std::bind(distribution, generator);

this->tasks.reserve(this->taskCount);
qDebug() << "Adding" << this->taskCount << "tasks...";
int msPauseBetweenChunks = 0;

for(int taskIdx = 0; taskIdx < this->taskCount; ++taskIdx) {
msPauseBetweenChunks = dice();
Task* task = new Task("task_" + QString::number(taskIdx), msPauseBetweenChunks);
task->setAutoDelete(false);
const TaskCommData *taskCommData = task->getCommData();

// Manage connections
connect(taskCommData, SIGNAL(signalRunningStatusChanged(QString, bool)),
this, SLOT(slotRunningStatusChanged(QString, bool)));
connect(this, SIGNAL(signalAbort()),
taskCommData, SLOT(slotAbort()));
this->tasks.insert(task->getCommData()->getId(), task);
qDebug() << "Added task " << task->getCommData()->getId() << " (sleep: " << msPauseBetweenChunks << ")";
}

emit signalCurrentlyRunningTasks(this->tasksRunning, this->taskCount);
}

在这里我有同样的东西(虽然不是作为类(class)成员)并且它有效(范围不同但仍然)。

最初我在 void TaskCommData::generateData() 中有相同的代码片段(与随机数生成相关的代码片段;TaskManager::initData())也就是说,引擎、分发和计时器都在堆栈上,一旦超出范围就会被销毁。但结果是一样的 - 一遍又一遍地重复同一组随机数。

然后我确定问题出在种子上(这里描述缺少可能更合适)。所以我将代码更改为:

// ...
std::chrono::nanoseconds nanoseed = std::chrono::duration_cast<std::chrono::nanoseconds>(std::chrono::system_clock::now().time_since_epoch());
qDebug() << "Setting PRNG engine to seed" << nanoseed.count();
this->engine = new std::default_random_engine();
this->engine->seed(nanoseed.count());
this->distribution = new std::uniform_int_distribution<int>(0, 1);
this->dice = std::bind(*this->distribution, *this->engine);

generateData();
// ...

我得到了稍微好一点的结果:

Setting PRNG engine to seed 1473233571281947000
"Task task_0: Generated data [[1,0,0][0,1,1][0,0,0][0,1,1][1,0,0][1,0,0][0,0,1][1,1,1][1,0,0][1,0,0]]"
Added task "task_0" (sleep: 0 )
Setting PRNG engine to seed 1473233571282947700
"Task task_1: Generated data [[1,0,1][1,0,0][1,0,1][0,0,1][1,1,0][0,0,1][0,0,1][0,1,0][0,1,0][0,1,0]]"
Added task "task_1" (sleep: 1315 )
Setting PRNG engine to seed 1473233571282947700
"Task task_2: Generated data [[1,0,1][1,0,0][1,0,1][0,0,1][1,1,0][0,0,1][0,0,1][0,1,0][0,1,0][0,1,0]]"
Added task "task_2" (sleep: 7556 )
Setting PRNG engine to seed 1473233571283948400
"Task task_3: Generated data [[0,0,1][1,0,1][0,1,1][1,1,1][1,0,0][0,0,0][0,0,1][1,1,0][0,1,1][0,0,1]]"
Added task "task_3" (sleep: 4586 )
Setting PRNG engine to seed 1473233571283948400
"Task task_4: Generated data [[0,0,1][1,0,1][0,1,1][1,1,1][1,0,0][0,0,0][0,0,1][1,1,0][0,1,1][0,0,1]]"
Added task "task_4" (sleep: 5328 )
Setting PRNG engine to seed 1473233571284950700
"Task task_5: Generated data [[0,0,0][1,1,0][0,0,1][0,0,1][0,1,1][1,0,0][1,0,0][1,0,1][0,0,0][0,0,0]]"
Added task "task_5" (sleep: 2189 )
Setting PRNG engine to seed 1473233571284950700
"Task task_6: Generated data [[0,0,0][1,1,0][0,0,1][0,0,1][0,1,1][1,0,0][1,0,0][1,0,1][0,0,0][0,0,0]]"
Added task "task_6" (sleep: 470 )
Setting PRNG engine to seed 1473233571285950800
"Task task_7: Generated data [[0,0,0][1,0,0][0,1,1][1,0,0][1,0,1][0,1,0][1,0,1][0,1,0][1,1,0][0,0,1]]"
Added task "task_7" (sleep: 6789 )
Setting PRNG engine to seed 1473233571285950800
"Task task_8: Generated data [[0,0,0][1,0,0][0,1,1][1,0,0][1,0,1][0,1,0][1,0,1][0,1,0][1,1,0][0,0,1]]"
Added task "task_8" (sleep: 6793 )
Setting PRNG engine to seed 1473233571286950900
"Task task_9: Generated data [[1,0,1][1,1,1][1,0,0][1,1,0][0,1,1][0,0,0][1,0,1][1,0,1][0,0,0][1,0,1]]"
Added task "task_9" (sleep: 9347 )

虽然还是有太多的重复(似乎生成了相同的数据对)。这也有一个巨大的缺点,即它与 TaskCommData 对象的创建速度以及创建此类的两个实例之间发生的事情有关。创建越快,用 std::chrono::system_clock::now()) 测量的差异越小。这似乎不是生成种子的好方法(当然我可能弄错了 :D)。

知道如何解决这个问题吗?即使问题出在种子上,我仍然不明白为什么在 TaskManager::initData() 中一切正常,而在这里却没有那么多。

最佳答案

所以,是的,第一种情况是正确的:如果您使用相同(默认)种子为所有 PRNG 播种,它们必须产生相同的数字序列。这就是它们的设计目的。

在第二种情况下,您使用了基于时间的种子,您注意到这也不是很好,因为您实际上只获得了三个不同的种子值,而且您还注意到这并不奇怪,因为不同的种子大致在同一时间生成。所以,这是另一个说明为什么基于时间的种子通常不好的例子。老实说,我不知道为什么我们仍然教那个。根据时间播种是个好主意的情况实际上非常罕见¹,如果我想一想,只要您需要的东西实际上是从外部无法预测的。如果您不需要真正不可预测,任何静态种子都可以。

所以,事情是这样的:简单地使用您的任务编号作为种子怎么样?这样,您就可以保证拥有与任务一样多的不同 PRN 序列。如果你需要在不同的运行中有不同的值,你仍然可以先取一个基于时间的随机数(或者,更好:向你的操作系统询问一个随机数!)并将任务号添加到其中,再次保证 -不同的序列。


¹ 基于时间的播种一直是 大量 未经授权访问背后的安全问题。典型示例:一些联网的过程控制系统有一个 Web 界面,您需要登录该界面。然后您会得到一个带有 secret session ID 的 cookie。唯一的问题是这个 session ID 只是一个随机数,受制于已知的“stringifier”,并且 RNG 是用实际用户登录时的时间播种的。因为确定设备时间通常很容易,而且很容易猜测可能发生该登录的时间范围,该 session ID 远非 secret ,并且通常可以通过极少量的尝试进行暴力破解。

关于c++ - C++ 中 PRNG 的默认随机引擎为类的每个实例生成相同的输出 - 正确的种子?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/39364195/

25 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com