gpt4 book ai didi

algorithm - 具有最大连续重复次数的加权随机选择

转载 作者:塔克拉玛干 更新时间:2023-11-03 03:50:30 26 4
gpt4 key购买 nike

假设我有一个项目列表,其中每个项目可以重复一次或多次(1到n)。我试图找到一个算法,从列表中提取随机项,直到它为空,但有一个限制,即一个项不能连续重复超过固定次数(每个项可能不同)。我希望算法具有“正确的概率”(稍后我将对此进行解释)。
例如,假设我有这个列表:

Item | Count | Max. consecutive
-----+-------+-----------------
A | 2 | 1
B | 4 | 2

一些结果可能是:
B A B A B B  
B B A B A B

但以下是不正确的,因为“B”有3个连续的重复,当最大值为2时:
B B B A B A

我已经设法创建了一个有效的算法,但它有一个概率问题。我先放代码,然后再谈问题。这是一个C#类;抱歉,如果你不喜欢格式或goto,让我们成为朋友;P
要使用它,您可以实例化一个提供唯一项数量的对象,使用 SetIndexData()为每个项添加信息(元素数量和最大重复次数),并调用 GetNextRandomIndex()获取下一个随机项(我使用 int实现它,就好像项是索引一样,您必须创建一个索引集合来存储实际项的值)。
您可以使用方法 CheckIndexMaxConsecutiveCorrectness()检查是否有可能在指定的总元素的袋子中具有指定的最大重复项。例如,如果一个包只有一个包含两个元素的项,但我们说它只能重复一次,这显然是不可能的,因此 CheckIndexMaxConsecutiveCorrectness()将返回 false
类内部使用方法“cc>”来检索一个项目的连续元素组的最小和最大数目(例如,在序列 GetIndexMinMaxGroups()中,我们有2个组为“A”,2个为“B”)。实际上,在这个方法中计算的数字不是完全正确的,因为有一些变量是不考虑的,但是它返回的值对两件事有用:做从 B B B A B A的检查(如果组的最小数目大于最大值,那么重复的最大次数不正确),并且知道何时必须将一个确定的条目从cc返回(当最小值等于最大值时)。我没有用任何数学算法来推导这些性质,所以可能出了什么问题,尽管到目前为止,它在我的测试中“神奇地”完美地工作了数百万次……
class ShuffleBag
{
/*** INFORMATION FOR THE INDEXES ***/


// Class for the data:
class IndexData
{
public int Count = 0; // The current number of instances of the index in the bag.
public int MaxConsecutive = int.MaxValue; // The maximum consecutive repetitions allowed for the index.
}


// List of indexes data for this bag:
IndexData[] IndexesDataList;




/*** MEMBERS USED FOR CALCULATIONS ***/


// Random number generator:
Random RandGenerator;


// Remaining elements in the bag:
int _RemainingElementsCount = 0;

public int RemainingElementsCount
{
get { return _RemainingElementsCount; }
}


// Last retrieved index (-1 if no index has been retrieved yet),
// and the last consecutive repetitions of that index:
int LastIndex = -1;
int LastRepetitions = 0;








///==============///
/// Constructor. ///
///==============///


public ShuffleBag(int uniqueIndexesCount)
{
IndexesDataList = new IndexData[uniqueIndexesCount];
for (int i = 0; i < uniqueIndexesCount; i++)
IndexesDataList[i] = new IndexData();


RandGenerator = new Random();
}








///===========================================================///
/// Resets the shuffle bag; must be called before reusing it. ///
/// The number of unique indexes won't be reset. ///
/// Doesn't need to be called just after creating the bag. ///
///===========================================================///


public void Reset()
{
for (int i = 0; i < IndexesDataList.Length; i++)
{
IndexesDataList[i].Count = 0;
IndexesDataList[i].MaxConsecutive = int.MaxValue;
}

_RemainingElementsCount = 0;

LastIndex = -1;
LastRepetitions = 0;
}








///==================================================================================================///
/// Checks if it's possible to honor the max repetitions of an index with the provided data. ///
/// If it was not possible, the behaviour of a shuffle bag with those parameters would be undefined. ///
///==================================================================================================///


public static bool CheckIndexMaxConsecutiveCorrectness(int maxConsecutive, int indexElements, int bagTotalElements)
{
int min, max;
GetIndexMinMaxGroups(indexElements, maxConsecutive, bagTotalElements, out min, out max);
return min <= max;
}








///=====================================================================///
/// Sets the data for the specified index. ///
/// Can be called after starting to use the bag, ///
/// but if any parameters make the max consecutive repetitions invalid, ///
/// the behaviour of the bag will be undefined. ///
///=====================================================================///


public void SetIndexData(int index, int count, int maxConsecutive)
{
IndexData data = IndexesDataList[index];

_RemainingElementsCount += count - data.Count;

data.Count = count;
data.MaxConsecutive = maxConsecutive;
}








///====================================================================================================================///
/// Retrieves the next random index. The caller must check if there are remaining elements in the bag to be retrieved. ///
///====================================================================================================================///


public int GetNextRandomIndex()
{
/*** GET THE INDEX ***/


int index;


// If, for any index, the minimum possible groups equals the maximum, it must be the returned index:
for (index = 0; index < IndexesDataList.Length; index++)
{
IndexData data = IndexesDataList[index];

int minGroups, maxGroups;
GetIndexMinMaxGroups(data.Count, data.MaxConsecutive, _RemainingElementsCount, out minGroups, out maxGroups);

if (minGroups == maxGroups)
goto _INDEX_FOUND_;
}


// Get a random number to choose the index:
int rand = RandGenerator.Next(_RemainingElementsCount);

for (index = 0; index < IndexesDataList.Length; index++)
{
IndexData data = IndexesDataList[index];

// This index corresponds with the random number:
if (rand < data.Count)
{
// Check if the index has reached the maximum consecutive repetitions;
// in that case, get the next available one:
if (index == LastIndex && data.MaxConsecutive == LastRepetitions)
{
for (int k = 1; k <= IndexesDataList.Length - 1; k++)
{
int m = WrapIndexSimple(index + k, IndexesDataList.Length);
if (IndexesDataList[m].Count > 0)
{
index = m;
goto _INDEX_FOUND_;
}
}
}

goto _INDEX_FOUND_;
}

// This index doesn't correspond with the random number; update it to check the next index:
else
{
rand -= data.Count;
}
}




/*** INDEX FOUND, UPDATE AND RETURN ***/
_INDEX_FOUND_:

IndexData resultData = IndexesDataList[index];

resultData.Count--;
_RemainingElementsCount--;

if (LastIndex == index)
{
LastRepetitions++;
}
else
{
LastIndex = index;
LastRepetitions = 1;
}


return index;
}








///===============================================================================///
/// Calculates the minimum and maximum possible groups of consecutive repetitions ///
/// for an index with the specified data. ///
/// If any provided data is invalid, the behaviour and results are undefined. ///
///===============================================================================///


static void GetIndexMinMaxGroups(int indexRemainingElements, int indexMaxConsecutive, int bagRemainingElements, out int min, out int max)
{
int rem;
int div = Math.DivRem(indexRemainingElements, indexMaxConsecutive, out rem);

min = rem == 0 ? div : div + 1;
max = bagRemainingElements - indexRemainingElements + 1;
}









///=======================================================///
/// Converts an index out of bounds to a valid value. ///
/// "length" is the number of elements of the collection. ///
/// Only works for indexes that are less than 2*length. ///
///=======================================================///


static int WrapIndexSimple(int index, int length)
{
if (index >= length)
return index - length;

else if (index < 0)
return length + index;

else
return index;
}
}

问题是:
正如我所说,到目前为止,这门课已经按我的要求进行了,在不超过重复次数的情况下提取了项目问题是,每次调用 CheckIndexMaxConsecutiveCorrectness()时提取项的概率并不是应该的。例如,如果我有这个:
Item | Count | Max. consecutive
-----+-------+-----------------
a | 30 | 3
X | 10 | 2

如果我是对的,可能会有13个不同的序列方法在前3次调用中每次返回“x”应返回“a”12次;在第4次调用中,每3次返回“x”应返回“a”10次;等等。这是不同的序列:
Seq. 1 | Seq. 2 | Seq. 3 | Seq. 4 | Seq. 5 | Seq. 6 | Seq. 7 | Seq. 8 | Seq. 9 | Seq. 10 | Seq. 11 | Seq. 12 | Seq. 13 | Number of “a” | Number of “X”
-------+--------+--------+--------+--------+--------+--------+--------+--------+---------+---------+---------+---------+---------------+--------------
a | a | a | X | a | a | a | a | a | a | a | a | a | 12 | 1
a | a | X | a | a | a | a | a | a | a | a | a | a | 12 | 1
a | X | a | a | a | a | a | a | a | a | a | a | a | 12 | 1
X | a | a | a | X | X | X | X | X | X | X | X | X | 3 | 10
a | a | a | X | X | a | a | a | a | a | a | a | a | 11 | 2
a | a | X | a | a | a | a | a | a | a | a | a | a | 12 | 1
a | X | a | a | a | a | a | a | a | a | a | a | a | 12 | 1
X | a | a | a | a | X | X | X | X | X | X | X | X | 4 | 9
a | a | a | X | X | X | a | a | a | a | a | a | a | 10 | 3
a | a | X | a | a | a | a | a | a | a | a | a | a | 12 | 1
a | X | a | a | a | a | a | a | a | a | a | a | a | 12 | 1
X | a | a | a | a | a | X | X | X | X | X | X | X | 5 | 8
a | a | a | X | X | X | X | a | a | a | a | a | a | 9 | 4
a | a | X | a | a | a | a | a | a | a | a | a | a | 12 | 1
a | X | a | a | a | a | a | a | a | a | a | a | a | 12 | 1
X | a | a | a | a | a | a | X | X | X | X | X | X | 6 | 7
a | a | a | X | X | X | X | X | a | a | a | a | a | 8 | 5
a | a | X | a | a | a | a | a | a | a | a | a | a | 12 | 1
a | X | a | a | a | a | a | a | a | a | a | a | a | 12 | 1
X | a | a | a | a | a | a | a | X | X | X | X | X | 7 | 6
a | a | a | X | X | X | X | X | X | a | a | a | a | 7 | 6
a | a | X | a | a | a | a | a | a | a | a | a | a | 12 | 1
a | X | a | a | a | a | a | a | a | a | a | a | a | 12 | 1
X | a | a | a | a | a | a | a | a | X | X | X | X | 8 | 5
a | a | a | X | X | X | X | X | X | X | a | a | a | 6 | 7
a | a | X | a | a | a | a | a | a | a | a | a | a | 12 | 1
a | X | a | a | a | a | a | a | a | a | a | a | a | 12 | 1
X | a | a | a | a | a | a | a | a | a | X | X | X | 9 | 4
a | a | a | X | X | X | X | X | X | X | X | a | a | 5 | 8
a | a | X | a | a | a | a | a | a | a | a | a | a | 12 | 1
a | X | a | a | a | a | a | a | a | a | a | a | a | 12 | 1
X | a | a | a | a | a | a | a | a | a | a | X | X | 10 | 3
a | a | a | X | X | X | X | X | X | X | X | X | a | 4 | 9
a | a | X | a | a | a | a | a | a | a | a | a | a | 12 | 1
a | X | a | a | a | a | a | a | a | a | a | a | a | 12 | 1
X | a | a | a | a | a | a | a | a | a | a | a | X | 11 | 2
a | a | a | X | X | X | X | X | X | X | X | X | X | 3 | 10
a | a | X | a | a | a | a | a | a | a | a | a | a | 12 | 1
a | X | a | a | a | a | a | a | a | a | a | a | a | 12 | 1
X | a | a | a | a | a | a | a | a | a | a | a | a | 12 | 1

这个小控制台应用程序显示了类的使用结果。对于指定的迭代次数,它填充包并提取所有随机项。最后,它显示了每个项目的连续重复的最大数目,以及每个项目的累积发生次数和每次迭代的调用到“cc>”。如您所见,经过100万次迭代后,第一次调用的累积元素“a”约为750000,“X”约为250000,最后一次调用的累积元素“a”约为999950,“X”约为50:
static void Main(string[] args)
{
int MaxIterations = 1000000;
bool ShowResults = true;


string[] items = new string[] { "a", "X" };
int[] count = new int[] { 30, 10 };
int[] maxRepet = new int[] { 3, 2 };


int elementCount = count.Sum();


bool maxRepetOK = true;
for (int i = 0; i < items.Length; i++)
maxRepetOK = maxRepetOK && ShuffleBag.CheckIndexMaxConsecutiveCorrectness(maxRepet[i], count[i], elementCount);
if (! maxRepetOK)
{
Console.WriteLine("*** Bad number of repetitions!! ***\n");
goto _END_;
}


Dictionary<string, Tuple<int,int>> results = new Dictionary<string,Tuple<int,int>>();
for (int i = 0; i < items.Length; i++)
results.Add(items[i], new Tuple<int,int>(0, 0));


int[,] resultsPerCall = new int[elementCount, items.Length];


ShuffleBag bag = new ShuffleBag(items.Length);




int iterations = 0;

for (int x = 0; x < MaxIterations; x++)
{
iterations++;


bag.Reset();


for (int i = 0; i < items.Length; i++)
bag.SetIndexData(i, count[i], maxRepet[i]);


string prevItem = "";
int prevRepetitions = 0;


int row = 0;
while (bag.RemainingElementsCount > 0)
{
int newIndex = bag.GetNextRandomIndex();


if (prevItem == items[newIndex])
{
prevRepetitions++;
}
else
{
prevItem = items[newIndex];
prevRepetitions = 1;
}

var resultAnt = results[items[newIndex]];
results[items[newIndex]] = new Tuple<int,int>(resultAnt.Item1 + 1, Math.Max(prevRepetitions, resultAnt.Item2));

resultsPerCall[row, newIndex]++;


if (ShowResults)
{
Console.WriteLine(items[newIndex]);
}


row++;
}


if (ShowResults && MaxIterations > 1)
{
Console.WriteLine("\nESC:\tEnd\nENTER:\tNext iteration\nTAB:\tAll iterations");

while (true)
{
switch (Console.ReadKey(true).Key)
{
case ConsoleKey.Enter:
goto _CONTINUE_;
case ConsoleKey.Escape:
goto _RESULTS_;
case ConsoleKey.Tab:
ShowResults = false;
goto _CONTINUE_;
}
}

_CONTINUE_:
Console.Clear();
Console.WriteLine("Iterating ...");
}
}




_RESULTS_:
Console.Clear();
Console.WriteLine("RESULTS\n------------------------------------------");

for (int i = 0; i < items.Length; i++)
{
var data = results[items[i]];
double average = (double)data.Item1 / iterations;

Console.WriteLine(items[i] +
": Average = " + average.ToString() + (average != count[i] ? "(!)" : "") +
"\t\tMax. repetitions = " + data.Item2.ToString() + (data.Item2 > maxRepet[i] ? "(!)" : ""));
}


Console.WriteLine();
for (int i = 0; i < elementCount; i++)
{
Console.Write((i+1).ToString("00") + ")");
for (int k = 0; k < items.Length; k++)
{
Console.Write(" \t" + items[k] + ": " + resultsPerCall[i, k].ToString());
}
Console.WriteLine();
}




_END_:
Console.WriteLine("\n\nESC to exit");
while (Console.ReadKey(true).Key != ConsoleKey.Escape);
}

问题是,从 GetNextRandomIndex()开始,我只使用一个项目的剩余元素数作为选择它的概率,因此对于第一个调用,获得“a”的概率是获得“x”的概率的3倍(因为“a”有30个元素,而“x”有10个元素)。有人知道我如何改变我的算法(或使用不同的算法)来获得正确的概率吗?

最佳答案

正如所承诺的,这里有一个马尔可夫链蒙特卡罗抽样。我好像不能
使用propp-wilson技术得到一个精确的版本;这个只是
收敛到在
期望的结果,可能相当缓慢定义a的分数
排列为无效字母的数目。具体来说,如果A
最多应连续出现一次,B最多应出现
连续两次,然后得分

AAABBAABBBB
^^ ^ ^^

5,其中投稿字母用 ^标记。
从随机排列开始,重复选择两个位置
独立均匀随机更换排列得分
在那些位置上的字母互换;这是提议的
置换。计算2的幂(当前分数-
建议得分)如果零到一之间的均匀随机浮点数小于
如果超过此数字,则保留此建议的交换;否则,撤消它做
这只要你能站着,然后采取下一个有效的排列。

关于algorithm - 具有最大连续重复次数的加权随机选择,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/26346903/

26 4 0
Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
广告合作:1813099741@qq.com 6ren.com