gpt4 book ai didi

algorithm - 数组保持不变的概率是多少?

转载 作者:行者123 更新时间:2023-12-01 16:36:51 25 4
gpt4 key购买 nike

这个问题在微软面试中被问到了。非常想知道为什么这些人会问这么奇怪的概率问题?

给定一个 rand(N),一个随机生成器,它生成从 0 到 N-1 的随机数。

int A[N]; // An array of size N
for(i = 0; i < N; i++)
{
int m = rand(N);
int n = rand(N);
swap(A[m],A[n]);
}

编辑:请注意,种子不是固定的。

数组 A 保持不变的概率是多少?
假设数组包含唯一元素。

最佳答案

好吧,我对这个有点乐趣。当我第一次阅读这个问题时,我想到的第一件事是群论(特别是对称群 Sn)。 for 循环通过在每次迭代中组合转置(即交换)来简单地在 Sn 中构建置换 σ。我的数学不是那么出色,而且我有点生疏,所以如果我的符号不合适,请忍受我。


概述

A是我们的数组在排列后不变的事件。我们最终被要求找到事件的概率 A , Pr(A) .

我的解决方案尝试遵循以下过程:

  • 考虑所有可能的排列(即我们数组的重新排序)
  • 根据它们包含的所谓身份转换的数量,将这些排列划分为不相交的集合。这有助于将问题减少到仅偶数排列。
  • 给定排列是偶数(并且具有特定长度),确定获得身份排列的概率。
  • 将这些概率相加以获得数组不变的总体概率。

  • 1) 可能的结果

    请注意,for 循环的每次迭代都会创建一个交换(或转置),导致以下两种情况之一(但不会同时出现):
  • 交换了两个元素。
  • 元素与自身交换。就我们的意图和目的而言,数组没有改变。

  • 我们标记第二种情况。让我们定义一个身份转换如下:

    An identity transposition occurs when a number is swapped with itself. That is, when n == m in the above for loop.



    对于所列代码的任何给定运行,我们编写 N换位。可以有 0, 1, 2, ... , N出现在这个“链”中的身份转换。

    例如,考虑一个 N = 3案例:
    Given our input [0, 1, 2].
    Swap (0 1) and get [1, 0, 2].
    Swap (1 1) and get [1, 0, 2]. ** Here is an identity **
    Swap (2 2) and get [1, 0, 2]. ** And another **

    请注意,有奇数个非同一性转置 (1) 且数组已更改。

    2)基于身份换位次数的分区

    K_i是事件 i身份换位出现在给定的排列中。请注意,这形成了所有可能结果的详尽划分:
  • 任何排列都不能同时具有两个不同数量的身份转换,并且
  • 所有可能的排列必须介于 0 之间和 N身份转换。

  • 因此我们可以应用 Law of Total Probability :
                          

    Now we can finally take advantage of the the partition. Note that when the number of non-identity transpositions is odd, there is no way the array can go unchanged*. Thus:

                            

    *From group theory, a permutation is even or odd but never both. Therefore an odd permutation cannot be the identity permutation (since the identity permutation is even).

    3) Determining Probabilities

    So we now must determine two probabilities for N-i even:

    1. Pr(K_i)
    2. Pr(A|K_i)

    The First Term

    The first term, Pr(K_i), represents the probability of obtaining a permutation with i identity transpositions. This turns out to be binomial since for each iteration of the for loop:

    • The outcome is independent of the results before it, and
    • The probability of creating an identity transposition is the same, namely 1/N.

    Thus for N trials, the probability of obtaining i identity transpositions is:

                          

    The Second Term

    So if you've made it this far, we have reduced the problem to finding Pr(A|K_i) for N - i even. This represents the probability of obtaining an identity permutation given i of the transpositions are identities. I use a naive counting approach to determine the number of ways of achieving the identity permutation over the number of possible permutations.

    First consider the permutations (n, m) and (m, n) equivalent. Then, let M be the number of non-identity permutations possible. We will use this quantity frequently.

                                  

    The goal here is to determine the number of ways a collections of transpositions can be combined to form the identity permutation. I will try to construct the general solution along side an example of N = 4.


    Let's consider the N = 4 case with all identity transpositions (i.e. i = N = 4). Let X represent an identity transposition. For each X, there are N possibilities (they are: n = m = 0, 1, 2, ... , N - 1). Thus there are N^i = 4^4 possibilities for achieving the identity permutation. For completeness, we add the binomial coefficient, C(N, i), to consider ordering of the identity transpositions (here it just equals 1). I've tried to depict this below with the physical layout of elements above and the number of possibilities below:

    I  =  _X_   _X_   _X_   _X_
    N * N * N * N * C(4, 4) => N^N * C(N, N) possibilities

    现在没有明确替换 N = 4i = 4 ,我们可以看看一般情况。结合上面的分母,我们发现:
                              

    This is intuitive. In fact, any other value other than 1 should probably alarm you. Think about it: we are given the situation in which all N transpositions are said to be identities. What's the probably that the array is unchanged in this situation? Clearly, 1.


    Now, again for N = 4, let's consider 2 identity transpositions (i.e. i = N - 2 = 2). As a convention, we will place the two identities at the end (and account for ordering later). We know now that we need to pick two transpositions which, when composed, will become the identity permutation. Let's place any element in the first location, call it t1. As stated above, there are M possibilities supposing t1 is not an identity (it can't be as we have already placed two).

    I  =  _t1_   ___   _X_   _X_
    M * ? * N * N

    唯一可能排在第二位的元素是 t1 的倒数。 ,实际上是 t1 (这是唯一的逆唯一性)。我们再次包含二项式系数:在这种情况下,我们有 4 个开放位置,我们希望放置 2 个身份排列。我们有多少种方法可以做到这一点? 4 选择 2。
    I  =  _t1_   _t1_   _X_   _X_ 
    M * 1 * N * N * C(4, 2) => C(N, N-2) * M * N^(N-2) possibilities

    再看看一般情况,这一切都对应于:
                          

    Finally we do the N = 4 case with no identity transpositions (i.e. i = N - 4 = 0). Since there are a lot of possibilities, it starts to get tricky and we must be careful not to double count. We start similarly by placing a single element in the first spot and working out possible combinations. Take the easiest first: the same transposition 4 times.

    I  =  _t1_   _t1_   _t1_   _t1_ 
    M * 1 * 1 * 1 => M possibilities

    现在让我们考虑两个独特的元素 t1t2 .有 M t1 的可能性并且只有 M-1 t2 的可能性(因为 t2 不能等于 t1 )。如果我们穷尽所有的安排,我们会留下以下模式:
    I  =  _t1_   _t1_   _t2_   _t2_ 
    M * 1 * M-1 * 1 => M * (M - 1) possibilities (1)st

    = _t1_ _t2_ _t1_ _t2_
    M * M-1 * 1 * 1 => M * (M - 1) possibilities (2)nd

    = _t1_ _t2_ _t2_ _t1_
    M * M-1 * 1 * 1 => M * (M - 1) possibilities (3)rd

    现在让我们考虑三个独特的元素, t1 , t2 , t3 .让我们放置 t1先然后 t2 .像往常一样,我们有:
    I  =  _t1_   _t2_   ___   ___ 
    M * ? * ? * ?

    我们还不能说有多少可能 t2 s 可能还有,我们将在一分钟内看到原因。

    我们现在放置 t1在第三个位置。通知, t1必须去那里,因为如果要去最后一个地方,我们只会重新创建 (3)rd上面的安排。重复计算是不好的!这留下了第三个唯一元素 t3到最终位置。
    I  =  _t1_   _t2_   _t1_   _t3_ 
    M * ? * 1 * ?

    那么为什么我们要花一点时间来考虑 t2的数量?更接近?换位 t1t2 不能 是不相交的排列(即它们必须共享其 nm 中的一个(并且只有一个,因为它们也不能相等)。这样做的原因是因为如果它们不相交,我们可以交换排列顺序。这意味着我们将重复计算 (1)st安排。

    t1 = (n, m) . t2必须是形式 (n, x)(y, m)一些 xy为了不相交。请注意 x可能不是 nmy许多不是 nm .因此,可能的排列数 t2可能实际上是 2 * (N - 2) .

    所以,回到我们的布局:
    I  =  _t1_    _t2_    _t1_   _t3_ 
    M * 2(N-2) * 1 * ?

    现在 t3必须是 t1 t2 t1 的组合的倒数.让我们手动完成:
    (n, m)(n, x)(n, m) = (m, x) 

    因此 t3必须是 (m, x) .注意这是 不是 t1 不相交并且不等于 t1t2所以在这种情况下没有重复计算。
    I  =  _t1_    _t2_    _t1_   _t3_ 
    M * 2(N-2) * 1 * 1 => M * 2(N - 2) possibilities

    最后,将所有这些放在一起:




    4) 把它们放在一起

    所以就是这样。向后工作,将我们发现的结果代入步骤 2 中给出的原始总和。我计算了 N = 4 的答案。下面的情况。它与另一个答案中的经验数字非常匹配!

    N = 4
    M = 6 _________ _________ _________
    | Pr(K_i) | Pr(A | K_i) |产品 |
    _________|_________|_____________|_________|
    | | | | |
    |我 = 0 | 0.316 | 120/1296 | 0.029 |
    |_________|_________|_____________|_________|
    | | | | |
    |我 = 2 | 0.211 | 6/36 | 0.035 |
    |_________|_________|_____________|_________|
    | | | | |
    |我 = 4 | 0.004 | 1/1 | 0.004 |
    |_________|_________|_____________|_________|
    | | |
    |总和: | 0.068 |
    |_________|_________|

    正确性

    如果群论中有一个结果可以应用在这里会很酷——也许有!它肯定会有助于使所有这些繁琐的计数完全消失(并将问题缩短为更优雅的问题)。我在 N = 4 停止工作.对于 N > 5 ,给出的只是一个近似值(有多好,我不确定)。如果你仔细想想,很清楚为什么会这样:例如,给定 N = 8换位,显然有四种方法可以用上面没有说明的四个独特元素来创建身份。随着排列变长(据我所知......),方式的数量似乎变得更加难以计算。

    反正我绝对不能在采访范围内做这种事。如果幸运的话,我会走到分母一步。除此之外,它似乎很讨厌。

    关于algorithm - 数组保持不变的概率是多少?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11872190/

    25 4 0
    Copyright 2021 - 2024 cfsdn All Rights Reserved 蜀ICP备2022000587号
    广告合作:1813099741@qq.com 6ren.com