关于有序对上的随机样本的算法设计手册(Steven Skiena)第 250 页-6ren

关于有序对上的随机样本的算法设计手册(Steven Skiena)第 250 页

转载作者：塔克拉玛干更新时间：2023-11-03 05:57:17

25

4

给出以下解释

Problem: We need an efficient and unbiased way to generate random pairs of vertices to perform random vertex swaps. Propose an efficient algorithm to generate elements from the (n 2) unordered pairs on {1, . . . , n} uniformly at random.

Solution: Uniformly generating random structures is a surprisingly subtle problem. Consider the following procedure to generate random unordered pairs: i = random int(1,n-1); j = random int(i+1,n);

It is clear that this indeed generates unordered pairs, since i < j. Further, it is clear that all (n 2) unordered pairs can indeed be generated, assuming that random int generates integers uniformly between its two arguments.

But are they uniform? The answer is no. What is the probability that pair (1,2) is generated? There is a 1/(n−1) chance of getting the 1, and then a 1/(n−1) chance of getting the 2, which yields p(1,2) = 1/(n − 1)2. But what is the probability of getting (n − 1,n)? Again, there is a 1/n chance of getting the first number, but now there is only one possible choice for the second candidate! This pair will occur n times more often than the first! The problem is that fewer pairs start with big numbers than little numbers. We could solve this problem by calculating exactly how unordered pairs start with i (exactly (n − i)) and appropriately bias the probability. The second value could then be selected uniformly at random from i + 1 to n. But instead of working through the math, let’s exploit the fact that randomly generating the n2 ordered pairs uniformly is easy. Just pick two integers independently of each other. Ignoring the ordering (i.e. , permuting the ordered pair to unordered pair (x,y) so that x < y) gives us a 2/n^2 probability of generating each unordered pair of distinct elements. If we happen to generate a pair (x,x), we discard it and try again. We will get unordered pairs uniformly at random in constant expected time using the following algorithm:

在上面的段落中“问题是更少的对开始于大数多于小数。”这不应该是更多的对而不是更少的对
在上面的段落中“我们可以通过准确地计算无序对如何以 i 开始(准确地 (n − i))来解决这个问题”这不应该是我有多少无序对而不是多少无序对

编辑

在上面的段落“忽略顺序(即, 将有序对置换为无序对 (x,y) 使得 x < y)给我们生成每个无序对的 2/n^2 概率不同的元素。”概率 2/n^2 是如何导出的？

谢谢

最佳答案

in the above paragraph "The problem is that fewer pairs start with big numbers than little numbers." shouldn't this be more pairs instead of fewer pairs

不，它更少了。:

n - 1 pairs start with 1 (1 2; 1 3; ...; 1 n)
n - 2 pairs start with 2 (2 3; 2 4; ...; 2 n)
n - 3 pairs start with 3
...

in the above paragraph "We could solve this problem by calculating exactly how unordered pairs start with i (exactly (n − i))" shouldn't this me how many unordered pairs rather than how unordered pairs

是的，这里少了一个“很多”。

in the above paragraph "Ignoring the ordering (i.e. , permuting the ordered pair to unordered pair (x,y) so that x < y) gives us a 2/n^2 probability of generating each unordered pair of distinct elements." how is the probability 2/n^2 derived ?

有 n*n 种可能性生成对，其中顺序很重要(1 2 和 2 1 是不同的对)。由于您随后继续忽略排序，1 2 和 2 1 将相同，因此您有两个有利的情况。

但这并没有说明您丢弃了 x x 对这一事实。那么它将是 2/(n*(n - 1))，因为如果你选择 x 一次，你只有 n - 1第二顺位的可能性。

关于关于有序对上的随机样本的算法设计手册(Steven Skiena)第 250 页，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/32579780/

25

4

0

文章推荐： python - 如何指定用于在 python 中对列表进行排序的算法

文章推荐： php - Symfony 组件应该如何添加 JavaScript 和 CSS

文章推荐： PHP:什么是 Getter 和 Setter？

来自不平衡面板数据的 R 样本
我正在处理不平衡的面板数据，我想从中抽取一个随机样本，该样本不受每个单位不同观察次数的影响。例如，在下面的代码中，IBM 被选中的可能性是 GOOG 的两倍，被选中的可能性是 MSFT 的五倍。有没有
wpf - CollectionChanged 样本
有人可以指出实现 CollectionChanged 的示例。我正在使用 wpf mvvm 灯。我试图谷歌，没有找到任何足够好的东西。最佳答案 public ObservableCollecti
带权重的 Pandas 样本
我有 df我想对某些变量的分布进行一些抽样。比方说 df['type'].value_counts(normalize=True)返回: 0.3 A 0.5 B 0.2 C 我想做类似 sampled
opengl - 样本、像素和片段之间有什么区别？
我仍然无法理解样本/像素/片段之间有什么区别。由于片段着色器按像素执行，我认为片段只是指一个像素，这是正确的吗？谁能给我一个例子和每个人的定义？最佳答案片段着色器按片段执行并发出像素。它们非常相
ios - 了解音频的帧/样本
我正在尝试理解这个名为“The Amazing Audio Engine”的 GitHub 项目，它简化了在 iOS 上处理音频的过程。我从麦克风捕捉并使用这种方法: id receiver = [
c++ - QTableView - 样本
如何在诺基亚 Qt SDK(用于手机)中使用 QTableView。我引用了一些文档，但我仍然不清楚 QTableView。请任何人建议如何使用 QTableView。我想显示具有三列的 QTabl
java - JmDNS 样本
我已经能够获取 JmDNS 附带的示例来编译和运行，但是我无法获取任何类来发现我的服务。我正在运行一个 Windows 环境，多台 PC 运行 VNC、SSH 和 Apache，我一直在尝试让 Jm
python - 在一个范围内生成均匀分布的倍数/样本
问题的具体实例我的整数范围是 1-100。我想生成此范围内的 n 个总数，这些数字尽可能均匀分布并包括第一个和最后一个值。示例 start = 1, end = 100, n = 5 Outp
jmeter - 从摘要报告中排除 JSR223 样本
我在线程组中有几个带有脚本的 JSR 223 采样器，它们在执行在调用 HTTP 请求之前进行一些工作。问题在于 JSR 233 采样器包含在最终摘要报告中我的问题是如何从最终计算中排除那些 JS
ios - 良好的后端方式来存储iOS应用的歌曲(音频)样本？
我需要有关存储后端歌曲预览的好方法的建议(现在正在查看iTunes，也许还有spotify和soundcloud)。我的想法是，我需要预下载并可能缓存30秒及更少的音频文件，以方便召回。然后，我需要
java - 如何运行 Atmosphere 样本？
我刚刚从 Github 下载了 Atmosphere 样本。当我在聊天样本上运行 jetty:run goal 时，我遇到了一些问题。我可以使用浏览器访问该页面(http://localhost:9
r - 洗牌向量 - 样本()的所有可能结果？
我有一个包含五个项目的向量。 my_vec 有更换，当我需要时没有更换。最有效的方法是什么？请注意，在我的向量中，我有两次值“a” - 因此，在返回的打乱向量集中，它们都应该有两次“a”。最佳答案
python - 如何解释这两个未压缩的 zlib 样本？
我正在尝试学习如何阅读规范。让我们看看尝试压缩后会得到什么:1) 一个空缓冲区和 2) 一个感叹号: >>> zlib.compress(b'', 0) b'x\x01\x01\x00\x00\xff
CUDA 样本 matrixMul 错误
我是 cuda 的新手，几周前才开始阅读有关并行编程和 cuda 的内容。在我安装了 cuda 工具包之后，我正在浏览 sdk 示例(安装工具包时附带的)并想尝试其中的一些。我从 0_Simple 文
f# - 大小值是否用于使用自定义生成器的 Gen 样本？
我正在使用 FsCheck 生成自定义数据的 Gen . 假设你有一个函数返回 Gen : let chooseRectangle widthMax heightMax offset = gen
python - 如何在多个列上进行 Pandas 样本？
我有一个包含大约 800 万个观察值的数据框。我需要从中提取样本，但想从多个列中采样。我尝试了以下方法，但不起作用: import pandas as pd state = ['mi', 'mi',
android - 找不到 FingerPaint 样本
我学习安卓图形，我遇到了一个奇怪的问题: 我发现很多提到“FingerPaint”样本的地方，但我在样本文件夹中找不到它。只有 47 个示例项目，没有一个是关于图形的。如何下载此示例？我按照这
python - 基于每行类别的 Pandas 样本
假设我有一个 pandas 数据框 rid category 0 0 c2 1 1 c3 2 2 c2 3 3 c3 4 4
python - 基于标准的 Pandas 样本
我想用 Pandas sample功能，但具有不分组或过滤数据的标准。 import pandas as pd import numpy as np df = pd.DataFrame(np.rand
python - 绘制 MNIST 样本
我正在尝试从 MNIST 数据集中绘制 10 个样本。每个数字之一。这是代码: import sklearn import pandas as pd import matplotlib.pyplot

首页

博学

6Ren·AI

商城

关于有序对上的随机样本的算法设计手册(Steven Skiena)第 250 页