I have searched on stackoverflow for people facing similar issues and this topic Replicating MATLAB's `randperm` in NumPy is the most similar.
我在stackoverflow上搜索过面临类似问题的人,这个主题在NumPy中复制MATLAB的“randperm”是最相似的。
However, although it is possible to recreate the behavior of randperm function from Matlab in Python using numpy random permutation, the numbers generated are not the same, even though I choose the same seed generator for both languages. I am a bit confused since my tests were relevant for other random functions between Matlab and Python.
然而,尽管可以使用numpy随机排列在Python中从Matlab中重新创建randperm函数的行为,但生成的数字并不相同,即使我为两种语言选择了相同的种子生成器。我有点困惑,因为我的测试与Matlab和Python之间的其他随机函数有关。
Here is what I have tried:
以下是我尝试过的:
Matlab
Matlab
rng(42);
randperm(15)
which returns
返回
ans =
11 7 6 5 15 14 1 4 9 10 3 13 8 2 12
Python
蟒蛇
np.random.seed(42)
print(np.random.permutation(range(1,16)))
which returns
返回
[10 12 1 14 6 9 3 2 15 5 8 11 13 4 7]
How can I change my Python code so it can reproduce the same order of random numbers than Matlab ?
我如何更改我的Python代码,使其能够重现与Matlab相同的随机数顺序?
更多回答
if you want to guarantee the equality of outcomes, call Matlab from Python using its API. Otherwise, you either know that their random generation algorithm is equal or you have no guarntee.
如果您想保证结果的平等性,请使用其API从Python调用Matlab。否则,你要么知道他们的随机生成算法是相等的,要么就没有保证。
It seems that Matlab and Numpy use the same random number generators by default, and the discrepancy is caused by the inner workings of randperm
being different in the two languages.
默认情况下,Matlab和Numpy似乎使用相同的随机数生成器,而这种差异是由于两种语言中randperm的内部工作方式不同造成的。
In old Matlab versions, randperm
worked by generating a random array and outputting the indices that would make the array sorted (using the second output of sort
). In more modern Matlab versions (I'm using R2017b), randperm
is a built-in function, so the source code cannot be seen, but it seems to use the same method:
在旧的Matlab版本中,randperm通过生成一个随机数组并输出索引来进行排序(使用排序的第二个输出)。在更现代的Matlab版本中(我使用的是R2017b),randperm是一个内置函数,因此看不到源代码,但它似乎使用了相同的方法:
>> rng('default')
>> rng(42)
>> randperm(15)
ans =
11 7 6 5 15 14 1 4 9 10 3 13 8 2 12
>> rng(42)
>> [~, ind] = sort(rand(1,15))
ind =
11 7 6 5 15 14 1 4 9 10 3 13 8 2 12
So, if the random number generators are actually the same in the two languages, which seems to be the case, you can replicate that behaviour in Numpy by defining your own version of randperm
using argsort
:
因此,如果这两种语言中的随机数生成器实际上是相同的(似乎是这样),那么您可以通过使用argsort定义自己版本的randperm来在Numpy中复制这种行为:
>>> import numpy as np
np.random.seed(42)
ind = np.argsort(np.random.random((1,16)))+1
print(ind)
[[11 7 6 5 15 16 14 1 4 9 10 3 13 8 2 12]]
Note, however, that relying on the random number generators being the same in the two languages is risky, and probably version-dependent.
然而,请注意,依赖两种语言中相同的随机数生成器是有风险的,并且可能依赖于版本。
更多回答
I sometimes am deeply surprised learning that MATLAB implements a weird, sub-optimal algorithm for a standard task with a standard, efficient solution.
当我得知MATLAB用标准高效的解决方案为标准任务实现了一种奇怪的次优算法时,我有时会深感惊讶。
@CrisLuengo Do you mean this? That seems to be linear, an sorting is about n*log(n). so not much more expensive. Also, the linear algorithm requires looping, which was an issue in old Matlab versions. Note that randperm
was an m-file. And you have to admit that [~, out] = sort(in);
looks elegant :-)
@CrisLuengo你是这个意思吗?这似乎是线性的,排序大约是n*log(n)。所以不会贵多少。此外,线性算法需要循环,这在旧的Matlab版本中是一个问题。请注意,randperm是一个m文件。你必须承认[~,out]=排序(in);看起来很优雅:-)
I guess the sorting method is more efficient if you don’t have a JIT and can rely on fast, vectorized sort
and rand
. But in a compiled language it’s much less efficient, it uses more memory and sorting is a lot more expensive. O notation talks about how it scales with array size, not about how much it costs. Fisher-Yates doesn’t use any comparisons at all, sorting needs to do O(n log n) comparisons. Sorting also needs to move each array element multiple times, Fisher-Yates does exactly n-1 swaps.
我想,如果你没有JIT,并且可以依靠快速、矢量化的排序和rand,排序方法会更有效。但在编译语言中,它的效率要低得多,占用的内存更多,排序也要贵得多。O表示法谈论的是它如何随数组大小而扩展,而不是它的成本。Fisher Yates根本不使用任何比较,排序需要进行O(n-logn)比较。排序还需要多次移动每个数组元素,Fisher Yates正好进行n-1次交换。
我是一名优秀的程序员,十分优秀!