- android - 多次调用 OnPrimaryClipChangedListener
- android - 无法更新 RecyclerView 中的 TextView 字段
- android.database.CursorIndexOutOfBoundsException : Index 0 requested, 光标大小为 0
- android - 使用 AppCompat 时,我们是否需要明确指定其 UI 组件(Spinner、EditText)颜色
我正在尝试编写一些与 Python 功能相同的代码,Numpy.random.Choice
关键部分是:概率
The probabilities associated with each entry in a. If not given the sample assumes a uniform distribution over all entries in a.
部分测试代码:
import numpy as np
n = 5
vocab_size = 3
p = np.array( [[ 0.65278451], [ 0.0868038725], [ 0.2604116175]])
print('Sum: ', repr(sum(p)))
for t in range(n):
x = np.random.choice(range(vocab_size), p=p.ravel())
print('x: %s x[x]: %s' % (x, p.ravel()[x]))
print(p.ravel())
这给出了输出:
Sum: array([ 1.])
x: 0 x[x]: 0.65278451
x: 0 x[x]: 0.65278451
x: 0 x[x]: 0.65278451
x: 0 x[x]: 0.65278451
x: 0 x[x]: 0.65278451
[ 0.65278451 0.08680387 0.26041162]
有时。
这里有一个Distribution,它是部分随机的,但是那里也有一个Structure。
我想在 C# 中实现它,老实说,我不确定实现它的有效方法。
大约 4 年前,有人提出了一个很好的问题:Emulate Python's random.choice in .NET
因为这现在已经很老了,也没有真正深入到均匀概率分布,我想我会要求详细说明吗?
现在时代变了,代码也在变,我认为可能有更好的方法来实现 .NET Random.Choice()
方法。
public static int Choice(Vector sequence, int a = 0, int size = 0, bool replace = false)
{
// F(x)
var Fx = 1/(b - a)
var p = (xmax - xmin) * Fx
return random.Next(0, sequence.Length);
}
Vector 只是一个 double[]。
我将如何着手从向量中随机选择一个概率,如下所示:
p = np.array(
[[ 0.01313731], [ 0.01315883], [ 0.01312814], [ 0.01316345], [ 0.01316839],
[ 0.01314225], [ 0.01317578], [ 0.01312916], [ 0.01316344], [ 0.01317046],
[ 0.01314973], [ 0.01314432], [ 0.01317042], [ 0.01314846], [ 0.01315124],
[ 0.01316694], [ 0.0131816 ], [ 0.01315033], [ 0.0131645 ], [ 0.01314199],
[ 0.01315199], [ 0.01314431], [ 0.01314458], [ 0.01314999], [ 0.01315409],
[ 0.01316245], [ 0.01315008], [ 0.01314104], [ 0.01315215], [ 0.01317024],
[ 0.01315993], [ 0.01318789], [ 0.0131677 ], [ 0.01316761], [ 0.01315658],
[ 0.01315902], [ 0.01314266], [ 0.0131637 ], [ 0.01315702], [ 0.01315776],
[ 0.01316194], [ 0.01316246], [ 0.01314769], [ 0.01315608], [ 0.01315487],
[ 0.01316117], [ 0.01315083], [ 0.01315836], [ 0.0131665 ], [ 0.01314706],
[ 0.01314923], [ 0.01317971], [ 0.01316373], [ 0.01314863], [ 0.01315498],
[ 0.01315732], [ 0.01318195], [ 0.01315505], [ 0.01315979], [ 0.01315992],
[ 0.01316072], [ 0.01314744], [ 0.0131638 ], [ 0.01315642], [ 0.01314933],
[ 0.01316188], [ 0.01315458], [ 0.01315551], [ 0.01317907], [ 0.01316296],
[ 0.01317765], [ 0.01316863], [ 0.01316804], [ 0.01314882], [ 0.01316548],
[ 0.01315487]])
Python 中的输出是:
Sum: array([ 1.])
x: 21 x[x]: 0.01314431
x: 30 x[x]: 0.01315993
x: 54 x[x]: 0.01315498
x: 31 x[x]: 0.01318789
x: 27 x[x]: 0.01314104
有时。
编辑:咖啡和 sleep 之后,更多的洞察力。文档说明:
Generate a non-uniform random sample from np.arange(5) of size 3 without replacement:
np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0]) array([2, 3, 0])
参数p
为序列或Choice引入了一个非均匀分布。
The probabilities associated with each entry in
a
. If not given the sample assumes a uniform distribution over all entries ina
.
所以我猜,如果:
static int[] a = new int[] { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40,
41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60,
61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75};
static double[] p = new double[] { 0.01313731, 0.01315883, 0.01312814, 0.01316345, 0.01316839, 0.01314225,
0.01317578, 0.01312916, 0.01316344, 0.01317046, 0.01314973, 0.01314432,
0.01317042, 0.01314846, 0.01315124, 0.01316694, 0.0131816, 0.01315033,
0.0131645, 0.01314199, 0.01315199, 0.01314431, 0.01314458, 0.01314999,
0.01315409, 0.01316245, 0.01315008, 0.01314104, 0.01315215, 0.01317024,
0.01315993, 0.01318789, 0.0131677, 0.01316761, 0.01315658, 0.01315902,
0.01314266, 0.0131637, 0.01315702, 0.01315776, 0.01316194, 0.01316246,
0.01314769, 0.01315608, 0.01315487, 0.01316117, 0.01315083, 0.01315836,
0.0131665, 0.01314706, 0.01314923, 0.01317971, 0.01316373, 0.01314863,
0.01315498, 0.01315732, 0.01318195, 0.01315505, 0.01315979, 0.01315992,
0.01316072, 0.01314744, 0.0131638, 0.01315642, 0.01314933, 0.01316188,
0.01315458, 0.01315551, 0.01317907, 0.01316296, 0.01317765, 0.01316863,
0.01316804, 0.01314882, 0.01316548, 0.01315487 };
我如何有效地计算这个分布?
编辑:
虽然上面的p
参数可能没有明确的分布:
此 p
参数执行以下操作:
p = np.array(
[[ 3.09571694e-03], [ 6.62372261e-04], [ 2.52917874e-04], [ 6.93371978e-04],
[ 2.22301291e-04], [ 3.53796717e-02], [ 2.36204398e-04], [ 2.41100042e-04],
[ 1.59093166e-02], [ 5.17099025e-04], [ 2.72037896e-04], [ 1.29918769e-03],
[ 2.68077696e-02], [ 5.68696611e-04], [ 5.32142704e-04], [ 5.88432463e-05],
[ 2.53700138e-02], [ 2.51216588e-03], [ 4.72895541e-04], [ 4.20276848e-03],
[ 5.65701874e-05], [ 1.84972048e-03], [ 8.46515331e-03], [ 8.02505743e-02],
[ 5.34274983e-04], [ 5.18868535e-04], [ 2.22580377e-04], [ 2.50133462e-02],
[ 3.70997917e-02], [ 5.84941482e-05], [ 6.49978323e-04], [ 4.18675536e-01],
[ 6.16371962e-02], [ 3.82260752e-04], [ 6.09901544e-04], [ 2.54540201e-03],
[ 2.46758824e-04], [ 4.13621365e-04], [ 5.23495532e-04], [ 6.40675685e-03],
[ 1.14165332e-03], [ 1.89148994e-04], [ 8.41715724e-04], [ 8.65699032e-04],
[ 6.71368283e-04], [ 2.14908596e-03], [ 5.80679210e-02], [ 1.11176616e-02],
[ 6.58134137e-05], [ 2.38992622e-02], [ 2.91388753e-04], [ 1.93989753e-03],
[ 1.82157325e-03], [ 3.33691627e-03], [ 5.69157244e-03], [ 1.11033592e-04],
[ 2.42448034e-04], [ 8.42765356e-05], [ 1.31656056e-02], [ 1.68779684e-02],
[ 2.72298244e-02], [ 8.19056613e-04], [ 1.14640462e-02], [ 6.21846308e-05],
[ 9.24618073e-04], [ 3.63659515e-02], [ 7.17286486e-05], [ 6.24008652e-04],
[ 2.59900890e-03], [ 1.57848651e-04], [ 5.71378707e-05], [ 7.62828929e-04],
[ 2.91648042e-04], [ 1.67612579e-04], [ 1.65455262e-04], [ 1.01981563e-02]])
一些向左偏斜的高斯分布。 PoyserMath 的这段视频非常棒:Stats: Finding Probability Using a Normal Distribution Table 解释为什么 p
总和必须为 1.0
编辑:12.04.17 - 最后我找到了与此关联的 python 文件!!!
# Author: Hamzeh Alsalhi <ha258@cornell.edu>
#
# License: BSD 3 clause
from __future__ import division
import numpy as np
import scipy.sparse as sp
import operator
import array
from sklearn.utils import check_random_state
from sklearn.utils.fixes import astype
from ._random import sample_without_replacement
__all__ = ['sample_without_replacement', 'choice']
# This is a backport of np.random.choice from numpy 1.7
# The function can be removed when we bump the requirements to >=1.7
def choice(a, size=None, replace=True, p=None, random_state=None):
"""
choice(a, size=None, replace=True, p=None)
Generates a random sample from a given 1-D array
.. versionadded:: 1.7.0
Parameters
-----------
a : 1-D array-like or int
If an ndarray, a random sample is generated from its elements.
If an int, the random sample is generated as if a was np.arange(n)
size : int or tuple of ints, optional
Output shape. Default is None, in which case a single value is
returned.
replace : boolean, optional
Whether the sample is with or without replacement.
p : 1-D array-like, optional
The probabilities associated with each entry in a.
If not given the sample assumes a uniform distribution over all
entries in a.
random_state : int, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator;
If RandomState instance, random_state is the random number generator;
If None, the random number generator is the RandomState instance used
by `np.random`.
Returns
--------
samples : 1-D ndarray, shape (size,)
The generated random samples
Raises
-------
ValueError
If a is an int and less than zero, if a or p are not 1-dimensional,
if a is an array-like of size 0, if p is not a vector of
probabilities, if a and p have different lengths, or if
replace=False and the sample size is greater than the population
size
See Also
---------
randint, shuffle, permutation
Examples
---------
Generate a uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3) # doctest: +SKIP
array([0, 3, 4])
>>> #This is equivalent to np.random.randint(0,5,3)
Generate a non-uniform random sample from np.arange(5) of size 3:
>>> np.random.choice(5, 3, p=[0.1, 0, 0.3, 0.6, 0]) # doctest: +SKIP
array([3, 3, 0])
Generate a uniform random sample from np.arange(5) of size 3 without
replacement:
>>> np.random.choice(5, 3, replace=False) # doctest: +SKIP
array([3,1,0])
>>> #This is equivalent to np.random.shuffle(np.arange(5))[:3]
Generate a non-uniform random sample from np.arange(5) of size
3 without replacement:
>>> np.random.choice(5, 3, replace=False, p=[0.1, 0, 0.3, 0.6, 0])
... # doctest: +SKIP
array([2, 3, 0])
Any of the above can be repeated with an arbitrary array-like
instead of just integers. For instance:
>>> aa_milne_arr = ['pooh', 'rabbit', 'piglet', 'Christopher']
>>> np.random.choice(aa_milne_arr, 5, p=[0.5, 0.1, 0.1, 0.3])
... # doctest: +SKIP
array(['pooh', 'pooh', 'pooh', 'Christopher', 'piglet'],
dtype='|S11')
"""
random_state = check_random_state(random_state)
# Format and Verify input
a = np.array(a, copy=False)
if a.ndim == 0:
try:
# __index__ must return an integer by python rules.
pop_size = operator.index(a.item())
except TypeError:
raise ValueError("a must be 1-dimensional or an integer")
if pop_size <= 0:
raise ValueError("a must be greater than 0")
elif a.ndim != 1:
raise ValueError("a must be 1-dimensional")
else:
pop_size = a.shape[0]
if pop_size is 0:
raise ValueError("a must be non-empty")
if p is not None:
p = np.array(p, dtype=np.double, ndmin=1, copy=False)
if p.ndim != 1:
raise ValueError("p must be 1-dimensional")
if p.size != pop_size:
raise ValueError("a and p must have same size")
if np.any(p < 0):
raise ValueError("probabilities are not non-negative")
if not np.allclose(p.sum(), 1):
raise ValueError("probabilities do not sum to 1")
shape = size
if shape is not None:
size = np.prod(shape, dtype=np.intp)
else:
size = 1
# Actual sampling
if replace:
if p is not None:
cdf = p.cumsum()
cdf /= cdf[-1]
uniform_samples = random_state.random_sample(shape)
idx = cdf.searchsorted(uniform_samples, side='right')
# searchsorted returns a scalar
idx = np.array(idx, copy=False)
else:
idx = random_state.randint(0, pop_size, size=shape)
else:
if size > pop_size:
raise ValueError("Cannot take a larger sample than "
"population when 'replace=False'")
if p is not None:
if np.sum(p > 0) < size:
raise ValueError("Fewer non-zero entries in p than size")
n_uniq = 0
p = p.copy()
found = np.zeros(shape, dtype=np.int)
flat_found = found.ravel()
while n_uniq < size:
x = random_state.rand(size - n_uniq)
if n_uniq > 0:
p[flat_found[0:n_uniq]] = 0
cdf = np.cumsum(p)
cdf /= cdf[-1]
new = cdf.searchsorted(x, side='right')
_, unique_indices = np.unique(new, return_index=True)
unique_indices.sort()
new = new.take(unique_indices)
flat_found[n_uniq:n_uniq + new.size] = new
n_uniq += new.size
idx = found
else:
idx = random_state.permutation(pop_size)[:size]
if shape is not None:
idx.shape = shape
if shape is None and isinstance(idx, np.ndarray):
# In most cases a scalar will have been made an array
idx = idx.item(0)
# Use samples as indices for a if a is array-like
if a.ndim == 0:
return idx
if shape is not None and idx.ndim == 0:
# If size == () then the user requested a 0-d array as opposed to
# a scalar object when size is None. However a[idx] is always a
# scalar and not an array. So this makes sure the result is an
# array, taking into account that np.array(item) may not work
# for object arrays.
res = np.empty((), dtype=a.dtype)
res[()] = a[idx]
return res
return a[idx]
def random_choice_csc(n_samples, classes, class_probability=None,
random_state=None):
"""Generate a sparse random matrix given column class distributions
Parameters
----------
n_samples : int,
Number of samples to draw in each column.
classes : list of size n_outputs of arrays of size (n_classes,)
List of classes for each column.
class_probability : list of size n_outputs of arrays of size (n_classes,)
Optional (default=None). Class distribution of each column. If None the
uniform distribution is assumed.
random_state : int, RandomState instance or None, optional (default=None)
If int, random_state is the seed used by the random number generator;
If RandomState instance, random_state is the random number generator;
If None, the random number generator is the RandomState instance used
by `np.random`.
Returns
-------
random_matrix : sparse csc matrix of size (n_samples, n_outputs)
"""
data = array.array('i')
indices = array.array('i')
indptr = array.array('i', [0])
for j in range(len(classes)):
classes[j] = np.asarray(classes[j])
if classes[j].dtype.kind != 'i':
raise ValueError("class dtype %s is not supported" %
classes[j].dtype)
classes[j] = astype(classes[j], np.int64, copy=False)
# use uniform distribution if no class_probability is given
if class_probability is None:
class_prob_j = np.empty(shape=classes[j].shape[0])
class_prob_j.fill(1 / classes[j].shape[0])
else:
class_prob_j = np.asarray(class_probability[j])
if np.sum(class_prob_j) != 1.0:
raise ValueError("Probability array at index {0} does not sum to "
"one".format(j))
if class_prob_j.shape[0] != classes[j].shape[0]:
raise ValueError("classes[{0}] (length {1}) and "
"class_probability[{0}] (length {2}) have "
"different length.".format(j,
classes[j].shape[0],
class_prob_j.shape[0]))
# If 0 is not present in the classes insert it with a probability 0.0
if 0 not in classes[j]:
classes[j] = np.insert(classes[j], 0, 0)
class_prob_j = np.insert(class_prob_j, 0, 0.0)
# If there are nonzero classes choose randomly using class_probability
rng = check_random_state(random_state)
if classes[j].shape[0] > 1:
p_nonzero = 1 - class_prob_j[classes[j] == 0]
nnz = int(n_samples * p_nonzero)
ind_sample = sample_without_replacement(n_population=n_samples,
n_samples=nnz,
random_state=random_state)
indices.extend(ind_sample)
# Normalize probabilites for the nonzero elements
classes_j_nonzero = classes[j] != 0
class_probability_nz = class_prob_j[classes_j_nonzero]
class_probability_nz_norm = (class_probability_nz /
np.sum(class_probability_nz))
classes_ind = np.searchsorted(class_probability_nz_norm.cumsum(),
rng.rand(nnz))
data.extend(classes[j][classes_j_nonzero][classes_ind])
indptr.append(len(indices))
return sp.csc_matrix((data, indices, indptr),
(n_samples, len(classes)),
dtype=int)
最佳答案
如果我没理解错的话——您想根据 double 组给出的分布概率从 Y 元素列表中随机选择 X 元素,其中每个元素代表具有相同索引的元素被返回的概率。我能想到的最直接的方法是这个(见评论):
using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading;
static readonly ThreadLocal<Random> _random = new ThreadLocal<Random>(() => new Random());
static IEnumerable<T> Choice<T>(IList<T> sequence, int size, double[] distribution) {
double sum = 0;
// first change shape of your distribution probablity array
// we need it to be cumulative, that is:
// if you have [0.1, 0.2, 0.3, 0.4]
// we need [0.1, 0.3, 0.6, 1 ] instead
var cumulative = distribution.Select(c => {
var result = c + sum;
sum += c;
return result;
}).ToList();
for (int i = 0; i < size; i++) {
// now generate random double. It will always be in range from 0 to 1
var r = _random.Value.NextDouble();
// now find first index in our cumulative array that is greater or equal generated random value
var idx = cumulative.BinarySearch(r);
// if exact match is not found, List.BinarySearch will return index of the first items greater than passed value, but in specific form (negative)
// we need to apply ~ to this negative value to get real index
if (idx < 0)
idx = ~idx;
if (idx > cumulative.Count - 1)
idx = cumulative.Count - 1; // very rare case when probabilities do not sum to 1 becuase of double precision issues (so sum is 0.999943 and so on)
// return item at given index
yield return sequence[idx];
}
}
我很难用通俗的语言来解释这一点,但我认为从代码中应该是比较明显的。也许用例子来解释是最容易的。假设我们有分布 [0.1, 0.4, 0.4, 0.1]。累积版本(当我们将所有先前项目的总和添加到当前项目时)将如下所示:[0.1, 0.5, 0.9, 1]。现在我们生成 0 到 1 范围内的随机数。它的分布是均匀的,所以任何值都是同样可能的。它在 0-0.1 范围内的概率是多少? 0.1。在 0.1-0.5 范围内? 0.4。因此,您会看到概率均匀分布的 0-1 数字将在给定范围内与我们在概率分布数组中的完全相同。
像这样使用:
var result = Choice(Enumerable.Range(0, 5).ToArray(), 3, new double[] {0.01, 0.01, 0.48, 0.48, 0.02}).ToArray();
将导致:
[3,3,3] //
[2,3,2] // most often result with contain 2 and 3, because they both have 0.48 probablity and the rest elements have just 0.01
[1,3,2] // very rare other elements will appear
如果您需要没有重复的版本 - 也可以稍微修改此代码。
如果您需要一件元素 - 使用 size = 1
调用上述函数或为方便起见创建重载。如果您想传递单个整数而不是序列,则相同:
static T Choice<T>(IList<T> sequence, double[] distribution) {
return Choice(sequence, 1, distribution).First();
}
static int Choice(int upTo, double[] distribution) {
return Choice(Enumerable.Range(0, upTo).ToArray(), distribution);
}
关于c# - C# 中的 Python numpy.random.choice 具有非/均匀概率分布,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/43303538/
在 Windows 世界中,什么是正确的名称。具有导出函数的老式 C++ DLL?不是 COM DLL,也不是 .NET DLL。我们以前通过调用 LoadLibrary() 和 GetProcAdd
目前我正在使用javaEE7,我有一个场景如下。在我的 JSF Web 应用程序中,我有一个事件监听器(不是 JSF 事件),当事件调用时,它会执行某些操作,然后将这些信息更新到我的 Web 应用程序
这不是 AJAX 请求/响应回调问题... 我正在使用 Dojo 1.5 构建网格。我正在尝试 dojo.connect具有功能的扩展/收缩按钮。我的问题是 grid.startup()在创建实际 D
非 Webkit Opera 是 very specific在某些功能中,因此通常通过 JavaScript 检测到 the following way . 但是,Opera Next 几乎是 Goo
我已查看以下链接中给出的所有日志,但未能找到 IP 地址: https://developer.couchbase.com/documentation/server/3.x/admin/Misc/Tr
我有一个命令行程序,它根据一组源文件生成一个我想在我的 Android gradle 构建 (A) 中使用的 jar 文件。这个命令行程序只是将一个 jar 文件存储在磁盘上的一个目录中。 我如何创建
下面的 htaccess 命令将所有非 www 转移到 http www RewriteEngine On RewriteCond %{HTTP_HOST} !^www\. RewriteRule ^
我正在使用自定义链接器脚本将内核镜像分为两部分。第一个是普通代码和数据,第二个是初始化代码和不再需要时将被丢弃的数据。初始化部分也不像内核本身那样在地址空间之间共享,因此如果 fork() 仍然存在(
这个问题在这里已经有了答案: Several unary operators in C and C++ (3 个答案) What is the "-->" operator in C++? (29
假设我有一个类设置如下: class A { public: virtual void foo() { printf("default implementation\n"); } }; c
#include using namespace std; int main(int argc, char *argv[]) { int i=-5; while(~(i)) {
近期,百度搜索引擎变化无常,很多企业站、行业站、门户站、论坛等站点遭到了降权,特别是比比贴分类信息网直接遭到了拔毛,这对于广大站长来说是一种打击,也是各个企业、行业的打击。 至今,很多网站已经恢复
我现在正在使用 IBM TPM v1332 + IBM TSS v1470 并尝试将一些基本关键字/密码存储到 TPM 上的非 volatile 内存中。我找到了两种方法。一种是创建一个密封对象并使用
我的 PHP 脚本中有一个正则表达式,如下所示: /(\b$term|$term\b)(?!([^)/iu 这与 $term 中包含的单词匹配,只要前后有单词边界并且它不在 HTML 标记内即可。 但
我想显示用户名称地址(请参阅 www.ipchicken.com ),但我唯一能找到的是 IP 地址。我尝试了反向查找,但也没有用: IPAddress ip = IPAddress.Parse(th
只有 UI 线程能够显示到屏幕上,还是其他线程也可以这样做? 最佳答案 不,您只能直接从 UI 线程访问 UI,但您可以编码来自其他线程的结果,例如使用 Control.Invoke 或 contro
我正在使用现代 Excel 滚动条(不是旧的 ActiveX 类型,即开发人员 > 插入 > 表单控件 > 滚动条)并且想检测它的值何时更改。我找不到有关此类对象的更改事件的任何信息。您可以在单击时分
当我使用这段代码时 IE 6 确实正确使用了指定的样式表,但所有其他浏览器在应该使用基本上声明的样式表时会忽略这两种样式表,如果您不是 IE,请使用此样式表。 有什么想法吗? 最佳答案 n
我想指定 2 mssql 表之间的关系。 付款类别和付款。 paymentcategory.id 加入 payout.category 列。 在 payout.json 模型中 我指定为外键:id,
我正在尝试制作非 volatile UDF,但似乎不可能。因此,这是我非常简单的test-UDF: Option Explicit Dim i As Integer Sub Main() i = 0
我是一名优秀的程序员,十分优秀!