python - PyMC3 多元混合模型 : Constraining components to be non-empty-6ren

python - PyMC3 多元混合模型 : Constraining components to be non-empty

转载作者：太空宇宙更新时间：2023-11-04 05:36:27

我正在 pymc3 中实现多元高斯回归的个性化混合，并遇到空组件问题。引用后related PyMC3 mixture model example ，我尝试使用单变量法线来实现模型，但我有 some issues there as well .

我尝试了几种策略来将每个组件限制为非空，但都失败了。这些显示在下面的代码中。我的具体问题是:在使用 pymc3 的多元高斯混合模型中，将所有分量限制为非空的最佳方法是什么？

请注意，下面代码中的尝试 #1 来自 Mixture Model in PyMC3 Example并且在这里不起作用。

您可以使用 function in this gist 复制我正在使用的合成数据.

import pymc3 as pm
import numpy as np
import theano
import theano.tensor as T
from scipy import stats

# Extract problem dimensions.
N = X.shape[0]  # number of samples
F = X.shape[1]  # number of features
pids = I[:, 0].astype(np.int)  # primary entity ids
uniq_pids = np.unique(pids)  # array of unique primary entity ids
n_pe = len(uniq_pids)  # number of primary entities

with pm.Model() as gmreg:
    # Init hyperparameters.
    a0 = 1
    b0 = 1
    mu0 = pm.constant(np.zeros(F))
    alpha = pm.constant(np.ones(K))
    coeff_precisions = pm.constant(1 / X.var(0))

    # Init parameters.
    # Dirichlet shape parameter, prior on indicators.
    pi = pm.Dirichlet(
        'pi', a=alpha, shape=K)

    # ATTEMPT 1: Make probability of membership for each cluter >= 0.1
    # ================================================================
    pi_min_potential = pm.Potential(
        'pi_min_potential', T.switch(T.min(pi) < .1, -np.inf, 0))
    # ================================================================

    # The multinomial (and by extension, the Categorical), is a symmetric
    # distribution. Using this as a prior for the indicator variables Z
    # makes the likelihood invariant under the many possible permutations of
    # the indices. This invariance is inherited in posterior inference.
    # This invariance model implies unidentifiability and induces label
    # switching during inference.

    # Resolve by ordering the components to have increasing weights.
    # This does not deal with the parameter identifiability issue.
    order_pi_potential = pm.Potential(
        'order_pi_potential',
        T.sum([T.switch(pi[k] - pi[k-1] < 0, -np.inf, 0)
               for k in range(1, K)]))

    # Indicators, specifying which cluster each primary entity belongs to.
    # These are draws from Multinomial with 1 trial.
    init_pi = stats.dirichlet.rvs(alpha.eval())[0]
    test_Z = np.random.multinomial(n=1, pvals=init_pi, size=n_pe)
    as_cat = np.nonzero(test_Z)[1]
    Z = pm.Categorical(
        'Z', p=pi, shape=n_pe, testval=as_cat)

    # ATTEMPT 2: Give infinite negative likelihood to the case
    # where any of the clusters have no users assigned.
    # ================================================================
    # sizes = [T.eq(Z, k).nonzero()[0].shape[0] for k in range(K)]
    # nonempty_potential = pm.Potential(
    #     'comp_nonempty_potential',
    #     np.sum([T.switch(sizes[k] < 1, -np.inf, 0) for k in range(K)]))
    # ================================================================

    # ATTEMPT 3: Add same sample to each cluster, each has at least 1.
    # ================================================================
    # shared_X = X.mean(0)[None, :]
    # shared_y = y.mean().reshape(1)
    # X = T.concatenate((shared_X.repeat(K).reshape(K, F), X))
    # y = T.concatenate((shared_y.repeat(K), y))

    # Add range(K) on to the beginning to include shared instance.
    # Z_expanded = Z[pids]
    # Z_with_shared = T.concatenate((range(K), Z_expanded))
    # pid_idx = pm.Deterministic('pid_idx', Z_with_shared)
    # ================================================================

    # Expand user cluster indicators to each observation for each user.
    pid_idx = pm.Deterministic('pid_idx', Z[pids])

    # Construct masks for each component.
    masks = [T.eq(pid_idx, k).nonzero() for k in range(K)]
    comp_sizes = [masks[k][0].shape[0] for k in range(K)]

    # Component regression precision parameters.
    beta = pm.Gamma(
        'beta', alpha=a0, beta=b0, shape=(K,),
        testval=np.random.gamma(a0, b0, size=K))
    # Regression coefficient matrix, with coeffs for each component.
    W = pm.MvNormal(
        'W', mu=mu0, tau=T.diag(coeff_precisions), shape=(K, F),
        testval=np.random.randn(K, F) * std)

    # The mean of the observations is the result of a regression, with
    # coefficients determined by the cluster the sample belongs to.

    # Now we have K different multivariate normal distributions.
    X = T.cast(X, 'float64')
    y = T.cast(y, 'float64')
    comps = []
    for k in range(K):
        mask_k = masks[k]

        X_k = X[mask_k]
        y_k = y[mask_k]

        n_k = comp_sizes[k]
        precision_matrix = beta[k] * T.eye(n_k)

        comp_k = pm.MvNormal(
            'comp_%d' % k,
            mu=T.dot(X_k, W[k]), tau=precision_matrix,
            observed=y_k)
        comps.append(comp_k)

前两种方法无法确保非空簇；尝试在 LinAlgError 中采样结果:

with gmreg:
step1 = pm.Metropolis(vars=[pi, beta, W])
step2 = pm.ElemwiseCategoricalStep(vars=[Z], values=np.arange(K))
tr = pm.sample(100, step=[step1, step2])
...:     

Failed to compute determinant []
---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)
<ipython-input-2-c7df53f4c6a5> in <module>()
      2     step1 = pm.Metropolis(vars=[pi, beta, W])
      3     step2 = pm.ElemwiseCategoricalStep(vars=[Z], values=np.arange(K))
----> 4     tr = pm.sample(100, step=[step1, step2])
      5 

/home/mack/anaconda/lib/python2.7/site-packages/pymc3/sampling.pyc in sample(draws, step, start, trace, chain, njobs, tune, progressbar, model, random_seed)
    155         sample_args = [draws, step, start, trace, chain,
    156                        tune, progressbar, model, random_seed]
--> 157     return sample_func(*sample_args)
    158 
    159 

/home/mack/anaconda/lib/python2.7/site-packages/pymc3/sampling.pyc in _sample(draws, step, start, trace, chain, tune, progressbar, model, random_seed)
    164     progress = progress_bar(draws)
    165     try:
--> 166         for i, strace in enumerate(sampling):
    167             if progressbar:
    168                 progress.update(i)

/home/mack/anaconda/lib/python2.7/site-packages/pymc3/sampling.pyc in _iter_sample(draws, step, start, trace, chain, tune, model, random_seed)
    246         if i == tune:
    247             step = stop_tuning(step)
--> 248         point = step.step(point)
    249         strace.record(point)
    250         yield strace

/home/mack/anaconda/lib/python2.7/site-packages/pymc3/step_methods/compound.pyc in step(self, point)
     12     def step(self, point):
     13         for method in self.methods:
---> 14             point = method.step(point)
     15         return point

/home/mack/anaconda/lib/python2.7/site-packages/pymc3/step_methods/arraystep.pyc in step(self, point)
     87             inputs += [point]
     88 
---> 89         apoint = self.astep(bij.map(point), *inputs)
     90         return bij.rmap(apoint)
     91 

/home/mack/anaconda/lib/python2.7/site-packages/pymc3/step_methods/gibbs.pyc in astep(self, q, logp)
     38 
     39     def astep(self, q, logp):
---> 40         p = array([logp(v * self.sh) for v in self.values])
     41         return categorical(p, self.var.dshape)
     42 

/home/mack/anaconda/lib/python2.7/site-packages/pymc3/blocking.pyc in __call__(self, x)
    117 
    118     def __call__(self, x):
--> 119         return self.fa(self.fb(x))

/home/mack/anaconda/lib/python2.7/site-packages/pymc3/model.pyc in __call__(self, *args, **kwargs)
    423     def __call__(self, *args, **kwargs):
    424         point = Point(model=self.model, *args, **kwargs)
--> 425         return self.f(**point)
    426 
    427 compilef = fastfn

/home/mack/anaconda/lib/python2.7/site-packages/theano/compile/function_module.pyc in __call__(self, *args, **kwargs)
    604                         self.fn.nodes[self.fn.position_of_error],
    605                         self.fn.thunks[self.fn.position_of_error],
--> 606                         storage_map=self.fn.storage_map)
    607                 else:
    608                     # For the c linker We don't have access from

/home/mack/anaconda/lib/python2.7/site-packages/theano/compile/function_module.pyc in __call__(self, *args, **kwargs)
    593         t0_fn = time.time()
    594         try:
--> 595             outputs = self.fn()
    596         except Exception:
    597             if hasattr(self.fn, 'position_of_error'):

/home/mack/anaconda/lib/python2.7/site-packages/theano/gof/op.pyc in rval(p, i, o, n)
    766             # default arguments are stored in the closure of `rval`
    767             def rval(p=p, i=node_input_storage, o=node_output_storage, n=node):
--> 768                 r = p(n, [x[0] for x in i], o)
    769                 for o in node.outputs:
    770                     compute_map[o][0] = True

/home/mack/anaconda/lib/python2.7/site-packages/theano/tensor/nlinalg.pyc in perform(self, node, (x,), (z,))
    267     def perform(self, node, (x,), (z, )):
    268         try:
--> 269             z[0] = numpy.asarray(numpy.linalg.det(x), dtype=x.dtype)
    270         except Exception:
    271             print 'Failed to compute determinant', x

/home/mack/anaconda/lib/python2.7/site-packages/numpy/linalg/linalg.pyc in det(a)
   1769     """
   1770     a = asarray(a)
-> 1771     _assertNoEmpty2d(a)
   1772     _assertRankAtLeast2(a)
   1773     _assertNdSquareness(a)

/home/mack/anaconda/lib/python2.7/site-packages/numpy/linalg/linalg.pyc in _assertNoEmpty2d(*arrays)
    220     for a in arrays:
    221         if a.size == 0 and product(a.shape[-2:]) == 0:
--> 222             raise LinAlgError("Arrays cannot be empty")
    223 
    224 

LinAlgError: Arrays cannot be empty
Apply node that caused the error: Det(Elemwise{Mul}[(0, 1)].0)
Inputs types: [TensorType(float64, matrix)]
Inputs shapes: [(0, 0)]
Inputs strides: [(8, 8)]
Inputs values: [array([], shape=(0, 0), dtype=float64)]

Backtrace when the node is created:
  File "/home/mack/anaconda/lib/python2.7/site-packages/pymc3/distributions/multivariate.py", line 66, in logp
    result = k * T.log(2 * np.pi) + T.log(1./det(tau))

HINT: Use the Theano flag 'exception_verbosity=high' for a debugprint and storage map footprint of this apply node.

...表示组件为空，因为精度矩阵的形状为 (0, 0)。

第三种方法实际上解决了空组件问题，但给出了非常奇怪的推理行为。我选择了一个基于轨迹图的老化，并稀释到每 10 个样本。样本仍然高度自相关，但比没有细化要好得多。此时，我对样本中的 Z 值求和，这就是我得到的结果:

In [3]: with gmreg:
    step1 = pm.Metropolis(vars=[pi, beta, W])
    step2 = pm.ElemwiseCategoricalStep(vars=[Z], values=np.arange(K))
    tr = pm.sample(1000, step=[step1, step2])
   ...:     
 [-----------------100%-----------------] 1000 of 1000 complete in 258.8 sec

...

In [24]: zvals = tr[300::10]['Z']

In [25]: np.array([np.bincount(zvals[:, n]) for n in range(nusers)])
Out[25]: 
array([[ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70],
       [ 0,  0, 70]])

因此出于某种原因，所有用户都被分配到每个样本的最后一个集群。

最佳答案

我遇到过类似的问题。这样的事情适用于多元高斯模型的混合。至于它是否是最好的，它肯定是我找到的最好的解决方案。

pm.Potential('pi_min_potential', T.switch(
                                           T.all(
                                                [pi[i, 0] < 0.1 for i in range(K)]), -np.inf, 0))

此处的关键是，您需要考虑每个低于临界值的潜力。此外，您应该调整 pi 分布的 shape，如评论中所述。这将影响您在 T.switch 调用中的索引(在 pi[i,0] 上)。

关于python - PyMC3 多元混合模型 : Constraining components to be non-empty，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/35438967/

文章推荐： python - 使用 numpy 样式格式的 Sphinx 忽略的部分

文章推荐： c - 即使缓冲区大小不同，数据是否在 UDP 中成功传输？

文章推荐： c - pow 函数给出不需要的结果 c

文章推荐： Python urllib2 和动态加载的数据

swift - 嵌套通用约束 : Constrain the T of a generic item inside a generic sequence extension that is constrained to that generic type
在 Swift 中，我们可以对序列等通用项编写扩展: extension Sequence where Iterator.Element : ObservableType { } 这将保证扩展仅适用于
php - 如何使用具有默认值的 foreignId-constrained？
刚刚从视频教程中注意到Laravel有这个foreignId-constrained。所以我尝试将我的代码从 method.1 更改为 method.2，表已迁移，但是当我尝试 db:seed 时，默
odoo - @api.constrains 无法正常工作
在“project.project”模型中，我编写了一个函数来验证“开始日期”和“结束日期”，为此我使用了 onchange 函数。该函数正在工作并发出警告，但正在创建记录。实际上，如果有任何错误意味
完成部分三角剖分(Constrained Triangulation)的算法
给定平面中的一组点和一个不完整的 triangulation of the convex hull of the points (只给出了一些边)，我正在寻找一种算法来完成三角剖分(初始给定的边应该保
flutter - LayoutBuilder : constrains. maxHeight 返回无穷大
我正在尝试获取constrains.maxHeight。这是我使用的代码: LayoutBuilder( builder: (BuildContext context, B
vhdl - variable must be constrained 错误
我收到一个错误，我不明白为什么。我的代码: library ieee; use ieee.std_logic_1164.all; use work.Func_Pack.all; use ieee.
java - 多线程最佳实践: constraining tasks newFixedThreadPool
我想启动很多任务来在 +-42Mio 记录的数据库上运行。我想以 5000 条记录/时间的批处理运行此程序(结果为 850 个任务)。我还想限制 java 开始为我执行此操作的线程数(最多 16 个)
java - 惯用的 Java : constraining data
关闭。这个问题是opinion-based 。目前不接受答案。想要改进这个问题吗？更新问题，以便 editing this post 可以用事实和引文来回答它。 . 已关闭 8 年前。 Improv
vhdl - variable must be constrained 错误
我收到一个错误，我不明白为什么。我的代码: library ieee; use ieee.std_logic_1164.all; use work.Func_Pack.all; use ieee.
odoo-8 - 保护不适用于 api.constrains
如果没有“可用数量”，我想禁止生产产品。但是这段代码不起作用。它仅在我将 @api.constrains 更改为 @api.onchange('move_lines') 时有效，但如果我使用 onc
swift 泛型 : Constraining Argument Types
来自 C++ 背景(模板)，我很难理解为什么以下 Swift 代码(泛型)无法编译: func backwards(array: [T]) -> [T] { let reversedColle
C# 泛型 : What is generic constraining interface?
在 MSDN 上 - C# 编程指南 Constraints on Type Parameters ，它说: where T : interface_name The typeargument mus
typescript : generic type with primitive types constrain
我在 typescript 中有以下通用类 type UserId = number type Primitive = string | number | boolean class ColumnVa
c++ - 为什么arduino对某些 'functions'使用宏，例如min()，max()，constrain()？
尽管我通常是宏的粉丝，但是我不明白为什么Arduino的制造商选择使用宏而不是实际的函数来表示其一些算术“函数”。仅举几个例子: min() max() constrain() 他们的网站通知人们不要
Swift 泛型 : Constrain Type Parameter to Protocol
您如何指定泛型类型参数只能是协议(protocol)(或符合该协议(protocol)的协议(protocol))，而不是符合该协议(protocol)的类？例如: import Foundatio
C#/Salesforce : Must Constrain Generic, 无法约束通用
这个问题由 C# 和 Salesforce 组成，双方可能都有解决方案。欢迎提出建议! 我正在编写一个通用类来读取 Salesforce 数据。签名看起来像这样: public abstract cl
iOS 自动布局 : Is the Size Constrains for UIView is Obligatory?
我正在尝试通过 Interface Builder 使用约束来安排 Storyboard中的控件(“任何宽度，任何高度”的情况)。我在那里添加了 UIView，但是，当我根据给定的约束按下此元素的更新
javascript - 选择2 : Multiple select elements constrained
我正在尝试实现某些目标，但我找不到/决定什么是最好的实现方法，所以我要问以前是否有人这样做过，或者 select2 是否内置了一些东西来实现目标我想要。事情是这样的:我的 DOM 中有许多选择(多个
paypal自适应支付错误: 550001 - You do not have permission to constrain funding sources
我在使用自适应支付的实时 API 凭据时遇到此错误。阅读 intrwebs 和文档它必须对帐户权限做一些事情，但公平地说我不知道是哪一个。接收器、api 持有者或应用程序我已经创建了应用程序，
java - Clojure deftype : how to constrain field types?
我正在尝试编写一个可以从 Java 使用的 Clojure 库，而用户不知道它是用 Clojure 编写的。为此，我需要我的字段具有正确的类型: 我喜欢我能做到这一点: (deftype Point

太空宇宙

个人简介

我是一名优秀的程序员,十分优秀！

作者热门文章

滴滴打车优惠券免费领取

全站热门文章

首页

博学

6Ren·AI

商城

python - PyMC3 多元混合模型 : Constraining components to be non-empty