在PyMC3中使用LKJCorr优先级修改了BPMF:使用Nuts的PositiveDefiniteError


问题内容

我以前在中实现了原始贝叶斯概率矩阵分解(BPMF)模型pymc3请参阅我先前的问题以获取参考,数据源和问题设置。根据@twiecki对这个问题的回答,我使用LKJCorr相关矩阵的先验和标准差的统一先验实现了模型的变体。在原始模型中,协方差矩阵是从Wishart分布中提取的,但是由于当前的限制pymc3,无法从Wishart分布中正确采样。这个对一个松散相关问题的回答LKJCorr先验的选择提供了简洁的解释。新模型如下。

import pymc3 as pm
import numpy as np
import theano.tensor as t


n, m = train.shape
dim = 10  # dimensionality
beta_0 = 1  # scaling factor for lambdas; unclear on its use
alpha = 2  # fixed precision for likelihood function
std = .05  # how much noise to use for model initialization

# We will use separate priors for sigma and correlation matrix.
# In order to convert the upper triangular correlation values to a
# complete correlation matrix, we need to construct an index matrix:
n_elem = dim * (dim - 1) / 2
tri_index = np.zeros([dim, dim], dtype=int)
tri_index[np.triu_indices(dim, k=1)] = np.arange(n_elem)
tri_index[np.triu_indices(dim, k=1)[::-1]] = np.arange(n_elem)

logging.info('building the BPMF model')
with pm.Model() as bpmf:
    # Specify user feature matrix
    sigma_u = pm.Uniform('sigma_u', shape=dim)
    corr_triangle_u = pm.LKJCorr(
        'corr_u', n=1, p=dim,
        testval=np.random.randn(n_elem) * std)

    corr_matrix_u = corr_triangle_u[tri_index]
    corr_matrix_u = t.fill_diagonal(corr_matrix_u, 1)
    cov_matrix_u = t.diag(sigma_u).dot(corr_matrix_u.dot(t.diag(sigma_u)))
    lambda_u = t.nlinalg.matrix_inverse(cov_matrix_u)

    mu_u = pm.Normal(
        'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
         testval=np.random.randn(dim) * std)
    U = pm.MvNormal(
        'U', mu=mu_u, tau=lambda_u,
        shape=(n, dim), testval=np.random.randn(n, dim) * std)

    # Specify item feature matrix
    sigma_v = pm.Uniform('sigma_v', shape=dim)
    corr_triangle_v = pm.LKJCorr(
        'corr_v', n=1, p=dim,
        testval=np.random.randn(n_elem) * std)

    corr_matrix_v = corr_triangle_v[tri_index]
    corr_matrix_v = t.fill_diagonal(corr_matrix_v, 1)
    cov_matrix_v = t.diag(sigma_v).dot(corr_matrix_v.dot(t.diag(sigma_v)))
    lambda_v = t.nlinalg.matrix_inverse(cov_matrix_v)

    mu_v = pm.Normal(
        'mu_v', mu=0, tau=beta_0 * lambda_v, shape=dim,
         testval=np.random.randn(dim) * std)
    V = pm.MvNormal(
        'V', mu=mu_v, tau=lambda_v,
        testval=np.random.randn(m, dim) * std)

    # Specify rating likelihood function
    R = pm.Normal(
        'R', mu=t.dot(U, V.T), tau=alpha * np.ones((n, m)),
        observed=train)

# `start` is the start dictionary obtained from running find_MAP for PMF.
# See the previous post for PMF code.
for key in bpmf.test_point:
    if key not in start:
        start[key] = bpmf.test_point[key]

with bpmf:
    step = pm.NUTS(scaling=start)

重新实现的目标是生成一个可以使用NUTS采样器估算的模型。不幸的是,我在最后一行仍然遇到相同的错误:

PositiveDefiniteError: Scaling is not positive definite. Simple check failed. Diagonal contains negatives. Check indexes [   0    1    2    3    ...   1030 1031 1032 1033 1034   ]

我已经在该要点中提供了PMF,BPMF和此经过修改的BPMF的所有代码,以简化复制错误的过程。您所需要做的就是下载数据(要点中也有提及)。


问题答案:

看来您要将完整的精度矩阵传递给正态分布:

mu_u = pm.Normal(
    'mu_u', mu=0, tau=beta_0 * lambda_u, shape=dim,
     testval=np.random.randn(dim) * std)

我假设您只想传递对角线值:

mu_u = pm.Normal(
    'mu_u', mu=0, tau=beta_0 * t.diag(lambda_u), shape=dim,
     testval=np.random.randn(dim) * std)

这会更改mu_umu_v为您修复吗?