当每个人都有多个数据点时，我如何拟合pymc3模型？

问题描述

我正在尝试对研究中遇到的各种数据使用pymc3，但是当每个人给我多个数据点并且每个人来自时，我都很难思考如何拟合模型一个不同的组（因此尝试使用层次模型）。

这是我正在使用的练习方案：假设我们有两组人，每组N = 30。全部60人进行了10个问题调查，每个人都可以对每个问题做出回答（“ 1”）或不做出回答（“ 0”）。因此，对于每个人，我都有一个长度为10的数组，带有1和0。

要为这些数据建模，我假设每个人都有一些潜在的特征“ theta”，并且每个项目都有一个“歧视” a和“困难” b（这只是一个基本的项目响应模型），以及响应（“ 1”）表示为：（1 + exp（-a（theta-b）））^（-1）。（逻辑应用于a（theta-b）。）

这是我尝试使用pymc3进行调整的方式：

traces = {}

for grp in range(2):

group = prac_data["Group ID"] == grp
data = prac_data[group]["Response"]

with pm.Model() as irt:
    # Priors
    a_tmp = pm.Normal('a_tmp',mu=0,sd = 1,shape = 10)
    a = pm.Deterministic('a',np.exp(a_tmp))
    # We do this transformation since we must have a >= 0 
    
    b = pm.Normal('b',mu = 0,shape = 10)
    
    # Now for the hyperpriors on the groups:
    theta_mu = pm.Normal('theta_mu',sd = 1)
    theta_sigma = pm.Uniform('theta_sigma',upper = 2,lower = 0)
    
   
    
    theta = pm.Normal('theta',mu = theta_mu,sd = theta_sigma,shape = N)
    
    p = getProbs(Disc,Diff,theta,N)
    
    y = pm.Bernoulli('y',p = p,observed = data)
        
    traces[grp] = pm.sample(1000)

函数“ getProbs”应该为我提供伯努利随机变量的一系列概率，因为每个人在试验/调查问题中回答1的概率都发生了变化。但是此方法给我一个错误，因为它说“指定p或logit_p中的一个”，但是我以为我使用了函数？

如果有帮助，这里是“ getProbs”的代码：

def getProbs(Disc,THETA,Nprt):
# Get a large array of probabilities for the bernoulli random variable
n = len(Disc)
m = Nprt
probs = np.array([])
for th in range(m):
    for t in range(n):
        p = item(Disc[t],Diff[t],THETA[th])
        probs = np.append(probs,p)
return probs

我添加了Nprt参数，因为如果尝试获取THETA的长度，由于它是FreeRV对象，因此会给我一个错误。我知道我可以尝试对“ item”函数进行矢量化处理，而这只是我上面提到的逻辑函数，而不是这样做，但这在我尝试运行它时也出现了错误。

我认为我可以使用pm.Data来解决此问题，但是文档对我来说并不十分清楚。

基本上，我习惯于在JAGS中构建模型，在其中循环遍历每个数据点，但是pymc3似乎不能那样工作。我对如何在模型中建立/索引随机变量感到困惑，以确保概率将我希望的概率从试验变为试验，并确保我估计的参数与在正确的人群中合适的人。

在此先感谢您的帮助。我是pymc3的新手，正试图抓住它的根源，并想尝试一些不同于JAGS的东西。

编辑：我能够通过以下方式解决此问题：首先通过遍历试验来构建所需的阵列，然后使用以下方法转换阵列：

p = theano.tensor.stack(p,axis = 0)

然后我将这个新变量放在Bernoulli实例的“ p”参数中，并且可以正常工作！这是更新的完整模型：（下面，我将theano.tensor导入为T）

group = group.astype('int') 
data = prac_data["Response"]


with pm.Model() as irt:
# Priors

# Item parameters:
a = pm.Gamma('a',alpha = 1,beta = 1,shape = 10) # Discrimination

b = pm.Normal('b',shape = 10) # Difficulty

# Now for the hyperpriors on the groups: shape = 2 as there are 2 groups
theta_mu = pm.Normal('theta_mu',shape = 2)
theta_sigma = pm.Uniform('theta_sigma',lower = 0,shape = 2)


# Individual-level person parameters:
# group is a 2*N array that lets the model know which
# theta_mu to use for each theta to estimate
theta = pm.Normal('theta',mu = theta_mu[group],sd = theta_sigma[group],shape = 2*N)


# Here,we're building an array of the probabilities we need for 
# each trial:
p = np.array([])
for n in range(2*N):
    for t in range(10):
        x = -a[t]*(theta[n] - b[t])
        p = np.append(p,x)
  
# Here,we turn p into a tensor object to put as an argument to the
# Bernoulli random variable
p = T.stack(p,axis = 0)

y = pm.Bernoulli('y',logit_p = p,observed = data)


# On my computer,this took about 5 minutes to run. 
traces = pm.sample(1000,cores = 1)
    
print(az.summary(traces)) # Summary of parameter distributions

解决方法

暂无找到可以解决该程序问题的有效方法，小编努力寻找整理中！

如果你已经找到好的解决方法，欢迎将解决方案带上本链接一起发送给小编。

小编邮箱:dio#foxmail.com (将#修改为@）

hierarchical-bayesian pymc3

当每个人都有多个数据点时，我如何拟合pymc3模型？

问题描述

解决方法

相关问答