python - pymc 中的标准化贝叶斯 IRT 模型

标签 python bayesian pymc

我能找到的关于如何在 Python 中使用 MCMC 估计此类 IRT 贝叶斯模型的最佳示例是 example 。下面是我运行的代码的可重现版本。我的理解是,为了识别模型,能力参数theta被限制为均值0、标准差1的正态分布,我认为这是通过下面代码中的这一行完成的:

# theta (proficiency params) are sampled from a normal distribution
theta = Normal("theta", mu=0, tau=1, value=theta_initial, observed= generating)

但是,当我运行代码时,theta 的后验平均值类似于 1.85e18,具有更大的标准差,即不是均值零和标准差 1。为什么我会收到此错误以及如何确保每次迭代后 theta 都归一化为 0,sd 1?

#from pylab import * #Pylab will not install with pip so I just loaded numpy itself
from numpy import *
import numpy
from pymc import *
from pymc.Matplot import plot as mplot
import numpy as np

numquestions = 300 # number of test items being simulated
numpeople = 10 # number of participants
numthetas = 1 # number of latent proficiency variables

generating = 0
theta_initial = zeros((numthetas, numpeople))
correctness = np.random.randint(2, size= numquestions * numpeople) == 1 #Produces Error
#correctness = np.random.randint(2, size= numquestions * numpeople) == -1 #all False code runs fine
#correctness = np.random.randint(2, size= numquestions * numpeople) != -1 #all True code throws error message

correctness.shape = (numquestions, numpeople)


# theta (proficiency params) are sampled from a normal distribution
theta = Normal("theta", mu=0, tau=1, value=theta_initial, observed= generating)


# question-parameters (IRT params) are sampled from normal distributions (though others were tried)
a = Normal("a", mu=1, tau=1, value=[[0.0] * numthetas] * numquestions)
# a = Exponential("a", beta=0.01, value=[[0.0] * numthetas] * numquestions)
b = Normal("b", mu=0, tau=1, value=[0.0] * numquestions)

# take vectors theta/a/b, return a vector of probabilities of each person getting each question correct
@deterministic
def sigmoid(theta=theta, a=a, b=b): 
    bs = repeat(reshape(b, (len(b), 1)), numpeople, 1)
    return np.exp(1.0 / (1.0 + np.exp(bs - dot(a, theta))))

# take the probabilities coming out of the sigmoid, and flip weighted coins
correct = Bernoulli('correct', p=sigmoid, value=correctness, observed=not generating)

# create a pymc simulation object, including all the above variables
m = MCMC([a,b,theta,sigmoid,correct])

# run an interactive MCMC sampling session
m.isample(iter=20000, burn=15000)


mydict = m.stats()
print(mydict['theta']['mean']) #Get ability parameters for each student
print(mydict['theta']['mean'].mean()) #Should be zero, but returns something link 1.85e18, i.e. an absurdly large value.

最佳答案

我认为您的 sigmoid 函数中有一个额外的 n.exp 。根据wikipediaS(t) = 1/(1 + exp(-t))。我用这个替代版本替换了你的第 34 行:

return 1.0 / (1.0 + np.exp(bs - dot(a, theta)))

这样我得到的 theta 平均值为 0.08。

关于python - pymc 中的标准化贝叶斯 IRT 模型,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/29108002/

相关文章:

python - 在 Telethon 中使用 Download_Media 时命名文件

python - 如何在python中使用比较和 ' if not'?

machine-learning - 概率编程与概率机器学习有什么区别?

r - 错误: Attempt to redefine node in linear regression

bayesian-networks - 如何使用 PyMC 估计贝叶斯网络中的参数

python - 将高斯混合转换为 PyMC3

python - Django 中的持久计算字段

python - 如何在 PyMC3 中定义一个模型,其中一个参数在多个条件下限制为相同值

python - PyMC 的并行化

Python (Pytorch) 多处理抛出错误 : Connection reset by peer and File Not Found