python - R 和 Python 中的广义线性模型给出与 MATLAB 中不同的结果

标签 python r matlab regression statsmodels

我正在尝试将处理广义线性模型的函​​数从 MATLAB 转录到 R 和 Python。目前,R 和 Python 都给出了相同的答案,这与 MATLAB 的答案不同,即使输入相同也是如此。

MATLAB 代码本质上是这样的:

coeffs = glmfit(X, [y ones(length(y),1)], 'binomial', 'link', 'logit');

请注意,matlab 中的 glmfit 已将常数项添加到 X。此函数的文档位于此处:http://www.mathworks.com/help/stats/glmfit.html

在 R 中,我有这样的东西:

model <- glm(cbind(y, array(1, length(y)))~X, family=binomial(link=logit))
coeffs <- model$coefficients

R 中 glm 的文档在这里:http://stat.ethz.ch/R-manual/R-patched/library/stats/html/glm.html

在 Python 中,我有这个:

import statsmodels.api as sm
import numpy as np

est = sm.GLM(sm.add_constant(y, prepend=False), sm.add_constant(X), family=sm.families.Binomial()).fit()
coeffs = est.params

在 python 中,文档在这里:http://statsmodels.sourceforge.net/devel/glm.html#module-reference

根据文档,所有 GLM 函数都使用 logit 参数作为 link。如果非要我猜的话,这个问题可能是由于我以前遇到过的 MATLAB 的精度造成的。 (似乎 MATLAB 不如 R 或 NumPy 精确。)不过,我确实想确定我不只是使用了错误的函数。

我认为这是 MATLAB 的问题是否正确,还是我遗漏了一些参数/做错了什么?

另外——如果它很重要,这些是我正在使用的变量(我正在从一个 csv 文件中读取):

X =

3.00000,9.10000,0.43000,-1.26000,-0.25000,0.05000,-4.01000,7.87000
4.00000,1.70000,-1.10000,0.43000,-1.26000,-17.34000,0.05000,-4.01000
5.00000,3.80000,-0.01000,-1.10000,0.43000,5.35000,-17.34000,0.05000
6.00000,0.70000,0.13000,-0.01000,-1.10000,6.03000,5.35000,-17.34000
7.00000,5.80000,0.38000,0.13000,-0.01000,18.10000,6.03000,5.35000
8.00000,8.90000,0.13000,0.38000,0.13000,-1.88000,18.10000,6.03000
9.00000,7.60000,-0.88000,0.13000,0.38000,-7.59000,-1.88000,18.10000
10.00000,4.50000,0.01000,-0.88000,0.13000,2.67000,-7.59000,-1.88000
11.00000,2.40000,0.37000,0.01000,-0.88000,12.76000,2.67000,-7.59000
12.00000,6.60000,0.60000,0.37000,0.01000,3.25000,12.76000,2.67000
13.00000,2.80000,-1.66000,0.60000,0.37000,18.67000,3.25000,12.76000
14.00000,7.20000,1.91000,-1.66000,0.60000,-0.37000,18.67000,3.25000
15.00000,4.00000,0.78000,1.91000,-1.66000,-8.18000,-0.37000,18.67000
16.00000,7.00000,-0.81000,0.78000,1.91000,8.86000,-8.18000,-0.37000
17.00000,0.50000,0.33000,-0.81000,0.78000,-12.55000,8.86000,-8.18000
18.00000,4.00000,1.17000,0.33000,-0.81000,-10.02000,-12.55000,8.86000
19.00000,7.10000,0.20000,1.17000,0.33000,5.50000,-10.02000,-12.55000
20.00000,3.20000,0.12000,0.20000,1.17000,-20.33000,5.50000,-10.02000
3.00000,0.70000,-1.26000,-0.96000,0.88000,-12.24000,7.18000,0.31000
4.00000,1.60000,-0.70000,-1.26000,-0.96000,-1.32000,-12.24000,7.18000
5.00000,8.00000,1.98000,-0.70000,-1.26000,6.75000,-1.32000,-12.24000
6.00000,3.40000,0.80000,1.98000,-0.70000,17.24000,6.75000,-1.32000
7.00000,8.00000,-1.09000,0.80000,1.98000,-7.76000,17.24000,6.75000
8.00000,7.30000,-0.59000,-1.09000,0.80000,-2.04000,-7.76000,17.24000
9.00000,7.70000,0.69000,-0.59000,-1.09000,7.18000,-2.04000,-7.76000
10.00000,6.60000,0.09000,0.69000,-0.59000,-1.75000,7.18000,-2.04000
11.00000,0.10000,0.10000,0.09000,0.69000,1.37000,-1.75000,7.18000
12.00000,5.80000,0.18000,0.10000,0.09000,14.02000,1.37000,-1.75000
13.00000,9.90000,-1.36000,0.18000,0.10000,-3.16000,14.02000,1.37000
14.00000,1.30000,-0.26000,-1.36000,0.18000,6.47000,-3.16000,14.02000
15.00000,8.40000,-0.52000,-0.26000,-1.36000,-7.12000,6.47000,-3.16000
16.00000,7.40000,-0.66000,-0.52000,-0.26000,-1.73000,-7.12000,6.47000
17.00000,8.30000,-0.63000,-0.66000,-0.52000,11.65000,-1.73000,-7.12000
18.00000,6.70000,-1.04000,-0.63000,-0.66000,13.63000,11.65000,-1.73000
19.00000,5.80000,-1.25000,-1.04000,-0.63000,-14.71000,13.63000,11.65000
20.00000,3.20000,-1.37000,-1.25000,-1.04000,-3.60000,-14.71000,13.63000
3.00000,2.30000,-1.16000,-0.19000,-0.70000,8.41000,0.18000,-4.08000
4.00000,1.90000,1.11000,-1.16000,-0.19000,-0.00000,8.41000,0.18000
5.00000,6.40000,0.12000,1.11000,-1.16000,17.86000,-0.00000,8.41000
6.00000,3.90000,1.30000,0.12000,1.11000,1.12000,17.86000,-0.00000
7.00000,0.80000,-0.63000,1.30000,0.12000,-1.33000,1.12000,17.86000
8.00000,7.60000,-0.21000,-0.63000,1.30000,22.26000,-1.33000,1.12000
9.00000,0.90000,1.47000,-0.21000,-0.63000,-14.10000,22.26000,-1.33000
10.00000,5.10000,0.60000,1.47000,-0.21000,13.36000,-14.10000,22.26000
11.00000,6.40000,-1.41000,0.60000,1.47000,-5.37000,13.36000,-14.10000
12.00000,3.30000,-0.97000,-1.41000,0.60000,6.47000,-5.37000,13.36000
13.00000,8.80000,1.82000,-0.97000,-1.41000,8.82000,6.47000,-5.37000
14.00000,9.90000,1.80000,1.82000,-0.97000,5.63000,8.82000,6.47000
15.00000,3.40000,-0.60000,1.80000,1.82000,5.02000,5.63000,8.82000
16.00000,5.90000,-1.14000,-0.60000,1.80000,10.93000,5.02000,5.63000
17.00000,3.00000,0.77000,-1.14000,-0.60000,-2.12000,10.93000,5.02000
18.00000,1.60000,1.69000,0.77000,-1.14000,-2.89000,-2.12000,10.93000
19.00000,1.10000,-0.07000,1.69000,0.77000,-2.78000,-2.89000,-2.12000
20.00000,0.70000,-0.66000,-0.07000,1.69000,2.82000,-2.78000,-2.89000
3.00000,1.30000,-0.72000,1.12000,1.41000,4.24000,5.35000,-0.62000
4.00000,4.80000,-1.18000,-0.72000,1.12000,8.91000,4.24000,5.35000
5.00000,1.30000,-0.44000,-1.18000,-0.72000,1.15000,8.91000,4.24000
6.00000,8.90000,-0.42000,-0.44000,-1.18000,11.64000,1.15000,8.91000
7.00000,6.20000,0.06000,-0.42000,-0.44000,20.74000,11.64000,1.15000
8.00000,6.60000,1.83000,0.06000,-0.42000,21.14000,20.74000,11.64000
9.00000,2.80000,1.50000,1.83000,0.06000,-16.67000,21.14000,20.74000
10.00000,8.70000,-1.72000,1.50000,1.83000,9.86000,-16.67000,21.14000
11.00000,6.90000,-1.51000,-1.72000,1.50000,14.44000,9.86000,-16.67000
12.00000,1.80000,-1.10000,-1.51000,-1.72000,3.64000,14.44000,9.86000
13.00000,4.30000,-1.34000,-1.10000,-1.51000,13.78000,3.64000,14.44000
14.00000,4.90000,-0.31000,-1.34000,-1.10000,-2.43000,13.78000,3.64000
15.00000,2.50000,1.69000,-0.31000,-1.34000,5.85000,-2.43000,13.78000
16.00000,1.70000,0.67000,1.69000,-0.31000,-0.94000,5.85000,-2.43000
17.00000,8.00000,1.23000,0.67000,1.69000,-2.19000,-0.94000,5.85000
18.00000,1.40000,0.63000,1.23000,0.67000,14.29000,-2.19000,-0.94000
19.00000,5.00000,1.10000,0.63000,1.23000,7.65000,14.29000,-2.19000
20.00000,4.10000,-0.13000,1.10000,0.63000,3.03000,7.65000,14.29000
3.00000,3.70000,-0.79000,-1.32000,-0.62000,-5.73000,0.93000,-0.62000
4.00000,0.90000,1.56000,-0.79000,-1.32000,3.27000,-5.73000,0.93000
5.00000,6.90000,2.39000,1.56000,-0.79000,11.63000,3.27000,-5.73000
6.00000,0.20000,1.17000,2.39000,1.56000,2.61000,11.63000,3.27000
7.00000,2.30000,-0.83000,1.17000,2.39000,18.57000,2.61000,11.63000
8.00000,0.10000,0.08000,-0.83000,1.17000,2.48000,18.57000,2.61000
9.00000,0.80000,-0.52000,0.08000,-0.83000,-2.11000,2.48000,18.57000
10.00000,3.40000,0.47000,-0.52000,0.08000,-5.58000,-2.11000,2.48000
11.00000,3.30000,-1.37000,0.47000,-0.52000,-5.09000,-5.58000,-2.11000
12.00000,7.70000,-0.54000,-1.37000,0.47000,-13.45000,-5.09000,-5.58000
13.00000,7.80000,2.62000,-0.54000,-1.37000,13.67000,-13.45000,-5.09000
14.00000,3.50000,0.91000,2.62000,-0.54000,-10.03000,13.67000,-13.45000
15.00000,4.40000,-0.73000,0.91000,2.62000,7.27000,-10.03000,13.67000
16.00000,0.50000,0.53000,-0.73000,0.91000,-5.44000,7.27000,-10.03000
17.00000,4.50000,-0.80000,0.53000,-0.73000,7.60000,-5.44000,7.27000
18.00000,7.90000,0.01000,-0.80000,0.53000,-11.30000,7.60000,-5.44000
19.00000,6.30000,0.42000,0.01000,-0.80000,9.38000,-11.30000,7.60000
20.00000,2.70000,0.13000,0.42000,0.01000,3.55000,9.38000,-11.30000

和 y =

0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
1.00000
0.00000
1.00000
1.00000
0.00000
0.00000
0.00000
1.00000
0.00000
1.00000
0.00000
0.00000
1.00000
1.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
1.00000
0.00000
0.00000
1.00000
0.00000
1.00000
1.00000
0.00000
0.00000
0.00000
1.00000
1.00000
0.00000
1.00000
0.00000
1.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
1.00000
1.00000
0.00000
1.00000
1.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
0.00000
1.00000
0.00000
0.00000
0.00000

感谢您的帮助!

最佳答案

感谢 aosmith,我找到了解决方案 --

我的 R 代码更改为:

y <- cbind(y, 1-y)
model <- glm(y~X, family=binomial(link=logit))
coeffs <- model$coeffs

我的 Python 代码更改为:

est = sm.GLM(np.vstack([y, 1-y]).T, sm.add_constant(X), family=sm.families.Binomial()).fit()
coeffs = est.params

谢谢!

关于python - R 和 Python 中的广义线性模型给出与 MATLAB 中不同的结果,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24937202/

相关文章:

python - 在Python中将图像3 channel 深度转换为1 channel 的快速方法?

r - 如何在 Seurat (ggplot2) 中的 DoHeatmap 图中重新排序单元格

r - R corrplot 中的标题太不居中且太高

r - 加载 Data.Table 后遮盖的润滑对象

matlab - 杀死在 Matlab 中创建的 Excel 进程

matlab - 是否有可能在 Matlab 中获得对矩阵的引用?

Matlab:将单元格中的矩阵对角线附加到新的零矩阵中

Python 在带有 numpy 数组数据的 for 循环中附加一个列表

python - 在 Python 中使用 Lambda 过滤数据框

python - VTK 不能用 vtkClipClosedSurface 构造一个合适的闭合曲面