我在以下链接中对某些计数数据进行负二项式分析:https://www.dropbox.com/s/q7fwqicw3ebvwlg/stackquestion.csv?dl=0
当我尝试将所有自变量拟合到模型中时遇到一些问题(错误消息),这使我想一一查看每个自变量,以找出引起问题的变量。这是我发现的:
对于所有其他变量,当我将变量拟合到Y时,A列看起来很正常:
m2 <- glm.nb(A~K, data=d)
summary(m2)
Call:
glm.nb(formula = A ~ K, data = d, init.theta = 0.5569971932,
link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-2.5070 -1.2538 -0.4360 0.1796 1.9588
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -0.66185 0.84980 -0.779 0.436
K 0.25628 0.03016 8.498 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(0.557) family taken to be 1)
Null deviance: 113.202 on 56 degrees of freedom
Residual deviance: 70.092 on 55 degrees of freedom
AIC: 834.86
Number of Fisher Scoring iterations: 1
Theta: 0.5570
Std. Err.: 0.0923
2 x log-likelihood: -828.8570
但是,我发现了这个变量L,当我将L拟合到Y时,我得到了:m1 <- glm.nb(A~L, data=d)
There were 50 or more warnings (use warnings() to see the first 50)
summary(m1)
Call:
glm.nb(formula = A ~ L, data = d, init.theta = 5136324.722, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-67.19 -18.93 -12.07 13.25 64.00
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.45341 0.01796 192.3 <2e-16 ***
L 0.24254 0.00103 235.5 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(5136325) family taken to be 1)
Null deviance: 97084 on 56 degrees of freedom
Residual deviance: 28529 on 55 degrees of freedom
AIC: 28941
Number of Fisher Scoring iterations: 1
Error in prettyNum(.Internal(format(x, trim, digits, nsmall, width, 3L, :
invalid 'nsmall' argument
您会看到init.theta和AIC太大,并且有50条警告和一条错误消息。警告信息是这个
In theta.ml(Y, mu, sum(w), w, limit = control$maxit, trace = control$trace > ... :
iteration limit reached
实际上,变量M和L是对一件事的两种观察。我没有发现变量L有任何异常。对于所有数据,只有列L有此问题。因此,我想知道此错误消息的确切含义是:prettyNum(.Internal(format(x,trim,digits,nsmall,width,3L ,: invalid'nsmall'argument。解决此错误吗?
最佳答案
重要信息在warnings()
中:当L
是自变量时,GLM收敛过程中的默认迭代次数不够高,不足以收敛于模型拟合。
如果您手动将maxit
参数设置为更高的值,则可以正确安装A ~ L
而不出现错误:
glm.nb(A ~ L, data = d, control = glm.control(maxit = 500))
有关更多信息,请参见 glm.control
documentation。请注意,您还可以为init.theta
设置一个合理的值-这样可以防止theta和AIC都适合不合理的值:m1 <- glm.nb(A ~ L, data = df, control = glm.control(maxit = 500), init.theta = 1.0)
输出: Call:
glm.nb(formula = A ~ L, data = df, control = glm.control(maxit = 500),
init.theta = 0.8016681349, link = log)
Deviance Residuals:
Min 1Q Median 3Q Max
-3.3020 -0.9347 -0.3578 0.1435 2.5420
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 1.25962 0.40094 3.142 0.00168 **
L 0.38823 0.02994 12.967 < 2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
(Dispersion parameter for Negative Binomial(0.8017) family taken to be 1)
Null deviance: 160.693 on 56 degrees of freedom
Residual deviance: 67.976 on 55 degrees of freedom
AIC: 809.41
Number of Fisher Scoring iterations: 1
Theta: 0.802
Std. Err.: 0.140
2 x log-likelihood: -803.405
关于r - prettyNum(.Internal(format(x,trim,digits,nsmall,width,3L,: invalid 'nsmall' argument,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/64002936/