我正在尝试将一些模型拟合到一些数据中,结果模型预测了合理的值,并且绘图似乎是正确的。但是当分别提取系数和绘制函数时,它们没有意义!。我显然做错了什么,所以请有人告诉我错误在哪里?
数据:
dput(distcur)
structure(list(id1 = c(1.6, 1.6, 1.6, 1.6, 1.6, 1.6, 1.6, 1.6
), range = c(-39.898125, -21.448125, -11.07, -3.22875, 3.776484375,
12.309609375, 22.399453125, 39.235078125), meanrat = c(20.2496,
17.7504273504274, 12.76875, 2.475, -1.4295652173913, -3.9603305785124,
-14.7008547008547, -19.7366666666667)), .Names = c("id1", "range",
"meanrat"), row.names = 9:16, class = "data.frame")
library(ggplot2)
id = 1.6
degree = 3
press_x <- seq(min(distcur$range), max(distcur$range), length = 500)
moddist3b <- lm(meanrat ~ poly(range, degree), distcur)
valsdist = data.frame(predict(moddist3b, data.frame(range = press_x)))
colnames(valsdist) = "pred"
valsdist$id1 = id
allvals = cbind(valsdist, press_x)
summary(moddist3b)
#test plot
pdf(paste("mod-",measure,id ))
TITLE = paste("Distance ID: ", id, "Model = line, Points = exp1")
p = ggplot(allvals, aes(x=press_x, y=pred, colour=factor(id1))) +
geom_line() +
geom_point(data=distcur, aes(shape=factor(id1), x = range, y = meanrat, colour = factor(id1))) +
ylim(-100, 100) +
labs(title=TITLE) +
ylab("Mean Rating (%)") +
xlab(measure)
print(p)
dev.off()
我知道图像质量很差,但它表明它是正确的。然而,从用于构建函数的模型中获得的系数看起来与该图完全不同:
summary(moddist3b)
Call:
lm(formula = meanrat ~ poly(range, degree), data = distcur)
Residuals:
9 10 11 12 13 14 15 16
-0.20134 0.44939 1.65996 -2.80500 -1.14594 2.98617 -0.92081 -0.02244
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 1.6770 0.8281 2.025 0.1128
poly(range, degree)1 -37.7155 2.3423 -16.102 8.7e-05 ***
poly(range, degree)2 -2.9435 2.3423 -1.257 0.2773
poly(range, degree)3 6.4888 2.3423 2.770 0.0503 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.342 on 4 degrees of freedom
Multiple R-squared: 0.9853, Adjusted R-squared: 0.9743
F-statistic: 89.51 on 3 and 4 DF, p-value: 0.0004019
给定函数 y = 6.49x^3 −2.94x^2 − 37.72x + 1.68
在谷歌上绘制清楚地表明该函数与R(来自模型)的图完全不同
https://www.google.com/search?q=6.49x^3+%E2%88%922.94x^2+%E2%88%92+37.72x+%2B+1.68&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:unofficial&client=iceweasel-a&channel=fflb#client=iceweasel-a&rls=org.mozilla:en-US%3Aunofficial&channel=fflb&sclient=psy-ab&q=6.49*x^3+-2.94*x^2+-+37.72*x+%2B+1.68&oq=6.49*x^3+-2.94*x^2+-+37.72*x+%2B+1.68&gs_l=serp.3...3610.3975.1.4155.2.2.0.0.0.0.107.147.1j1.2.0...0.0...1c.1.14.psy-ab.4C6De6gdmtg&pbx=1&bav=on.2,or.r_qf.&bvm=bv.47008514,d.d2k&fp=5e81885614cfda4f&biw=1440&bih=667
最佳答案
您遇到的问题与ggplot
无关.相反,它是您定义线性模型的方式。顺便说一句,我弄清楚发生了什么的方式是在 0 进行预测:
R> (moddist3b <- lm(meanrat ~ poly(range, 3), distcur) )
Coefficients:
(Intercept) poly(range, 3)1 poly(range, 3)2 poly(range, 3)3
1.68 -37.72 -2.94 6.49
R> predict(moddist3b, data.frame(range = 0))
1
2.733
并注意预测已关闭(应该是 1.68)。
无论如何,您需要使用参数
raw=TRUE
来拟合模型(moddist3b <- lm(meanrat ~ poly(range, 3, raw=TRUE), distcur) )
predict(moddist3b, data.frame(range = 0))
这给了你你所期望的。默认情况下,
poly
使用正交多项式。见 this blog post和 poly帮助页面了解更多详情。
关于R-多项式线性模型系数不符合模型的预测值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/16749926/