我的问题是关于使用插入符号拟合模型时如何处理缺失值。 我的数据的一小部分样本如下:
df <- dput(dat)
structure(list(LagO3 = c(NA, NA, NA, 40, 45, NA), RH = c(69.4087524414062,
79.9608383178711, 64.4592437744141, 66.4207077026367, 66.0899200439453,
91.3353729248047), SR = c(298.928888888889, 300.128888888889,
303.688888888889, 304.521111111111, 303.223333333333, 294.716666666667
), ST = c(317.9917578125, 317.448253038194, 311.039059244792,
312.557927517361, 321.252841796875, 330.512212456597), Tmx = c(294.770359293045,
294.897191864461, 295.674552786042, 296.247345044048, 296.108238352818,
294.594430242372), CWTE = c(0, 1, 0, 0, 0, 0), CWTW = c(0, 0,
0, 0, 0, 0), o3 = c(NA, NA, NA, 52, 55, NA)), .Names = c("LagO3",
"RH", "SR", "ST", "Tmx", "CWTE", "CWTW", "o3"), row.names = c("1",
"2", "3", "4", "5", "6"), class = "data.frame")
问题是,对于我的一个预测变量中的多个位置,我有 NA,并且预测值 (o3) 也有 NA(但在不同的位置)。然后,我尝试了:
model <- train(x = na.omit(x.training), y = na.omit(training$o3), method = "lmStepAIC",
direction="backward", trControl = control)
但是,我会对 y 有不同的长度...... 我尝试使用:
model <- train(x = x.training, y = training$o3,na.action=na.pass,
method = "lmStepAIC",direction="backward",trControl = control)
出现以下错误:
Error in quantile.default(y, probs = seq(0, 1, length = cuts)) : missing values and NaN's not allowed if 'na.rm' is FALSE
如果有任何建议,我将不胜感激!
非常感谢。
最佳答案
您需要将 na.action
参数与 train
函数的 na.omit
结合使用。正如 na.action
的文档所述(类型 ?train
):
A function to specify the action to be taken if NAs are found. The default action is for the procedure to fail. An alternative is na.omit, which leads to rejection of cases with missing values on any required variable. (NOTE: If given, this argument must be named.)
因此以下内容将起作用:
model <- train(x = x.training, y = training$o3,
method = "lmStepAIC",direction="backward",
trControl = control, na.action=na.omit)
输出:
> model <- train(x = x.training, y = y.training, method = "lmStepAIC",direction="backward",
+ na.action=na.omit)
Start: AIC=-129.7
.outcome ~ LagO3 + RH + SR + ST + Tmx + CWTE + CWTW
Step: AIC=-129.7
.outcome ~ LagO3 + RH + SR + ST + Tmx + CWTE
Step: AIC=-129.7
.outcome ~ LagO3 + RH + SR + ST + Tmx
Step: AIC=-129.7
.outcome ~ LagO3 + RH + SR + ST
Step: AIC=-129.7
.outcome ~ LagO3 + RH + SR
Step: AIC=-129.7
.outcome ~ LagO3 + RH
Step: AIC=-129.7
.outcome ~ LagO3
Step: AIC=-129.7
.outcome ~ 1
...
...
...
关于r - 使用插入符号创建训练和测试数据时缺少值,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28831197/