r - 插入符 : glmnet warning - x should be a matrix with 2 or more columns

标签 r r-caret glmnet

当我将单个数字变量作为独立变量传递给插入符号中的 glmnet 时,我收到一条错误消息,指出“x 应该是具有 2 列或更多列的矩阵”,但是当我传递单个因子变量时,则传递训练函数按预期执行。将因子变量添加到单个数值变量也可以按预期工作。为什么是这样?到目前为止,这是一个很大的问题。我知道使用 glmnet 您需要使用矩阵而不是数据框,但是插入符应该处理这种转换,就像它对于因子变量所做的那样。另外,我需要能够在插入符框架内一致地实现我的分析,并且我需要将我的数据作为数据框架。这是一个示例,请忽略因观察太少而导致的警告消息,该消息与此问题无关。

任何帮助将不胜感激,因为我快要疯了!

df <- structure(list(Y = structure(c(1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 
                             1L, 1L, 1L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L), .Label = c("No", 
                                                                                         "Yes"), class = "factor"), A = c("Yes", "Yes", "No", "No", "No", 
                                                                                                                          "No", "No", "No", "No", "Yes", "No", "No", "Yes", "Yes", "N", 
                                                                                                                          "No", "No", "No", "No", "No"), B = c(30, 6, 12, 12, 12, 12, 12, 
                                                                                                                                                               4, 12, 32, 12, 12, 4, 24, 8, 12, 15, 6, 12, 12), C = structure(c(1L, 
                                                                                                                                                                                                                                1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 2L, 1L, 
                                                                                                                                                                                                                                1L, 2L, 2L), .Label = c("A", "B"), class = "factor")), .Names = c("Y", 
                                                                                                                                                                                                                                                                                                  "A", "B", "C"), row.names = c(NA, 20L), class = "data.frame")



# set up the grid
  tuneGrid <- expand.grid(.alpha = seq(0, 1, 0.05), .lambda = seq(0, 2, 0.05))
  ## 10-fold CV ##
  fitControl <- trainControl(method = 'cv', number = 10, classProbs = TRUE, summaryFunction = twoClassSummary) 

  #works with a single factor variable  (ignore warnings based on small sample size)
  train(Y ~ A, data=df[c("Y", "A")], method="glmnet", 
    family="binomial", trControl = fitControl, tuneGrid = tuneGrid, metric = "ROC")

  #returns and error message when a single numeric independent variable is passed
  train(Y ~ B, data=df[c("Y", "B")], method="glmnet", 
    family="binomial", trControl = fitControl, tuneGrid = tuneGrid, metric = "ROC")

  #works when a factor variable is added to the numeric variable (ignore warnings based on small sample size)
  train(Y ~ A + C, data=df[c("Y", "A", "C")], method="glmnet", 
    family="binomial", trControl = fitControl, tuneGrid = tuneGrid, metric = "ROC")

最佳答案

尝试使用这个技巧:

df$ones <- rep(1, nrow(df))
train(Y ~ ones+B, data=df[c("Y", "B", "ones")], method="glmnet", 
    family="binomial", trControl = fitControl, tuneGrid = tuneGrid, metric = "ROC")

关于r - 插入符 : glmnet warning - x should be a matrix with 2 or more columns,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46911249/

相关文章:

R gputools : gcc: error: unrecognized command line option ‘-Wp’

r - 以与R的bs()函数相同的方式在Matlab中计算B样条基

python - 如何使用 'selector gadget' 将数据抓取到 R 中?

r - 在 R 中使用 Caret 保存和加载 catboost 模型

r - 变量的顺序改变了 glmnet 中的估计系数

python - 在 ElasticNetCV 中使用相当于 lambda 属性的 python 时遇到问题

r - 将字符串拆分为 2 个字符的组合并扩展为 R 中的数据框

r - Predict() 函数的奇怪行为

r - 在插入符号中拟合无截距模型

r - plot.glmnet 增加变量标签的大小