r - 如何从两个列表中获取每个组合？

我获取可用数据，并根据某些条件对其进行过滤(根据列的特定值删除行)。然后我根据这些数据训练模型。随后，我从头开始再次获取相同的数据，但这次我使用之前使用的相同标准或使用不同的标准来测试模型。然后我做 ROC 和瀑布图。我的问题是，我想从两个列表中获取每个组合。例如:

list1 = list(c('a','b','c'),c('A','B','C'))
list2 = list(c('x','y','z'),c('X','Y','Z'))

我想要一个 for 循环来运行 c('a','b','c') 和 c('x','y','z') 的分析，然后运行 c('a','b','c') 和 c('X','Y','Z') 的分析。之后继续到 c('A','B','C') 和 c('x','y','z') ，最后是 c('A','B','C') 和 c('X','Y','Z') 。

这是我的代码。现在我知道您可能会说 use_train 和 use_test 是相同的。它们不会保持不变，这只是暂时的。对我来说，处理两个列表比处理一个列表更容易。这里，每个模型和每个图都存储在我在 for 循环之前创建的列表中。我应该在 for 循环内创建一个 for 循环吗？

use_train = list(c('CR','PR','SD'),c('CR','PR','SD','PD')) # criteria used to train the ML model
use_test = list(c('CR','PR','SD'), c('CR','PR','SD','PD')) # criteria used to test the ML model

xgb_models = auc_test = auc_test_plot = data_list = waterfall = list() 

for(i in 1:length(use_train)){
  
  data_list[[i]] = create_data(mydata,metadata, 
                                  recist.use = use_train[[i]], case = 'CR', use_batch = FALSE, seed=40)
  
  xgb_models[[i]] = train_ici(data_list[[i]])
  #parallelStop()
  
  auc_test[[i]] = evaluate_model(xgb_models[[i]], mydata, metadata, 
                         recist.use = use_test[[i]], case = 'CR' , use_batch = FALSE, seed = 40)
  
  auc_test_plot[[i]] = evaluate_model_plot(xgb_models[[i]], data_list[[i]][[2]])
  
  waterfall[[i]] = waterfall(xgb_models[[i]], metadata, data_list[[i]][[2]], case  = 'CR',
                                train.recist = use_train[[i]], test.recist = use_test[[i]])
}

所以最后，我将进行 4 轮:

来自 use_train : c('CR','PR','SD') 和来自 use_test : c('CR','PR','SD')
来自 use_train : c('CR','PR','SD') 和来自 use_test : c('CR','PR','SD','PD')
来自 use_train : c('CR','PR','SD','PD') 和来自 use_test : c('CR','PR','SD')
来自 use_train :c('CR','PR','SD','PD') 和来自 use_test :c('CR','PR','SD','PD') 。

编辑-

此示例来自 create_data 函数之后的数据。因此，我在这里已经创建了数据，并且已为 train_ici 函数做好了准备。

structure(list(`totaldata_new[, "RECIST"]` = c("PD", "SD", "PR", 
"PD", "PD", "PD", "PD", "PR", "SD", "PD", "SD", "PD", "PD", "PD", 
"PR", "CR", "PD", "PR", "SD", "SD", "SD", "PD", "SD", "PR", "PD"
), Gender = c("male", "female", "female", "female", "male", "female", 
"female", "male", "male", "male", "female", "male", "female", 
"female", "male", "female", "female", "male", "male", "male", 
"female", "male", "female", "male", "male"), treatment = c("anti-PD1", 
"anti-PD1", "anti-PD1", "anti-PD1", "anti-PD1", "anti-PD1", "anti-PD1", 
"anti-PD1", "anti-PD1", "anti-PD1", "anti-PD1", "anti-PD1", "anti-PD1", 
"anti-PD1", "anti-PD1", "anti-PD1", "anti-PD1", "anti-PD1", "anti-PD1", 
"anti-PD1", "anti-PD1", "anti-PD1", "anti-PD1", "anti-PD1", "anti-PD1"
), Cancer_Type = c("Melanoma", "Melanoma", "Melanoma", "Melanoma", 
"Melanoma", "Melanoma", "Melanoma", "Melanoma", "Melanoma", "Melanoma", 
"Melanoma", "Melanoma", "Melanoma", "Melanoma", "Melanoma", "Melanoma", 
"Melanoma", "Melanoma", "Melanoma", "Melanoma", "Melanoma", "Melanoma", 
"Melanoma", "Melanoma", "Melanoma"), `CD4-T-cells` = c(-0.0741098696855045, 
-0.094401270881699, 0.0410284948786532, -0.163302950330185, -0.0942478217207681, 
-0.167314411991775, -0.118272811489486, -0.0366277340916379, 
-0.0349008907108641, -0.167823357941815, -0.0809646843667242, 
-0.140727850456348, -0.148668434567449, -0.0726825919321525, 
-0.062499826731091, -0.0861178015030313, -0.117687306656149, 
-0.141342090175904, -0.206661192280272, -0.15593285099477, -0.0897617831679252, 
-0.0627645386986058, -0.136416087222329, -0.100351419040291, 
-0.167041995646525)), row.names = c("Pt1", "Pt10", "Pt101", "Pt103", 
"Pt106", "Pt11", "Pt17", "Pt18", "Pt2", "Pt24", "Pt26", "Pt27", 
"Pt28", "Pt29", "Pt3", "Pt30", "Pt31", "Pt34", "Pt36", "Pt37", 
"Pt38", "Pt39", "Pt4", "Pt44", "Pt46"), class = "data.frame")

最佳答案

如果问题是可并行化的，人们会尝试避免在 R 中使用 for 循环。相反，您可以创建一个数据框来保存由 expand.grid() 创建的所有组合，并使用相应的结果创建一个附加列:

library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

list1 <-list(c('a','b','c'),c('A','B','C'))
list2 <- list(c('x','y','z'),c('X','Y','Z'))

# do some stuff with the 2 vars
do_stuff <- function(l1, l2) {
  length(l1) + length(l2) + runif(1)
}

expand.grid(list1, list2) |>
  rowwise() |>
  mutate(result = do_stuff(Var1, Var2))
#> # A tibble: 4 × 3
#> # Rowwise: 
#>   Var1      Var2      result
#>   <list>    <list>     <dbl>
#> 1 <chr [3]> <chr [3]>   6.43
#> 2 <chr [3]> <chr [3]>   6.91
#> 3 <chr [3]> <chr [3]>   6.26
#> 4 <chr [3]> <chr [3]>   6.08

^{由reprex package于2023年1月7日创建(v2.0.1)}

关于r - 如何从两个列表中获取每个组合？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/75041051/

r - 如何从两个列表中获取每个组合？

编辑-

上一篇：python - 如何将自定义函数应用于 xarray.DataArray.coarsen.reduce()？

下一篇：excel - 从 Excel 工作表中过滤/提取具有相似值的文本