我创建了一个函数clique_function
,它根据条件从两个数据帧object_list
和clique_list
中选择变量,并从中返回一个新的数据帧clique_object
。
输入:
clique_list
是一个派系列表(网络中三个节点的组),其中每列代表一个由三个节点组成的派系。
clique_list <- structure(c("ND1", "IS1", "IS3", "IS1", "IS3", "IS2", "ND2",
"ND1", "IS1"), .Dim = c(3L, 3L))
object_list
是一个矩阵,其中节点为行,不同对象类型的出现次数为列。
object_list <- structure(list(CA1 = c(0.159159159159159, 0.222222222222222,
0.25, 0.115384615384615, 0.311111111111111, 0.1285140562249,
0.214132762312634, 0.413461538461538, 0.183333333333333, 0.4,
0.4375, 0.167778836987607, 0.25, 0.5, 0.166666666666667, 0.181818181818182,
0.21580547112462, 0.0792452830188679, 0.424657534246575, 0, 0
), CA11 = c(0.00600600600600601, 0, 0, 0, 0, 0.00401606425702811,
0.012847965738758, 0, 0.05, 0, 0, 0, 0, 0, 0, 0, 0.00911854103343465,
0.0113207547169811, 0.0410958904109589, 0, 0), CA111 = c(0, 0,
0, 0, 0, 0, 0.00499643112062812, 0, 0.0333333333333333, 0, 0,
0, 0, 0, 0, 0, 0.0060790273556231, 0.00754716981132075, 0.0273972602739726,
0, 0), CA1111 = c(0, 0, 0, 0, 0, 0, 0.000713775874375446, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), CA1113 = c(0, 0, 0, 0,
0, 0, 0.000713775874375446, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0), CA1115 = c(0, 0, 0, 0, 0, 0, 0.000713775874375446,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0), CA1116 = c(0, 0, 0,
0, 0, 0, 0, 0, 0.0166666666666667, 0, 0, 0, 0, 0, 0, 0, 0.00303951367781155,
0.00377358490566038, 0, 0, 0), CA1117 = c(0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0.0136986301369863, 0, 0), CA112 = c(0,
0, 0, 0, 0, 0, 0.00285510349750178, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0), CA1122 = c(0, 0, 0, 0, 0, 0, 0.00142755174875089,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0)), class = "data.frame", row.names = c("ND5",
"ND6", "ND8/ND10", "ND3", "ND7", "ND2", "ND1", "ND4", "KB3/KB4/KB5",
"KB1", "KB2", "IS1", "KB9", "KB7/KB8", "KB6", "IS4", "IS3", "IS2",
"KB12/KB14", "KB13", "IS5"))
函数clique_function
应该循环object_list
并选择变量type
(列),例如CA1
来自 clique_list
的三个节点。
clique_function <- function(clique_list, type, object_list)
{
for (i in 1:ncol(clique_list)) {
clique_object <- subset(object_list, row.names(object_list) %in% clique_list[, i],
colnames(object_list) == type)
}
return(clique_object)
}
预期输出 clique_object
是 object_list
的子集,显示 clique_list 中所有派系的选定对象类型
.type
的出现情况
例如:
clique_object <- structure(list(cliques = c("ND1", "IS1", "IS3", "ND2", "IS1",
"IS3", "ND1", "IS4", "IS3", "ND1", "IS1", "IS3", "ND1", "IS1",
"IS8", "ND3", "IS1", "IS3", "ND1", "IS1", "IS3"), CA1 = c(0.0007137759,
0.0047664442, 0.009118541, 0.0007137759, 0.0047664442, 0.009118541,
0.0007137759, 0.0047664442, 0.009118541, 0.0007137759, 0.0047664442,
0.009118541, 0.0007137759, 0.0047664442, 0.009118541, 0.0007137759,
0.0047664442, 0.009118541, 0.0007137759, 0.0047664442, 0.009118541
)), class = "data.frame", row.names = c(NA, -21L))
如果我输入 print(clique_object)
而不是 return(clique_object)
,则该函数可以正常工作。在第一种情况下,我从围绕数据框循环的函数中获取完整列表。但是使用 return(clique_object)
我只能得到 clique_list
中第一个 clique 的结果。
我希望该函数将完整结果输出为数据框。
谢谢。
最佳答案
如果你改变你的功能如下:
clique_function <- function(clique_list, type, object_list)
{
lapply(seq(1,ncol(clique_list)), function(i) {
subset(object_list, row.names(object_list) %in% clique_list[, i],colnames(object_list) == type)
})
}
然后它将返回一个数据帧列表,如下所示:
[[1]]
CA1
ND1 0.2141328
IS1 0.1677788
IS3 0.2158055
[[2]]
CA1
IS1 0.16777884
IS3 0.21580547
IS2 0.07924528
[[3]]
CA1
ND2 0.1285141
ND1 0.2141328
IS1 0.1677788
然后您可以选择如何组合这些框架。例如,您可以像这样组合它们:
bind_rows(lapply(seq_along(res), function(x) tibble("clique"=x, "nodes"=rownames(res[[x]]), res[[x]])))
# A tibble: 9 x 3
clique nodes CA1
<int> <chr> <dbl>
1 1 ND1 0.214
2 1 IS1 0.168
3 1 IS3 0.216
4 2 IS1 0.168
5 2 IS3 0.216
6 2 IS2 0.0792
7 3 ND2 0.129
8 3 ND1 0.214
9 3 IS1 0.168
但我不知道你想要的输出结构是什么。在实际实践中,我会进一步调整clique_function
以在一次调用中返回所需的最终结构。
关于r - 如何使循环函数将其结果输出为R中的数据帧,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71175340/