r - 用于处理 r 中的各个值的循环

标签 r loops

这是我的小数据集。

Indvidual <- c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J")
Parent1 <- c(NA, NA, "A", "A", "C", "C", "C", "E", "A", NA)
Parent2 <- c(NA, NA, "B", "C", "D", "D", "D", NA, "D", NA)
mydf <- data.frame (Indvidual, Parent1, Parent2)

  Indvidual Parent1 Parent2
1         A    <NA>    <NA>
2         B    <NA>    <NA>
3         C       A       B
4         D       A       C
5         E       C       D
6         F       C       D
7         G       C       D
8         H       E    <NA>
9         I       A       D
10        J      <NA>     <NA>

只要考虑一下有两个或一个已知 parent 的人。我需要通过计算他们 parent 的分数来比较和得出分数。

规则是,父项之一(父项 1 或父项 2 列中的名称)已知(不是 NA),将获得 1 个额外分数加上其父项的分数。如果已知有两位 parent ,则将考虑得分最高者。

这是一个例子:

Individual "A", has both parents unknown so will get score 0
Indiviudal "C", has both parents known (i.e. A, B) 
will get 0 score (maximum of their parents) 

加 1(因为它有已知的 parent 之一)

因此上述数据帧的预期输出(带解释)是:

Indvidual Parent1 Parent2   Scores     Explanation 
1         A    <NA>    <NA>    0       0 (Max of parent Scores NA) + 0 (neither parent knwon) 
2         B    <NA>    <NA>    0       0 (Max of parent Scores NA)  + 0 (neither parent knwon) 
3         C     A       B      1    0 (Max of parent Scores)  +  1 (either parent knwon)       
4         D     A        C      2       1 (Max of parent scores)  +  1 (either parent knwon) 
5         E       C      D      3       2 (Max of parent scores) + 1 (either parent knwon)
6         F       C      D      3       2 (Max of parent scores) + 1 (either parent knwon)
7         G       C      D      3       2 (Max of parent scores) + 1 (either parent knwon)
8         H       E    <NA>     4       3 (Max of parent scores) + 1 (either parent knwon) 
9         I       A       D     3       2 (Max of parent scores) + 1 (either parent knwon)
10        J      <NA>    <NA>   0       0 (Max of parent scores NA)  + 0 (neither parent knwon)

解释:随着循环的进行,它会考虑已经计算出的分数。 家长分数的最大值

编辑:基于chase的问题

例如:

Individual C has two parents A and B, each of which has Scores calculated as 0 and 0 
(in row 1 and 2 and column Scores),  means that max (c(0,0)) will be 0

Individual E has parents C and D, whose scores in Scores column is (in row 3 and 4),
 1 and 2, respectively.  So maximum of max(c(1,2)) will be 2.

最佳答案

使用 plyr 和递归参数的示例

library(plyr)
Indvidual <- c("A", "B", "C", "D", "E", "F", "G", "H", "I", "J")
Parent1 <- c(NA, NA, "A", "A", "C", "C", "C", "E", "A", NA)
Parent2 <- c(NA, NA, "B", "C", "D", "D", "D", NA, "D", NA)
mydf <- data.frame (Indvidual, Parent1, Parent2)
scor.fun<-function(x,mydf){
    Explanation<-0
    P1<-as.character(x$Parent1)
    P2<-as.character(x$Parent2)
    score<-as.numeric(!(is.na(P1)||is.na(P1)))
    if(!(is.na(P1)||is.na(P2))){
        Explanation<-max(scor.fun(subset(mydf,Indvidual==P1),mydf)[1],scor.fun(subset(mydf,Indvidual==P2),mydf)[1])
        score<-score+Explanation
    }else{
        Explanation<-ifelse(is.na(P1),0,scor.fun(subset(mydf,Indvidual==P1),mydf)[1])
        Explanation<-max(Explanation,ifelse(is.na(P2),0,scor.fun(subset(mydf,Indvidual==P2),mydf)[1]))
        score<-score+Explanation
    }
    c(score,Explanation)
}

adply(mydf,1,scor.fun,mydf)

在大数据帧上进行递归可能不是最好的主意。

关于r - 用于处理 r 中的各个值的循环,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11504287/

相关文章:

r - 使用R中的arules包对重复交易进行关联分析

vba - 在 Access 中循环表行,使用或不使用 Private Const

lisp - Clojure 变量和循环

java - 我正在尝试制作一个图案(java中的三角形)

python - 隐藏测试失败,找不到缺陷

r - 调整地 block 之间的距离

r - 如何使用depmixS4进行分类?

r - 在 R 中,如何在节点之间随机生成边?

r - 将 glmnet 与插入符号一起使用的正确方法是什么?

ios - 使用循环到几个 subview (swift4)