r - 查找包含在两个 `n1` FALSE 之间的 `n2` TRUE，整个内容包含在 `n3` TRUE 之间，等等

从 TRUE 和 false 的序列中，我想创建一个函数，无论序列中某处是否存在一系列至少 n1 TRUE，该函数都返回 TRUE。这是该函数:

fun_1 = function(TFvec, n1){
    nbT = 0
    solution = -1
    for (i in 1:length(x)){
            if (x[i]){
            nbT = nbT + 1
               if (nbT == n1){
                return(T)
                break
               }
            } else {
                nbT = 0
            }
        }
        return (F) 
}

测试:

x = c(T,F,T,T,F,F,T,T,T,F,F,T,F,F)
fun_1(x,3) # TRUE
fun_1(x,4) # FALSE

然后，我需要一个函数，如果在给定的列表 boolean 向量中，有一系列至少 n1 TRUE，并且由至少两个系列(每侧一个)n2假。这里的功能:

fun_2 = function(TFvec, n1, n2){
    if (n2 == 0){
        fun_1(TFvec, n2)        
    }
    nbFB = 0
    nbFA = 0
    nbT = 0
    solution = -1
    last = F
    for (i in 1:length(TFvec)){
        if(TFvec[i]){           
            nbT = nbT + 1
            if (nbT == n1 & nbFB >= n2){
                solution = i-n1+1
            }
            last = T
        } else {
            if (last){
                nbFB = 0
                nbFA = 0        
            }
            nbFB = nbFB + 1
            nbFA = nbFA + 1
            nbT = 0
            if (nbFA == n2 & solution!=-1){
                return(T)
            }
            last = F
        }
    }
    return(F)
}

但这可能不是一个非常有效的函数!我还没有测试过 100 次，但看起来效果很好!

测试:

x = c(T,F,T,T,F,F,T,T,T,F,F,T,F,F)
fun_2(x, 3, 2) # TRUE
fun_2(x, 3, 3) # FALSE

现在，不管你相信与否，我想创建一个函数 (fun_3)，如果 boolean 向量中有一个(至少)系列至少 ，则该函数返回 TRUE n1 TRUE 包裹在(至少)两个(每侧一个)系列的 n2 false 之间，其中整个事物(三个系列)包裹在(至少)两个 (每侧一个)一系列 n3 TRUE。由于我担心必须进一步解决这个问题，因此我在此请求帮助创建一个函数 fun_n，其中我们输入两个参数 TFvec 和 list_n 其中 list_n 是任意长度的 n 列表。

你能帮我创建函数fun_n吗？

最佳答案

为了方便，记录下阈值个数的长度

n = length(list_n)

将 TRUE 和 FALSE 的向量表示为游程编码，为了方便记住每次游程的长度

r = rle(TFvec); l = r$length

寻找可能的起始位置

idx = which(l >= list_n[1] & r$value)

确保起始位置嵌入得足以满足所有测试

idx = idx[idx > n - 1 & idx + n - 1 <= length(l)]

然后检查连续远程运行的长度是否与条件一致，仅保留那些符合条件的起点

for (i in seq_len(n - 1)) {
    if (length(idx) == 0)
        break     # no solution
    thresh = list_n[i + 1]
    test = (l[idx + i] >= thresh) & (l[idx - i] >= thresh)
    idx = idx[test]
}

如果idx中还有剩余值，那么这些就是满足条件的rle的索引；初始向量中的起始点为 cumsum(l)[idx - 1] + 1。

综合:

runfun = function(TFvec, list_n) {
    ## setup
    n = length(list_n)
    r = rle(TFvec); l = r$length

    ## initial condition
    idx = which(l >= list_n[1] & r$value)
    idx = idx[idx > n - 1 & idx + n - 1 <= length(l)]

    ## adjacent conditions
    for (i in seq_len(n - 1)) {
        if (length(idx) == 0)
            break     # no solution
        thresh = list_n[i + 1]
        test = (l[idx + i] >= thresh) & (l[idx - i] >= thresh)
        idx = idx[test]
    }

    ## starts = cumsum(l)[idx - 1] + 1
    ## any luck?
    length(idx) != 0
}

这很快，并且允许运行 >= 阈值，如问题中规定的；例如

x = sample(c(TRUE, FALSE), 1000000, TRUE)
system.time(runfun(x, rep(2, 5)))

不到 1/5 秒即可完成。

有趣的泛化允许灵活的条件，例如，精确地运行 list_n，如 rollapply 解决方案中的那样

runfun = function(TFvec, list_n, cond=`>=`) {
    ## setup
    n = length(list_n)
    r = rle(TFvec); l = r$length

    ## initial condition
    idx = which(cond(l, list_n[1]) & r$value)
    idx = idx[idx > n - 1 & idx + n - 1 <= length(l)]

    ## adjacent conditions
    for (i in seq_len(n - 1)) {
        if (length(idx) == 0)
            break     # no solution
        thresh = list_n[i + 1]
        test = cond(l[idx + i], thresh) & cond(l[idx - i], thresh)
        idx = idx[test]
    }

    ## starts = cumsum(l)[idx - 1] + 1
    ## any luck?
    length(idx) != 0
}

关于r - 查找包含在两个 `n1` FALSE 之间的 `n2` TRUE，整个内容包含在 `n3` TRUE 之间，等等，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/28756848/

r - 查找包含在两个 `n1` FALSE 之间的 `n2` TRUE，整个内容包含在 `n3` TRUE 之间，等等

上一篇：coded-ui-tests - 如何在编码的 ui 测试中刷新浏览器

下一篇：google-apps-script - 如何访问我的应用程序脚本？