R - 生成具有重复项的唯一列表序列

我希望在列表中生成唯一的元素序列，其中某些元素在 R 中不是唯一的

sequence <- c(1,0,1,0)

例如:

result<-function(sequence)  
result:
  seq1 seq2 seq3 seq4 seq5 seq6
1    1    1    0    0    0    1
2    0    1    0    1    1    0
3    1    0    1    0    1    0
4    0    0    1    1    0    1

注意所有序列都包含原始序列中的每个元素，这样序列的和总是2

gtools 返回“不同的元素太少”

result <- gtools::permutations(4, 4, coseq)

我没有找到任何直接解决这个问题的 SO 帖子，而是允许元素重复:Creating combination of sequences 可通过 expand.grid 和不同长度的序列实现。

编辑: 上面是一个最小的例子，理想情况下它适用于序列:

 sequence = c(0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1)

有点重要的是，该解决方案不会生成随后会被删除的重复项，因为如果生成重复项，较长的序列(例如 20 或 30)将对计算要求很高。

最佳答案

有几个专门为此构建的包。

首先是安排包:

## sequence is a bad name as it is a base R function so we use s instead
s <- c(1,0,1,0)
arrangements::permutations(unique(s), length(s), freq = table(s))
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    1    0    1    0
[3,]    1    0    0    1
[4,]    0    1    1    0
[5,]    0    1    0    1
[6,]    0    0    1    1

接下来，我们有RcppAlgos(我是作者):

RcppAlgos::permuteGeneral(unique(s), length(s), freqs = table(s))
     [,1] [,2] [,3] [,4]
[1,]    1    1    0    0
[2,]    1    0    1    0
[3,]    1    0    0    1
[4,]    0    1    1    0
[5,]    0    1    0    1
[6,]    0    0    1    1

它们都非常高效。为了给你一个想法，对于 OP 的实际需要，其他方法将失败(我认为矩阵的行数有限制...... 2 ^ 31 - 1，但不确定)或采取很长的时间，因为他们必须生成 16! ~= 2.092e+13 任何进一步处理之前的排列。然而，有了这两个包，返回是即时的:

## actual example needed by OP
sBig <- c(0, 1, 0, 1, 0, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1)

system.time(a <- arrangements::permutations(unique(sBig), length(sBig), freq = table(sBig)))
user  system elapsed 
0.001   0.001   0.002 

system.time(b <- RcppAlgos::permuteGeneral(unique(sBig), length(sBig), freqs = table(sBig)))
user  system elapsed 
0.001   0.001   0.002 

identical(a, b)
[1] TRUE

dim(a)
[1] 11440    16

关于R - 生成具有重复项的唯一列表序列，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55172503/

R - 生成具有重复项的唯一列表序列

上一篇：r - 在 rCharts 中向传单 map 添加文本

下一篇：ios7 - iOS 7 的核心蓝牙弃用