这个问题与此Wide a dataframe and insert missing columns相关
假设我们有一个给定 5 个元素的模式,按以下顺序:“A”、“B”、“C”、“D”、“E”
这种模式会重复 10 次。但有时缺少一些元素(参见图片我的矢量(橙色)。
在R
中是否可以识别重复的模式并填充缺少的元素(参见图片我想要的输出)。
我的向量:
my.vector <- c("A", "B", "C", "D", "E", "A", "B", "C", "D", "E", "B", "C",
"D", "E", "B", "C", "D", "E", "B", "C", "D", "E", "B", "C", "D",
"E", "B", "C", "D", "E", "B", "C", "D", "E", "A", "B", "C", "D",
"E", "B")
my.vector
[1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "B" "C" "D" "E" "A" "B" "C" "D" "E" "B"
图形解释:
最佳答案
根据 diff
创建分组列的match
ing 索引 LETTERS[1:5]
, split
(或使用任何分组函数,如tapply
等),并创建一个union
与“字母[1:5] ,
取消列表 the
列表and
取消名称`
unname( unlist(lapply(split(my.vector, cumsum(c(TRUE,
diff(match(my.vector, LETTERS[1:5])) != 1))),
function(x) union(LETTERS[1:5], x))))
-输出
[1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A"
[37] "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E"
或者另一个选项是 complete
library(dplyr)
library(tidyr)
library(data.table)
tibble(col1 = my.vector) %>%
group_by(rn = rowid(col1)) %>%
complete(col1 = LETTERS[1:5]) %>%
ungroup %>%
pull(col1)
-输出
1] "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E" "A"
[37] "B" "C" "D" "E" "A" "B" "C" "D" "E" "A" "B" "C" "D" "E"
关于识别向量中的给定模式并添加缺少的元素以获得给定模式的重复,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68885775/