我想从包含 ID 序列的列创建新列,比如 ID1-5。
比如说,我们有以下小标题:
# A tibble: 3 x 2
group id_seq
<chr> <chr>
1 A ID_Nr61-63
2 A ID_Nr67-69
3 B ID_Nr73-75
我想要的输出是:
# A tibble: 3 x 6
group id_seq id1 id2 id3 id4
<chr> <chr> <chr> <chr> <chr> <chr>
1 A ID_Nr61-64 ID_Nr61 ID_Nr62 ID_Nr63 ID_Nr64
2 A ID_Nr67-69 ID_Nr67 ID_Nr68 ID_Nr69 NA
3 B ID_Nr73-75 ID_Nr73 ID_Nr74 ID_Nr75 NA
感谢您的帮助!谢谢。
最佳答案
我们可以从 id_seq 中提取数字,通过 rowwise
,得到序列 (:
) 来扩展数据并从 'long' 整形为 'wide' pivot_wider
library(dplyr)
library(tidyr)
library(stringr)
library(data.table)
df1 %>%
mutate(s1 = as.numeric(str_extract(id_seq, "\\d+")),
s2 = as.numeric(str_extract(id_seq, "\\d+$"))) %>%
mutate(rn = row_number()) %>%
rowwise %>%
summarise(rn, group, id_seq, se = (s1:s2)) %>%
mutate(id_seq2 = str_replace(id_seq, "\\d+-\\d+$", as.character(se)),
rn2 = str_c("id", rowid(rn)), se = NULL) %>%
pivot_wider(names_from = rn2, values_from = id_seq2) %>%
select(-rn)
-输出
# A tibble: 3 × 6
group id_seq id1 id2 id3 id4
<chr> <chr> <chr> <chr> <chr> <chr>
1 A ID_Nr61-64 ID_Nr61 ID_Nr62 ID_Nr63 ID_Nr64
2 A ID_Nr67-69 ID_Nr67 ID_Nr68 ID_Nr69 <NA>
3 B ID_Nr73-75 ID_Nr73 ID_Nr74 ID_Nr75 <NA>
或者使用base R
lst1 <- lapply(sub("-", ":", sub("ID_Nr", "", df1$id_seq)),
function(x) paste0("id_seq", eval(parse(text = x))))
mx <- max(lengths(lst1))
m1 <- do.call(rbind, lapply(lst1, `length<-`, mx))
df1[paste0("id", seq_len(ncol(m1)))] <- m1
df1
group id_seq id1 id2 id3 id4
1 A ID_Nr61-64 id_seq61 id_seq62 id_seq63 id_seq64
2 A ID_Nr67-69 id_seq67 id_seq68 id_seq69 <NA>
3 B ID_Nr73-75 id_seq73 id_seq74 id_seq75 <NA>
数据
df1 <- structure(list(group = c("A", "A", "B"), id_seq = c("ID_Nr61-64",
"ID_Nr67-69", "ID_Nr73-75")), class = "data.frame", row.names = c("1",
"2", "3"))
关于r - 根据 id 序列扩展列,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/71209630/