我有一个数据框,其中有一列指向下一条记录,下面是示例数据框。
OG_Data <- data.frame(
Record = c("aaaa", "NNNN", "rrrr", "tttt", "pppp", "ssss", "bbbb"),
NextRecord = c("pppp", "tttt", "bbbb", "N/A" , "NNNN", "rrrr", "N/A")
)
# Record NextRecord
# aaaa pppp
# NNNN tttt
# rrrr bbbb
# tttt N/A
# pppp NNNN
# ssss rrrr
# bbbb N/A
我想根据 B 列 (NextRecord) 确定的预定义序列对该数据框进行排序,该序列指向下一条记录的 A 列 (Record) 以获得序列顺序和行组。
期望的输出:
# Record NextRecord Sequence Line
# aaaa pppp 1 1
# pppp NNNN 2 1
# NNNN tttt 3 1
# tttt N/A 4 1
# ssss rrrr 1 2
# rrrr bbbb 2 2
# bbbb N/A 3 2
我在想这样的事情:
OG_Data[1,] %>%
add_row(OG_Data, filter(OG_Data, OG_Data$Record == NextRecord))
但这行不通且不可扩展。另外,我不确定从哪里开始找到行组的开头。
最佳答案
我敢打赌有更简单的方法,但至少将其作为图形问题来处理会很有趣。
library(igraph)
g = delete_vertices(graph_from_data_frame(OG_Data), "N/A")
OG_Data$Line = components(g)$mem[OG_Data$Record]
OG_Data[order(OG_Data$Line, factor(OG_Data$Record, levels = names(topo_sort(g)))), ]
Record NextRecord Line
1 aaaa pppp 1
5 pppp NNNN 1
2 NNNN tttt 1
4 tttt N/A 1
6 ssss rrrr 2
3 rrrr bbbb 2
7 bbbb N/A 2
然后 Numbering rows within groups in a data frame
plot(g)
一个不太有效的尝试,为了记录:
g = graph_from_data_frame(OG_Data)
g2 = sapply(V(g)[degree(g, mode = 'in') == 0], all_simple_paths, graph = g, "N/A")
d2 = OG_Data[{x = unlist(g2); x[!endsWith(names(x), ".N/A")]},]
d2$Line = rep.int(seq_along(g2), lengths(g2) - 1)
关于r - 根据指向下一条记录的列对数据框进行排序,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73638783/