对于 myvector1
的每个值我想知道mycategory
myvector1
中前一个相同值的值,鉴于 mystatus
为 ON,否则我会寻找相应的下一个相同值,直到它为 ON。
说明如下所示:
“我的向量”
“我的类别”。如果为 OFF,则重复转到第 2 点。
给定数据集
mydf
我要找的是DesiredSolution
(我手动填写)。mydf <- structure(list(myvector1 = structure(c(1L, 2L, 3L, 4L, 5L, 1L,
2L, 4L, 5L, 2L, 3L, 4L, 5L, 2L, 3L, 5L, 1L, 2L, 3L, 4L, 5L, 1L,
2L, 4L, 5L, 1L, 1L, 2L, 3L, 4L, 5L, 3L), .Label = c("0", "1",
"2", "3", "4"), class = "factor"), mystatus = structure(c(2L,
1L, 2L, 2L, 1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L, 1L, 2L, 2L, 2L,
1L, 2L, 2L, 1L, 1L, 2L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 2L), .Label = c("OFF",
"ON"), class = "factor"), mycategory = structure(c(2L, 2L, 3L,
1L, 1L, 1L, 1L, 3L, 3L, 1L, 2L, 2L, 3L, 1L, 1L, 1L, 1L, 1L, 2L,
2L, 3L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L), .Label = c("bye",
"hi", "stay"), class = "factor"), DesiredSolution = structure(c(3L,
3L, 3L, 3L, 3L, 2L, 3L, 1L, 3L, 1L, 4L, 4L, 4L, 1L, 2L, 4L, 1L,
1L, 1L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 2L, 2L, 1L, 2L), .Label = c("bye",
"hi", "NA", "stay"), class = "factor")), .Names = c("myvector1",
"mystatus", "mycategory", "DesiredSolution"), row.names = c(NA,
-32L), class = "data.frame")
最佳答案
用数据表...
library(data.table)
setDT(mydf)
mydf[, r := .I]
mydf[, v := mydf[mystatus == "ON"][mydf, on=.(r < r, myvector1), mult="last", x.mycategory]]
这使
myvector1 mystatus mycategory DesiredSolution r v
1: 0 ON hi NA 1 NA
2: 1 OFF hi NA 2 NA
3: 2 ON stay NA 3 NA
4: 3 ON bye NA 4 NA
5: 4 OFF bye NA 5 NA
6: 0 ON bye hi 6 hi
7: 1 ON bye NA 7 NA
8: 3 ON stay bye 8 bye
9: 4 ON stay NA 9 NA
10: 1 OFF bye bye 10 bye
11: 2 ON hi stay 11 stay
12: 3 ON hi stay 12 stay
13: 4 ON stay stay 13 stay
14: 1 OFF bye bye 14 bye
15: 2 ON bye hi 15 hi
16: 4 ON bye stay 16 stay
17: 0 ON bye bye 17 bye
18: 1 OFF bye bye 18 bye
19: 2 ON hi bye 19 bye
20: 3 ON hi hi 20 hi
21: 4 OFF stay bye 21 bye
22: 0 OFF bye bye 22 bye
23: 1 ON bye bye 23 bye
24: 3 OFF bye hi 24 hi
25: 4 ON bye bye 25 bye
26: 0 OFF bye bye 26 bye
27: 0 OFF hi bye 27 bye
28: 1 OFF hi bye 28 bye
29: 2 OFF hi hi 29 hi
30: 3 OFF hi hi 30 hi
31: 4 OFF stay bye 31 bye
32: 2 ON stay hi 32 hi
myvector1 mystatus mycategory DesiredSolution r v
工作原理:在
mydf[mystatus == "ON"]
中查找行其中行号r
较低和myvector1
火柴。返回 mycategory
, 如果有多个匹配项,则取最后一个匹配行。
关于r - 在向量中查找以前的相同值并应用某些条件,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/47815579/