我想按名称提取一些正则表达式的最小条目。
这里是一些数据:
# Here I define the dates:
dates <- as.Date(as.character(c("2011-01-13",
"2011-01-14",
"2011-01-15",
"2011-01-16",
"2011-01-17",
"2011-01-13",
"2011-01-14",
"2011-01-15",
"2011-01-16",
"2011-01-17",
"2011-01-13",
"2011-01-14",
"2011-01-15",
"2011-01-16",
"2011-01-17")))
# Here I define the Names
Name <-c("Andy","Andy","Andy","Andy","Andy","Jo","Jo","Jo","Jo","Jo","Me","Me","Me","Me",'Me')
# Here I define the status character
status<- c("ID: 10 -> 1","ID: 11 -> 0","ID: 3 -> 5","ID: 20 -> 4","ID: 1 -> 5","ID: 1 -> 1","ID: 3 -> 2","ID: 20 -> 5","ID: 10 -> 5","ID: 11 -> 5","ID: 12 ->1","ID: 30 -> 2","ID: 30 -> 5","ID: 30 -> 2","ID: 30 -> 5")
# put together
data <- data.frame(Name, dates, status)
# Here the output with the desired column TRUE which is true for the
# first change of ID from something to 5
Name dates status condition_met
1 Andy 2011-01-13 ID: 10 -> 1 0
2 Andy 2011-01-14 ID: 11 -> 0 0
3 Andy 2011-01-15 ID: 3 -> 5 1
4 Andy 2011-01-16 ID: 20 -> 4 0
5 Andy 2011-01-17 ID: 1 -> 5 0
6 Jo 2011-01-13 ID: 1 -> 1 0
7 Jo 2011-01-14 ID: 3 -> 2 0
8 Jo 2011-01-15 ID: 20 -> 5 1
9 Jo 2011-01-16 ID: 10 -> 5 0
10 Jo 2011-01-17 ID: 11 -> 5 0
11 Me 2011-01-13 ID: 12 -> 1 0
12 Me 2011-01-14 ID: 30 -> 2 0
13 Me 2011-01-15 ID: 30 -> 5 1
14 Me 2011-01-16 ID: 30 -> 2 0
15 Me 2011-01-17 ID: 30 -> 5 0
我试着提取:
data$condition_met <- ifelse(grepl("-> 5",data$status),1,0)
它生成一个带有 condition_met 的表,但不幸的是,对于所有“-> 5”而不是最小值,也就是第一个“-> 5”。
最佳答案
我们可以创建一个函数来指示条件的第一个匹配项。然后使用 base R
、dplyr
或 data.table
按组调用它:
condition <- function(x) as.integer(1:length(x) == grep("-> 5", x, fixed = TRUE)[1])
#base
data$condition_met <- as.integer(with(data, ave(status, Name, FUN=condition)))
#data.table
library(data.table)
setDT(data)[, condition_met := condition(status), by = Name]
#dplyr
library(dplyr)
data %>% group_by(Name) %>% mutate(condition_met = condition(status))
关于regex - 如何按名称提取某些正则表达式的最小条目?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/38590617/