我有以下 df:
structure(list(id = c(9L, 10L, 11L, 96L, 97L, 101L, 103L, 248L,
499L, 1044L), leg_activity = c("home, adpt, shop, car_passenger, home, adpt, work, adpt, home pt,, work pt,, outside, outside, outside pt,, outside pt,, pt, home",
"home pt,, pt, outside, outside, outside, outside pt,, pt, home, car, leisure, car, other, car, leisure, car, leisure, car, other, car, leisure, car, other, car, leisure, car, home, adpt, leisure, adpt, home",
"home pt,, work, adpt, home", "home, car, work, car, home pt,, work, adpt, home",
"home, adpt, work, car_passenger, leisure, car_passenger, work, adpt, home, car_passenger, outside, outside, outside, car_passenger, outside, outside, outside, car_passenger, home",
"home, bike, outside, outside, outside, car_passenger, outside, outside, outside, car_passenger, outside, outside, outside, bike, home, adpt, leisure, adpt, home, bike, leisure, bike, home",
"home, adpt, work, adpt, home, walk, other, pt, home", "home, adpt, work, walk, home, adpt, work, walk, home",
"home, adpt, leisure, adpt, home, bike, outside, outside, outside, bike, home",
"home, pt, work, adpt, home, adpt, work, adpt, home")), row.names = c(NA,
10L), class = "data.frame")
如您所见,leg_activity
列包含字符串。我想要的是删除所有与 outside
相关的单词。
更具体一点,让我们以假设的行为例:
"home, bike, outside, outside, outside, car_passenger, outside, outside, bike, home, adpt, bike, leisure, bike, home"
目标是删除 outside
之前的单词以及 outside
之后的单词,最终,outside
应该被删除也。所需的输出:
"home, home, adpt, bike, leisure, bike, home"
到目前为止我只能删除特定的单词
agents$leg_activity <- gsub(', home', '', agents$leg_activity)
非常感谢您的帮助!
最佳答案
我们可以用逗号分割字符串,使用 grep
获取 "outside"
所在的位置,并删除它之前和之后的值。
agents$new_col <- sapply(strsplit(agents$leg_activity, ',{1,}\\s'), function(x) {
inds <- grep('outside', x)
if(length(inds)) toString(x[-unique(c(inds - 1, inds, inds + 1))])
else toString(x)
})
agents$new_col
# [1] "home, adpt, shop, car_passenger, home, adpt, work, adpt, home pt, home"
# [2] "home pt, home, car, leisure, car, other, car, leisure, car, leisure, car, other, car, leisure, car, other, car, leisure, car, home, adpt, leisure, adpt, home"
# [3] "home pt, work, adpt, home"
# [4] "home, car, work, car, home pt, work, adpt, home"
# [5] "home, adpt, work, car_passenger, leisure, car_passenger, work, adpt, home, home"
# [6] "home, home, adpt, leisure, adpt, home, bike, leisure, bike, home"
# [7] "home, adpt, work, adpt, home, walk, other, pt, home"
# [8] "home, adpt, work, walk, home, adpt, work, walk, home"
# [9] "home, adpt, leisure, adpt, home, home"
#[10] "home, pt, work, adpt, home, adpt, work, adpt, home"
关于r - 如何使用 R 删除字符串中特定单词之前和之后的单词?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62245449/