r - 如何使用 R 删除字符串中特定单词之前和之后的单词？

我有以下 df:

structure(list(id = c(9L, 10L, 11L, 96L, 97L, 101L, 103L, 248L, 
499L, 1044L), leg_activity = c("home, adpt, shop, car_passenger, home, adpt, work, adpt, home pt,, work pt,, outside, outside, outside pt,, outside pt,, pt, home", 
"home pt,, pt, outside, outside, outside, outside pt,, pt, home, car, leisure, car, other, car, leisure, car, leisure, car, other, car, leisure, car, other, car, leisure, car, home, adpt, leisure, adpt, home", 
"home pt,, work, adpt, home", "home, car, work, car, home pt,, work, adpt, home", 
"home, adpt, work, car_passenger, leisure, car_passenger, work, adpt, home, car_passenger, outside, outside, outside, car_passenger, outside, outside, outside, car_passenger, home", 
"home, bike, outside, outside, outside, car_passenger, outside, outside, outside, car_passenger, outside, outside, outside, bike, home, adpt, leisure, adpt, home, bike, leisure, bike, home", 
"home, adpt, work, adpt, home, walk, other, pt, home", "home, adpt, work, walk, home, adpt, work, walk, home", 
"home, adpt, leisure, adpt, home, bike, outside, outside, outside, bike, home", 
"home, pt, work, adpt, home, adpt, work, adpt, home")), row.names = c(NA, 
10L), class = "data.frame")

如您所见，leg_activity 列包含字符串。我想要的是删除所有与 outside 相关的单词。

更具体一点，让我们以假设的行为例:

"home, bike, outside, outside, outside, car_passenger, outside, outside,  bike, home, adpt, bike, leisure, bike, home"

目标是删除 outside 之前的单词以及 outside 之后的单词，最终，outside 应该被删除也。所需的输出:

"home, home, adpt, bike, leisure, bike, home"

到目前为止我只能删除特定的单词

agents$leg_activity <- gsub(', home', '', agents$leg_activity)

非常感谢您的帮助!

最佳答案

我们可以用逗号分割字符串，使用 grep 获取 "outside" 所在的位置，并删除它之前和之后的值。

agents$new_col <- sapply(strsplit(agents$leg_activity, ',{1,}\\s'), function(x) {
              inds <-  grep('outside', x)
              if(length(inds)) toString(x[-unique(c(inds - 1, inds, inds + 1))])
              else toString(x)
})
agents$new_col

# [1] "home, adpt, shop, car_passenger, home, adpt, work, adpt, home pt, home"                                                                                       
# [2] "home pt, home, car, leisure, car, other, car, leisure, car, leisure, car, other, car, leisure, car, other, car, leisure, car, home, adpt, leisure, adpt, home"
# [3] "home pt, work, adpt, home"                                                                                                                                    
# [4] "home, car, work, car, home pt, work, adpt, home"                                                                                                              
# [5] "home, adpt, work, car_passenger, leisure, car_passenger, work, adpt, home, home"                                                                              
# [6] "home, home, adpt, leisure, adpt, home, bike, leisure, bike, home"                                                                                             
# [7] "home, adpt, work, adpt, home, walk, other, pt, home"                                                                                                          
# [8] "home, adpt, work, walk, home, adpt, work, walk, home"                                                                                                         
# [9] "home, adpt, leisure, adpt, home, home"                                                                                                                        
#[10] "home, pt, work, adpt, home, adpt, work, adpt, home"

关于r - 如何使用 R 删除字符串中特定单词之前和之后的单词？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/62245449/

r - 如何使用 R 删除字符串中特定单词之前和之后的单词？

上一篇：html - 在 wtforms SubmitField 中使用图标？

下一篇：c# - 使用 ISNULL 函数会引发错误