r - 提取特定单词前后的 5 个单词

标签 r

如何提取特定单词旁边的单词/句子？示例:

“6月28日，简去电影院吃了爆米花”

我想选择 'Jane' 并得到 [-2,2]，意思是:

“6月28日，简去了”

最佳答案

我们可以创建一个函数来帮忙。这可能会使它更具活力。

library(tidyverse)

txt <- "On June 28, Jane went to the cinema and ate popcorn"

grab_text <- function(text, target, before, after){
  min <- which(unlist(map(str_split(text, "\\s"), ~grepl(target, .x))))-before
  max <- which(unlist(map(str_split(text, "\\s"), ~grepl(target, .x))))+after

  paste(str_split(text, "\\s")[[1]][min:max], collapse = " ")
}

grab_text(text = txt, target = "Jane", before = 2, after  = 2)
#> [1] "June 28, Jane went to"

首先我们拆分句子，然后我们找出目标的位置，然后我们抓取之前或之后的任何单词(函数中指定的数字)，最后我们将句子折叠起来。

关于r - 提取特定单词前后的 5 个单词，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57980257/

上一篇：google-cloud-platform - 如何配置多个 gcloud 项目

下一篇：python - 如何让我的交互式 Holoviews 图形显示在 Visual Studio 中(没有 Jupyter)？

相关文章：

r - 如何在 R 中的错误消息中使用特殊字符和颜色？

r - 将子组标签添加到ggplot2中的抖动图

r - R 包外的单元测试

r - 总结一个数据框

r - 使用 dplyr 中的条件合并两列

R中用于probit和logit回归的稳健和聚集标准误差

r - 在 R/Shiny 中，htmlOutput()/renderUI() 的最佳替代方案？

r - 如何在R中进行scp？

r - 如何从向量中提取数字的倍数

c++ - Rcpp 中没有调用 'as' 的匹配函数