我正在努力使用正则表达式,因此任何见解都会有所帮助。我有一个这样的列表:
[1] "collected 1 hr total. wind >15 mph." "collected 4 hr total.
wind ~15 mph."
[3] "collected 10 hr total. gusts 5-10 mph." "collected 1 hr total.
breeze at 1mph,"
[5] "collected 2 hrs." [6]
我想要:
[1] > 15 mph
[2] ~15 mph
[3] 5-10 mph
[4] 1mph
[5]
[6]
我想提取每一行的风速。您能建议正确的正则表达式吗?如你看到的, a) 数字和“mph”之间可以有可变数量的空格 b) mph 之前的数字可以有不同的符号,“">”、“<”、“~”或者可以是间隔“-”
提前谢谢您!
最佳答案
str_extract
的一个选项
library(stringr)
trimws(str_extract(v1, "[>~]?[0-9- ]+mph"))
#[1] ">15 mph" "~15 mph" "5-10 mph" "1mph" NA
数据
v1 <- c("collected 1 hr total. wind >15 mph.",
"collected 4 hr total. wind ~15 mph.",
"collected 10 hr total. gusts 5-10 mph.",
"collected 1 hr total. breeze at 1mph,",
"collected 2 hrs.")
关于r - 提取短语前的数字,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/54199736/