长期以来,我一直在努力寻找一种方法来使用相对简单的命令从文本的开头和结尾截除非字母字符。然而,重要的是可以有例如文本中的数字字符。
举个例子:
a <- c("1) dog with 4 legs", "- cat with 1 tail", "2./ bird with 2 wings." )
b <- c("07 mouse with 1 tail.", "2.pig with 1 nose,,", "$ cow with 4 spots_")
data <- data.frame(cbind(a, b))
正确的结果应该是这样的:
a <- c("dog with 4 legs", "cat with 1 tail", "bird with 2 wings" )
b <- c("mouse with 1 tail", "pig with 1 nose", "cow with 4 spots")
data_cleaned <- data.frame(cbind(a, b))
有没有简单的解决办法?
最佳答案
我们可以这样做:
首先我们用空格替换所有特殊字符。 然后我们删除第一个字符之前的所有内容:
library(dplyr)
library(stringr)
data %>%
mutate(across(c(a,b), ~str_replace_all(., "[[:punct:]]", " ")),
across(c(a,b), ~str_replace(., "^\\S* ", "")))
a b
1 dog with 4 legs mouse with 1 tail
2 cat with 1 tail pig with 1 nose
3 bird with 2 wings cow with 4 spots
关于r - 将所有内容修剪成字母字符 (R),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/73592045/