我正在尝试清理一些文本。我有一个表情符号列表,我不想将其从文本中删除。我想在这些表情符号之前加一个空格(如果还没有的话)。
emojis = as.character(outer(c(":", ";", ":-", ";-","="),c(")", "(", "]", "[", "D", "o", "O", "P", "p","8"),FUN = paste,sep=""))
如果我有一条这样的推文。
Tweet = "I am so happy:)"
我希望这样
Tweet = "I am so happy :)"
代码运行后。
这是一个非常简单的想法,但我还没有找到任何代码来做到这一点。
前面需要空格的表情符号的完整列表:
":)" ";)" ":-)" ";-)" "=)" ":(" ";(" ":-(" ";-(" "=(" ":]" ";]" ":-]" ";-]" "=]" ":[" ";[" ":-[" ";-[" "=[" ":D" ";D" ":-D" ";-D" "=D" ":o" ";o" ":-o" ";-o" "=o" ":O" ";O" ":-O" ";-O" "=O" ":P" ";P" ":-P" ";-P" "=P" ":p" ";p" ":-p" ";-p" "=p" ":8" ";8" ":-8" ";-8" "=8"
最佳答案
正则表达式可以提供帮助。
emojis = as.character(outer(c(":", ";", ":-", ";-","="),c("\\)", "\\(", "\\]", "\\[", "D", "o", "O", "P", "p","8"),FUN = paste,sep=""))
pat <- paste0("(\\w+)(", paste(emojis, collapse="|"), ")")
Tweet = "I am so happy:)"
sub(pat, "\\1 \\2", Tweet)
#[1] "I am so happy :)"
关于r - 表情符号前加一个空格,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/40879992/