我正在尝试解析一个短语并排除常用词。
例如在短语“as the world turns”中,我想排除常用词“as”和“the”,只返回“world”和“turns”。
(\w+(?!the|as))
不起作用。感谢反馈。
最佳答案
前瞻应该放在第一位:
(\b(?!(the|as)\b)\w+\b)
I have also added word boundaries to ensure that it only matches whole words otherwise it would fail to match the complete word "as" but it would successfully match the letter "s" of that word.
You might also want to consider what \w
matches and if that meets your needs. If you are looking for words in English you probably are interested in letters but not digits and you may wish to include some punctuation characters that are excluded by \w
, such as apostrophes. You could try something like this instead (Rubular):
/(\b(?!(?:the|as)\b)[a-z'-]+\b)/i
要更准确地匹配人类语言中的单词,您可以考虑使用自然语言解析库而不是正则表达式。
关于正则表达式否定 - 词解析,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/3643720/