我想根据众多字符(如下所列)之一拆分句子。我的正则表达式能够根据大多数字符进行拆分,但不能根据“[”、“]”(左方括号和右方括号)进行拆分。如果我将字符串 SPECIAL_CHARACTERS_REGEX 更改为 [ :;'=\\()!-\\[\\]]
,它会开始拆分字符串中的整数,而不是拆分方括号。如何使正则表达式拆分为方括号而不是整数('[]' 表示所有整数)。
另一个相关问题,有没有办法也从字符串中拆分数字?例如。 9pm
应拆分为 9
和 pm
。
This:
private static final String SPECIAL_CHARACTERS_REGEX = "[ :;'=\\()!-]";
String rawMessage = "let's meet tomorrow at 9:30p? 7-8pm? i=you go (no Go!) [to do !]"
String[] tokens = rawMessage.split(SPECIAL_CHARACTERS_REGEX);
Gives:
Input: let's meet tomorrow at 9:30p? 7-8pm? i=you go (no Go!) [to do !]
output: [let, s, meet, tomorrow, at, 9, 30p?, 7, 8pm?, i, you, go, , no, Go, , , [to, do, , ]]
还有,
This:
private static final String SPECIAL_CHARACTERS_REGEX = "[ :;'=\\()!-\\[\\]]";
String rawMessage = "let's meet tomorrow at 9:30p? 7-8pm? i=you go (no Go!) [to do !]"
String[] tokens = rawMessage.split(SPECIAL_CHARACTERS_REGEX);
Gives:
let's meet tomorrow at 9:30p? 7-8pm? i=you go (no Go!) [to do !]
[let, s, meet, tomorrow, at, , , , , p, , , , , pm, , i, you, go, , no, , o, , , , to, do]
预期输出:
{"let", "s", "meet", "tomorrow", "at", "9", "30", "p", "7", "8", "pm", "i", "you", "go", "no", "Go", "to", "do"}
最佳答案
如果你把破折号留在字符类的中间,你也需要转义它。
但是,通过将它放在字符类的开头或结尾来避免这种情况。此外,您不需要在此处转义 ()
,并且您可能希望在字符类之后使用量词 *
或 +
.
更新:要获得预期结果,您可以这样做。
private static final String SPECIAL_CHARACTERS_REGEX = "[ :;'?=()!\\[\\]-]+|(?<=\\d)(?=\\D)";
String rawMessage = "let's meet tomorrow at 9:30p? 7-8pm? i=you go (no Go!) [to do !]";
String[] tokens = rawMessage.split(SPECIAL_CHARACTERS_REGEX);
System.out.println(Arrays.toString(tokens));
正则表达式:
[ :;'?=()!\[\]-]+ any character of: ' ', ':', ';', ''', '?',
'=', '(', ')', '!', '\[', '\]', '-' (1 or more times)
| OR
(?<= look behind to see if there is:
\d digits (0-9)
) end of look-behind
(?= look ahead to see if there is:
\D non-digits (all but 0-9)
) end of look-ahead
参见 Working demo
输出
[let, s, meet, tomorrow, at, 9, 30, p, 7, 8, pm, i, you, go, no, Go, to, do]
关于java - 在多个字符上拆分字符串,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/20625772/