我需要一个句子解析器。解析器根据白色字符分割完整的句子。它将括号内的完整内容视为单个单词(已解析的单词)。
输入句子:-
"This is the work (my real job) which is great."
所需输出:-
This
is
the
work
(my real job)
which
is
great.
最佳答案
不确定是否有一个很好的方法来使用这个正则表达式从这样的句子中解析出单词。不管怎样,你可能需要迭代这个句子。我不认为 String.split()
会为你做这件事。只需编写一个循环来为您执行此操作,然后您就可以处理括号不匹配时的具体情况。例如,即使句子结束并且没有右括号,这也会假设所有内容都是一个单词:
String s = "This is the work (my real job) which is great, and (also some stuff";
ArrayList<String> words = new ArrayList<String>();
Scanner sentence = new Scanner(s);
boolean inParen = false;
StringBuilder inParenWord = new StringBuilder();
while(sentence.hasNext()) {
String word = sentence.next();
if(inParen) {
inParenWord.append(" ");
inParenWord.append(word);
if(word.endsWith(")")) {
words.add(inParenWord.toString());
inParenWord = new StringBuilder();
inParen = false;
}
}
else {
if(word.startsWith("(")) {
inParen = true;
inParenWord.append(word);
}
else {
words.add(word);
}
}
}
if(inParenWord.length()>0) {
words.add(inParenWord.toString());
}
for(String word : words) {
System.out.println(word);
}
它将输出:
This
is
the
work
(my real job)
which
is
great,
and
(also some stuff
或者使用模式/匹配器:
String s = "This is the work (my real job) which is great, and (also somet stuff";
ArrayList<String> words = new ArrayList<String>();
Pattern p = Pattern.compile(" ?([^(][^ ]+|\\([^\\)]+\\)?)");
Matcher m = p.matcher(s);
while(m.find()) {
words.add(s.substring(m.start(),m.end()).trim());
}
for(String word : words) {
System.out.println(word);
}
关于java - 正则表达式用于解析句子并跳过括号内容,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/11916601/