到目前为止,这是我的代码:
import java.util.*;
import java.io.*;
public class Alice {
public static void main(String[] args) throws IOException {
/*
* To put the text document into an ArrayList
*/
Scanner newScanner = new Scanner(new File("ALICES ADVENTURES IN WONDERLAND.txt"));
ArrayList<String> list = new ArrayList<String>();
while (newScanner.hasNext()) {
list.add(newScanner.next());
}
newScanner.close();
}
}
我对现在如何通过所有标点符号拆分文档感到困惑,但我仍然需要能够对文本中的单词执行字符串操作。请帮忙
输入是整本爱丽丝梦游仙境书,我需要输出如下:
“这本书是为了使用等。”
基本上所有的单词都是分开的,所有的标点符号都从文档中删除。
最佳答案
List <String> list = new ArrayList <> ();
Pattern wordPattern = Pattern.compile ("\\w+");
try (BufferedReader reader = new BufferedReader (new FileReader ("ALICES ADVENTURES IN WONDERLAND.txt"))) {
String line;
while ((line = reader.readLine ()) != null) {
Matcher matcher = wordPattern.matcher (line);
while (matcher.find())
list.add (matcher.group());
}
}
关于java - 如何按所有类型的标点符号拆分 ArrayList 中的文本文件?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/55283992/