我想删除 Java 中文本的标点符号。我知道有一个模式可以匹配所有标点符号,即\p{Punct},但这会删除所有标点符号。然而,我想保留首字母缩略词和连字符的单词。例如,保留“m.i.t.”或“state-of-the-art”、“9.4”、“11:00”、“p.m.”、“976-4275”,而我正在删除标点符号。
我尝试了\p{Punct},但它会删除所有标点符号。
String text = "There's a string from M.I.T., written by Jason at 11:00 p.m. 976-4275, 9.5, another word is state-of-the-art.";
text.replaceAll("\\p{Punct}", "");
System.out.println(text);
结果将是:
"There s a string from MIT written by Jason at 1100 pm 9764275 95 another word is stateoftheart"
但我想要的是:
"There s a string from M.I.T. written by Jason at 11:00 p.m. 976-4275 9.5 another word is state-of-the-art"
最佳答案
请在\\p{Punct}
后添加代码&&[^.]
,它将帮助您替换除句号标点符号之外的所有标点符号。
解决方案:
text.replaceAll("[\\p{Punct}&&[^.]]", "");
关于java - 如何在 Java 中删除标点符号但保留首字母缩写词和连字符单词?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57423113/