java - 在一个字符串中搜索多个字符串的最快方法

标签 java

下面是我的代码,用于查找给定单个字符串中所有子字符串的出现次数

public static void main(String... args) {
    String fullString = "one is a good one. two is ok. three is three. four is four. five is not four";
    String[] severalStringArray = { "one", "two", "three", "four" };
    Map<String, Integer> countMap = countWords(fullString, severalStringArray);
}

public static Map<String, Integer> countWords(String fullString, String[] severalStringArray) {
    Map<String, Integer> countMap = new HashMap<>();

    for (String searchString : severalStringArray) {
        if (countMap.containsKey(searchString)) {
            int searchCount = countMatchesInString(fullString, searchString);
            countMap.put(searchString, countMap.get(searchString) + searchCount);
        } else
            countMap.put(searchString, countMatchesInString(fullString, searchString));
    }

    return countMap;
}

private static int countMatchesInString(String fullString, String subString) {
    int count = 0;
    int pos = fullString.indexOf(subString);
    while (pos > -1) {
        count++;
        pos = fullString.indexOf(subString, pos + 1);
    }
    return count;
}
假设完整字符串可能是作为字符串读取的完整文件。以上是搜索的有效方法还是其他更好的方法或最快的方法?
谢谢

最佳答案

您可以形成一个正则表达式交替搜索单词,然后对该正则表达式进行一次搜索:

public static int matchesInString(String fullString, String regex) {
    int count = 0;

    Pattern r = Pattern.compile(regex);
    Matcher m = r.matcher(fullString);

    while (m.find())
        ++count;

    return count;
}

String fullString = "one is a good one. two is ok. three is three. four is four. five is not four";
String[] severalStringArray = { "one", "two", "three", "four" };
String regex = "\\b(?:" + String.join("|", severalStringArray) + ")\\b";

int count = matchesInString(fullString, regex);
System.out.println("There were " + count + " matches in the input");
这打印:

There were 8 matches in the input


请注意,上面示例中使用的正则表达式模式是:
\b(?:one|two|three|four)\b

关于java - 在一个字符串中搜索多个字符串的最快方法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/68994401/

相关文章:

java - JSON Jackson无限递归错误Java

java - 自定义 JComponent 未添加到 JPanel

java regex 为什么这两个正则表达式不同

java - 如何从网页中提取文本内容?

java - 如何为使用 java.util.regex 的函数编写 JUnit

Java:System.out.println() 这么慢的原因是什么?

java - Android GCM 注册 ID

java - 如何在单击 JCheckBox 时显示 JPopupMenu?

java - 线程安全的排序链表

java - 更改 facelets url 缓存行为以包含主机