java - Java中的正则表达式匹配算法

标签 java regex algorithm

这article说 Java 中的正则表达式匹配很慢，因为带有“反向引用”的正则表达式不能有效匹配。这篇文章解释了高效 Thomson 基于 NFA 的匹配算法(发明于 1968 年)，该算法适用于没有“反向引用”的正则表达式。然而 Pattern javadoc说 Java 正则表达式使用基于 NFA 的方法。

现在我想知道 Java 正则表达式匹配的效率如何以及它使用什么算法。

最佳答案

java.util.regex.Pattern 使用 Boyer–Moore 字符串搜索算法

/* Attempts to match a slice in the input using the Boyer-Moore string
 * matching algorithm. The algorithm is based on the idea that the
 * pattern can be shifted farther ahead in the search text if it is
 * matched right to left.
 */

private void compile() {
    ----------------------
    -----------------------

   if (matchRoot instanceof Slice) {
        root = BnM.optimize(matchRoot);
        if (root == matchRoot) {
            root = hasSupplementary ? new StartS(matchRoot) : new Start(matchRoot);
        }
    } else if (matchRoot instanceof Begin || matchRoot instanceof First) {
        root = matchRoot;
    } else {
        root = hasSupplementary ? new StartS(matchRoot) : new Start(matchRoot);
    }
}

关于java - Java中的正则表达式匹配算法，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/19251384/

上一篇：java - Hibernate 4.0 的 MultiTenancy 和 Separate Schema 方法

下一篇：java - 位数组 Java 的高效连接

相关文章：

java - 在 Java 中处理大量数据的有效方法

Java 会计年度周数计数器

java - 在每次使用的基础上配置 OSGI 服务

regex - 从 Google Analytics 中排除 IP - 为什么这不是有效的正则表达式？

silverlight - 找到更好的控制

algorithm - 寻找动态规划解决方案

java - 使用 Java 下载 torrent 使用什么？

java - OrientDB shortestPath() 使用特定的边@class？

javascript - JS 正则表达式与 $ 匹配字符串

java - 如何使用replaceAll方法将双引号替换为"