android - 带错误检查的正则表达式

我已经进行了大量搜索，但在这种情况下我对正则表达式语句和我的 google-fu 感到很糟糕，因为它不够强大。

场景:

在推送通知中，我们收到一个包含 9 位内容 ID 的 URL。

示例 URL:http://www.something.com/foo/bar/Some-title-Goes-here-123456789.html(123456789 是此场景中的内容 ID)

解析内容 ID 的当前正则表达式:

public String getContentIdFromPathAndQueryString(String path, String queryString) {
        String contentId = null;
        if (StringUtils.isNonEmpty(path)) {
            Pattern p = Pattern.compile("([\\d]{9})(?=.html)");
            Matcher m = p.matcher(path);
            if (m.find()) {
                contentId = m.group();
            } else if (StringUtils.isNonEmpty(queryString)) {
                p = Pattern.compile("(?:contentId=)([\\d]{9})(?=.html)");
                m = p.matcher(queryString);
                if (m.find()) {
                    contentId = m.group();
                }
            }
        }

        Log.d(LOG_TAG, "Content id " + (contentId == null ? "not found" : (" found - " + contentId)));
        if (StringUtils.isEmpty(contentId)) {
            Answers.getInstance().logCustom(new CustomEvent("eid_url")
                    .putCustomAttribute("contentId", "empty")
                    .putCustomAttribute("path", path)
                    .putCustomAttribute("query", queryString));
        }

        return contentId;
    }

问题: 这可以完成工作，但我需要考虑一个特定的错误场景。

创建推送的人可能会输入错误长度的内容 ID，无论如何我们都需要获取它，因此假设它可以是任意数字...标题也可以包含数字，这很烦人。内容 ID 后面始终跟有“.html”

最佳答案

虽然这里的基本答案只是“将 {9} 限制量词恰好匹配 9 次出现替换为 + 量词匹配 1+ 次出现”，但有两种模式可以改进。

未转义的点应该在模式中转义以匹配文字点。

如果您没有重叠匹配，则无需在捕获组之前使用正向前瞻，只需保留捕获组并获取 .group(1) 值即可。

A non-capturing group (?:...)仍然是一个消耗模式，并且 (?:contentId=) 等于 contentId=(您可以删除 (?: 和 ) )。

不需要将单个原子包裹在 character class 中, 使用 \\d 而不是 [\\d]。 [\\d] 实际上是误解的来源，有些人可能认为它是一个分组构造，并可能尝试在方括号中添加替代的 sequences，而 [...] 匹配单个字符。

所以，你的代码可以看起来像

        Pattern p = Pattern.compile("(\\d+)\\.html");     // No lookahead, + instead of {9}
        Matcher m = p.matcher(path);
        if (m.find()) {
            contentId = m.group(1);                       // (1) refers to Group 1
        } else if (StringUtils.isNonEmpty(queryString)) {
            p = Pattern.compile("contentId=(\\d+)\\.html");
            m = p.matcher(queryString);
            if (m.find()) {
                contentId = m.group(1);
            }
        }

关于android - 带错误检查的正则表达式，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/45359944/

android - 带错误检查的正则表达式

上一篇：android - 尝试在文本中嵌套 View 时出现 React Native Android 错误

下一篇：android - Recyclerview 网络的 ImageView 中的相同图像