regex - 匹配两个或多个不相同的字符

标签 regex regex-negation

是否可以编写一个正则表达式模式来匹配 abc,其中每个字母不是文字,而是表示像 xyz 这样的文本(但不是 xxy >) 会匹配吗?我能够尽可能地匹配 (.)(?!\1) 来匹配 ab 中的 a 但随后我就被难住了。

得到下面的答案后,我能够编写一个例程来生成此模式。使用原始 re 模式比将模式和文本转换为规范形式然后将它们进行比较要快得多。

def pat2re(p, know=None, wild=None):
    """return a compiled re pattern that will find pattern `p`
    in which each different character should find a different
    character in a string. Characters to be taken literally
    or that can represent any character should be given as
    `know` and `wild`, respectively.

    EXAMPLES
    ========

    Characters in the pattern denote different characters to
    be matched; characters that are the same in the pattern
    must be the same in the text:

    >>> pat = pat2re('abba')
    >>> assert pat.search('maccaw')
    >>> assert not pat.search('busses')

    The underlying pattern of the re object can be seen
    with the pattern property:

    >>> pat.pattern
    '(.)(?!\\1)(.)\\2\\1'    

    If some characters are to be taken literally, list them
    as known; do the same if some characters can stand for
    any character (i.e. are wildcards):

    >>> a_ = pat2re('ab', know='a')
    >>> assert a_.search('ad') and not a_.search('bc')

    >>> ab_ = pat2re('ab*', know='ab', wild='*')
    >>> assert ab_.search('abc') and ab_.search('abd')
    >>> assert not ab_.search('bad')

    """
    import re
    # make a canonical "hash" of the pattern
    # with ints representing pattern elements that
    # must be unique and strings for wild or known
    # values
    m = {}
    j = 1
    know = know or ''
    wild = wild or ''
    for c in p:
        if c in know:
            m[c] = '\.' if c == '.' else c
        elif c in wild:
            m[c] = '.'
        elif c not in m:
            m[c] = j
            j += 1
            assert j < 100
    h = tuple(m[i] for i in p)
    # build pattern
    out = []
    last = 0
    for i in h:
        if type(i) is int:
            if i <= last:
                out.append(r'\%s' % i)
            else:
                if last:
                    ors = '|'.join(r'\%s' % i for i in range(1, last + 1))
                    out.append('(?!%s)(.)' % ors)
                else:
                    out.append('(.)')
                last = i
        else:
            out.append(i)
    return re.compile(''.join(out))

最佳答案

您可以尝试:

^(.)(?!\1)(.)(?!\1|\2).$

Demo

这是正则表达式模式的解释:

^          from the start of the string
(.)        match and capture any first character (no restrictions so far)
(?!\1)     then assert that the second character is different from the first
(.)        match and capture any (legitimate) second character
(?!\1|\2)  then assert that the third character does not match first or second
.          match any valid third character
$          end of string

关于regex - 匹配两个或多个不相同的字符,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/60538299/

相关文章:

用于匹配灰度颜色的 Ruby 正则表达式

regex - 为什么这个正则表达式在 Notepad++ (Windows) 中不起作用?

.net - 有没有办法对这些信息进行正则表达式。

python 正则表达式 : inverse match at the end of the line

regex - 转到正则表达式以匹配所有不以时间戳开头的行

php正则表达式找到太多

regex - 所有以后缀结尾而不使用否定的文件

java - 如何替换java字符串中的所有数字

java - 字符范围和空格的正则表达式

c# - 有人在 C# 应用程序中使用过 RE2 吗?