python - 查找以相同大写字符开头和结尾的子字符串

我有一个家庭作业问题，我需要使用正则表达式来解析大字符串中的子字符串。

目标是选择与以下参数匹配的子字符串:

子字符串以相同的大写字符开头和结尾，我需要忽略任何前面带有数字 0 的大写字符实例。

例如，ZAp0ZuZAuX0AZA 将包含匹配项 ZAp0ZuZ 和 AuX0AZA

我已经弄乱这个几个小时了，老实说还没有接近...

我已经尝试过类似下面的代码，但它会选择从第一个大写字母到最后一个大写字母的所有内容。我也有

[A-Z]{1}[[:alnum:]]*[A-Z]{1} <--- this selects the whole string
[A-Z]{1}[[:alnum:]][A-Z]{1} <--- this gives me strings like ZuZ, AuX

非常感谢任何帮助，我完全被这个难住了。

最佳答案

用正则表达式来做这件事可能不是最好的主意，因为你可以简单地拆分它们。但是，如果您有/希望这样做，this expression当您的字符列表扩展时，可能会让您了解您可能面临的问题:

(?=.[A-Z])([A-Z])(.*?)\1

我添加了必须包含一个大写字母的 (?=.[A-Z])。您可以删除它，它会起作用。但是，为了安全起见，您可以将此类边界添加到您的表达式中。

JavaScript 测试

const regex = /([A-Z])(.*?)\1/gm;
const str = `ZAp0ZuZAuX0AZA
ZApxxZuZAuXxafaAZA
ZApxaf09xZuZAuX090xafaAZA
abcZApxaf09xZuZAuX090xafaAZA`;
let m;

while ((m = regex.exec(str)) !== null) {
    // This is necessary to avoid infinite loops with zero-width matches
    if (m.index === regex.lastIndex) {
        regex.lastIndex++;
    }
    
    // The result can be accessed through the `m`-variable.
    m.forEach((match, groupIndex) => {
        console.log(`Found match, group ${groupIndex}: ${match}`);
    });
}

Python 测试

# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility

import re

regex = r"([A-Z])(.*?)\1"

test_str = ("ZAp0ZuZAuX0AZA\n"
    "ZApxxZuZAuXxafaAZA\n"
    "ZApxaf09xZuZAuX090xafaAZA\n"
    "abcZApxaf09xZuZAuX090xafaAZA")

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):
    
    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
    
    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum + 1
        
        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))

# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.

关于python - 查找以相同大写字符开头和结尾的子字符串，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56085044/

python - 查找以相同大写字符开头和结尾的子字符串

JavaScript 测试

Python 测试

上一篇：python - 为什么变量有效，但在 python 中使用 xlrd 打开工作簿的列表却无效？

下一篇：python - 根据 30 分钟间隔计算分钟差异？