C++ 字符串差异(la Python 的差异库)

标签 c++ python algorithm diff

我正在尝试区分两个字符串以确定它们是否仅在字符串结构的一个数字子集中有所不同;例如,

varies_in_single_number_field('foo7bar', 'foo123bar')
# Returns True, because 7 != 123, and there's only one varying
# number region between the two strings.

在 Python 中,我可以使用 difflib 来完成这个:

import difflib, doctest

def varies_in_single_number_field(str1, str2):
    """
    A typical use case is as follows:
        >>> varies_in_single_number_field('foo7bar00', 'foo123bar00')
        True

    Numerical variation in two dimensions is no good:
        >>> varies_in_single_number_field('foo7bar00', 'foo123bar01')
        False

    Varying in a nonexistent field is okay:
        >>> varies_in_single_number_field('foobar00', 'foo123bar00')
        True

    Identical strings don't *vary* in any number field:
        >>> varies_in_single_number_field('foobar00', 'foobar00')
        False
    """
    in_differing_substring = False
    passed_differing_substring = False # There should be only one.
    differ = difflib.Differ()
    for letter_diff in differ.compare(str1, str2):
        letter = letter_diff[2:]
        if letter_diff.startswith(('-', '+')):
            if passed_differing_substring: # Already saw a varying field.
                return False
            in_differing_substring = True
            if not letter.isdigit(): return False # Non-digit diff character.
        elif in_differing_substring: # Diff character not found - end of diff.
            in_differing_substring = False
            passed_differing_substring = True
    return passed_differing_substring # No variation if no diff was passed.

if __name__ == '__main__': doctest.testmod()

但我不知道如何为 C++ 找到类似 difflib 的东西。欢迎使用其他方法。 :)

最佳答案

这可能有效,它至少通过了您的演示测试: 编辑:我做了一些修改来处理一些字符串索引问题。我相信现在应该好了。

#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
#include <cctype>

bool starts_with(const std::string &s1, const std::string &s2) {
    return (s1.length() <= s2.length()) && (s2.substr(0, s1.length()) == s1);
}

bool ends_with(const std::string &s1, const std::string &s2) {
    return (s1.length() <= s2.length()) && (s2.substr(s2.length() - s1.length()) == s1);
}

bool is_numeric(const std::string &s) {
    for(std::string::const_iterator it = s.begin(); it != s.end(); ++it) {
        if(!std::isdigit(*it)) {
                return false;
        }
    }
    return true;
}

bool varies_in_single_number_field(std::string s1, std::string s2) {

    size_t index1 = 0;
    size_t index2 = s1.length() - 1;

    if(s1 == s2) {
        return false;
    }

    if((s1.empty() && is_numeric(s2)) || (s2.empty() && is_numeric(s1))) {
        return true;
    }

    if(s1.length() < s2.length()) {
        s1.swap(s2);
    }

    while(index1 < s1.length() && starts_with(s1.substr(0, index1), s2)) { index1++; }
    while(ends_with(s1.substr(index2), s2)) { index2--; }

    return is_numeric(s1.substr(index1 - 1, (index2 + 1) - (index1 - 1)));

}

int main() {
    std::cout << std::boolalpha << varies_in_single_number_field("foo7bar00", "foo123bar00") << std::endl;
    std::cout << std::boolalpha << varies_in_single_number_field("foo7bar00", "foo123bar01") << std::endl;
    std::cout << std::boolalpha << varies_in_single_number_field("foobar00", "foo123bar00") << std::endl;
    std::cout << std::boolalpha << varies_in_single_number_field("foobar00", "foobar00") << std::endl;
    std::cout << std::boolalpha << varies_in_single_number_field("7aaa", "aaa") << std::endl;
    std::cout << std::boolalpha << varies_in_single_number_field("aaa7", "aaa") << std::endl;
    std::cout << std::boolalpha << varies_in_single_number_field("aaa", "7aaa") << std::endl;
    std::cout << std::boolalpha << varies_in_single_number_field("aaa", "aaa7") << std::endl;
}

基本上,它会查找一个包含 3 个部分的字符串,string2 以 part1 开头,string2 以 part3 结尾,part2 只是数字。

关于C++ 字符串差异(la Python 的差异库),我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/269918/

相关文章:

算法:对线条/其他几何形状的缓冲效果

c++ - 如何声明 std::make_tuple 结果的类型,而不使用 auto

c++ - 我可以静态断言一个实体是模板类的实例,而不强制执行任何/所有模板参数吗?

c++11 vs c++ - 枚举差异

Python 发送带有 Base64 编码图像作为附件的电子邮件

python - 在单行python3中打印一个int列表

python - 导入错误: bad magic number in 'time' : b'\x03\xf3\r\n' in Django

algorithm - BST转换为对称结构的树

algorithm - 如何在不同目录的多个图像中读取和运行算法?

c++ - 将常量传播到成员变量指向的数据