python - 使用 Python 进行 Digram 列表操作

标签 python algorithm list optimization

我想添加/替换看起来与此类似的二元字母列表:

[[a,b][b,c][c,d][d,c][c,b][b,a]]

如果列表被展平,结果将是:´´[a,b,c,d,c,b,a]´´ 但这只是为了描述结构,而不是问题。

Note that there are only two items on a digram and each of the two items on a digram precedes the next and the previous digram items, except of the first and the last digram, where terminating item occurs only once. See item ´´a´´.

我的问题是如何将二字词替换到列表中,以便评论部分的下一个示例结果将满足:

replace([['d','d']],          1, ['a', 0]) # should return: [['d', 'd']]
replace([['d',1]],            1, ['a', 0]) # should return: [['d', 'a'], ['a', 0]]
replace([[1,'d']],            1, ['a', 0]) # should return: [['a', 0], [0, 'd']]
replace([[1,'d'],['d', 1]],   1, ['a', 0]) # should return: [['a', 0], [0, 'd'], ['d', 'a'], ['a', 0]]
replace([['d',1],[1,'d']],    1, ['a', 0]) # should return: [['d','a'], ['a', 0], [0, 'd']]
replace([[1,1]],              1, ['a', 0]) # should return: [['a', 0], [0, 'a'], ['a', 0]]
replace([[1,1],[1,1]],        1, ['a', 0]) # should return: [['a', 0], [0, 'a'], ['a', 0], [0, 'a'], ['a', 0]]

我已经尝试了下一种方法,但它有一些问题。尤其是 ´´j == 1´´ 下的部分有一些特殊情况是行不通的。

def replace(t, a, b):
    """
    1. argument is the target list
    2. argument is the index value to be used on replacement
    3. argument is the digram to be inserted
    """
    # copy possibly not needed, im not sure
    t1 = t[:]
    for i, x in enumerate(t1):
        for j, y in enumerate(x):
            # if there is a digram match, lets make replacement / addition
            if y == a:
                if j == 0:
                    c = t1[i:]
                    del t1[i:]
                    t1 += [b] + c
                    c[0][0] = b[1]
                if j == 1:
                    c = t1[i:]
                    del t1[i:]
                    t1 += c + [b]
                    c[len(c)-1][1] = b[0]
                    #c[0][1] = b[0]
                    #t1 += c

    print (t, t1)

您能否提出一些改进功能的技巧或其他方法来完成任务?

加法

这是我对该函数的改进版本,它提供了正确的答案,但仍然“烦人”的部分或整个方法可以优化。这个问题和题目可以改到代码优化区:

def replace(t, a, b):
    """
    1. argument is the target list
    2. argument is the index value to be used on replacement
    3. argument is the digram to be inserted
    """
    l = len(t)
    i = 0
    while i < l:
        for j, y in enumerate(t[i]):
            # if there is a digram match, lets make replacement / addition
            if y == a:
                if j == 0:
                    c = t[i:]
                    del t[i:]
                    t += [b] + c
                    c[0][0] = b[1]
                    # increase both index and length
                    # this practically jumps over the inserted digram to the next one
                    i += 1
                    l += 1
                elif j == 1:
                    c = t[i:]
                    del t[i:]
                    # this is the annoying part of the algorithm...
                    if len(c) > 1 and c[1][0] == a:
                        t += c
                    else:
                        t += c + [b]
                        c[-1][1] = b[0]
                    t[i][1] = b[0]
        i += 1

    return t

我还提供了测试输入和输出的测试函数:

def test(ins, outs):
    try:
        assert ins == outs
        return (True, 'was', outs)
    except:
        return (False, 'was', ins, 'should be', outs)

for i, result in enumerate(
    [result for result in [
[replace([['d','d']],          1, ['a', 0]), [['d', 'd']]],
[replace([['d',1]],            1, ['a', 0]), [['d', 'a'], ['a', 0]]],
[replace([[1,'d']],            1, ['a', 0]), [['a', 0], [0, 'd']]],
[replace([[1,'d'],['d', 1]],   1, ['a', 0]), [['a', 0], [0, 'd'], ['d', 'a'], ['a', 0]]],
[replace([['d',1],[1,'d']],    1, ['a', 0]), [['d','a'], ['a', 0], [0, 'd']]],
[replace([[1,1]],              1, ['a', 0]), [['a', 0], [0, 'a'], ['a', 0]]],
[replace([[1,1],[1,1]],        1, ['a', 0]), [['a', 0], [0, 'a'], ['a', 0], [0, 'a'], ['a', 0]]],
[replace([['d',1],[1,1]],      1, ['a', 0]), [['d', 'a'], ['a', 0], [0, 'a'], ['a', 0]]],
[replace([[1,1],[1,'d']],      1, ['a', 0]), [['a', 0], [0, 'a'], ['a', 0], [0, 'd']]]
]]):
    print (i+1, test(*result))

最佳答案

这是我的方法。解释如下。

def replace(t, a, b):
    # Flatten the list
    t = [elem for sub in t for elem in sub]
    replaced = []
    # Iterate the elements of the flattened list
    # Let the elements that do not match a in and replace the ones that
    # do not match with the elements of b
    for elem in t:
        if elem == a:  # this element matches, replace with b
            replaced.extend(b)
        else:          # this element does not, add it
            replaced.append(elem) 
    # break up the replaced, flattened list with groups of 2 elements
    return [replaced[x:x+2] for x in range(len(replaced)-1)]

您从一些列表列表开始。所以首先,我们可以将其展平。

[[1,'d'],['d', 1]] becomes [1,'d','d', 1]

现在我们可以遍历扁平化的列表,在任何地方找到 a 的匹配项,我们可以用 b 的内容扩展我们的 replaced 列表>。如果该元素不匹配 a,我们只需将其附加到 replaced 即可。我们最终得到:

['a', 0, 'd', 'd', 'a', 0]

现在我们要将所有这些以 2 为一组,一次移动我们的索引 1。

[['a', 0] ...]
[['a', 0], [0, 'd'], ...]
[['a', 0], [0, 'd'], ['d', 'd'], ...]

如果您的数据比您的示例长得多并且需要提高性能,则可以删除列表的扁平化,您可以使用嵌套循环扁平化 t 中的值,这样您就可以通过 t 进行单次传递。

编辑

def replace(t, a, b):
    t = [elem for sub in t for elem in sub]

    inner_a_matches_removed = []
    for i, elem in enumerate(t):
        if not i % 2 or elem != a:
            inner_a_matches_removed.append(elem)
            continue
        if i < len(t) - 1 and t[i+1] == a:
            continue
        inner_a_matches_removed.append(elem)

    replaced = []
    for elem in inner_a_matches_removed:
        if elem == a:
            replaced.extend(b)
        else:
            replaced.append(elem)
    return [replaced[x:x+2] for x in range(len(replaced)-1)]

这里是测试的补充:

args_groups = [
        ([['d','d']],          1, ['a', 0]),
        ([['d',1]],            1, ['a', 0]),
        ([[1,'d']],            1, ['a', 0]),
        ([[1,'d'],['d', 1]],   1, ['a', 0]),
        ([['d',1],[1,'d']],    1, ['a', 0]),
        ([[1,1]],              1, ['a', 0]),
        ([[1,1],[1,1]],        1, ['a', 0]),
]

for args in args_groups:
    print "replace({}) => {}".format(", ".join(map(str, args)), replace(*args))

哪些输出:

replace([['d', 'd']],             1,   ['a', 0]) => [['d', 'd']]
replace([['d', 1]],               1,   ['a', 0]) => [['d', 'a'], ['a', 0]]
replace([[1, 'd']],               1,   ['a', 0]) => [['a', 0], [0, 'd']]
replace([[1, 'd'], ['d', 1]],     1,   ['a', 0]) => [['a', 0], [0, 'd'], ['d', 'd'], ['d', 'a'], ['a', 0]]
replace([['d', 1], [1, 'd']],     1,   ['a', 0]) => [['d', 'a'], ['a', 0], [0, 'd']]
replace([[1, 1]],                 1,   ['a', 0]) => [['a', 0], [0, 'a'], ['a', 0]]
replace([[1, 1], [1, 1]],         1,   ['a', 0]) => [['a', 0], [0, 'a'], ['a', 0], [0, 'a'], ['a', 0]]

我想我还是不明白案例 #4,但你似乎已经自己解决了,这太棒了!

这是您修改后的代码:

def replace(t, a, b):
    # Flatten the list
    t1 = []
    l = len(t)-1
    for items in [t[i][0:(1 if i>-1 and i<l else 2)] for i in range(0,l+1)]:
        t1.extend(items)
    replaced = []
    # Iterate the elements of the flattened list
    # Let the elements that do not match a in and replace the ones that
    # do not match with the elements of b
    for elem in t1:
        if elem == a:  # this element matches, replace with b
            replaced.extend(b)
        else:          # this element does not, add it
            replaced.append(elem) 
    # break up the replaced, flattened list with groups of 2 elements
    return [replaced[x:x+2] for x in range(len(replaced)-1)]

关于python - 使用 Python 进行 Digram 列表操作,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/35244046/

相关文章:

python - 在 Python 上使用 time.time() 完美计时

python - 为字典列表创建过滤功能的方法

algorithm - 有效地搜索所有元素都大于给定元组的元组

python - 如何合并 Pandas 中的两个数据框以替换nan

python - 使用 pandas 更快地计算行之间的相似度/距离

algorithm - 找到作为 A,B,C 字符串的子序列的最长序列 S

javascript - 这个函数的空间复杂度是多少?

c# - 如何使用 Linq/C# 在字典中获取当前之前和之后的项目?

javascript - 如何在Retrofit中插入值?

list - Future[List[Error\/Double]] 到 Future[[List[Error]\/List[Double]] 在 Scala 中