python - 显示字符串对齐

标签 python

我有一个程序可以告诉我两个字符串之间的距离,它工作得很好。

例如

word 1 = hello
word 2 = hi

从一个到另一个的成本为 5(将 i 替换为 e 为 2,并且有 3 次插入)。

基本上,插入成本为 1,删除成本为 1,替换成本为 2。也可以在字符串中打乱单词以降低成本。

我需要一种方法来记住在什么时候发生了什么操作,以便我可以显示对齐。

例如

wax
S M S(substitute move substitute, cost of 4)
and

有什么想法或提示吗?

import sys
from sys import stdout


def  minEditDist(target, source):

    # Length of the target strings set to variables
    n = len(target)
    m = len(source)

    distance = [[0 for i in range(m+1)] for j in range(n+1)]

    for i in range(1,n+1):
        distance[i][0] = distance[i-1][0] + insertCost(target[i-1])

    for j in range(1,m+1):
        distance[0][j] = distance[0][j-1] + deleteCost(source[j-1])


    for i in range(1,n+1):
        for j in range(1,m+1):
           distance[i][j] = min(distance[i-1][j]+1,
                                distance[i][j-1]+1,
                                distance[i-1][j-1]+subCost(source[j-1],target[i-1]))

    # Return the minimum distance using all the table cells
    return distance[i][j]

def subCost(x,y):
    if x == y:
        return 0
    else:
        return 2

def insertCost(x):
    return 1

def deleteCost(x):
    return 1

# User inputs the strings for comparison
# Commented out here because cloud9 won't take input like this
# word1 = raw_input("Enter A Word: ")
# word2 = raw_input("Enter The Second Word: ")
word1 = "wax"
word2 = "and"
word1x = word1
word2x = word2
# Reassign variables to words with stripped right side whitespace
word1x = word1x.strip()
word2x = word2x.strip()

if(len(word1) > len(word2)):
    range_num = len(word1)
else:
    range_num = len(word2)

# Display the minimum distance between the two specified strings
print "The minimum edit distance between S1 and S2 is: ", minEditDist(word1x,word2x), "!"
print (word1x)
print (word2x)

最佳答案

你可以从这样的事情开始。

我已经为“S”添加了正确的数据。

path = []

def  minEditDist(target, source):

    # Length of the target strings set to variables
    n = len(target)
    m = len(source)

    distance = [[0 for i in range(m+1)] for j in range(n+1)]

    for i in range(1,n+1):
        distance[i][0] = distance[i-1][0] + insertCost(target[i-1])

    for j in range(1,m+1):
        distance[0][j] = distance[0][j-1] + deleteCost(source[j-1])


    for i in range(1,n+1):
        for j in range(1,m+1):
           sc = subCost(source[j-1],target[i-1])
           distance[i][j] = min(distance[i-1][j]+1,
                                distance[i][j-1]+1,
                                distance[i-1][j-1]+sc)
           if distance[i-1][j]+1 > distance[i-1][j-1]+sc and distance[i][j-1]+1 > distance[i-1][j-1]+sc:
               path.append("S");

    print path

    # Return the minimum distance using all the table cells
    return distance[i][j]

def subCost(x,y):
    if x == y:
        return 0
    else:
        return 2

def insertCost(x):
    path.append("I")
    return 1

def deleteCost(x):
    path.append("D")
    return 1

关于python - 显示字符串对齐,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14228884/

相关文章:

python - 从 XML DTD 生成 lex 匹配规则和 yacc 语法规则

python - 保存 Jupyter 笔记本 session

python - 2道数学题

python - 将位置命名参数 (*args, **kwargs) 传递给函数后,有什么方法可以获得正确的输出吗?

python - 如何在视频上正确运行 Detectron2?

python - 从 Anaconda Python 运行 Spyder 时出现问题

python - 如何合并pandas中的多列值?

python - Pandas :使用最后可用的值填充缺失值

python - 在运行时从 Swift 应用程序调用脚本(某种语言)

python - Python 2.7 中平台特定的 Unicode 语义