python - 仅从字符串中删除最后一次出现的单词

标签 python string

我有一个字符串和一个短语数组。

input_string = 'alice is a character from a fairy tale that lived in a wonder land. A character about whome no one knows much about'

phrases_to_remove = ['wonderland', 'character', 'no one']

现在我要做的是,从 input_string 中删除数组 phrases_to_remove 中最后出现的单词。

output_string = 'alice is a character from a fairy tale that lived in a. A about whome knows much about'

我已经写下了一个方法,它接受输入字符串和一个 array 或只是一个 string 来替换并使用了 rsplit() 来替换短语。

def remove_words_from_end(actual_string: str, to_replace, occurrence: int):
    if isinstance(to_replace, list):
        output_string = actual_string
        for string in to_replace:
            output_string = ' '.join(output_string.rsplit(string, maxsplit=occurrence))
        return output_string.strip()
    elif isinstance(to_replace, str):
        return ' '.join(actual_string.rsplit(to_replace, maxsplit=occurrence)).strip()
    else:
        raise TypeError('the value "to_replace" must be a string or a list of strings')

代码的问题是,我无法删除 space 不匹配的单词。例如 wonderlandwonderland

有没有一种方法可以在不影响性能的情况下做到这一点?

最佳答案

使用re 来处理可能的空格是一种可能:

import re

def remove_last(word, string):
    pattern = ' ?'.join(list(word))
    matches = list(re.finditer(pattern, string))
    if not matches:
        return string
    last_m = matches[-1]
    sub_string = string[:last_m.start()]
    if last_m.end() < len(string):
        sub_string += string[last_m.end():]
    return sub_string

def remove_words_from_end(words, string):
    words_whole = [word.replace(' ', '') for word in words]
    string_out = string
    for word in words:
        string_out = remove_last(word, string_out)
    return string_out

并运行一些测试:

>>> input_string = 'alice is a character from a fairy tale that lived in a wonder land. A character about whome no one knows much about'
>>> phrases_to_remove = ['wonderland', 'character', 'no one']
>>> remove_words_from_end(phrases_to_remove, input_string)
'alice is a character from a fairy tale that lived in a . A  about whome  knows much about'
>>> phrases_to_remove = ['wonder land', 'character', 'noone']
>>> remove_words_from_end(phrases_to_remove, input_string)
'alice is a character from a fairy tale that lived in a . A  about whome  knows much about'

在此示例中,正则表达式搜索模式只是每个字符之间可能有空格 ' ?'​​ 的单词。

关于python - 仅从字符串中删除最后一次出现的单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48992197/

相关文章:

python - 神经网络预测区间

python - 为什么我用Python做的网络爬虫不能用?

python - 找到减少分数的最有效方法

c# - 为什么在构建SQL命令时使用地址符号@?

ios - Swift 与核心数据

python - 使用 Dijkstra 计算目的地之间的最短路线,字典帮助。 (Python)

python - 守护 python 的 BaseHTTPServer

python - 面试准备 : optimizing swapLexOrder

c# - 如何根据字符串值通过 LINQ 查询列表的内容?

regex - 如何查找和替换特定字符,但前提是它在引号中?