python - 仅从字符串中删除最后一次出现的单词

我有一个字符串和一个短语数组。

input_string = 'alice is a character from a fairy tale that lived in a wonder land. A character about whome no one knows much about'

phrases_to_remove = ['wonderland', 'character', 'no one']

现在我要做的是，从 input_string 中删除数组 phrases_to_remove 中最后出现的单词。

output_string = 'alice is a character from a fairy tale that lived in a. A about whome knows much about'

我已经写下了一个方法，它接受输入字符串和一个 array 或只是一个 string 来替换并使用了 rsplit() 来替换短语。

def remove_words_from_end(actual_string: str, to_replace, occurrence: int):
    if isinstance(to_replace, list):
        output_string = actual_string
        for string in to_replace:
            output_string = ' '.join(output_string.rsplit(string, maxsplit=occurrence))
        return output_string.strip()
    elif isinstance(to_replace, str):
        return ' '.join(actual_string.rsplit(to_replace, maxsplit=occurrence)).strip()
    else:
        raise TypeError('the value "to_replace" must be a string or a list of strings')

代码的问题是，我无法删除 space 不匹配的单词。例如 wonderland 和 wonderland。

有没有一种方法可以在不影响性能的情况下做到这一点？

最佳答案

使用re 来处理可能的空格是一种可能:

import re

def remove_last(word, string):
    pattern = ' ?'.join(list(word))
    matches = list(re.finditer(pattern, string))
    if not matches:
        return string
    last_m = matches[-1]
    sub_string = string[:last_m.start()]
    if last_m.end() < len(string):
        sub_string += string[last_m.end():]
    return sub_string

def remove_words_from_end(words, string):
    words_whole = [word.replace(' ', '') for word in words]
    string_out = string
    for word in words:
        string_out = remove_last(word, string_out)
    return string_out

并运行一些测试:

>>> input_string = 'alice is a character from a fairy tale that lived in a wonder land. A character about whome no one knows much about'
>>> phrases_to_remove = ['wonderland', 'character', 'no one']
>>> remove_words_from_end(phrases_to_remove, input_string)
'alice is a character from a fairy tale that lived in a . A  about whome  knows much about'
>>> phrases_to_remove = ['wonder land', 'character', 'noone']
>>> remove_words_from_end(phrases_to_remove, input_string)
'alice is a character from a fairy tale that lived in a . A  about whome  knows much about'

在此示例中，正则表达式搜索模式只是每个字符之间可能有空格 ' ?' 的单词。

关于python - 仅从字符串中删除最后一次出现的单词，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/48992197/

python - 仅从字符串中删除最后一次出现的单词

上一篇：python - 有条件地从 Pandas DataFrame 中采样行

下一篇：python - 使用 Python 对 Excel 列进行排序