我有一个字符串和一个短语数组。
input_string = 'alice is a character from a fairy tale that lived in a wonder land. A character about whome no one knows much about'
phrases_to_remove = ['wonderland', 'character', 'no one']
现在我要做的是,从 input_string
中删除数组 phrases_to_remove
中最后出现的单词。
output_string = 'alice is a character from a fairy tale that lived in a. A about whome knows much about'
我已经写下了一个方法,它接受输入字符串和一个 array
或只是一个 string
来替换并使用了 rsplit()
来替换短语。
def remove_words_from_end(actual_string: str, to_replace, occurrence: int):
if isinstance(to_replace, list):
output_string = actual_string
for string in to_replace:
output_string = ' '.join(output_string.rsplit(string, maxsplit=occurrence))
return output_string.strip()
elif isinstance(to_replace, str):
return ' '.join(actual_string.rsplit(to_replace, maxsplit=occurrence)).strip()
else:
raise TypeError('the value "to_replace" must be a string or a list of strings')
代码的问题是,我无法删除 space
不匹配的单词。例如 wonderland
和 wonderland
。
有没有一种方法可以在不影响性能的情况下做到这一点?
最佳答案
使用re
来处理可能的空格是一种可能:
import re
def remove_last(word, string):
pattern = ' ?'.join(list(word))
matches = list(re.finditer(pattern, string))
if not matches:
return string
last_m = matches[-1]
sub_string = string[:last_m.start()]
if last_m.end() < len(string):
sub_string += string[last_m.end():]
return sub_string
def remove_words_from_end(words, string):
words_whole = [word.replace(' ', '') for word in words]
string_out = string
for word in words:
string_out = remove_last(word, string_out)
return string_out
并运行一些测试:
>>> input_string = 'alice is a character from a fairy tale that lived in a wonder land. A character about whome no one knows much about'
>>> phrases_to_remove = ['wonderland', 'character', 'no one']
>>> remove_words_from_end(phrases_to_remove, input_string)
'alice is a character from a fairy tale that lived in a . A about whome knows much about'
>>> phrases_to_remove = ['wonder land', 'character', 'noone']
>>> remove_words_from_end(phrases_to_remove, input_string)
'alice is a character from a fairy tale that lived in a . A about whome knows much about'
在此示例中,正则表达式搜索模式只是每个字符之间可能有空格 ' ?'
的单词。
关于python - 仅从字符串中删除最后一次出现的单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48992197/