我正在尝试创建一个函数来审查字符串中的单词。它有点工作,有一些怪癖。
这是我的代码:
def censor(sentence):
badwords = 'apple orange banana'.split()
sentence = sentence.split()
for i in badwords:
for words in sentence:
if i in words:
pos = sentence.index(words)
sentence.remove(words)
sentence.insert(pos, '*' * len(i))
print " ".join(sentence)
sentence = "you are an appletini and apple. new sentence: an orange is a banana. orange test."
censor(sentence)
输出:
you are an ***** and ***** new sentence: an ****** is a ****** ****** test.
一些标点符号消失了,并且单词“appletini”
被错误地替换。
如何解决这个问题?
还有,有没有更简单的方法来做这种事情?
最佳答案
具体问题是:
- 你根本不考虑标点符号;和
- 插入
'*'
时,您使用的是“坏词”的长度,而不是单词的长度。
我会切换循环顺序,因此您只需处理该句子一次,并使用 enumerate
而不是删除
和插入
:
def censor(sentence):
badwords = ("test", "word") # consider making this an argument too
sentence = sentence.split()
for index, word in enumerate(sentence):
if any(badword in word for badword in badwords):
sentence[index] = "".join(['*' if c.isalpha() else c for c in word])
return " ".join(sentence) # return rather than print
测试str.isalpha
将仅用星号替换大写和小写字母。演示:
>>> censor("Censor these testing words, will you? Here's a test-case!")
"Censor these ******* *****, will you? Here's a ****-****!"
# ^ note length ^ note punctuation
关于python - 从坏词列表创建审查函数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/24738016/