python - 我正在尝试从列表中删除长度小于 4 个字符的所有单词，但它不起作用

我有这段代码，应该从列表中删除长度小于 4 个字符的所有单词，但它只是删除一些单词(我不确定是哪一个)，但绝对不是全部:

#load in the words from the original text file
def load_words():
    with open('words_alpha.txt') as word_file:
        valid_words = [word_file.read().split()]

    return valid_words


english_words = load_words()
print("loading...")

print(len(english_words[0]))
#remove words under 4 letters
for word in english_words[0]:
    if len(word) < 4:
        english_words[0].remove(word)

print("done")
print(len(english_words[0]))

#save the remaining words to a new text file
new_words = open("english_words_v3.txt","w")
for word in english_words[0]:
    new_words.write(word)
    new_words.write("\n")

new_words.close()

它输出:

loading...
370103
done
367945

words_alpha.txt 中有 67000 个英语单词

最佳答案

您想要通过使用 english_words[0][:] 获取 english_words 的副本来迭代它的副本。现在您正在迭代正在修改的同一个列表，这导致了奇怪的行为。所以 for 循环看起来像

for word in english_words[0][:]:
    if len(word) < 4:
        english_words[0].remove(word)

此外，您还可以通过列表理解简化第一个 for 循环，并且不需要将 word_file.read().split() 包装在列表中，因为它已经返回列表

所以你的代码看起来像

#load in the words from the original text file
def load_words():
    with open('words_alpha.txt') as word_file:
        #No need to wrap this into a list since it already returns a list
        valid_words = word_file.read().split()

    return valid_words

english_words = load_words()

#remove words under 4 letters using list comprehension
english_words = [word for word in english_words if len(word) >= 4]

print("done")
print(len(english_words))

#save the remaining words to a new text file
new_words = open("english_words_v3.txt","w")
for word in english_words:
    new_words.write(word)
    new_words.write("\n")

new_words.close()

关于python - 我正在尝试从列表中删除长度小于 4 个字符的所有单词，但它不起作用，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56169697/

python - 我正在尝试从列表中删除长度小于 4 个字符的所有单词，但它不起作用

上一篇：python - 如何在使用 token 后强制使其过期(在使用 python 中的 itsdangerous 库为 token 设置过期时间之前)

下一篇：python - 从嵌套 JSON 文件中提取值