Python txt 文件字数统计程序

我正在尝试编写一个程序来计算 txt 文件中 5 个最常见的单词。

这是我到目前为止所拥有的:

file = open('alice.txt')
wordcount = {}

for word in file.read().split():
    if word not in wordcount:
        wordcount[word] = 1
    else:
        wordcount[word] += 1

for k, v in wordcount.items():
    print (k, v)

程序本身会计算 .txt 文件中的每个单词。

我的问题是如何使其只计算文件中 5 个最常见的单词，以便显示这些单词以及每个单词旁边的单词计数。

有一个问题 - 我不能使用字典......无论这意味着什么。

最佳答案

很简单，您只需找到文件中最常见的 5 个单词。

所以你可以这样做:

wordcount = sorted(wordcount.items(), key=lambda x: x[1], reverse=True)

然后，该字典将按值排序(请记住，sorted 返回一个列表)。

您可以使用以下代码获取 5 个最常见的单词:

for k, v in wordcount[:5]):
    print (k, v)

所以完整的代码如下:

wordcount = {}

with open('alice.txt') as file:  # with can auto close the file
    for word in file.read().split():
        if word not in wordcount:
            wordcount[word] = 1
        else:
            wordcount[word] += 1

wordcount = sorted(wordcount.items(), key=lambda x: x[1], reverse=True)

for k, v in wordcount[:5]:
    print(k, v)

另外，这里有一个更简单的方法来执行此操作使用 collections.Counter :

from collections import Counter
with open('alice.txt') as file:  # with can auto close the file
    wordcount = Counter(file.read().split())

for k, v in wordcount.most_common(5):
    print(k, v)

输出与第一个解决方案相同。

关于Python txt 文件字数统计程序，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33384673/

Python txt 文件字数统计程序

上一篇：python - Pandas : Group by similarities

下一篇：Python 正则表达式字符串组捕获