python - 10、Python字符串中最常见的单词

我需要显示文本文件中最常见的 10 个单词，从最频繁到最少以及它的使用次数。我无法使用字典或计数器功能。到目前为止我有这个:

import urllib
cnt = 0
i=0
txtFile = urllib.urlopen("http://textfiles.com/etext/FICTION/alice30.txt")
uniques = []
for line in txtFile:
    words = line.split()
    for word in words:
        if word not in uniques:
            uniques.append(word)
for word in words:
    while i<len(uniques):
        i+=1
        if word in uniques:
             cnt += 1
print cnt

现在我想我应该查找数组“uniques”中的每个单词，看看它在这个文件中重复了多少次，然后将其添加到另一个计算每个单词实例的数组中。但这就是我被困住的地方。我不知道如何继续。

如有任何帮助，我们将不胜感激。谢谢

最佳答案

使用Python集合可以轻松解决上述问题下面是解决方案。

from collections import Counter

data_set = "Welcome to the world of Geeks " \
"This portal has been created to provide well written well" \
"thought and well explained solutions for selected questions " \
"If you like Geeks for Geeks and would like to contribute " \
"here is your chance You can write article and mail your article " \
" to contribute at geeksforgeeks org See your article appearing on " \
"the Geeks for Geeks main page and help thousands of other Geeks. " \

# split() returns list of all the words in the string
split_it = data_set.split()

# Pass the split_it list to instance of Counter class.
Counters_found = Counter(split_it)
#print(Counters)

# most_common() produces k frequently encountered
# input values and their respective counts.
most_occur = Counters_found.most_common(4)
print(most_occur)

关于python - 10、Python字符串中最常见的单词，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/27327303/

python - 10、Python字符串中最常见的单词

上一篇：android - 恢复出厂设置 Android x86

下一篇：c++ - 如何替换数值对应的二维 vector ？