python - 统计文件中单词的出现次数

我想使用字典统计文件中每个单词的出现次数(文件中包含的所有单词均为小写且文件不包含任何标点符号)。

我想优化我的代码，因为我知道该列表花费了不必要的时间。

def create_dictionary(filename):
    d = {}
    flat_list = []
    with open(filename,"r") as fin:
        for line in fin:
            for word in line.split():
                flat_list.append(word)
        for i in flat_list:
            if d.get(i,0) == 0:
                d[i] = 1
            else :
                d[i] +=1

        return d

例如，一个文件包含:

i go to the market to buy some things to 
eat and drink because i want 
to eat and drink

应该返回:

{'i': 2, 'go': 1, 'to': 4, 'the': 1, 'market': 1, 'buy': 1, 'some': 1, 'things': 1, 'eat': 2, 'and': 2, 'drink': 2, 'because': 1, 'want': 1}

我可以改进什么？

最佳答案

只需使用collections.Counter:

with open(filename,"r") as fin:
    print(Counter(fin.read().split()))

关于python - 统计文件中单词的出现次数，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59661586/

上一篇：python - 如何在 python turtle 中完美地将三角形内接在正方形内？

下一篇：python - 如何使用 RE 查找字符串中的多个平衡大小匹配项？

c - 在使用 C 的 Linux 中，如何将整个环境写入文件？

ios - 在 swift 上使用字典时使用未解析的标识符

python - 使用 Healpy 在 Cartview 中添加轴

python - Word2vec - 获得相似度等级

node.js - 如果父文件夹不存在，如何写入文件？

python - 如何获取 19GB 文件的第二行 - python？

python - 无法访问Python中的嵌套字典

java list<map> 具有重复值

python - 你如何对文件进行数字排序？