python - 统计文本文件中字母出现的频率

<分区>

在 python 中，如何遍历文本文件并计算每个字母出现的次数？我意识到我可以只使用“for x in file”语句来遍历它，然后设置 26 个左右的 if elif 语句，但肯定有更好的方法吗？

谢谢。

最佳答案

使用collections.Counter() :

from collections import Counter
with open(file) as f:
    c = Counter()
    for line in f:
        c += Counter(line)

如果文件不是很大，你可以把它作为一个字符串全部读入内存，并用一行代码将它转换成一个Counter对象:

c = Counter(f.read())

例子:

>>> c = Counter()
>>> c += Counter('aaabbbcccddd eee fff ggg')
>>> c
Counter({'a': 3, ' ': 3, 'c': 3, 'b': 3, 'e': 3, 'd': 3, 'g': 3, 'f': 3})
>>> c += Counter('aaabbbccc')
Counter({'a': 6, 'c': 6, 'b': 6, ' ': 3, 'e': 3, 'd': 3, 'g': 3, 'f': 3})

或使用 count()字符串的方法:

from string import ascii_lowercase     # ascii_lowercase =='abcdefghijklmnopqrstuvwxyz'
with open(file) as f:
    text = f.read().strip()
    dic = {}
    for x in ascii_lowercase:
        dic[x] = text.count(x)

关于python - 统计文本文件中字母出现的频率，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/12342207/

上一篇：python - 测试列表的索引是否存在

下一篇：Python - 从文本文件中读取数字并放入列表

相关文章：

python - on_mouse_down 调用每个按钮(甚至滚轮移动)

xml - 如何在 Linux 中查找和替换多行文本？

python - 除了 ManyToMany 之外，是否有允许多种选择的 Django 模型字段？

python - 最小化缓慢、嘈杂、未明确定义的目标函数

string - Golang 中是否有可用的文本数据类型？

Java从不同方法获取文本字段值

python - 文本 Python 中的重复短语

c - 使用模换行文本

python - 为什么re.split返回的list开头和结尾多了一个空串？

python - 如何在python中删除字符串中任意位置的字母？