python-3.x - 试图找到单词的频率。有什么方法可以把这封信算作它自己的词吗？

我正在尝试计算某个词的使用频率。如果我说“嗨，我是尼克”，它会为每个单词给我一个计数。我按照这本书做的，但是当我做类似“我像风筝一样高”的事情时，我得到 3 次 i 和 a。有没有办法只计算 i 和 a 自己？

txt = "i am high as a kite"

x = txt.split(" ")

for num_of_instances in x:
    count = txt.count(num_of_instances)
    print(num_of_instances, count)

最佳答案

只是做:

x.count(num_of_instances)

代替:

txt.count(num_of_instances)

仍然，这将重复计算句子中的重复单词，如 "to be or not to be"(be 和 to 将被计算在内两次)，最好使用集合来删除这些重复项(但是你会丢失单词出现的顺序):

txt = "to be or not to be"

x = txt.split(" ")

for num_of_instances in set(x):
    count = x.count(num_of_instances)
    print(num_of_instances, count)

输出(每次执行代码顺序可能会改变):

be 2
to 2
not 1
or 1

最好使用 Counter 对象:

from collections import Counter
txt = "to be or not to be"
x = Counter(txt.split(" "))

for word, count in x.items():
    print(word, count)

输出:

to 2
be 2
or 1
not 1

关于python-3.x - 试图找到单词的频率。有什么方法可以把这封信算作它自己的词吗？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/56483437/

上一篇：git:集体分段 merge 分支

下一篇：python-3.x - Python 数据框 : converting columns into rows

相关文章：

python-3.x - pandas 在 Openshift 上工作吗？

django - 检索邻居的查询太慢

python - 游戏几秒后崩溃

python - 运行 QtRemoteObjects 时得到 "Dynamic metaobject is not assigned"

python-3.x - 从Python中的字符串中删除特殊字符

python - 读取 xml 并尝试将其提取到 2 个不同的 xml 中

python - 访问上一个列表的列表项

Python 3.7 导入请求返回 "Chardet"错误

python - 如果 foo 中不存在键，则忽略 str.format(**foo)

python - 为什么这个 python 生成器函数只能正确运行一次？