Python 重复单词

标签 python python-3.x count duplicates

我有一个问题,我必须计算 Python (v3.4.1) 中的重复单词并将它们放入句子中。我使用了计数器,但我不知道如何按以下顺序获取输出。输入为:

mysentence = As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality

我将其放入列表并对其进行排序

输出应该是这样的

"As" is repeated 1 time.
"are" is repeated 2 times.
"as" is repeated 3 times.
"certain" is repeated 2 times.
"do" is repeated 1 time.
"far" is repeated 2 times.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 times.
"of" is repeated 1 time.
"reality" is repeated 2 times.
"refer" is repeated 2 times.
"the" is repeated 1 time.
"they" is repeated 3 times.
"to" is repeated 2 times.

到目前为止我已经走到这一步了

x=input ('Enter your sentence :')
y=x.split()
y.sort()
for y in sorted(y):
    print (y)

最佳答案

我可以看到排序的方向,因为您可以可靠地知道何时遇到新单词并跟踪每个唯一单词的计数。然而,您真正想要做的是使用哈希(字典)来跟踪计数,因为字典键是唯一的。例如:

words = sentence.split()
counts = {}
for word in words:
    if word not in counts:
        counts[word] = 0
    counts[word] += 1

现在将为您提供一本字典,其中键是单词,值是它出现的次数。您可以执行一些操作,例如使用collections.defaultdict(int),这样您就可以添加值:

counts = collections.defaultdict(int)
for word in words:
    counts[word] += 1

但是甚至还有比这更好的东西... collections.Counter 它将获取您的单词列表并将其转换为包含计数的字典(实际上是字典的扩展)。

counts = collections.Counter(words)

从那里您需要按排序顺序排列的单词列表及其计数,以便您可以打印它们。 items() 将为您提供一个元组列表,而 sorted 将按每个元组的第一项(本例中的单词)排序(默认情况下)...这正是您想要的。

import collections
sentence = """As far as the laws of mathematics refer to reality they are not certain as far as they are certain they do not refer to reality"""
words = sentence.split()
word_counts = collections.Counter(words)
for word, count in sorted(word_counts.items()):
    print('"%s" is repeated %d time%s.' % (word, count, "s" if count > 1 else ""))

输出

"As" is repeated 1 time.
"are" is repeated 2 times.
"as" is repeated 3 times.
"certain" is repeated 2 times.
"do" is repeated 1 time.
"far" is repeated 2 times.
"laws" is repeated 1 time.
"mathematics" is repeated 1 time.
"not" is repeated 2 times.
"of" is repeated 1 time.
"reality" is repeated 2 times.
"refer" is repeated 2 times.
"the" is repeated 1 time.
"they" is repeated 3 times.
"to" is repeated 2 times.

关于Python 重复单词,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/25798674/

相关文章:

python - 比较嵌套列表 Python

javascript - 如何使用两个函数计算 Javascript 字符串中的元音?

python - 如何记录 Python 交互式 shell session 中发生的一切?

python - 类型对象 'User' 没有属性 'objects' (AbstractUser) python

javascript - 计算单词出现的次数,允许特殊字符和换行符

count - 在 SQLAlchemy 中优化具有多个计数的左连接查询?

python - 如何使用pyspark在s3上获取csv(方案: s3n)没有文件系统

python - 如何使用lightgbm实现学习排名?

python - 在 Redhat 6.8 上将 glibc 2.12 升级到 2.14 以使用 Tensorflow 和 Python

python - Django 多对多 "through"关系的成本