python - 计算列表中唯一单词的数量

使用 https://stackoverflow.com/a/11899925 中的以下代码，我能够找到一个词是否唯一(通过比较它是否被使用一次或多次):

helloString = ['hello', 'world', 'world']
count = {}
for word in helloString :
   if word in count :
      count[word] += 1
   else:
      count[word] = 1

但是，如果我有一个包含数百个单词的字符串，我将如何计算该字符串中唯一单词的数量？

例如，我的代码有:

uniqueWordCount = 0
helloString = ['hello', 'world', 'world', 'how', 'are', 'you', 'doing', 'today']
count = {}
for word in words :
   if word in count :
      count[word] += 1
   else:
      count[word] = 1

如何将 uniqueWordCount 设置为 6？通常，我真的很擅长解决这些类型的算法难题，但我一直没有成功解决这个问题。我觉得它就在我的 Nose 底下。

最佳答案

解决这个问题的最好方法是使用set 集合类型。 set 是一个集合，其中所有元素都是唯一的。因此:

unique = set([ 'one', 'two', 'two']) 
len(unique) # is 2

您可以从一开始就使用一个集合，边做边添加单词:

unique.add('three')

这将在添加任何重复项时将其丢弃。或者，您可以收集列表中的所有元素并将列表传递给 set() 函数，届时该函数将删除重复项。我上面提供的示例显示了这种模式:

unique = set([ 'one', 'two', 'two'])
unique.add('three')

# unique now contains {'one', 'two', 'three'}

上一篇：python - 尝试在 python 中的环内生成随机 x,y 坐标

下一篇：python - Pandas:从 3 列创建时间戳:月、日、小时

python - paho MQTT on_message 返回一条有趣的消息 - python

python - 使用列表中的项目创建文件名 - for 循环

python - 使用 Shutil Copy 会中断 csv 数据的循环

python-3.x - 在第一个示例之后跳过场景(python Behave)

python-3.x - Matplotlib:仅将单元格颜色应用于某些列/单元格

python - 如何在 while 循环中生成不重复的随机数？ ( python 3)

python - astropy.convolution.convolve 返回 nan 值

python - 在 Python 中使用导入

python - Django 删除时区