python - numpy - 如何按索引计算嵌套列表中项目的出现次数？

您好，我希望能够通过嵌套列表的索引来计算列表中项目的出现次数。

如果我的列表是

keys = ['One', 'Two', 'Three', 'Four', 'Five', 'Six', 'Seven', 'Eight',
        'Nine', 'Ten', 'Eleven', 'Twelve', 'Thirteen', 'Fourteen', 'Fifteen']

我的嵌套列表如下所示:

[['Three' 'One' 'Ten']
 ['Three' 'Five' 'Nine']
 ['Two' 'Five' 'Three']
 ['Two' 'Three' 'Eight']
 ['One' 'Three' 'Nine']]

对于每个项目，'One' 在索引 0 等处出现了多少次，这是我想知道的。

我正在使用 numpy 数组构建列表并从加权随机数创建输出。我希望能够对 1000 个列表运行测试并计算索引出现次数以确定我在程序其他地方所做的更改如何影响最终结果。

我找到了类似 https://stackoverflow.com/a/10741692/461887 的例子

import numpy as np
x = np.array([1,1,1,2,2,2,5,25,1,1])
y = np.bincount(x)
ii = np.nonzero(y)[0]
zip(ii,y[ii]) 
# [(1, 5), (2, 3), (5, 1), (25, 1)]

但这似乎不适用于嵌套列表。也在 numpy cookbook - indexing 中寻找索引并在 example list 中进行直方图和数字化但我似乎找不到可以执行此操作的函数。

更新为包含示例数据输出:

假定 100 个深度嵌套列表

{'One': 19, 'Two': 16, 'Three': 19, 'Four': 11, 'Five': 7, 'Six': 8, 'Seven' 4, 'Eight' 3,
            'Nine' 5, 'Ten': 1, 'Eleven': 2, 'Twelve': 1, 'Thirteen': 1, 'Fourteen': 3, 'Fifteen': 0}

或者像 treddy 的例子一样

array([19, 16, 19, 11, 7, 8, 4, 3, 5, 1, 2, 1, 1, 3, 0])

最佳答案

你最好为你的例子添加你想要得到的例子输出，但现在看起来像collections.Counter将完成这项工作:

>>> data = [['Three','One','Ten'],
...  ['Three','Five','Nine'],
...  ['Two','Five','Three'],
...  ['Two','Three','Eight'],
...  ['One','Three','Nine']]
... 
>>> 
>>> from collections import Counter
>>> [Counter(x) for x in data]
[Counter({'Three': 1, 'Ten': 1, 'One': 1}), Counter({'Nine': 1, 'Five': 1, 'Three': 1}), Counter({'Five': 1, 'Two': 1, 'Three': 1}), Counter({'Eight': 1, 'Two': 1, 'Three': 1}), Counter({'Nine': 1, 'Three': 1, 'One': 1})]

更新:

当您给出所需的输出时，我认为您的想法是 - 使列表变肥，使用 Counter 计算出现次数，然后创建字典(如果顺序对您很重要，则创建 OrderedDict):

>>> from collections import Counter, OrderedDict
>>> c = Counter(e for l in data for e in l)
>>> c
Counter({'Three': 5, 'Two': 2, 'Nine': 2, 'Five': 2, 'One': 2, 'Ten': 1, 'Eight': 1})

或者如果您只需要每个列表中的第一个条目:

>>> c = Counter(l[0] for l in data)
>>> c
Counter({'Three': 2, 'Two': 2, 'One': 1})

简单字典:

>>> {x:c[x] for x in keys} 
{
    'Twelve': 0, 'Seven': 0,
    'Ten': 1, 'Fourteen': 0,
    'Nine': 2, 'Six': 0
    'Three': 5, 'Two': 2,
    'Four': 0, 'Eleven': 0,
    'Five': 2, 'Thirteen': 0,
    'Eight': 1, 'One': 2, 'Fifteen': 0
}

或 OrderedDict:

>>> OrderedDict((x, c[x]) for x in keys)
OrderedDict([('One', 2), ('Two', 2), ('Three', 5), ('Four', 0), ('Five', 2), ('Six', 0), ('Seven', 0), ('Eight', 1), ('Nine', 2), ('Ten', 1), ('Eleven', 0), ('Twelve', 0), ('Thirteen', 0), ('Fourteen', 0), ('Fifteen', 0)])

并且，以防万一，如果您不需要在输入中使用零，您可以只使用 Counter 来获取出现次数:

>>> c['Nine']   # Key is in the Counter, returns number of occurences
2
>>> c['Four']   # Key is not in the Counter, returns 0
0

关于python - numpy - 如何按索引计算嵌套列表中项目的出现次数？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/20158579/

python - numpy - 如何按索引计算嵌套列表中项目的出现次数？

更新:

上一篇：python - Python 中 cStringIO.StringIO.write 和 String.StringIO.write 的区别

下一篇：python - 如何比较 strftime 值