python - 除非在某些条件下,否则得出出现在列表列表中的重复项列表

标签 python python-2.7

我有一个任务类型和与其关联的项目的列表。任务类型一共有4种。我想生成具有多种任务类型的项目列表,除非具有某些任务类型对。我已经弄清楚如何获取具有多个任务的项目列表,但不知道如何排除排除组合。

要从输出中排除的组合对 (任务类型1,任务类型4),(任务类型3,任务类型4)

如果一个项目除其他项目外还具有排除对,则应将其包含在输出中。

输入:

my_list = [['Task Type 1', 'Project 1'],['Task Type 2', 'Project 1'],['Task Type 4', 'Project 1'],
          ['Task Type 3', 'Project 2'],['Task Type 4', 'Project 2'],
          ['Task Type 1', 'Project 3'],['Task Type 1', 'Project 3'],
          ['Task Type 4', 'Project 4']]

开始编码:

from collections import Counter
my_project_list = zip(*my_list)[1]
cnt = Counter(my_project_list)
my_duplicate_list = [k for k, v in cnt.iteritems() if v > 1]
print my_duplicate_list

期望的输出:

['Project 1', 'Project 3']

最佳答案

这是一种方法:

首先,我们将创建从项目到其类型列表的映射。

然后,我们将创建一个过滤器,用于接收规则列表并仅返回与任何规则匹配的项目。

这是包含详细信息的完整代码(感谢@DSM 修复):

#!/usr/bin/env python
from collections import defaultdict

my_list = [
    ['Task Type 1', 'Project 1'],
    ['Task Type 2', 'Project 1'],
    ['Task Type 4', 'Project 1'],
    ['Task Type 3', 'Project 2'],
    ['Task Type 4', 'Project 2'],
    ['Task Type 1', 'Project 3'],
    ['Task Type 1', 'Project 3'],
    ['Task Type 4', 'Project 4']
]

# create mapping according to our filter value
# in our case, project to it's types
projects_to_types = defaultdict(list)
for x in my_list:
    projects_to_types[x[1]].append(x[0])

# sort all lists of types - this promises
# the equation of two identical lists
# returns the same results (lists have order)
projects_to_types = {k:sorted(v) for k, v in projects_to_types.iteritems()}

# a function to create a filter over a mapping
# like the one we created, the filter is a generator
def rules_filter_generator(original):
    # take a list of rules and filter out keys whose
    # values match any rule
    def filter_restricted(rules, minimum_length=2):
        # a set will give us better, more readable and faster code.
        # convert to tuples since list isn't hashable (mutable).
        rule_set = set(map(lambda x: tuple(sorted(x)), rules))
        for k, v in original.iteritems():
            if len(v) >= minimum_length and not tuple(v) in rule_set:
                yield k
    return filter_restricted

# use the filter specifically on the mapping we've created
generator = rules_filter_generator(projects_to_types)

# test (consume the generator to a list)
print list(generator([
    ['Task Type 3', 'Task Type 4'],
    ['Task Type 3', 'Task Type 3', 'Task Type 4']
]))

# prints: set(['Project 3', 'Project 1'])

关于python - 除非在某些条件下,否则得出出现在列表列表中的重复项列表,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/28180360/

相关文章:

python - 将生成的 TFIDF 计算添加到 Pyspark 中原始文档的数据框中

python-2.7 - 无法从 Explorer [2013] 通过 IDLE 运行 Python - IDLE 的子进程未建立连接

python - Jenkins 构建返回 'None' 状态

python-2.7 - 如何解释 Python 2.7 中的花式循环?

python - 将文件句柄缓存到 python 中的 netCDF 文件

python - 为什么 Python 的 math.factorial 不能很好地处理线程?

python - 使用python降噪将其他人的声音视为噪音

python - 在Django中立即运行页面并返回给用户的正确方法是什么?

python - 在文件中存储大型 python 字典的最佳方法

python-2.7 - Pandas -Python 2.7 : How convert timeseries index to seconds of the day?