python - 带有生成器的 Python 中的 DFS 算法

标签 python algorithm generator depth-first-search

背景:

我正在做一个项目,需要编写一些文本处理规则。在这个项目上工作了几天并实现了一些规则后,我意识到我需要确定规则的顺序。没问题,我们有拓扑排序来帮忙。但后来我意识到我不能指望图表总是满的。所以我想出了这个想法,给定一个具有一组依赖项(或单个依赖项)的规则,我需要检查依赖项的依赖项。听起来很熟悉?是的。这个主题与图的深度优先搜索非常相似。
我不是数学家,也没有学过 C.S. 因此,图论对我来说是一个新领域。尽管如此,我还是实现了一些可行的东西(见下文)(我怀疑效率低下)。

代码:

这是我的搜索和 yield 算法。如果您在下面的示例中运行它,您会看到它多次访问某些节点。因此,推测效率低下。
关于输入的一句话。我写的规则基本上是 python 类,它有一个类属性 depends。我因未使用 inspect.getmro 而受到批评 - 但这会使事情变得非常复杂,因为类需要相互继承 (See example here)

def _yield_name_dep(rules_deps):
    global recursion_counter
    recursion_counter = recursion_counter +1 
    # yield all rules by their named and dependencies
    for rule, dep in rules_deps.items():
        if not dep:
            yield rule, dep
            continue
        else:
            yield rule, dep
            for ii in dep:
                i = getattr(rules, ii)
                instance = i()
                if instance.depends:
                    new_dep={str(instance): instance.depends}
                    for dep in _yield_name_dep(new_dep):
                        yield dep    
                else:
                    yield str(instance), instance.depends

好的,现在您已经查看了代码,下面是您可以测试的一些输入:

demo_class_content ="""
class A(object):
    depends = ('B')
    def __str__(self):
        return self.__class__.__name__

class B(object):
    depends = ('C','F')
    def __str__(self):
        return self.__class__.__name__

class C(object):
    depends = ('D', 'E')
    def __str__(self):
        return self.__class__.__name__

class D(object):
    depends = None
    def __str__(self):
        return self.__class__.__name__   

class F(object):
    depends = ('E')
    def __str__(self):
        return self.__class__.__name__

class E(object):
    depends = None  
    def __str__(self):
        return self.__class__.__name__
"""       

with open('demo_classes.py', 'w') as clsdemo:
    clsdemo.write(demo_class_content)

import demo_classes as rules

rule_start={'A': ('B')}

def _yield_name_dep(rules_deps):
    # yield all rules by their named and dependencies
    for rule, dep in rules_deps.items():
        if not dep:
            yield rule, dep
            continue
        else:
            yield rule, dep
            for ii in dep:
                i = getattr(rules, ii)
                instance = i()
                if instance.depends:
                    new_dep={str(instance): instance.depends}
                    for dep in _yield_name_dep(new_dep):
                        yield dep    
                else:
                    yield str(instance), instance.depends

if __name__ == '__main__':
    # this is yielding nodes visited multiple times, 
    # list(_yield_name_dep(rule_start))
    # hence, my work around was to use set() ...
    rule_dependencies = list(set(_yield_name_dep(rule_start)))
    print rule_dependencies

问题:

  • 我尝试对我的工作进行分类,我认为我所做的与 DFS 类似。真的可以这样分类吗?
  • 如何改进此功能以跳过已访问的节点,并仍然使用生成器?

更新:

为了省去运行代码的麻烦,上面函数的输出是:

>>> print list(_yield_name_dep(rule_wd))
[('A', 'B'), ('B', ('C', 'F')), ('C', ('D', 'E')), ('D', None), ('E', None), ('F', 'E'), ('E', None)]
>>> print list(set(_yield_name_dep(rule_wd)))
[('B', ('C', 'F')), ('E', None), ('D', None), ('F', 'E'), ('C', ('D', 'E')), ('A', 'B')]

虽然我想出了一个更好的解决方案,但上面的问题仍然存在。所以请随意批评我的解决方案:

visited = []
def _yield_name_dep_wvisited(rules_deps, visited):
    # yield all rules by their name and dependencies
    for rule, dep in rules_deps.items():
        if not dep and rule not in visited:
            yield rule, dep
            visited.append(rule)
            continue
        elif rule not in visited:
            yield rule, dep
            visited.append(rule)
            for ii in dep:
                i = getattr(grules, ii)
                instance = i()
                if instance.depends:
                    new_dep={str(instance): instance.depends}
                    for dep in _yield_name_dep_wvisited(new_dep, visited):
                        if dep not in visited:
                            yield dep    
                    
                elif str(instance) not in visited:
                    visited.append(str(instance))
                    yield str(instance), instance.depends

上面的输出是:

>>>list(_yield_name_dep_wvisited(rule_wd, visited))
[('A', 'B'), ('B', ('C', 'F')), ('C', ('D', 'E')), ('D', None), ('E', None), ('F', 'E')]

所以你现在可以看到,节点 E 只被访问过一次。

最佳答案

根据 Gareth 和其他 Stackoverflow 用户的反馈,我得出了以下结论。它更清晰,也更通用:

def _dfs(start_nodes, rules, visited):
    """
    Depth First Search
    start_nodes - Dictionary of Rule with dependencies (as Tuples):    

        start_nodes = {'A': ('B','C')}

    rules - Dictionary of Rules with dependencies (as Tuples):
    e.g.
    rules = {'A':('B','C'), 'B':('D','E'), 'C':('E','F'), 
             'D':(), 'E':(), 'F':()}
    The above rules describe the following DAG:

                    A
                   / \
                  B   C
                 / \ / \
                D   E   F
    usage:
    >>> rules = {'A':('B','C'), 'B':('D','E'), 'C':('E','F'), 
                 'D':(), 'E':(), 'F':()}
    >>> visited = []
    >>> list(_dfs({'A': ('B','C')}, rules, visited))
    [('A', ('B', 'C')), ('B', ('D', 'E')), ('D', ()), ('E', ()), 
    ('C', ('E', 'F')), ('F', ())]
    """

    for rule, dep in start_nodes.items():
        if rule not in visited:
            yield rule, dep
            visited.append(rule)
            for ii in dep:
                new_dep={ ii : rules[ii]}
                for dep in _dfs(new_dep, rules, visited):
                    if dep not in visited:
                        yield dep

关于python - 带有生成器的 Python 中的 DFS 算法,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/19358278/

相关文章:

c++ - 无法在迷宫中追踪灰色细胞

java - 如何将公共(public)元素组合到一个数组中?

c++ - 是否有任何信号处理算法可以逆向工程如何通过人类的发声系统产生声波?

Python 生成器的 'yield' 在单独的函数中

iterator - 在 python 中快速迭代可迭代对象(不是列表)的前 n 项

Python:根据另一列和行的条件函数创建新列

c# - 为 Microsoft Natural Ergonomic Desktop 7000 上的缩放按钮编写新功能

python - 如何从 pandas 数据框中的 bool 和 float 列计算新的矢量化列?

python - 取消注释 `if False: yield` 更改 `__iter__` 行为

python - 在Python中延迟解析有状态的、每记录多行的数据流?