algorithm - 如何在 Haskell 中表示两棵树之间的映射？

我正在尝试在 Haskell 中实现树处理算法，并且(因为这是我的第一个 Haskell 程序!)，我正在努力设计数据结构。那里的任何 FP 大师都可以伸出援手吗？

我将首先描述算法的重要特征，勾勒出我将如何使用命令式语言来解决这个问题，并以到目前为止我在 Haskell 中所做的蹒跚学步的小步骤结束。

问题

我不会详细描述完整的算法，但要点如下:

该算法对两棵玫瑰树 X 和 Y 进行操作。

算法的第一阶段根据每个节点的标签和属性以及其后代的标签和属性，为每个节点计算一些派生属性。

这些派生属性用于计算两棵树之间的部分映射，这样 X 中的节点可能与 Y 中的节点相关联，反之亦然。因为映射是部分的，X 或 Y 中的任何节点都可以被映射(即在另一棵树上有一个伙伴)，或者可以不被映射。

算法的最后阶段通过检查映射节点的父/子/兄弟节点的一系列操作来优化这些映射。

因此，数据结构必须具有以下特征:

给定对节点的引用，提供对该节点的父节点、该节点的兄弟节点和该节点的子节点的访问。

给定输入树中的一个节点，允许使用附加信息(派生属性，以及对另一棵树中节点的可选引用)对该节点进行注释。

命令式解决方案的草图

如果我要使用命令式语言来实现这个算法，那么解决方案将如下所示。

让我们假设起点是输入树的以下定义:

struct node {
    // Identifier for this node, unique within the containing tree
    size_t id;

    // Label of this node
    enum label label;

    // Attributes of this node
    // An attribute can be assumed to be a key-value pair
    // Details of the attributes themselves aren't material to this
    // discussion, so the "attribute" type is left opaque
    struct attribute **attributes;
    size_t n_attributes;

    // Pointer to parent of this node
    // NULL iff this node is root
    struct node *parent;

    // Pointer to first child of this node
    // NULL iff this node is leaf
    struct node *child;

    // Double-linked list of siblings of this node
    struct node *prev;
    struct node *next;
};

嵌入在每个节点中的指针明确支持算法所需的上/下/左/右遍历。

可以通过定义以下结构来实现注解:

struct algo_node {
    // Pointer to input node which has been wrapped
    struct node *node;

    // Derived properties computed by first phase of the algorithm
    // Details of the properties themselves aren't material to this
    // discussion, so the "derived" type is left opaque
    struct derived props;

    // Pointer to corresponding node in the other tree
    // NULL iff this node is unmatched
    struct node *match;
};

算法第一阶段构造一个algo_node每个node在每个输入树中。

来自 algo_node 的映射到 node很简单:遵循嵌入式 *node指针。可以通过存储 algo_node 支持其他方向的映射s 在一个数组中，由 id 索引输入节点。

这当然只是一种可能的实现方式。许多变化是可能的，包括

抽象 list 后面的子链表或 queue接口(interface)，而不是存储三个原始指针

不是通过索引将输入树与算法树相关联，而是直接在 struct algo_node 中编码父/子/兄弟关系

迁移到 Haskell

让我们从输入树的以下定义开始:

data Tree = Leaf Label Attributes
          | Node Label Attributes [Tree]

可以通过以下方式实现每个具有 id 的节点的增强:

data AnnotatedTree = Tree Int

addIndex :: Int -> AnnotatedTree -> (AnnotatedTree, Int)

indexedTree = addIndex 0 tree

类似地，我们可以编写一个计算派生属性的函数:

data AnnotatedTree = Tree DerivedProperties

computeDerived :: DerivedProperties -> AnnotatedTree -> (AnnotatedTree, DerivedProperties)

derivedTree = computeDerived DefaultDerived tree

上面的代码片段可以通过很少的工作进行调整，例如 AnnotatedTree包含索引和派生属性。

但是，我不知道从哪里开始表示两棵树之间的映射。根据一些阅读，我有一些不成熟的想法......

定义 AnnotatedTree包含从另一棵树的根到映射节点的路径 - 编码为每个连续子列表的索引列表，[Integer]

使用 zipper (我目前对它的理解相当松散)通过路径

访问映射节点(或其父/子/兄弟)。

或者也许使用一个镜头(......我对它的理解更不明确!)做同样的

定义 AnnotatedTree直接包含对映射节点的引用，如 Maybe Tree

但是后来我没有看到一种方法可以走到映射节点的父/兄弟节点

...但我真的可以提供一些指导，说明其中哪些(如果有的话)值得追求。

任何帮助将非常感激!

最佳答案

您可以使用 Int 标记树节点id's，然后用 zipper 在它们周围走动(使用 Data.Tree 和 Data.Tree.Zipper 是个好主意，因为不需要重新发明轮子)。然后，您可以使用 Data.IntMap 将辅助属性附加到节点。将节点 ID 映射到您想要的任何内容。特别是，您可以创建一个 IntMap从节点 id 映射到 TreePos Full Int以便您可以探索该节点的父节点、兄弟节点和子节点。

关于algorithm - 如何在 Haskell 中表示两棵树之间的映射？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/55680121/

algorithm - 如何在 Haskell 中表示两棵树之间的映射？

上一篇：r - 在数据框 R 中动态创建列并根据其他列条件进行填充

下一篇：cordova - 如何在 ionic (cordova)应用程序和arduino之间进行通信