algorithm - 查找二叉堆的最后一个元素

引用 Wikipedia :

It is perfectly acceptable to use a traditional binary tree data structure to implement a binary heap. There is an issue with finding the adjacent element on the last level on the binary heap when adding an element which can be resolved algorithmically...

关于这种算法如何工作的任何想法？

我无法找到有关此问题的任何信息，因为大多数二进制堆都是使用数组实现的。

任何帮助表示赞赏。

最近，我注册了一个 OpenID 帐户，但无法编辑我的初始帖子或评论答案。这就是我通过这个答案做出回应的原因。非常遗憾。

引用米奇小麦:

@Yse: is your question "How do I find the last element of a binary heap"?

是的。
或者更准确地说，我的问题是:“如何找到非基于数组的二进制堆的最后一个元素？”。

引用 Suppressingfire:

Is there some context in which you're asking this question? (i.e., is there some concrete problem you're trying to solve?)

如上所述，我想知道一种“找到非基于数组的二进制堆的最后一个元素”的好方法，这是插入和删除节点所必需的。

引用罗伊的话:

It seems most understandable to me to just use a normal binary tree structure (using a pRoot and Node defined as [data, pLeftChild, pRightChild]) and add two additional pointers (pInsertionNode and pLastNode). pInsertionNode and pLastNode will both be updated during the insertion and deletion subroutines to keep them current when the data within the structure changes. This gives O(1) access to both insertion point and last node of the structure.

是的，这应该有效。如果我没记错的话，当它们的位置由于删除/插入而改变到另一个子树时，找到插入节点和最后一个节点可能有点棘手。但我会试试这个。

引用扎克斯克里维纳:

How about performing a depth-first search...

是的，这将是一个很好的方法。我也试试看

我仍然想知道，是否有办法“计算”最后一个节点和插入点的位置。可以通过取大于 N 的 2 的最小幂的对数(以 2 为底)来计算具有 N 个节点的二叉堆的高度。也许也可以计算最深级别的节点数。然后就有可能确定如何遍历堆以到达插入点或删除节点。

最佳答案

基本上，引用的语句是指解决在堆中插入和删除数据元素的位置的问题。为了保持二叉堆的“形状属性”，堆的最低层必须始终从左到右填充，不留空节点。为了保持二叉堆的平均 O(1) 插入和删除时间，您必须能够确定下一次插入的位置以及用于删除根节点的最低级别上最后一个节点的位置，两者都在恒定的时间内。

对于存储在数组中的二进制堆(其隐式的、压缩的数据结构，如维基百科条目中所述)，这很容易。只需在数组末尾插入最新的数据成员，然后将其“冒泡”到位(遵循堆规则)。或者用数组“冒泡”中的最后一个元素替换根以进行删除。对于数组存储中的堆，堆中元素的数量是一个隐式指针，指向要插入下一个数据元素的位置以及查找最后一个用于删除的元素的位置。

对于存储在树结构中的二叉堆，这个信息没有那么明显，但是因为是完全二叉树，所以可以计算出来。例如，在具有 4 个元素的完整二叉树中，插入点将始终是根节点左 child 的右 child 。用于删除的节点将始终是根节点的左 child 的左 child 。对于任何给定的任意树大小，树将始终具有特定形状，并具有明确定义的插入和删除点。因为树是一个“完全二叉树”，对于任何给定的大小都有特定的结构，所以很可能在 O(1) 时间内计算插入/删除的位置。然而，问题是即使您知道它在结构上的位置，您也不知道该节点在内存中的位置。因此，您必须遍历树才能到达给定的节点，该节点是一个 O(log n) 过程，使所有插入和删除操作都至少为 O(log n)，打破了通常所需的 O(1) 行为。由于注意到的遍历问题，任何搜索(“深度优先”或其他搜索)也将至少为 O(log n)，并且由于半排序堆的随机性，通常为 O(n)。

诀窍是通过扩充数据结构(“线程化”树，如维基百科文章中所述)或使用额外的指针，能够在恒定时间内计算和引用这些插入/删除点。

在我看来最容易理解的实现是只使用普通的简单二叉树结构(使用定义为 [data, pParent, pLeftChild, pRightChild] 的 pRoot 和 Node)和添加两个额外的指针(pInsert 和 pLastNode)。 pInsert 和 pLastNode 都将在插入和删除子例程期间更新，以在结构中的数据更改时保持它们最新。此实现使 O(1) 访问结构的插入点和最后一个节点，并且应该允许在插入和删除中保留整体 O(1) 行为。实现的成本是插入/删除子例程中的两个额外指针和一些次要的额外代码(又名，最小)。

编辑 : 为 O(1) insert() 添加伪代码

这是插入子例程的伪代码，平均为 O(1):

define Node = [T data, *pParent, *pLeft, *pRight]

void insert(T data)
{
    do_insertion( data );   // do insertion, update count of data items in tree

    # assume: pInsert points node location of the tree that where insertion just took place
    #   (aka, either shuffle only data during the insertion or keep pInsert updated during the bubble process)

    int N = this->CountOfDataItems + 1;     # note: CountOfDataItems will always be > 0 (and pRoot != null) after an insertion

    p = new Node( <null>, null, null, null);        // new empty node for the next insertion

    # update pInsert (three cases to handle)
    if ( int(log2(N)) == log2(N) )
        {# #1 - N is an exact power of two
        # O(log2(N))
        # tree is currently a full complete binary tree ("perfect")
        # ... must start a new lower level
        # traverse from pRoot down tree thru each pLeft until empty pLeft is found for insertion
        pInsert = pRoot;
        while (pInsert->pLeft != null) { pInsert = pInsert->pLeft; }    # log2(N) iterations
        p->pParent = pInsert;
        pInsert->pLeft = p;
        }
    else if ( isEven(N) )
        {# #2 - N is even (and NOT a power of 2)
        # O(1)
        p->pParent = pInsert->pParent;
        pInsert->pParent->pRight = p;
        }
    else 
        {# #3 - N is odd
        # O(1)
        p->pParent = pInsert->pParent->pParent->pRight;
        pInsert->pParent->pParent->pRight->pLeft = p;
        }
    pInsert = p;

    // update pLastNode
    // ... [similar process]
}

所以，insert(T) 平均是 O(1):在所有情况下都是 O(1) 除非当树必须增加一级时它是 O(log N)，这发生在每 log N 次插入(假设没有删除)。添加另一个指针 (pLeftmostLeaf) 可以使 insert() 在所有情况下都是 O(1)，并避免在完整的完整二叉树中交替插入和删除的可能病理情况。 (添加 pLeftmost 留作练习 [这相当容易]。)

关于algorithm - 查找二叉堆的最后一个元素，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/500057/

algorithm - 查找二叉堆的最后一个元素

上一篇：algorithm - 处理海量数据的库/数据结构

下一篇：algorithm - 编码单词列表的压缩算法