algorithm - B-tree 比 AVL 或 Red Black-Tree 更快？

标签 algorithm data-structures binary-tree

<分区>

我知道性能永远不会是非黑即白的，通常一种实现在 X 情况下更快，在 Y 情况下更慢，等等。但总的来说 - B 树比 AVL 或 RedBlack-Trees 快吗？它们的实现比 AVL 树(甚至可能是 RedBlack 树？)要复杂得多，但是它们更快吗(它们的复杂性是否得到返回)？

编辑:我还想补充一点，如果它们比等效的 AVL/RedBlack 树更快(在节点/内容方面)- 为什么它们更快？

最佳答案

Sean 的帖子(目前被接受的帖子)包含几个不正确的声明。对不起肖恩，我不是有意无礼；我希望我能让你相信我的陈述是基于事实的。

They're totally different in their use cases, so it's not possible to make a comparison.

它们都用于维护一组具有快速查找、插入和删除功能的完全有序的项目。它们具有相同的界面和相同的意图。

RB trees are typically in-memory structures used to provide fast access (ideally O(logN)) to data. [...]

总是 O(log n)

B-trees are typically disk-based structures, and so are inherently slower than in-memory data.

废话。在磁盘上存储搜索树时，通常使用 B 树。这是真的。当您将数据存储在磁盘上时，它的访问速度比内存中的数据慢。但是存储在磁盘上的红黑树也比存储在内存中的红黑树慢。

您在这里比较苹果和橘子。真正有趣的是内存中 B 树和内存中红黑树的比较。

[顺便说一句:与红黑树相反，B 树在 I/O 模型中在理论上是高效的。我已经通过实验测试(并验证)了用于排序的 I/O 模型；我希望它也适用于 B 树。]

B-trees are rarely binary trees, the number of children a node can have is typically a large number.

需要说明的是，B树节点的大小范围是树的一个参数(在C++中，您可能希望使用整数值作为模板参数)。

The management of the B-tree structure can be quite complicated when the data changes.

我记得它们比红黑树更容易理解(和实现)。

B-tree try to minimize the number of disk accesses so that data retrieval is reasonably deterministic.

这是真的。

It's not uncommon to see something like 4 B-tree access necessary to lookup a bit of data in a very database.

有数据吗？

In most cases I'd say that in-memory RB trees are faster.

有数据吗？

Because the lookup is binary it's very easy to find something. B-tree can have multiple children per node, so on each node you have to scan the node to look for the appropriate child. This is an O(N) operation.

每个节点的大小是固定参数，所以即使做线性扫描也是O(1)。如果我们对每个节点的大小进行 big-oh，请注意您通常会保持数组排序，因此它是 O(log n)。

On a RB-tree it'd be O(logN) since you're doing one comparison and then branching.

您正在比较苹果和橙子。 O(log n) 是因为树的高度最多为 O(log n)，就像 B 树一样。

此外，除非你对红黑树玩一些讨厌的分配技巧，否则推测 B 树具有更好的缓存行为似乎是合理的(它访问一个数组，而不是散布在各处的指针，并且分配更少开销甚至更多地增加内存局部性)，这可能有助于它在速度竞赛中。

我可以指出实验证据表明 B 树(大小参数具体为 32 和 64)在小尺寸方面与红黑树非常有竞争力，甚至在 n 值适中时也优于红黑树。参见 http://idlebox.net/2007/stx-btree/stx-btree-0.8.3/doxygen-html/speedtest.html

B 树更快。为什么？我猜想这是由于内存局部性、更好的缓存行为和更少的指针追踪(如果不是同一件事，它们在某种程度上是重叠的)。

关于algorithm - B-tree 比 AVL 或 Red Black-Tree 更快？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/647537/

上一篇：algorithm - 不同语言的阶乘算法

下一篇：algorithm - 哈希查找和二分查找哪个更快？

相关文章：

algorithm - CUDA流压缩算法

c - 使用什么数据结构？ ( HashMap 与特里树与？)

C数据结构如何声明

architecture - 在 MongoDB 中存储单个 "first-class"列表的惯用方法？

c# - 如何对 Excel 范围执行二进制搜索以查找最后一个非空单元格？

algorithm - 如何为 URL 生成唯一哈希值？

c++ - 优化 C++ 模板执行

algorithm - Re : Given a binary search tree and a number, 找到一个路径，其节点的数据添加到给定的数字

c - 用户想要删除一个值，但该值不在二叉树上。该如何治疗呢？

algorithm - 给定n个数字，找到最小周长三角形