我正在尝试实现 this algorithm在 Python 中，但由于我缺乏对树结构的理解，我对分区树的创建过程感到困惑。

简要说明:

被链接的算法，用于将高维特征空间划分为内部节点和叶子节点，以便可以快速执行查询。

它使用特定的随机测试来划分一个大空间，超平面将一个大单元格一分为二。

This answer explains everything much more precisely .

(取自上面的链接)

代码片段:

def random_test(self, main_point):  # Main point is np.ndarray instance
    dimension = main_point.ravel().size
    random_coefficients = self.random_coefficients(dimension)
    scale_values = np.array(sorted([np.inner(random_coefficients, point.ravel())
                                    for point in self.points]))
    percentile = random.choice([np.percentile(scale_values, 100 * self.ratio),  # Just as described on Section 3.1
                                np.percentile(scale_values, 100 * (1 - self.ratio))])
    main_term = np.inner(main_point.ravel(), random_coefficients)
    if self.is_leaf():
        return 0  # Next node is the center leaf child
    else:
        if (main_term - percentile) >= 0:  # Hyper-plane equation defined in the document
            return -1  # Next node is the left child
        else:
            return 1  # Next node is the right child

self.ratio 正如上面链接的算法中提到的，它决定了树的平衡和浅度，在 1/2 时它应该生成最多平衡的浅树。

然后我们进入迭代部分，树不断地划分空间，直到它到达叶节点(注意关键字到达)，问题也就是说，它将永远不会真正到达叶节点。

因为上面链接的文档中叶节点的定义是这样的:

def is_leaf(self):
    return (self.capacity * self.ratio) <= self.cell_count() <= self.capacity

其中 self.cell_count() 是单元格中的点数，self.capacity 是单元格可以拥有的最大点数，self .ratio 是分流比。

My full code基本上应该通过在初始迭代时创建新节点来划分特征空间，直到创建叶节点(但永远不会创建叶节点)。 See the fragment that contains the division process .

(取自上面链接的文档)

tl;博士:

在我们向二叉分区树添加任何点之前是否准备好(用空节点填充)？如果是这样，我们是否需要定义树的级别(深度)？

如果不是，是否在向它们添加点时创建二叉分区树？如果是这样，那么第一个点(从第一次迭代开始)是如何添加到树中的？

最佳答案

它们是随着您的前进而 build 的。第一个节点在第 1 行的右侧或左侧。然后下一个级别询问第 2 行的右侧或左侧...您提供的论文中的插图显示了这一点，其中编号的行与为查找节点而提供的选择相关联。

当然，向右或向左并不准确。一些线被水平切割。但是创建的空间是二进制的。

关于python - 在将点添加到节点之前是否预先制作了二进制分区树？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/53889906/

python - 在将点添加到节点之前是否预先制作了二进制分区树？

tl;博士:

上一篇：arrays - 最小可能数组置换问题

下一篇：algorithm - 如何区分元素是在左子树还是右子树中