java - 使用 MapReduce 构建 k-d 树？

我正在尝试为图像特征构建 KD 树(独立)。我提取了图像特征，该特征包含假设 1000 个浮点值。

使用 map-reduce 根据分类(例如，猫、狗、枪)在集群的节点之间分发图像。每个节点将包含一堆相似的图像，然后在每个节点上构建图像的 KD 树。我对如何构建树感到困惑。

那么如何使用 map-reduce 构建 KD 树呢？每个节点都包含树，对吧？分发图像的逻辑是什么？在构建 KD 树时，我应该根据什么在树中添加图像特征向量(即左 child 或右 child )？

感谢任何帮助。提前致谢。

最佳答案

我认为 k-d-tree 不适合您的数据。这是Wikipedia说:

k-d trees are not suitable for efficiently finding the nearest neighbour in high dimensional spaces. As a general rule, if the dimensionality is k, the number of points in the data, N, should be N >> 2^k. Otherwise, when k-d trees are used with high-dimensional data, most of the points in the tree will be evaluated and the efficiency is no better than exhaustive search, and approximate nearest-neighbour methods should be used instead.

您的特征向量的维度为 1000，这意味着您应该有大约 10^300 张图像，这是不太可能的。

我建议你看看Locality-sensitive hashing ，这是上述的高维数据近似最近邻搜索之一。

由于维基百科并不总是学习复杂事物的最佳场所，我建议您查看 respective lecture slides的 Data Mining苏黎世联邦理工学院的类(class)代替。正好我在本学期选修这门课。

关于java - 使用 MapReduce 构建 k-d 树？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/10984168/

java - 使用 MapReduce 构建 k-d 树？

上一篇：java - Hadoop MR 在 reduce 方法中保持数组引用

下一篇：java - Hadoop:奇怪的 ClassNotFoundException