arrays - 使用曼伯迈尔斯算法的后缀数组

标签 arrays algorithm data-structures suffix-array

我尝试过研究论文 http://webglimpse.net/pubs/suffix.pdf 中的理论。

但是当他们说时我有点迷失

Let Ai be the ﬁrst sufﬁx in the ﬁrst bucket (i.e., Pos[0] = i), and consider Ai-h (if i-h < 0, then we ignore Ai and take the sufﬁx of Pos[1], and so on). Since Ai starts with the smallest h-symbol string, Ai-h should be the ﬁrst in its 2h-bucket.

我无法理解这个说法。为什么如果 i-h < 0 可以忽略 Ai-h。在第 1 阶段，当 i-h > 0 时，如何在 const 时间内确定位置？

一个示例 impl 是 http://belbesy.wordpress.com/2012/10/10/spoj-649-distinct-substrings-suffix-arrays-nlgn/

最佳答案

我强烈建议不要尝试理解 C++ 代码，而是浏览一下此 Python implementation of the Manbers-Myers suffix array construction algorithm ，手动，一个简单的 5 个字符示例。

因为Python版本只有大约15行代码，所以很容易理解。

即使你不懂 Python，也可以将其视为伪代码，并通过 Google 搜索你不明白的语法。

就我个人而言，我手动浏览了一个 5 个字符的字符串，这足以帮助我理解该算法是如何工作的..

关于arrays - 使用曼伯迈尔斯算法的后缀数组，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/18495728/