algorithm - 内存算法时间复杂度

我读了这篇文章Retiring a Great Interview Problem ，作者想出了一个“分词”的问题，并给出了三种解决方案。高效的使用 memoization 算法，作者说它的最坏情况时间复杂度是 O(n^2) 因为 “关键的见解是 SegmentString 只被调用在原始输入字符串的后缀上，并且只有 O(n) 个后缀”。

但是，我发现我很难理解为什么是O(n^2)。有人可以给我提示或证据吗？

断字问题:

    Given an input string and a dictionary of words,
    segment the input string into a space-separated
    sequence of dictionary words if possible. For
    example, if the input string is "applepie" and
    dictionary contains a standard set of English words,
    then we would return the string "apple pie" as output.

内存算法来自 Retiring a Great Interview Problem

Map<String, String> memoized;

String SegmentString(String input, Set<String> dict) {
      if (dict.contains(input)) 
          return input;
      if (memoized.containsKey(input) {
          return memoized.get(input);
      }
      int len = input.length();
      for (int i = 1; i < len; i++) {
          String prefix = input.substring(0, i);
          if (dict.contains(prefix)) {
              String suffix = input.substring(i, len);
              String segSuffix = SegmentString(suffix, dict);
              if (segSuffix != null) {
                  return prefix + " " + segSuffix;
              }
          }
      }
      memoized.put(input, null);
      return null;
}

最佳答案

直觉

(很明显，最坏的情况是没有解决方案，所以我专注于此)

由于递归调用是在将值放入内存缓存之前进行的，因此最后(最短)的后缀将首先被缓存。这是因为该函数首先使用长度为 N 的字符串调用，然后使用长度为 N-1 的字符串调用自身，然后......，使用 len 0 的字符串，它被缓存并返回，然后是长度为1的字符串被缓存并返回，...，长度为N的字符串被缓存并返回。

如提示所示，只有后缀被缓存，而且只有 N 个。这意味着当顶级函数获得其第一次递归调用的结果时(即在长度为 N-1 的后缀上)，缓存中已经填充了 N-1 后缀。

现在，假设 N-1 个最后的后缀已经缓存，for 循环需要进行 N-1 次递归调用，每次调用 O(1)(因为答案已经缓存)，总计 O(N)。但是，(预先)构建最后 N-1 个缓存需要 O(N^2)(如下所述)，总计 O(N)+O(N^2) = O(N^2)。

数学归纳法证明

这个解释可以很容易地转化为使用归纳法的正式证明。这是它的要点:

(f(N) 是函数在长度为 N 的输入上完成所需的操作数)

归纳假设 -- 存在一个常数 c英石。 f(N) < c*N^2 .

基本情况 很简单——对于长度为 1 的字符串，您可以找到一个常量 c这样 f(1) < c*N^2 = c

归纳步骤

回顾事情发生的顺序:

第 1 步:函数首先在长度为 N-1 的后缀上调用自身，构建一个缓存，其中包含最后 N-1 个后缀的答案

第 2 步:函数然后调用自身 O(N) 次，每次调用 O(1)(多亏了这个测试:if (memoized.containsKey(input))，事实上缓存已经在第 1 步中填充了 )。

所以我们得到:

f(N+1) = f(N) + a*N <   (by the hypothesis)
       < c*N^2 + a*N <  (for c large enough)
       < c*N^2 + 2*c*N + c =
       = c*(N+1)^2

因此我们得到了f(N+1) < c*(N+1)^2 , 从而完成证明。

关于algorithm - 内存算法时间复杂度，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/21273505/

algorithm - 内存算法时间复杂度

上一篇：algorithm - 确定给定折线与一组现有折线的近似重叠

下一篇：algorithm - 我该如何优化这个索引算法