c++ - 递归二进制搜索字符串 - C++

标签 c++ recursion binary-search

我正在尝试实现 findMatchesInDict 函数,该函数试图查看某个单词是否与预排序词典中的任何单词匹配。下面是我当前的实现:

void findMatchesInDict(string word, int start, const string dict[], int end, string results[], int& totalResults)
{
    // initial start = 0 index
    // initial end = last index of dict array

    int middle = start + (end - start) / 2;
    if (end < start)
        return;

    if (word == dict[middle]) // if we found a match
        storeUniqueMatches(word, 0, results, totalResults); 
    else if (word < dict[middle])
        findMatchesInDict(word, start, dict, middle - 1, results, totalResults);
    else
        findMatchesInDict(word, middle + 1, dict, end, results, totalResults);
}

storeUniqueMatches 函数正常工作(这只是将匹配的单词存储到 results 数组中,确保没有存储重复的单词。

该函数只会匹配从字典中选择的单词,而不匹配其他单词。

关于为什么这可能无法正常工作的任何想法?


作为引用,此实现有效,但效率极低并导致堆栈溢出错误。

void findMatchesInDict(string word, int start, const string dict[], int end, string results[], int& totalResults)
{
    if (start > end) 
        return;
    if (word == dict[start]) // if we found a match
        storeUniqueMatches(word, 0, results, totalResults);

    findMatchesInDict(word, start + 1, dict, size, results, totalResults);
}

最佳答案

我仍然相信 OP 错了 1 个错误。

我强烈怀疑

findMatchesInDict(word, start, dict, middle - 1, results, totalResults);

应该是

findMatchesInDict(word, start, dict, middle, results, totalResults);

我做了自己的小 sample 。 (因此,我对代码进行了一些重新设计,因为我对 OP 的做法感到不走运。)

#include <iostream>
#include <string>

size_t find(const std::string &word, const std::string dict[], size_t i0, size_t size)
{
  if (!size) return (size_t)-1; // bail out with invalid index
  const size_t i = i0 + size / 2;
  return word == dict[i]
    ? i
    : word < dict[i]
      ? find(word, dict, i0, i - i0)
      : find(word, dict, i + 1, i0 + size - (i + 1));
}

int main()
{
  const std::string dict[] = {
    "Ada", "BASIC", "C", "C++",
    "D", "Haskell", "INTERCAL", "Modula2",
    "Oberon", "Pascal", "Scala", "Scratch",
    "Vala"
  };
  const size_t sizeDict = sizeof dict / sizeof *dict;
  unsigned nErrors = 0;
  // brute force tests to find something what is in
  for (size_t n = 1; n <= sizeDict; ++n) {
    for (size_t i = 0; i < n; ++i) {
      if (find(dict[i], dict, 0, n) >= n) {
        std::cerr << "ALERT! Unable to find entry " << i << " in " << n << " entries!\n";
        ++nErrors;
      }
    }
  }
  // brute force tests to find something what is not in
  for (size_t n = 1; n <= sizeDict; ++n) {
    if (find("", dict, 0, n) < n) {
      std::cerr << "ALERT! Able to find entry '' in " << n << " entries!\n";
      ++nErrors;
    }
    for (size_t i = 0; i < n; ++i) {
      if (find(dict[i] + " + Assembler", dict, 0, n) < n) {
        std::cerr << "ALERT! Able to find entry '" << dict[i] << " + Assembler' in " << n << " entries!\n";
        ++nErrors;
      }
    }
  }
  // report
  if (!nErrors) std::cout << "All tests passed OK.\n";
  else std::cerr << nErrors << " tests failed!\n";
  // done
  return nErrors > 0;
}

Live Demo on coliru

这段代码大部分是暴力测试代码:

  1. 测试从 1 到 dict 大小的每个长度。对于每个长度,搜索 dict 的任何条目。

  2. 测试从 1 到 dict 大小的每个长度。对于每个长度,将测试空字符串(在任何其他条目之前)以及任何经过修改的条目。 (修改授权它将在未修改的条目及其后继条目之间或最后一个条目之后。)

输出:

All tests passed OK.

一切顺利。

然后我换了

find(word, dict, i0, i - i0)

find(word, dict, i0, i - i0 > 0 ? i - i0 - 1 : 0)

类似于(在我看来)OP 代码的错误。

输出:

ALERT! Unable to find entry 0 in 2 entries!
ALERT! Unable to find entry 0 in 3 entries!
ALERT! Unable to find entry 1 in 4 entries!
ALERT! Unable to find entry 1 in 5 entries!
ALERT! Unable to find entry 3 in 5 entries!
ALERT! Unable to find entry 0 in 6 entries!
ALERT! Unable to find entry 2 in 6 entries!
ALERT! Unable to find entry 4 in 6 entries!
ALERT! Unable to find entry 0 in 7 entries!
ALERT! Unable to find entry 2 in 7 entries!
ALERT! Unable to find entry 4 in 7 entries!
ALERT! Unable to find entry 0 in 8 entries!
ALERT! Unable to find entry 3 in 8 entries!
ALERT! Unable to find entry 5 in 8 entries!
ALERT! Unable to find entry 0 in 9 entries!
ALERT! Unable to find entry 3 in 9 entries!
ALERT! Unable to find entry 6 in 9 entries!
ALERT! Unable to find entry 1 in 10 entries!
ALERT! Unable to find entry 4 in 10 entries!
ALERT! Unable to find entry 7 in 10 entries!
ALERT! Unable to find entry 1 in 11 entries!
ALERT! Unable to find entry 4 in 11 entries!
ALERT! Unable to find entry 7 in 11 entries!
ALERT! Unable to find entry 9 in 11 entries!
ALERT! Unable to find entry 1 in 12 entries!
ALERT! Unable to find entry 3 in 12 entries!
ALERT! Unable to find entry 5 in 12 entries!
ALERT! Unable to find entry 8 in 12 entries!
ALERT! Unable to find entry 10 in 12 entries!
ALERT! Unable to find entry 1 in 13 entries!
ALERT! Unable to find entry 3 in 13 entries!
ALERT! Unable to find entry 5 in 13 entries!
ALERT! Unable to find entry 7 in 13 entries!
ALERT! Unable to find entry 9 in 13 entries!
ALERT! Unable to find entry 11 in 13 entries!
35 tests failed!

嗯。实际上,这并不能证明任何关于 OP 的代码。

然而,这表明

  1. “减 1”可以从本质上破坏二分查找。

  2. 如何设计暴力测试来发现此类错误。

因此,这有望帮助 OP 自行找出算法中的错误(这对他来说实际上更有值(value))。

关于c++ - 递归二进制搜索字符串 - C++,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/51584543/

相关文章:

C++ IntelliSense 'auto' 功能?它在哪里?如何得到它 'on' ?

c++ - 是否可以同时调试 VB6 和 C++ COM dll?

javascript - 递归算法如何适用于汉诺塔?

java - 卡在 N-Queens java 算法中。 (回溯)

c - 使用字符数组进行二进制搜索?

c++ - 前向声明和 typeid

c++ - 有没有办法强制 C++ 编译器不优化静态库中的特定静态对象?

php - 具有多态关系的多维数组(树结构)

c# - 按 ID 的 BinarySearch 对象数组

c++ - 二叉搜索树中是否需要结构