algorithm - 该算法是否产生均匀随机排列？

我正在研究 CLRS，发现一个关于洗牌算法的问题。这会产生均匀随机排列吗？

1 PERMUTE-WITH-ALL-IDENTITY(A)
2    n = A.length
3    for i = 1 to n
4        swap A[i] with A[RANDOM(1,n)]
5        swap A[i] with A[RANDOM(i+1,n)]

我的主张:不，它没有。因为，由于第 4 行，将有 n^n 种可能的排列。而且，它不能被 n 整除!这是不同排列的数量。

能否请您确认我的推理是否正确？

最佳答案

首先，我认为伪代码中存在错误。在第 4 行中，我认为当 i = n 时存在边界错误，因为这会要求 n+1 和 n 之间的随机数。在接下来的内容中，我通过假设代码如下来更正此问题:

1 PERMUTE-WITH-ALL-IDENTITY(A)
2    n = A.length
3    for i = 1 to n
4        swap A[i] with A[RANDOM(1,n)]
5        swap A[i] with A[RANDOM(i,n)]

如果这是代码，那么我相信答案是不，这不会产生均匀随机的排列。我没有数学证明是这种情况，但我有有一段 C++ 代码，通过 PERMUTE-WITH-ALL-IDENTITY 暴力破解所有可能的路径，并计算每个排列产生的次数。我运行了该代码并测试了长度为 0 到 4 的序列的排列算法，发现并非所有排列的可能性都相同。

代码如下:

#include <iostream>
#include <map>
#include <vector>
#include <algorithm>
using namespace std;

/* Maximum size of a sequence to permute. */
const size_t kMaxSize = 4;

/**
 * Given a frequencies map associating permutations to the number of times
 * that we've seen them, displays a visual report of the permutations
 * and their frequencies.
 *
 * @param frequencies The frequencies map.
 */
void reportResults(const map<vector<int>, size_t>& frequencies, size_t size) {
  cout << "Report for size " << size << endl;
  cout << "===================================================" << endl;  

  /* Print out the map. */
  cout << "  Map entries:" << endl;
  for (const auto& entry: frequencies) {
    cout << "    ";
    for (const auto& num: entry.first) {
      cout << num;
    }
    cout << ": " << entry.second << endl;
  }

  cout << "===================================================" << endl;
  cout << endl << endl;
}

/**
 * Traces through all possible executions of the algorithm, recording
 * the number of times that each outcome occurs. This algorithm uses
 * exhaustive recursion to explore the full space, assuming that the
 * underlying random generator is uniform.
 *
 * @param elems The elements in the sequence. It's assumed that initially
 *    these are in sorted order, but as the algorithm progresses it will
 *    become progressively more permuted.
 * @param frequencies A map from permutations to the number of times that
 *    they appear.
 * @param index The index through the main loop that we are currently in.
 *    This is the variable 'i' in the pseudocode.
 * @param size The length of the vector. This isn't strictly necessary,
 *    but it makes the code a bit cleaner.
 */
void recPopulate(const vector<int>& elems,
                 map<vector<int>, size_t>& frequencies,
                 size_t index, size_t size) {
  /* If we've finished permuting everything, record in the frequency map
   * that we've seen this permutation one more time.
   */
  if (index == size) {
    frequencies[elems]++;
  } else {
    /* For all possible pairs of a first swap and a second swap, try that
     * out and see what happens.
     */
    for (size_t i = 0; i < size; i++) {        // First swap index
      for (size_t j = index; j < size; j++) {  // Second swap index
        /* Make a clean copy of the vector to permute. */
        vector<int> newElems = elems;

        /* Perform the swaps. */
        swap(newElems[i], newElems[index]);
        swap(newElems[j], newElems[index]);

        /* Recursively explore all possible executions of the algorithm
         * from this point forward.
         */
        recPopulate(newElems, frequencies, index + 1, size);
      }
    }
  }
}

/**
 * Traces through all possible executions of the proposed algorithm,
 * building a frequency map associating each permutation with the
 * number of possible executions that arrive there.
 *
 * @param size The number of elements in the initial sequence.
 * @return A frequency map from permutations to the number of times that
 *    permutation can be generated.
 */
map<vector<int>, size_t> populateFrequencies(size_t size) {
  /* Create the sequence 0, 1, 2, ..., size - 1 */
  vector<int> elems(size);
  iota(elems.begin(), elems.end(), 0);

  map<vector<int>, size_t> frequencies;
  recPopulate(elems, frequencies, 0, elems.size());
  return frequencies;
}

int main() {
  for (size_t size = 0; size <= kMaxSize; size++) {
    map<vector<int>, size_t> frequencies = populateFrequencies(size);
    reportResults(frequencies, size);
  }
}

该程序生成的报告显示，对于每个排列，通过产生该排列的算法的可能执行路径的数量。四个元素排列的输出如下所示。鉴于每条执行路径的可能性均等，因为这里的数字不相同，我们可以得出结论，该算法不是均匀随机的:

Report for size 4
===================================================
  Map entries:
    0123: 214
    0132: 268
    0213: 267
    0231: 316
    0312: 242
    0321: 229
    1023: 268
    1032: 322
    1203: 312
    1230: 349
    1302: 287
    1320: 262
    2013: 242
    2031: 283
    2103: 233
    2130: 262
    2301: 252
    2310: 240
    3012: 213
    3021: 208
    3102: 204
    3120: 187
    3201: 248
    3210: 236
===================================================

以上分析是基于这样一个事实

我的代码是正确的，并且
我对伪代码的解释是正确的。

如果其中任何一个有误，我很乐意撤回或编辑此答案。如果我在这里犯了错误，请告诉我!

希望这对您有所帮助!

关于algorithm - 该算法是否产生均匀随机排列？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/30169729/

algorithm - 该算法是否产生均匀随机排列？

上一篇：string - 给定一组单词，你如何识别 "n"组字母，以帮助你从原始列表中生成最大数量的完整单词？

下一篇：algorithm - 数的质因数的乘积