algorithm - 如何随机读取整个数组中散布的所有 1's in an Array of 1' 和 0

我有一个数组，其中 1 和 0 随机分布在数组中。

int arr[N] = {1,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,1,1,1,0,0,0,1....................N}

现在我想尽可能快地检索数组中所有的 1，但条件是我不应该丢失数组的确切位置(基于索引)，因此排序选项无效。所以剩下的唯一选择是线性搜索，即 O(n) ，还有比这更好的吗。

The main problem behind linear scan is , I need to run the scan even for X times. So I feel I need to have some kind of other datastructure which maintains this list once the first linear scan happens, so that I need not to run the linear scan again and again.

Let me be clear about final expectations-

我只需要在数组的某个范围内找到 1 的个数，准确地说我需要在数组中找到 40-100 范围内的 1 的个数。所以这可以是随机范围，我需要在该范围内找到 1 的计数。我不能做 sum 和 all，因为由于不同的范围要求，我需要一遍又一遍地遍历数组

最佳答案

我很惊讶您考虑将排序作为线性搜索的更快替代方法。

如果您不知道它们出现在哪里，那么没有比线性搜索更好的方法了。也许如果您使用位或 char 数据类型，您可以进行一些优化，但这取决于您要如何使用它。

您可以对此进行的最佳优化是克服分支预测。因为每个值都是零或一，所以您可以使用它来推进用于存储一索引的数组的索引。

简单的方法:

int end = 0;
int indices[N];

for( int i = 0; i < N; i++ )
{
    if( arr[i] ) indices[end++] = i;  // Slow due to branch prediction
}

没有分支:

int end = 0;
int indices[N];

for( int i = 0; i < N; i++ )
{
    indices[end] = i;
    end += arr[i];
}

[编辑] 我测试了上面的内容，发现没有分支的版本快了将近 3 倍(4.36 秒对 11.88 秒，在随机填充的 1 亿元素数组上重复 20 次)。

回来发布结果，我看到您已经更新了您的要求。使用动态编程方法，您想要的东西真的很容易...

您所要做的就是创建一个新的数组，它比数组大一个元素，它存储从数组开头到(但不包括)当前索引的 1 的数量。

arr   :   1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0 1
count : 0 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 5 6 6 6 6 7

(我已经偏移了上面的arr所以它排列得更好)

现在您可以在 O(1) 时间内计算任意范围内 1 的数量。要计算索引 A 和 B 之间的 1 的数量，您只需执行以下操作:

int num = count[B+1] - count[A];

显然，您仍然可以使用非分支预测版本来最初生成计数。相对于对每个查询求和的幼稚方法，所有这些都应该给你一个很好的加速:

int *count = new int[N+1];
int total = 0;

count[0] = 0;

for( int i = 0; i < N; i++ )
{
    total += arr[i];
    count[i+1] = total;
}


// to compute the ranged sum:
int range_sum( int *count, int a, int b )
{
    if( b < a ) return range_sum(b,a);
    return count[b+1] - count[a];
}

关于algorithm - 如何随机读取整个数组中散布的所有 1's in an Array of 1' 和 0，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/16557144/

algorithm - 如何随机读取整个数组中散布的所有 1's in an Array of 1' 和 0

上一篇：C#算法为单元测试创建假的加拿大社会安全号码(SIN)

下一篇：algorithm - 如何生成表示 'e' 二进制扩展的系列

algorithm - 如何随机读取整个数组中散布的所有 1's in an Array of 1' 和 0

上一篇：C#算法为单元测试创​​建假的加拿大社会安全号码(SIN)

下一篇：algorithm - 如何生成表示 'e' 二进制扩展的系列

上一篇：C#算法为单元测试创建假的加拿大社会安全号码(SIN)