c++ - 在多个查询的子数组中查找不同(唯一)值的数量

标签 c++ arrays algorithm data-structures wavelet

我有一个数组(可以有 2X10^5 个值)。我想对这个数组执行大量查询。每个查询都是 [L,R] 类型,该查询的结果应该是子数组中从索引 L 开始到索引 R 结束的唯一值的数量。

我知道这可以在 O(nrootn) 时间内使用 Mo 算法完成。然而,问题在于 Mo 的算法是一种离线算法。我正在寻找的是一种在线算法,因为在我的情况下,前一个查询的结果决定了下一个查询。

我尝试使用形成一个段树,其中节点将存储范围内的所有不同元素。然而,结果证明这对我的目的来说太慢了。这种方法的预处理花费了太多时间。

最佳答案

这是我使用 here 尝试解决方案的 C++ 尝试(也发布了 Wavelet tree ) ,使用改编自 https://www.geeksforgeeks.org/wavelet-trees-introduction 的代码实现.重新表述问题的想法(如 Photon commented 链接)是首先构造一个数组,该数组列出原始数组中每个对应单元格的右侧下一个重复元素的索引。然后问题就变成了找到区间中有多少单元格具有这样一个超出当前区间的“下一个索引”(那些在区间内显然没有重复的单元格),可以用装饰的小波树查询。请参阅底部的(非零基)查询示例。

// Adapted from https://www.geeksforgeeks.org/wavelet-trees-introduction

#include <iostream>
#include <vector>
#include <map>
#include <algorithm>
#include <climits>
using namespace std;

// wavelet tree class 
class wavelet_tree { 
public: 
    // Range to elements 
    int low, high; 

    // Left and Right child 
    wavelet_tree* l, *r; 

    std::vector<int> freq;

    // Default constructor 
    // Array is in range [x, y] 
    // Indices are in range [from, to] 
    wavelet_tree(int* from, int* to, int x, int y) 
    { 
        // Initialising low and high 
        low = x, high = y; 

        // Array is of 0 length 
        if (from >= to) 
            return; 

        // Array is homogenous 
        // Example : 1 1 1 1 1 
        if (high == low) { 
            // Assigning storage to freq array 
            freq.reserve(to - from + 1); 

            // Initialising the Freq array 
            freq.push_back(0); 

            // Assigning values 
            for (auto it = from; it != to; it++) 

                // freq will be increasing as there'll 
                // be no further sub-tree 
                freq.push_back(freq.back() + 1); 

            return; 
        } 

        // Computing mid 
        int mid = (low + high) / 2; 

        // Lambda function to check if a number 
        // is less than or equal to mid 
        auto lessThanMid = [mid](int x) { 
            return x <= mid; 
        }; 

        // Assigning storage to freq array 
        freq.reserve(to - from + 1); 

        // Initialising the freq array 
        freq.push_back(0); 

        // Assigning value to freq array 
        for (auto it = from; it != to; it++) 

            // If lessThanMid returns 1(true), we add 
            // 1 to previous entry. Otherwise, we add 0 
            // (element goes to right sub-tree) 
            freq.push_back(freq.back() + lessThanMid(*it));      

        // std::stable_partition partitions the array w.r.t Mid 
        auto pivot = std::stable_partition(from, to, lessThanMid); 

        // Left sub-tree's object 
        l = new wavelet_tree(from, pivot, low, mid); 

        // Right sub-tree's object 
        r = new wavelet_tree(pivot, to, mid + 1, high); 
    } 

    // Count of numbers in range[L..R] less than 
    // or equal to k 
    int kOrLess(int l, int r, int k) 
    { 
        // No elements int range is less than k 
        if (l > r or k < low) 
            return 0; 

        // All elements in the range are less than k 
        if (high <= k) 
            return r - l + 1; 

        // Computing LtCount and RtCount 
        int LtCount = freq[l - 1]; 
        int RtCount = freq[r]; 

        // Answer is (no. of element <= k) in 
        // left + (those <= k) in right 
        return (this->l->kOrLess(LtCount + 1, RtCount, k) + 
            this->r->kOrLess(l - LtCount, r - RtCount, k)); 
    } 

    // Count of numbers in range[L..R] greater than 
    // or equal to k 
    int kOrMore(int l, int r, int k) 
    { 
        // No elements int range are greater than k 
        if (l > r or k > high) 
            return 0; 

        // All elements in the range are greater than k 
        if (low >= k) 
            return r - l + 1; 

        // Computing LtCount and RtCount 
        int LtCount = freq[l - 1]; 
        int RtCount = freq[r]; 

        // Answer is (no. of element <= k) in 
        // left + (those <= k) in right 
        return (this->l->kOrMore(LtCount + 1, RtCount, k) + 
            this->r->kOrMore(l - LtCount, r - RtCount, k)); 
    }

}; 


int main() 
{ 
    int size = 7, high = INT_MIN;
    int arr[] = {1, 2, 3, 2, 4, 3, 1};
    int next[size];
    std::map<int, int> next_idx;

    for (int i=size-1; i>=0; i--){
        if (next_idx.find(arr[i]) == next_idx.end())
            next[i] = size + 1;
        else
            next[i] = next_idx[arr[i]];
        next_idx[arr[i]] = i + 1;
        high = max(high, next[i]);
    } 

    // Object of class wavelet tree 
    wavelet_tree obj(next, next + size, 1, high);

    // Queries are NON-zero-based
    //
    //  1  2  3  4  5  6  7
    // {1, 2, 3, 2, 4, 3, 1};
    // query([3, 6]) = 3;
    cout << obj.kOrMore(3, 6, 7) << '\n';
    // query([1, 4]) = 3;
    cout << obj.kOrMore(1, 4, 5) << '\n';
    // query([1, 7]) = 4;
    cout << obj.kOrMore(1, 7, 8) << '\n';

    return 0; 
}

关于c++ - 在多个查询的子数组中查找不同(唯一)值的数量,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62261464/

相关文章:

c++ - 指针地址的内存在函数返回前后偶尔改变

c++ - 在 C++ 中访问过度分配的内存

c - 如何使用C中的指针将十六进制转换为二进制

algorithm - 处理图中的负循环

c# - 如何按类型排序列表?

c++ - 是否可以使用另一个 cpp 文件中定义的类而不是任何 header ?

c++ - boost::property_tree : 解析复杂的xml结构

c++ - 从 NFQueue 读取数据包时的序列号为零

c++ - 我在 main 下的输出不是我所期望的

algorithm - 计算 3D 缩减凸包