c++ - 对相当大的整数的大集合的操作的快速实现

标签 c++ performance algorithm integer set

描述:

我实现了以下类 LabSetInt64(参见下面的代码)。

这里的目标是尽可能快地操作大量大整数(最多 10M 的值)。 我的主要要求集中在:

  • !关键:尽快获取集合的大小/基数
  • !重要:能够快速迭代集合

那么,从下面的实现开始,还有两点我想和大家讨论一下。

“popcount()”惰性实现

int LabSetInt64::popcount(uint64_t x) {
    int count;
    for (count=0; x; count++)
        x &= x-1;
    return count;
}

我知道我选择的实现是市场上最慢的实现之一,但我不能 自己找出一个更好的。因此,如果您能给我指出一个更好的(当然是 64 位的)...

我听说了一种非常有效的计算基数的算法:“MIT HAKMEM 169”算法>>

// MIT HAKMEM.
// MIT HAKMEM Count is funky. Consider a 3 bit number as being 4a+2b+c. If we
// shift it right 1 bit, we have 2a+b. Subtracting this from the original gives
// 2a+b+c. If we right-shift the original 3-bit number by two bits, we get a,
// and so with another subtraction we have a+b+c, which is the number of bits
// in the original number. How is this insight employed? The first assignment
// statement in the routine computes tmp. Consider the octal representation of
// tmp. Each digit in the octal representation is simply the number of 1¡äs in
// the corresponding three bit positions in n. The last return statement sums
// these octal digits to produce the final answer. The key idea is to add
// adjacent pairs of octal digits together and then compute the remainder
// modulus 63. This is accomplished by right-shifting tmp by three bits, adding
// it to tmp itself and ANDing with a suitable mask. This yields a number in
// which groups of six adjacent bits (starting from the LSB) contain the number
// of 1¡äs among those six positions in n. This number modulo 63 yields the
// final answer. For 64-bit numbers, we would have to add triples of octal
// digits and use modulus 1023. This is HACKMEM 169, as used in X11 sources.
// Source: MIT AI Lab memo, late 1970¡äs.
// http://gurmeet.net/2008/08/05/fast-bit-counting-routines/
int CheckMIT(unsigned int n) 
{
   /* works for 32-bit numbers only    */
   /* fix last line for 64-bit numbers */
   register unsigned int tmp;

   tmp = n - ((n >> 1) & 033333333333)
           - ((n >> 2) & 011111111111);

   // the remainder of 63 for i equals the sum of 7 octal numbers which
   // consititutes the 32-bit integer.
   // For example, 03456 % 63 == 034 + 056
   return ((tmp + (tmp >> 3)) & 030707070707) % 63;
}

我对上述算法的问题是我对它的理解不够 将其转换为“uint64_t”(64 位)版本(尽管执行此操作的指令很少)。

因此,如果你们中的一个人可以帮助我完成这项任务(或者至少帮助我理解),那就太棒了。

这是LabSetInt64.h:

/*
 * LabSetInt64.h
 *
 *  Created on: Feb 20, 2013
 *      Author: golgauth
 */

#ifndef LABSETINT64_H_
#define LABSETINT64_H_

#include <ctype.h>
#include <stdio.h>
#include <string.h>

#include <iostream>
#include <stdint.h>
#include <limits>
#include <algorithm>

#include <LabTimeUtils.h>

using namespace std;


namespace elps {


// Lets assume we need at most 1M distinct indices in our sets
#define DEFAULT_SIZE_IN_BITS 1000000


class LabSetInt64
{
public:

    LabSetInt64();
    LabSetInt64(int sizeinbits);
    LabSetInt64(const LabSetInt64 &);
    ~LabSetInt64();


    void Clear();
    void Resize(int sizeinbits);
    void Set(int i);
    void UnSet(int i);
    void Set(int i, bool b);
    bool Find(int i);
    int  Cardinality();
    int  NextSetBit(int i);

    void Print();

    /**
     * Should have been "=" operator, but this overload is not compatible
     * with Cython, so we'll use "<<" instead.
     * @param
     * @return
     */
    const LabSetInt64 & operator << ( const LabSetInt64 & );
    LabSetInt64         operator ~  () const;
    LabSetInt64         operator +  ( const LabSetInt64 & );
    LabSetInt64         operator *  ( const LabSetInt64 & );
    LabSetInt64         operator -  ( const LabSetInt64 & );
    LabSetInt64         operator ^  ( const LabSetInt64 & );

private:
    uint64_t *data;
    int data_len;
    int bits_size;

    int popcount(uint64_t x);
    int nb_trailing_zeros(uint64_t v);
};


} /* namespace elps */
#endif /* LABSETINT64_H_ */

这里是LabSetInt64.cpp:

/*
 * LabSetInt64.cpp
 *
 *  Created on: Feb 20, 2013
 *      Author: golgauth
 */

#include "LabSetInt64.h"

namespace elps {


/** PUBLICS **/


LabSetInt64::LabSetInt64() : LabSetInt64(DEFAULT_SIZE_IN_BITS) {
}

LabSetInt64::LabSetInt64(int sizeinbits) {
    bits_size = sizeinbits;
    data_len = (bits_size + 63) / 64;
    data = new uint64_t[data_len];
}

LabSetInt64::LabSetInt64(const LabSetInt64& src) {
    bits_size = src.bits_size;
    data_len = src.data_len;
    data = new uint64_t[data_len];
    for( int i = 0; i < data_len; i++ )            // copy into this set
        data[i] = src.data[i];
}

LabSetInt64::~LabSetInt64() {
}


void LabSetInt64::Clear() {
    std::fill_n(data, data_len, 0);
}

void LabSetInt64::Resize(int sizeinbits) {
    bits_size = sizeinbits + 1;
    int size = (bits_size + 63) / 64;
    uint64_t *new_data = new uint64_t[size];

    memcpy( new_data, data, data_len * sizeof(uint64_t) );

    data_len = size;
    delete [] data;
    data = new_data;
}


void LabSetInt64::Set(int i) {
    data[i / 64] |= (1l << (i % 64));
}

void LabSetInt64::UnSet(int i) {
    data[i / 64] &= ~(1l << (i % 64));
}

void LabSetInt64::Set(int i, bool b) {
    if(b) Set(i); else UnSet(i);
}

bool LabSetInt64::Find(int i) {
    return ((data[i / 64]) & (1l << (i % 64)));           // binary AND;
}


int LabSetInt64::Cardinality() {
    int sum = 0;
    for(int i=0; i<data_len; i++)
        sum += popcount(data[i]);
    //sum += __builtin_popcount(data[i]);
    return sum;
}

int LabSetInt64::NextSetBit(int i) {
    int x = i / 64;
    long w = data[x];
    w >>= (i % 64);
    if (w != 0) {
        return i + nb_trailing_zeros(w);
    }
    ++x;
    for (; x < data_len; ++x) {
        if (data[x] != 0) {
            return x * 64 + nb_trailing_zeros(data[x]);
        }
    }
    return -1;
}


void LabSetInt64::Print() {
    int cur_id = this->NextSetBit(0);
    cout << "\nSet size : " << this->Cardinality() << endl;
    cout << "{ ";
    while (cur_id != -1)
    {
        cout << (cur_id) << " ";
        cur_id = this->NextSetBit(cur_id+1);
    }
    cout << "}" << endl;;
}


/** PRIVATES **/

//This is better when most bits in x are 0
//It uses 3 arithmetic operations and one comparison/branch per "1" bit in x.
int LabSetInt64::popcount(uint64_t x) {
    int count;
    for (count=0; x; count++)
        x &= x-1;
    return count;
}

// output: c will count v's trailing zero bits,
// so if v is 1101000 (base 2), then c will be 3
int LabSetInt64::nb_trailing_zeros(uint64_t v) {
    int c;
    if (v)
    {
        v = (v ^ (v - 1)) >> 1;         // Set v's trailing 0s to 1s and zero rest
        for (c = 0; v; c++) {
            v >>= 1;
        }
    }
    else
        c = 8 * sizeof(v);

    return c;
}


/** ***************************************************************************
*********************************   OPERATORS   *******************************
*******************************************************************************/

/** PUBLICS **/


/*******************************************************************************
 * TODO >> May be better to use  "DEFAULT_SIZE_IN_BITS" instead of "data_len"  *
 *         in all the operators !!!                                            *
 *                                                                             *
 * => For now, we assume that all the Sets are created with the default        *
 *    constructor (aka : bit_size = DEFAULT_SIZE_IN_BITS)                      *
 *******************************************************************************/


/**
 *  "operator = " assigns the right hand side to this set.
 *  returns: nothing
 */
const LabSetInt64 &LabSetInt64::operator << ( const LabSetInt64 &rhs )
{
    if( &rhs != this )                              // avoid self assignment
    {
        this->Resize(rhs.bits_size - 1);
        for( int i = 0; i < data_len; i++ )         // copy into this set
            data[i] = rhs.data[i];
    }
    return *this;                                   // enable x << y << z;
}


/**
 * "operator ~ " performs set complement operation (not).
 * returns: pointer to set
 */
LabSetInt64 LabSetInt64::operator ~ () const
{
    LabSetInt64 rv;
    for( int i = 0; i < data_len; i++ )
        rv.data[i] = ~data[i];                      // bitwise complement
    return rv;
}


/**
 * "operator + " performs set union operation (or).
 * returns: pointer to set
 */
LabSetInt64 LabSetInt64::operator + ( const LabSetInt64 &rhs )
{
    LabSetInt64 rv;
    for( int i = 0; i < data_len; i++ )
        rv.data[i] = data[i] | rhs.data[i];         // bitwise OR
    return rv;
}


/**
 * "operator * " performs set intersection operation (and).
 * returns: pointer to set
 */
LabSetInt64 LabSetInt64::operator * ( const LabSetInt64 &rhs )
{
    LabSetInt64 rv;
    for( int i = 0; i < data_len; i++ )
        rv.data[i] = data[i] & rhs.data[i];         // bitwise AND
    return rv;
}


/**
 * "operator - " performs set difference operation.
 * returns: pointer to set
 */
LabSetInt64 LabSetInt64::operator - ( const LabSetInt64 &rhs )
{
    LabSetInt64 rv;
    for( int i = 0; i < data_len; i++ )
        rv.data[i] = data[i] & ( ~rhs.data[i] );    // bitwise a AND ~b
    return rv;
}


/**
 * "operator ^ " performs set symmetric difference operation (xor).
 * returns: pointer to set
 */
LabSetInt64 LabSetInt64::operator ^ ( const LabSetInt64 &rhs )
{
    LabSetInt64 rv;
    for( int i = 0; i < data_len; i++ )
        rv.data[i] = data[i] ^ rhs.data[i];         // bitwise XOR
    return rv;
}


/** *****************************************************************************
*********************************************************************************/


} /* namespace elps */

对于其余的实现,我对表现非常满意,但再一次,任何好的 我们将非常感谢您的评论。

非常感谢您的宝贵时间。

编辑: 好吧,毕竟我并不完全相信“稀疏”解决方案,因为我需要并集、交集、差异特征,我想这比我的“按位”版本要昂贵得多(需要迭代可能有大量的项目),所以我仍然有兴趣获得一些关于如何将上述“MIT HAKMEM”算法转换为 64 位的帮助。非常感谢。

编辑: 这个版本包含一些小缺陷。请更好地查看下面给出的源代码。

最佳答案

Russ Cox 在他的网站上描述了 very simple integer set data structure它具有 O(n) 迭代(相对于 O(U),其中 U 是某个最大值)和 O(1) 插入、删除和计数。它不使用一点摆弄。它也倾向于使用更多空间,但可以扩展为 O(1) 摊销附加(通过简单地使用 std::vector 而不是普通数组)。

(我会粘贴示例代码,但我认为 Cox 的描述足够清晰,我只是介绍错误。)

关于c++ - 对相当大的整数的大集合的操作的快速实现,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/14981143/

相关文章:

迭代计算数字倒数的算法

c++ - 如何以给定的 MIDI 音符/ Octave 音程播放声音?

c++ - 模块化功能背后的逻辑

performance - Lua优化内存

algorithm - Web 表单的 Winkler 算法使用

string - 在多个字符串中查找子序列出现

c++ - 优化光线追踪器中的 BVH 遍历

c++ - Python中低级标准输入的重复重定向

oracle - 简单的 Oracle UPDATE 语句性能异常糟糕

javascript - 使用 JavaScript 访问 CSS 样式太慢