c++ - SIMD 值的合理散列？

我想用 __m128i 来测试一个简单的 hashmap，但是 C++ 提示说 hash 函数不兼容:

/Applications/Xcode.app/[...]/c++/v1/__hash_table:880:5: error: static_assert failed due to requirement [...] "the specified hash does not meet the Hash requirements"

    static_assert(__check_hash_requirements<_Key, _Hash>::value,
    ^             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In file included from [...] note: in instantiation of template class [...] requested here
    std::unordered_map<__m128i, std::size_t> hmap;

现在，我可以通过使用类似于以下的代码来提供哈希函数:

    class hash128i
    {
    public:
        std::size_t operator()(const __m128i &r) const
        {
            return something;
        }
    };

用我发明的东西，比如OR-ing __m128i的高64位和低64位，然后使用 std::hash.

但是，鉴于哈希函数的敏感性，我不知道这种方法是否明智。

__m128i(或其他 SIMD 变量)的良好 C++ 哈希函数是什么？

最佳答案

哈希函数的实际质量在某种程度上取决于您需要的属性以及数据的分布方式。

如果您不必防御试图用大量冲突值阻塞您的表的恶意输入，那么一个相当简单的函数就足够了。

对于短整数，Chris Wellons 已经做了相当多的 analysis使用他的 hash-prospector程序。

他说的一个不错的64位函数如下，找到了here :

uint64_t splittable64(uint64_t x)
{
    x ^= x >> 30;
    x *= UINT64_C(0xbf58476d1ce4e5b9);
    x ^= x >> 27;
    x *= UINT64_C(0x94d049bb133111eb);
    x ^= x >> 31;
    return x;
}

您可以散列 128 位整数的两半并通过 XOR 组合它们，如果您希望两半经常相同，则轮换其中之一。所以你的解决方案可能看起来像这样:

class hash128i
{
public:
    std::size_t operator()(const __m128i &r) const
    {
        uint64_t lower_hash = splittable64(static_cast<uint64_t>(r));
        uint64_t upper_hash = splittable64(static_cast<uint64_t>(r >> 64));
        uint64_t rotated_upper = upper_hash << 31 | upper_hash >> 33;
        return lower_hash ^ rotated_upper;
    }
};

如果您的哈希表要抵御恶意输入，您可能希望使用带有随 secret 钥的 key 哈希函数。看看SIPHash .

关于c++ - SIMD 值的合理散列？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/54765861/

c++ - SIMD 值的合理散列？

上一篇：c++ - 如何验证是否使用相同的引用调用了 2 个模拟方法？

下一篇：c++ - 在 const 方法中修改非常量对象的非可变成员是未定义行为吗？