c++ - unordered_map 如何工作/优化设计?

标签 c++ performance c++11 hashmap unordered-map

我在另一个论坛上阅读了以下帖子,该帖子来自一个似乎对 C++ 内部机制了解很多的人关于将数千个键插入“字典”:

e) Map and Set look-up is done with Red-Black or Balanced Tree's and each item is allocated "individually", so if you're allocating 500,000 Instruments [by symbol] with a pointer to an instrument Object-class associated, you have 'N' number of bytes [plus overhead] for the string and 4-bytes [plus overhead] for the pointer. And include; one-minute, five-second, one-second price time-series on all instruments and full trade-history on ALL those Instruments in STD Containers. That's a lot of memory and a hell of a lot More Wasted due to small object Allocation overhead!

f) Notoriously, STD Map & Set walk thru all of the keys to FIND using LowerBound [Less Than Compare] which is slow as hell.

g) Some Genius may say "No, they use an Unsorted Map"...well they don't, but even if they did they are STILL doing a String Compare on a discretely allocated element.

What I do in C++ is the following (example);

a) Create a "custom" in-place String Class-object, which has two personalities; a) a Byte array, and b) an Integer array [of Modulus 4 and Aligned on the Native Boundary]. b) Use Custom Map & Set, which are Hash based in 2x Dimensions with Nodes allocated in a Flat Contiguous Memory region [which may & can dynamically re-size]. c) String [in Integer format] Hashing is done by Integer to pipeline the CPU and Key Comparison is done similarly.

With these techniques, which can only be done in C++, C or ASM there are at least 4-5x ORDERS OF MAGNITUDE the performance of the same thing done in .NET, C# or Java.

http://www.elitetrader.com/vb/showthread.php?s=1eb70fb998d8a51d22050ea53d24db21&threadid=204368&perpage=6&pagenumber=3

如果我大致知道我将插入多少个键,我可以使用哪些技术来设计我自己的 unordered_map 实现,它比我的特定用途的标准实现更有效?

(欢迎任何关于设计哈希函数的 101)

最佳答案

使用 unordered_map您只需为您的 key 设计一个散列函数。 C++ 标准库为内置键类型提供了一组哈希函数,例如:hash<int>hash<float> . 如果你声明一个 unordered_map<int,int>它默认使用 hash<int>因为它是哈希函数。 但是如果你想使用你自己的对象作为键,你必须提供你自己的散列函数


优点: 虽然插入时间在unordered_map<T>更大,但散列通常提供 O(1)检索 (key,value) 时的复杂性从容器中取出一对。

关于c++ - unordered_map 如何工作/优化设计?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18674041/

相关文章:

c++ - 如何组合两个字符串?

java - 修改编辑距离算法以不计算所有距离

performance - 我可以在部署后监控 Azure 应用程序的性能吗(类似于分析)

C++ 打印到终端会显着减慢代码速度吗?

C++11:为什么 std::condition_variable 使用 std::unique_lock?

c++ - 我怎样才能部分特化可变参数模板函数

c++ - 同一台计算机上的 UDP 连接。 1 个服务器。多个客户端

c++ - 列表迭代器和 - 运算符的问题

C# 等效于另一个进程的postthreadmessage

c++11:为什么静态 constexpr 的类内初始化不是定义?