我在另一个论坛上阅读了以下帖子,该帖子来自一个似乎对 C++ 内部机制了解很多的人关于将数千个键插入“字典”:
e) Map and Set look-up is done with Red-Black or Balanced Tree's and each item is allocated "individually", so if you're allocating 500,000 Instruments [by symbol] with a pointer to an instrument Object-class associated, you have 'N' number of bytes [plus overhead] for the string and 4-bytes [plus overhead] for the pointer. And include; one-minute, five-second, one-second price time-series on all instruments and full trade-history on ALL those Instruments in STD Containers. That's a lot of memory and a hell of a lot More Wasted due to small object Allocation overhead!
f) Notoriously, STD Map & Set walk thru all of the keys to FIND using LowerBound [Less Than Compare] which is slow as hell.
g) Some Genius may say "No, they use an Unsorted Map"...well they don't, but even if they did they are STILL doing a String Compare on a discretely allocated element.
What I do in C++ is the following (example);
a) Create a "custom" in-place String Class-object, which has two personalities; a) a Byte array, and b) an Integer array [of Modulus 4 and Aligned on the Native Boundary]. b) Use Custom Map & Set, which are Hash based in 2x Dimensions with Nodes allocated in a Flat Contiguous Memory region [which may & can dynamically re-size]. c) String [in Integer format] Hashing is done by Integer to pipeline the CPU and Key Comparison is done similarly.
With these techniques, which can only be done in C++, C or ASM there are at least 4-5x ORDERS OF MAGNITUDE the performance of the same thing done in .NET, C# or Java.
如果我大致知道我将插入多少个键,我可以使用哪些技术来设计我自己的 unordered_map 实现,它比我的特定用途的标准实现更有效?
(欢迎任何关于设计哈希函数的 101)
最佳答案
使用 unordered_map
您只需为您的 key 设计一个散列函数。 C++ 标准库为内置键类型提供了一组哈希函数,例如:hash<int>
或 hash<float>
.
如果你声明一个 unordered_map<int,int>
它默认使用 hash<int>
因为它是哈希函数。 但是如果你想使用你自己的对象作为键,你必须提供你自己的散列函数。
优点:
虽然插入时间在unordered_map<T>
更大,但散列通常提供 O(1)
检索 (key,value)
时的复杂性从容器中取出一对。
关于c++ - unordered_map 如何工作/优化设计?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/18674041/