Memcached 使用分布式一致性哈希来选择将 key 放在哪个服务器上,但它使用哪种哈希算法将字符串 key 映射到最终哈希,在该哈希上应用 Ketama 算法来选择服务器。该算法在将相似的 key 传播到不同的服务器方面有多好。
最佳答案
根据hash.c中的源代码, memcached 使用以下算法:
The hash function used here is by Bob Jenkins, 1996:
http://burtleburtle.net/bob/hash/doobs.html
"By Bob Jenkins, 1996. bob_jenkins@burtleburtle.net. You may use this code any way you wish, private, educational, or commercial. It's free."
来自 Bob Jenkins 的网站:
I offer you a new hash function for hash table lookup that is faster and more thorough than the one you are using now. I also give you a way to verify that it is more thorough.
另外,他的要求是:
- The keys are unaligned variable-length byte arrays.
- Sometimes keys are several such arrays.
- Sometimes a set of independent hash functions were required.
- Average key lengths ranged from 8 bytes to 200 bytes.
- Keys might be character strings, numbers, bit-arrays, or weirder things.
- Table sizes could be anything, including powers of 2.
- The hash must be faster than the old one.
- The hash must do a good job.
...
The real requirement then is that a good hash function should distribute hash values uniformly for the keys that users actually use.
回到您的其他问题,他测量了算法均匀分布散列值的能力,因此我认为散列在将相似的 key 传播到不同服务器方面做得很好。如果您有顾虑,代码是隔离的,因此您应该能够运行自己的测试。
关于algorithm - memcached 使用什么哈希算法来哈希键?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10434375/