我写了一个快速的 Canvas 可视化来查看我从 C++ 移植到 JavaScript 的哈希算法的分布。
我看到奇怪的行为,无论我用什么修改散列,0 都是严重有偏见的,因为它被选择的频率是散列函数中大多数其他数字的两倍。
您可以在以下位置查看演示:http://jsfiddle.net/x5L73/2/
和原始的 C++ 算法:http://www.azillionmonkeys.com/qed/hash.html
我所指的代码部分位于 jsFiddle 的底部:
// hash is 0 twice as often as anything else
var hash = app.Hash( word ) % ( 3499 )
, b1 = 0|hash / 59
, b2 = hash % 59;
对我来说奇怪的是 hash
为零的频率是 any 其他值的两倍,无论我选择用什么来修改它。在此示例中,它是零 1/3499
次,而 任何 其他数字被命中 1/6998
次。这是通过暴力测试确定的:
if( hash!==1234 ){ nonZero++; }else{ zero++ } // 1234 is a random number to check
if( Math.random() < .00001 ){ console.log( zero, nonZero, 0|nonZero/zero ); }
我在这里错过了什么???
最佳答案
虽然这是一个非常有趣的事实,在处理整数时可能会派上用场,就像哈希一样,但错误是不是,因为 JavaScript 有负零也是...
OP 报告的原始原因是:
It's because I was accidentally discarding all negative numbers in the visualization.
是的,负数并不是那么微不足道,我们的大脑有时会忽略它们 - 特别是当他们长时间专注于涉及整数的特定难题时,就像试图找出好的哈希方法,然后切换一个看似简单得多的任务:显示结果...
所以真正的答案是:除了负零之外,JavaScript 还有更多的负数......不要忘记将它们计算在内 - 即使在easy 可视化任务。
长话短说
我把它留在这里,因为这可能对将来遇到类似问题的任何人都有用,因为它可能会导致类似的情况。
看这个问题:+0 and -0 in JavaScript (negative zero and positive zero in JavaScript)
引用:
JavaScript uses IEEE 745 standard to represent numbers. From Wikipedia:
Signed zero is zero with an associated sign. In ordinary arithmetic, −0 = +0 = 0. However, in computing, some number representations allow for the existence of two zeros, often denoted by −0 (negative zero) and +0 (positive zero). This occurs in some signed number representations for integers, and in most floating point number representations. The number 0 is usually encoded as +0, but can be represented by either +0 or −0.
The IEEE 754 standard for floating point arithmetic (presently used by most computers and programming languages that support floating point numbers) requires both +0 and −0. The zeroes can be considered as a variant of the extended real number line such that 1/−0 = −∞ and 1/+0 = +∞, division by zero is only undefined for ±0/±0 and ±∞/±∞.
关于javascript - 哈希分布,为什么 0 总是重权重?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/15242983/