floating-point - float256 中指数和分数的大小

您最好查看表格以了解我想要的内容:

╔════════╦════════╦════════════╦════════════╗
║  name  ║  sign  ║  exponent  ║  fraction  ║
╠════════╬════════╬════════════╬════════════╣
║float16 ║    1   ║      5     ║     10     ║
╠════════╬════════╬════════════╬════════════╣
║float32 ║    1   ║      8     ║     23     ║
╠════════╬════════╬════════════╬════════════╣
║float64 ║    1   ║     11     ║     52     ║
╠════════╬════════╬════════════╬════════════╣
║float128║    1   ║     15     ║    112     ║
╠════════╬════════╬════════════╬════════════╣
║float256║    1   ║    ????    ║    ????    ║
╠════════╬════════╬════════════╬════════════╣
║float512║    1   ║    ????    ║    ????    ║
╚════════╩════════╩════════════╩════════════╝

我的问题是如何计算给定总位数(例如 256、512 或 1024)的指数和分数的位数。

最佳答案

IEEE-754 (2008) 的早期草案定义了任意宽度浮点数“应该”的指数和有效数字段的宽度的指南。这不是硬性要求，而只是推荐的做法。它被认为对于提供的最小好处来说太麻烦了，所以它完全从标准中删除，取而代之的是:

Language standards should define mechanisms supporting extendable precision for each supported radix. Language standards supporting extendable precision shall permit users to specify p and emax. Language standards shall also allow the specification of an extendable precision by specifying p alone; in this case emax shall be defined by the language standard to be at least 1000×p when p is ≥ 237 bits in a binary format or p is ≥ 51 digits in a decimal format.

(3.7 扩展和可扩展精度，第 14 页)。

也就是说，该标准仍然定义(不需要)第 3.6 节 (p13) 表中大于 128 的每个 32 位倍数大小的“交换格式”。具体来说，宽度k的二进制格式有一个 round(4*log2(k)) - 13位指数。对于k=256的具体情况，这给出:

exponent: round(4*log2(256)) - 13 = 32 - 13 = 19
significand: 256 - 1 - 19 = 236

对于遵循此公式的 384 位宽格式，指数宽度将为:

round(4*log2(384)) - 13 = round(34.339850002884624) - 13 = 21 bits

请注意，有许多用于任意精度浮点运算的软件包不符合此准则。这只是“binary256 交换格式”的定义，而不是任何给定的实现必须使用的。

关于floating-point - float256 中指数和分数的大小，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/7059672/

floating-point - float256 中指数和分数的大小

上一篇：SVN客户端钩子(Hook)

下一篇：jasper-reports - JasperReports多页报表具有不同的内容