c++ - 如何在 C++ 中使用小数( float )？

根据 IEEE 754-2008有

There are three binary floating-point basic formats (which can be encoded using 32, 64 or 128 bits) and two decimal floating-point basic formats (which can be encoded using 64 or 128 bits).

这张图表在它下面。在 C++ 中，我相信 float 和 double 是单精度和 double (binary32 和 binary64)。

Name        Common name         Base  Digits E min  E max   Digits  E max
binary32    Single precision    2     23+1   −126   +127    7.22    38.23
binary64    Double precision    2     52+1   −1022  +1023   15.95   307.95
binary128   Quadruple precision 2     112+1  -16382 +16383  34.02   4931.77
decimal32                       10    7      −95    +96     7       96
decimal64                       10    16     −383   +384    16      384
decimal128                      10    34     −6143  +6144   34      6144

decimalX 可以使用什么类/结构，binary128 可以使用什么类/结构？这些类/结构是标准的还是非标准的？

最佳答案

除了 32 位 float 和 64 位 double 之外，GCC 还提供了 __float80, __float128 , _Decimal32, _Decimal64, _Decimal128;对于 ARM 目标，它还提供半精度 __fp16。

英特尔 CPU 使用旧的标量 x87 FPU 指令(但不使用 SSE vector 指令)在硬件中支持 80 位 float 。我不知道有任何主流 CPU 的硬件支持十进制 FP 类型。

看起来当前的 Microsoft 编译器为 double 和 long double 提供了 64 位，但旧版本为 long 提供了 80 位双。

在此处查看文档:

关于c++ - 如何在 C++ 中使用小数( float )？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/9316476/

c++ - 如何在 C++ 中使用小数( float )？

上一篇：C++ 没有初始化变量

下一篇：c++ - 如何在 C 中找到 "char *"数组的长度？