c++ - Objective-C 中的非规范化 float ?

标签 c++ objective-c performance floating-point

Stack Overflow 问题/答案的相关性是什么 Why does changing 0.1f to 0 slow down performance by 10x? 用于 Objective-C?如果有任何相关性,这应该如何改变我的编码习惯?有什么方法可以关闭 Mac OS X 上的非规范化 float 吗?

这似乎与 iOS 完全无关。对吗?

最佳答案

正如我在回复您的评论时所说:

it is more of a CPU than a language issue, so it probably has relevance for Objective-C on x86. (iPhone's ARMv7 doesn't seem to support denormalized floats, at least with the default runtime/build settings)

更新

我刚刚测试过。在 x86 上的 Mac OS X 上观察到减速,在 ARMv7 上的 iOS 上则没有(默认build设置)。

正如预期的那样,在 iOS 模拟器(x86 上)上运行时,非规范化 float 再次出现。

有趣的是,FLT_MINDBL_MIN 分别被定义为最小的非非规范化数字(在 iOS、Mac OS X 和 Linux 上)。奇怪的事情发生在使用

DBL_MIN/2.0

在你的代码中;编译器愉快地设置了一个非规范化常量,但是一旦(arm)CPU 接触到它,它就被设置为零:

double test = DBL_MIN/2.0;
printf("test      == 0.0 %d\n",test==0.0);
printf("DBL_MIN/2 == 0.0 %d\n",DBL_MIN/2.0==0.0);

输出:

test      == 0.0 1  // computer says YES
DBL_MIN/2 == 0.0 0  // compiler says NO

因此,快速运行时检查是否支持反规范化可以是:

#define SUPPORT_DENORMALIZATION ({volatile double t=DBL_MIN/2.0;t!=0.0;})

(“甚至没有任何目的适用性的默示保证”)

这是 ARM 对清零模式必须说的:http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/Bcfheche.html

更新<<1

这是在 ARMv7 上禁用清零模式的方法:

int x;
asm(
    "vmrs %[result],FPSCR \r\n"
    "bic %[result],%[result],#16777216 \r\n"
    "vmsr FPSCR,%[result]"
    :[result] "=r" (x) : :
);
printf("ARM FPSCR: %08x\n",x);

以下令人惊讶的结果。

  • 第 1 列:一个 float ,每次迭代除以 2
  • 第 2 列:此 float 的二进制表示
  • 第 3 列:对这个 float 求和 1e7 次所花费的时间

您可以清楚地看到非规范化的成本为零。 (对于 iPad 2。在 iPhone 4 上,它会以 10% 的减速为代价。)

0.000000000000000000000000000000000100000004670110: 10111100001101110010000011100000 110 ms
0.000000000000000000000000000000000050000002335055: 10111100001101110010000101100000 110 ms
0.000000000000000000000000000000000025000001167528: 10111100001101110010000001100000 110 ms
0.000000000000000000000000000000000012500000583764: 10111100001101110010000110100000 110 ms
0.000000000000000000000000000000000006250000291882: 10111100001101110010000010100000 111 ms
0.000000000000000000000000000000000003125000145941: 10111100001101110010000100100000 110 ms
0.000000000000000000000000000000000001562500072970: 10111100001101110010000000100000 110 ms
0.000000000000000000000000000000000000781250036485: 10111100001101110010000111000000 110 ms
0.000000000000000000000000000000000000390625018243: 10111100001101110010000011000000 110 ms
0.000000000000000000000000000000000000195312509121: 10111100001101110010000101000000 110 ms
0.000000000000000000000000000000000000097656254561: 10111100001101110010000001000000 110 ms
0.000000000000000000000000000000000000048828127280: 10111100001101110010000110000000 110 ms
0.000000000000000000000000000000000000024414063640: 10111100001101110010000010000000 110 ms
0.000000000000000000000000000000000000012207031820: 10111100001101110010000100000000 111 ms
0.000000000000000000000000000000000000006103515209: 01111000011011100100001000000000 110 ms
0.000000000000000000000000000000000000003051757605: 11110000110111001000010000000000 110 ms
0.000000000000000000000000000000000000001525879503: 00010001101110010000100000000000 110 ms
0.000000000000000000000000000000000000000762939751: 00100011011100100001000000000000 110 ms
0.000000000000000000000000000000000000000381469876: 01000110111001000010000000000000 112 ms
0.000000000000000000000000000000000000000190734938: 10001101110010000100000000000000 110 ms
0.000000000000000000000000000000000000000095366768: 00011011100100001000000000000000 110 ms
0.000000000000000000000000000000000000000047683384: 00110111001000010000000000000000 110 ms
0.000000000000000000000000000000000000000023841692: 01101110010000100000000000000000 111 ms
0.000000000000000000000000000000000000000011920846: 11011100100001000000000000000000 110 ms
0.000000000000000000000000000000000000000005961124: 01111001000010000000000000000000 110 ms
0.000000000000000000000000000000000000000002980562: 11110010000100000000000000000000 110 ms
0.000000000000000000000000000000000000000001490982: 00010100001000000000000000000000 110 ms
0.000000000000000000000000000000000000000000745491: 00101000010000000000000000000000 110 ms
0.000000000000000000000000000000000000000000372745: 01010000100000000000000000000000 110 ms
0.000000000000000000000000000000000000000000186373: 10100001000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000092486: 01000010000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000046243: 10000100000000000000000000000000 111 ms
0.000000000000000000000000000000000000000000022421: 00001000000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000011210: 00010000000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000005605: 00100000000000000000000000000000 111 ms
0.000000000000000000000000000000000000000000002803: 01000000000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000001401: 10000000000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000000000: 00000000000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000000000: 00000000000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000000000: 00000000000000000000000000000000 110 ms

关于c++ - Objective-C 中的非规范化 float ?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/9350810/

相关文章:

c++ - 保证复制省略的行为是否取决于用户定义的复制构造函数的存在?

C++/STL:std::transform 给定步幅?

c++ - OpenCV:为矩阵元素分配新值

c++ - 如何打印嵌套 std::unordered_map 的内容?

ios - NSNetServiceBrowsing 无法解析服务并出现错误 -72004

objective-c - 指针约定 *

ios - Google+ iOS SDK 获取好友列表?

Java 分析 : Private Property Getter has Large Base Time

c++ - 浮点运算性能 C++

javascript - 为什么 native 浏览器排序功能比快速排序慢?