c - 在 CUDA 中实现固定精度数字

标签 c cuda

我正在尝试通过在 CUDA 中使用固定精度数字来加速我的代码。我需要 64 位。我怎样才能在不溢出和剪掉数字顶部的情况下进行乘法运算。 CUDA 中是否有 128 位类型？

typedef long long fixed;
#define _fxadd(a, b) ((a) + (b))
#define _fxsub(a, b) ((a) - (b))
#define _fxmul(a, b) ((a) * (b)) >> 32

最佳答案

不，CUDA 中没有内置的 128 位宽整数数据类型，但是有一些 integer intrinsics ，这可能有助于您自己的实现。

例如，您可以使用 __umul64hi 来获得乘以 64 位宽(无符号)整数操作数的高边:

Calculate the most significant 64 bits of the 128-bit product x * y, where x and y are 64-bit unsigned integers.

关于c - 在 CUDA 中实现固定精度数字，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/32141802/

相关文章：

cmake - 为什么需要分离编译？