floating-point - 将任意精度的有理数(OCaml，zarith)转换为近似 float

我正在使用 Zarith库来做任意精度的有理算术。假设我有一个有理数 q类型 Q.t这是两个大整数的比率(Q 是 Zarith 的任意精度有理数模块)。有时，为了便于阅读，我想将此数字打印为浮点数，有时我需要将此数字转换为浮点数，以便以后进行非任意精度计算。有没有办法转换q到一定精度的浮点数？

我转换的方式q to floating-point 现在没有任何保证，并且可以创建未定义的浮点数(Z 是任意精度整数模块):

let to_float q =
  let n, d = num q, den q in
  (* check if d is zero and raise an error if it is *)
  let nf, df = Z.to_float n, Z.to_float d in
  nf /. df

有没有更好的方法来处理这个问题，我可以获得一个最准确地近似任何 q 的浮点数?

编辑

如果有人感兴趣，我很快在 OCaml 中写下了 Mark Dickinson 的答案。它可能(肯定)可以改进和清理。如果我这样做或者如果有人有任何改进建议，我会进行编辑。但是现在这已经解决了我的问题!

let to_float q = 
  let n, d = num q, den q in
  let n_sign = Z.sign n in
  let d_sign = Z.sign d in (* always >= 0 *)
  if d_sign = 0 then raise Division_by_zero;
  let n = Z.abs n in
  if n_sign = 0 then 0. else
    let shift = (Z.numbits n) - (Z.numbits d) - 55 in
    let is_subnormal = shift < -1076 in
    let shift = if is_subnormal then -1076 else shift in
    let d = if shift >= 0 then Z.shift_left d shift else d in
    let n = if shift < 0 then Z.shift_left n (-shift)
      else n in
    let quotient, remainder = Z.div_rem n d in
    let quotient = if (Z.compare remainder (Z.zero)) = 0 && Z.is_even quotient then
        Z.add Z.one quotient else quotient in
    let quotient = if not is_subnormal then quotient else
        let round_select = Z.to_int @@ Z.rem quotient @@ Z.of_int 8 in
        Z.add quotient [|Z.zero;Z.minus_one;Z.of_int (-2);Z.one;Z.zero
                        ;Z.minus_one;Z.of_int 2;Z.one|].(round_select)
    in
    let unsigned_res = ldexp (Z.to_float quotient) shift in                                                                                                             
    if n_sign = 1 then unsigned_res else -.unsigned_res

我会考虑为 GMP 的 mpq_get_d 编写一个接口(interface)。稍后运行，但我不完全确定如何做到这一点。我看到的唯一方法是转换 q : Q.t到一个字符串并传递 to:

int mpq_set_str (mpq_t rop, const char *str, int base)

有谁知道如何通过rop至mpq_get_d在 OCaml 中或有描述如何执行此操作的引用？我浏览了chapter 19 of RWO并没有看到这样的情况。

最佳答案

如果您有权访问

整数 log2操作和

将整数左移给定位数的能力

那么滚动您自己的正确舍入转换相对容易。简而言之，该方法如下所示:

减少到案例n > 0 , d > 0 ;过滤掉明显的下溢/溢出

选择一个整数 shift这样2^-shift*n/d介于 2^54 之间和 2^56 .

使用整数算术计算 x = 2^-shift*n/d , 使用 round-to-odd 四舍五入到最接近的整数舍入法。

转换 x到最接近的 IEEE 754 double 值 dx , 使用通常的舍入到偶数舍入模式。

返回 ldexp(dx, shift) .

恐怕我不精通 OCaml，但下面的 Python 代码说明了正输入的想法。我留给你对负输入和除以零进行明显的修改。您可能还希望提前返回极端溢出和下溢的情况:通过查找超大或超小的 shift 值很容易检测到这些情况。以下。

from math import ldexp

def to_float(numerator, denominator):
    """
    Convert numerator / denominator to float, correctly rounded.

    For simplicity, assume both inputs are positive.
    """
    # Shift satisfies 2**54 < (numerator / denominator) / 2**shift < 2**56
    shift = numerator.bit_length() - denominator.bit_length() - 55

    # Divide the fraction by 2**shift.
    if shift >= 0:
        denominator <<= shift
    else:
        numerator <<= -shift

    # Convert to the nearest integer, using round-to-odd.
    q, r = divmod(numerator, denominator)
    if r != 0 and q % 2 == 0:
        q += 1

    # Now convert to the nearest float and shift back.
    return ldexp(float(q), shift)

一些注意事项:

bit_length正整数上的方法 n给出表示 n 所需的位数，或者换句话说 1 + floor(log2(n)) .

divmod是一个 Python 函数，它同时计算整数除法的商和余数。

数量q (很容易)适合 64 位整数

我们四舍五入两次:一次在转换移位 numerator / denominator 时到最接近的整数，并在将该整数舍入为浮点数时再次。第一轮使用round-to-odd方法;这确保了第二轮(隐含在从 int 到 float 的转换中)给出的结果与我们将分数直接四舍五入为浮点数相同。

上述算法不能正确处理转换后的浮点值低于正常值的分数:在这种情况下，ldexp操作可能会引入第三次舍入。有可能处理这个问题，但要小心。请参阅下面的一些代码。

以上实际上是 Python 在将一个(大)整数除以另一个以获得浮点结果时使用的算法的简化版本。可以看源码here . long_true_divide开头的评论函数概述了该方法。

为了完整起见，这里有一个变体，它也可以正确处理低于正常的结果。

def to_float(numerator, denominator):
    """
    Convert numerator / denominator to float, correctly rounded.

    For simplicity, assume both inputs are positive.
    """
    # Choose shift so that 2**54 < numerator / denominator / 2**shift < 2**56
    shift = numerator.bit_length() - denominator.bit_length() - 55

    # The 'treat_as_subnormal' flag catches all cases of subnormal results,
    # along with some cases where the result is not subnormal but *is* still
    # smaller than 2**-1021. In all these cases, it's sufficient to find the
    # closest integer multiple of 2**-1074. We first round to the nearest
    # multiple of 2**-1076 using round-to-odd.
    treat_as_subnormal = shift < -1076
    if treat_as_subnormal:
        shift = -1076

    # Divide the fraction by 2**shift.
    if shift >= 0:
        denominator <<= shift
    else:
        numerator <<= -shift

    # Convert to the nearest integer, using round-to-odd.
    q, r = divmod(numerator, denominator)
    if r != 0 and q % 2 == 0:
        q += 1

    # Now convert to the nearest float and shift back.
    if treat_as_subnormal:
        # Round to the nearest multiple of 4, rounding ties to
        # the nearest multiple of 8. This avoids double rounding
        # from the ldexp call below.
        q += [0, -1, -2, 1, 0, -1, 2, 1][q%8]

    return ldexp(float(q), shift)

关于floating-point - 将任意精度的有理数(OCaml，zarith)转换为近似 float ，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/33623875/

floating-point - 将任意精度的有理数(OCaml，zarith)转换为近似 float

上一篇：garbage-collection - ParNew 和 CMS-初始相关

下一篇：reporting-services - 无法将 SSRS 或 SSIS 包从 SQL Server Data Tools 包安装到 Visual Studio 2017