python - 用于处理大整数的多个数字数组

我处理非常大的数字，整数，有 10000 位数字，所以我将每个数字拆分到数组中。

小数据样本:

#all combinations with length 3 of values in list L
N = 3
L = [[1,9,0]]*N
a = np.array(np.meshgrid(*L)).T.reshape(-1,N)
#it is number so removed first 0 and also last value is always 0
a = a[(a[:, 0] != 0) & (a[:, -1] == 0)]
print (a)
[[1 1 0]
 [1 9 0]
 [1 0 0]
 [9 1 0]
 [9 9 0]
 [9 0 0]]

然后我需要乘以 1.1 标量的倍数。为了更好地理解:

#joined arrays to numbers
b = np.array([int(''.join(x)) for x in a.astype(str)])[:, None]
print (b)
[[110]
 [190]
 [100]
 [910]
 [990]
 [900]]

#multiple by constant
c = b * 1.1
print (c)
[[ 121.]
 [ 209.]
 [ 110.]
 [1001.]
 [1089.]
 [ 990.]]

但是因为10000位，这个解是不行的，因为四舍五入。所以我需要多个数组的解决方案:

我尝试的是:将最后 0 个“列”添加到第一个，然后求和:

a1 = np.hstack((a[:, [-1]] , a[:, :-1] ))
print (a1)
[[0 1 1]
 [0 1 9]
 [0 1 0]
 [0 9 1]
 [0 9 9]
 [0 9 0]]

print (a1 + a)
[[ 1  2  1]
 [ 1 10  9]
 [ 1  1  0]
 [ 9 10  1]
 [ 9 18  9]
 [ 9  9  0]]

但问题是如果值更像 9 是必要的移动下一个数字(就像老学校的论文求和)，预期输出是:

c1 = np.array([list(str(x).split('.')[0].zfill(4)) for x in np.ravel(c)]).astype(int)
print (c1)
[[0 1 2 1]
 [0 2 0 9]
 [0 1 1 0]
 [1 0 0 1]
 [1 0 8 9]
 [0 9 9 0]]

是否有一些快速矢量化解决方案可以从 a 数组生成 c1 数组？

编辑:我尝试使用另一个数据进行测试并通过@yatu 引发错误解决方案:

ValueError: cannot convert float NaN to integer

from itertools import product,zip_longest

def grouper(iterable, n, fillvalue=None):
    args = [iter(iterable)] * n
    return zip_longest(*args, fillvalue=fillvalue)

#real data
#M = 100000
#N = 500
#loop by chunks by length 5
M = 20
N = 5
v = [0]*M
for i in grouper(product([9, 0], repeat=M), N, v):
    a = np.array(i)
#    print (a)
    #it is number so removed first 0 and also last value is always 0
    a = a[(a[:, 0] != 0) & (a[:, -1] == 0)]
    print (a)
#

    s = np.arange(a.shape[1]-1, -1, -1)
    # concat digits along cols, and multiply
    b = (a * 10**s).sum(1)*1.1
    # highest amount of digits in b
    n_cols = int(np.log10(b.max()))
    # broadcast division to reverse
    c = b[:, None] // 10**np.arange(n_cols, -1, -1)
    # keep only last digit
    c1 = (c%10).astype(int)
    print (c1)

最佳答案

这是一个从 a 开始工作的矢量化代码。这个想法是将每一列乘以 10**seq，seq 是一个排列，直到列数，并按降序排列。一旦我们沿第二个轴取 sum，这将充当沿列的数字的串联。最后，我们可以通过应用相同的逻辑来反转该过程，但在乘以 1.1 之后，除法和广播到结果形状，并对结果取模 10 以仅保留最后一位数字:

s = np.arange(a.shape[1]-1, -1, -1, dtype=np.float64)
# concat digits along cols, and multiply
b = (a * 10**s).sum(1)*1.1
# highest amount of digits in b
n_cols = int(np.log10(b.max()))
# broadcast division to reverse
c = b[:, None] // 10**np.arange(n_cols, -1, -1, dtype=np.float64)
# keep only last digit
c1 = (c%10).astype(int)

print(c1)

array([[0, 1, 2, 1],
       [0, 2, 0, 9],
       [0, 1, 1, 0],
       [1, 0, 0, 1],
       [1, 0, 8, 9],
       [0, 9, 9, 0]])

更新-

上述方法适用于不高于 int64 支持的整数，即:

np.iinfo(np.int64).max
# 9223372036854775807

但是，在这种情况下可以做的是将数组值保存为 python int 而不是 numpy dtype。所以我们可以将 np.arange 定义为 dtype 对象，上面的代码应该适用于共享示例:

s = np.arange(a.shape[1]-1, -1, -1, dtype=object)
# concat digits along cols, and multiply
b = (a * 10**s).sum(1)*1.1
# highest amount of digits in b
n_cols = int(np.log10(b.max()))
# broadcast division to reverse
c = b[:, None] // 10**np.arange(n_cols, -1, -1, dtype=object)
# keep only last digit
c1 = (c%10).astype(int)

关于python - 用于处理大整数的多个数字数组，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/60916328/

python - 用于处理大整数的多个数字数组

上一篇：ruby-on-rails - 有什么办法可以在乘客 + nginx 下运行 byebug？

下一篇：c# - 将通用列表形式 DerivedClass 转换为 BaseClass