python - Python 中的 numpy.save( ) 和 joblib.dump( ) 有什么区别？

我在 Python 中保存了很多离线模型/矩阵/数组，并遇到了这些函数。有人能帮我列出 numpy.save( ) 和 joblib.dump( ) 的优缺点吗？

最佳答案

下面是 joblib 中的关键代码部分，应该可以说明一些问题。

def _write_array(self, array, filename):
    if not self.compress:
        self.np.save(filename, array)
        container = NDArrayWrapper(os.path.basename(filename),
                                   type(array))
    else:
        filename += '.z'
        # Efficient compressed storage:
        # The meta data is stored in the container, and the core
        # numerics in a z-file
        _, init_args, state = array.__reduce__()
        # the last entry of 'state' is the data itself
        zfile = open(filename, 'wb')
        write_zfile(zfile, state[-1],
                            compress=self.compress)
        zfile.close()
        state = state[:-1]
        container = ZNDArrayWrapper(os.path.basename(filename),
                                        init_args, state)
    return container, filename

基本上，joblib.dump 可以选择压缩一个数组，它可以使用 numpy.save 将其存储到磁盘，或者(用于压缩)存储一个 zip 文件。此外，joblib.dump 存储一个 NDArrayWrapper(或用于压缩的 ZNDArrayWrapper)，这是一个轻量级对象，用于存储保存/压缩的名称包含数组内容和数组子类的文件。

关于python - Python 中的 numpy.save( ) 和 joblib.dump( ) 有什么区别？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/26766599/

上一篇：python - 矢量化 for 循环以填充 Pandas DataFrame

下一篇：python - 如何在 Mechanical Turk 之前将数据 POST 到网络服务器

相关文章：

python - 如何以不同的名称序列化 Marshmallow 字段

python - Numpy 独特的二维子数组

python - 使用 numpy 数组列表理解 - 不好的做法？

python - pickle : Why are they called that?

Python:在不使用 `setrecursionlimit` 的情况下 pickle 高度递归对象

python - 如何修复 Python 中的双重编码和损坏的字符串？

python - 如何用Django模型数据库API实现Mysql操作UNHEX & HEX

python - 在给定值之后屏蔽 numpy 数组

python - 查找字典中混合类型值的重复项

python - 如何将选定的数据转换为相同的长度(形状)