python - 将大型 numpy 数组输入到 tensorflow

标签 python numpy tensorflow

我有一个很大的 numpy 数组 (X)，我可以将其加载到 CPU 上，但它对于 GPU/Tensorflow 来说太大了。我想使用 Tensorflow 在 X 上执行数组操作，因此我将数组分成批处理(使用numpy)，将其输入到tensorflow，然后最后连接最终的输出数组以得到numpy数组Y。我是tensorflow的新手，所以我认为应该有更好/更快的方法来输入numpy数组。

#X is a large numpy array
#batches is an integer which defines the number of batches

X_list = np.array_split(X,batches)

X_tf = tf.placeholder(tf.float32)
Y_tf = some_function(X_tf)

for batch in range(batches):
    sess = tf.Session()
    sess.run(init)
    Y_list.append(sess.run(Y_tf, feed_dict={X_tf: X_list[batch]}))
    sess.close()

Y = np.hstack(Y_list)

最佳答案

你应该看看 tensorflow dataset类，因为它具有处理大型 np 数组的能力。只要数组可以容纳在内存中，就可以根据需要加载和批处理。

基本实现如下所示(更多详细信息 here )

#load np array X 

#make placeholders for dataset    
X_placeholder = tf.placeholder(dtype=tf.float32, shape=X.shape)    

#make data set from placeholders    
dataset = Dataset.from_tensor_slices((X_placeholder)) 

#batch
dataset = dataset.batch(batch_size)

关于python - 将大型 numpy 数组输入到 tensorflow ，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/48154925/

上一篇：r - 在 ggplot2 中将颜色分配给因子值不起作用

下一篇：idris - 使用依赖对时类型检查失败

相关文章：

python - 如何从包含日期时间对象的数组中进行插值？

python - 使用 numpy 数组的真值错误

python - 在 Python 中为大型数据集创建邻接矩阵

Tensorflow tf.losses.cosine_distance 大于 1

.net - 铁蟒，美汤，win32 app

python - 无法使用 Intel MKL 安装 Scipy

python - TensorFlow 二进制文件经过优化，可在性能关键型操作中使用以下 CPU 指令 : AVX2 FMA

python - PyCharm 远程解释器和 Tensorflow -> 无法导入 Cudart.so

python - 使用带有 webdriver.find 函数的 python 过滤与 selenium 进行网络抓取

python - 不寻常的符号 - python