tensorflow - 为什么Tensorflow的MirroredStrategy和OneDevicestrategy在colab上不起作用？

标签 tensorflow machine-learning deep-learning artificial-intelligence distributed-computing

我正在尝试使用 MirroredStrategy 及其其他变体了解 Tensorflow 分布式训练。我已经在 Colab 上有了一个使用 MNIST 的简单脚本，但需要在多个 GPU 上进行测试。运行和实现代码会出现错误“ValueError:编译中的分发参数在 TF 2.0 中不可用，请在分发策略范围下创建模型”。我也尝试过 OneDeviceStrategy，但它也不起作用。我想使用各种分布式训练方法来比较时间复杂度和准确性。这里有一个 Screenshot of the error和一个Link to the code on Colab .

最佳答案

问题本质上是您使用分发策略范围的方式。在 TF 2.0 中，您不会将分发策略传递给编译方法。相反，您需要构建模型并在分发策略范围内对其进行编译。请注意，对 model.fit(...) 的调用不应在分发策略范围内。例如，这是您的 colab 代码的编辑版本，应该可以解决您的问题:

...

strategy = tf.distribute.MirroredStrategy()

with strategy.scope():
  input_img = layers.Input(shape=IMG_SIZE)
  model = layers.Conv2D(32, (3, 3), padding='same')(input_img)
  ... # model definition
  output_img = layers.Activation('softmax')(model)

  model = models.Model(input_img, output_img)

  model.compile(optimizers.Adam(lr=0.0001), loss='categorical_crossentropy', metrics=["accuracy"])

...

history = model.fit(...)

请参阅Using tf.distribute.Strategy with Keras Distributed training with TensorFlow 部分指南以获取更多信息。

关于tensorflow - 为什么Tensorflow的MirroredStrategy和OneDevicestrategy在colab上不起作用？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/59646275/

上一篇：react-native - 更新 MainApplication.java 后，react-native-navigation 出错

下一篇：checkbox - Google 表格 - 当复选框放置在大范围内的特定单元格中时，只允许选中一个复选框

tensorflow - 如何从 tfrecords 目录创建 tf.data.dataset？

python - TensorFlow - 如何在不同的测试数据集上使用经过训练的模型进行预测？

deep-learning - 当图像大小不同时，如何格式化图像数据以进行训练/预测？

python - tensorflow 属性错误: type object 'NewBase' has no attribute 'is_abstract'

python - 如何将输入传递给 Keras 中的 2D Conv？

python - 未在 featuretools 中为我的实体集设置生成功能

r - 如何调整R中Bagging的参数？

machine-learning - K-Fold交叉验证的应用部署

python - 为什么我的 RNN 学习将所有输入仅分类为 2 个可能分类中的 1 个？