python - tensorflow : What is actually tf. nn.dropout output_keep_prob？

我试图理解output_keep_prob的概念:

如果我的例子是简单的 RNN :

    with tf.variable_scope('encoder') as scope:
        cells = rnn.LSTMCell(num_units=500)
        cell = rnn.DropoutWrapper(cell=cells, output_keep_prob=0.5)

        model = tf.nn.bidirectional_dynamic_rnn(cell, cell, inputs=embedding_lookup, sequence_length=sequence_le,
                                                dtype=tf.float32)

我的困惑是，如果我给出 output_keep_prob=0.5 它实际上意味着什么？我知道通过添加 dropout 可以降低过度拟合(称为正则化)的可能性。它在训练期间随机关闭神经元的激活，好吧，我明白了这一点，但当我给出时我很困惑

output_keep_prob=0.5 和我的 no_of_nodes = 500 那么 0.5 意味着它将在每次迭代中随机转动 50% 的节点，或者意味着它将仅保留那些概率大于或等于 0.5 < 的连接/p>

keep_layers whose probability =>0.5

或

turn off 50% randomly nodes unit at each iteration ??

我试图通过这个 stackoverflow answer 来理解这个概念但也存在同样的困惑，0.5 到底意味着什么？它应该在每次迭代中丢弃 50% 的节点，或者只保留那些概率大于或等于 0.5 的节点

如果答案是第二个仅保留那些概率大于或等于 0.5 的节点:

那么这意味着假设我给出了 500 个节点单元，并且只有 30 个节点有 0.5 的概率，因此它将关闭其余 470 个节点，并且仅使用 30 个节点进行传入和传出连接？

因为this answer说:

Suppose you have 10 units in the layer and set the keep_prob to 0.1, Then the activation of 9 randomly chosen units out of 10 will be set to 0, and the remaining one will be scaled by a factor of 10. I think a more precise description is that you only keep the activation of 10 percent of the nodes.

而另一边this answer作者:@mrry 说:

it means that each connection between layers (in this case between the last densely connected layer and the readout layer) will only be used with probability 0.5 when training.

任何人都可以清楚地解释哪个是正确的以及该值在 keep_prob 中实际代表什么？

最佳答案

Keep_prop 表示任何给定神经元输出被保留的概率(与丢弃相反，即归零)。换句话说，keep_prob = 1 - drop_prob .

tf.nn.dropout()描述指出

By default, each element is kept or dropped independently.

所以如果你想一想，如果你有大量的神经元，比如一层有 10,000 个，而 keep_prob 比方说，0.3，那么 3,000 就是该数字的期望值保留的神经元。因此，说 keep_prob 为 0.3 意味着保留 10,000 个神经元中随机选择的 3,000 个神经元的值，这或多或少是同一件事。但也不完全是这样，因为实际数量可能与 3,000 略有不同。

之所以需要进行缩放，是因为如果删除一定数量的神经元，则该层的预期总和将会减少。因此，将其余的值相乘以向前馈送与其他情况下相同大小的值。如果您加载预训练网络并希望继续训练但现在使用不同的 keep_prob 值，这一点尤其重要。

(请注意，您可以决定使用 noise_shape 参数将非独立性引入到丢弃概率中，请参阅 tf.nn.drouput() description ，但这超出了本问题的范围。)

每次调用网络时都会重新计算是否删除神经元的随机决定，因此每次迭代都会删除一组不同的神经元。 Dropout 背后的想法是后续层不能过度拟合并学会观察某些激活的任意星座。通过总是改变以前可用的激活，你破坏了“惰性神经元过度拟合的 secret 计划”。

关于python - tensorflow : What is actually tf. nn.dropout output_keep_prob？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49864214/

python - tensorflow : What is actually tf. nn.dropout output_keep_prob？

上一篇：python - 为什么使用 Seaborn 绘制回归时截距显示不正确？

下一篇：python - 使用 datastax Python Cassandra 驱动程序从文件执行 CQL 查询