python - padding_idx 在 nn.embeddings() 中做什么

标签 python deep-learning nlp pytorch recurrent-neural-network

我正在学习 pytorch 和
我想知道 padding_idx 是什么意思属性 do in torch.nn.Embedding(n1, d1, padding_idx=0) ?
我到处找，找不到我能得到的东西。
你能举个例子来说明这一点吗？

最佳答案

padding_idx在 documentation 中确实描述得很糟糕.

基本上，它指定在调用期间传递的哪个索引将意味着“零向量”(这在 NLP 中经常使用，以防某些标记丢失)。默认情况下，没有索引意味着“零向量”，如下例所示:

import torch

embedding = torch.nn.Embedding(10, 3)
input = torch.LongTensor([[0, 1, 0, 5]])
print(embedding(input))

会给你:

tensor([[[ 0.1280, -1.1390, -2.5007],
         [ 0.3617, -0.9280,  1.2894],
         [ 0.1280, -1.1390, -2.5007],
         [-1.3135, -0.0229,  0.2451]]], grad_fn=<EmbeddingBackward>)

如果您指定 padding_idx=0每input其中值等于 0 (所以第零和第二行)将是 zero-ed像这样(代码:embedding = torch.nn.Embedding(10, 3, padding_idx=0)):

tensor([[[ 0.0000,  0.0000,  0.0000],
         [-0.4448, -0.2076,  1.1575],
         [ 0.0000,  0.0000,  0.0000],
         [ 1.3602, -0.6299, -0.5809]]], grad_fn=<EmbeddingBackward>

如果您要指定 padding_idx=5最后一行将充满零等。

关于python - padding_idx 在 nn.embeddings() 中做什么，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/61172400/

上一篇：macos - RuntimeError : <path> failed executing,，请将LLVM_CONFIG指向llvm-config的路径

下一篇：ruby - Cocoapods 安装失败，无法构建 gem native 扩展

相关文章：

python - 有效地为数组赋值

machine-learning - 使用 PyTorch 向 LSTM 提供多个输入以进行时间序列预测

machine-learning - 有没有什么方法可以在 caffe 中添加一个新层来保持类型为 unsigned int 的权重？

python - 给定keras中隐藏层的输入，权重和偏差，如何获得隐藏层的输出？

java - 如何找到解析树中节点的路径

python - Matplotlib 动画 : vertical cursor line through subplots

python - 子进程没有调用我的命令(或者做错了)

python - 如何去掉小数点

nlp - pytorch 中预训练 BERT 错误的权重初始化

python - NLTK:调整 LinearSVC 分类器的准确性？ - 寻找更好的方法/建议