我想在 LSTM 的 Embeddings 层中使用 BERT Word Vector Embeddings 而不是通常的默认嵌入层。有什么办法可以做到吗?
最佳答案
希望这些链接有帮助:
https://github.com/abhilash1910/BERTSimilarity
transformer_model = transformers.TFBertModel.from_pretrained('bert-large-uncased')
input_ids = tf.keras.layers.Input(shape=(128,), name='input_token', dtype='int32')
input_masks_ids = tf.keras.layers.Input(shape=(128,), name='masked_token', dtype='int32')
X = transformer_model(input_ids, input_masks_ids)[0]
X = tf.keras.layers.Dropout(0.2)(X)
X = tf.keras.layers.Dense(6, activation='softmax')
model = tf.keras.Model(inputs=[input_ids, input_masks_ids], outputs = X)
提供 sample :
import numpy as np
from transformers import AutoTokenizer, pipeline, TFDistilBertModel
from scipy.spatial.distance import cosine
def transformer_embedding(name,inp,model_name):
model = model_name.from_pretrained(name)
tokenizer = AutoTokenizer.from_pretrained(name)
pipe = pipeline('feature-extraction', model=model,
tokenizer=tokenizer)
features = pipe(inp)
features = np.squeeze(features)
return features
z=['The brown fox jumped over the dog','The ship sank in the Atlantic Ocean']
embedding_features1=transformer_embedding('distilbert-base-uncased',z[0],TFDistilBertModel)
embedding_features2=transformer_embedding('distilbert-base-uncased',z[1],TFDistilBertModel)
distance=1-cosine(embedding_features1[0],embedding_features2[0])
print(distance)
谢谢。
关于python-3.x - 在 Keras 嵌入层中使用 BERT 嵌入,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/62771845/