Updated

我正在研究使用 Tflearn 进行答案匹配分数预测的词嵌入模型。我必须使用 tflearn dnn 分类器使用句子向量构建模型，现在我必须向 dnn 模型添加单词嵌入层。怎么做？提前致谢。

"JVMdefines": enables a computer to run a Java program

隐藏为:

"JVMdefines": [[list([0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

enables a computer to run a Java program : list([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])]

我的问题:有没有什么方法可以让机器分析。

enables a "machine" to run a Java program

即它可以检测计算机和机器，含义相同。

最佳答案

我会发表澄清评论，但我没有足够的声誉来这样做，所以我会根据您在原始问题中提供的信息尝试回答...

您的问题似乎不清楚，但以下是如何针对 tflearn 中的二元分类问题执行此操作。

第 1 步:预处理

您需要做的第一件事是将句子标记化并将其转换为整数列表:

“你喜欢吃什么？” ---> [234,64,12,5224,43,96,23]

然后，大多数人将序列填充为相同的长度，即切断长序列或通过用 0 填充来增加短序列的长度。

[234,64,12,5224,43,96,23] ---> [0,0,0,0....234,64,12,5224,43,96,23]

提示:

from tflearn.data_utils import pad_sequences
padded = pad_sequences(unpadded, maxlen=max_document_length, value=0.)

第 2 步:模型构建

将所有文本转换为整数序列后，您可以构建模型。请注意，我们的输入形状是 [None, max_document_length]。 None 表示可选大小(允许可变批量大小)， max_document_length 是我们之前填充的序列的长度。

#Create our model
network = input_data(shape=[None, max_document_length], name='input')

创建嵌入矩阵。请注意，您将嵌入矩阵推送到 CPU。输入 dim 参数正在寻找表示词汇量大小的整数。 output_dim 是嵌入的大小。

with tf.device('/cpu:0'):
    network = tflearn.embedding(network, input_dim=vocabulary_size, output_dim=128)

#Pass embeddings into an lstm layer (handles sequential problems)
network = tflearn.lstm(network, 512, dropout=0.8)

#Squish data into a fully connected layer, with 2 outputs for binary classification
network = tflearn.fully_connected(network, 2, activation='softmax')

#Perform regression to get the final anaswer
network = tflearn.regression(network, optimizer='rmsprop', learning_rate=0.001,
                         loss='categorical_crossentropy')

#Wrap the graph we just created in a tflearn DNN wrapper
model = tflearn.DNN(network)

#Run model.fit to actually train your model
model.fit(x_train, y_train, n_epoch=15, shuffle=True, validation_set=(x_val, y_val), show_metric=True, batch_size=batch_size)

关于tensorflow - 如何使用Tflearn构建词嵌入模型？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/49707130/

tensorflow - 如何使用Tflearn构建词嵌入模型？

第 1 步:预处理

第 2 步:模型构建

上一篇：python - 如何解读 GridSearch 的最佳得分？

下一篇：python - 机器学习中测试集需要数据清理吗？