python - 如何在 2 列上训练 ML 模型来解决分类问题？

我在数据集中有三列进行情感分析(类 0、1、2):

text    thing    sentiment

但问题是我只能在 text 或 thing 上训练数据并获得预测的情绪。有没有办法在text和thing上训练数据，然后预测情绪？

问题案例(例如):

  |text  thing  sentiment
0 | t1   thing1    0
. |
. |
54| t1   thing2    2

这个例子告诉我们，情绪也取决于事物。如果我尝试将两列连接在另一列下面，然后尝试，但这将是不正确的，因为我们不会向模型提供两列之间的任何关系。

此外，我的测试集包含两列test和thing，我必须根据经过训练的模型来预测情绪两列。

现在我正在使用tokenizer，然后使用下面的模型:

model = Sequential()
model.add(Embedding(MAX_NB_WORDS, EMBEDDING_DIM, input_length=X.shape[1]))
model.add(SpatialDropout1D(0.2))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())

有关如何继续或使用哪种模型或编码操作的任何指示？

最佳答案

您可能希望转向 Keras 功能 API 并训练多输入模型。

据 Keras 的创建者 François CHOLLET 在他的著作《使用 Python 进行深度学习 [Manning，2017](第 7 章第 1 节)中所述:

Some tasks, require multimodal inputs: they merge data coming from different input sources, processing each type of data using different kinds of neural layers. Imagine a deep-learning model trying to predict the most likely market price of a second-hand piece of clothing, using the following inputs: user-provided metadata (such as the item’s brand, age, and so on), a user-provided text description, and a picture of the item. If you had only the metadata available, you could one-hot encode it and use a densely connected network to predict the price. If you had only the text description available, you could use an RNN or a 1D convnet. If you had only the picture, you could use a 2D convnet. But how can you use all three at the same time? A naive approach would be to train three separate models and then do a weighted average of their predictions. But this may be suboptimal, because the information extracted by the models may be redundant. A better way is to jointly learn a more accurate model of the data by using a model that can see all available input modalities simultaneously: a model with three input branches.

关于python - 如何在 2 列上训练 ML 模型来解决分类问题？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/57488823/

python - 如何在 2 列上训练 ML 模型来解决分类问题？

问题案例(例如):

上一篇：python - Keras 中的 3D 卷积是否适用于 RGB 视频？

下一篇：python - 创建模型后清除Python循环中的内存