python - 如何在 2 列上训练 ML 模型来解决分类问题?

标签 python machine-learning classification logistic-regression training-data

我在数据集中有三列进行情感分析(类 012):

text    thing    sentiment

但问题是我只能在 textthing 上训练数据并获得预测的情绪。有没有办法在textthing上训练数据,然后预测情绪

问题案例(例如):

  |text  thing  sentiment
0 | t1   thing1    0
. |
. |
54| t1   thing2    2

这个例子告诉我们,情绪也取决于事物。如果我尝试将两列连接在另一列下面,然后尝试,但这将是不正确的,因为我们不会向模型提供两列之间的任何关系。

此外,我的测试集包含两列testthing,我必须根据经过训练的模型来预测情绪两列。

现在我正在使用tokenizer,然后使用下面的模型:

model = Sequential()
model.add(Embedding(MAX_NB_WORDS, EMBEDDING_DIM, input_length=X.shape[1]))
model.add(SpatialDropout1D(0.2))
model.add(LSTM(100, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(3, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())

有关如何继续或使用哪种模型或编码操作的任何指示?

最佳答案

您可能希望转向 Keras 功能 API 并训练多输入模型。

据 Keras 的创建者 François CHOLLET 在他的著作《使用 Python 进行深度学习 [Manning,2017](第 7 章第 1 节)中所述:

Some tasks, require multimodal inputs: they merge data coming from different input sources, processing each type of data using different kinds of neural layers. Imagine a deep-learning model trying to predict the most likely market price of a second-hand piece of clothing, using the following inputs: user-provided metadata (such as the item’s brand, age, and so on), a user-provided text description, and a picture of the item. If you had only the metadata available, you could one-hot encode it and use a densely connected network to predict the price. If you had only the text description available, you could use an RNN or a 1D convnet. If you had only the picture, you could use a 2D convnet. But how can you use all three at the same time? A naive approach would be to train three separate models and then do a weighted average of their predictions. But this may be suboptimal, because the information extracted by the models may be redundant. A better way is to jointly learn a more accurate model of the data by using a model that can see all available input modalities simultaneously: a model with three input branches.

关于python - 如何在 2 列上训练 ML 模型来解决分类问题?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/57488823/

相关文章:

python - PyQt4:为什么使用 QTreeWidgetItem 时 Python 会在关闭时崩溃?

machine-learning - 需要有关实时视频上的对象检测和运动分类的建议

python - 如何为由 tf 操作组成的操作注册自定义梯度

machine-learning - 是否有必要同时运行随机森林和交叉验证

python - 在 knn 算法中计算距离而不是欧氏距离的替代有效方法

python - 在 aiohttp 中使用共享 TCPConnector 时出错

python - 图像中网格图案的鲁棒检测

hadoop - 如何在 Apache mahout 中合并两个相似实例

machine-learning - 分类中的熵

python - Six.u() 取消转义 HTML 字符串