machine-learning - 处理 scikit-learn MLPClassifier 的分类类标签

标签 machine-learning scikit-learn neural-network multilabel-classification

我有一个用于分类目的的手写数据集，其中的类来自 a-z。如果我想使用MLPClassifier ，我认为我不能直接使用此类分类类，因为 scikit-learn 中的 MLP 实现仅处理数值类。那么，这里应该采取什么适当的行动呢？把这些类的编号从1-28转换一下怎么样，有意义吗？如果没有，scikit-learn 是否为类标签提供特殊的编码机制来处理这种情况(我猜这里不是 one-hot 编码的选项)？

谢谢

最佳答案

您可能需要预处理数据，因为 scikit-learn 仅处理数值。在本例中，我想预测交易的货币。货币以 ISO 代码表示，因此使用 LabelEncoder 将其转换为数字类别(即:1、2、3...):

#Import the object LabelEncoder
from sklearn.preprocessing import LabelEncoder

#defining class column
my_encoder = LabelEncoder()
my_class_currency = np.array(my_encoder.fit_transform(my_data['currency'])).reshape(-1,1)
#Create a "diccionary" to translate the categories into the actual values once you have the output
my_class_decoder = list(np.unique(my_data['currency']))

关于machine-learning - 处理 scikit-learn MLPClassifier 的分类类标签，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/50672446/

上一篇：machine-learning - 实体类型识别: Finding an Entity's Dominant Type from its Description

下一篇：tensorflow - 对全连接层使用单一共享偏差

相关文章：

c# - 使用反向传播训练 ff nn

neural-network - 预训练的 GloVe 矢量文件(例如 glove.6B.50d.txt)中的 "unk"是什么？

python - Scikit-learn KNN(K 最近邻)使用 Apache Spark 并行化

python - 将自定义标签添加到 pytorch 数据加载器/数据集不适用于自定义数据集

machine-learning - F1 micro 与 Accuracy 一样吗？

Python Sklearn - 弃用警告

python - 如何获得最重要单词的 TF-IDF 分数？

python - 无法使多变量线性回归收敛

python - ValueError : y has only 1 sample in class 0, 协方差定义不明确。而QDA分类

python - 如何加载 caffe 模型并转换为 numpy 数组？