使用 Sklearn 分层 kfold 拆分,当我尝试使用多类拆分时,我收到错误消息(见下文)。当我尝试使用二进制进行拆分时,它没有问题。
num_classes = len(np.unique(y_train))
y_train_categorical = keras.utils.to_categorical(y_train, num_classes)
kf=StratifiedKFold(n_splits=5, shuffle=True, random_state=999)
# splitting data into different folds
for i, (train_index, val_index) in enumerate(kf.split(x_train, y_train_categorical)):
x_train_kf, x_val_kf = x_train[train_index], x_train[val_index]
y_train_kf, y_val_kf = y_train[train_index], y_train[val_index]
ValueError: Supported target types are: ('binary', 'multiclass'). Got 'multilabel-indicator' instead.
最佳答案
keras.utils.to_categorical
产生一个单热编码的类向量,即 multilabel-indicator
错误信息中提到。 StratifiedKFold
不适用于此类输入;来自 split
方法 docs :
split
(X, y, groups=None)[...]
y : array-like, shape (n_samples,)
The target variable for supervised learning problems. Stratification is done based on the y labels.
即你的y
必须是您的类标签的一维数组。
本质上,您要做的只是颠倒操作顺序:首先拆分(使用您的初始 y_train
),然后转换 to_categorical
之后。
关于python - Sklearn StratifiedKFold : ValueError: Supported target types are: ('binary' , 'multiclass' )。取而代之的是 'multilabel-indicator',我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/48508036/