python - 使用 sklearn 使用 Keras 数据生成器绘制混淆矩阵

标签 python keras scikit-learn tensorflow2.0 tensorflow-datasets

Sklearn 明确定义了如何使用自己的分类模型绘制混淆矩阵 1 .
但是如何将它与使用数据生成器的 Keras 模型一起使用。让我们看一个示例代码:
首先我们需要训练模型。

import numpy as np
from keras import backend as K
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator
from sklearn.metrics import classification_report, confusion_matrix

#Start
train_data_path = 'F://data//Train'
test_data_path = 'F://data//Validation'
img_rows = 150
img_cols = 150
epochs = 30
batch_size = 32
num_of_train_samples = 3000
num_of_test_samples = 600

#Image Generator
train_datagen = ImageDataGenerator(rescale=1. / 255,
                                   rotation_range=40,
                                   width_shift_range=0.2,
                                   height_shift_range=0.2,
                                   shear_range=0.2,
                                   zoom_range=0.2,
                                   horizontal_flip=True,
                                   fill_mode='nearest')

test_datagen = ImageDataGenerator(rescale=1. / 255)

train_generator = train_datagen.flow_from_directory(train_data_path,
                                                    target_size=(img_rows, img_cols),
                                                    batch_size=batch_size,
                                                    class_mode='categorical')

validation_generator = test_datagen.flow_from_directory(test_data_path,
                                                        target_size=(img_rows, img_cols),
                                                        batch_size=batch_size,
                                                        class_mode='categorical')

# Build model
model = Sequential()
model.add(Convolution2D(32, (3, 3), input_shape=(img_rows, img_cols, 3), padding='valid'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(32, (3, 3), padding='valid'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(64, (3, 3), padding='valid'))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(5))
model.add(Activation('softmax'))

model.compile(loss='categorical_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

#Train
model.fit_generator(train_generator,
                    steps_per_epoch=num_of_train_samples // batch_size,
                    epochs=epochs,
                    validation_data=validation_generator,
                    validation_steps=num_of_test_samples // batch_size)
现在在训练模型之后,让我们构建一个混淆矩阵。
#Confution Matrix and Classification Report
Y_pred = model.predict_generator(validation_generator, num_of_test_samples // batch_size+1)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
print(confusion_matrix(validation_generator.classes, y_pred))
print('Classification Report')
target_names = ['Cats', 'Dogs', 'Horse']
print(classification_report(validation_generator.classes, y_pred, target_names=target_names))
现在这到目前为止工作正常。但是如何将其保存为与上述 sklearn 示例相同的布局中的 png?
任何想法都受到高度赞赏。
提前致谢

最佳答案

像这样(另见 ConfusionMatrixDisplay confusion_matrix ):

from sklearn.metrics import ConfusionMatrixDisplay
from sklearn.metrics import confusion_matrix
import matplotlib.pyplot as plt
import numpy as np


y_pred = np.array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2])
y_test = np.array([0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 2])
labels = ["Cats", "Dogs", "Horses"]

cm = confusion_matrix(y_test, y_pred)

disp = ConfusionMatrixDisplay(confusion_matrix=cm, display_labels=labels)

disp.plot(cmap=plt.cm.Blues)
plt.show()
结果:
confusion matrix plotted without scikit-learn's plot_confusion_matrix method

关于python - 使用 sklearn 使用 Keras 数据生成器绘制混淆矩阵,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/67303001/

相关文章:

python : ImportError in my unittests

python - 如何找出Python脚本是从哪里调用的?

python-3.x - 从神经网络中找到最重要的输入

python - Keras - RTX 2080 ti 训练比仅 CPU 和 GTX 1070 慢?

python - 以最快的方式将 one-hot 编码的特征保存到 Pandas DataFrame 中

python - 如何在 Python 中使用正则表达式 re.sub() 一个可选的匹配组?

Python 向量的三元运算

python - 多输出模型的编译选项: multiple losses & loss weighting

python - GridSearch 最佳模型 : Save and load parameters

python - 用于 scikit-learn 矢量化器的自定义分词器