python-3.x - Keras 二元分类将输出压缩为 0/1

标签 python-3.x machine-learning deep-learning keras theano

我有一个前馈 DNN 模型,具有多个层来执行二元分类。输出层为1个sigmoid单元,损失函数为binary_crossentropy。作为预测,我期望一个带有零/一的向量。为此,我对预测进行了总结并进行了解释。然后我使用 sklearn 评分函数来计算(f1score、rocauc、 precision、recall、mcc)。问题是我得到的预测向量与我假装的独热编码不匹配。尽管如果我使用 mse 损失函数,它会像假装的那样工作。

=>模型创建函数:

    def create_DNN_model(self, verbose=True):
        print("Creating DNN model")
        fundamental_parameters = ['dropout', 'output_activation', 'optimization', 'learning_rate',
                              'units_in_input_layer',
                              'units_in_hidden_layers', 'nb_epoch', 'batch_size']
        for param in fundamental_parameters:
            if self.parameters[param] == None:
                print("Parameter not set: " + param)
                return
        self.print_parameter_values()
        model = Sequential()
        # Input layer
        model.add(Dense(self.parameters['units_in_input_layer'], input_dim=self.feature_number, activation='relu'))
        model.add(BatchNormalization())
        model.add(Dropout(self.parameters['dropout']))
        # constructing all hidden layers
        for layer in self.parameters['units_in_hidden_layers']:
            model.add(Dense(layer, activation='relu'))
            model.add(BatchNormalization())
            model.add(Dropout(self.parameters['dropout']))
        # constructing the final layer
        model.add(Dense(1))
        model.add(Activation(self.parameters['output_activation']))
        if self.parameters['optimization'] == 'SGD':
            optim = SGD()
            optim.lr.set_value(self.parameters['learning_rate'])
        elif self.parameters['optimization'] == 'RMSprop':
            optim = RMSprop()
            optim.lr.set_value(self.parameters['learning_rate'])
        elif self.parameters['optimization'] == 'Adam':
            optim = Adam()
        elif self.parameters['optimization'] == 'Adadelta':
            optim = Adadelta()
        model.add(BatchNormalization())
        model.compile(loss='binary_crossentropy', optimizer=optim, metrics=[matthews_correlation])
        if self.verbose == 1: str(model.summary())
        print("DNN model sucessfully created")
        return model

=> 评估函数:

    def evaluate_model(self, X_test, y_test):
        print("Evaluating model with hold out test set.")
        y_pred = self.model.predict(X_test)
        y_pred = [float(np.round(x)) for x in y_pred]
        y_pred = np.ravel(y_pred)
        scores = dict()
        scores['roc_auc'] = roc_auc_score(y_test, y_pred)
        scores['accuracy'] = accuracy_score(y_test, y_pred)
        scores['f1_score'] = f1_score(y_test, y_pred)
        scores['mcc'] = matthews_corrcoef(y_test, y_pred)
        scores['precision'] = precision_score(y_test, y_pred)
        scores['recall'] = recall_score(y_test, y_pred)
        scores['log_loss'] = log_loss(y_test, y_pred)
        for metric, score in scores.items():
            print(metric + ': ' + str(score))
        return scores

=> 预测向量“y_pred”:

[-1. -1.  2. -0.  2. -1. -1. -1.  2. -1. -1.  2. -1.  2. -1.  2. -1. -1.  2. -1.  2. -1. -1.  2. -1.  2.  2.  2. -1. -1.  2.  2.  2.  2. -1. -1. 2.  2.  2. -1.  2.  2. -1.  2. -1. -1. -1.  1. -1. -1. -1.]

提前致谢。

最佳答案

您在输出层中使用线性激活(默认),而您应该采用 sigmoid。

关于python-3.x - Keras 二元分类将输出压缩为 0/1,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/46697007/

相关文章:

machine-learning - 数据可视化。 3D、精度、召回率和 f 测量。也许使用 Octave ?

machine-learning - 数据挖掘中的堆叠

python - 如何在 Keras 中对简历片段进行分类?

python - 如何不在 Tensorflow 中重新初始化预训练加载的模型?

python - 如何使用 python 从嘈杂的文件中解析出 xml

python - 推荐一种有效的数据结构,用于在列表中存储大量重复值

python - BeautifulSoup 中带有 .get-operator 的 if 语句

python - 找不到 GraphViz 的可执行文件(python 3 和 pydotplus)

python - Keras:一维输入的卷积层

python-3.x - Python 中的重采样