python - 在 scikit 的 precision_recall_curve 中，为什么阈值与召回率和精度具有不同的维度？

标签 python python-2.7 scikit-learn precision-recall

我想看看准确率和召回率如何随阈值变化(而不仅仅是相互之间)

model = RandomForestClassifier(500, n_jobs = -1);  
model.fit(X_train, y_train);  
probas = model.predict_proba(X_test)[:, 1]  
precision, recall, thresholds = precision_recall_curve(y_test, probas)  
print len(precision)   
print len(thresholds)

283  
282

因此，我不能将它们绘制在一起。关于为什么会这样的任何线索？

最佳答案

对于这个问题，应该忽略最后的precision和recall值最后的精度和召回值总是分别为 1 和 0，并且没有相应的阈值。

例如这里有一个解决方案:

def plot_precision_recall_vs_threshold(precisions, recall, thresholds): 
    fig = plt.figure(figsize= (8,5))
    plt.plot(thresholds, precisions[:-1], "b--", label="Precision")
    plt.plot(thresholds, recall[:-1], "g-", label="Recall")
    plt.legend()

plot_precision_recall_vs_threshold(precision, recall, thresholds)

这些值应该存在，以便在绘制精度与召回率时，绘图从 y 轴 (x=0) 开始。

关于python - 在 scikit 的 precision_recall_curve 中，为什么阈值与召回率和精度具有不同的维度？，我们在Stack Overflow上找到一个类似的问题： https://stackoverflow.com/questions/31639016/

上一篇：python - 来自 Python/SQLAlchemy 的 SQL Server 查询中需要单个反斜杠

下一篇：python - celery 抛出 BacklogLimitExceeded

相关文章：

python - Python告诉我语法错误，请解释

MySQL优化。为什么选项 2 比选项 1 快

python - 变形金刚类从何而来？

python - Tornado nginx websockets 握手 400 错误

python - 如果您不知道行数，是否可以使用 raw_input() 获取所有行？

Python 不支持的格式字符 "w"

python - 为什么文本 I/O 必须在 python 3 中缓冲？

python - 运行两个 python 进程

python-2.7 - 拟合后如何从sklearn GMM中的每个组件获得标准偏差

machine-learning - Tensorflow 多元线性回归结果为 NaN