在sklearn文档中,函数KNeighborsClassifier的参数weights="distance"
解释如下:
‘distance’ : weight points by the inverse of their distance. in this case, closer neighbors of a query point will have a greater influence than neighbors which are further away.
虽然对我来说,对相邻点进行加权,然后将预测计算为加权点的平均值是有意义的,例如使用KNeighborsRegressor...但是,我看不到权重在分类中的使用方式算法。根据《The Elements of Statistical Learning》一书,KNN 分类是基于多数投票的。不是吗?
最佳答案
在分类过程中,计算邻居众数时将使用权重(而不是频率,而是使用权重之和来计算众数)。
了解更多详情请查看here ,以供实际执行。
来自 documentation 的示例:
>>> from sklearn.utils.extmath import weighted_mode
>>> x = [4, 1, 4, 2, 4, 2]
>>> weights = [1, 1, 1, 1, 1, 1]
>>> weighted_mode(x, weights)
(array([4.]), array([3.]))
The value 4 appears three times: with uniform weights, the result is simply the mode of the distribution.
>>>
>>> weights = [1, 3, 0.5, 1.5, 1, 2] # deweight the 4's
>>> weighted_mode(x, weights)
(array([2.]), array([3.5]))
可以查看实现here
关于python - KNeighborsClassifier 中如何使用参数 "weights"?,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/56799923/