使用此处描述的 Wilson Score 方程 http://www.evanmiller.org/how-not-to-sort-by-average-rating.html ,我正在对我评分的项目进行排序。但是,如果某个项目有 1 票反对票(0 票赞成票),则它会返回与有 1000 票反对票(0 票赞成票)的项目相同的分数(即 0 分)。
我想要么允许威尔逊分数为负,以克服这个缺点,要么也许有人建议其他解决方案。
无论如何,我不知道如何改变这个方程/函数:
def ci_lower_bound(pos, n, confidence):
if n==0: return 0
z = 1.96
phat = 1.0*pos/n
score = (phat + z*z/(2*n) - z*math.sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n)
return score
其中pos
是正面评分的数量,n
是评分的总数,confidence
是指统计置信水平。
最佳答案
从逻辑上讲,您的评分系统必须处理以下情况:
+----------+----------+------------+---------------+ | Positive | Negative | Any Votes? | Wilson Score? | +----------+----------+------------+---------------+ | N | N | N | Y, = 0 | | Y | Y | Y | Y | | Y | N | Y | Y | | N | Y | Y | N | +----------+----------+------------+---------------+
The missing item being when you have 0 positive votes and more than 0 negative votes, as you note.
As you have both positive and negative scores at the time, why not follow your own idea and create a negative Wilson Score to deal with this, remembering that the square root of a negative number is complex.
To get around complexity assume that negative votes are positive. You then calculate how "liked" a negatively scored item is and multiple this by -1 to turn it into how disliked it is.
import math
def ci_lower_bound(pos, n, neg=0):
if n == 0:
return 0
# Cannot calculate the square-root of a negative number
if pos == 0:
votes, use_neg = neg, True
else:
votes, use_neg = pos, False
# Confidence
z = 1.96
phat = 1.0 * votes / n
# Calculate how confident we are that this is bad or good.
score = (phat + z*z/(2*n) - z * math.sqrt((phat*(1-phat)+z*z/(4*n))/n))/(1+z*z/n)
# This relationship is defined above.
# Multiply by -1 to return a negative confidence.
if use_neg:
return -1 * score
return score
关于python - 我怎样才能改进这个方程,使负面投票多于正面投票的项目返回更有用的威尔逊分数,我们在Stack Overflow上找到一个类似的问题: https://stackoverflow.com/questions/10258409/