The Smart Fruit Seller - How We Now Rank the Best Perfumes
We are constantly striving to improve the validity and accuracy of our community's ratings and rankings. As a result, we have updated the way we calculate the ratings for our top lists.
What changes?
Instead of the previous Wilson score, we now use a "weighted Wilson score"
Simple averages vs. ranking
Before we go into detail, you should know that the average scores displayed on the page are not used for ranking in the top lists. The averages give you a first indication of how a fragrance was generally rated, but for a more accurate and meaningful ranking we use a weighted Wilson score.
What does this mean?
Imagine you have two buckets of apples. Bucket A has 3 perfect apples and bucket B has 30 very good apples but also 2 bad ones. Would you say bucket A is better just because all the apples are perfect? Probably not, because bucket B has more very good apples. A "weighted Wilson score" is like a smart fruit seller who takes into account both the quality and the number of apples.
Why is this important?
With this new approach, we are increasing the validity and accuracy of our Top Lists. This means the lists will give a more authentic and well-rounded picture of the perfumes that are actually your favourites.
As always, we look forward to your feedback!
If there’s a calculation that goes on behind the scenes, in this case the Weighted Wilson, make it apparent in the representation of the entry. Right now it just seems to be broken I’m afraid.
How does number of votes makes this better approximation for this measurement?
For example why perfume with 600 votes with 9.4 value for money is lower than perfume with 3900 votes with rating 7.6? I understand why it could be higher in popularity contest but in this.
I don’t know what your overall brand strategy is and what kind of audience you want to attract. I do hope your site grows and prospers, because the features you guys have built and the community you have fostered here are amazing. Just be careful not to upset your hardcore fans in the process :-)
I agree that the previous method might not have been the most indicative since 9.0s that had ~100 ratings were bordering on top 10 with a relatively small amount of ratings, but to me, it appears that the minimum size of the bucket is now too high. The discrepancy between Widian's London (8.9, 1,329 ratings) at rank #8 and Nishane's Nefs (8.9, 745 ratings) at rank #50 is not that wide.
If anything, I like the previous method. Under that method, you could've punished the less popular scents with an invisible deduction of points for every 100 ratings less than some minimum. E.g. say the floor is 1,000 - you take off 0.1 per 100, so Nefs in reality would be 8.9 - 0.255 = 8.645 and then you rank it based on that (or something like this method).