DonVanVliet
The Parfumo Developer Blog
11 months ago - 10/06/2023
11 11
The Smart Fruit Seller - How We Now Rank the Best Perfumes

The Smart Fruit Seller - How We Now Rank the Best Perfumes

We are constantly striving to improve the validity and accuracy of our community's ratings and rankings. As a result, we have updated the way we calculate the ratings for our top lists.

What changes?

Instead of the previous Wilson score, we now use a "weighted Wilson score"

Simple averages vs. ranking

Before we go into detail, you should know that the average scores displayed on the page are not used for ranking in the top lists. The averages give you a first indication of how a fragrance was generally rated, but for a more accurate and meaningful ranking we use a weighted Wilson score.

What does this mean?

Imagine you have two buckets of apples. Bucket A has 3 perfect apples and bucket B has 30 very good apples but also 2 bad ones. Would you say bucket A is better just because all the apples are perfect? Probably not, because bucket B has more very good apples. A "weighted Wilson score" is like a smart fruit seller who takes into account both the quality and the number of apples.

Why is this important?

With this new approach, we are increasing the validity and accuracy of our Top Lists. This means the lists will give a more authentic and well-rounded picture of the perfumes that are actually your favourites.

As always, we look forward to your feedback!

11 Comments
CeesieCeesie 2 months ago
I understand you implemented this scoring mechanism for top lists but it’s heavily confusing when sorting search results. For example when I sort on scent, I expect the highest scoring scents on top, not somewhere in the middle.
If there’s a calculation that goes on behind the scenes, in this case the Weighted Wilson, make it apparent in the representation of the entry. Right now it just seems to be broken I’m afraid.
CzuczuCzuczu 11 months ago
I stiil doesnt understand best value for money rank.
How does number of votes makes this better approximation for this measurement?
For example why perfume with 600 votes with 9.4 value for money is lower than perfume with 3900 votes with rating 7.6? I understand why it could be higher in popularity contest but in this.
CzuczuCzuczu 11 months ago
1
It would be ideal if i culd just sort fragrances just by average value.
KuraiKurai 11 months ago
1
Big improvement, I would say. Previously there were some new releases skyrocketing into the top-100 based on relatively few votes. That seems to be fixed now. It is true that mainstream and classics rank the highest now and that is how it should be. After all it is a popularity list, so I believe it should represent the majority of the community votes. The release year feature is a great way to filter out those all-time favorites and see how the new perfumes score.
SopelkaSopelka 11 months ago
Oh yes, I can see the difference and I like the new system because my tastes are pretty mainstream. I can now recognize all the top fragrances and actually agree that they are great!
I don’t know what your overall brand strategy is and what kind of audience you want to attract. I do hope your site grows and prospers, because the features you guys have built and the community you have fostered here are amazing. Just be careful not to upset your hardcore fans in the process :-)
NicheOnlyNicheOnly 11 months ago
I heavily dislike this change, because it pushes mainstream product to the top while punishing expensive product. We know expensive product can be expected to smell better and as a result, it pushes people away from both rating and buying.
I agree that the previous method might not have been the most indicative since 9.0s that had ~100 ratings were bordering on top 10 with a relatively small amount of ratings, but to me, it appears that the minimum size of the bucket is now too high. The discrepancy between Widian's London (8.9, 1,329 ratings) at rank #8 and Nishane's Nefs (8.9, 745 ratings) at rank #50 is not that wide.
If anything, I like the previous method. Under that method, you could've punished the less popular scents with an invisible deduction of points for every 100 ratings less than some minimum. E.g. say the floor is 1,000 - you take off 0.1 per 100, so Nefs in reality would be 8.9 - 0.255 = 8.645 and then you rank it based on that (or something like this method).
NicheOnlyNicheOnly 11 months ago
I wasn't commenting from my personal point of view. I haven't even smelled London. But previously, Nefs was a stable top 4 entry alongside Naxos, Alex2 & London. Those 4 were basically solidified as the top 4 since their average rating was very high and they had a sufficient amount of votes to support that idea. But this change of methodology dropped Nefs from 4 to 50 which is a showcase for the methodology (1) being very slow to adopt to newer releases catching up; and (2) having a mainstream bias towards household names. The new rankings achieve the exact same thing that anyone can get from the customer support at your local boutique or from a new YT/IG reviewer - it emphasizes being already known (quantity/bucket size) over the scent being rated more highly (quality).
DonVanVlietDonVanVliet 11 months ago
2
Our top 100 is generated based on overall ratings and is not a "curated list". It's perfectly fine if your personal favorites don't match that list - it's a community-driven ranking after all. 😉 The new weighted score aims to provide a balanced and reliable ranking by taking into account both the quality and quantity of ratings. It doesn't inherently favour mainstream or niche fragrances. I hope that helps to clarify things a bit.
NicheOnlyNicheOnly 11 months ago
Just want to also add that the apples example doesn't really line up with the way the average person ranks fragrances. I believe previously Parfumo was winsorizing the data, meaning the haters and fangirls (at both extremes, rating scents 0s or 10s) got pulled out of the pool. Now, we're doing the polar opposite of that - we're saying that the haters & fangirls do matter which basically punishes the objectivity of the rankings, because the size of the bucket exudes a significant amount of control over the ranking.
SalahharakeSalahharake 11 months ago
1
Nothing better than a good, accurate rating! Thank you for the update @DonVanVliet 👏🏼
TelekinecTelekinec 11 months ago
4
I'm quite excited to see the difference this will bring to perfume ratings! I'm already checking it out. Thank you for keeping us updated and looking at ways to make the site even better! Cheers :)!

More articles by DonVanVliet