This week on Cold Take, Frost tries to figure out why we’re still using review scores.
Check out more recent episodes of Cold Take, including The Hype Train is Running Out of Steam, Armored Core 6 and the ‘Git Gut’ Mentality, and The Problem of Voting With Your Wallets, and Baldur’s Gate 3 Has Caused Quite a Hubbub.
Why Are We Still Using Review Scores? – Transcript
Yahtzee, who I’m sure you’re familiar with, has never taken part in the ambiguous grading of a game in Zero Punctuation, not even when it went by the name Fullyramblomatic, (10 out of 10 name btw). 3 Minute Reviews, our short but sweet look at the world of indies, doesn’t use a numeric rating system either. It didn’t use them when they were called 2 Minute Reviews on Gameumentary, and I imagine it won’t use them even as they get suspiciously closer to becoming 4 Minute Reviews. Arguably, for the longest time, refusing to use the same metric of measurement the rest of the industry went by was seen as a quirky choice. You’re just trying to stand out. I’m different from the other girls you know. I can’t count. Girl math or otherwise. But the major fall releases of 2023 have gamers, developers, and critics alike wondering once again if we should continue to use review scores. What purpose do they serve? If they are useless then why do we continue to use them? What should we use instead?
Let’s start off with the painfully obvious. Review scores are a sham of a mockery trying to quantify opinions. There is no standardized process and anyone claiming to be able to do so is a charlatan and a swindler. Done. Video’s over.
I’d love it if there was a readily understood rubric, a scale, or a unit of measurement that quantified video games. Life would be so much easier. None of us even use the same scale as it is. Steam has its thumb’s up and thumbs down. There’s 4-star systems and 5-star systems. 10 point and 100 point scales. Then there’s Metacritic pretending to use advanced algorithms to convert, combine, and average them all out while refusing to show its work. Sure, you can convert fractions and decimals in the same way we convert miles to kilomiles, but the context of each score is different between sites. To some, 5 out of 10 means it’s functional, but does nothing spectacular. Anything under that is fundamentally broken, so you rarely hear about games under a 4 out of 10. What’s the point of all the numbers if you’re not going to use them? No. Just no. I’m all for a standardized review system and I’d start giving all games scores tout suite if there was ever an accurate system. I don’t even like calibrating my cooking thermometers. I don’t know how we’d calibrate the review score gun.
Who even needs review scores? Review scores are useful to consumers– and I don’t mean that as a slur. Games are a product and people exist who’d like to make informed purchases. And because people who care about how they spend their money tend to care about how their time is spent as well, a number serves as a quick summary or supplements the important parts of a lengthy reading. Traditionally, reviews were given on the functionality of a game because they are a technological marvel, and if you’re playing on PC it’s not even a standardized marvel at that. Every video game has a technical aspect to it because it is a form of software coordinating with your varying choices of hardware. All manners of bugs, breaks, hiccups, and unintended interactions can exist and technical reviews frame “how good is it?” as a question of performance. “How well does it work?” “It ran well for an hour then it blew up, 1 out of 10.” Technical game reviews still exist, usually mixed in with the opinionated bits. Off the top of my head, SkillUpmakes reviews that do take into account the performance side and he informs his audience of things like the state of Cyberpunk’s new DLC on last gen consoles, current gen consoles, Steam Deck, and multiple PCs with different hardware, while holding conversation around graphical fidelity and playability. These are objective benchmarks that can be assigned a value. In this manner, technical video game write ups are closer to what I read when I’m looking for a new blender or an air fryer than when I’m looking for a film or a book, and a review score lets me know to not waste my time with the 4 out of 10s and below because they’re broken in some manner of speaking.
Sellers also like review scores as a badge of honor. Publishers are all too eager to highlight a notable review outlet and slap it across all of their advertisements. I know this sparks the fear and conspiracy that reviewers get paid off to write sweet nothings about a game to boost their metacritic average, but the keyword I’m using here is “notable.” A 10 out of 10 score from “crusty old man we found rummaging through our company dumpster” does not hold as much weight as a tweet of appreciation from say Hideo Kojima. He is trustworthy, and you don’t build up trust by shilling or being wrong more times than you are right. It just so happens that as you gain more attention you attract people with different points of view so you’re likely to find more people who disagree with you and saying you’re being completely irresponsible with your opinion. Thus, you end up with a noxious cloud surrounding the major outlets, and everyone outside of the zone is wondering how they got so big to begin with if they are apparently shills. That noxious cloud is usually composed of people who are addicted to arguing on the internet or who have mistaken their hobby for a personality. That one was a slur.
So we agree review scores are useful as a quick gauge of technical quality, but video games have an artistic nature about them and there is no such thing as objectivity in the arts. 144 frames per second may be objectively better from a performance standpoint, but, from an artistic point of view, maybe the vision of Link adventuring in Hyrule only needed 30 frames per second to be realized in Breath of the Wild. There are people buying Mortal Kombat 1 on the Switch and it looks nothing short of a Walmart bathroom. That’s true love that can look past the surface level and into the real beauty beneath. Unsurprisingly, review scores fail miserably when you want to converse about personal enjoyment with any kind of nuance. And wouldn’t you know it, that is the burning question that dominates the airways. “Is this game good? Will I like this game? Will it make me feel whole again?” I don’t know. Go to therapy or to a brothel if you want to feel w(hole) again. The only people who can say for sure what you will like are people who know what you like, you and your close ones. So what do review scores communicate then? They’re a parasocial shortcut. It’s not about the games anymore. It’s about shooting a flare into the air signaling “everyone who feels the way I do come gather around.” Like a school of guppies, it’s easier to fight off dissenting opinions as a collective.
But let me tell ya, just because you feel the same way does not mean you think the same way. Believe my 5 ex-wives would agree with me there. I know plenty of friends and peers who love and hate the same games I do, but for different specific reasons. It’s impossible to get game recommendations from them. Inversely, there are plenty of other creators who do not feel the way I do, but think the way I do, so I trust them more than my own immediate circle. Example, Iron Pineapple, one of the most prominent content creators for the soulslike genre, feels Lies of P is the best non-Fromsoftware or non-Nioh soulslike. I feel Lies of P is the least worst non-Fromsoftware or non-Nioh soulslike. We think the same way, we both