Skip to main content
If you click on a link and make a purchase we may receive a small commission. Read our editorial policy.

The Numbers Game

Eurogamer's Dan Whitehead on why Amazon's use of Metacritic scores is a terrible idea.

This article first appeared on USgamer, a partner publication of VG247. Some content, such as this article, has been migrated to VG247 for posterity after USgamer's closure - but it has not been edited or further vetted by the VG247 team.

Dan Whitehead, Eurogamer

Amazon has been all over the headlines this week, courtesy of its intriguing $99 Fire TV capsule console.

From hiring talent like Portal's Kim Swift and Far Cry 2's Clint Hocking to buying Killer Instinct studio Double Helix and securing versions of The Walking Dead and Minecraft, as a tech company Amazon has been making all the right noises. It clearly takes gaming seriously.

Yet at the same time, as a retailer, it's just taken a giant leap backwards by quietly starting to roll out the use of Metacritic scores on its games listings. Those ubiquitous traffic-light numbers will now appear at unmissable size, right next to the "Add to Cart" button. If it was bad news for a specialist retail service like Steam, it's even worse when a mainstream retailer the size of Amazon follows suit.

The slow and inexorable acceptance of Metacritic as some sort of ultimate truth is as predictable as it is depressing. We humans are driven to impose uniformity, to find a box for everything, to have things in the right order. This means we're also prone to something called apophenia, the tendency to see patterns and meaning in unconnected data.

Spoiler alert: that's all scores are. Unconnected data. There's no great objective truth behind the number at the end of a review, just a gut feeling expressed in an easy-to-digest format, a convenient summation of the paragraphs that precede it. And those paragraphs are a subjective opinion, hopefully informed by experience and insight, but ultimately no more mathematically quantifiable than whether you prefer chocolate or strawberry ice cream.

Deadly Premonition is one of the most brilliant and divisive games of the last 10 years, but how many potential fans will never pick it up if they see the 68 Metascore at the checkout?

At best, a score should accurately reflect the text and it should adhere to a scale that is consistent across the site or magazine. That doesn't always happen, because it's a fallible human system, not a machine. A score is a clumsy but handy way of illustrating how a particular game succeeds or fails on its own terms, as perceived by the writer, and how it stacks up against its immediate peers. The idea that there should be some wider coherence, that every game ever made can be ranked in order of quality simply by crunching numbers, is nonsensical.

The idea that there should be some wider coherence, that every game ever made can be ranked in order of quality simply by crunching numbers, is nonsensical.

It's here that gaming proves especially susceptible to the false comfort offered by a Metascore. Metacritic doesn't just cover gaming, but film, music, TV - these are media that feel mercurial, alchemical. However they're delivered to us, we can see and hear the human hand in their construction. They spring from a creative place that we accept is impossible to truly pin down.

Metacritic ratings also appear on the Internet Movie Database, but nobody in the film industry really seems to care. Studios are not closed, actors do not have their pay docked, if a certain arbitrary Metascore is not reached. Film has been around for too long, its cultural roots are too deep, for Metacritic to have any real impact. There are still arguments and differences of opinion, of course, but they tend not to be about whether a four-star movie review read like three stars, or whether gruelling documentary The Act of Killing is as good as Disney's Frozen because they both have the same score. Scores for movies are rightly treated as a sideshow, a not entirely accurate thermometer, rather than the main topic of conversation.

Relatively young as a medium and with the neatly ordered rankings of high-score tables and player statistics woven into their electronic DNA, games are much more vulnerable to the pernicious comforting lie of the Metascore.

It's no surprise that gamers flock to Metacritic for reassurance when handing over their money, seeking the solid confidence of a single "true" score, and in this age of metrics, analytics and data-driven decision making you can hardly blame publishers for using Metacritic's pseudoscience as a guide rail for performance. Like all successful ideas, it's incredibly convenient. It's also incredibly flawed.

Developers at Obsidian famously lost out on pay bonuses because Fallout: New Vegas missed an 85 Metascore by just one point.

As a writer, there's personal ego at play whenever words are sidelined in favor of numbers, of course. It's always depressing to spend days writing a thousand carefully considered words, only for discussion to congeal around whether the figure at the end deviated by one point from the perceived 'correct' score. This isn't, however, an argument against scores in themselves. They serve a purpose, when taken individually and in context, but we've all -- from writers to gamers, PR to retailers -- allowed them to take on an autonomous significance far in excess of what they actually represent.

Crucially, we need to rid ourselves of the notion that the primary purpose of a games review is to act as a buyer's guide. A review can tell you whether you will like a game, but it does so primarily through description rather than simply imparting the writer's preferences. It's then your job to take all those words, along with your personal context, to decide if you think the game is interesting. You don't have to agree with a review in order for it to be useful or enlightening.

So while Metacritic's domination of the critical conversation is an annoyance, its determination to sit in Amazon's shop window and be a part of the commercial process of buying a game is more troubling. The Metascore is a statistic built from the raw fibre of multiple incompatible subjective opinions, softened by the absence of context and eventually broken down by the invisible enzymes of Metacritic's top secret algorithms and statistical weighting. Yet the gaseous digits that squeak out at the end of this patently ridiculous process are too often treated as gospel by shoppers and publishers alike.

Good criticism should always be an invitation to further discussion, the start of the conversation rather than the end. In going from curious statistical experiment to globally accepted league table, Metacritic turns the open invitation of individual reviews into a full stop, drawing a line under critical thought. It even colour codes the results, reducing even its basic 100 point scale into an idiot-proof traffic signal: green means buy.

If just one gamer turns away from a potentially fascinating game on Amazon, purely because the Metascore beacon flashed a cautionary yellow warning, that's one gamer too many.

Pete Davison, USgamer

As someone whose personal tastes in games have deviated significantly from the "mainstream" over the course of the last few years, Dan's words ring particularly true and mean a lot with regard to my own perception of the games industry. Specifically, my immediate reaction to Amazon's introduction of Metascores on games was that those games already perceived as mediocre or bad -- whether fairly or not -- will have their situation compounded by having that score front-and-center on one of the most popular online retail outlets.

There is the argument, of course, that a lot of these 50-60 Metascore titles, many of which are niche interest offerings from specialist publishers, have their own fanbases already -- fanbases who don't really care what the review scores say and who will probably pick up the titles they're interested in regardless of how well-received they are. But the addition of Metascores to Amazon pages potentially precludes new fans -- those who aren't already invested in a particular franchise, or who aren't familiar with the work of specialist publishers -- from taking a risk on these games. And, in my experience, anyway, some of my favorite games of the past few years have been saddled with mediocre Metascores, with perhaps the most egregious example being the incredible Nier. How many people passed that game up because it was a "yellow" rather than a "green?"

Nier: one of the most emotionally harrowing, beautifully written games of the last generation -- 67 Metascore, yellow.

Then, of course, there's the matter of how that Metascore is calculated in the first place. Here at USgamer, as you know, we use a five-star rating system -- and we use the entire spectrum of ratings. A five-star rating system, taken by itself, is somewhat less in-your-face than a percentage- or out-of-ten-based scoring system because there's less of the preconceptions attached to what the numbers "mean." In an out-of-ten system, for example, most people believe that an 8 or higher is something you should definitely consider buying, while a 6, despite being above average, is perceived by many as being not worth bothering with. With a five-star system, however, three stars tends to mean "good" -- and yet Metacritic converts it back to 60%, with all the attached stigma that brings. It's arguably not wrong to do so, given that three-out-of-five is indeed 60% of a "full" score, but it does ignore the varying perceptions that different approaches to scoring have.

Ultimately it remains to be seen what impact, if any, the addition of Metascores to Amazon's product pages has on sales -- but speaking personally, my own gut feeling is that the sooner we stop depending on Metascore -- or worse, as Dan says, the idiot-proof traffic signal colors -- as some sort of infallible buying advice, the better.

Read this next