We’ve helped a number of clients to develop and deliver rankings – of healthy places, of “cosy” areas of the country, of corporate wellbeing, and of universities. Building a ranking sounds like an easy job, since the output is just a list. But building a ranking well and meaningfully is much harder than you’d think, because every individual on the list is keen to be in exactly the right place, or even higher up, and is going to pick apart every choice you made in the process.

which data?

While it’s hard to be sure that you have the correct final ranking, there are many rankings that are definitely incorrect – “What is the best X in the world?” has no right answer, and many wrong answers. One key element is choosing your data sources – what are the factors which make up “a good X” for your purposes? This is always going to be a partly qualitative judgement, but is grounded in the values and priorities of the ranking organisation.

For example, for the Times Higher Education World University Rankings, the inputs included 13 measures from several discrete sources, including research output metrics, scholarly reputation scores, and staff:student ratios. That ranking has several competitors, who emphasise different things – for example, the QS University Ranking thinks that employer reputation should be included in rankings, but puts much less weight and detail on teaching measures than THE.

smoothing things out

Once the data sources are agreed, the second critical step is transforming and weighting them.

Transformation is a critical step: for example, in the case of university reputations, Harvard might get a score of 1000, while the 10th best university got a score of 50. This means that if we include untransformed reputation scores in the ranking, the reputation element overwhelms everything else.

This issue means that you need to use an appropriate probability distribution to transform each source from its raw form into an evenly-distributed 1-100 score before it is allowed to mix with its peers. And with that mixing, there is another value judgement about how much relative weight each source gets in the overall ranking.

data quality and missing-ness

Finally, you need to decide what to do with incomplete data. Do you just remove someone from the rankings if they are missing a data point (is that fair?) Do you try and impute what their data “should” be? (isthat fair?) Or do you impute the data but penalise them for not supplying information (is that fair either?) Even more value judgements – again, founded in data science, but still ultimately a judgement call.

We can’t tell our clients what to do in these cases – but we can get the technology to help us, by running multiple versions of the rankings under different scenarios to check in each case whether the outputs seem sensible and what the implications of each might be.


We are always interested in knotty data science problems like this. If you want us to help you find the best X for your industry, contact us, and read here for a longer case study of the world university rankings.

Share this article