Better Predictions for the 2022 World Cup (pt. 3)

More statistics == more points

5 min readNov 18, 2022

In the first blog of this series, we predicted 1–0 for every game. We looked at the score distributions for international football matches and decided we should simply predict the most common score, in favor of the country with the higher FIFA rating.

In the second blog, we simulated the World Cup to figure out the chances of winning the cup for each country.

In this one, we will revisit the predictions for the group stage using some more advanced methods: density functions.

FIFA ranks and ratings

FIFA shows more than just ranks, it also displays a rating, this is what we will use for generating the density functions.

Snapshot of the top 8 by FIFA rankings, showing the ratings in the column PTS. [image by author, taken from FIFA]

We assume when two countries play a match, there is a relation between the rating of the two countries and the outcome of the match. A country with a higher rating is more likely to win, right?

Let’s have a look

We start with finding each instance of a specific score, let’s start with 1–0 again. For each match that ended with 1–0, we will record the rating of the home team and the rating of the away team. Afterward, we calculate the difference in rating, then put a dot on the chart for this difference.

Example:
19 June 2010, The Netherlands vs. Japan, 1–0

rating Netherlands = 1231
rating Japan = 682
rating difference = 1231 - 682 = 549

We put a dot for x=549 and y=“1–0”.

When we do this for all international matches we get the chart below.

Occurrences of 1–0 match result for each rating difference. [image by author]

This does not show a lot yet, we can vaguely see there are more dots on the right side than on the left, but it’s hard to read.

Calculating density

To make it more clear, we could go back to our trusty bar chart, but there is another way: density functions.

For this, we simply calculate the mean continuously over an area, where we count 1–0 as 1 and not 1–0 as 0. The area here is a range of rating differences. A density-based version is shown below.

Occurrences of 1–0 match result for each rating difference as a density function. [image by author]

Much easier to interpret already! Let’s look at a few more.

Occurrences of 0–0, 1–0, 0–1, and 1–1 match results for each rating difference as a density function. [image by author]

From these four options, it is now easy to see which to pick for each rating difference: you would pick the one with the highest chance given the difference.

For example:
Monday, November 21, Senegal vs Netherlands

rating Senegal = 1584.38
rating Netherlands = 1694.51
rating difference = 1584.38 - 1694.51 = -110.13

When we look at the graph at x=-110.13 we find that 0–1 is the most likely outcome. Nothing changed yet compared to our earlier predictions, but it feels much better.

When should we change our prediction?

Let’s look at one more, also adding lines for 2–0 and 0–2.

Occurrences of six different match results for each rating difference as a density function. [image by author]

For readability, the x-axis only shows from -1000 to +1000 now. We can see that the lines for 0–1 and 1–0 stay at the top with a few exceptions:

0–2 is a better prediction when more than 600 rating points down
2–0 is a better prediction when more than 600 rating points up
1–1 is a better prediction when between 30 points down and 30 points up

According to this chart, any other predictions never make sense, you would only hurt your chances of a correct prediction.

Can we trust this chart?

Maybe, but we should have another look at the data first.

The total number of international games recorded in our data set by year. [image by author]

There is only a very small number of old matches, gradually increasing over time. Should we be using games from before 1920 at all?

FIFA ratings over time. [image by author]

Here’s another problem: the minimum, mean and maximum ratings change heavily over time.

This means a rating difference of 200 means something very different in 2022 than it did in 2005, and it was not even possible in 1995. Also, there is no FIFA rating data from before 1992.

Fix it and try again

Looking at the factors above, it would be wise to clean our data a bit more.

Just drop it!

We start by dropping all data from before 1992, so we’re only using ratings when we are sure what the rating was at the time. This leaves us with 24606 out of 40374 matches, meaning we have ~61% of our data left.

Same mean, same variance

One way to correct changing data over time is to make sure each batch has the same mean and variance. We treat each year as a separate batch and correct the shifting in FIFA ratings to get the following graph.

Normalized FIFA ratings over time. [image by author]

We normalized by mean and variance, and we can clearly see the mean remaining stable through the years. There is some change in the extreme values over time, but that is acceptable.

The master graph

Here it is, the same graph but now with all clean data. Looks amazing!

Occurrences of six different match results for each rating difference as a density function after data cleaning. [image by author]

The shapes are similar, but it looks much cleaner and the values on the x-axis make a lot more sense.

You should predict the highest line at any point, which means:

2–0 or 0–2 when the rating difference is more than 380
1–0 or 0–1 when the rating difference is between 100 and 380
1–1 when the rating difference is less than 100

The final result

Below are my updated predictions for the group stage.

Wrap up

In this blog, we calculated the mean occurrence of each outcome over a range of FIFA rating differences. We call this a probability density function.

We then use these probabilities to figure out the most likely outcome for each match, taking the rating difference for this match into account.

Want to get my best predictions?

Read the next blog with the best predictions here.

More like this

Follow me here on Medium and here on LinkedIn.