Better Predictions for the 2022 World Cup (pt. 3)
More statistics == more points
In the first blog of this series, we predicted 1–0 for every game. We looked at the score distributions for international football matches and decided we should simply predict the most common score, in favor of the country with the higher FIFA rating.
In the second blog, we simulated the World Cup to figure out the chances of winning the cup for each country.
In this one, we will revisit the predictions for the group stage using some more advanced methods: density functions.
FIFA ranks and ratings
FIFA shows more than just ranks, it also displays a rating, this is what we will use for generating the density functions.
We assume when two countries play a match, there is a relation between the rating of the two countries and the outcome of the match. A country with a higher rating is more likely to win, right?
Let’s have a look
We start with finding each instance of a specific score, let’s start with 1–0 again. For each match that ended with 1–0, we will record the rating of the home team and the rating of the away team. Afterward, we calculate the difference in rating, then put a dot on the chart for this difference.
Example:
19 June 2010, The Netherlands vs. Japan, 1–0
rating Netherlands = 1231
rating Japan = 682
rating difference = 1231 - 682 = 549
We put a dot for x=549 and y=“1–0”.
When we do this for all international matches we get the chart below.
This does not show a lot yet, we can vaguely see there are more dots on the right side than on the left, but it’s hard to read.
Calculating density
To make it more clear, we could go back to our trusty bar chart, but there is another way: density functions.
For this, we simply calculate the mean continuously over an area, where we count 1–0 as 1 and not 1–0 as 0. The area here is a range of rating differences. A density-based version is shown below.
Much easier to interpret already! Let’s look at a few more.
From these four options, it is now easy to see which to pick for each rating difference: you would pick the one with the highest chance given the difference.
For example:
Monday, November 21, Senegal vs Netherlands
rating Senegal = 1584.38
rating Netherlands = 1694.51
rating difference = 1584.38 - 1694.51 = -110.13
When we look at the graph at x=-110.13 we find that 0–1 is the most likely outcome. Nothing changed yet compared to our earlier predictions, but it feels much better.
When should we change our prediction?
Let’s look at one more, also adding lines for 2–0 and 0–2.
For readability, the x-axis only shows from -1000 to +1000 now. We can see that the lines for 0–1 and 1–0 stay at the top with a few exceptions:
- 0–2 is a better prediction when more than 600 rating points down
- 2–0 is a better prediction when more than 600 rating points up
- 1–1 is a better prediction when between 30 points down and 30 points up
According to this chart, any other predictions never make sense, you would only hurt your chances of a correct prediction.
Can we trust this chart?
Maybe, but we should have another look at the data first.
There is only a very small number of old matches, gradually increasing over time. Should we be using games from before 1920 at all?
Here’s another problem: the minimum, mean and maximum ratings change heavily over time.
This means a rating difference of 200 means something very different in 2022 than it did in 2005, and it was not even possible in 1995. Also, there is no FIFA rating data from before 1992.
Fix it and try again
Looking at the factors above, it would be wise to clean our data a bit more.
Just drop it!
We start by dropping all data from before 1992, so we’re only using ratings when we are sure what the rating was at the time. This leaves us with 24606 out of 40374 matches, meaning we have ~61% of our data left.
Same mean, same variance
One way to correct changing data over time is to make sure each batch has the same mean and variance. We treat each year as a separate batch and correct the shifting in FIFA ratings to get the following graph.
We normalized by mean and variance, and we can clearly see the mean remaining stable through the years. There is some change in the extreme values over time, but that is acceptable.
The master graph
Here it is, the same graph but now with all clean data. Looks amazing!
The shapes are similar, but it looks much cleaner and the values on the x-axis make a lot more sense.
You should predict the highest line at any point, which means:
- 2–0 or 0–2 when the rating difference is more than 380
- 1–0 or 0–1 when the rating difference is between 100 and 380
- 1–1 when the rating difference is less than 100
The final result
Below are my updated predictions for the group stage.
Wrap up
In this blog, we calculated the mean occurrence of each outcome over a range of FIFA rating differences. We call this a probability density function.
We then use these probabilities to figure out the most likely outcome for each match, taking the rating difference for this match into account.