Who Will Win World Cup ?


  • Updating predictions as the tournament progresses
    • 58% (Updating through the tournament) at the end of the group stage
    • 6 / 8 results correctly in the last 16
  • Fixing predictions at the start of the tournament
    • 3 out of 8 for quarterfinals
    • 11 out of 16 teams in the last 16 were correct
    • 47% of matches predicted correctly in the group stage

Random Guessing: 33%, but 22% of games end in a draw so, could go with not predicting a draw in which case 39%.

I will update my predictions as the tournament progresses taking into account new data as and when I get access to it. The first predictions made at the start of the tournament can be seen at the bottom and will remain unchanged.

The updated predictions along with the results are in the table below. Note all predictions will be locked in the day before matches.

James : “Hey Pico, I have built this model that will predict who will win the world cup”

Pico : “Portugal”

James : “Close its, Br…”

Pico : “I said Portugal”

4504521164 thinks it will be a Germany v Spain final with German winning. This guy reckons it’s going to be Brazil. It seems the theme for this World Cup is using Machine Learning or Statistics to predict the result.

There is no shortage of data to use with results of all international football matches, FIFA rankings,  the schedule of 7704777203 and betting (832) 858-0894 along with a number of summary metrics all available to download via Kaggle.

Rather than repeat the prediction made by others I thought I would break things down into two questions. Then applying not just the Science but also a bit of common sense too

  1. Does form predictor results?
  2. Do attacking teams or defending teams win matches?

Measuring form is simple enough as FIFA maintains a ranking of teams. The ranking is based on a running scoring system that gives 3 points for a win, 1 for a draw and 0 for a loss. Your points are then weighted by the strength of the opposition and importance of the match.

Picking the higher ranked team to win will only give you the correct answer 67% of the time. We can improve this a little by excluding friendly matches to 70%. Let’s give FIFA the benefit of the doubt and say our 30% error is down to teams incorrectly ranked as they are rising through the rankings.

Taking a rolling average of a teams ranking over the last year and then saying the higher ranked team wins unless the lower ranked team is improving gets you the correct answer 87% of the time in competitive matches. This gives us a pretty clear answer form in particular recent form (last 12 months) is very important in predicting a head2head result.

Our next question is, is it better to be an attacking or defending team? We can answer this by calculating a moving average of goals per game and then weight each goal by the quality of team scored against. Repeat the process for goals conceded to get a measure of defensive ability. I called these features Attacking Momentum and Defensive Momentum.

Comparing the average difference of a winnings teams Attacking Momentum to that of a losing team we can see there is a gap of  0.79.  The gap for Defensive Momentum is 0.012 goals per game. So, an average winning performance has a score of 0.8 – 0.012 (weighting of the strength of opposition means we can have fractions of goals).  Lets round and say 1 – 0 which is hardly an exciting game leaving me to believe defensive teams tend to win more than attacking teams. Not a perfect methodology to answer this I know.

Putting this all together we are looking for a strong defensive team who is moving up FIFA’s rankings in the last 12 months.

Now that we have our sense check we can apply our Machine Learning solution. I have picked Xgboost as my algorithm to predict head2head results. The code is located on (551) 226-8434.

Matches with between 40% and 60% probability of a win I have made a DRAW, or said are going to penalties

___Starting group A:___
(WRONG) Russia vs. Saudi Arabia: Saudi Arabia wins with 0.66
(WRONG) Russia vs. Egypt: Egypt wins with 0.71
(CORRECT) Russia vs. Uruguay: Uruguay wins with 0.87
(WRONG) Saudi Arabia vs. Egypt: Egypt wins with 0.59(DRAW)
(CORRECT) Saudi Arabia vs. Uruguay: Uruguay wins with 0.60 
(CORRECT) Egypt vs. Uruguay: Uruguay wins with 0.87
___Starting group D:___
(CORRECT)Argentina vs. Iceland: Argentina wins with 0.60(DRAW)
(WRONG) Argentina vs. Croatia: Draw
(WRONG)Argentina vs. Nigeria: Draw
(CORRECT)Iceland vs. Croatia: Croatia wins with 0.73
(WRONG)Iceland vs. Nigeria: Nigeria wins with 0.59(DRAW)
(CORRECT) Croatia vs. Nigeria: Croatia wins with 0.76
___Starting group F:___
(CORRECT) Germany vs. Mexico: Mexico wins with 0.80
(WRONG)Germany vs. Sweden: Sweden wins with 0.58(DRAW)
(WRONG)Germany vs. Korea Republic: Germany wins with 0.85
(CORRECT)Mexico vs. Sweden: Sweden wins with 0.98
(WRONG)Mexico vs. Korea Republic: Korea Republic wins with 0.81
(WRONG) Sweden vs. Korea Republic: Korea Republic wins with 1.00
___Starting group C:___
(CORRECT) France vs. Australia: France wins with 0.88
(WRONG) France vs. Peru: Peru wins with 0.63
(WRONG)France vs. Denmark: Denmark wins with 0.63
(CORRECT)Australia vs. Peru: Peru wins with 0.85
(WRONG)Australia vs. Denmark: Denmark wins with 0.85
(WRONG) Peru vs. Denmark: Denmark wins with 0.55(DRAW)
___Starting group H:___
(WRONG) Poland vs. Senegal: Senegal wins with 0.68
(CORRECT)Poland vs. Colombia: Colombia wins with 0.83
(WRONG)Poland vs. Japan: Poland wins with 0.79
(CORRECT)Senegal vs. Colombia: Colombia wins with 0.78
(CORRECT)Senegal vs. Japan: Draw
(CORRECT)Colombia vs. Japan: Japan wins with 1.00
___Starting group E:___
(WRONG) Brazil vs. Switzerland: Brazil wins with 0.96
(CORRECT)Brazil vs. Costa Rica: Brazil wins with 0.94
(CORRECT)Brazil vs. Serbia: Brazil wins with 0.90
(CORRECT)Switzerland vs. Costa Rica: Costa Rica wins with 0.55(DRAW)
(CoRRECT) Switzerland vs. Serbia: Switzerland wins with 0.61
(CORRECT) Costa Rica vs. Serbia: Serbia wins with 0.72
___Starting group G:___
(CORRECT)Belgium vs. Panama: Belgium wins with 0.76
(WRONG)Belgium vs. Tunisia: Draw
(WRONG) Belgium vs. England: England wins with 0.74
(WRONG) Panama vs. Tunisia: Tunisia wins with 0.59(DRAW)
(WRONG)Panama vs. England: Draw
(WRONG)Tunisia vs. England: Tunisia wins with 0.67
___Starting group B:___
(CORRECT) Portugal vs. Spain: Portugal wins with 0.57 (DRAW)
(CORRECT) Portugal vs. Morocco: Portugal wins with 0.80
(WRONG)Portugal vs. Iran: Portugal wins with 0.82
(WRONG)Spain vs. Morocco: Spain wins with 0.96
(CORRECT) Spain vs. Iran: Spain wins with 0.94
(WRONG)Morocco vs. Iran: Morocco wins with 0.65

___Starting of the round_of_16___
Uruguay vs. Spain: Uruguay wins with probability 0.52
Denmark vs. Argentina: Argentina wins with probability 1.00
Brazil vs. Sweden: Brazil wins with probability 0.89
Tunisia vs. Japan: Tunisia wins with probability 0.75
Egypt vs. Portugal: Portugal wins with probability 0.87
Peru vs. Croatia: Croatia wins with probability 0.58(PENALTIES)
Serbia vs. Korea Republic: Serbia wins with probability 0.72
England vs. Colombia: Colombia wins with probability 0.58(PENALTIES)

___Starting of the quarterfinal___
Uruguay vs. Argentina: Uruguay wins with probability 0.56(PENALTIES)
Brazil vs. Tunisia: Brazil wins with probability 0.96
Portugal vs. Croatia: Portugal wins with probability 0.62
Serbia vs. Colombia: Serbia wins with probability 0.70

___Starting of the semifinal___
Uruguay vs. Brazil: Brazil wins with probability 0.54(PENALTIES)
Portugal vs. Serbia: Portugal wins with probability 0.82

___Starting of the final___
Brazil vs. Portugal: Brazil wins with probability 0.95

So if we look at Brazil we can see the probability of them winning is the probability of Brazil winning each came multiplied together. Which is 0.96*0.94*0.9*0.89*0.96*0.54*.0.95 = 0.35 so we should expect to see Brazil having odds that triple your money but most bookmakers have it at 1 to 4.

I will leave you to calculate the odds of the other teams and spot when the bookmakers have got it wrong 🙂 have fun.