How the Model Works--A Detailed Example Part 2

This is a continuation of an article that details exactly how my predictions and rankings are derived. You can read part 1 here. To recap, I'm using the Super Bowl match-up between the Steelers and Cardinals as an example. So far, we've used a logistic regression model based on team efficiency stats to estimate the probability each team will win.

We haven't accounted for strength of schedule yet. For example, the Steelers may have the NFL's best run defense, yielding only 3.3 yds per rush. But is that because they're good or because their opponents happened to have poor running games?

To adjust for opponent strength, we'll first need to calculate each team’s generic win probability (GWP), or the probability of winning a game against a notional league-average opponent at a neutral site. This would give us a good estimate of a team’s expected winning percentage based on their stats.

Since we already know each team’s logit components, all we need to know is the NFL-average logit. If we take the average efficiency stats and apply the model coefficients we get Logit (Avg) = -2.52.

Therefore, for the Cardinals, a game against a notional average opponent would look like:

Logit = Logit (ARI) – Logit (Avg)
= 0.07
The odds ratio is e0.07 = 1.09. Arizona’s GWP is 0.52—just barely above average. If we do the same thing for Pittsburgh, we get a GWP of 0.73. And it’s easy enough to do for all 32 teams. In fact, that’s what we need to do for our next step in the process, which is to adjust for average opponent strength.

The GWPs I calculated for Arizona and Pittsburgh were based on raw efficiency stats, unadjusted for opponent strength. That’s ok if we assume they had roughly the same strength of schedule. But often teams don’t, especially in the earlier weeks of the season.

To adjust for opponent strength, I could adjust each team efficiency stat according to the average opponents’ corresponding stat. In other words, I could adjust the Cardinals’ passing efficiency according to their opponents’ average defensive efficiency. I’d have to do that for all the stats in the model, which would be insanely complex. But I have a simpler method that produces the same results.

For each team, I average its to-date opponents’ GWP to measure strength of schedule. This season Arizona’s average opponent GWP was 0.51—essentially average. I can compute the average logit of Arizona’s opponents by reversing the process I’ve used so far.

The odds ratio for the Cardinals’ average opponent is 0.51/(1-0.51) = 1.03. The log of the odds ratio, or logit, is log(1.03) = 0.034. I can add that adjustment into the logit equation we used to get their original GWP.

Logit = Logit(ARI) – Logit(Avg) + 0.034
= 0.11

This makes the odds ratio e0.11 = 1.12. Their GWP now becomes 0.53. If you think about it intuitively, this makes sense. Their unadjusted GWP was 0.51. They (apparently) had a slightly tougher schedule than average. So their true, underlying team strength should be slightly higher than we originally estimated.

I said ‘apparently’ because now that we’ve adjusted each teams GWP, that makes each team’s average opponent GWP different. So we have to repeat the process of averaging each team’s opponent GWP and redoing the logistic adjustment. I iterate this (usually 4 or 5 times) until the adjusted GWPs converge. In other words, they stop changing because each successive adjustment gets smaller as it zeroes in on the true value.

Ultimately, Arizona’s opponent GWP is 0.50 and Pittsburgh’s is 0.53. After a full season of 16 games, strength of schedule tends to even out. But earlier in the season one team might have faced a schedule averaging 0.65 while another may have faced one averaging 0.35.

My hunch is that it’s this opponent adjustment technique that gives this model its accuracy. It’s easy enough to look at a team’s record or stats to intuitively assess how good it is, but it’s far more difficult to get a good grasp of how inflated or deflated its reputation may be due to the aggregate strength or weakness of its opponents.

Now that we’ve determined opponent adjustments, we can apply them to the game probability calculations. The full logit now becomes:

Logit = const + home field + (Team A logit + Team A Opp logit) –
(Team B logit + Team B Opp logit)

Pittsburgh’s opponent logit is log(0.53/(1-0.53)) = 0.10 and Arizona’s is log(0.50/1-.50) = 0.01. The game logit including opponent adjustments is now:

Logit = -0.36 + 0.72/2 + (-2.45 + 0.01) - (-1.51 + 0.10)
= -1.02

The odds ratio is therefore e-1.02, which makes the probability of Arizona winning 0.36. This estimate, based on opponent adjustments, is slightly lower than what we got for the unadjusted estimate. This makes sense because Arizona’s strength of schedule was basically average, and Pittsburgh’s was slightly tougher than average.

So there you have it, a complete estimate of Super Bowl XLIII probabilities and a step-by-step method of how I do it.

There are all kinds of variations to play around with. You can choose which weeks of stats to use, to overweight, or to ignore. You can calculate a team’s offensive GWP by holding its own defensive stats average in the calculations, and only adjusting for opponent defensive stats. The resulting OGWP tells us how a team would do on just the strength of its offense alone. It’s the generic win probability assuming the team had a league-average defense. DGWP is vice versa.

One variation I employ is to counter early-season overconfidence by adding a number of dummy weeks of league-average data to each team's stats. This regresses each team's stats to the league mean, which reduces the tendency for team stats to be extreme due to small sample size. For example, it takes about 6 weeks for a team's offensive run efficiency to stabilize near its ultimate season-long average. So at week 3, I'll add 3 games worth of purely average performance into each team's running efficiency stat. No team will sustain either 7.5 yds per rush or 2.2 yds per rush.

This entire process might seem ridiculously convoluted, but it’s actually pretty simple. You get the coefficients from the regression. You next calculate each team’s logit with simple arithmetic. Game probabilities and “GWP” are just a logarithm away. Opponent adjustments require a little more effort, but in the end, you just add them into the logit equation.

Voila--a completely objective, highly accurate NFL game prediction and team ranking system.

  • Spread The Love
  • Digg This Post
  • Tweet This Post
  • Stumble This Post
  • Submit This Post To Delicious
  • Submit This Post To Reddit
  • Submit This Post To Mixx

18 Responses to “How the Model Works--A Detailed Example Part 2”

  1. Anonymous says:

    what is the constant?

  2. Brian Burke says:

    -0.36. All the coefficients are in part 1 if you're interested.

    Maybe I should call it 'f' or something novel. You know, like the speed of light is 'c' or the natural exponent is 'e.' Or a Greek letter maybe, like pi? No, I think that one's taken.

    Essentially, it's the value of home field advantage. e^-0.36 = 0.70. And 1/(1+.70) = .41. So HFA for equal opponents is 0.59 to 0.41. Or, in terms of odds, it's about 1.4 to 1 -- that's pretty big.

  3. Anonymous says:

    The research I've done, which has all the games from 2004 to now, shows that the home team wins about 56.6% of the time. Shouldn't that correspond to the 59% number? Across all the games, team quality differences should even out, right?

  4. Brian Burke says:

    I've found that HFA is not linear for all games. The 56.6 number you found was an average for all types of game match-ups. 59% is the HFA when teams are evenly matched. It ranges for 59% down to 53% for mis-match type games.

    In case anyone is interested, here is the article on it.

  5. Brian Burke says:

    Also keep in mind the effect of HFA varies randomly from year to year. Some years the average is 58% and some years it's 53%.

  6. bmoore_ucla says:

    Brian, it would be easy enough to test your hunch about the opponent adjustment being the source of your improved accuracy. It would also be interesting to know how big the improvement is.

  7. Brian Burke says:

    You're right. But there are so many things I'd like to work on. I tried hard this year not to obsess over the prediction model's accuracy, and as a consequence all the records I'd need to do that are scattered all over my hard drive.

  8. Brian says:

    So the first time through, you start with Team A's raw GWP and create Team A's adjusted GWP using their opponents' raw GWP. In the second iteration, do you begin with Team A's raw GWP or do you use their newly calculated adjusted GWP. My guess is that if you keep using the adjusted GWP then the numbers will never converge.

  9. Brian Burke says:

    Yes. If you adjust the adjusted value, the solution never converges. The GWPs spread apart, with lots of .10s and .10s.

  10. Brian Burke says:

    .90s I mean.

  11. Borat says:

    What is the frequency, Kenneth?

  12. Anonymous says:

    I tried to understand how the adjustment work, so I decided to study the example. I noticed several differences between your calculus and mine. So I wonder if I did a mistake.

    1/ When you calculate the odd ratio of ARI you found 1.09 but I found exp(0.07) = 1.07. It's a slight difference but who's correct?

    2/ Same question for the odd ratio of ARI opponents.
    0.51/(1-0.51) = 1.04
    log(1.04) = 0.017033339
    you found
    0.51/(1-0.51) = 1.03
    log(1.03) = 0.034

    3/ Logit = const + Logit(ARI) – Logit(Avg) + 0.034 = 0.11
    if the constant is -0.36 so
    Logit = -0.36 + -2.45 – -2.52 + 0.034 = -0.256
    Why do we need the constant since the GWP is on neutral field?

    If you get a chance I would like to know if my example is correct.
    Team GWP = 0.744, so the logit = 0.46; log(0.744/(1-0.744))
    Team Opp GWP = 0.136 so the logit = -0.803
    Adjustment of logit = 0.46 - 0.803 = -0.343
    New GWP = 1 / (1 + exp(--0.343)) = 0.415

    The GWP goes from 0.744 to 0.415 which makes sense because the team played against very weak opponents so far. Is this first iteration correct?

  13. Anonymous says:

    When you are using log, it is the logarithm with base e, or 10?

  14. Brian Burke says:

    Sorry, I don't have time to go over your calculations, but I think the difference between 1.09 and 1.07 may be due to rounding errors. I'm using many more significant digits for the actual calculations than in the write up. That might explain it.

    And I'm using base e.

  15. Anonymous says:

    Excel use by default log base 10, so it might be the problem.

    When you calculate the adjusted GWP, why do you add the constants to the equation?
    Logit = const + Logit(ARI) – Logit(Avg) + 0.034

  16. Csonka says:

    I have a related question to the previous one.
    actually, not only why did you add the constant? actually, I was thinking you put the constant in incase there specified a HFA, but in this case, it would be zero (neutral field), so would logit=.07+.034=.041.

    Logit = const + Logit(ARI) – Logit(Avg) + 0.034
    = 0.11

    Brian, can you explain where you get 0.11?
    thanks

  17. Brian Burke says:

    I made an error in the write up. The constant should not be in that line. I'll correct it above.

    It should be Logit = Logit(ARI) - Logit(Avg) + 0.34

    which is:

    Logit = 0.07 + 0 + 0.034
    Logit = 0.104
    The log of the odds ratio = Exp(0.104) = 1.11
    The probability is therefore: 1/(1+(1/1.11)) = 0.53

  18. Csonka says:

    ok, thanks

Leave a Reply