This is the third and final post of an article discussing the amount of luck in determining outcomes in the NFL. In the first post, I compared the actual distribution of team win-loss records over the past five seasons with an idealized pure luck distribution. I found that only 78 out of 160 actual season records (48%) differed from what we’d expect if the NFL were determined completely by luck.
In the second post, I compared the actual distribution with an idealized distribution of records in a theoretical league governed by “pure skill.”
In this post, I will unify the three distributions--actual, luck, and skill--into one algorithm. The resulting equation reveals the proportion of NFL games in which the deciding factor is luck and not the camparitive strength of each opponent.
LUCK, SKILL, AND OBSERVED
The chart below is a histogram of the pure binomial distribution, the simulated pure-skill (zero luck) distribution, and the actual distribution of NFL records since the '02 expansion. (Pure luck is blue, actual is yellow, and pure skill is red.)
When I first examined the three distributions together, I was struck by how the actual distribution appeared to split the difference between the luck and skill distribution. The actual records appear to be some sort of combination of the luck and skill distributions. To me, it looked as if the pure-luck binomial distribution was pressed into a flatter and wider distribution by skill.
It dawned on me to create another simulation, one that synthesized the pure-luck and pure-skill distributions together in varying degrees. (10% luck/90% skill, 20% luck/80% skill, etc.) Basically, the luck% variable determined a percentage of games (chosen at random) to be decided by pure luck, essentially a coin flip. The remainder of the games were credited to the superior team. The simulation algorithm looked like this:
If rand() < %luck, then game outcome = pure luck, else game outcome = win by the better team
I varied the %luck value between 0 and 1, re-running the simulation. Here are some representative win distributions (legend is in %luck):
Next I overlayed the actual distribution.
PROPORTION OF LUCK IN NFL OUTCOMES
Then I varied the %luck value until it maximized the goodness of fit between the actual distribution and the synthesized distribution. At 52.5% luck, the theoretical distribution is statistically indistinguishable from the actual distribution (chi-square goodness of fit p=0.94). This means it is 94% probable that the discrepancies between the synthesized simulation and the actual observations are merely due to sample error.
THEORETICAL MAXIMUM PREDICTION RATE
I will be very careful in stating what conclusion I draw from this exercise. The actual observed distribution of win-loss records in the NFL is indistinguishable from a league in which 52.5% of the games are decided at random and not by the comparative strength of each opponent.
I admit 52.5% seems very high. But keep in mind, that half of the time, the better team wins by luck. In other words, half the time our prediction models are correct by chance, just like a monkey picking winners would be. If the 52.5% figure is correct, the best any prediction model could do is:
0.50 + 0.525/2 = 0.76
So 76% correct would be the theoretical ceiling for NFL game prediction models. This is consistent with the various computer models as well as odds makers. It is also consistent with our intuitive experience--upsets seem happen about a quarter of the time. Sometimes a model (or a person) can predict at better than a 76% correct rate, but anything above that would be...by luck.
I also posted a follow up to this series of articles here.