In a post from last year I noted how team records tend to regress to the mean from year to year based on how well a team did regarding interceptions. When teams did notably well in either offensive (low) or defensive (high) interceptions, the overwhelming trend was for them to win fewer games the following year. Likewise, teams with poor interception stats tended to win more games the following year.
When we look at team records from year to year, regression to the mean dominates. Good teams win fewer games the next year, and bad teams win more. This tendency is extremely strong as illustrated by the graph below. The horizontal x-axis represents each team's regular season win total from the prior year. The vertical y-axis is the change in each team's win total from the prior year to the subsequent year. The more wins a team had, the farther the drop in the following year. Likewise, the fewer wins a team had, the stronger the improvement. For example, the typical 13-win team will tend to win 4 fewer games the following year. And the typical 4-win team will tend to win 3 more.
I previously attributed the strength of the regression phenomenon to the scheduling system which matches opponents according to how they placed in their respective divisions, the draft which allocates draft position in reverse order of win-loss records, and salary cap boom/bust cycles in which individual teams load up on talented and costly players, then 'purge' their rosters to recover salary cap room for the dead weight of past signing bonuses.
While those considerations are very likely to contribute to the churn of team records, I now believe the major cause is the randomness of turnovers. Each team's turnover stats have a random component--think of tipped passes or fumbles bouncing on the turf. To test how strongly turnovers drive the phenomenon of win regression, I calculated the correlations between each turnover stat and the year-to-year change in team win totals (Win Δ). The data is from all 32 teams' five season-pairs from the 2002-2007 regular seasons (n=160).
|Stat||Win Δ Correlation|
These are very strong correlations, considering we are estimating next year's wins with previous year's stats. It's important to point out these are inverse correlations. The better a team does in terms of turnovers one year, the fewer games it is expected to win the following year. To put this in context with other correlations in the NFL, current year TD passes correlate at 0.50 with current year wins.
Based on each team's 2007 turnover stats we can estimate their improvement or decline for 2008. The estimates are based on a linear regression on Win Δ by fumbles lost, fumbles taken, interceptions thrown, and interceptions taken. Those teams that benefited the most from favorable turnover stats would be expected to decline, and vice versa. The table below lists each team and their expected change in wins from 2007 to 2008.
(One caveat--these are not definitive predictions for 2008, these are just based on the overwhelming tendency for teams to regress based on turnovers. Think of these as estimates about which other factors, such as injuries and fundamental improvement or decline, would operate.)
|Team||Int Taken||Fum Taken||Int Thrown||Fum Lost||Net TO||Exp Win Δ|
Why does randomness and regression to the mean appear so strong in the NFL? I think it's due to a combination of a short schedule and team parity. Sixteen games is simply not long enough for "the breaks" to even out. And if the opponents are relatively equal in ability, then random factors will play a large role in determining game outcomes. When randomness is decisively involved, regression to the mean will be a strong force from year to year.