One of the reasons I have always supported and endorsed Brian Burke and his work at Advanced NFL Stats is his recognition of the limitations of statistical analysis. Ever since statistical analysis took baseball by storm and many of its most prominent practitioners were scooped up and employed by MLB franchises, there has been a crush to translate statistical analysis to other sports. And to varying extents, it has worked.
Football remains relatively impenetrable. It doesn’t have the binary pitcher-batter interface of baseball. Apart from maybe touchbacks, there are no true individual stats. One player’s name may appear beside a 20 yard reception but that reception is the product of: one player passing, one player receiving, x number of players running (ultimately) decoy routes, and y number of players blocking.
All those confounding factors tend to test the intuitiveness of advanced stats. Yes, Austin Collie was the second most valuable receiver by EPA/P, but, no, no one thinks Collie is the second most valuable receiver in football. Even aggregate EPA produces some head-scratchers. Was, for instance, Lance Moore truly more valuable than Marques Colston? Probably not, right? But we are short of information to explain exactly why. Keep in mind, a stat like EPA isn’t arguing that Moore is more valuable as a football players than Colston, only that passes targeting Moore were more valuable in toto than passes targeting Colston.
For the time being, advanced stats really must be combined with scouting to create a meaningful whole. However true that may be though and however long that may be true, that doesn’t mean stats are maxed out. There remains huge potential but some of that potential will never be realized unless the NFL itself improves its own stat keeping.
That is the subject of this piece.
Improve accuracy: According to the NFL,
“Gamebooks are prepared on site at each game using data available immediately following the completion of a game. They are intended to provide a snapshot of the game's action and are not updated after stats are made official on Monday mornings. Please note that scoring decisions made on game days are reviewed and frequently adjusted before becoming official.”Seems like a reasonable standard. Football is messy, gamebooks are imprecise, but the NFL does its due diligence to correct what errors may have occurred during the initial stat keeping.
Though if you have ever compared game film to play-by-play, you will quickly recognize that football is messy and football statistics are often messier still. What determines who is awarded a sack or an assist is often all but impossible to determine. Players not even on the field are awarded tackles. What exactly defines a “pass defensed” is anyone’s guess.
Before we can add information, stats, the NFL must do a better job of recording the stats it already has. That means: clear definitions and accurate accounting.
Personnel data: Part of the Colston-Moore dilemma is that though Moore is very much involved in Sean Payton’s offense, Moore is a reserve/slot receiver, while Colston is the vaguely-defined “number one receiver.” Moore had one start to Colston’s 11. But “start”—which literally means that player was involved in his team’s first offensive/defensive play—is a shallow tending towards ludicrous way to define a player’s importance to his team. Maybe Colston started but Moore played in more snaps.
An NFL game typically involves a hundred or so plays. It is not a particularly long or involved process to record the personnel for one game. It is, however, a bit daunting to do that for the 512 games that populate the NFL regular season—at least for one person. It should not be hard for the NFL. Some personnel groupings are reasonably static like the offensive line. Others are a little more dynamic like wide receivers or defensive linemen.
Personnel data would illuminate both usage and create a kind of plus-minus. First used in hockey, plus-minus has been exported to basketball too. In football, it would help account for receivers that draw double teams and create openings for other receivers; defensive linemen that draw double teams and create space and openings for other linemen; running backs that pass block effectively; etc. I have seen simple attempts at this application. As a Seahawks fan, I knew that Seattle was a much less effective run defense when Marcus Tubbs was inactive. But inactive or active, starting or as a reserve, are at best approximations for on the field or not. And if we want to know whether Tubbs really was essential to the run defense, rather than if his absence and the corresponding decline in run defense was just a coincidence, we need accurate and comprehensive accounting for participation by all players on all snaps.
Play action and draws: In its most cardinal form, its most absolute form, football is a game of running or passing and defending the run or defending the pass. But it’s not a neat binary. It is more like a scale or range, extending from a “pure” pass play like a five wide receiver, shotgun set, to a pure run play, something like the modern “wildcat” or wing t.
Statistical analysis has proven that passes are on the whole more effective than runs and thus seemingly underutilized, but that conclusion stems from the initial assumption of run or pass, which is an oversimplification. Personnel data would help bridge that gap by indicating whether, say, a vertical threat like DeSean Jackson truly forces safeties back and thus improves the run game. However, to really understand how the run and pass game interact, we need to account for play fakes. Play fake are elemental and essential parts of football strategy. Is, for instance, the value of Adrian Peterson—a talent adored by coaches and fans but largely undervalued by advanced statistics—hidden in his ability to improve the Vikings play-action offense? Or, conversely, how much does a great quarterback and a great passing offense improve the value of a draw play? We don’t know and until we do know, statistical analysis will be stuck attempting to evaluate an absolute run and an absolute pass in a game that's really about everything that falls between.
Subdivide yards into feet: This might be a pipe dream but it’s too important to exclude: football is measured in three foot increments. When a run off right end goes for two feet, it’s recorded as a one yard gain. When a run off right end goes for four feet, it’s recorded as a one yard gain. On first and ten or third and ten, etc, anything short of the first down, be it 1 inch or four feet, is recorded as a nine yard gain. On first and inches, anything that converts the first down but does not meet the threshold for a two yard gain, be it the “inches” in question, or four feet, is recorded as a one yard gain. In isolation, some of this doesn’t matter much, but in aggregate, it makes a mess. For instance, how do we determine what a successful play is when every gain from 3.5 to 4.49 yards is lumped together?
Measuring the gridiron in yards is a holdover from Rugby Union, plus some tinkering by Walter Camp. For obvious reasons, primarily the lack of downs and thus first downs, increased granularity in measuring a rugby field is unnecessary. American Football is one of the few places the unit "yard" is still in wide usage. It’s essentially a matter of tradition, and one I doubt the NFL is keen to change. That said, yards do not have to be discarded entirely, just made more specific. There’s nothing particularly hard to decipher or unorthodox about measuring play length in yards and feet. It would certainly make statistical analysis, now dependent on a wealth of imprecise measurements, more accurate and meaningful.
That is the greater cause. Football may not need a statistical revolution. The NFL is the most successful professional sports league in America. It may not need enthusiasts that dig deep into the data and determine just how the game works, but if we’re out there, and we’re dedicated, and we represent part of the league’s future fanbase, why stubbornly ignore us? Fans, fantasy football fans, gamblers, enthusiasts, are hungry for new information, new ways to appreciate, understand and love the sport of football. Probably not a priority right now in these dark days of the lockout, but why not NFL? Why not take the best sports product on Earth and make it better.