Trash-time distorts the relationship between true team strength and team statistics, be they conventional or advanced, total stats or per play. To determine true team strength, we need to weed out the random outcomes and discount trash-time performance.
WPA is probably the ultimate explanatory statistic. EPA is less explanatory and more predictive, because it's not subject to the leverage of time and score, but it's also subject to the random outcomes of a bouncing or tipped oblong ellipsoid.
One way to eliminate trash time from the data would be to simply throw out the fourth quarter. As it turns out, there is a lot of baby in that bathwater. A better way might be to throw out data based on Win Probability (WP). A statistic that's based on EPA, but limited to when the game is still in play, could be the answer.
There's still the problem of the bouncing ball. There are sometimes huge EPA plays--James Harrison's 99-yard TD return in the Super Bowl a couple years ago comes to mind. A play like that represents almost a 12-point swing in EP, but it's the kind of event that's so rare that it makes little sense to project future team performance on such a distorting play. Put simply, it does not have the equivalent predictive value of two solid 80-yard offensive drives.
But it's representative of something. We don't want to throw plays like that out. What we can do is limit their statistical impact. We can cap their EPA value at a certain amount, so that no single play will have more or less than a chosen value.
To start, I chose to limit the data to plays in which the offense had between a 0.05 and 0.95 WP. That eliminates situations when a team is significantly ahead or behind, and may be playing in a way that distorts 'normal' football.
Then I capped the EPA values for all plays to 2.0 points (and -2.0 points). This limits the impacts of freak plays--blown coverages, pick sixes, or very long runs. Football is bounded by end zones, so a run that breaks past all defenders from the 50 yd-line would have a higher EPA value than the same run that occurs from the opponent's 20-yd line. The fact that it happened from midfield rather than from the 20 is not an indication of additional team strength.
Both of these limitations are arbitrary, and are open to revision. For now they represent my own intuitive sense of things. An turnover is typically worth about 4 EPA, so the cap throws away about half the impact of a turnover.
In summary, I've created a new team stat I'll call EPX: Expected Points-Experimental. It starts with EPA. Then it throws out trash time, ignoring it completely. Lastly, it caps EPA values for any single play at 2 points, limiting the statistical impact of very large plays.
The chart below ranks teams in terms of EPX through week 13 of the 2011 season. It included both offense and defense. To generate the final results, I computed EPX per play, then multiplied that number by the average number of plays per game (about 108). The result is in terms of net point difference per game. GB's defense comes out as 21st in the league.
EPX isn't intended to replace any other stat here at ANS. It's just another perspective on the numbers surrounding team strength. EPX does correlate better than GWP at the moment with the team rankings derived from point spreads (as computed by Michael Beuoy), 0.80 vs 0.77.
Future improvements would include tweaking the arbitrary limits of 0.05 to 0.95 WP and 2.0 EPA, as well as applying opponent adjustments.