Previously, I made the case that YAC belongs to a receiver's abilities, and is not contributed to by a quarterback by virtue of his accuracy.
In a recent exchange with a well-informed fan of Brett Favre at the Football Outsiders site, Alex challenged my conclusion that YAC belongs to the receiver. Although throughout the discussion I was tweaking him about Favre's poor numbers, much of the discussion centered around McNabb vs. Garcia. Alex was suspicious because my rankings considered Garcia the better QB in '06 because Garcia's completions were deeper down-field, and McNabb's were shorter completions with lots of YAC. (Ultimately McNabb comes out on top after considering rushing yards and fumbles.)
Alex pointed out an article by Aaron Schatz of Football Outsiders, that implied that QBs control YAC rather than receivers. Mr. Schatz's research tends to be very heavy on data, but very weak on analysis. His article is based on a look at QB YAC between '05 and '06. He found a 0.33 correlation in year-to-year YAC for QBs as a whole, and a 0.41 correlation for QBs on the same team both years. The author concludes, "For the NFL, that's reasonably consistent."
Actually, for the NFL or for anything, that's reasonably meaningless. First, statisticians grate their teeth whenever a couple correlations by themselves are used to make a conclusion. Second, for a sample size of say, the top 32 QBs (n=32), the significance level for a Pearson correlation is 0.35. So the article's correlation for all QBs is probably not even significant.
In the article, the author goes on for several paragraphs explaining why QB after QB jumps from the bottom of the YAC rankings to the top or from the top to the middle, and so on. I think if he stood back from what he wrote, he'd see that his data should be leading him to the opposite conclusion. Here is a sample:
“Last year’s top quarterback in YAC was Jake Delhomme, and he’s fallen to theAlso note that his explanation for the two biggest surprises, Delhomme and Brady, were due to their receivers. If YAC really was QB-dependent, receivers should be (somewhat) interchangeable.
middle of the pack this year…The rest of this year’s top five: Daunte Culpepper,
David Garrard, Mark Brunell, and Brett Favre. Brunell was third last year, but
Garrard was near the bottom of the YAC rankings last year…Garrard went from 43rd
to third, and Leftwich went from 33rd to eighth…Tom Brady was one of last year’s
leaders, but he’s middle of the pack this year …The bottom five: Garcia, Matt
Hasselbeck, Joey Harrington, Peyton Manning, Steve McNair. All of those guys
were middle of the pack in 2005 except Hasselbeck …There are a lot of other guys
who are near the bottom in YAC both years, though — they just aren’t bottom FIVE
To be generous, let's assume his numbers are significant. The correlation of 0.41 for QBs who remained on the same team, would be based a smaller sample size. I don't have his data, but we'd need an n=22 sample size to make 0.41 significant. It's a safe assumption that there were indeed 22 or more QBs on the same team in both years, which would probably make the 0.41 number statistically significant. Even so, that undermines his case because QBs on the same team in both years are throwing to mostly the same receivers. This does not support the conclusion that YAC is consistent in QBs.
Remember that correlation coefficients are not linear. You square the correlation coefficient (r) to know the percentage of variance accounted for (r-squared). So an r of 0.41 means that only 16% of the variance is accounted for by various combinations of both QB and receivers from '05 and '06. An r of 0.33 equates to 11% of the variance. How much could the QB alone account for? Very little if any.
I repeated a similar analysis, but this time for receivers from year to year. I studied all receivers with receptions in each season from '02 to '06, for an n=148 (critical r for significance is 0.16). I calculated the correlation between consecutive years and between non-consecutive years (between '02 and '05 for example). This is regardless of who the QB or team was for each receiver, so these numbers should be compared to the 0.33 number calculated by the author. The highest correlation I calculated was 0.57, and the lowest was 0.41. The average correlation among all years was 0.47. [Note: these correlations are actually understated. Here is an updated analysis.]
So the stronger correlation is for the receiver by far. But keep in mind this isn't a competition between who gets credit, QB or receiver. The real question at hand is should the QB get credit? My original thesis was that measures of QB performance is better off not including YAC.
Think of it this way. We're parsing the total variance of YAC--how much belongs to the QB, and how much belongs to other things. Assuming each contribution is independent, we can conceptualize it this way:
var(YAC) = var(QB) + var(rec) + var(defense) + var(random)
In other words, the total variance in YAC is due to the QB's contribution, the receiver's contribution, what the opposing defenses did or didn't do on each play, plus variance from random error.
So far we can be fairly certain about two things. Var(QB) is very small or zero, and var(rec) is significantly greater than var(QB). The rest of the variance, and possibly the vast majority of it, is due to defense and randomness. But no matter how the rest of it is divided up, in the final analysis, it is very unlikely any significant amount belongs to the QB.
Also consider that QB accuracy, measured by completion percentage and interception rate, are not statistically significant in a regression of YAC. So while I may have been mistaken to say YAC belongs only to the receiver, I was correct in saying it does not belong to the QB. The more accurate conclusion is that YAC belongs somewhat to the receiver, a lot to the opposing defenses and random variation, and very little, if at all, to the QB.
[Further analysis, which shows that receivers really do own YAC, can be found here.]