Tomlin's 4th Down Call vs. Baltimore

Reader Borat asks: With 2:34 remaining anda 4-point lead, PIT faced a 4th and 5 at the BAL 29. Tomlin is taking a lot of heat from Steelers fans about his handling of the situation. Have you done an analysis?

At that point, Tomlin should have preferred a FG attempt (0.77 total WP) to a punt (0.75 total WP). The league average for that distance is 64%, and a successful kick would have made it a 7-point game, assuring at least overtime. But either decision could be defensible depending on the particulars--most importantly the expected resulting field position of the punt and likelihood of success of a FG attempt.

But Tomlin wavered, unsure of his team's new holder, a recently signed punter. He may have been correct because a drop of only a few percentage points in FG probability makes the decision a wash. The play clock ran out, and PIT took a delay of game penalty.

At this point, felt he had no choice but to punt. And the numbers reflect this. At 4th and 10 from the BAL 34, the punt is the better option, 0.75 WP to 0.74 WP.

But, at the risk of sounding like a broken record, I think he should have gone for it. Yes, gone for it even on 4th and 10. Here's why.

A 4th and 10 conversion is typically a 35% proposition, not terribly worse than the 52% shot at a 49-yd field goal. And the payoff of a successful conversion is virtually certain victory, while the risk is only a 25-yd shorter field for BAL's offense. It's actually somewhat safer than a missed FG because then BAL would start at their 41 instead of, most likely, the 34.

In fact, PIT would have only needed a 15% chance at converting to make the gamble worthwhile. In fact, Roethlisberger is 40% successful on 3rd and 10 (in 111 attempts), and 33% successful on 4th and 10 (2 for 6). PIT may not have had a 35% chance vs. the BAL defense, but they almost certainly had better than a 15% shot, especially considering the possibility of a defensive penalty.

Here's how it breaks down:

0.35 * 0.98 + (1-0.35) * 0.71 = 0.80 WP

Go4itPuntFG Att
Success Rate: 0.35-0.52
WP Success0.980.750.80
WP Fail: 0.71-0.66
WP Total: 0.800.750.74

  • Spread The Love
  • Digg This Post
  • Tweet This Post
  • Stumble This Post
  • Submit This Post To Delicious
  • Submit This Post To Reddit
  • Submit This Post To Mixx

21 Responses to “Tomlin's 4th Down Call vs. Baltimore”

  1. Anonymous says:

    The real question is how the hell did Tomlin not call time out? Do you have a number for how much that penalty cost them? I.e., how much of Tomlin's salary should be docked.

  2. Sam's Hideout says:

    From your WP calculator, going for it on the 4th and 5 and failing has a WP of 0.74, only slightly worse than punting. Succeeding essentially puts the game away of course.

  3. bob says:

    seems odd that the same column that declares .77 and .75 to be an actionable difference and even to conclude the coach was wrong, yet declares .35 and .52 not terribly different.

  4. Brian Burke says:

    nice version:
    bob-I'm sorry if I wasn't clear when I wrote "either decision could be defensible depending on the particulars."

    other version:
    bob-You're a moron. Try to read the article without skipping sentences next time. What do you not understand about the modifier "terrible"? .35 and .52 are 17 percentage points apart. The point is that the FG isn't the safe bet people think it is, and the 4th and 10 attempt isn't the long shot people think it is.

  5. bob says:

    These numbers seem to be off a bit.

    "a FG attempt (0.77 total WP) to a punt (0.75 total WP). "

    Go4it Punt FG Att
    WP Total: 0.80 0.75 0.74

    and, my conclusion, is that all these numbers are the same, in terms of predictive power.

    The only difference is that going for it has a 15 to 30 % chance of making Tomlin look like he blew it, whereas a FG attempt has a >50% chance of tomlin looking like he made the right call, and the punt had a 75% chance of making tomlin look like he made the right call.

  6. bob says:

    bob-You're a moron.

    sorry for making a comment. The point is right on however. You have concluded that Tomlin is wrong and he should have gone for it. There is no predictive power in this analysis.

    Here is what you are doing, you are measuring a coin flip, and you find it is heads 48% of the time (480 out of 1000), then you take another coin and yo find it is heads 51% of the time (510 out of 1000). You then conclude that Tomlin is wrong if he calls heads for coin A, and he should have used coin A.

  7. bob says:

    er, and he should have used coin B

  8. bob says:

    ok, the slightly different numbers, were presumably do to the difference before and after the PITT penalty that moved them back 5 yds to the 34.

  9. Anonymous says:

    Bob makes a fair point, what is the margin of error on your WP model? How big are your samples on these situations?

  10. J.R. says:

    I think Tomlin was right to try to pin the Ravens deep. Remember that the Ravens' offense had been very effective from the shotgun, and Flacco is always a threat to get one shot deep down the field, so having that extra forty-yard cushion -- and the Steelers' defense on the field! -- is a pretty solid bet.

  11. Anonymous says:

    Well, Tomlin presumably has a lot more information about his team than shown in these statistics, so it is foolish to say he made the right or the wrong decision, since the amount of information you have affects the stats.

    But I am curious about the difference in the WP from a failed FG and a failed 4th and go. Why a 5pp difference?

  12. Ian Simcox says:

    Bob - “Here is what you are doing, you are measuring a coin flip, and you find it is heads 48% of the time (480 out of 1000), then you take another coin and yo find it is heads 51% of the time (510 out of 1000). You then conclude that Tomlin is wrong if he calls heads for coin A, and he should have used coin A.”

    Bob, that only works if you assume the coins are both fair. Let’s say though that you know that one coin gives a head 48% of the time and the other gives a head 51% of the time. You toss the coins and get your result, A has 480 heads, B has 510 heads. The question is, what is the probability that coin B is the better choice given that I want to maximise my chances of calling a head? (97% in this case)

    Ultimately that’s what these analyses are about. We can never know the exact numbers, but we can see when the data is suggesting overwhelmingly that going for it is more likely a better idea than punting.

  13. bob says:

    Ian, in that example yes I am assuming they are fair.

    and in fact, they were both the same coin I merely changed the label! lol, a bit of a joke, there, but it illustrates the point. Even in the ideal world where we drew sample from a perfect and stationary (i.e. never changing) probability distribution, like a coin, we can never have prefect knowledge of what that underlying distribution is based soley on realizations.

    For the NFL, with different rules, different players, and so many chaotic outcomes (where a small change in circumstances leads to a huge difference in the outcome), then one has to keep that into account when looking at the numbers.

    I directly stated two cases, but was really thinking of all three cases (and should have stated that more clearly). There was 80%, 75%, and 74% chance of winning at that point. In terms of using this information for a gametime decision, they all basically have to be treated as indistinguishable especially in the context of what the pittsburgh steelers will do on this night against the baltimore ravens.

    The conclusion should be, when up 4 points with 2:30 left in the game, you are likely to win no matter what.

    The point is not going for 4th down is a good idea, the point is anything you do in that situation is a good idea. going for 4th down is not 'terribly different' from simply taking a knee on 4th down.

    punting on 3rd down probably would not make much of a difference. nor would a FG on 3rd down. obviously, there are several insane plays that could get a coach fired that would not have significantly changed the win probability.

    That is the context of the question.

    and as I also pointed out, missing on 4th and 10 is the most likely outcome (ranging from 85% to 65 %), and would have been a disaster for the coach.

    either 1) a FG or FG miss, and the win, or 2) a succesful FG, but a loss, or 3) a punt and a win, would lead to a favorable outcome for the coach in terms of opinions on how well he did.

  14. bob says:

    also, fyi. in 1000 'ideal coin flips' the number of heads being greater or equal to 510 is 27.4%.

    >= 505 heads, almost 40% of the time.

    It is exceedingly unlikely to get exactly 500 heads, less than about 2% of the time.

    So, when looking at Big Ben and seeing he is 2 for 6, well i'd estimate his range as not 33% but rather 2 +- sqrt(6).

    So, his range of possiblity on making that 4th and 10 would be 0% to 70%, 90% of the time.

    for 3rd down numbers (and 10 yds) , he is 44 for 111, so about between 30% and 59% most of the time.

  15. Adam H says:


    But regardless of the uncertainty, your BEST GUESS is that going for it is better.

    If you know that you don't know how a coin is weighted, and all you know is that 490 tails and 510 heads came up, your best guess should be that the coin has a 51% chance of landing heads.

  16. bob says:

    adam. regarding your "best guess", it is extremely likely that it is incorrect.

    for instance, as you say, the "BEST GUESS" is that the coin has a 51% chance of landing heads. That is wrong.

    The next 1000 flips is extremely likely to be 510 heads. and it 75% chance of being less than 510 heads.

    and keep in mind, the NFL is vastly more complicated than a coin, it's underlying probability changes week to week, players change, coaches change, rules change, and the number of points you use to derive your best guess is far lower than 1000.

    I doubt there has ever been a case of big ben on 4th and 10 with 2:35 against baltimore in the past.

    and, even if there was, you would need several thousand to get some good estimates on the many varied outcomes.

    and even then, you'd probably come to the conclusion that going for it, fg, or punt were all similarly likely.

  17. Jonathan says:

    Brian has a much larger data set than the hypothetical 1000 coin flips you keep bringing up. He has 10 years of data from every NFL game. That's 2,560 entire games--not including playoffs.

    Besides...if you were told that a certain outcome had a 51% chance of happening, the best guess is that it's probably going to happen. If the true probability is 50.5%, that doesn't mean you're wrong. It means you're correct--you said it probably will happen, and lo, it's probably going to happen.

  18. Anonymous says:

    How damaging was the 3rd down play.
    The incomplete pass stopped the clock.
    A running play would have moved the clock to the 2min warning or more likely forced a timeout and improved the odds of a FG or 4th down attempt.

  19. Brian Burke says:

    The pass incompletion was only marginally consequential. 2 min + 1 timeout is plenty of time in today's NFL to move 90+ yds. The most important thing was to get the conversion. That would have sealed the win for PIT. I think the pass gave them the best chance to do that.

  20. bob says:

    Jonathan, how many 4 and 10s on the 30 yard line with 2:30 left in the game up by 4 points does he have?

    the point i make is pretty simple and straightforward.

    1) the probability ranges overlap

    2) "past performance does not indicate future results".

    3) bonus question . now that the team lost with a punt in that exact situation, how much does that change the probabilities when you recalculate? Punting would logically now have to be the poor choice to make, compared to a fg.

  21. Smerdyakov with a Guitar says:

    So wait, Bob: You mean Brian's model can't perfectly analyse what would have happened under counterfactuals? That's it -- I'm leaving this worthless site for good!

Leave a Reply