After an 11-yard sack, Donovan McNabb and the Philadelphia Eagles were backed up to their own 25 yardline. Down 17-14 to the Green Bay Packers with 1:12 remaining in the game, the top-seeded Eagles were on the verge of being eliminated from their first game in the 2003 playoffs. On 4th-and-26, the Eagles call for a 25-yard slant. McNabb drops back and throws a bullet to Freddie Mitchell in stride, converting and laughing in the face of probability. The Eagles then drove down the field and kicked a field goal, sending the game into overtime where they would eventually win. What are the odds that a drive containing a 4th-and-26 from the 25 would end with a successful field goal? According to the Markov model, a whopping 1 out of 175.
Now some math jargon: A Markov chain is a form of a Stochastic process. A Stochastic process is any process for which we do not know the outcome, but can make estimates based on the probability of different events occurring. A simple example is flipping a coin over and over. We do not know how many heads or tails there will be, but we can guess based on the fact that there is a 50% chance it lands heads and a 50% chance it lands tails.
A Markov chain is special in that only the most recent event matters in predicting the future of the process. If a team has a 1st-and-10 from their own 20, it does not matter how they got there: touchback, converted a 1st-and-10 from their own 10, interception, etc... But, based on the fact that they are currently in a 1st-and-10 from their own 20, we can predict where they will end up next and how they will ultimately finish the drive.
Now onto the creation of the model: The first step was to divide a drive into all possible situations and label them as distinct states. The non-drive-ending states, also known as transient states, were determined by down, distance-to-go, and yardline. The field was divided into 20 zones, one for every 5 yards. Similarly, the distance-to-go was split into 5-yard increments—with all to-go distances of more than 20 yards grouped into one label of 20+. This was done to ensure high enough frequencies for every state; if there were any states that never occurred in a game in the past 5 years, it would detract from the accuracy of the model. The range of frequencies was 6 to 6624, with an average of about 550 visits to each state. There were a total of 340 transient states.
There are 9 possible drive-ending scenarios—known as absorbing states—fitting into the three categories listed above: scoring, giving the ball back, end of half or game. The absorbing states are as follows: touchdown, field goal, safety, missed field goal, fumble, interception, turnover on downs, punt, end of half or game.
Once we have the transition probabilities, through matrix manipulation, we can calculate the probability of being "absorbed" into any of the 9 drive-ending states. In addition, we can calculate the average remaining length of a drive, and expected points on any given drive. To play around with the model, go here.
You can see the Markov analysis of Jim Harbaugh's decision to "take the points," here.
Keith Goldner is the creator of Drive-By Football, and Chief Analyst at numberFire.com - The leading fantasy sports analytics platform.