The relationship between batting average and BABIP


With the Cardinals and Red Sox having punched their tickets to the World Series this weekend, one particularly memorable Fall Classic moment comes to mind.

Charged with protecting a 2-1 lead, Mariano Rivera retired the side in the eighth inning of Game 7 in the 2001 World Series. We all know what happened next. The Diamondbacks tied the game in the ninth before Luis Gonzalez delivered a bases-loaded one-out walk-off hit to give the Diamondbacks their first ever World Series title.

It was arguably the most dramatic Fall Classic moment ever – and maybe the luckiest, too.

As our understanding of the game has improved since then, most notably highlighted in the movie Moneyball, sabermetricians have found dozens of new ways to evaluate and quantify baseball performance.

Some sabermetrics, like the increasingly mainstream WAR (Wins Above Replacement) are valuable but are difficult to calculate. Not so with BABIP (Batting on Average on Balls In Play).

The formula is simple: (Hits – Home runs)/(At-bats – Strikeouts – Home runs + Sacrifice Flies). It measures what its name implies – how often a player registers a base hit when he puts the ball in play, excluding home runs and including sacrifice flies.

BABIP aims to quantify, in a way, how lucky a batter is. It accounts for base hits like Gonzalez’s World Series-winning blooper. Theoretically, if a player as a high BABIP, it’s an indication of him getting “lucky” at the plate and that, eventually, his batting average will eventually regress to the mean – what his batting average would be without any luck.


The relationship between batting average and BABIP

During the five years from 2008-12, there were 750 players that notched at least 500 plate appearances in a given season. There was a 0.769 correlation (on a scale from -1 to 1) between their batting averages and BABIPs over that span, indicating a predictably strong relationship.

What wasn’t as predictable was the fact that those with higher batting averages actually tended to have lower BABIP-batting average ratios than those with lower batting averages.

The 18 players from 2008-12 that hit at least .330 in a season had an average BABIP-batting average ratio of 1.06 while the 11 players that hit below .220 in a season had an average BABIP-batting average of 1.19. This bar chart illustrates this trend.


Using BABIP to predict a player’s batting average

Intuitively, using one’s batting average in a given year would be a pretty good indicator of what their batting average will be the following season. And this is true. What may not be as intuitive is the fact that BABIP – supposedly a measure of random luck – greatly improves the ability to predict the next season’s batting average.

Using only a player’s batting average, you come within an average of 40 points when projecting his batting average the next year. Incorporating the player’s BABIP one year, though, brings you to within an average of 20 points of next year’s batting average – a twofold improvement in your predictive power.

Like BABIP itself, the formula here is pretty simple. Multiply a player’s BABIP by -1.5, add 1.5, then multiply that number by the player’s batting average the previous year and you have an improved projection of what his batting average will be next year.


BABIP theory proved true – hitters don’t stay lucky forever

Back to what BABIP is supposed to do: measure how lucky a hitter is.

Looking back at those five years (2008-12), you find that the thinking behind BABIP’s purpose is sound. Those with high BABIPs one year tend to have lower batting averages the following year. More to the point, those with especially high BABIPs tend to have much lower batting averages the following season (and vice versa).

Of the 466 players over those five years that had at least 500 plate appearances in consecutive seasons, 14 of them had a BABIP of at least .370 in the first of those seasons. Their batting average dipped by an average of 31 points the next year. 19 of those 466 players had BABIPs of lower than .260, with their batting average rising by an average of 27 points.

In fact, only seven of the 50 (14 percent) players that had a BABIP of at least. 350 one year had higher batting averages the next year. And 17 of the 19 players (89.5 percent) who had a BABIP of lower than .260 one year had a higher batting average the following season.

Here is how those with the five highest BABIPs fared from one year to the next. All of their batting averages fell, most of them dramatically.

And here is how those with the five lowest BABIPs from 2008-2012 improved their batting averages from one year to the next, all by at least 29 points. That includes Alex Rios, who went from hitting .227 in 2011 with a miniscule .237 BABIP to batting .304 in 2012. His 77-point leap in batting average was the biggest anyone made from 2008-2012.

In case you were wondering, Braves third baseman Chris Johnson (.394), Twins catcher Joe Mauer (.383) and Rockies outfielder Michael Cuddyer (.382) led the Majors in BABIP this past season and are players you should possibly expect to have less productive seasons in 2014.

Meanwhile, look out for Cubs second baseman Darwin Barney (.222), Braves second baseman Dan Uggla (.225) and Orioles catcher Matt Weiters (.247) – those with the three lowest BABIPs among qualified players – to bounce back next year.