Tuesday, May 4, 2010

Fun with BABIP Regression

As I'm sure most of our dutiful readers already know, small sample sizes are often quite perilous, and cannot be trusted. There are countless reasons for this, and I don't think anyone really has the time or energy to dredge their way through a somewhat exhaustive list - and I'm sure that I don't really want to write it. That being said, BABIP is the quickest, dirtiest explanation for the issues with small sample sizes. With that in mind, I thought that I would look at a few Yankees who are bound to see a drastic change in their performance as their respective BABIPs regress to the mean. To do that, I'll compare their BABIPs to their expected BABIPs (xBABIP).

The Risers

Mark Teixeira
  • BABIP - .221
  • xBABIP - .322
Regressing Teixeira's BABIP to the mean would add a staggering .078 points onto his batting average, raising it from .189 to .267. While that may not be on-par with the Teixeira that we know and love, it's much more palatable, at the very least. It's also worth noting that his HR/FB is about eleven percent lower than his career norm, meaning that a real coming out party should be right around the corner.

Curtis Granderson
  • BABIP - .262
  • xBABIP - .353
I'm hoping that this will at least temporarily stymie the hindsight patrol in regards to the dealing of Austin Jackson (whose BABIP is around .150 higher than expected). Regressing Granderson's BABIP to the mean results in his average jumping from .221 to .282. Assuming all singles, Granderson's triple-slash line would be .282/.352/.436. While that's still a bit below what we'd all like to see, it lends an iota of hope, at the very least.

David Robertson
  • BABIP - .556
  • xBABIP - .227
Robertson is my dark horse for the dreaded 'Heir to Mo' discussion… and he's seemingly becoming darker and darker by the day. This, however, makes me feel a bit better. Robertson's batting average against stands at .408 - regression drops it all the way down to .190. Factor in his improved control and still dominant strikeout numbers and there's still a whole lot to like here.

The Fallers

Phil Hughes
  • BABIP - .162
  • xBABIP - .247
Luckily, Hughes' xBABIP is still very low - he simply hasn't been hit too hard too often this year. His batting average against would rise to a still stellar .157 (it's .124 now) with regression, which leaves me feeling pretty confident. The more pressing concern, in my mind, is his HR/FB (sitting at 3.3%, league average tends to be 10%) and his LOB% (87%, tends to be closer to 72%).

Marcus Thames
  • BABIP - .556
  • xBABIP - .262
This is the small sample size to end all small sample sizes… but I really don't like Thames, and I'm really worried about him lumbering around left field for the next couple of weeks. A bit of regression with Thames drops his average from .458 to .167. Here's hoping his luck continues.

Robinson Cano
  • BABIP - .365
  • xBABIP - .338
Breathe deep Yankee fans - it's going to be alright. Cano's swing data doesn't show him as making a flukish amount of contact or anything of the sort, so this career year does appear to be for real. His HR/FB is unsustainable, but he's hitting a ton of line drives, which means the hits should keep coming. Regression would drop his average to .355… I'll take it.

I must note that this is far from an exact science - nothing ever goes exactly by the numbers in baseball. Zack Greinke maintained a 4.5% HR/FB in 2009. Joey Votto sustained a .372 BABIP (fifty points higher than his xBABIP). The likelihood of these things happening isn't terribly high, but, with the exception of Thames, nothing here is really an impossibility.

blog comments powered by Disqus