Effectively Wild: A FanGraphs Baseball Podcast - Effectively Wild Episode 667: The New Frontier in Pitching Statistics
Episode Date: April 30, 2015Ben and Sam talk to Jonathan Judge about BP’s new pitching metrics and the challenges and opportunities ahead of statisticians who are still trying to uncover new truths about baseball....
Transcript
Discussion (0)
You get what you deserve
You are the proud of what it's worth
And you're part of the world I love
Good morning and welcome to episode 667 of Effectively Wild, the daily podcast from Baseball Perspectives
presented by the Play Index at BaseballReference.com
I am Ben Lindberg of Grantland, joined by Sam Miller of Baseball Prospectus.
Hello, Sam.
Hello, Ben.
We are going to talk about a new pitching statistic today, why it exists, how it works,
how we should judge pitching statistics, what this one includes that others have not previously.
Baseball Prospectus rolled this out yesterday. It's called Deserved Run Average, DRA,
and it's going to be the backbone
of BP's pitching value stat
from just about this point forward.
So we are talking today to Jonathan Judge,
who was an integral member of the team
that produced this stat,
along with Harry Pavlidis and Dan Turkenkoff
and Rob McEwen and others who were involved. But Jonathan is the front man of this effort. So
hello, Jonathan. Hey, Ben. So tell us why we needed another pitching stat because this is the
part of the discussion where the old school guy who doesn't like the new school stats starts reeling
off fake acronyms to mock the actual acronyms that exist and are in most cases almost as ridiculous
sounding and there's a whole suite of those and people know FIP and XFIP and Sierra and all of
these existing advanced pitching statistics so what was the thing that you wanted to do that those stats did not do?
How does deserved run average set itself apart?
Well, the number one thing we wanted to do was to sort of take advantage of some things
we had been recently discovering, which was that the so-called mixed models could add a lot of value to our picture metrics,
in which instead of sort of averaging everybody and just sort of assuming that everyone's at-bats were the same,
everyone's home runs were given up to the same people or similar people,
and that it would just sort of even out over time,
we noticed that when you started accounting for the fact that
giving up a home run to Nelson Cruz was a lot more excusable than giving one up to, say, Ben Revere,
that the numbers started reflecting that, and they started getting a lot more accurate.
And so we had sort of been doing this in baby steps with catcher framing. And then I wrote a
piece about it, applying it to FIP, Fielding Independent
Pitching, over at the Hardball Times. And then in March, they basically asked me and said,
would you like to see if we can develop an entire pitcher metric for everything that goes on instead
of just framing or just strikeouts and walks and such? And I said, well, I don't know, but we would be happy to give it a try.
So we've spent the last two months working pretty much every day,
and we found we could not only do it, but that it works really well.
My head starts spinning when I think about this question of knowing that it works really well,
because the kind of premise of needing a statistic like this is that
runs allowed per nine on its own is not necessarily capturing the whole truth of the situation,
right? But then to figure out whether a stat like this works, you look to see how well it correlates
to runs allowed per nine. And then I start walking around trying to make sense of that. And so can you kind
of explain, like, how do you test the success of a product like this or of a stat like that?
How do you know that it is doing what you want it to do? And I guess, how do you know that what
you want it to do is actually good? Yeah. So there are kind of two ways we can look at it. One is we can just sort of look at
whether it is kind of following sound methods. So even before we get to the results, we can kind of
say, okay, does what we're doing make sense as we add things like temperature or defense or
catcher framing? Does the model seem to be giving us more information than it was before?
Is it explaining more of what's going on than what was going on before? And that's sort of our,
you know, as we're going along the way, that's how we basically kind of get reassurance that
we're not completely wasting our time. Then what we do is it is kind of tricky, because you're
basically trying to predict something that's already happening. And so and that's also a problem, because if you think about it, it's really easy to go back and say, well, here's what happened in a season and here's how everything matched up.
But what happens if you're in the middle of a current season and you're going right along and you're trying to say, OK, how much is this worth when the season season isn't even over yet? So the way we tested it and the way we designed it
is we actually tell the model to look at the last three seasons and sort of study everything that
went on there, all of the batting events, the effects of defense, the effect of all of these
things. And then the moment that 2015 kicked off, basically, the model is ready to go. And it just sort of predicts as we go along. And what we found is that when we took DRA, deserved run average, and we would compare it to FIP or some of these other metrics that work reasonably well, but we thought we could improve upon, that it was doing a much
better job. And that whereas FIP could explain, oh, about 50% or so of the runs that a pitcher
was giving up, that deserved run average can explain about 72%. And that's just our first
stab at it before we've started, you know, refining and doing other things. So that's how we test it.
We kind of go back in time and we say, how would this do over time?
Using the past seasons to get it ready, to teach it.
And then we kind of let it loose on the season we're interested in.
And it was beating, by the measures we were looking at, it was beating FIP every time
and every metric we could look at, which is usually a good sign
that you're onto something. So FIP, the idea behind FIP is that it might not tell you how
many runs a pitcher allowed, but it can kind of tell you how well a pitcher pitched. And the idea
behind ERA is that it can tell you exactly, almost literally exactly, although not because of the E
in ERA, but almost exactly the amount of runs
a pitcher gave up, but it does a very poor job of telling you how well he pitched.
And ideally, you're basically taking the aspect of fifth that we all like as analysts,
but making it so that it is no longer kind of imaginary and describing events that didn't
happen, but is real, tied to reality,
tied to how many runs actually scored and whether the pitcher's team was likely to win. Is that
a fair assessment of what this brings in terms of value?
I think in general, I think that's kind of what it's trying to do. I mean, the FIP is,
it looks at such a small subset of events, and they're really important events. They're the ones
that have, they not only govern a lot of things in the game home runs walks strikeouts etc but you
know those have spillover effects that we're all aware of if you strike out a lot of guys you also
tend to generate weak contact and other things that that help you out but as you say it only
looks at a certain number of events and the fact is that you know whether
we like it or not there are a whole lot of other events other than strikeouts and walks and home
runs that happen on a ball field all the time and you know the pitcher is the one who starts it
and it never would occur if he wouldn't have thrown the ball so we kind of have to find some
way to explain whose fault things are and how we should attribute that while recognizing that a lot of that won't be his fault.
So our idea is that if you essentially take the fact that the pitcher threw the ball and then you start accounting for as many things as you can, that by the time you're done, you have pretty much run out of explanations for
reasons why something happened aside from just pure random chance. And I don't think we're there.
But again, the fact that we've managed to sort of increase the amount of pitcher runs, scoring
allowed, etc. by that much from 50% up to, you know, over 70%, suggests that we have found that there's a lot more out there that was capable of being explained.
So that's what we're trying to do.
And all of this is laid out in the amount of detail that is appropriate to the reader.
There is a hardcore version of this introduction of the stat at Baseball Perspectives, and there's a softcore version, which still fairly hardcore but not but very readable and
there's a list of all the factors that go into this and people are used to you know batted ball
stats being involved in these things and park factors and you know maybe the strength of the
defense but this goes way beyond that this is bringing in things that I don't know that most people even consider when they're
evaluating a pitcher, let alone know how to factor it into the pitcher's performance.
So can you go through the list of some of the things that this has taken into account
that other stats don't and give us some sense of what's the impact of these things, like temperature or framing or the various things that go into this.
How much better can they make a picture look or worse, as the case may be?
Well, it's funny.
I don't have the exact coefficients in front of me, but these are all things that we, one of the things that we kept
doing, as we sat in our slack room, you know, virtual room, you know, night after night was we
sat there and kept on coming up with more ideas for things that, you know, could possibly be
basically screwing a picture in one way or another. And, you know, that's actually kind of a fun
discussion to have as terms of a brainstorming session. And the problem is that you have a whole
lot of ideas that people have,
but the question is, do you actually have data on that? And most of the time, the answer is no,
or it exists, but it's licensed, or there's something else that stops it. And what we
basically just found was that a lot of this stuff actually was having an impact. And some of these
were issues that people had told us, we want you to start accounting for, we're frustrated
that you're not accounting for this. And one of those things was catcher framing,
people kept saying to us, you know, if catcher framing is so important, then why can't you
quantify to some extent what effect it has. And so what we noticed was, you know, and we basically
just sort of kept track of these things. And we said, Is it having an effect or not? And if it's
having an effect, a meaningful effect, we leave it in. And if it doesn't, we would take it out.
And we found that, you know, we actually just didn't do catcher framing. We did catcher framing
and then umpire. And then even the batter can have a kind of a trivial effect. And so we noticed that
was actually having an impact, which makes sense because if it's, if the pitcher is getting more
strikes, they're going to be in more favorable counts, and you would expect some of the results to follow.
So that was kind of a big one.
Some of the other ones, temperature was just kind of a fun thing to add.
I don't think that that's absolutely a make-or-break thing,
but it did improve it, and not surprisingly,
when the temperature is 80 degrees, you get more run scoring
than when it's at 50 degrees.
And in April and August and things like that, that can make a difference. As the season goes on, the
amount of batters that you face, what basically happens is that the model notices that, not
surprisingly, two sort of things happen. One is that people who only pitch a certain number of
innings tend to be because they're not very good and managers don't want to have them anymore. So it actually penalizes you
if you don't not, if you have sort of a so-so or not so good looking value for plate appearance
and you don't have enough innings or batters faced because it goes by batters faced, it will
actually penalize you for that. And it will say you probably are not as good as this, even as good
as the statistic makes you want to be because you are, if you were, then why wouldn't your manager have given you more play dependencies?
Oh, so it's like, it's like you, it's leveraging the manager almost as a, as data himself.
Yes, it is. And of course it isn't doing this consciously. It's just saying, we noticed that
guys who only pitch, you know, a face like 20 or 30 batters generally stink. And so if it looks
like they're actually better than that, we bump them batters generally stink. And so if it looks like they're
actually better than that, we bump them down a little bit. And by the same token, from what I've
seen, what it does is it also recognizes that once you get above a certain number of batters faced,
the curve sort of bends a little bit. And because it says, okay, if you're actually pitching to this
many people, you're not going to remain this effective over time, which is, of course, what
is true of starters, they have to face people the second or third time through the order. They wear down. They
have other things that happen. So if there's, I don't know if I would call both of them survival
bias, and I guess in a sense they are, but taking into account the batters faced is one of the
things that DRA really enjoys doing. The other thing that kind of surprised me, we had defense
in there, but one of the sort of late innovations, and actually one of the things that kind of surprised me, we had defense in there, but one of the sort of late innovations,
and actually one of the things that kind of helped drive Pedro 2000, sounds like a Terminator model,
I guess it actually was when you think about it. But one of the things that, you know,
kept driving his number down were some of the innovations we added at the end. And one of those
was how teams defensively play on the road when they're at
a ballpark they're not at for half the season. And, you know, we kind of sat there and said,
you know, we see a lot of people who go to Coors or some of the other fields, and they look really
silly playing some of these balls off the wall, and plus the usual disadvantages of, you know,
being away and such. And we added that in and that was explaining more that we hadn't
accounted for before. So there's a list of another 10 or 15 things that people can read if they
really want, but those are sort of examples of the things that we decided to finally tackle and
were sort of pleased to find that we could actually insert them in. Do you recall anything that you
were convinced would make a difference that didn't, that you were surprised to see that we could actually insert them in. Do you recall anything that you were convinced would make a difference that didn't,
that you were surprised to see that it just wasn't that important?
You know, there have been a bunch of them, but I sort of keep rewriting them in my memory when we
have new things. I was sort of amazed when we started looking at things like base stealing,
how important certain actors were. I guess that's sort of the thing that sticks
with me most about the project. People have been saying for a while, people steal off pitchers,
they don't steal off catchers or off anybody else. And, you know, we actually sort of along the way,
we were able to sort of test that or the numbers we get tell us who is in fact the most important.
And it was pretty staggering um i never
would have imagined that pitcher versus catcher versus other factors were actually that important
in stealing and it really is it's it's really kind of amazing so is it pitcher is the most
important or is it absolutely it's like um order of magnitude more important i mean the the runs
we were seeing there were some suggesting that the choice of
pitcher on the mound is almost 20 times more relevant than the choice of catcher it it was
crazy i just kind of i looked at it and i looked at it again and looked at it a few more times and
then we you know you look at a few more seasons and finally you shrug and say i think just to be
conservative we'll say order of magnitude 10 times-ish, a whole bunch. People are absolutely stealing off the pitcher. And what's more interesting is that since, as you know, we have a base stealing metric and a base stealing attempts metric, and those are not matching up as well as they should with some of the extremes, which tells us that a lot of managers are not trying to steal off the right pitchers.
a lot of managers are not trying to steal off the right pitchers.
So that's something that I think people ought to be looking at a little more closely.
Oh, so basically, like, if they lined up, if managers knew who to steal off of,
they would, basically, we would see more attempts against the pitchers who allow more steals reliably. But since we don't, they're probably not running because it's Yadier
Molina behind the plate, even though the pitcher is more important than Molina. And they are running
when it's Derek Norris behind the plate, even though the pitcher is more important to the
equation than Derek Norris. Is that what you mean? Exactly. And, you know, for example, the one that
kind of jumped out to me, and I didn't really look at his attempt numbers, but, you know,
one of the, if not the absolute worst close to it, you know,
last year in base dealing, which we called swipes was Jake Arrieta. And that, you know, he was
actually one of the top value pitchers last year under DRA. I mean, he was just kind of unstoppable.
And so this is this is not only a big weakness, it may be one of his only weaknesses. And so you would think
that when you got on base, and especially since the Cubs have not exactly been employing a
battalion of, you know, great throwing catchers in general, that someone would try to take advantage
of that. But you know, if they are, they certainly aren't doing it at the extent that they should.
So, you know, more interesting findings, I guess.
Well, John Lester has the second worst takeoff rate above average right now, which means he is
allowing the second highest rate of steal attempts. So that seems to validate that stat.
Yeah, or that managers read the papers, I guess. Yes, it does.
Right. Tyson Ross is highest, and he was actually very high in 2014, too, it looks like. So
that is a thing that Tyson Ross is not good at, holding runners.
So this is definitely a do not try this at home stat, I guess you could say. I mean, it's
even, you know, FIP or XFIP, you could calculate if you wanted to. You can plug in the various
batted ball things and the coefficient or whatever, and you can calculate if you wanted to. You can plug in the various batted ball things
and the coefficient or whatever,
and you can calculate them.
I don't know that most people do,
and there's usually no need to.
That's why it's nice to have these things on leaderboards
where you can go look them up
and they're pre-calculated for you,
but maybe it gives people comfort
that they can actually plug these numbers in
and get the same numbers out.
This is so much more complicated
and in a good way, probably. And yet it does kind of separate the stat from the user of the stat in
a way. And, you know, maybe that would make people uncomfortable in a sense that it is just so beyond
their ability to calculate it or validate it themselves. And yet, I mean, I guess
that's where we're headed, right? Like, if we want to keep iterating and improving on the things that
we have, we have to use these ever more complex methods, right? To get the extra little bit of
signal out of the noise. And so there's probably no way around that. So is that
the way that you would kind of comfort people? I think so. I mean, I guess I have found that as
sort of my interest in baseball and baseball stats has grown, that I have actually taken
pleasure in watching college basketball. And I make a point of understanding absolutely nothing
statistical that goes on inside of it. And I actually really kind of enjoy that. It's a bit liberating to just be like, oh, he made the shot. That's great.
And so I think that's basically it. I mean, I think you can watch a baseball game for all
sorts of reasons. And you can kind of want to know what's going on for all sorts of reasons.
And if, you know, it's really important to you to sort of understand more of what's going on,
or as much as possible, then I, you know, I think, unfortunately, or fortunately, that this is sort of kind of the
next step of where we're going. You know, it just seems to be where a lot of the developments are.
And I think the main reason to do this is because it allows us to sidestep a lot of the problems
that we're really starting to develop with batted ball data. And I
know Colin Weyer spent a lot of time writing about this, but basically that batted ball
classifications, as you all know, are a bit of a nightmare. The line drives versus the fly balls
and different parks and what type of home run was it and things like that. And so, you know,
that's really become a bit of a problem as people have tried to mine that data and figure out what it means and how that relates to how many runs you should be giving up. And it's pretty much a bit of a mixed bag, I think, at best.
that we basically just put all the people up against each other.
And so to the extent these people have a batted ball profile,
even though we don't know what it is and we can't classify it consistently, we at least know who Nelson Cruz is and we know who Matt Carpenter is
and we know who certain pitchers are and we know how they interact with each other.
And so in a sense, you know, sort of correcting for those things is very complicated to do,
and you can't really do it at home.
On the other hand, it's a really sort of simple and I think understandable way of getting
at who really is better and how they match up with each other.
And so that's one thing that DRA uses to its advantage.
And I think that actually does make a lot of sense to people.
And if more information becomes available, if StatCast goes full public,
is this a model that you could kind of scale to incorporate any new information that becomes
available to us in the future? Absolutely. I mean, right now, what we're basically doing is
we're taking the quality of the defense on every play by simply sort of, you know, totaling up all defenders on
the field. And, you know, that actually is still providing a lot of signal because over the course
of a season that does add up and it matters. But, you know, let's say once, you know, we were
satisfied with, you know, the actual precise direction of the ball and the velocity of it,
if I'm using that word correctly, I never know if I am,
you know, then we can start saying, okay, we're only going to consider certain defenders in certain regions, and we'll start getting more precise. And, you know, maybe the location of
the ball will, you know, affect whose responsibility it was. And so yeah, I mean, I, the thing I like
about it is that it really is sort of a, a move to a next generation of doing this. And I think there
will be further refinements. I don't think they will produce necessarily the same leap in
explanatory power that we got with this one, because it is sort of going on to a new level
and doing new things. But yeah, I mean, that's kind of the point of it, is that if we want to add
another couple of data points or something, that's really not a big deal. We can add as many within reason as we want. So StatCast and other things, they can
go right in. And I think next offseason, I would be a little surprised if we didn't end up maybe
cutting one or two things out and adding a few more. And I think having a framework that's really
that flexible is frankly kind of important if you want your statistic to continue to be meaningful and give the value that honestly BP readers sort of have come to expect.
You kind of wrote that this is not necessarily the best metric for predictions, for predicting for going forward that you would say use CFIP for that instead of this.
Why is that? Why does this not predict? What is it sucking in that isn't necessarily true
talent level or that isn't necessarily predictive? Yeah. And that's sort of the hardest thing to kind
of understand. And that was sort of a topic that I was exploring a lot in the Hardball Times paper.
And it was just this issue that I'm sure you're kind of common sense fan if they care about
baseball statistics
at all says, you know, why can't you just give me the one thing that'll tell me what I need to know
instead of, you know, sending me down this chart with all these different things and whatever.
And unfortunately, that's kind of impossible, because the fact is that people, you know,
stuff happens in baseball games all the time that has nothing to do with how
good, I shouldn't say nothing to do with, an awful lot not to do with the quality of the pitcher on
the mound. That's why, you know, runs allowed for a pitcher actually has virtually no correlation,
you know, from one year to the next. It's really poor. And so, you know, that sort of tells us
already that since the number of runs that cross the plate are obviously the most descriptive statistic there is, runs being the currency of the game, there's simply no statistic that can accurately tell you everything that was going on simultaneously and then do it again next year because even the runs crossing the plate can't do that.
even the runs crossing the plate can't do that. So what we kind of have to do instead is focus on a small subset of skills which tend to reproduce themselves like strikeouts and walks
and home runs. And so when you focus on those, then you can still have some predictive power
because those are skills that the pitcher actually takes with him to the next year. Whereas
things like the extent to which the ball will skip off the infield grass at some place or skid
somewhere else off the turf or the fielder will fall down, those are not things that the pitcher
takes with him to the next season. And so that's why when you're really trying to sort of describe
what happened and give the pitcher as much of a break as he's entitled to, you're doing that.
But those same sort of complex events themselves don't show up the next season because it's a new season with new players and new stadium surfaces and different temperatures and all sorts of other things.
I'm going to give an example of a thing that has nothing to do with this, but it's kind of an analogy for my question. So a few years ago, like Dave Righetti was getting a
lot of credit because the Giants pitchers never allowed any home runs at home. They had like big
fly ball rates, but they weren't allowing home runs at home. And it'd be, you know, it's, and
maybe it was correct, but it'd be very easy to say, oh, well, you know, they're lucky they're
benefiting from their home ballpark. They're allowing a lot of deep fly balls, but in that park, they get bailed out. And so if you
were writing a metric that wanted to show how good they were at pitching, you might knock them down
a lot for that reason, right? You're saying their ballpark is saving them. But another person might
say, well, maybe they're pitching to that ballpark. Maybe they're very good at assessing the context
that they're in, that they know just where to throw the pitch
that it gets caught in triple's alley. And maybe that's generous, but I'm just giving a hypothetical.
And so maybe in fact, controlling for that ballpark is ignoring how much sort of agency
a pitcher has in what he does. And so one of the things that makes DRA really well engineered and
really complete and comprehensive is that you control
for an awful lot of these sort of contexts. You control for a lot of things that affect how many
runs a pitcher is likely to give up in a situation. I don't know, do you feel like any concern or
is it something that you can, I don't know, that you've managed to control for or something,
the ability of a pitcher to pitch to a, say,
hypothetically, particular umpire's strike zone or a particular catcher's ability to frame
particular pitches or whatever the case may be. Things that you're sort of peeling off the factors
that affect the pitcher when, in fact, the pitcher may be affecting the factors. I would say we are
confident that we're getting a lot of that. We are not confident that we're getting 100% of it.
Because in order to do that, to give the really sort of basic version of it, basically we have to use something that are called interactions.
And in other words, what you basically say, I want these two variables to sort of play a game together and sort of see how they interact.
And so that's how you might see how a pitcher is in a particular ballpark or in other things. And doing those is actually
incredibly resource intensive. And it's very difficult to do, even in mixed modeling. So,
you know, to the extent that there was truly a pitcher who was absolutely knew exactly where to
send a pitch for a particular umpire and when to do it at,
you know, AT&T Park and knew that he would bring it to the opposite field. I mean, those are things
that I'm sure at some point we are not controlling for. And those are the sorts of things that we
would probably continue to try to, you know, do over future seasons. That said, I think that,
I mean, pitchers can have plans like that all they want.
As you know, that they, where they want the ball to go is not where it's going to go,
at least not as much as they would like most of the time. And I think a lot of that balances out.
And at the end of the day, what really matters is, you know, what is the ballpark like for
hitters who are right-handed versus left-handed, which we account for. And then you account for that, you account for temperature and some of the other interactions.
And there could be, you know, some very specific things that aren't yet accounted for, but I think
we're getting a lot of that. And I will say that one thing that I probably want to explain,
because I don't think it's clear to everybody, is that just because there are things we aren't
accounting for yet doesn't mean there's anything wrong with what we are accounting for.
I think sometimes people say, well, if you're not accounting for this one thing, then the whole thing isn't, you know, if you're doing temperature, but you're not doing humidity, it must not be a valid exercise or something.
And, you know, it's basically we've got a sort of threshold now of explaining about 72% of pitcher run scoring.
And then the only question is,
how do we take it up further? So all of these things like the interactions and the possibility
that, you know, Pirates pitchers are magical at PNC and not just for reasons of the dimensions,
but because their pitching coach is the best. Those are things that we could probably start
figuring out how to control for. And then, you know, maybe we might go up to 73 or 74%. But, you know, those aren't things that I really expect that on
a league wide basis are going to make a huge difference. I want 80% by the end of the decade.
All right. It'll be our version. It'll be Ben's law.
Yeah, so that is my inspiring goal for the end of this decade. So you mentioned that this is primarily
useful for evaluating past performance, past value. People have kind of gotten into this
habit of using FIP as a predictive tool, because I guess it is one relative to ERA. And yet you
remind people that we still have projections and projections are useful too so
is there any reason why people should be looking at FIP or looking at DRA and saying okay it's
lower than this guy's ERA and therefore he is going to have a lower ERA from this point forward
because it's going to match his FIP or his DRA or whatever,
or should you only look at projections in that context? I would say I like to kind of think of
what I call the sort of quick and dirty analysis, which is I'm wondering about some guy. I want to
go and just sort of get a sense of whether what he's been doing over the past three weeks makes sense or not in the long term.
You know, if you go to BP, you will find DRA and CFIP right next to each other.
And the easy thing for people to remember is that DRA is for the past.
That's the your medals in the mail version of the leaderboard.
And then CFIP is sort of for the future.
That's what evaluates what I call their true talent, because CFIP actually sort of predicts both. It's descriptive and its predictive power is equal, which is very unusual
for a statistic. So that is actually telling you how good the pitcher is at the things that it
measures and is probably, for just a quick version of events, what you would look at in terms of to
say, okay, this guy's really good and he's having just a tough month, or this guy's actually really bad, and you should enjoy him while he
lasts. Certainly, if you really want to get into things like specific runs allowed, specific
walks, things like that, you would probably want to go to a projection system. But honestly,
I think just DRA for the past, CFIP for the future. And I am honestly a little curious at this point what sort of everyday value FIP and
XFIP and Sierra would still have by comparison.
I think I'm sure some very smart person will point out that they do.
And that's great.
I hope if they do, someone will prove that.
But in the meantime, you know, DRA is explaining more of the past than anything else that I
know of.
And CFIP is explaining more of the future and has a better balance between descriptive
and predictive than any other statistics.
So those are, if you really only want to have two you look at for how good your pitcher
is, those are right now, I think, two that will largely do the trick.
Do you have any favorites?
People, I mean, if you've been looking at results and leaderboards for months now, do you have anyone who looks especially good in DRA or one
of the components of DRA that you've calculated who maybe was underappreciated or overappreciated
or any specific individual who stands out? Well, I think it was Rob Nair who basically said that, you know, Jason Schmidt
must be making campaign contributions to the DRA fund. DRA really likes Jason Schmidt. It thinks
that his 2003 season, his 2004 season were beyond outstanding. The two most, actually, by terms of
value per plate appearance, they believe that, DRA believes that Jason Schmidt was, you know, the most valuable pitcher in that regard in both seasons.
And that, you know, I honestly never heard of Jason Schmidt.
I guess I should have been a better student of the game back then or paying attention.
But he was just some guy who, you know, seemed to do a good job and was obviously one of the better pitchers.
But DRA thought that he was really the cat's meow. So that was sort of the first name that we've been kind of joking about for
the past week or two, which is, you know, sort of Jason Schmidt for vice president sort of thing
with, you know, Pedro, obviously the top of the ticket. That was the one that kind of jumped out
to me. A lot of the other names, fortunately, we don't want to have too many surprises, right?
Most of the other names like on the top 25 list are people who either are very, very good in general or had one extremely good season we're all aware of.
And so for the most part, I'm pleased to say that there haven't been a whole lot of surprises.
But there have been one or two where you've looked at it and gone, oh, yeah, that guy.
And everyone shares some fond memory about that guy.
So that's kind of the fun about it.
So as crazy as it sounds, there will come a time some years from now when the Internet will actually argue.
It'll probably be a minority arguing this, but will argue about the pronunciation of this.
And you're saying DRA.
I just want to – how much are you emphasizing that it must be D dra and that somewhere some somewhere down the line
when somebody is calling it dre uh that it is that it was not the creator's intent
we had an argument about this today um and this is i think actually for weighted on base average
as i recall correctly i don't think tango and mitchell lickman can even agree on how it ought
to be pronounced at least i recall hearing that once I don't know if it's true. We actually had a two to one split in the Slack channel at the time. And Dan and I
thought it was should be DRA and someone else really liked the DRA aspect of it. And then
someone else pointed out it really would be draw under, you know, standard pronunciation. And then
that was troubling to people. So I think pressed by virtue of being the person who got the honor of going on the podcast,
it is D-R-A, and in the sense of E-R-A and R-A-9,
I think that's the simplest thing.
And I think also if you say D-R-A,
people know you're talking about some baseball statistic,
and if you say D-R-A,
I think people are going to give you funny looks.
So maybe that's your objective,
but I think D-R-RA is the way to go.
How did you have time to be a lawyer while you were building this?
Well, I think my firm's asking the same question.
They've actually been very supportive about it.
They think it's a little weird but a little fun.
And no, that's basically what I've been doing is lawyering all day.
In theory, having a wonderful wife and two kids and coming home and
spending a couple hours instead of watching TV for the past couple of months. It's been
hanging out with BP staffers and modeling and asking ideas. So maybe I need more hobbies.
But it has been tremendous fun. It is, BP really, and all seriously, it is just a wonderful group
of people who are kind of
doing this because they really want to understand more. And you all know this, but I don't know if
everyone else does, but to really sit there with a bunch of really bright people and talk about
baseball and sort of unlocking additional explanations for things for a game as old as it
is, is tremendous fun. So I found the time I'm looking forward to this project being done very
much. But it is it has been an absolute blast. It's not going to be done till you get to 80%.
But you can you can take a break. Do you get to use sabermetrics as a lawyer? That is the one
thing that's missing from the good wife, I think is a sabermetrician in the firm.
Oh, excellent. I do. Actually, it wouldn't be Saber Metrics because
I would have to have baseball team clients, I suppose. But I do have started integrating
modeling and analytics, to use a very overused term, into lawyering. I actually published an
article a couple months ago using sort of the same modeling techniques that I've used for
Cardball Times and BP and other
things to predict government fines. So if the government says, we want to fine you $15 million,
actually, they only ought to be fining you $2 million because you did this or that. And you
might say, well, that's still a lot of money. But if you're a really big corporation, that's a highly
material difference. So I've published on that. And that seems to be an area
of growing interest, because a lot of lawyers are still making decisions by leaning back in their
chair and deciding what makes sense or what feels right. And, you know, good judgment is really
important to lawyers. But the idea that we're still doing that in the year 2015 is a bit insane.
So I'm actually seeing more and more interest in applying sort of a lot of the
same rigor and discipline of sabermetrics into legal analysis. And I hope it continues to grow
because it makes my job more fun. I'd hope that you have good temperature and humidity data for
that because I would think that the fines would be stricter if the courtroom is hot and humid
and uncomfortable. yeah the we definitely
hope the air conditioning works without question all right you can go read all of these introductions
to deserved run average dra at baseball prospectus now if you want to ask jonathan judge questions
directly you can find him on twitter at bakla b-a-c-h-l-a-w if you're listening on thursday
the 30th you can go to baseball prospectus and chat with harry pablitis who's doing a q a at
1 p.m et anyone can participate in that it's not just subscribers so lots of different ways to
find out more about this stat and ask questions and get answers.
So thank you for giving us some answers, Jonathan.
You bet.
Okay, so that is it for today.
You know where to find our Facebook group, but I will tell you anyway, facebook.com slash groups slash effectively wild.
You can send us emails for tomorrow's listener email show. We're doing a Friday email show at podcast at baseballperspectives.com.
Rate, review, subscribe on iTunes, and support our sponsor, The Play Index,
by going to baseballreference.com, using the coupon code BP,
and getting the discounted price of $30 on a one-year subscription.
We'll be back tomorrow.