Effectively Wild: A FanGraphs Baseball Podcast - Effectively Wild Episode 653: The First Sign of Statcast
Episode Date: April 9, 2015Ben and Sam banter about time of game and Jon Lester, then discuss what to do with the first release of Statcast data....
Transcript
Discussion (0)
She's stuck in the thought that she was ready to go
She said, I'm a C-point, it's about that incident's over
She said, technology, it needs me
Oh, if you want me, then I won't tell you
Too bad, cause I gotta be by myself
And she said, I'm all your fault, I'm leaving
Good morning, and welcome to episode 653 of Effectively Wild,
the daily podcast from Baseball Perspectives,
presented by The Play Index at BaseballReference.com.
I'm Ben Lindberg of Grantland, joined by Sam Miller of Baseball Perspectives.
Hello.
Hello, how are you?
Okay.
Should we do a daily time of game update,
or is that not going to be a regular segment?
I don't think it should be a regular segment, but I've got one.
Okay.
For you.
So yesterday's games, eight of them were under three hours.
Five of them, I believe, were over three hours,
and one was exactly three hours, and one had a rain delay.
And so it's hard to know how long it was based on what I'm looking at.
But there was a two hour, 21 minute game, a two hour, 23 minute game.
And generally, if more than half are under three hours, that beats the pass.
And so now I've got the median length. this is all by play index by the way,
bonus play indexing. The median game length, which is only slightly skewed by weather phenomenon. is two hours, 57 and a half minutes.
The median game length for games last year through three games
was three hours and five minutes.
And so seven and a half minutes have been cut.
Also, just to see if there's a distorting effect early
in the season, I've got the median game length for last year for all team games between 60
and 68, and that is three hours and seven minutes. So, A, something's happening here.
But what it is ain't exactly clear.
We know that the first day was all fast and awesome and exceptional.
And so it's conceivable that we are simply regressing
and the gap is closing.
So the gap on day one was 20 minutes between this year and day one last year.
And so probably last year's first days were longer than typical.
And this year's first days were shorter than typical. and this year's first days were shorter than typical,
and so it was a 20-minute gap.
Just two days later, we're at a 7-minute gap, more or less, basically.
This isn't perfect data, but more or less.
It's evaporating. We're losing it.
We are losing it.
My guess is that if we did this in two weeks,
we would be at about three minutes, and it will stay there all year.
Well, we probably will do this in two weeks, we would be at about three minutes, and it will stay there all year. Well, we probably will do this in two weeks.
I think Rob Manford's goal was like 10 minutes or something.
As I recall, he just wanted a modest improvement,
so three would probably not be satisfactory to him.
No, but modest improvements often start with modest steps toward modest improvements.
So, I mean, satisfactory would imply, like, it probably wouldn't stop him from working.
It might just drive him to work even harder.
But I think that at the end of the year, I mean, that slide on the PowerPoint presentation
would be in a happy font.
Yeah, I think so.
Even just to stop the increase would be a victory of sorts.
Yeah, he would definitely do that thing where he would chart the time of games
in the PowerPoint, chart the time of game over the years,
and it would be up, up, up, and then down.
But he would definitely do that thing where instead of setting the Y axis from zero to 200 minutes or whatever, he would set it at 175 minutes.
And so it looks like it got down to almost zero. It would be a misleading chart.
Yeah. Okay.
A misleading graph, I should say.
Yeah.
Okay.
A misleading graph, I should say.
So I wrote my John Lester article,
my John Lester article that we bantered about on Monday,
what would happen if John Lester went another season without throwing over to first base
and all of the base runners knew that he wasn't going to
or acted as if he wasn't going to
and just called his bluff on pick-off attempts.
And I went with the simulation route.
It was the only way that I could think to do it well.
And so I contacted all of the best-regarded,
long-lasting baseball simulation games uh baseball mogul and out of
the park baseball and stratomatic and dynasty league baseball and diamond mine baseball
and i asked all of them to do whatever the closest thing that they could come up with was
so just uh all of these games are very sophisticated and they have you know
two or three different ratings that go into the pitcher and his impact on the running game like
they have a hold rating or they have a pickoff rating or they have both it's very complicated
and so they uh they all ran some sims for me with Lester, not only at the minimum, not only with his pickoff attempt rating set to zero so that he would never make a pickoff attempt, but also with the base runner aggressiveness bumped up so that guys would take the maximum advantage of it. And so Diamond Mine was kind of the centerpiece of this.
And they ran the simulation 500 times for me. And they actually did it for 2014. So they
re-ran his 2014 with these altered Lester settings. And the difference was significant but not season derailing he went from uh 2.46 was
his his actual era last year to 3.22 oh that's a lot yeah it's i mean i mean he's still a really
good pitcher after he was last year but he but nobody thinks he's really 2.46.
I mean, that was his career best year.
So let's say, I mean, if his projected ERA this year was like 3.1 or something,
that would bump him to 3.9 or 4,
which would make him essentially not a very good pitcher in this era.
Yeah, right.
That's the difference between if he's getting paid $25 million,
I mean, what does a pitcher who eats innings
and has an ERA of four in this day and age get?
Is that Jeremy Guthrie?
Is that 11 and a half?
Does that take half his value?
Well, he definitely doesn't get $25 million a year.
But yeah, so the other sim games
didn't run as many simulations they just you know did
one or did a handful or whatever and and they can bounce around quite a bit from one simulation to
the next but but uh the the three of those games that simulated his 2015 instead of going back and
redoing 2014 uh baseball mogul he went from a 3.15 ERA to a 3.89 ERA.
Out of the park baseball, he went from 2.76 to 3.43.
And strat, he actually didn't change much at all.
He got better. Tell me he got better.
He didn't get better, but he only got like a tenth of a run worse,
even though he gave up a bunch more stolen bases.
I'm very disappointed, Ben.
I don't think you can...
You can't have gone to all the credible simulators
if you didn't go to John Boyce.
I should have had Boyce build a video game for me.
Yeah, I mean, there's got to be a setting on your MLB 2K14
or whatever they call things these days, right?
I looked into that a little bit.
I don't think so.
It didn't seem to me like MLB was,
the show was rigorous enough with its pitcher hold rating
that it would do as good a job as these sim games.
I think these sim games are pretty sophisticated.
I mean, these are like Diamond Mind is Tom Tippett's game,
and he did the stuff for Diamond Mind that he then did for the Red Sox.
So it's, you know i'm always
impressed by how much goes into these games so i kind of buy it yeah anyway yeah that article is
up at grantland right now if anyone wants to go check it out in detail yeah i uh yeah i i i keep
going back and forth on whether i think it would ever get to a point
that runners would feel comfortable taking, like, a 30-foot lead.
Yeah, I mean, there have been some comfortable-looking leads against him,
but not—I mean, I don't know.
It's because I went back as I was writing this,
and I looked at his last pickoff attempts in 2013 and 2012.
Even when he was doing the odd pickoff attempt, it was so weak at that point.
It was just like he was bouncing them, he was lobbing them. You almost could have Like broken On the first move and gotten back
By the time the throw
Actually got to first base so
If you watched all that video
And you looked at the
Numbers that said he never ever throws
To first I don't know you should
Be pretty confident although I
Don't think this is going to last much longer
Just based on
His quotes and Madden's quotes.
It sounds like they are having him work on this,
and there was some video of him throwing on the side in a pickoff move-like manner.
So I'm guessing that this streak of 66 straight starts now without a pickoff attempt.
I am guessing that its days are numbered.
I would too. It's just
knowing what we know. I mean, if
he had six months, five months to do something
and didn't, then it's
hard to then say that he'll do it
in five days. I agree.
It just seems
overwhelmingly unlikely that
he could keep doing this and
just not do this very simple
thing that has some benefit to him. But like, it would have been impossible for me to accept
that he would come into the season still not throwing the first.
Uh, okay. So you probably want to spend the rest of the show talking about the Josh Harrison extension.
How does the Josh Harrison extension rank on your intrigue, extension intrigue list?
Extremely low.
Probably the lowest.
The lowest.
Okay.
So we won't dwell on that um all right well i wanted to talk about the stat cast stuff that is out there unexpectedly and i wrote about this for grantland also
should be up sometime soon but uh i don't know whether you've followed this at all it's been
kind of this underground story
because it hasn't been announced officially.
It's just this data coming out kind of in dribs and drabs
where there is essentially HitFX data now that is public.
It's not HitFX, it's TrackMan,
but it's what teams have had with HitFX for
the last several years. It's batted ball velocity, it's batted ball angle, both vertical and
horizontal, and batted ball distance. And so this was just very quietly added to the feed of stats
from Major League Baseball Advanced Media on opening day.
There was no warning that this was going to happen.
And it's been kind of spotty.
It seems like they're still working things out.
It's not in every park,
or at least the data is not coming from every park.
And sometimes it will disappear during the day, and then the data will not coming from every park. And sometimes it will like disappear during the day
and then the data will come back a few hours later.
So it's not totally clear for sure
that we can count on having this all year
or that it's just here to stay
the way that we can count on having PitchFX
or whatever the equivalent of that is now.
But it seems promising.
It's encouraging because I wasn't at all sure what we would get stat cast wise.
There's also some stat cast stuff that's showing up in game day and at bat.
It's a more stripped down version of the raw data that has been released.
It's just batted ball velocity and distance and I think not the angle stuff.
But that was kind of what I was worried about, that that was all we would get,
that we would get a kind of watered-down version of the uncut stat cast that teams were getting and that we wouldn't
be able to do all the cool analysis that people have done with PitchFX because we wouldn't have
the data on that level. We would just kind of have superficial stuff or it would be on broadcasts and
we'd get to see videos with some numbers from time to time, but you wouldn't be able to do real hardcore analysis.
And at least right now, it seems like maybe we can.
There's no indication of base running or defense or anything like that yet,
so that might be a while, that might be never, I don't know.
But the batted ball stuff alone is intriguing and so wait hang on so just
uh clarify again it it's batted ball distance and batted ball velocity but not angle and not
it is angle uh in angle angle up or down or angle left and right both okay yeah in in game day it doesn't show the angle oh but in the in
the raw data feed that you can look up like on baseball savant uh that is already there there's
a leaderboard of the the hardest hit balls and you can look up all the the balls that there's
batted ball data for and wait wait wait and what am i right did did I misstate or yes, also distance?
Also distance, yes.
For everything? For all batted balls?
Well, not all batted balls have this yet. It seems like it's still a work in progress.
If you ground out to shortstop, what is the distance for that labeled as?
Is it a foot?
Is it 107 feet?
I will look up an example in my spreadsheet of batted ball data
and find a ground out to shortstop.
Let's see.
So Carlos Gonzalez grounds out sharply to second base,
and the distance is 88 feet.
Interesting.
And so then if it were to get past second base,
then it would be, you know, 200 feet.
Yeah, I don't know how that works.
I don't know whether it just detects when the forward progress stops
or when it goes into a glove or what.
So do you have like a, is there like a line drive double or a line drive triple in there?
There should be.
Let's see.
All right.
So double, Karl Crawford double.
uh double carl crawford double and it is a line drive and the distance is 290 feet but i don't know whether it i don't need to the wall or whatever yeah actually there's
oh there's a let's see there's a play description field also. Carl Crawford doubles on a sharp line drive to center fielder Will Myers.
So that's probably not to the wall.
Let's look for another one.
It's also probably not 290 in the air, probably.
If it's a sharp liner, to me, sharp liner is low.
Sharp is usually low, right?
Yeah.
It probably didn't travel 290 feet in the air.
So it's probably giving you where the ball is fielded.
Mm-hmm.
Let's see. Freddie Freeman doubles
on a ground ball to right fielder
John Carlos Stanton.
95 feet. Huh.
That's interesting. That's odd.
So that one is clearly
not where the ball stopped.
Sharp line drive double to Norioki, 306 feet.
And that one is.
Sam fold triples on a ground ball to Shinsu Chu, 131 feet.
So maybe it's where it lands.
Well, some of those are where they land, but some of them are probably not.
I mean, you don't see anything in there that has a distance of six feet or less, right?
I mean, there's no ground out to third distance, you know, one of them.
It looks like there is.
Let's see if I sort.
Here's one.
Chris Iannetta, line.
Wait, no.
I got that wrong.
Distance, sort Smallest to largest
2.3 feet
But that could be like, I mean, you would expect
A lot of them if they were counting grounders
Is where it hits the ground
Yeah, right
Adrian Beltre
Grounds into a force at
3.4 feet
So I don't know, I mean, I'm not
They could still be working the kinks Out of don't know. I mean, I'm not, they could still be
working the kinks out of this thing for all I know,
but, so I'm not sure.
And tell me, are you sure, did they
get, are you sure that they've got
drabs in there? I mean, I know they've got dribs,
but are there drabs?
Well, it's only
one kind of thing, so
maybe it's only dribs. Only dribs so
far, so we can still look forward to drabs
yeah there's definitely definitely more to to look forward to i don't know what exactly but
cory schwartz of of mlb advanced media has been tweeting lots of cryptic things about how there's
more coming and they're going to be adding more to the feed, and there's going to be more and more as the season goes on.
So I guess that they will incorporate base running and defense
and all those sorts of things.
But anyway.
I've waylaid you, yes.
Tell me where you were going with this.
Well, I wrote about just the things that I'm looking forward to using this for,
or smarter people than I using it for.
And I am wondering what you would like to see it used for.
Other than just, I mean, it's always fun to just sort by,
you know, the guys who hit the ball hardest, that's fun.
Or the guys who hit the ball weakest, and those kinds of
things. But there are things that we should theoretically be able to find out with this
information that we couldn't have found out before. Well, I mean, this would have been a
more interesting answer like a year ago or two years ago, but I've always wanted to have a, you know, very well
reliable to me.
I've always wanted to have a results independent batting stat.
Right.
And so that seems to be like eminently plausible at this point, like extremely plausible, like
almost too easy to give an answer as an answer.
So that like from a that is I mean I think that is the the biggest thing that I mean it's it's the
most obvious but it's also the probably the the number one way that you would
use this yeah and and it's already like to some degree it's already been scooped guess, not with this, and maybe it'll get refined, and maybe it'll get better, but Ben Jedlovich unveiled a methodology, I guess, for how to do this already at Sabre Analytics.
Right. Well, yeah, I mean, teams were doing this six years ago.
And exactly, teams were doing this six years ago?
Yeah.
Geez, I thought maybe three years ago.
No, they've had this information since they have hit effects going back to 2008.
I think maybe they got it in 2009 and then it was like filled in for 2008 later or something but they've
had it at least since 2009 which is i mean it's sort of sad to think about i'm all excited about
this new data that we get to play with and to anyone who's been with the team for the last
several years it would seem like the most uninteresting thing all right right. So from a fun perspective, this would just be a toy,
not really relevant for data. But what I would do probably if I had a job that,
if they said, take all this and go away for two months and come back with something frivolous,
I would want to have a to the microsecond. Okay, so you know how sometimes you'll see, who's that guy,
Frank Luntz? Is that his name? The consultant for politics?
30 Rock?
No, no, no. He's like a political consultant. Yeah, Frank Luntz. He's a political consultant,
famous political consultant, who does conservative political messaging. And so during debates, he'll be
on cable TV and he'll have these focus groups of people who are like every two seconds,
they're registering their mood as they watch the debate. And you see this moving line and
at any given second, you can see the general mood that they feel toward
the candidate, right?
And I would like to have that as a win expectancy thing where there is a line running throughout
the game in which the win expectancy is to the microsecond being updated during the play
itself.
So if you hit a line drive or a ground ball or a fly ball
at a certain angle, at a certain speed, a certain velocity,
you can therefore use a probabilistic measure
to figure out the likelihood of each outcome of that ball
based on where it is and where the defense is and everything like that.
And so you could see these win probability spikes and valleys in the middle of a play
as the runner, you know, as the fielder might get close to the ball but then boot it,
or as the ball that might get caught or might not is either caught or trapped.
might not, is either caught or trapped. And with every pitch, with every second of the,
maybe even within the pitch, perhaps, but that would be more complicated. For now, we'll just talk about batted balls. But as soon as the ball is off the bat, you'd see that
win expectancy shoot in some direction and then move throughout the play until the play
is resolved. And then at the end of the play, you would have your classic win expectancy
that we all know and see when we click refresh on the page.
But I want to have the moving line win expectancy.
That's what I would do with this.
That is my, if anybody wants to buy me out,
I figure that's about a, I don't know,
I'll say an eight to 12 week project.
And upwards of 70 people would use this sometimes on a lark for five
minutes unfortunately they wouldn't be able to watch the play while they were watching the moving
line so they'd have to choose one or the other uh the moving lines would be on the screen and you
would you'd be able to pick up the up and down movement as it happened. So
nailed it. Nailed it. Perfect answer.
Thank you. Thank you. That's a fun one.
It is a fun one. Yeah, you would think
that if StatCast
all becomes
public and there's
a big sample size and it's
a decade down the
line and there's all the processing
power that you could possibly want, you would, a decade down the line and there's all the processing power that you could possibly want.
You would be able to have like the perfect win expectancy model, just constantly updating like anytime a guy moves on the field somewhere.
Although I guess moving on the field would not change it.
The field would not change it.
But even taking a lead, like taking a one-step larger lead would change your win expectancy after you have like 10 years of stat cast data.
Exactly.
Oh, my gosh.
It would be amazing. Like it would be one thing.
I'm only promising in 8 to 12 weeks.
I'm only promising the batted ball element of this.
However, it would be such a runaway success in the popular culture that then
I think I can incorporate defense in another eight to 12 weeks and then bring it all together
within eight months after that. So you could be looking at this from me as soon as June 2016.
Okay. I'll look forward to that. Anything else come to mind?
Okay, I'll look forward to that Anything else come to mind?
Yeah, the other thing I would do is I would ask you what yours is
I would take this data and I would say, Ben, what about you?
What about your answer to the same question?
None of them is as fun as that, probably
I think there are useful things that you could do with it
Like if you have pitch tracking data
and hit tracking data,
then you can pair those things together
in a way that we haven't really been able
to do that effectively so far.
And you could see, you know,
what kind of pitch produces weaker contact
and whether it does so reliably
and then whether there are pitchers who can throw
that type of pitch reliably basically you could not have to have that tiresome debate every time
like a starter goes two seasons in a row with a babbitt of like 280 or below and uh and some
people project him to regress to a league average BABIP
and then other people will say,
but he hasn't been a league average BABIP for the last two or three years.
So he's a guy who can prevent hits or he allows weak contact or something.
And we never know if it's true because just randomly you would expect certain guys
to go a few seasons with below league average BABIPs.
guys to go a few seasons with below league average babbitts but uh this seems like you would finally be able to come closer to an answer of whether there is such a thing as allowing weaker contact
and if there is how how sustainable or repeatable it is or what the magnitude of it is and you could i guess come up with better pitch value ratings than you currently
have just because right now you have those pitch value ratings where it's just based on like
how hard the you know what what the batter did against the against that pitch like on balls that
he hit did he get lots of hits on that type of pitch or whatever and if you had this stuff then you
could do it on a more granular level and you could look at i don't know what what pitches
got hit hardest or uh fooled guys the least or you would know exactly what locations in the
strike zone produce the the highest expected value for the batter and all that
sort of stuff it would be a lot of like things that we kind of have now but better um better and
faster where like even the even the expected value of a plate appearance thing like if you
if you just rate guys based on where they hit the ball and how
hard they hit the ball and don't even look at what the outcome of the of the ball was whether
it was caught or whatever uh even that like over two or three seasons i would think it would
probably be the same for for most guys um but the advantage is that you would be able to tell how good a guy is in half
a season or less than half a season or something so it's it's all about uh just being able to
get significant sooner in a smaller sample i guess maybe there would be some injury applications or
fatigue applications if you could tell that a guy was hitting the ball less hard all of a sudden,
then that would support the idea that he's fighting a nagging injury or something.
And then you could actually quantify how much worse he is too.
You wouldn't just have to rely on the projection
that is based on his previous three years or whatever
because you would know that he hits the ball this much less hard now. And you'd be able to look at all the other guys
who hit the ball at that speed and see how good they are. And you'd be able to dock him by the
same amount. So that would be kind of cool. And just like not having to deal with line drive rate
anymore and ground ball rate and fly ball rate.
I mean, you would still describe things as line drives and ground balls and fly balls,
but you wouldn't have to be tied to that kind of arbitrary bucket concept
where you've got borderline batted balls that could go either way.
You could just classify everything by the angle.
He's not a 42% ground ball hitter.
He's a whatever degrees hitter.
So that would be more precise at least.
And theoretically you could improve defensive stats maybe
if you could tell whether certain pitching staffs are allowing harder batted balls.
And maybe that's not being accounted for appropriately now.
And you could give guys on those teams a defensive boost because they're having harder opportunities.
a defensive boost because they are having harder opportunities. Like you could adjust a defensive efficiency rating by batted ball strength or something.
So that, you know, the teams that had lousy pitchers that gave up lots of hard contact
would be rated better fielding wise.
Yeah.
Those are things.
Good things.
Yeah.
Rob Arthur suggested that you could also study hot streaks again and i said that the last thing the world needs is another hot streak study probably
but he's right in that i guess it would be a a truer test of whether a guy is locked in or not
if he is if you're just basing it on the quality of the contact
as opposed to whether he gets a hit or not
so that it's not fielding dependent,
and then you could tell whether guys are more likely to hit the ball hard
when they've been hitting the ball hard for a while or something.
But I'm not all that interested in hot streaks anyway you've confused you've actually
mashed together two characters lutz was was a character frank was a different character
that's right yeah frank's the the trucker hat guy yeah frank rossitano and john lutz yeah
uh all right yeah it'll be fun to see okay it'll be nice at the very, it'll be fun to see. Mm-hmm. Okay.
It'll be nice.
At the very least, it'll be nice to...
I mean, I know that teams will always have other things,
but you figure, like, the tier of information,
there's a big tier between, you know,
what we have and this,
and it is a smaller tier between this
and what teams are going to have beyond this.
And so, like, they can quit being... and I say this in the nicest possible way, so smug.
Yeah, that was, I wrote that in my conclusion that like as excited as I am about the individual things that we could learn from this, I'm sort of more excited about just the precedent of actually getting this stuff
and new technology bringing us closer to teams
instead of widening the gap between front offices and fans, which is nice.
I gather that not all teams are totally thrilled that this stuff is
becoming public now which i guess means that they still think that they have some sort of
advantage over teams that have not put as much effort into it and there's still i mean there's
still scouting reports and there's still medical records and there's still biomechanical stuff.
And all of that is probably more important than having good batted ball info, I would think.
Does narrow the gap a little bit.
Yeah.
All right.
All right.
So that is it for today.
Support our sponsor, the Play Index at baseballreference.com by going to
baseballreference.com and using the coupon code BP to get the discounted price of $30
on a one-year subscription. We will be back tomorrow.