Effectively Wild: A FanGraphs Baseball Podcast - Effectively Wild Episode 653: The First Sign of Statcast

Episode Date: April 9, 2015

Ben and Sam banter about time of game and Jon Lester, then discuss what to do with the first release of Statcast data....

Transcript
Discussion (0)
Starting point is 00:00:00 She's stuck in the thought that she was ready to go She said, I'm a C-point, it's about that incident's over She said, technology, it needs me Oh, if you want me, then I won't tell you Too bad, cause I gotta be by myself And she said, I'm all your fault, I'm leaving Good morning, and welcome to episode 653 of Effectively Wild, the daily podcast from Baseball Perspectives,
Starting point is 00:00:31 presented by The Play Index at BaseballReference.com. I'm Ben Lindberg of Grantland, joined by Sam Miller of Baseball Perspectives. Hello. Hello, how are you? Okay. Should we do a daily time of game update, or is that not going to be a regular segment? I don't think it should be a regular segment, but I've got one.
Starting point is 00:00:51 Okay. For you. So yesterday's games, eight of them were under three hours. Five of them, I believe, were over three hours, and one was exactly three hours, and one had a rain delay. And so it's hard to know how long it was based on what I'm looking at. But there was a two hour, 21 minute game, a two hour, 23 minute game. And generally, if more than half are under three hours, that beats the pass.
Starting point is 00:01:21 And so now I've got the median length. this is all by play index by the way, bonus play indexing. The median game length, which is only slightly skewed by weather phenomenon. is two hours, 57 and a half minutes. The median game length for games last year through three games was three hours and five minutes. And so seven and a half minutes have been cut. Also, just to see if there's a distorting effect early in the season, I've got the median game length for last year for all team games between 60 and 68, and that is three hours and seven minutes. So, A, something's happening here.
Starting point is 00:02:30 But what it is ain't exactly clear. We know that the first day was all fast and awesome and exceptional. And so it's conceivable that we are simply regressing and the gap is closing. So the gap on day one was 20 minutes between this year and day one last year. And so probably last year's first days were longer than typical. And this year's first days were shorter than typical. and this year's first days were shorter than typical, and so it was a 20-minute gap.
Starting point is 00:03:08 Just two days later, we're at a 7-minute gap, more or less, basically. This isn't perfect data, but more or less. It's evaporating. We're losing it. We are losing it. My guess is that if we did this in two weeks, we would be at about three minutes, and it will stay there all year. Well, we probably will do this in two weeks, we would be at about three minutes, and it will stay there all year. Well, we probably will do this in two weeks. I think Rob Manford's goal was like 10 minutes or something.
Starting point is 00:03:33 As I recall, he just wanted a modest improvement, so three would probably not be satisfactory to him. No, but modest improvements often start with modest steps toward modest improvements. So, I mean, satisfactory would imply, like, it probably wouldn't stop him from working. It might just drive him to work even harder. But I think that at the end of the year, I mean, that slide on the PowerPoint presentation would be in a happy font. Yeah, I think so.
Starting point is 00:04:07 Even just to stop the increase would be a victory of sorts. Yeah, he would definitely do that thing where he would chart the time of games in the PowerPoint, chart the time of game over the years, and it would be up, up, up, and then down. But he would definitely do that thing where instead of setting the Y axis from zero to 200 minutes or whatever, he would set it at 175 minutes. And so it looks like it got down to almost zero. It would be a misleading chart. Yeah. Okay. A misleading graph, I should say.
Starting point is 00:04:40 Yeah. Okay. A misleading graph, I should say. So I wrote my John Lester article, my John Lester article that we bantered about on Monday, what would happen if John Lester went another season without throwing over to first base and all of the base runners knew that he wasn't going to or acted as if he wasn't going to
Starting point is 00:05:03 and just called his bluff on pick-off attempts. And I went with the simulation route. It was the only way that I could think to do it well. And so I contacted all of the best-regarded, long-lasting baseball simulation games uh baseball mogul and out of the park baseball and stratomatic and dynasty league baseball and diamond mine baseball and i asked all of them to do whatever the closest thing that they could come up with was so just uh all of these games are very sophisticated and they have you know
Starting point is 00:05:47 two or three different ratings that go into the pitcher and his impact on the running game like they have a hold rating or they have a pickoff rating or they have both it's very complicated and so they uh they all ran some sims for me with Lester, not only at the minimum, not only with his pickoff attempt rating set to zero so that he would never make a pickoff attempt, but also with the base runner aggressiveness bumped up so that guys would take the maximum advantage of it. And so Diamond Mine was kind of the centerpiece of this. And they ran the simulation 500 times for me. And they actually did it for 2014. So they re-ran his 2014 with these altered Lester settings. And the difference was significant but not season derailing he went from uh 2.46 was his his actual era last year to 3.22 oh that's a lot yeah it's i mean i mean he's still a really good pitcher after he was last year but he but nobody thinks he's really 2.46. I mean, that was his career best year.
Starting point is 00:07:10 So let's say, I mean, if his projected ERA this year was like 3.1 or something, that would bump him to 3.9 or 4, which would make him essentially not a very good pitcher in this era. Yeah, right. That's the difference between if he's getting paid $25 million, I mean, what does a pitcher who eats innings and has an ERA of four in this day and age get? Is that Jeremy Guthrie?
Starting point is 00:07:33 Is that 11 and a half? Does that take half his value? Well, he definitely doesn't get $25 million a year. But yeah, so the other sim games didn't run as many simulations they just you know did one or did a handful or whatever and and they can bounce around quite a bit from one simulation to the next but but uh the the three of those games that simulated his 2015 instead of going back and redoing 2014 uh baseball mogul he went from a 3.15 ERA to a 3.89 ERA.
Starting point is 00:08:08 Out of the park baseball, he went from 2.76 to 3.43. And strat, he actually didn't change much at all. He got better. Tell me he got better. He didn't get better, but he only got like a tenth of a run worse, even though he gave up a bunch more stolen bases. I'm very disappointed, Ben. I don't think you can... You can't have gone to all the credible simulators
Starting point is 00:08:33 if you didn't go to John Boyce. I should have had Boyce build a video game for me. Yeah, I mean, there's got to be a setting on your MLB 2K14 or whatever they call things these days, right? I looked into that a little bit. I don't think so. It didn't seem to me like MLB was, the show was rigorous enough with its pitcher hold rating
Starting point is 00:09:07 that it would do as good a job as these sim games. I think these sim games are pretty sophisticated. I mean, these are like Diamond Mind is Tom Tippett's game, and he did the stuff for Diamond Mind that he then did for the Red Sox. So it's, you know i'm always impressed by how much goes into these games so i kind of buy it yeah anyway yeah that article is up at grantland right now if anyone wants to go check it out in detail yeah i uh yeah i i i keep going back and forth on whether i think it would ever get to a point
Starting point is 00:09:47 that runners would feel comfortable taking, like, a 30-foot lead. Yeah, I mean, there have been some comfortable-looking leads against him, but not—I mean, I don't know. It's because I went back as I was writing this, and I looked at his last pickoff attempts in 2013 and 2012. Even when he was doing the odd pickoff attempt, it was so weak at that point. It was just like he was bouncing them, he was lobbing them. You almost could have Like broken On the first move and gotten back By the time the throw
Starting point is 00:10:28 Actually got to first base so If you watched all that video And you looked at the Numbers that said he never ever throws To first I don't know you should Be pretty confident although I Don't think this is going to last much longer Just based on
Starting point is 00:10:44 His quotes and Madden's quotes. It sounds like they are having him work on this, and there was some video of him throwing on the side in a pickoff move-like manner. So I'm guessing that this streak of 66 straight starts now without a pickoff attempt. I am guessing that its days are numbered. I would too. It's just knowing what we know. I mean, if he had six months, five months to do something
Starting point is 00:11:12 and didn't, then it's hard to then say that he'll do it in five days. I agree. It just seems overwhelmingly unlikely that he could keep doing this and just not do this very simple thing that has some benefit to him. But like, it would have been impossible for me to accept
Starting point is 00:11:32 that he would come into the season still not throwing the first. Uh, okay. So you probably want to spend the rest of the show talking about the Josh Harrison extension. How does the Josh Harrison extension rank on your intrigue, extension intrigue list? Extremely low. Probably the lowest. The lowest. Okay. So we won't dwell on that um all right well i wanted to talk about the stat cast stuff that is out there unexpectedly and i wrote about this for grantland also
Starting point is 00:12:17 should be up sometime soon but uh i don't know whether you've followed this at all it's been kind of this underground story because it hasn't been announced officially. It's just this data coming out kind of in dribs and drabs where there is essentially HitFX data now that is public. It's not HitFX, it's TrackMan, but it's what teams have had with HitFX for the last several years. It's batted ball velocity, it's batted ball angle, both vertical and
Starting point is 00:12:55 horizontal, and batted ball distance. And so this was just very quietly added to the feed of stats from Major League Baseball Advanced Media on opening day. There was no warning that this was going to happen. And it's been kind of spotty. It seems like they're still working things out. It's not in every park, or at least the data is not coming from every park. And sometimes it will disappear during the day, and then the data will not coming from every park. And sometimes it will like disappear during the day
Starting point is 00:13:26 and then the data will come back a few hours later. So it's not totally clear for sure that we can count on having this all year or that it's just here to stay the way that we can count on having PitchFX or whatever the equivalent of that is now. But it seems promising. It's encouraging because I wasn't at all sure what we would get stat cast wise.
Starting point is 00:13:54 There's also some stat cast stuff that's showing up in game day and at bat. It's a more stripped down version of the raw data that has been released. It's just batted ball velocity and distance and I think not the angle stuff. But that was kind of what I was worried about, that that was all we would get, that we would get a kind of watered-down version of the uncut stat cast that teams were getting and that we wouldn't be able to do all the cool analysis that people have done with PitchFX because we wouldn't have the data on that level. We would just kind of have superficial stuff or it would be on broadcasts and we'd get to see videos with some numbers from time to time, but you wouldn't be able to do real hardcore analysis.
Starting point is 00:14:47 And at least right now, it seems like maybe we can. There's no indication of base running or defense or anything like that yet, so that might be a while, that might be never, I don't know. But the batted ball stuff alone is intriguing and so wait hang on so just uh clarify again it it's batted ball distance and batted ball velocity but not angle and not it is angle uh in angle angle up or down or angle left and right both okay yeah in in game day it doesn't show the angle oh but in the in the raw data feed that you can look up like on baseball savant uh that is already there there's a leaderboard of the the hardest hit balls and you can look up all the the balls that there's
Starting point is 00:15:39 batted ball data for and wait wait wait and what am i right did did I misstate or yes, also distance? Also distance, yes. For everything? For all batted balls? Well, not all batted balls have this yet. It seems like it's still a work in progress. If you ground out to shortstop, what is the distance for that labeled as? Is it a foot? Is it 107 feet? I will look up an example in my spreadsheet of batted ball data
Starting point is 00:16:14 and find a ground out to shortstop. Let's see. So Carlos Gonzalez grounds out sharply to second base, and the distance is 88 feet. Interesting. And so then if it were to get past second base, then it would be, you know, 200 feet. Yeah, I don't know how that works.
Starting point is 00:16:40 I don't know whether it just detects when the forward progress stops or when it goes into a glove or what. So do you have like a, is there like a line drive double or a line drive triple in there? There should be. Let's see. All right. So double, Karl Crawford double. uh double carl crawford double and it is a line drive and the distance is 290 feet but i don't know whether it i don't need to the wall or whatever yeah actually there's
Starting point is 00:17:18 oh there's a let's see there's a play description field also. Carl Crawford doubles on a sharp line drive to center fielder Will Myers. So that's probably not to the wall. Let's look for another one. It's also probably not 290 in the air, probably. If it's a sharp liner, to me, sharp liner is low. Sharp is usually low, right? Yeah. It probably didn't travel 290 feet in the air.
Starting point is 00:17:46 So it's probably giving you where the ball is fielded. Mm-hmm. Let's see. Freddie Freeman doubles on a ground ball to right fielder John Carlos Stanton. 95 feet. Huh. That's interesting. That's odd. So that one is clearly
Starting point is 00:18:01 not where the ball stopped. Sharp line drive double to Norioki, 306 feet. And that one is. Sam fold triples on a ground ball to Shinsu Chu, 131 feet. So maybe it's where it lands. Well, some of those are where they land, but some of them are probably not. I mean, you don't see anything in there that has a distance of six feet or less, right? I mean, there's no ground out to third distance, you know, one of them.
Starting point is 00:18:32 It looks like there is. Let's see if I sort. Here's one. Chris Iannetta, line. Wait, no. I got that wrong. Distance, sort Smallest to largest 2.3 feet
Starting point is 00:18:47 But that could be like, I mean, you would expect A lot of them if they were counting grounders Is where it hits the ground Yeah, right Adrian Beltre Grounds into a force at 3.4 feet So I don't know, I mean, I'm not
Starting point is 00:19:04 They could still be working the kinks Out of don't know. I mean, I'm not, they could still be working the kinks out of this thing for all I know, but, so I'm not sure. And tell me, are you sure, did they get, are you sure that they've got drabs in there? I mean, I know they've got dribs, but are there drabs? Well, it's only
Starting point is 00:19:19 one kind of thing, so maybe it's only dribs. Only dribs so far, so we can still look forward to drabs yeah there's definitely definitely more to to look forward to i don't know what exactly but cory schwartz of of mlb advanced media has been tweeting lots of cryptic things about how there's more coming and they're going to be adding more to the feed, and there's going to be more and more as the season goes on. So I guess that they will incorporate base running and defense and all those sorts of things.
Starting point is 00:19:50 But anyway. I've waylaid you, yes. Tell me where you were going with this. Well, I wrote about just the things that I'm looking forward to using this for, or smarter people than I using it for. And I am wondering what you would like to see it used for. Other than just, I mean, it's always fun to just sort by, you know, the guys who hit the ball hardest, that's fun.
Starting point is 00:20:22 Or the guys who hit the ball weakest, and those kinds of things. But there are things that we should theoretically be able to find out with this information that we couldn't have found out before. Well, I mean, this would have been a more interesting answer like a year ago or two years ago, but I've always wanted to have a, you know, very well reliable to me. I've always wanted to have a results independent batting stat. Right. And so that seems to be like eminently plausible at this point, like extremely plausible, like
Starting point is 00:21:00 almost too easy to give an answer as an answer. So that like from a that is I mean I think that is the the biggest thing that I mean it's it's the most obvious but it's also the probably the the number one way that you would use this yeah and and it's already like to some degree it's already been scooped guess, not with this, and maybe it'll get refined, and maybe it'll get better, but Ben Jedlovich unveiled a methodology, I guess, for how to do this already at Sabre Analytics. Right. Well, yeah, I mean, teams were doing this six years ago. And exactly, teams were doing this six years ago? Yeah. Geez, I thought maybe three years ago.
Starting point is 00:21:53 No, they've had this information since they have hit effects going back to 2008. I think maybe they got it in 2009 and then it was like filled in for 2008 later or something but they've had it at least since 2009 which is i mean it's sort of sad to think about i'm all excited about this new data that we get to play with and to anyone who's been with the team for the last several years it would seem like the most uninteresting thing all right right. So from a fun perspective, this would just be a toy, not really relevant for data. But what I would do probably if I had a job that, if they said, take all this and go away for two months and come back with something frivolous, I would want to have a to the microsecond. Okay, so you know how sometimes you'll see, who's that guy,
Starting point is 00:22:48 Frank Luntz? Is that his name? The consultant for politics? 30 Rock? No, no, no. He's like a political consultant. Yeah, Frank Luntz. He's a political consultant, famous political consultant, who does conservative political messaging. And so during debates, he'll be on cable TV and he'll have these focus groups of people who are like every two seconds, they're registering their mood as they watch the debate. And you see this moving line and at any given second, you can see the general mood that they feel toward the candidate, right?
Starting point is 00:23:28 And I would like to have that as a win expectancy thing where there is a line running throughout the game in which the win expectancy is to the microsecond being updated during the play itself. So if you hit a line drive or a ground ball or a fly ball at a certain angle, at a certain speed, a certain velocity, you can therefore use a probabilistic measure to figure out the likelihood of each outcome of that ball based on where it is and where the defense is and everything like that.
Starting point is 00:24:03 And so you could see these win probability spikes and valleys in the middle of a play as the runner, you know, as the fielder might get close to the ball but then boot it, or as the ball that might get caught or might not is either caught or trapped. might not, is either caught or trapped. And with every pitch, with every second of the, maybe even within the pitch, perhaps, but that would be more complicated. For now, we'll just talk about batted balls. But as soon as the ball is off the bat, you'd see that win expectancy shoot in some direction and then move throughout the play until the play is resolved. And then at the end of the play, you would have your classic win expectancy that we all know and see when we click refresh on the page.
Starting point is 00:24:49 But I want to have the moving line win expectancy. That's what I would do with this. That is my, if anybody wants to buy me out, I figure that's about a, I don't know, I'll say an eight to 12 week project. And upwards of 70 people would use this sometimes on a lark for five minutes unfortunately they wouldn't be able to watch the play while they were watching the moving line so they'd have to choose one or the other uh the moving lines would be on the screen and you
Starting point is 00:25:19 would you'd be able to pick up the up and down movement as it happened. So nailed it. Nailed it. Perfect answer. Thank you. Thank you. That's a fun one. It is a fun one. Yeah, you would think that if StatCast all becomes public and there's a big sample size and it's
Starting point is 00:25:40 a decade down the line and there's all the processing power that you could possibly want, you would, a decade down the line and there's all the processing power that you could possibly want. You would be able to have like the perfect win expectancy model, just constantly updating like anytime a guy moves on the field somewhere. Although I guess moving on the field would not change it. The field would not change it. But even taking a lead, like taking a one-step larger lead would change your win expectancy after you have like 10 years of stat cast data. Exactly.
Starting point is 00:26:13 Oh, my gosh. It would be amazing. Like it would be one thing. I'm only promising in 8 to 12 weeks. I'm only promising the batted ball element of this. However, it would be such a runaway success in the popular culture that then I think I can incorporate defense in another eight to 12 weeks and then bring it all together within eight months after that. So you could be looking at this from me as soon as June 2016. Okay. I'll look forward to that. Anything else come to mind?
Starting point is 00:26:42 Okay, I'll look forward to that Anything else come to mind? Yeah, the other thing I would do is I would ask you what yours is I would take this data and I would say, Ben, what about you? What about your answer to the same question? None of them is as fun as that, probably I think there are useful things that you could do with it Like if you have pitch tracking data and hit tracking data,
Starting point is 00:27:07 then you can pair those things together in a way that we haven't really been able to do that effectively so far. And you could see, you know, what kind of pitch produces weaker contact and whether it does so reliably and then whether there are pitchers who can throw that type of pitch reliably basically you could not have to have that tiresome debate every time
Starting point is 00:27:34 like a starter goes two seasons in a row with a babbitt of like 280 or below and uh and some people project him to regress to a league average BABIP and then other people will say, but he hasn't been a league average BABIP for the last two or three years. So he's a guy who can prevent hits or he allows weak contact or something. And we never know if it's true because just randomly you would expect certain guys to go a few seasons with below league average BABIPs. guys to go a few seasons with below league average babbitts but uh this seems like you would finally be able to come closer to an answer of whether there is such a thing as allowing weaker contact
Starting point is 00:28:15 and if there is how how sustainable or repeatable it is or what the magnitude of it is and you could i guess come up with better pitch value ratings than you currently have just because right now you have those pitch value ratings where it's just based on like how hard the you know what what the batter did against the against that pitch like on balls that he hit did he get lots of hits on that type of pitch or whatever and if you had this stuff then you could do it on a more granular level and you could look at i don't know what what pitches got hit hardest or uh fooled guys the least or you would know exactly what locations in the strike zone produce the the highest expected value for the batter and all that sort of stuff it would be a lot of like things that we kind of have now but better um better and
Starting point is 00:29:13 faster where like even the even the expected value of a plate appearance thing like if you if you just rate guys based on where they hit the ball and how hard they hit the ball and don't even look at what the outcome of the of the ball was whether it was caught or whatever uh even that like over two or three seasons i would think it would probably be the same for for most guys um but the advantage is that you would be able to tell how good a guy is in half a season or less than half a season or something so it's it's all about uh just being able to get significant sooner in a smaller sample i guess maybe there would be some injury applications or fatigue applications if you could tell that a guy was hitting the ball less hard all of a sudden,
Starting point is 00:30:07 then that would support the idea that he's fighting a nagging injury or something. And then you could actually quantify how much worse he is too. You wouldn't just have to rely on the projection that is based on his previous three years or whatever because you would know that he hits the ball this much less hard now. And you'd be able to look at all the other guys who hit the ball at that speed and see how good they are. And you'd be able to dock him by the same amount. So that would be kind of cool. And just like not having to deal with line drive rate anymore and ground ball rate and fly ball rate.
Starting point is 00:30:46 I mean, you would still describe things as line drives and ground balls and fly balls, but you wouldn't have to be tied to that kind of arbitrary bucket concept where you've got borderline batted balls that could go either way. You could just classify everything by the angle. He's not a 42% ground ball hitter. He's a whatever degrees hitter. So that would be more precise at least. And theoretically you could improve defensive stats maybe
Starting point is 00:31:24 if you could tell whether certain pitching staffs are allowing harder batted balls. And maybe that's not being accounted for appropriately now. And you could give guys on those teams a defensive boost because they're having harder opportunities. a defensive boost because they are having harder opportunities. Like you could adjust a defensive efficiency rating by batted ball strength or something. So that, you know, the teams that had lousy pitchers that gave up lots of hard contact would be rated better fielding wise. Yeah. Those are things.
Starting point is 00:32:03 Good things. Yeah. Rob Arthur suggested that you could also study hot streaks again and i said that the last thing the world needs is another hot streak study probably but he's right in that i guess it would be a a truer test of whether a guy is locked in or not if he is if you're just basing it on the quality of the contact as opposed to whether he gets a hit or not so that it's not fielding dependent, and then you could tell whether guys are more likely to hit the ball hard
Starting point is 00:32:37 when they've been hitting the ball hard for a while or something. But I'm not all that interested in hot streaks anyway you've confused you've actually mashed together two characters lutz was was a character frank was a different character that's right yeah frank's the the trucker hat guy yeah frank rossitano and john lutz yeah uh all right yeah it'll be fun to see okay it'll be nice at the very, it'll be fun to see. Mm-hmm. Okay. It'll be nice. At the very least, it'll be nice to... I mean, I know that teams will always have other things,
Starting point is 00:33:11 but you figure, like, the tier of information, there's a big tier between, you know, what we have and this, and it is a smaller tier between this and what teams are going to have beyond this. And so, like, they can quit being... and I say this in the nicest possible way, so smug. Yeah, that was, I wrote that in my conclusion that like as excited as I am about the individual things that we could learn from this, I'm sort of more excited about just the precedent of actually getting this stuff and new technology bringing us closer to teams
Starting point is 00:33:55 instead of widening the gap between front offices and fans, which is nice. I gather that not all teams are totally thrilled that this stuff is becoming public now which i guess means that they still think that they have some sort of advantage over teams that have not put as much effort into it and there's still i mean there's still scouting reports and there's still medical records and there's still biomechanical stuff. And all of that is probably more important than having good batted ball info, I would think. Does narrow the gap a little bit. Yeah.
Starting point is 00:34:37 All right. All right. So that is it for today. Support our sponsor, the Play Index at baseballreference.com by going to baseballreference.com and using the coupon code BP to get the discounted price of $30 on a one-year subscription. We will be back tomorrow.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.