Rates & Barrels - The future of baseball research and how to get into a MLB front office with xSTATS' Andrew Perpetua

Episode Date: June 4, 2021

Eno Sarris welcomes Andrew Perpetua, founder of xSTATS to the show. Andrew explains his unorthodox path into the Mets' front office after an accident kept him out of medical school. Eno and Andrew als...o discuss why he started xSTATS, what is the future of baseball research and where he's headed next. Rundown 2:27 Andrew's education background and journey into baseball 10:50 Realizing that the data is not perfect 17:25 The three types of people that do baseball analytics 21:12 How baseball front offices need to change 31:45 What Andrew learned once he was "inside the game" 34:00 Improving technology for visual tracking 40:20 The next frontier is meshing all the technology together 47:00 What's coming next for Andrew Follow Eno on Twitter: @enosarris Follow Perpetua on Twitter: @AndrewPerpetua e-mail: ratesandbarrels@theathletic.com Subscribe to the Rates & Barrels YouTube channel: https://www.youtube.com/c/RatesBarrels Learn more about your ad choices. Visit megaphone.fm/adchoices

Transcript
Discussion (0)
Starting point is 00:00:00 Welcome to Rates and Barrels, presented by Topps. Check out Topps Project 70, celebrating 70 years of Topps baseball cards. This is Eno Saris hosting. It's a little bit weird to open it up. But today, we have a guest, a good friend of mine, Andrew Perpetua, who has worked within baseball and without. And I'd like to welcome you to Rates and Bar mine, Andrew Perpetua, who has worked within baseball and without. And I'd like to welcome you to Rates and Barrels, Andrew. Thank you for having me, you know. It's great to talk to you. You've got an interesting background.
Starting point is 00:00:56 And I wanted to focus on that to open the show. You know, people, some of the people listening to this would like to get into baseball. So just, you know, you've you worked with the Mets for a bit. But before that, how did you get to to get that opportunity? And sort of what's your background educationally? Or, you know, what do you what sort of proficiencies you have that sort of led to that moment? Well, my educational background is a little weird. I'm a high school dropout. And I, I went to college when I was 17 to a little college named Simon's Rock. And it's for people who don't graduate from high school.
Starting point is 00:01:33 A lot of my peers were like 14, 15 years old. So there I studied psychology and neuroscience and chemistry. And so my background is I try to focus on more hard sciences. So that's sort of my background working in like lab type things and stuff like that. So were you using like SPSS back then? No. No. No. That's what I used to when I was studying psychology. That was the sort of stats.
Starting point is 00:02:13 When I did, the type of psychology I did wasn't really stats-based. The psychology I did was more like I did more of like the clinical type stuff. It was more the like i i did more of like the clinical type stuff um it was more um the chemistry and the physics that i did a little bit of stats honestly i didn't really even do that much stats in school like stats is probably my my weakest subject is statistics so um um getting in in the baseball um it's kind of a long story and i'm gonna skip all of it but uh long story short i i very badly hurt my hand and as a result couldn't go to medical school so um oh uh yeah i never knew that about you yeah i accidentally cut off a few of my fingers in uh the my senior year of of college college right before finals.
Starting point is 00:03:06 It was pretty bad. Makes it hard to be a doctor, I guess. Yeah, I mean, by the time I could go to medical school, it took me like a year to bounce back, honestly. What did you do to yourself? Like a large chunk of my hand fell off, you know? Doing what? I got stabbed by a knife
Starting point is 00:03:26 what yeah is this not a story we can get all the way into okay i guess we've we've covered all of it at this point i i got i got stabbed and in a fight or something not really it was more of um just uh drunken happenstance oh it wasn't it wasn't really you didn't play the game where you put your i wasn't your hand i wasn't even the drunk one you know i was the sober one oh no oh god that's terrible anyway baseball's gain i guess somehow that led to baseball yeah so um yeah so i i i missed out on going to medical school and uh so yeah i i didn't really know what to do for a while so i just started like doing baseball stuff for fun um like i was listening to like Ted Berg and Toby Hyde on their, their podcast of back in the day. And so really the way I got into baseball was that I, I, I tracked all of the stats for the Mets by hand one season. It was like 2012. So I i like i would watch the game and i would mark down like
Starting point is 00:04:46 whether it was like a line drive or a fly ball or what happened and like what the count was and how many pitchers were thrown and what pitchers were thrown in what order i just did it all by hand on like you know just the old-fashioned way and um during about halfway through the season i said i have all these stats and as well like put them in a spreadsheet or something. And I started calculating like, just like, like basic stuff, like BIP and stuff. And that was fine. But then I started doing, yeah, I started doing like Sierra. And I realized that my Sierra for pictures was is that my Sierra for pitchers was totally different than Sierra on like fan graphs. I was thinking like, how, like we watch the same games, like how can we be this different? And the reason is because like, when I saw a ball, like I would say this was a line drive, and they would
Starting point is 00:05:40 say it may have been a ground ball or a fly ball. So there's all this ambiguity and subjectivity around what batted balls even are. And it got me really frustrated because I didn't know how to treat the data at that point. Because I thought I was being pretty reasonable. And I'm sure they think they're reasonable. And we come to totally different things and like some of the stats like I would have a pitcher as having like an above average season and they would have them as a below average season so how do I even make sense of that because I don't think I made mistakes and I don't think they made mistakes but we end up with these totally different interpretations of what was going on in the field.
Starting point is 00:06:30 Did you find that, you know, historically there was some analysis that suggested that stringers, there's a stringer bias that, you know, if it was closer to being a hit, if it found grass, it was a line drive. If it, you know, if it found glove, it was a ground ball. Did you find anything like that when you sort of tried to analyze your own work? It was harder back then because I didn't have access to like the tools we have today. Like today I could go to like what was MLB called like the film box or whatever. I could go and see the play. Whereas back then there like was barely video of the games.
Starting point is 00:07:06 Like the only, like I would have to like watch a game and like go through and not, I didn't have time for that. It's a lot of games, you know, you might've had to like record it. Right. Cause it wasn't like sitting somewhere for you to go look at it again.
Starting point is 00:07:19 It was just, it was just more difficult back then. So I was more, I was like, I was trying to figure out whether there was a bias towards line drives or fly balls and i don't think there really was a bias it was just we were just doing it differently like i like i since then i know that like balls that go like in down the lines like just over the first baseman or just over the third baseman like that area of the infield that i think is where a lot of disagreement comes from because like i i would say like if a
Starting point is 00:07:54 ball goes in the air and lands there i would call it a fly ball or a pop-up but sometimes they call it a ground ball and i was like it went wow that's called a hundred feet in the air. Wow. So it's just, it's just weird. I would have thought line drive. It's just weird. Like sometimes like some of those balls, like it gets pretty chaotic. And like, I think, I think Jeff Zimmerman did an article on that at one point where he, he took video and asked like, what do you think the stringer called this? Right. Yeah.
Starting point is 00:08:26 asked like what do you think the stringer called this right yeah no i remember my first sort of you know run in with that sort of thing was like looking at i think it maybe is bis has like a list of descriptions they can put on a ball and some of them were like fliner and one was like fliner fly and then another another one was like flyner ground it's like what is that and i just tried to watch the game i was like if i have you know 20 different ways that i could describe this ball i could probably see identical balls and call them five different things yep yeah yeah honestly yeah And so when, when I was doing that, I just got frustrated and I was,
Starting point is 00:09:08 I kept thinking to myself, like if there were some objective measure to like, like if I knew how hard the ball was hit or what the angle was or something. So like that, that's just, and that was like 2012. So I, I waited like three years and then in the 2015 stack cast came out and, and I remember, I don't, I think it was, um,
Starting point is 00:09:34 Alan Nathan like tweeted out something and I saw like, wait a second, the data is available, but I can just download it. So I just started, I started messing with it. And, um, So I just started messing with it and I started like redoing the work again, basically. And except with StatCast data. And at first I thought it was going well, but then I realized that StatCast is just as subjective. Just because there was like, it was missing some ground balls and missing pop-ups and stuff. And there's so much stuff.
Starting point is 00:10:03 It's like, there's... Yeah. Yeah. there's missing balls and um there's so much measurement error like we like i yeah i think that's one thing that doesn't come across that well from you know leak sources or just people that use the data there there's a lot of errors in there even even today right yeah i mean honestly I haven't worked that much with Hawkeye, just because I've been busy with getting my systems up to date. I'm going to talk about that later. But with TrackMan, I think the error on a launch angle was like four degrees, like plus
Starting point is 00:10:42 or minus four or something like that. That's a lot more. And with exit velocity, I think, like plus or minus four or something like that. And that's a lot more. And with exit velocity, I think it was plus or minus two. So the whole two whole tips. So, so you could have just think like you could have a ball that was hit. 93 at 25 degrees and it would register anywhere from 91 to 95 and anywhere from like 18 to like
Starting point is 00:11:09 34 degrees it would just be kind of random was it pervasive or was it more in certain parks or was it more in certain angles i mean it was more in certain angles right yeah there is um because the radar can only really measure whether it's coming towards you or away from you. It can't measure in any other axis. So the further you get from that perpendicular line, the worse the measurement was. And also, it just wasn't accurate enough at the end of the day. And that's why we have Hawkeye, because Hawkeye is so much better. And honestly, I don't know what the measurement error on Hawkeye is. I remember I heard a theoretical estimate a long time ago, like two plus years ago,
Starting point is 00:11:57 that there was a theoretical estimate of fractions of a centimeter of error. And I don't know if that's true. It may have just been way overly optimistic. I'm not really sure. But if I think, I watch tennis and it's almost never overturned when it comes to serves. And they're showing the ball. They're talking about fractions of centimeters
Starting point is 00:12:22 when the ball just barely grazed the line or whatever. The difficulty in baseball is that, like, in tennis, the playing field is so constrained that you can, like, see. Just train it on this one box. You can see, like, if something goes wrong and correct it. goes wrong and correct it whereas in baseball like there's so much space that you know you it's hard to test that's why like we focus so much on pitches because the pitching is constrained like generally you're going to throw within the plate the plate is like the the serving box right like it's it's it's a little little nice box and also people tend to care about pitching more so like if you say that this breaking ball has a break of 18 inches and
Starting point is 00:13:12 someone looks at it and says that's not a breaking ball it's a fastball and you're like like wait a second so like you can like you can test for stuff like that whereas like a fly ball like you may completely miss measure a fly ball and who is there to say that you got it wrong like like oh yeah yeah like that that was 206 feet no no way that was 201 distance distance is one thing you can test um but right but like spin rate how do you measure spin rate like how do you like without measuring it with like a camera or something like how can you how do you see it like you can't tell yeah i've been writing about how like ride i think is even hard to to scout for like uh you know i've developed this stuff metric with max bay and like it loves trevor rogers it hates trevor rogers
Starting point is 00:14:02 fastball and loves julio urias's And if you like watch both of them, it's very hard to be like, Oh yeah, Urias is fastballs way better. Yeah. And, and, and with pitching,
Starting point is 00:14:14 there's however hard you think it is with pitching with fly balls. It's like 10 times harder. So, and, and ground balls, like, like what, how do you treat a ball after it bounces is got to be one of the biggest questions in baseball because nobody has any clue how to do it like at least
Starting point is 00:14:34 i don't know anybody who even tries how do you track it or how do you do you do you like once it bounces does it get a new set of vectors? Does it have a new exit velocity? The fields aren't necessarily flat. You don't get perfect bounces. They bounce in any sort of direction, which can change velocity. Some are wetter than others. More importantly, how do you measure defense on a ball that bounced? How do you measure how far the fielder went?
Starting point is 00:15:04 How hard the play was even to get to? Like, it's really hard. Yeah, because there's even batter ball spin, right, that we don't have in the public, which means that, like, someone could hit identical balls towards the third baseman in the same sort of horizontal spray angle, right, in the same sort of vertical spray angle, but have totally different side spin on it. And that means that once it bounces, it's going to do something totally different.
Starting point is 00:15:29 So those two balls that went in the same sort of area in the same sort of speed might hit the ground and do totally different things. And yet if you treat them the same way, then you're not really getting everything you can out of that defensive stack. Absolutely. And also the
Starting point is 00:15:45 ground isn't flat so if the if the ball lands a centimeter to the left it might bounce differently you know so right like if it like if it bounces in the in the baseline or versus on the grass yeah and so well the all these sort of like uncertainties is kind of what I have tried to focus on in my analysis. But mostly because like I just kind of started out this way. Like I was just did like a little baseball project and it turned out to be a big study in measurement uncertainty. And since then, I've just grown more and more fascinated with just the uncertainty of the measurements. just grown more and more fascinated with just the uncertainty of the measurements. And early on, I was told that there are three types of people who do baseball analytics.
Starting point is 00:16:41 There's one person, they take the data and they just work with it at face value. And I think that's most people. The second group of people are the people who they see the data and they know that there's errors in it and they might even go and correct the errors. And the third group... Or at least acknowledge, have some sort of error bands around their findings or whatever. And the third group of people are the ones who see the errors Their findings or whatever. And the third group of people are the ones who see the errors and use those errors to their benefit. So they're the ones who... If I can clean these up better than everybody else, then I'm going to have better numbers.
Starting point is 00:17:16 Not even so much cleaning. It's just that we know that all this data is fake, but they don't know it's fake. data is fake but they don't know it's fake so we can we can act like we can do a trade or we can go sign you know an amateur player or we can do something to take advantage of us knowing that the data is wrong somehow on a very basic level the idea could be something like we know this guy hits a lot of pop-ups but track man never catches them so we know this guy is not as good as his track man numbers so we'll trade him i think one one track man example was that early on i think in in like the first year year or two maybe oh whatever the first year or two of track man it had a lot of trouble tracking sliders
Starting point is 00:18:02 so if you knew then that it couldn't track a slider and you know that teams are maybe putting a lot of weight because like maybe at like the end of that two-year period like the first year i don't think anybody used trackman but towards the second year maybe they started and you might have realized there were definitely teams like the astros were selecting for spin rate right like that was definitely something that we heard early on in track so if you knew that there's that problem, you can use that to your advantage. You could say like this pitcher, we think he has a bad slider. So when we trade for him, we're going to offer less and they're going to accept it.
Starting point is 00:18:37 Even though we know he has a good slider. Right. There's things like that. Totally. And that ends from that, uh, some, from that effort, Just things like that. Totally. And from that effort, XStats was born. Yes. So that was my attempt at XStats. It started out as just trying to have a way of digesting the stat cast in an easier to recognize way. I was trying to
Starting point is 00:19:08 distill it down to like the old stats. And then it... But in a way, like that was an important thing that you did that. I mean, no matter that it was a personal project that you were working on yourself and that for maybe it was just a labor of love or something that you didn't necessarily think as a stepping towards anything. It was important. It was your version of publishing, right? Like it was your version of showing your work and establishing yourself as someone that could be hired. Yeah. I asked that because my next question is, what would you tell someone that wanted to follow your footsteps? You know, like, I don't know that they're going to necessarily track every game and do all
Starting point is 00:19:46 that stuff, but, but the publishing was important, right? Yeah, definitely. You have to get out there. You have to publish.
Starting point is 00:19:52 And I, I think, um, the, the X stats, it did, it did two things where first it got my, my name out there.
Starting point is 00:20:01 And second, it let me do favors for people and doing favors for people is probably the best thing you could do it's like you have to do it's it but it's it's but it's terrible you think about it you know this is something that baseball does which is uh what can you do for me and can you do it free yeah well yeah there's that and and and and there's there there are more hard liners and there are more hard liners out there are more hard liners out there and i i respect the attitude that like you know i will not work for free i will not write for free i will not uh i will not do data work for you for free i respect that but i don't think that
Starting point is 00:20:37 it's it's just it's just not realistic to me because the way baseball works is you do favors for people long enough until then they pay you yeah that's that's just the reality of it and and honestly it's just i don't know how you get what you can do to change the way that i the way that i was doing it was that most of the favors i were i was doing were just things that i've already did anyway so it was a lot of it was just kind of like things that I've already did anyway. So a lot of it was just kind of like I already did like 90% of the work. I just have to like maybe run a query and output the query to like a CSV or whatever. And so not even ten minutes sometimes. Yeah. So those were a lot of the sort of favors I was doing and also like doing some graphics and I
Starting point is 00:21:27 just like making basic graphics so it was kind of fun and uh so well these these so here there's there's some very concrete stuff here though because you're talking about queries you're talking about graphics talking about publishing so these are those are those are things that I tell people but I think that you have a little bit more insight into the specifics of it, like even what languages are being used in front offices and what languages they should use. Those are two different things, by the way. What do you tell someone? Those are two different things. What languages they use and what they should use.
Starting point is 00:22:01 I've talked to you enough. I know. But actually, so here's my question so we've talked about this before where you know uh they use r but probably should use python yeah is that am i okay so i would say so but what would you tell what would you tell a kid then what would you tell somebody who's studying would you tell them to learn both or because they have to know the r to get in Or do you tell them to learn the Python and tell them to tell the teams, hey, Python is really what you need? Well, I think everybody should just start off by learning SQL.
Starting point is 00:22:33 You should just... Because that's all I know, and it's very basic. Everybody should know SQL. It is really the basic stepping stone. And honestly, you don't... Like, SQL is way more than baseball like you can get a like if you know sql you can get a job anywhere so it's it is true because that's because that was like one of the first sort of database languages so a lot of the
Starting point is 00:22:57 older databases like every everyone uses sql like if you want to work for a bank or a hospital or like any sort of company like that and you know SQL, you can get the job just for knowing SQL. Like you don't even need a degree or whatever. Plus, there is some value beyond that, right? Where the structure of SQL tells you a little bit about the structure of more advanced languages, right? Like the idea of sort of select, select this from here, given this, you know, that's the sort of structure of SQL. And I don't know that much about Python and R, but that I've done a little bit of, I've seen R so that it's a little bit similar, right?
Starting point is 00:23:37 Where it's a select this given this sort of idea. Yeah. I mean, R in my opinion, I don't, I don't know R that well. I'm very rudimentary at R. But in my opinion, R is kind of an island language. It's kind of its own thing. On its own. Whereas I think if you were to learn Java or Python or any of those more general languages. I think they're all so similar that if you learn Java,
Starting point is 00:24:10 you can learn Python in a few weeks. Or if you learned Python, you could learn JavaScript. Or if you learn JavaScript, you can learn Python. But if you learn R, it doesn't necessarily help you. If you learn R, you learned R. And that's it. You learned R. And that's it. Okay, like, that's it.
Starting point is 00:24:28 And I'm like, the craziest thing about R is that, like, R is a language that, like, if you want to make your own stuff in R, like, it's generally written in C. So in order to program really efficiently in R, you have to learn a different language to use R. So it's just, it kind of like boggles my mind. And like R has very... Why do you think baseball uses it so much? Just someone somewhere started using it? Oh, that's a simple one. I know exactly why they use it. It's because they all, like a lot of them, they took like a semester or two of stats in college. And then in that stats class, they used R. And then when like they're programming and they realize that like they can't do it in a spreadsheet anymore, like Excel isn't powerful enough.
Starting point is 00:25:18 They say like, what am I going to do? Like, I don't know what to do. Well, I took like a semester of R so that they just do R. And that's what happened. So it's all the school's fault. That's why everybody uses R. Nobody wants to use R. It just happened that way. And I kind of relate to that because in college, when I was doing psychology and neuroscience, I learned Java. So to me, going to JavaScript, which is what I mostly program in now was pretty easy for me because I had already learned Java. But flip side is that knowing Java,
Starting point is 00:25:54 I could have just as easily learned to go on the Python because Python and Java, while they're different in a lot of ways, like they're very different, they're basically the same thing, like the same sort of language. So it's kind of like going from like French to Italian, you know, it's not that different. Whereas like going from, going from R, like going from Python to R is like going to Chinese, you know, it's like, it's different. So, okay. But, but let's say, all right. So we've, we've laid a roadmap down for some people just so they understand that R is what's going on in there, but Python and SQL might be a more powerful sort of pathway for them. Even if they do that, they sort of then jump into the group of qualified applicants.
Starting point is 00:26:39 But, you know, you got hired for reasons beyond that. So, like, if you were telling someone, you know, what to do with their time, I mean, you're talking about publish and learn these languages. But then, I mean, you have this very interesting background in terms of your education. I heard once that Harvard could completely admit only 1600s on the SAT. They could just fill it up with 1600s, but they don't. They look for other things. Do you think that teams are sophisticated enough to look for other types of backgrounds? What would you know, what would you tell someone to do with their, with their free time or their hobbies important or their other things that they study?
Starting point is 00:27:29 Like should they also be looking at physics or biomechanics or, you know, is there another thing they could add to their resume that will kind of get them beyond that first group of qualified? Well, first I think right now teams are not sophisticated enough to look outside. And I think that's changing rapidly because the better teams, I think, kind of look for something more. Data in baseball is getting so much more complicated that it really requires specialized skill sets to do some stuff. Like computer vision is becoming more and more important.
Starting point is 00:28:10 And machine learning is more and more important. So computer vision in particular is a very specialized skill set. And a lot of people who are good at it may not even care about baseball. So like in that sort of sense. Yeah. And they would get paid a lot less in baseball than they would somewhere else. And that's like the skill sets that are becoming more important in baseball are more like less common in general and and and the places where they are common are like nasa so it's kind of like it's kind of like how do we get the kid who was going to work for nasa to work for us when nasa was going to pay them like 300 grand okay so right exactly like those
Starting point is 00:28:57 are kind of and will they work for us for free yeah we'll pay you 30 grand and give you a t-shirt so would you feel like you're in the movie moneyball so and that in that sense like i think these these skill sets are becoming more specialized than baseball um and and that i think is going to really grab baseball by the ears and drag it in a different direction. I think front offices are going to change a lot in the next few years. And in doing so... What do you think will happen? They'll have to pay more? They're going to have to pay more.
Starting point is 00:29:40 They're going to have to. It's just supply and demand. It's good for everybody like pay in baseball has to go up like everybody focuses on the players i don't really care about the players get paid uh so but well i mean the player minimum is 500 000 i mean the front office minimum is like 15 if you made the front office minimum 500,000, I'd be okay with that. So once you got on the other side, and it's funny, actually. I think that there's a parallel here.
Starting point is 00:30:19 You're fascinated with errors. One thing that I hear from people that start to work for a team and get inside is that they're not as advanced in some ways as you might think they are. They're not. Yeah. So I don't want you to rat on the Mets or anything specifically. But if there's a way that you could talk just sort of generally about what you learned when you started working within baseball? I think generally speaking, most teams in baseball, they view... When I was talking before about the three groups of analysts, there's the first group where people just take the data and run with it. That's most of the baseball. Most front offices, That's most of the baseball.
Starting point is 00:31:07 Like most front offices, because a lot of them, they get data from MLB. Like they don't necessarily gather much of their own data. I mean, it really depends. Like there's wearable tech and stuff like that where they collect their own stuff. But generally speaking, the majority of their data is coming from MLB. And MLB does a lot of work cleaning and sterilizing the data. So it's pre-chewed and they just spit it into little baby bird mouths. And that's kind of where... I like that analogy, man.
Starting point is 00:31:42 That's kind of how baseball is working right now so uh-huh um but but but but that means that everyone's sort of at the same yeah maybe it does every a lot of people are getting the same data they are and a lot of them are making similar decisions so and there's only there's a limited amount you can do on top of that right yeah, some people will look at it a different way and other people look at it a different way, but that's only a limited amount if you're looking at the same data. Yeah. I mean, you need, you need your own data in order to make, you know, more informed decisions. So different decisions and, and, and teams do get their own data.
Starting point is 00:32:19 It's just, I'm more talking. Even now the track bands are still up, right? Yeah. I know. Um, at now, the Trackmans are still up, right? Yeah, I know. At least in the minors, they are. I'm sure most teams had multi-year contracts. I don't know the specifics. Yeah, I would assume. And that's what happened is that Trackman is no longer the official MLB provider, but they're allowed to make contracts with each of the individual teams.
Starting point is 00:32:44 Yeah, absolutely. provider but they're allowed to they're allowed to make contracts with each of the individual teams yeah absolutely and if the if the stuff is still up there and you're a smart team at the very least you can say hey why don't we have hawkeye and you know and trackman yeah and you can bounce them off of each other um there's there's other technologies where things are different there's other technologies too that um i i know uh colleges use their own systems for visual tracking. There's a technology that sort of, you know, yeah, yeah, yeah. And there's technologies like Dr. Mike Sun works on one called ProPlay AI that just turns people on the field into lines and angles. And I think there's some stuff like that that scouts use.
Starting point is 00:33:31 I know all the scouts from certain teams will have a contraption they set up at the game that kind of does that for them. Yeah, and I've seen prototypes of LiDAR-based player tracking, which worked pretty well. What's the difference again between LiDAR and radar? LiDAR uses lasers. Whereas radar uses radio waves.
Starting point is 00:33:56 So it's better? Because the lasers are better? It's more accurate. I don't know what I did. Um, it's, it's more accurate. Um, the one thing about LIDAR is that you can add cameras together to either increase your resolution or accuracy or increase your frame rate, or you can add them together in groups. So you can have like two cameras together to increase their resolution and then take two groups of those to increase the frame rate um and then you can so it's also like the the astros supposedly have a ton of edutronic cameras everywhere right yeah like edutronic you can you can however you can add data to your yeah your collection of data that's not in the regular pool. And hopefully your data...
Starting point is 00:34:47 Like with LiDAR, you can track the bat very accurately. So you can get swing paths and contact points. And it's like millimeter accuracy in all three dimensions. Which should maybe come with Hawkeye eventually? Hawkeye will do it. I don't know if it'll ever be as accurate as a LiDAR would be. And it's not ready yet. Or at least not that I know of.
Starting point is 00:35:17 And even still, the more important point is that you have redundancy. And you also have something to measure against. So if your LiDAR says something to measure against. So if, if your LIDAR says this and Hawkeye says that, and they're different, you may, you can at least see if there's a conflict. So if you like only have one or the other,
Starting point is 00:35:35 you might not even know how much you should trust it. So if there's two and. That's really interesting. I was, you know, I just wrote this piece about how uh there's variability in bats that um that players get shipped to them they get 12 bats that are supposed to be all the same um and there's this one company out there just found out there's a ton of variability
Starting point is 00:35:53 and there's uh and he kind of strikes me as that third person you know he's like there's a ton of variability here this that's that's he's like that's what you're looking for you're looking for variability that that's an edge when you got inside did you learn did you like figure out that did you figure out that you were wrong about something or the general public was totally wrong about something like i don't know i i i'm usually wrong about most things so um that's what i think when people get mad at I'm usually wrong about most things. That's what I think when people get mad at me about being wrong about something. Geez, don't you know that baseball is just the process of trying to be less wrong? Yeah, my general strategy is to say things confidently while having absolutely no confidence behind it.
Starting point is 00:36:43 So I just kind of sit. Well, because if you tell people how little confidence you have in each statement, then they just won't ever believe. I just kind of like it's more like a thing for myself. I like I I try to figure out where I am in the moment and I say this is where I am. And I don't necessarily mean like this is the only way. It's just like this is where I am. And I don't necessarily mean like, this is the only way. It's just like, this is where I am right now. This is my truth. And in five minutes, I might totally change my mind. So, yeah.
Starting point is 00:37:17 So you didn't have a big aha moment like that? I don't think I had an aha moment. There were some technology things. It was mostly, I get really interested in the systems of how things relate to each other more than the details. So I was interested in how we have this technology that does this and this technology does that how do they work together and are they working together can we make them work together better like those are the sort of things that i have a lot of interest in and and whether like can we replace one of them with something that you know will help the total system so like if we have
Starting point is 00:38:06 five different technologies and like maybe two of them overlap a little too much we can get a different one that will maybe instead of overlapping with two technologies you'll have three technologies to overlap so you have more redundancy for the whole the total system instead of having too much redundancy for just one part of it. And you've also talked, that could also, one of the variables there is ease. You've talked about how
Starting point is 00:38:32 something has to be easy to set up for it to be valuable. Yeah, and that's the benefit of LiDAR. LiDAR you can set up in two minutes. And you just press a little button and they all sync up and it's magic and writing the code to analyze the game it's like maybe 10 lines it's so easy it's it's so efficient um especially but you know the you know i've got a little rundown here and i'm i'm legendary for
Starting point is 00:39:02 for never uh reading the rundown when i work with Terry, so I'm going to, as the host. But there's a transition here that's pretty easy. You're talking about different technologies and how they fit together. And there's a kernel in there that sort of leaps out. Isn't that sort of what's next in, in, in, in broad brushes for baseball to figure out how to mesh all these, all this new data and all this new tech in interesting ways.
Starting point is 00:39:34 And that, isn't that kind of a, the forefront of, of, of the work? Oh yeah. Like I don't, I don't think we're even using full Hawkeye yet.
Starting point is 00:39:43 Like I don't, like I think Hawkeye is too big for baseball at this point. We have to take only a little tiny LEGO piece out of Hawkeye, and that's what we're playing with right now. We're leaving the rest of the set for later. And that's just one tech. There's so many other things. There's Bats other things. There's, you know, there's bat speed stuff.
Starting point is 00:40:07 There's vision related technologies, which I think is a big part of the future. Is that when I studied neuroscience, my professor was an expert in human attention. So I got lectured a lot about human attention and he studied like... Gaze? Yeah. He studied like how do you focus on things and how does focusing increase your ability to perform? It was what his specialty was. And attention, I think, is a thing that baseball has never really delved into. How do we make our players better at
Starting point is 00:40:58 attending to the pitcher or to a fielder attending to the batter and um like how do we make them better at pruning what behaviors they could have like like with a batter like you could prune um like is the fastball going to be inside or outside is it going to go down four inches or up four inches and that's that's kind of the pruning process is deciding like where the most likely locations of the ball will be and updating it as you know at release point and as the ball comes in you can update it and also how do you update think about deception too a little bit deception right yeah just like release point stuff where well your hands during the swing deal with people you can you're hand-stirring the swing.
Starting point is 00:41:45 You update your hand stirring the swing, too. That's another type of attention that I don't think baseball is really focused that much on. The check swing, even. That big moment. Check swing analysis is so bad
Starting point is 00:42:02 in baseball. How people look at check swings i think um it's pretty amateur but um so there's just uh attention i think is is kind of the next step because we we have all of these things that we've done for pitchers to make pitchers better and i think the thing that we need like the missing link for the batter is studying attention and focusing attention because attention is is something that's very plastic it's easy to learn and change over time um and i it is yeah it's it's one of the most plastic things um in your your mind it it is it's it strikes me as something that the hitting coach can actually can actually that's the that's what he works in right like he gives you information that shapes your anticipation
Starting point is 00:42:53 right um and he and he tries to do mechanical drills that that put you in a position to to uh be very like to to have variable strategies and stuff like that so it seems like uh you know if we were better at knowing about attention we could put we could empower the hitting coach pretty quickly yeah and and if it is effective that you could have a very effective hitting and there have been research papers recently in the past year or so into what's called microbehaviors. And they say that microbehaviors are a result of your state of attention. So at various different loads on your brain, I don't know how else to describe it. But on various energy loads, like energy requirements you need, it'll change the way that you move.
Starting point is 00:43:49 And these micro behaviors can be caught by something like Hawkeye, where you're tracking the player on the field. And you can see that maybe when this player is losing attention, maybe he scratches his left arm or something. It's just little behaviors. Looks in a certain direction.
Starting point is 00:44:09 And these sort of micro behaviors. Mutters to himself. These sort of behaviors, when you identify them, it's the sort of thing that you can feed back to the player and it's easy to change. If you feel yourself doing this, if you start doing this, you're losing it. And do this.
Starting point is 00:44:29 And you can do that. That reminds me of, Joey Votto says, when he gets out to the plate, he looks out into the center field and he makes his eyes wide and he does just a weird thing where he's looking out into the center field
Starting point is 00:44:44 and breathing big and just sort of doing this getting ready thing. And I a big, he does like just a weird thing where he's like looking out the center field and breathing big and just sort of like doing this sort of like getting ready thing. And I asked him why he does that. And he says, the pitcher is trying to get you to focus on them. They're looking at you all weird. They're looking over their glove. They have their arm out a weird way. They're stomping around on the mound. They're trying to make you think about them. I don't want to think about them at all. I want to make them into like a pitching machine. they're trying to make you think about them. I don't want to think about them at all. I want to make them into like a pitching machine.
Starting point is 00:45:08 I want to think about release point, ball movement, and that's it. So I look out in the center field and I try to imagine there's no pitcher even. And also just getting a good deep breath is really good. Right. It sends a lot of oxygen to your brain. You think a little bit better, you know? Yeah. You think a little bit better, you know? So, yeah. So yeah, it's that, that's, that's one area of research that I think is going to be big in baseball.
Starting point is 00:45:31 And it's the sort of thing that like, it's, it's so simple. And the research, a lot of the research is so old, like the, like the papers came out in like the sixties and seventies and they still haven't hit baseball yet like a lot of like a lot of the the fathers of these players were like born after some of these papers were published so anyway so so what what's so that's this is super interesting and thank you so much but i also want to know a little bit about what's next for you what you're up to right now i know you're you're updating x stats and uh that's you know what uh what are you studying on the side what are you reading um right right now i've been focusing a lot on updating my x stats uh
Starting point is 00:46:14 i've been going through i'm rebuilding all of my models for all the stadiums so i can do park effects which is a ridiculously time-consuming job. I wish I had the LiDAR scans that MLB has. It would make it so much faster. But so... Because the listed distances are not always the correct distances. Oh no, they're not at all. Some of them are wrong, but like 10 feet. But, and also I want to know like the whole field the whole field, including fell territory. So I want to know the distance between the fielder and the wall, and that plus the fielder speed,
Starting point is 00:46:56 and estimating the fielder location, because I don't quite have that. But you can narrow down defensive ability using that. Yeah, I saw Rob Arthur today was talking about how players are playing deeper and it's turning some doubles into outs. And he was bemoaning the fact that player position wasn't actually spit out by Savant right now. Oh, dude, I wish I had player position. It's one of those things
Starting point is 00:47:26 but that was like that was one of the main things that was going to be good about trackman is that we could do defensive stats now we know their player position i can't believe that it's not spit out they just don't want us to do it you know they they want to keep that secret it's just one of those things data costs money you know all that stuff they put up costs money i understand it's a business but they they promised me they would have batted ball location like the place of impact they promised me would be here for this season not here you know oh man oh i wish i had that i keep asking for batted ball spin. Batted ball spin. Yeah, maybe. Man, I think with batted ball spin,
Starting point is 00:48:08 I think one of the reasons they're afraid to publish it is because I don't think the data is not that great. Yeah, and that's what I've heard about position player, like a defensive arm, is that it doesn't, it doesn't catch it all the time, but I, that's a little bit of a track man thing.
Starting point is 00:48:28 I don't know. Hawkeye seems like it should catch it. So maybe we'll get a defensive arm strength. Have you seen like the, the videos of the Hawkeye? They like, uh, Tom Tango has,
Starting point is 00:48:39 uh, tweets about them. Like once a day, I think with their, like this, this skeleton. Oh oh you mean like yeah the skeletons yeah yeah i've seen that those are cool um yeah and so they're capturing all that they're just you know well hopefully some stuff will start to trickle out i heard a so
Starting point is 00:48:57 you're trying to you're trying to prove x stats and but what are you reading on the side what are you what are you studying outside of that? In like just in life? Uh, yeah. Recently I've been reading about, um, rare orchestra instruments. That's what I've been reading about.
Starting point is 00:49:18 You know, you heard it kids. That's how you get into baseball. Read about rare orchestra instruments. I try to keep my interest diverse so I don't get locked into one thing. So I don't know, but between just doing Bayesian statistics and modeling stadiums and doing player speed analysis and fielder speed analysis. Beyond that, it's mostly just reading about orchestras recently. Nice.
Starting point is 00:49:55 Well, I think that actually is an interesting place to put a pin on it because I do think that when you know, when people talk about being a journalist and, you know, I have a master's in media studies, whatever that is, it's, I think that, like, I would recommend somebody, even if they do go to journalism school, that they have to do something else because at some point you have to report on something, you know what I mean? It's like you have to know something else other than just how to report, I think. And so I do agree with you that keeping up with the different,
Starting point is 00:50:37 I think it's also just healthy for your brain to keep the synapses firing in all sorts of different directions. But thank you so much for coming on today, Andrew. Thank you for having me. And I look forward to G-chatting or texting or DMing you in the near future. All right. Thank you. And that'll be it for us today on Rates and Barrels.
Starting point is 00:50:59 Please like and review us wherever you listen to this episode or subscribe to The Athletic, which is $3.99 a month right now. You get to see stories like the one I just did with Andrew Bagley about Tyler Rodgers' underhanded rise ball slider, which actually had some of Andrew Perpetua's pitch visualizations in it to describe the beauty of Tyler Rodgers' underhanded delivery and other things. We'll be back on Monday with Derek Van Ryper taking over. I'll be back on Wednesday. And as always,
Starting point is 00:51:31 thanks for listening. Bye.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.