Effectively Wild: A FanGraphs Baseball Podcast - Effectively Wild Episode 1476: Multisport Sabermetrics Exchange (Hockey and Cricket)

Episode Date: December 27, 2019

In the second installment of a special, seven-episode series on the past, present, and future of advanced analysis in non-baseball sports, Ben Lindbergh talks to Evolving-Hockey.com’s Josh and Luke ...Younggren about hockey and then writer, commentator, and team analyst Jarrod Kimber about cricket (57:13), touching on the origins of sabermetrics-style analysis in each sport, the […]

Transcript
Discussion (0)
Starting point is 00:00:00 Hello and welcome to episode 1476 of Effectively Wild, a baseball podcast from Fangraphs presented by our Patreon supporters. I'm Ben Lindberg of The Ringer, and today we are on part two of our seven-episode series on the state of advanced analysis in other sports, which we are calling the Multisport Sabermetrics Exchange. For anyone who missed the intro to the first part, we're taking some time during the slow holiday weeks here to talk about the past, present, and future of advanced time to talk my favorite non-baseball sport, as a spectator experience at least, hockey. And to do that, we are bringing on the brains behind EvolvingHockey.com, Josh and Luke Youngren. They are twin brothers, and they are among the foremost hockey analysts still out there in the public sphere, at least for now. They also contribute to Hockey Graphs. And I don't know if there's a seniority here. Is one of you one minute older or something? And I should introduce you first. Well, first of all, thank you for having
Starting point is 00:01:35 us. Yeah, thank you so much. Technically, I am Josh and I am six minutes older than my brother. Which I am Luke. So, you know, he's yeah, he's a little older, but, you know, I don't I don't hold that against him. Yeah. And the spirit of twindom and everything like that, I technically I am Luke, so, you know, he's a little older, but, you know, I don't hold that against him. Yeah, in the spirit of twindom and everything like that, technically, I am the older brother, but it doesn't really apply. Okay. Well, you've already warned me that you sound similar, so you will identify yourselves and we'll do our best. But you're a hive mind for the purposes of the podcast and the site, I suppose, at least from a public-facing perspective. and the site, I suppose, at least from a public-facing perspective. And you were just filling me in that you were both baseball fans before you were hockey fans and got into hockey analysis via baseball analysis.
Starting point is 00:02:12 So what was the pathway for you to switch over from one to the other? Yeah, well, it was kind of interesting. We were always, as we were mentioning before, but yeah, we were longtime baseball fans. And what we also did, kind of an interesting thing, we played baseball growing up and then we went mentioning before. But yeah, we were longtime baseball fans. And what we also did, kind of an interesting thing, we played baseball growing up, and then we went to college. And surprisingly enough, we were both music majors. So when we were in college, we did music. And there isn't a lot of time to do sports and music in school together. And so we didn't play,
Starting point is 00:02:38 we stopped. And because we were in Wisconsin going to school, I didn't really follow, I guess, our team as the twins. And we'd been following for years. But when I got out of school, I was just both of us were pretty bored. And we we got out in the fall, because we stayed a little later. And our dad was a huge, you know, he grew up in Minneapolis playing on the on the rinks in the lakes here. And he, you know, would just have the wild on which is, you know, the team, that's our name. And, uh, and that was kind of like, we had kind of at the same time been getting and reading more about the baseball stuff and kind of just as something to do after school to pass time, started following hockey. And at the same time, pretty much started reading some of the early or at that time they had been a couple of years along. Uh, and so right, pretty much our hockey fandom started along with kind
Starting point is 00:03:22 of looking into the stats side, you know, the analytics side, whatever you want to call it for hockey. And it's been basically just like a pure kind of started as a hobby and then just became just like a pure obsession. And now it's we just do it all the time now. Yeah. And I'd say I think like probably the intro for baseball stats was probably just like the Fangraphs glossary. Like, I just think that is just so well made and it's really clear. And it's kind of always something that we've looked at.
Starting point is 00:03:49 Fangraphs is kind of like a model for, I mean, our own website, but especially their glossary was super clear. And there's just a lot of really, with hockey stats, a lot of it happens on Twitter, so it's not very documented. But baseball has a really rich history that's documented in, you know, like Tango's blog and, you know, all types of message boards and stuff that you can go back and look at. So it's just, I don't know, it's just kind of the, you know, I guess the nerds in us just would just sit on my phone and I'd go through just the various discussions that were happening at the time. And I think that, you know, Fangraphs really helped kind of me get an idea for what was going on. And I just, yeah, found it really interesting. Well, you'll be well equipped to
Starting point is 00:04:29 answer this question. Then I think, how would you place hockey on the spectrum of ease of analysis when it comes to other sports? So if we do a one to 10 scale and 10 is baseball, so it's structured in a way that lends itself to analysis, and then one is the least easily analyzed sport imaginable, pretty opaque to analysis. Where would you put hockey on that scale? Ooh, that's a good question. I guess I have, I would say it's below a five, probably. I don't know if I'd go maybe like a three, I guess.
Starting point is 00:05:04 Maybe a little easier. But I think the biggest problem really is the strength states and the goalie. So with baseball, right. And with basketball, too, you have one strength state. I mean, baseball doesn't there's not like you just remove an outfielder. But like, you know, in hockey, you have the different strength states, depending upon if a penalty is taken, then a team gets a power play. In overtime, it's three on three.
Starting point is 00:05:29 So instead of five on five, it's three on three. And that can be, and then you can have throwing the goalie. So you can have basically 12 basic strength states. And a lot of the time when you're looking at analysis, you have to look at each one of those as a separate, basically game. Like they're all separate kind of, and that makes it really difficult from like a data side, because you have so much more data that you have to separate. And I just to kind of piggyback on that, I think it also kind of the it depends what
Starting point is 00:05:56 you're trying to do with the analysis, I suppose. So I would say that for right now, just to get an idea about how good players are, I think that that's, even though from a data side or statistical side, it can be kind of complicated, you can do it fairly well. There are still, I think, limitations with just we are missing certain pieces in order to say, if you want to have a player, you know, like tell a player exactly what to do in this situation. Hockey is a very fast moving sport. You're on ice, you know, it's a very skilled, well, I mean, not to say that any other sport isn't as skilled, but it takes, it's a very skilled, well, I mean, not to say that any other sport isn't as skilled, but it takes, it's a different kind of, I guess, instruction that you would maybe try and use to help a player, for instance, you know, learn or get better in a
Starting point is 00:06:33 certain situation. And that is very difficult. And I don't know if we don't have the data, at least in the public, and I would say it's still not necessarily there on a proprietary kind of level either. Ice is slippery. It seems hard. So I know hockey analysis goes back a bit. Tom Tango, who most people know from his baseball analysis, started consulting for NHL teams in the mid-2000s. Hockey prospectus, then called puck prospectus, was started in 2009. And Rob Vollman, who now works for the Kings, did his hockey abstract and hockey
Starting point is 00:07:05 prospectus books and co-wrote the book StatShot, which came out a few years ago. But can you give me kind of a brief history of hockey analysis, hockey sabermetrics? So roughly when did the early efforts show up? At what pace has it kind of caught on? And maybe some of the major breakthroughs or advances in the availability of the data? Yeah. So technically, hockey stats or hockey analytics have been around for quite a while. I mean, I believe the league started tracking goals and assists and plus minus, believe it or not, which is like we talked about a little bit. But in like the 40s, I think, or maybe the 50s. I think plus minus was like the 60s. I'm not super sure on that.
Starting point is 00:07:45 So just to be clear, points is probably the most well-known hockey stat, which is a combination of goals and assists. So if a player is on the ice or if a goal is if they score a goal, obviously, and then an assist is given out to the prior two skaters who touched the puck or assisted on that goal. Plus minus has been around for a long time, and that's kind of like how many goals did your team, or score and allow when you were on the ice? So it's a differential, but there's some weird caveats and, uh, I don't want to go on a long rant here, but it's not a very good stat. And so that was kind of technically the original stuff, but they, I believe in the, and I, the years are a little bit fuzzy for me, but I want to say in like maybe the mid to late 90s is when they started tracking. So actual time on ice. So how long has a player been on the ice?
Starting point is 00:08:29 Because hockey players go on and off the ice. They have shifts and that all adds up. And that was, I think, the late 90s. And then they started doing a little bit more tracking in the early 2000s. But there was some in terms of events. So shots, hits, giveaways, those kind of things. They did a little bit of them, but really, and there actually were some people very early on who did some very good work. One of the very first people who looked at any kind of stuff that might be considered more advanced stats, if you will, quote-unquote, was a guy named Alan Ryder who did a lot of work in the early 2000s,
Starting point is 00:08:58 but really the start of it all was in probably 2007, 2008 season, and that was when the NHL introduced their, I think it's a couple different names that people know it by, but their HIT system, their RTSS system, what is commonly referred to as play-by-play data, where they actually put people in the arenas or watching the game. And they would track play-by-play events and where those events occurred in like an XY pattern, basically. So coordinates for events. And that was kind of the first big innovation in anything that would be described as hockey stats. And so a lot of stuff, that was kind of the big pinnacle of where everything started out. And then after that, there's a lot more that we could get into, too. But that's kind of the trajectory of it.
Starting point is 00:09:43 And then since then, it's kind of just snowballed from there. Yeah. I guess I can just add on probably because the, yeah, the real, like basically modern hockey statistics start in the 2007, 2008 season, because it's actually, you know, if you watch a hockey game, it's very difficult to keep track of all of the players coming in and off the ice because they about players, skaters take about 25 shifts a game and that's you know 12 skaters on each team uh no no sorry 18 so that's a lot of data to keep track of and and the most important thing is knowing what happened when each skater was on the ice so lining up the shift data is um with the events data is kind of like the whole start of like the modern statistics. And that's
Starting point is 00:10:25 really like where it started. And then just like, yeah, you get into like different types of like, yeah, there's a lot of different things that happen with that data, but that's basically like the genesis of the original type. Yeah. And what were some of the major insights associated with that movement? If there was sort of an on-base percentage of hockey? Were there things on par with that? Yeah. So I'd say probably the most important kind of advancement at that time and still today is something called Corsi, which is funny.
Starting point is 00:10:58 Hockey developed this funny naming system where they named— It's not very descriptive. Yeah. Corsi and Fenwick are the type of shots. So there are four types of shots that the NHL... We're basically in hockey. So you have a goal, which would be a shot that ends up a goal, shot that is saved by the goalie. And then you have a missed shot. So a shot that's taken at the net that misses... Or not at the net, it misses the net. And then a blocked shot. So that's a shot that is taken and blocked by an opposing player. So there's four. And so Corsi is all four of those.
Starting point is 00:11:31 So they just generally it's called like, I mean, it should be called shot attempts. So that's any puck directed at the net, regardless of what happened. And then you have unblocked shot attempts, which is Fenwick. And that would be everything but blocked shots. And so the early research essentially found that Corsi was, or shot attempts, is a better predictor of future goals than actual goals. So that would probably be the first real advancement of probably that era. And that kind of took the form, just to jump in here of what generally is pretty like coursey four percentage it just looks at a differential and a lot of times it would be broken down into strength sense so even strength five on five you would look at did your team have say your team had 20 shots for and they let 20 you know
Starting point is 00:12:17 15 shots against that would they'd have a plus five differential or in that percentage form and using those game by game was very early on in the late kind of 20, or I guess aughts. That was the early innovations. We're using those kind of looking at how is a team controlling what was called then possession. So it was kind of this proxy for the idea of, is your, if you're shooting more, you're controlling the ice more, you have more offensive time. You're not allowing the other team to get as many attempts or chances. And that was the early research looked at that and teams that had a higher, you know, they led the league in that there was this understanding that they were they were probably the better teams and they had a generally had a better, a continuous motion sports, soccer, basketball, you have plus minus or expected goals maybe, and maybe hockey has some variety of win expectancy. Are there stats and concepts that were borrowed from baseball or from other sports that kind of
Starting point is 00:13:19 helped kickstart things? Could you skip a step because it was done somewhere else and you could kind of map it onto hockey? Yeah, there actually was. And that was, I don't know if we want to get like super technical, but it is kind of really interesting. I think that hockey, especially in the history of it, has really looked at other sports for what they've done. Because a lot of other sports got this, their data or their whatever you want to call it, before hockey and developed methods. And one of the ones that came out of that idea of shot attempts or looking at that from a differential is expected goals, as you mentioned, which I believe was started in like kind of came from soccer. But that was then applied to in a similar analysis in a similar way. And what that is, is actually an actual model. So it's a binary
Starting point is 00:14:00 classification. You could do logistic regression. There are several machine learning type methods that do that. But what that ultimately does is it assigns a probability for how likely a given shot will turn into a goal. And so hence the name expected goals. And you can do that at a shot level and assign a probability for every shot and some those or do various things with those. And that then is kind of takes the form of expected goals for percentage or something like that or a differential. And you could use that for team or player analysis. It also is a really good way to kind of separate the goalie out for skater analysis. It also was very early on some of the simpler models, kind of a tangent here. actually start to then get closer to what actual goal scoring looks like, which is really noisy and less repeatable than something that's maybe a simpler model that's more like a weighted shot or something, like a band on the ice kind of danger zone type look or high danger stuff.
Starting point is 00:14:55 And so expected goals was one. And then maybe Luke can talk about some of the basketball stuff. Yeah. So I think right now, probably the biggest things that have been taken were in about like 2010 or well, the NBA or they developed some adjusted plus minus and regularized adjusted plus minus or RPM, which is similar to ESPN has a real plus minus or RPM, which is much more complicated. Similar, basically, Brian McDonald, who used to work for the Florida Panthers, he wrote a couple of papers in 2012 and 2013. And then Andrew Thomas, who used to work for the Wild and now works for the SMT, which is a company that does player tracking. They developed kind of multiple, they're basically stint level regression models for analyzing players. models for analyzing players. And that is directly drawn off the work that was done in basketball by, I think, like Joseph Sill in 2010 added regularization to his method that was developed by Dan Rosenbaum in 2004. So a lot of the hockey stats have definitely drawn off of the work that was done in soccer and basketball, but it's a little bit more complicated because of the strength states. So Brian McDonald developed some special teams or power play and penalty kill type models that
Starting point is 00:16:09 do that. So yeah, a lot of this stuff in hockey right now definitely has drawn on the work that was done in other sports. And was there an early adapting team or coach? Was there one organization that is most associated with analytics in the way that the A's were in baseball or the Rockets were in basketball? Yeah, I mean, I think early on, and that's the thing is that it's kind of the same in every sport. But especially in hockey, there's this kind of culture of you don't tell anything. You know, injuries are reported as a lower injury. There's no there's like secrets are kept. And so it's still kind of not clear.
Starting point is 00:16:43 You hear rumblings. But early on, there were a couple of teams. There were the L.A. Kings in the kind of right around. I'm trying to I can't even remember the years, but it was like around the turn of the decade, I suppose. And then into the mid like the maybe and the blackhawks were teams and they just were also very good but they did things that influenced kind of what other teams were doing and what they did lined up with kind of ways that would lead to maybe better results which is kind of gets back to this idea of really controlling zone time getting a lot of shots shot attempts on net which is more just a symptom of just being a good team and so what you then from there got is kind of this realization after the fact that maybe it's not all about shots. Because what teams, there's kind of a funny term in hockey that is more on the stats side is something that you might call a gaming Corsi is what you started to see some teams do
Starting point is 00:17:36 is they would just shoot anything because they, you know, that would be the thing that. And so there's now kind of a running joke a little bit about how a team's maybe trying to game Corsi in a game because they're just like, but we know from looking at expected goal models that, for instance, shots that come from the blue line are really far away in the offensive zone are really pretty low probability unless you are good at generating a rebound off of those shots. And so that kind of then led to more of a look at creating good quality chances. So you do see some teams now, I would say right now, the biggest one is probably the Carolina Hurricanes are the number one team. They currently employ Eric Tolsky, who was one of the early, he really pushed kind of
Starting point is 00:18:10 after a couple of years of the tracking, I guess the RTSS or the play-by-play data, he did a lot of work in the public that proposed a lot of ideas about kind of are still really prevalent in a lot of the research. And you can see some of the stuff they do leads to those kind of results. But again, it's kind of shrouded and there's not a lot that comes out from the teams. And are there major misconceptions that have been overturned? Things that people used to think and say about hockey that have not really been borne out by the numbers? Yeah. I mean, there's probably a lot. I think, in terms of like one that immediately, I don't know why this is just a pretty specific thing,
Starting point is 00:18:48 but there was a kind of the thought of the stay at home defenseman was so the defenseman would be the basically like backs. Although there's been kind of a movement to try to call defensemen and forwards, not call them that, call them defenseman backs because they don't. I call them defenseman backs because they don't. But there was this thought that, oh, you have this guy who plays a whole bunch of minutes, and he just kind of stands in the defensive zone and blocks shots. I think that has been really disputed as a valuable commodity, I guess, because it's been basically shown that blocking shots by defensemen, that used to be really like that would be tracked, and that was thought of as a very valuable skill,
Starting point is 00:19:25 a defenseman that could block a lot of shots. But what actually ends up happening is that means that they're just stuck in their defensive zone a lot more. And so that would, in actuality, the best defensemen are those that are able to just move the puck out of the zone. So the best in like what they would say transition.
Starting point is 00:19:43 So I think that's a pretty big one. Shot. Yeah. Shot blocking is actually not good. Well, I mean, it's, it's more complicated than that, but they're always caveats. Yeah. There are caveats, but like generally like the, you know, big minute shot blocking defenseman is not really a, you don't really want to try to have your defenseman just stand in the defensive zone and block shots. I think another thing is a lot of people would value face-off skill, and that's basically been shown to be not really that valuable. It can be valuable in very specific situations, but a lot of the time, coaches would just be, or teams would be looking for a player who was really the only skill they had was winning face-offs, and most of the time, those guys were not good.
Starting point is 00:20:27 There's a player like Jared Stoll who played for the Kings, and he had just ridiculous face-off numbers, but everything else he did was just really bad. And so coaches, they wanted him. So when they're in the third period and there's five minutes left and they are taking a defensive zone draw, they can put their fourth line center who has like a 60% faceoff winning percentage out there. And then he wins a faceoff and he's
Starting point is 00:20:50 going to stand out there for 30 minutes and just get blown up in the defensive zone. So it's just like, those are a lot of like kind of misconceptions. And I mean, there's others, but those are like the two that come to mind for me, I guess. How much better than 50-50 is a faceoff specialist? Oh man. So that's the thing in hockey, a lot of the percentages and anything, even from a betting standpoint, is if you're getting better than 60% at your home win probability, like your actual accuracy, that's basically impossible. Hockey is such a noisy sport. But in face-off, I actually don't. I mean, there's a few guys like Paul Gostad, But in face-off, I actually don't.
Starting point is 00:21:25 I mean, there's a few guys like Paul Gostad, Jared Stoller. You could find these types of players who coaches and organizations clearly valued from the tradition of hockey, which is like what Luke was talking about is blocking shots. They're gritty. They're tough. They throw hits around. They do all this stuff that looks really flashy. They're really big. That's also another thing that has kind of shown, or not we, but the community has kind of shown that there's not really any correlation to size with skill, which was a you still see it in the draft now where teams will you get in the third, fourth round or fifth round and teams just take like six, five or six, six defensemen because they're just big. That's the only reason why. And you see some of the best defensemen in the league are under six feet.
Starting point is 00:22:00 I mean, there's several defensemen who are, you know, five, six, five, seven, who just, they are amazing. And, you know, there's not really any correlation there. And I don't know, I'm trying to think what is the face off? I guess. So I just, I just looked it up. Ryan O'Reilly is a center. He's actually probably the best face off in the last three years. He's got a 58% winning percentage. So yeah, which is like, that's about the best you're going to get. And that's like kind of the top for most of the percentages in terms of team evaluation and player evaluation. You're not getting anything really over... I mean, in small samples, small minutes, maybe over 65, 70 or something like that
Starting point is 00:22:32 in any differential percentage. Yeah, it's mostly like... So for reference, in a season, if a team has over 55% core C4, so that would be... They took 55% of the total shots and the total shot attempts in a game, that would be really good. Like of the total shots in a shot attempts in a game. That would be like really good. Like that's like a very good number.
Starting point is 00:22:48 So, yeah, it's it's it's pretty small from like, you know, I mean, you'll get teams that like are below 40. Like those are teams that are like tanking. Yeah. There's some historical teams and I don't want to I don't want to just ramble here too much. But I got to give a shout out to the 1415 teams of the it was the Buffalo Sabres, Edmonton Oilers, Arizona Coyotes before. You probably heard of Connor McDavid. But for anyone who's listening, doesn't know he's he's he is like the best player in the league. And he's just an absolute force on the ice. And he's still really young.
Starting point is 00:23:19 And yeah, people kind of compare him to like the Mike Trout of. Yes. A little bit. Yes. Yeah. But he's he's not as well rounded as a player as mike trout is no he's basically just like an like an offensive like unicorn like but his defense is really bad but he's like the greatest offensive player since like gretzky so
Starting point is 00:23:36 it's like yeah i mean you can't he's not very well around i wanted to say the the sabers that year are like at least in the data we have since 2007, I believe they're the lowest team that we can look at in terms of percentages. And they had, it was like, I mean, I think they were under 40% in terms of their course of play, which is just absolutely atrocious. It's terrible. It's really, really bad. And so, yeah. Anyway, yeah. Well, I was going to ask you, and maybe you just sort of answered it.
Starting point is 00:24:01 I don't know. But have there been Sabermetric darlings over the years or punching bags of the sabermetric community, particular players who sort of diverged from the mainstream perception in a dramatic way? Yeah, there have been. There are some interesting, you know, as we mentioned, like being hockey and baseball fans, there are very interesting parallels to the kind of ways that the sabermetrics community and baseball, at least publicly, will look at and kind of hold these certain players up. I mean, being Twins fans, you know, we've followed Williams Asadio, right, since like he has been in the organization. And I read articles about him early on about how he just like never walked or struck out.
Starting point is 00:24:39 And I was like, what? Like, who is this guy? Right. And there's certain players like that that kind of are brought up like that. But the very so this gets us into a kind of another area of some of the, I guess, philosophical aspects of hockey stats is this idea of prediction or predict, you know, repeatability is something that has been used a lot in hockey is how repeatable year over year as a stat that had been used as what was thought of at the time to be a good way to validate a metric and now it's technically it's not really thought of it that way anymore but prediction or the idea of this stat will lead like this is if a team is good and now in this stat that means they're going to win games that's like a been a pretty big bedrock for validating and using stats um that are more on the side. And there are some players who do those things
Starting point is 00:25:25 really well, but they actually, maybe they shoot the puck a lot, and that's good. They don't allow a lot of attempts, or maybe they're really good at generating higher quality shots, and they don't allow a lot. But that isn't technically, like they're not scoring goals. And so there's this kind of weird disconnect where there are players who you can see they should have a much higher shooting percentage. You know, more of their shots should go in or more of their team's shots should go in when they're on the ice because we know that these things are they go back and forth. But there's some like there was an example of I think the Leafs, for instance, the Toronto Maple Leafs. They went after Cody Franzen was was one of the early defensemen who had just great shot attempt numbers. But whenever he's on the ice, they just, for whatever reason, he just, all he did was like lead two shots,
Starting point is 00:26:09 but they didn't really lead to goals. And over enough time, you can see how that kind of goes up against certain things. Another one like Sean Bergenheim was another one that kind of got traded around from a couple of the more friendly teams or teams that had, you know, smaller departments that were looking at this stuff. So it's kind of an interesting dilemma in a way about that yeah I think well and kind of to piggy back on that I think well so I guess to get back to the original question I think probably a player like Jay Bomeester is who plays for the Blues and like this is kind of goes back to that like stay
Starting point is 00:26:40 at home defenseman those are just players that they just get like lit up in the defensive zone but like for whatever reason there's not a lot of goals that go in against them and so like that's a player that that would be considered probably like oh he just he just won a stanley cup with the blues st louis blues a couple years ago and he was playing like 25 26 minutes a night which is a lot like you don't ever get over 30 a game normally. But a lot of his more kind of advanced, I guess, if you will, metrics were just awful. Like, yeah, I think. And then on the other side, I think, like, you know, I mean, I'll just say, like, a player like Jared Spurgeon, who's a defenseman that plays for the Wilds right now. He's just really small.
Starting point is 00:27:19 And he just is pretty unassuming. But he just, like, comes one, like an elite defenseman, like he's like five, nine, and he's like 170 pounds. And, but he just is so good in transition and he never takes any penalties. And that's another thing that people just don't really realize is that defensemen who play a lot of minutes, if they do not take penalties, like that's actually really valuable because when you take a penalty, you're putting your team at a disadvantage from a goal perspective. And so that's something that a lot of people don't really understand. So like Connor McDavid, he draws more penalties than anybody. He's just a
Starting point is 00:27:55 god because he's so fast and every team's just like, they do anything to slow him down. And he draws so many penalties, which is super valuable. And it's the same way with taking penalties. You know, I think that is one of the things that's pretty overlooked. And if I'm trying to think of like other like kind of I think I think probably like a more famous player would be like Patrick Kane for the Blackhawks, who's just atrocious defensively. Well, now he is. He was. He was and he didn't used to be. But now he's just really bad. And they keep putting him out in these just, you know, late game defensive scenarios. And it's just like, what are you doing? Like, that's not that's not like what he's good at.
Starting point is 00:28:31 Like, that's not where you should put him out. But yeah. So tell me about goalies. How hard do they make your life? What's the difference between them in terms of magnitude? How long does it take to tell what their true talent is is it like yeah in the sense that they're so different from every other position and i would say they're they're not it's it's harder than i would say goalies cause so much pain and headaches and issues because
Starting point is 00:28:59 they yes they require a lot of time and a lot of shots against to kind of start to determine. I would say right at any given time in the league. So every team generally, they can have one starting goalie and then they have a backup goalie. And that's who there is on their roster. And at any given point in the league, you have basically, yeah, what is it, like 60 goalies? Yeah, with two more teams. Yeah, and then you have a couple, you know, you'll have some funny scenarios where you have emergency goalies whatever but and those goalies they don't play every game they play maybe 60 games out of the 80 you know the 82 season or whatever and so you need
Starting point is 00:29:34 a lot of seasons and year over year you just see goalies go from the best goal in the league to and two years later they're just terrible you know and then the next year they're great and there's just so much variability there and the other thing thing too, is that we, um, we don't have a lot of, so there are certain aspects about looking at how a goalie moves. Like we don't, we don't currently track like goalie positioning, you know, where, where, or where their glove is or where's their stick or how's their movement side to side. Those kinds of things are more granular and we don't have that. So it can be very hard to look at those things. Um, I would say that one of the biggest things that a lot of the more advanced stuff that has come out is just to try to remove, to get the goalie out
Starting point is 00:30:14 of analysis because there's a lot of noise. It's almost entirely noise when it comes to goals against for a skater, when a skater is on the ice, like goals against when that skater's on the ice is just basically there's no there's just nothing there because it's so random when you look at the entire population and there's just a lot of things that you have to do i think xg or expected goals is a pretty good job of of kind of removing the goalie and looking at how players or skaters do defensively because we can evaluate how a player like or is this player giving up a ton of high quality chances just looking at using that model are they really good at suppressing those kind of chances so and then yeah and then tagging in with that it's like with goalies you can take an expected goal
Starting point is 00:30:53 model and say how many goals did they allow versus their total expected goals you know faced that's a pretty common thing right now but i mean to be honest i've heard some theories because it's really actually really hard to show that goalie skill is like repeatable. It just looks like noise. Like if you look at the goalie stats and you try to do like repeatability type measures, that just looks like noise. And I've heard people kind of like theorize that goalies, their prime is too short or basically for how much of a sample you need for the population to get a good measurement of it. That like most goalies don't play in the league long enough, which would be like you need basically five full seasons
Starting point is 00:31:33 for like every goalie and that just doesn't happen. There's like at any given time, you can be sure that there's probably two goalies who are legitimately good and there are probably two or three goalies who are legitimately bad and then there's everyone else that we don't really know about. Yeah. That's kind of goalies. That's basically goalie analysis. And yeah, there's a thing they say in the hockey world that goalies are voodoo. And it's essentially
Starting point is 00:31:52 true. It's just really, really difficult. That's interesting. If you could discern their true talent, would you guess that there would be a widespread or not really? Because if there were, then you'd think that the goalies would be super important. They are the ones who let the goals in or not. But is it just sort of the replacement level is really high, presumably? It's really funny you bring up the idea of like kind of replacement level
Starting point is 00:32:16 because we have a war model that we built as well. And we also have one that for goalies. And it's kind of interesting because we actually had it when I was kind of curious if we were going to get any baseball people looking at it because war models have and we haven't talked about this yet but there have been several war models that have existed but a lot of those people who made them now work or were hired for teams so they're not available ours is still available but we got into it with um tango or tom tango about this because he was just adamant that
Starting point is 00:32:40 we were giving too much value to goalies our Our model said that goalies were worth too much. He was like, they don't get paid for the amount of value you're giving them here. And there were some other things and it was legitimate criticism. We went and looked at it and talked to a lot of our, I don't know, peers or the people that we kind of interact with.
Starting point is 00:32:56 And it's still kind of up for debate. I think that in terms of just looking at like the value they add, I think that a lot of it is team or system dependent. So it can be kind of hard, even with some kind of expected goal model, it still can be hard. If, for instance, a coach deploys a certain type of system, and you know, that they, I don't know, say they dump the puck a lot or something, even, or maybe they allow points from the, a lot of shots from the point, because that's a good, those are lower probability shots.
Starting point is 00:33:26 But the goalie, maybe the team is bad at getting on the way or there's a lot of deflections. There's a lot of stuff that can get in the way of just that. But I would guess that the spread of talent for goalies isn't that wide. Like I would guess there's a couple players on either end. But it is funny because when we've looked at replacement level, kind of looking at multiple methods, like it looks like replacement level goalies are terrible. So it's an interesting kind of, it's hard to show like skill from a season to season or multiple season through multiple season span. But there is a clear replacement level. So it's kind of like, I think that it's kind of a mixed bag there. Like it's hard to really say anything. It's honestly, it's hard to say anything about goaltending at all, I think.
Starting point is 00:34:08 Well, I pity you. We're lucky because we can just say, sorry, we don't do goalie stuff. We give you some stats you can look at, but there are other people. You can look at them, but, like, just don't. We've done some advanced stuff with that, and I'm not going to, like, put my stamp on that, and I can use it. Well, we won't put our stamp on it. Yeah, no, we will. Here's the thing I'll say. All you need to say is like Hendrick Lundquist
Starting point is 00:34:30 is the best goalie and he's been the best goalie for 10 years. Yeah, it's just Hendrick Lundquist is the best. John Gibson right now, those two and just you're fine with those. Yeah. Just go with those two and you'll be fine. That's interesting because obviously, you know, just visually, there's a parallel with catchers in baseball and they are very distinct from all the other positions. And catchers were kind of like that for a long time in that it was really hard to quantify a lot of the things that they did. And recently that's become much easier to do or much
Starting point is 00:35:00 more feasible. And so now we know, oh, this is how good this catcher is and how good that catcher is at least some aspects of their performance. And it turns out there is a huge difference between them. And so I wonder whether that will ever be the case for goalies or obviously it's just a different job description. And I guess unless you had, I don't know, very precise body positioning or stick positioning or something like that, it would just be hard to break it down in the same way. But I will ask you about tracking stats in a moment, so maybe we'll get to that. But you alluded to hockey analysts getting hired, and I know there was sort of a watershed time for that, 2014, the summer of analytics. had sean mackindoo on this podcast at the time to talk about that so what caused that run on hockey analysts at that time and what have the ramifications
Starting point is 00:35:52 been for the public hockey analysis community because obviously this has happened in baseball too tons and tons of smart people have gotten hired by teams but we've been kind of lucky in the sense that we haven't lost baseball reference. We haven't lost fan graphs like the stats repositories are still out there. But from what I understand, a lot of that data, at least in publicly accessible form, kind of went away for a while. Yeah, that's absolutely true. I think some of the first websites definitely, there were a couple like Extra Skater was one. Daryl Metcalf is that?
Starting point is 00:36:28 Yeah. There's a lot of names. Daryl Metcalf did a website and he got hired by the Maple Leafs and that went down. And then the War on Ice was one. So that was the first real war model was on a website called War on Ice. And that was run by three people, Andrew Thomas, Sam Ventura, and Alexander Mad Alexander Mandryky. And all three of those individuals got hired by teams. And then their website went down. They did leave it on GitHub, but it's, um, it, nobody really took up that task to remake it. And then there were a couple like behind the net Gabriel Dardane, I think
Starting point is 00:36:59 did. And then he did, he didn't really get hired, but I think maybe he did. I can't remember. There's a lot. But yeah, it's just been a bummer because a lot of the time people who run the sites, it's just one person or a small team of people who do it. It's not like it's kind of maintained by... I'm not exactly sure how Fangraphs was able to stay up or... I don't know the history about it, but... It's David Appelman who just kind of
Starting point is 00:37:25 started it himself and didn't really want to work in baseball or did initially. But after that was just more interested in building his business and creating his own thing, which I think has been fortunate for all of us. So that's the thing that's been super fortunate, I think, in that. And it's just that there are a lot of people got into it. And I don't want to presume I know where a lot of people did. I think early on there were people who just did it as a hobby and then basically teams started to see that this kind of work could lead to results and so they just the same thing in baseball they just kind of they targeted certain blog at the time it was a lot of bloggers and people who wrote for their own site or some of the SB Nation stuff back
Starting point is 00:38:01 then and a lot of those happen to run websites so there's kind of this funny joke where a website would go up and everyone was like this is great i love it here's great there's great data there's great charts and then but there was always this constant tension of eventually this site is going to go down and that happened over and over and over again for various reasons and it's it's been i mean it right now there's there's essentially three or four that there are really two sites that provide really comprehensive data. For skaters. For skaters and goalies. So our site, Evolving Hockey, I'm not trying to self-promote here, but it's kind of our site. And then there's a site called Natural Statric.
Starting point is 00:38:41 That's fantastic. It's been around for a long time in hockey stats. It's like four years, I think. There's also, and that's run by Brad Timmons, I think. Timmons, yeah. Yeah, Timmons, who's, it's a really great site. And then Micah Blake McCurdy is probably one of the bigger names.
Starting point is 00:38:56 And he has a site called hockeyviz.com, but it's mostly visualization and charts. And you can't really go there and download tables or CSVs to do stuff with data yourself. And so's and then there's just like money puck would be another but they're mostly focused on like kind of more of the betting side but they don't have like a betting model just like probabilities of in-game in-game win probability type stuff and stuff like that it's i mean i don't want to speak for everyone else but it seems like right now it's pretty stable there was course good not hockey that was run by Emmanuel Perry or Manny Perry. And it was really, really great, but he ultimately
Starting point is 00:39:27 took that down. And so, yeah, it's, it's kind of this, uh, I don't know if it's just maybe there, there are maybe less fans. There are also, uh, it doesn't seem like there's the people who, uh, have the technical knowledge. I mean, for us, I mean, it's kind of almost a moderate, like it's a miracle that we even have a website. Cause we, we weren't, I mean, like I said, we were music majors. I mean, we just kind of came into this. We taught ourselves the programming language R and then luckily we could kind of do a website with that. But it's a lot of work. So has there been much adoption by players and by coaches? Is this still mostly a front office thing in hockey or is it really penetrating down at the ice where you you
Starting point is 00:40:05 have players talking about these things and embracing these ideas i think it's mostly in like a front office type if that i mean i think there's probably about half the teams have like a legitimate analytics if you will department um you know beyond if that would maybe just more than one person that would be maybe maybe about half the teams have more than one person. It's important to ask us to say that there are still teams right now that do not employ anyone who looks at anything data related. So there's a couple teams that just are like, whatever, we don't care. So that's kind of where the state of the sport is. But then there's maybe like five or six that like the Maple Leafs have a really big team.
Starting point is 00:40:40 Hurricanes, the Colorado Avalanche. You know, I think the New Jersey Devils just hired a couple people this summer. The Minnesota Wild did, and then they had a GM fiasco, and he just basically pissed everybody off, and they all left. But anyway, in terms of getting down to the player level, I think from a public-facing side, we just don't really have the data to make something that I guess you'd say is actionable. It's kind of unfortunate. Like we have, like, I think Josh mentioned earlier that I think we're pretty confident with like being able to assess player talent or skill, like from the methods we have, but that's, we can't really determine why. I think that that would basically, you know, if a player was wondering, I think
Starting point is 00:41:20 there's some like kind of general concepts about like how you know like basically just knowing that that shots closer to the net are have a higher probability of becoming a goal um i think there's some work with like manual tracking and that would be like uh cory snyder who's kind of done the uh and then he he tracks a lot of games manually and that'll give like zone entries as a thing zone entry defense so like how well does a player defend against zone entries because that's kind of been shown to be a thing that's um you know if a player is good at entering the zone with possession that would be a thing that's good stuff like that so i think there's some concepts like that and then like ryan stimson
Starting point is 00:41:59 had worked worked on the passing project basically showing showing that where are the most dangerous passes coming from. And that was also a manual tracking type thing. But a lot of that stuff is kind of hard to apply at, you know, it's very different than in baseball where you can have a pitcher who can look at, you know, like the heat map for each one of their opponents that they're facing, like the batters, and then, you and then knowing each batter's spray chart so you can then shift. And stuff like that is, we just don't have the data at a granular level like that right now. So it's really pretty hard in general to apply very specific things to at an individual skater or goalie level. So have there been changes yet in terms of play style and aesthetics of the
Starting point is 00:42:43 sport? I mean, have these ideas become pervasive enough that they've actually changed how players play in a way that has either improved the sport or hurt it from a spectator perspective? I would say that there probably have been a few things, but it's kind of hard to separate that out and say that that was the direct result of the decisions made based on numbers, right, or data. A lot of it is just that out and say that that was the direct result of the decisions made based on numbers or data.
Starting point is 00:43:06 A lot of it is just that teams have realized that the best players are at their peak earlier than we thought. And so they're starting to bring players up who are younger. And what you see, what used to be thought, I think, and I'm not sure on the recent work on this, but in baseball, what's the kind of the peak is maybe what, like 28, 29? Does that seem right? Yeah, it was sort of that 26 to 29 range. Now it seems to be getting younger also in baseball. And so hockey, I think you look at a lot of aging curve work, and we did some early on when we were looking at this. And you kind of see the player peak is maybe between 22 to 25, like after that is kind of maybe even like 21,
Starting point is 00:43:48 maybe 21. And so what teams have kind of done is they've, they've started to remove. And this is also, we haven't even talked about this yet, but fighting is going is dropping, which is its own thing, but you don't see nearly as many fights as you used to do.
Starting point is 00:43:58 And teams used to just play a guy who would fight. That was their own, like for the most part. And they weren't very good overall, but every team had an enforcer. It was was called and they're just enforcers are kind of just a dying breed they don't really exist so teams have just gotten faster they've gotten generally like smaller so smaller players and more skill oriented players but that might be more of a kind of just younger type of style of game where since fighting is down and it's not like you know
Starting point is 00:44:24 you're not looking at the 1970s and 80s where you'd have you know the oh i went to a fight in a hockey game broke out type you know that type of mentality is just gone because it's just not really promoted in the lower leagues but i guess in terms i'm trying to think of like if there if there have been like like like i would say i think like dumping the puck yeah a, as a, I think that's a teams that do that though. And then, so, so for people who don't know, dumping the puck is basically a style of play where once you reach the center line. So if you're coming from your defensive zone skating, you have to reach center ice and then you can throw the puck down to the other side behind the goalie. the other side behind the goalie. And basically, the idea was that you would throw the puck back in deep, and then you would just skate in really hard and hit the other players and try to get the
Starting point is 00:45:11 puck back. Or just beat them if you're faster. So that's called generally the system is dump and chase. And that was kind of, I guess, criticized in an early. And it still kind of is, although it can be successful if you have the right players for it but a lot of the time it's just better to have players who are good or good at entering the zone with possession rather than it's kind of like you're giving up it's kind of like i guess in comparable comparison would be like in football where when you punt you know i guess that's that's not really but it kind of is because you're just giving it away and then you're just like assuming you're gonna get it back later but i think like that and then another thing that's kind of is because you're just giving it away. And then you're just like assuming you're going to get it back later. But I think like that. And then another thing that's kind of been a big thing is like pulling the goalie. So there's been a lot of research or a decent number of research about
Starting point is 00:45:51 showing that if you're down by one goal, you should actually be pulling like at the end of the game. If a team is down just for people, if people don't know, the goalie can just be substituted for a skater at any point in any hockey game. It's just because they're considered another skater. And it's not the end of the game either. You can just do it whenever you want. Yeah, you could do it. You could do it if, say, you have a five-on-three power play. You could just pull your goalie and put a sixth skater out there if you wanted to. That's a pretty bold move, but it potentially could pay off. I'm not really sure. But basically, there was some research basically showing that if you're down by one goal with you should really be pulling the goalie with like six minutes left instead of two or one and a half minutes and if you're down
Starting point is 00:46:34 by two goals it was something like nine minutes just because like the probability of losing the game is like you're already losing so you're really not gaining anything by just waiting like you really should just so that's kind of like a big thing that's been kind of now people are just like yelling at coaches to pull the goalie earlier there aren't a lot of teams that are that have adopted there was patrick wah who coached the avalanche for a couple years and he would pull the goalie with like yeah 11 minutes left in the third it almost was like this thing where coaches were just like almost like offended by it or something. It was so early and it went against so many things.
Starting point is 00:47:08 One of the most acclaimed goalies was one of the most aggressive at pulling the goalie. Yeah. There's a connection there. He ended up being a pretty potentially not good coach, but that was one thing. Last I read,
Starting point is 00:47:24 there were plans for a puck and player tracking system to debut in the next playoffs, right? So is that happening still? And how do you expect that to change things either internally or externally if any of that data will actually be public? Yeah, so player tracking. It's a thing. We've been hearing about it for a while now and it's uh for those who don't know it's if maybe some are familiar with the NBA's um player tracking system it's a video system that the NHL has been working on to do like it's I mean a number like ton of
Starting point is 00:47:55 captures per second that are in they've been working on it for a while and the last thing that people had heard was um the playoffs this season as you as you said the league The league has maybe been somewhat theme isn't always the most forthright with information about these things, and we haven't heard a lot about it. It's kind of interesting because, and maybe this is something I can talk about later, but there's been kind of this, it's been rumored for a year or two that we're going to be getting it, and there was this kind of thought that maybe the public would maybe have access to some of or all of the data like but really i think what we're probably gonna get did they really well no i don't think that was ever
Starting point is 00:48:30 yeah i think really what kind of happened is fans were just they wanted it like they thought it would be oh fans wanted so it's probably not unlike what stack has kind of how that was rolled out um and ultimately what happened is with stack ass is like it's not really available to the public outside of a few sources and i think that's probably what's going to happen from a public facing standpoint with this is that the league may make it available to maybe some, someone or a couple of journalists or whoever, and they'll maybe have some third parties who do some stuff with it. But not to mention the, I mean, it's very large. It like we, for instance, probably, I mean, we, it would, I mean, we don't have big enough computer or server or whatever to work with.
Starting point is 00:49:06 I would guess that if the public had access to a full season of raw player tracking data in the NHL, they would not be able to afford to work with it. Yeah, and that's kind of like the scope that we're looking at. But it does, I think, like I do think the league, I think from kind of what the rumors are and what has been reported, their real incentive is they want to help broadcast. So they want to have a name floating around on a skater or look at how fast is a skater skating. Kind of similar to the NFL with showing passing or like running like receiver routes and stuff like that. Like, I think that's kind of more of what the idea is with it. And so, yeah, that's pretty much like where it's at right now i think a lot of fans are like oh well once we get player tracking data we'll be able to know this it's like you know i'm just gonna say right now like sorry anybody who
Starting point is 00:49:53 thinks you're getting that but you're not gonna get that and also anyone who thinks that we can like even in basketball there's still they're still trying to figure out the best ways to use it is my understanding and it i think there's there's kind of this overhanging cloud over hockey the state of the current like 2019 2020 season is there's a lot of work that that could probably be done but i think there are some people who are just waiting to see what happens with player tracking because there has been i and we don't need to get in this too much but a lot of the pushback against some of like the war models have been the data isn't good enough and we need to wait for player tracking before we can actually figure out how to do this. And so that is kind of we've as model creators have had a lot of pushback of people saying we need to we need player tracking and we're somewhat skeptical.
Starting point is 00:50:46 you is that you were the first or among the first to notice that the NHL shot location data this season was off, that it was misreporting where shots were actually taken, which was screwing up a lot of stats. And I know that the league said that it would look into that in response to those public reports. So what happened there and has that been straightened out? Yeah, actually, that was a really bizarre thing. It was a doozy. Yeah. Yeah, they did fix it, actually. So we, essentially, what happened was I was just looking at early season results with our expected goals model. And generally, with an expected goals model, I know there's some weird things with probabilities that potentially are maybe going against traditional probability theory. but if you just add up every expected goal taken in the league, it should sum to about the actual goal total in the league. And so because we built a new expected goals model this summer,
Starting point is 00:51:38 Well, we both did. In our own ways. Yeah, Josh built the model, actually. I do most of the data side stuff, and he does a lot of the, more of the modeling. Anyway, so I was just to, like, make sure that nothing was wrong. I was just, like, a couple, you know, maybe three weeks in the season, I was looking at XG totals on our site, and then across on Natural Statric and MoneyPuck, and, like, XG was way below actual goal scoring like i mean like really really low like i think the goal totals at the time were like there were 160 goals and our xg models were at 100 xg which is just not like i was like what is going on because i was like i thought we were doing something wrong then i went and looked at the other sites they were basically in line with what our xg model said and so then we went and started looking at the actual shot locations because, yeah.
Starting point is 00:52:25 Well, I was going to just and then it was kind of a team. It was like a teamwork because we were just, you know, we had done a lot of work this offseason. We did some new models and we were evaluating a few of them early on. And and I was I we do this thread on our evolving hockey Twitter account that we look at. We take our respective goals model and we look at, you know, the highest XG, what we'd call them shots. So the highest probability shots that occurred in the prior week or something like that and i was just noticing when i found those that not only were the values lower than what we maybe thought they should be the distances were all off they they the nhl is somewhat they're generally pretty good with their distances but these were
Starting point is 00:52:59 you know four or five six feet away from the net and so then we both kind of did a little bit it was a sunday night on i won't forget this is that we did a lot of just like went through a couple different ways of looking at how, where the shots were coming from, where they were being tracked, maybe the distance from the net. And we realized basically that all of the shots that were generally right in front of the net or around the net had kind of just all been pushed away from the net. And what, with an expected goals model, generally, distance is a very significant feature in an expected goals model. So the closer the shot is to the net, the higher the value, for the most part. Distance is really king. And what we notice
Starting point is 00:53:34 is that compared to at this point in the last three seasons, the distances were being tracked further away, and the values were lower because of that. so we did a thread on twitter it kind of blew up a little bit it got somewhat picked up uh we had several sources uh and like team and league sources kind of reach out and say they had been somewhat looking in that and then it was kind of confirmed by a few journalists greg waschinski and elliot friedman both picked it up and then kind of were able to confirm the league that it was a problem they identified and then it took them what maybe i think it took them three weeks to go fix it was a problem they identified and then it took them what maybe I think it took him Three weeks to go fix it
Starting point is 00:54:06 Well, they they put in an immediate fix because it was something with from what I heard I think was reported was a problem with their user interface with the trackers So the I think something about maybe that I don't know what it was was the the they the way the trackers were is they have a rink and they just can click and point where the events occur and that's just automatically uh will plot a coordinate on a xy grid and my understanding was the the the rink layout they had was not translating correctly to the actual coordinates around the net and so it was just essentially the net was too big is what ended up coming out yeah so so they long story here they put in a they put in a fix like basically, basically within a week,
Starting point is 00:54:46 and then they backlogged, or they updated all of the coordinates. It was the first 91 games of the year, and they went back and they fixed all of them. So, you know, props to the NHL because they actually went back and, like, you know, from what I, I mean, it seems like it was a whole bunch of work to get those fixed. So it was a pretty, you know pretty nice reminder that the league is, they have a concern for the data integrity
Starting point is 00:55:10 because it's good data and it's free and it's very nice the league provides it. All right. Well, I was just reading an article from the Seattle Times from August that was headlined, All Eyes on How GM Ron Francis Will Build Out the Analytics Department of Seattle's NHL Team, which is going to be starting in 2021.
Starting point is 00:55:29 And it says in the article, among bloggers not already with teams, Seattle fans might keep an eye on twins Luke and Josh Youngren from Evolving Wild in Minnesota. They have as good a grasp on analytics as any bloggers. So I guess enjoy Luke and Josh while you can and yeah i've been evolving wild while it's still out there so uh maybe maybe like one of you can work for a team at a time so that the other one can keep the site going and then you can just rotate or something just for the good of the hockey analysis community but yeah i will well thank you we we are currently enjoying our current status and and we really enjoy it.
Starting point is 00:56:05 We very much enjoy our work in the public right now. Well, I guess we'll just— Never say never, supposedly. I guess never say never, but we really enjoy working in the public, and we have this whole website that we've spent years making. So it's a tough pull, I guess. It's hard to get away from that, I would say. Well, you can find Josh and Luke at Evolving Wild on Twitter.
Starting point is 00:56:28 You can also find their site at Evolving Hockey and at evolving-hockey.com. They are also Patreon people, so you can find them at patreon.com slash evolvinghockey if you want to support their work. And we appreciate your coming on. This has been fun. Yeah, thanks so much. Thanks so much thanks so much man for having us it's been a great time yeah pleasure okay let's pause for a quick break and we'll be right back to talk joined now by Jared Kimber, who is a cricket jack of all trades.
Starting point is 00:57:18 He has done it all in the cricket world. He describes himself as sort of Cricket's John Hollinger, but he has been a writer and an analyst and a commentator. He's written for ESPN's Cricket site. He has served as an analyst for Cricket teams. And he's probably done many other things, which maybe we will cover at some point in this interview. Jared, welcome. Yeah, I mean, I'd like to say a very unsuccessful version of John Hollinger. But yeah, essentially a similar sort of thing, except i suppose i did most of mine in reverse
Starting point is 00:57:46 but uh yeah i've uh we've uh basically i have uh taken every job in cricket that will pay me because i like to pay my mortgage yeah that's a good strategy all right so just a general big picture question if we were to rank cricket on a scale of ease of analysis, if you said that baseball is a 10, it's the sport that lends itself maybe most readily to statistical analysis just because of the structure of the sport and the record keeping and all of that. If baseball is a 10, what would cricket be just in terms of how the game itself flows and is organized? What would cricket be just in terms of how the game itself flows and is organized? Well, it's obviously a very similar sport to baseball in a sort of basic level. And it's a bit more complicated with the scoring system, which actually helps it more than baseball for analysis. But being that for hundreds of years, no one ever actually took information at all, really.
Starting point is 00:58:47 Didn't do anything with all that information. And cricket has one big problem that baseball doesn't have, which is that the pitch changes all the time. And even the ball, we use the same ball for long periods of time. So the ball deteriorates during a game and the pitch is a turf pitch, so it's made of grass, game and the pitch is a is a turf pitch so it's made of grass which means over five days you know in a long form game you can have an incredible change of um of the way that the game is played and you can imagine that the grass in pakistan is not that similar to the uh you know the grass in cape town um for instance so on many levels it's quite easy to quantify a lot of this stuff but it doesn't always translate to being a predictive measure um we you know uh we we can say which sort of players are good in certain um situations but we can't necessarily when we go to the next ground we can't tell you that that pitch is going to
Starting point is 00:59:35 behave the way that we think it's going to behave because it's a living breathing organism well a lot of living breathing organisms in fact um all you know rolled together and even over a three hour period things can change so i probably put it if baseball's a 10 i probably put it somewhere around 5.5 um six there's a lot of one thing that is a lot of sort of a bit like baseball you know that you know you have the bowler bowling to the batsman you know it's not like basketball or like football where you have to you know use um you know computer tracking to work those sorts of things out. We know that a bowler has to bowl that ball to that batsman, a bit like a pitcher has to throw to a batter.
Starting point is 01:00:10 So in some ways we know a lot, but it's just all the variables of which we'll get to throughout the podcast, which make it a little bit more difficult. Yeah, so that sounds like taking baseball park factors to a much more extreme and complicated level, which is already a level that many sports don't have because they have a fairly standardized playing surface or at least the same dimensions and that sort of thing. And that's kind of an extra wrinkle to baseball. But cricket, it just takes it to another higher level that makes it even more difficult to analyze, I guess. Yeah.
Starting point is 01:00:44 And we also have different ground sizes so we have we have the normal baseball problem and on top of that we have that the the actual surface that the game is played on um so uh and little things that you probably don't you know i haven't i haven't looked into all the stuff in baseball but there's obviously due factor in certain parts of the cricket world as well. So you actually, the ball literally deteriorated at a completely different rate in different grounds and different kinds of bowlers can't bowl if the dew comes in late at night for day night games or sometimes in the morning for day games. So there's a lot of very interesting external factors that do ruin my life. So give me a brief history of cricket analysis, whatever you would call modern sabermetric analysis as it exists in cricket. When did that start? How did it kind of catch on? And maybe
Starting point is 01:01:37 what are some of the major breakthroughs or places or times when it was implemented? Yeah, I suppose it's, you I suppose cricket basically invented batting average. It went across the baseball, but it's been a cricket staple, I think, since the late 1700s. I'm a cricket historian. I should know this. But I think very early on, they worked out that if you worked out how many times you batted and you divided by how many times you –
Starting point is 01:02:01 sorry, how many runs you made by how many times you were dismissed. We could work out the basic level of a batsman and a bowler. And so that was very early on. And what's incredible about cricket is even more than baseball, it just went, no, we're good. We've worked out this one metric. And to be fair, unlike RBIs and some of the numbers in baseball, cricket batting average is actually not a bad stat. Over 20 test matches, say, or first class matches, which are the longer games that go for three, four and five days. It's actually a
Starting point is 01:02:30 pretty good metric. It's not perfect, because as I said, you might play 10 of those in India, and 10 of those in Australia on completely different pitches against completely different teams. But it gives you a bit of an idea. And so cricket just went, yeah, we don't need to worry about any of this. and then we got to about i would say about 1970 so we're talking 180 years later and crickets changed the formats so we then had started having limited overs games so in a test match there's no you can bowl as many overs as you want essentially it's five days of cricket whereas they wanted a game that was a bit better for tv so they wanted a one-day game which was anywhere between 50 and 60 overs per side.
Starting point is 01:03:08 And then when you finish those overs, the team swap are switched over. And so about that time, we realized that cricket changed. You know, those limited format games were quite different than the original game in that efficiency was important for the first time. Instead of taking the big step forward and actually working out what that meant, what cricket came up with was a scoring rate system
Starting point is 01:03:28 so uh if you were to if you're a bowler and you went for four runs and over your economy would be four and if you're a batsman and you struck at a strike rate of 70 runs per 100 balls your strike rate would be 70 and again it was these are not bad metrics but um nothing was added to them uh no one really took them that much further forward and then throughout the 90s so there's a big change especially in australian sport um so you might have seen this i think the baseball teams have started using them but a lot of american teams have there's these things called the catapult system yes the wearable technology that tracks player movement and exertion and strain yeah so that was incredible one day i'll write one of the most boring long form pieces um in the history
Starting point is 01:04:11 of sports on on catapult system because the catapult system came out of an australian government academy system so you know usually you think of these things in sports as being money driven but this was literally driven by patriotism and trying to make australian athletes better and that was a big change in australian sport so if you look at the australia in the 1980s they were dreadful at the olympics i can say this because i'm an australian so it's okay i lived in the uk for a long time but i'm still australian enough that i can say we were we were pathetic in the in the 70s and 80s at sport. And when you come from a country that basically has Paul Hogan and nothing else, you know, the odd good B house, you know, B grade movie,
Starting point is 01:04:52 you know, we needed to be good at sports again. And we had been really good at sport, obviously, you know, Rod Laver and the golfers and, you know, we've been quite good in the Olympics. So Australia changed and came up with this academy and Catapult came through that but also cricket got a lot more professional in australia only it's really interesting that the other countries didn't cotton on what was happening and australia went on to dominate the game and
Starting point is 01:05:15 that's basically when the first major analytics movement came through and in perfect cricket style it happened in a hotel in india i think i'm trying to think of what's cold cutter or mumbai but basically the australian cricket coach was staying in this hotel and this lovely man called krishna tunga who's i love him to death but he's a mental guy just completely out there uh sort of person i think he'd been a fashion model a cricket statistician and there's some other random job like he sold mobile phones he's just one of those guys who gets really fascinated with things and goes off and does stuff and so he he watched every game of cricket around the world because in india they're so obsessed with it that the games are broadcast back back to um back to india whereas most countries you grew up in australia like i did you almost you never saw india play new zealand or sri lanka play england you only saw
Starting point is 01:06:03 australia play whoever they were playing. And then this was before, this is the whole cable TV boom as well. When that sort of comes up is, so, you know, America is a lot earlier than the rest of the world. For the rest of the world, cable TV really becomes a huge thing in the 90s.
Starting point is 01:06:16 And so cricket becomes this thing. This guy in India has all these cable TV channels. He gets every game and he literally sits at home and he watches them on vhs and he marks out well that ball has pitched there on the pitch and that person played this kind of shot to that and he got this amount of runs um and he does that for like thousands of cricket games just yeah over and over and over again and there's no automation of his system i've seen his system it's incredibly amateur like he didn't even do it
Starting point is 01:06:45 on an excel spreadsheet i think he did it on word which i mean anyone who's ever used word for more than eight minutes is probably having a panic attack at thinking of having to do thousands of cricket analysis on word and so so this guy takes all this information and he goes to this coach and the coach's name was john buchanan and he was sort of a outlier coach very much most cricket coaches at that point and cricket coaches were quite a new thing before that cricket coach was it was almost like a um ceremonial job because you know in cricket the captain is an incredibly important thing so obviously most sports in the world the captain is it's a it's a figure of you know it doesn't matter it doesn't matter who the captain is in the football they just give a different armband to someone in cricket the captain
Starting point is 01:07:29 it's almost like a combination of what a head coach and a quarterback do i see and so you are on the ground so let's say you're on the ground for two hours during a session of a test match there's no way for the coach to be bringing messages out because play is going on and and cricket actually eventually banned the coaches sending out too many messages, although there's many ways around that. But the coaching staff can't go out on the ground or anything for those two hours. So the captain makes all the decisions
Starting point is 01:07:52 when it comes to the strategy of where to put the fielders, of who to bowl, of where they should bowl and all these sorts of things. And so because of that job, coaching was sort of, a coach was someone who, if there was something going on you know in a training session they would have a say but mostly the captains ran the game and then through
Starting point is 01:08:10 the professional cricket australia change you suddenly had this huge boom of coaches getting really important but they were mostly as you would expect ex-players usually good players you know former stars there was a few sort of what i'd call educational coaches as well the sort of coaches you would say college level in american sports but a lot of coaches were you know just former players who who still didn't have that much to do with it but john buchanan was completely outside the box he ended up running new zealand cricket and getting people from lawn bowls involved in their management structure and he would give all the australian cricketers and i don't like if you want to like google an image of an Australian cricketer,
Starting point is 01:08:45 you'll probably get a big burly guy with a mustache who's smoking and drinking at the same time. And John Buchanan gave them all The Art of War by Sun Tzu. He wrote these incredible dossiers on the opposition players and then pretended to accidentally leave them around in the hotel so the opposition players would find them. He had this sort of very left brain wave of thinking about things.
Starting point is 01:09:08 And so he's in this random Indian hotel with an Australian tour. And this crazy guy who I said was a fashion model and everything else comes up to him and says, I think what you're doing in cricket is great, but I think it could be taken forward with this analysis. And John Buchanan looks at this stuff in front of him and he's smart enough to go, we've never had anything like this before this guy this guy can tell me how many
Starting point is 01:09:29 balls of different lengths have been bowled and this is I think it was 2001 okay so we're talking about by this stage baseball is a long way ahead although a lot of other sports obviously not so much uh but money ball is already existing even if the book hasn't come out and titled it yet um whereas cricket hasn't got anything it literally has a madman going up to a random coach and they formed this bond and so for the next few years he did it but the great thing about this is cricket australia was at that stage probably second or third richest cricket nation on earth and cricket is basically an international sport so you know it's the international teams that run it up until very recently.
Starting point is 01:10:05 And yet they didn't pay this guy, Krish Natunga. And yet he kept giving them incredible analysis of lines and lengths to bowl, different pitches. As I said, you know, you need to bowl a completely different length in England than you do in somewhere like India. And when I talk about length, I'm talking about where the ball bounces.
Starting point is 01:10:23 So, you know, he's giving them all this information that no one had ever had before. Cricket Australia still didn't pay him, but he was so excited to be used that he kept doing that. And then through that, you would have thought there'd be a big boom because people started to talk about Australia using different strategies.
Starting point is 01:10:39 So one of the interesting things is, you know, pitchers don't really do anything other than pitch. Whereas a fast bowler in cricket has to run in for 30 meters before he bowls fast and then he has to field in the outfield when he's not bowling and so one thing that Krishna worked out was that some of these bowlers must be doing you know 5, 10, 15, 20 kilometers a day in the outfield of war a lot of it's walking but a lot of it's running as well and he was just like if you if that person needs to be the person who has the most energy why are they doing the most work in the field which makes perfect sense it turned out later on that when when uh the
Starting point is 01:11:13 australian team tried to put the fast bowlers in positions where they were not having to run around as much they tend to be reflex positions and those players weren't very good at catching the ball so they got moved anyway but but the general thought you know what i mean it should have moved cricket forward a long way but actually the thing that really changed was when moneyball came out the cricket is there's a lot of crossover interest between uh cricket and baseball because they are quite similar and also in england the you know basketball is sort of a non-existent sport so the nfl and and major league baseball are sort of the the the american sports that those guys sort of follow and so there was a big
Starting point is 01:11:51 crossover there's been a few guys in cricket try and make it as baseball over times there's also a lot of so baseball was a century ahead of cricket when it came to fielding maybe a century and a half it's that bad how bad cricket fielding to... I read an article of yours about that, that basically just weren't tracking any fielding statistics at all, even something as basic as errors or just the most basic things that we've moved past in baseball long ago was just not recorded at all. Nothing, nothing at all. So a simple thing, we have a thing in cricket called an overthrow, where you'll throw the ball back to the wicketkeeper, who's essentially
Starting point is 01:12:28 the catcher. And if the ball goes past him, you can continue to run. Now, obviously, if you're a bowler and you've bowled a ball that the batsman can only hit for one run, and then the fielder throws it really poorly past the wicketkeeper, or the wicketkeeper lets it go through his legs, it's a very obvious thing that it shouldn't go against the bowling figures, but cricket has never done that. So around the sort of money ball thing was a bit of a mini explosion in cricket. And I do mean mini.
Starting point is 01:12:54 It was at the same time, one of the most recent formats of cricket was invented because we invent a new format of cricket about every five years at this point. There's a format of cricket called 20 called 2020, which goes for three hours, which is one of the most successful shortenings of a sport that there has ever been and might keep cricket relevant for the next 50 years.
Starting point is 01:13:15 It's had that much of an impact on the sporting countries that have had it, and especially on India. And that's where the money is because that's where all the people are. But there was a kind of money ball came out. Oh, you better at this uh when did the lewis book come out 2000 2003 okay so t20 cricket starts at the same time and so you sort of have you have english professional clubs which is maybe the strongest club league that we'd ever had in cricket up to that point they've invented this new form of cricket that is essentially to bring drunks to the ground after work right and that's essentially most of it a little bit of kids and families but mostly drunks the idea is it starts at six so if you're in an english city like london
Starting point is 01:13:54 or manchester or birmingham you finish work at 4 35 you hop to the pub have a quick couple of beers then you go down for the match you watch three hours of cricket whereas obviously a test match is seven hours one day game goes for about eight hours so you know it was a very clever thing but it also changed the way we thought about the sport because up until that point bowlers were the attacking ones now you had batsmen being the attacking ones and as we talked about earlier it changed cricket the more you reduce cricket the more you make it almost more like basketball in that it's an efficiency sport at that point each team is going to have 120 possessions if you will or 120 balls or 120 pitches we know that coming in we know each team is going to have that so it completely changed it
Starting point is 01:14:35 and so money ball sort of started to filter into the game at this english county level but you know it wasn't because there wasn't the the of it. There wasn't 30 or 40. Well, I was going to say 30 or 40, but you probably in the sabermetric community, you probably had what, three, four, 5,000 people throughout the 80s and 90s pouring over all this information, looking for new stats and new ways of analyzing teams. And we didn't have any of that in cricket because the actual ball by ball information, which, as I said said we've been using for years to actually generate our score sheets no one actually collated at all so what had happened
Starting point is 01:15:11 is this this incredible database that cricket should have had from uh well you go back to 1870s is when the first internet first test matches started although the first international sporting event was a usa versus canada in cricket in the 1850s. Well, 1847 or whenever it was. Sorry, fun fact. But yeah, so we'd had all this. And, you know, scorers in cricket are probably, of all the scorers in sport, the most honored, the most feted. Because it's a really hard thing.
Starting point is 01:15:38 I mean, if you're listening to this podcast, it's worth Googling a cricket scorebook. It's incredibly complex. It looks like gobbledygook in fact i i gave i was working with the scottish team recently and i gave a scoring card to one of the players and he handed it back to me as if to say what am i supposed to do with this and so it's a very complicated system and they all that for years all this information went away and went nowhere so even when you had the money ball thinking and you know they were looking for you know inefficiencies and they were looking for advantages and all the sorts of things that the sort of the oakland a's and other
Starting point is 01:16:09 teams of that time were looking for cricket didn't have the database to go with so you had teams like sussex who were desperate for this information and lester um who came up with you know some some really interesting um ideas but they couldn't really back it up as much as as you would want because they didn't have the sort of system behind them. And that's why cricket was largely left behind. So there's no reason why cricket,
Starting point is 01:16:32 because as I said, it's such an enclosed game, batsman versus bowler. If you take out the fielding and cricket has taken out the fielding, sadly, you know, there's heaps that we could have looked at, but we didn't
Starting point is 01:16:42 because we didn't have that database. But so there's a very interesting website who I worked. I worked for this website for years, a website called Crickinfo. Crickinfo started before the World Wide Web. It started on chatting forums in 1993. So basically all these students from India and Australia and South Africa all go to America to study. And then they get to America and they realize that shit there's no cricket at all like you know there's nothing obviously as you would know you know it's slightly better in America now than it
Starting point is 01:17:10 ever has been but it's you know hasn't been a cricket nation since since Bart King played in 1903 in England and uh terrorized the England players he was a baseball pitcher who was so bold and was brilliant um and sadly no one remembers him anymore but he's probably the last time there was a really good american cricketer as well and so cricket obviously died out there so all these students are over there and they start using relay chat and um i forget what it was called it was rsc the original google message boards well sorry well there's usenet which was uh yeah that was kind of the baseball, rec sport baseball was where that all germinated. Exactly.
Starting point is 01:17:47 So rec sport cricket was also there as well. Uh-huh. Okay. And so these guys through rec sport cricket and then through relay chat realized that if one of them has access to the radio or TV, they can start doing ball-by-ball commentary. That started in 1991. By 1993, they'd worked out a bot on how to keep the score so if
Starting point is 01:18:07 you came into the chat room you wouldn't have to say what's the score you would just put in a code and the score would come up automatically which gave them a database of ball by ball which they weren't obviously thinking about by 1996 it was one of the biggest websites in the world mick jagger uh was involved in it he was he was trying to stream video in 1996 on Crickinfo, which he did do. It was just terrible. It was one frame every five seconds off the top of my head. But same again, Mick Jagger's traveling the world
Starting point is 01:18:33 trying to get cricket information, and he's all these bunch of nerds. So a lot happens, and essentially what is a billion-dollar company ends up being all the founders who were just cricket nerds. There's a bunch of rocket scientists and uh there's a professor at columbia who is you know was one of the early guys and they basically invented twitter you know the the automatic scrolling news system and they used of course being cricket nerds they invented an incredible platform and decided to use it for
Starting point is 01:19:00 you know uh getting one day scores between new zealand and Pakistan. So they've invented this. And then over the years, it eventually ends up in ESPN's hands. And in that sort of period between 1996 and 2007, when ESPN buy it, every ball in cricket is published, essentially. Every major ball in cricket is published on this website. And what starts to happen is people start thinking to themselves wait a minute if this company has all this information back to i think 2001 is maybe when it got professional enough that you could scrape data off it
Starting point is 01:19:34 but back to 2001 you can scrape all this data they could start to work that out so people would just use a spider to basically just go through and collate all this information into databases. And that's the sort of the birth of that is around maybe 2008, 2009, people start doing it. But cricket's such a weird community. So one of my friends is a massive baseball fan and he talks about, you know, Moneyball all the time. And he's also maybe one of the most famous cricket statisticians in the world. also maybe one of the most famous cricket statisticians in the world and he collates his own database of cricket which i you know uh which you can which you can purchase off him with all this information all this ball by ball information so information on literally every
Starting point is 01:20:14 delivery in the world and i said to him one time didn't you ever think of using that to analyze the game and he just sort of looked at me like he hadn't thought of it and so i think you've got people like krishna tunga who are thinking about it and then you have a lot of other people in cricket who just are not thinking about it and there's because there's no publicly accessible data like in order to get this quick info information you literally have to you know scrape it yourself at this point and so the first sort of big boom within cricket then becomes england when again massively inspired by moneyball and the english team used moneyball a lot in their test match cricket as well they
Starting point is 01:20:52 would they started tying and purposely trying to tire out bowlers which is it's always been a sort of cricket fundamental but they took it to a sort of you know well a stat space but also a zealot type situation and another thing that came through at the same time is, you know, the Hawkeye machines that they use in tennis? Yes. So that was also a cricket thing. And I can't remember if it started in cricket or tennis first, but I think it might've been cricket, but it was roughly the same time. They've just started using those in baseball too. Oh, it's interesting. It's taken that long actually. So it was brought in as a TV gimmick and eventually became part of the umpiring systemick and eventually became part of the umpiring system it actually became part of the umpiring system well before they ever tested it to see if it
Starting point is 01:21:28 actually worked which tells you a lot about how cricket does things but so essentially you've now got this system where there's all this hawkeye data and what happens is the english cricket team hire a guy called nathan lehman who comes in and he is got a bit of a mass background but he's also a cricket coach he's also what he knows a lot of fairly important people within english cricket and they want to hire a moneyball guy and i think that was almost essentially what how they pitched the job to him and so he comes in and one of the first things he says is all these hawkeye data like of all these balls that are bowled around the world like who uses that and there's like you know a big look around the room and it's like no one uses that and he's like you know a big look around the room and it's
Starting point is 01:22:05 like no one uses that and he literally says well if we buy that can't we just tell so in cricket we have out swing and in swing so if you're a right hand batsman if the ball swings away from you that's out swing the ball swings in obviously it's in swing and he's like can't we just tell every bat what every batsman in the world does when the ball swings away from them could we not tell what they do when the ball spins away off the surface you know could we not tell how they go when the ball's bowled at their head because you're you're allowed to bowl a batsman's head in cricket and so he starts asking all these perfectly sane and reasonable questions and that sort of changes things massively and they buy the data and he now owns a company called crickvis which basically do that
Starting point is 01:22:44 professionally for a lot of different teams around the world. They also do it for broadcasters. And that was sort of the biggest boom. And then weirdly after that was when all these people started realizing that you could scrape this information offline. I would say now there's probably maybe a thousand people around cricket.
Starting point is 01:22:59 And I'm talking from professional to fully amateur people who have access to all this data and are basically coming up with their own metrics of how to improve cricket teams. And so as I said before, you have batting average and bowling average. We added economy and strike rate to that. Now, I've got something that I use called true economy, which is sort of like the truce using percentage in basketball is sort of where I got the idea. There's weighted averages to work out, you know,
Starting point is 01:23:31 because cricket has, you might be playing against Scotland, the team I work for one week, which is a sort of new professional team. And the next week you might be playing against India, which is, you know, the richest, one of the richest sporting teams in the world. So the quality of the players is obviously massively different you know for weighted average as i said there's also the grounds um and the pitches uh which play a huge thing so you have to factor that in so there's a
Starting point is 01:23:54 ground in guyana where it's basically fucking impossible to hit the ball off the square because the pitch is made of dog shit um and then you've got uh nottingham uh where you and i could hit sixes um it's so small and the pitch is so so flat so you have to you have to start putting that in so all those things are sort of coming in but the reason we haven't had as many big booms is because it hasn't been the none of the information's open so crick viz who own the hawkeye data are the only company that i'm aware of that own the Hawkeye data. And they obviously, they're a small company anyway, that, you know, they'd be compared, you compare them to someone like driveline in baseball. I think they would be a fraction of the size with a fraction of the staff. No one else has access to that. And then when you're
Starting point is 01:24:36 talking about this scraping offline, a few people have that, but they don't always have the proper metrics to put in because obviously those are two different skill sets. And so you might have at this stage, if I'm being optimistic, a thousand people sort of looking at advanced analytics, whereas you probably have more subscribers on your website, on your podcast that can do that in baseball. And so we've had sort of little explosions. England certainly had a massive explosion with
Starting point is 01:25:05 with using the moneyball tactics and then using the hawkeye data we've had little team some teams in t20 cricket have some really good success so the west indies team basically so west indies cricket to give you a very quick history lesson on west indies cricket from 1975 to 1994 they were arguably the best sporting team in the world and they're a collection of caribbean islands they're not one nation and they came together and they worked out that they could bowl really really really fast and that they had a lot of guys who could bowl really really fast and they're gonna use all those guys at once and basically scare the shit out of everyone it literally changed the way that players protected themselves with equipment because these guys were so good and so fast and then their batsmen also
Starting point is 01:25:45 came on so they dominated cricket for 19 years and then for the next 10 to 15 years they were a bit like how golden state's playing in the nba at the moment uh just an absolute shambles you couldn't even tell if players wanted to play that they had trouble keeping their best players because they wanted to go and play in club teams and and it became a real problem except for that when t20 came along it turned out that the West Indies again had this natural advantage in that they have a lot of players who'd like to hit sixes which is when you hit the ball over the rope so cricket's version of a home run or a three pointer I suppose and they worked out using analytics if they basically
Starting point is 01:26:18 played a bit like I suppose you would say well a bit like modern baseball with teams going for home runs more than trying to get the ball in play or a bit like, you know, the Houston Rockets in basketball. They worked out analytically that if they go for more sixes and are smart about it, no other team can hit as many sixes as them and that they will win. So they won two World T20 titles. And if it wasn't for infighting, they probably would have won three in a row.
Starting point is 01:26:42 Then we had a bunch of smaller successes in T20 franchise cricket. So club cricket is now for the first time ever, sort of almost as big as international cricket. And we've had successes in the Pakistan League, the England League and the Indian League. And maybe you could also suggest in the Australian League with various different strategies, most of which came from analytic mindsets of you know just just
Starting point is 01:27:06 thinking about the game differently and you know taking new things so it's it's still quite slow i would say off the top of my head there's probably 300 professional cricket teams in the world of those there's probably about 25 that are taking it anywhere near seriously i think the last three times i've worked for a team in T20 cricket as an analyst, I've been the first analyst I've ever had. When I walked in, like, and when I say, like, they had no analysis, like, when I walked into the St. Lucia Stars in the Caribbean Premier League, they didn't even, they weren't even capturing video.
Starting point is 01:27:41 So I was like, can you just give me a link to all the video from last year? And there was this, like, can you just give me a link to all the video from last year? And there was this like a blank stare. They didn't even have the scorecards from the last year, so I couldn't even manufacture anything. So, you know, it's certainly got a long way to go, but it's been a very interesting, maybe last three or four years as it started to explode. Well, that's fascinating. It's kind of like, it's a little like the His Dark Materials books where you have all these different interlocking worlds and they're similar in many ways, but in other ways,
Starting point is 01:28:11 they're just completely unrecognizable. So you have cricket and baseball, which are very closely related as sports. And then there are a lot of common elements in these origin stories. You know, someone just painstakingly recording things manually and then it moving to a forum and that kind of being the breeding ground in these origin stories, you know, someone just painstakingly recording things manually and,
Starting point is 01:28:25 and then it moving to a forum and that kind of being the breeding ground for all this, except then it takes a lot longer to catch on and there are no fielding stats and it's just a much slower moving process. So that's sort of fascinating how it's similar in a lot of ways, but also very different. So I was going to end by asking you then what the dramatic changes in the game have been and whether they've been good or bad, but maybe there just haven't been that many dramatic changes yet. But in baseball, for instance, you get different types of players now than you used to because their skills are more valued or you get certain strategies falling out of fashion and certain types of pitches are more common or or you know how how different
Starting point is 01:29:26 is it like is positioning different the way that it is in baseball where you have shifts now because you have data on where the ball goes or is it just not really used enough to to really provoke that kind of sweeping change well weirdly cricket that was one of the few things that cricket was massively ahead of baseball in and we've always had shifts, partly because I think the bats are flatter, so you can aim the ball easier. But me as a non-baseball fan, I could never understand why you guys just didn't move the field more. It's quite clear to me that, I mean, having watched any sport,
Starting point is 01:30:00 whether it's golf, cricket, tennis, people have a swing of where they like to swing. So I found that very interesting in baseball. But in cricket look you're right i don't think we've had as many seismic changes but there's been some really interesting ones especially of recent times when when the t20 came in so that's the 20 over format game came in so i said that was around 2003 there was this thought amongst the game although i don't think it was thought very uh very much by people who can think but certainly there was there was a feeling that spinners so we have bowlers who bowl at almost almost 50 the pace of a normal bowler if you will and because of that they use by flighting the ball they change their pace a lot um obviously they spin the ball sometimes in both directions usually don't just in one direction off the surface.
Starting point is 01:30:46 And everyone thought that they would disappear in cricket because of these limited overs. You know, the fact that if you're trying to hit sixes all the time and one guy's bowling 55 miles an hour and the other guy's bowling 90 miles an hour, you're probably going to try and hit the guy who bowls 55 miles an hour for six a lot more. Also, the guys who bowl 55 miles an hour can't bowl at your
Starting point is 01:31:05 head uh so that does change the dynamics of that a little bit but it turned out that when you're trying to hit lots of sixes spinners who can spin the ball both ways become incredibly important especially if you can't pick it out of the hand which you know not all bowlers have that ability it's a bit like you know baseball someers are better at hiding their stuff than others. And spin bowling is very much like that, except slowed down about half the pace. And so this sudden surge of these guys who could spin the ball both ways. And I don't know how much you know
Starting point is 01:31:35 about the rise of Afghanistan in cricket. But basically, Afghanistan have never been good at any sport, right? They've never played sport. They've been at war. The whole time sport's been a thing. They've been at war with themselves or someone else right and so a bunch of these afghanistanis come out of refugee camps in pakistan and they literally walk back across the hindu kush mountains with this new sport of cricket and they nail it almost straight away and
Starting point is 01:32:00 so one of the top paid players in all the professional cricket is a guy called Rashid Khan, who at 19, although he probably wasn't 19, but to be fair, with most of those Afghanistan guys, they don't know their exact birth date. So it's not like they're faking the system as much as they literally don't know when they were born. But he was quite young anyway and comes from nowhere and suddenly becomes one of the highest paid players in cricket because he can spin the ball both directions and a little bit quicker than other people can. No one saw that coming. And that was very much an analysis thing quite early on of teams suddenly realizing that if you could find someone who could spin the ball both ways, they were much more value than a spinner who could spin it one way, which in test cricket, that the longer format of game is not as important because it's a different format. So that was one. We had one recently.
Starting point is 01:32:48 We just had a Cricket World Cup for our 50-over competition. I'm really sorry about all these different formats. So that's the most famous World Cup that we have in cricket. And suddenly everyone turned up at the World Cup and started bowling bounces to each other, and bounces is when you try and hit the batsman in the head intentionally. And you're talking about the the um the the shift in baseball completely changed where the field was and basically what they worked out was that it's a very in the middle
Starting point is 01:33:13 overs of a one-day game teams accumulate so they don't take as many risks because they're trying to keep as much of their resources because batsman can only bat once so once you're out you're out so you don't want too many of your batsmen to take too many chances in the middle overs and so what teams worked out and this was 100% I think it might have been through England but it was certainly through using Hawkeye data and using analytics in general is that if you bowl at a batsman's head there's no real shot he can play to accumulate he can duck it but that gives you that gives you a dot ball which in cricket means no run or you can play an attacking shot and try and hit it off your face
Starting point is 01:33:50 basically for six which as you can tell is i mean it's not a good thing um and you know many back in the old days batsmen have died missing that we actually had an australian cricketer died five or six years ago and missing a bouncer it hit him on the back of the skull and he died in a professional game and it had to happen a lot so it's a dangerous thing as well but so you're trying to pick this ball up if you face and so in the middle overs the teams just went for it and that was a huge change from in in the old days you you would bowl your spinners in the middle over you try to get the ball quite soft so because as I said we use the same ball so what you want you the softer the ball the less likely at the end of the game they're going to hit sixes off you because the ball's not as hard so um that was a change i've also
Starting point is 01:34:33 cricket in these limited overs games the 50 over format and the 20 over format we're seeing a bit of a change in so imagine baseball where you you had to use five different pitches in every game, but those pitches would also, all of them would have to bat as well. You would have to factor in how well those pitches can bat. Right. And so what cricket basically did was they would, with four of the pitches, four of the bowlers, they go, do you know what? It doesn't matter how much they bat.
Starting point is 01:35:01 One of them will probably be handy and he'll be okay. The other three might be terrible, but we're willing to take that chance but but in the middle we need that one last batsman who can bat and and we call that the all-rounder position especially in limited overs and for years you would basically take a very ordinary bowler who bowled usually part-time spin or part-time medium pace very slow uh no skills really maybe a clever player do you know what i mean um maybe maybe they're a good athlete and so they can they can still get through a few overs or they were quite clever and they they'd wangle out a couple of overs or a couple of wickets and we've now seen through analytics that it just doesn't work that i don't know how much
Starting point is 01:35:40 you know about weak link sports and strong link sports. Yeah, we've talked about that. Beautiful. So in cricket, basically batting, especially in the limited overs games, is a strong link sport and bowling is a weak link sport. And we've worked that out through analytics. So now what you have is you have teams who come in usually with one fewer recognized batsman, knowing that they will have to back, you know, their top order a little bit more. And so you've got to use five bowlers in limited overs
Starting point is 01:36:11 and none can bowl more than 20% of the overs. So in a 20-over game, that's four overs per player. And in a 50-over game, that's 10 overs per player. So if you're going to wangle 10 overs out of a part-timer, you're better off to, you know, analytics part-time you're better off to you know analytics say anyway you're better off to use a frontline bowler in that position and have a reduced batting than the other way around so that is something that that's going on quite a bit at the moment and i've already talked about you know the the west indies and their um and their six
Starting point is 01:36:40 hitting you know that that's been a huge change and you know that you can only see you know weirdly cricketers got fit before baseballers you know i don't know why that was but for whatever reason when australia maybe because australia is such a physical fitness country you know we were such a you know an outdoorsy culture that when we talk over cricket we made sure everyone got fit so there was a very a huge period there where cricketers were massively more fit than baseballers. And now what we've worked out is it used to be the bowlers. So if we wanted to hit sixes in cricket in the 1990s, we would send in a bowler because a bowler would be the big strapping guy
Starting point is 01:37:16 who could bowl really quick. So he would be six foot seven, six foot eight, big shoulders sort of person. So he'd be sent in to hit the sixes. Whereas now the batsmen are a bit more like what you would see if it you know as a designated hitter like big some some of them are you know probably a little bit overweight which has always been a bit of a cricket thing but now it's more a muscular thing so a lot of players are just working on the gym uh for power um and so you
Starting point is 01:37:39 now have batsmen who are huge and way big you know they look like you know major league hitters which we didn't have before. In fact, it was always thought back in the old days of cricket that short batsmen were better. Whereas now some of the batsmen look like they could eat the old cricketers. So, and again, that comes from, you know, understanding that if you're going to take a risk, and it's very similar, this is where it is very similar to baseball.
Starting point is 01:38:01 If you're going to take a risk of hitting the ball into the field where there are fielders trying to run you out and catch you out. And your other option is to hit the ball over their head. And for a much bigger reward, you're better off analytically to take the much bigger reward and go for the six than you are to keep the ball in play and give the team a chance of either catching you out or running you out. And so that's probably the biggest change
Starting point is 01:38:25 that we've had in cricket is we didn't have range hitting in cricket until the late 90s. And I don't know when it started in baseball, but I'm assuming it was a long way before that. But we didn't have a period where cricketers would go out and see how far they could hit the ball.
Starting point is 01:38:40 We didn't really have a period, we didn't have anything where people were working on their power or hitting more sixes. And also the other thing that we came in and part of this was analytics and part of this was just the natural evolution of the game is I talked about baseball being really silly and not noticing that they should shift the field right one of the funny things about cricket is how for a very long time different parts of the field weren't used so even though we have a bigger field than baseball and there's more gaps um cricketers wouldn't hit the balls in certain areas because uh they thought it was too risky and to start to start with they literally thought it was an it wasn't an act that a gentleman would do and so you wouldn't you wouldn't hit the ball off your pads so you know
Starting point is 01:39:24 your pads are on your legs you wouldn't hit the ball on your pads um to a position on the field fine lake which is sort of toward past where the dugout would be in baseball and that sort of angle because that was like that wasn't a place that you hit the ball and then as cricket started to go and develop people started hitting the ball into more places but the one place that they never hit the ball was over the wicketkeeper's head so over the catcher's head which is a perfectly legal move in cricket and there's nothing to stop you doing it it's just an unwritten rule yeah no one had invented the shot essentially as much as anything but as you get these sort of developments in in the game and the way that the
Starting point is 01:39:58 game was changed and when people started to hit more and more fours and sixes which was again a lot of that was analytically driven through the way that Australia was playing their game at that stage before the West Indies sort of took that over. Teams were like, let's imagine you were like a jobbing cricketer at 28 and you're not a big guy. Well, if you're going to have to score boundaries at the same rate as everyone else, you're going to have to work out how.
Starting point is 01:40:19 So there was a Zimbabwean batsman, Douglas Marillier, who started scooping the ball over his shoulder. And then there was an Australian batsman called Ryan Campbell, who literally started turning around, facing 85, 90 mile an hour bowlers, turning around and trying to hit them backwards. And then we had another player called Tila Karatny-Dilshan from Sri Lanka, who would almost, it's worth looking at this. The shot is called the Dill Scoop.
Starting point is 01:40:43 Now, he's done this against people who've bowled 95 miles an hour, and it's worth looking at it because he literally, it's like he gets down and prays and then hits the ball directly over the badge on his helmet, straight back over his head. It's a phenomenal shot. And so now we have lots of different versions of those shots. And now that we have more fielding data
Starting point is 01:41:01 and we have more access to what we call wagon wheels in cricket, which is basically where batsmen hit the ball we now have the ability to go out and say to a to a batsman you don't usually score in this area this bowler bowls this particular length and he bowls it really well the only shot that you can play is literally over your left shoulder or over your right shoulder and that is an analytics it was a movement sort of come out of necessity through cricketers trying to catch up with the power movement that then hit the analytics movement to be able to say this bowler will bowl this length you can now play this stupid shot that you think your friends are crazy for even thinking about and it was sort of the the meshing of the two before that up when those shots existed you can imagine everyone just thought the batsmen were
Starting point is 01:41:43 all crazy and that someone was going to to die and I've seen some English cricketer, Beth Morgan, when she was playing for the women's side, try and play this scoop and literally off the middle of her bat, scoot the ball into her throat, which is probably, I think, probably better to hit a woman there because she doesn't have an Adam's apple. I'm not an anatomy expert, but I would have thought that that would kill a guy on the Adam's apple. I'm just tapping my Adam's apple to check now. But, you know, that was 10 years ago and people like Beth Morgan were seen as crazy, whereas now it's such a normal shot. And what we're working out in cricket from, you know, this extra analysis is that a lot of batting is either trying to hit the ball over the the fence and hit a six or it's manipulating the field and batsmen have already already always known that it's the same way that they were you
Starting point is 01:42:30 know bat a batter's always knew in baseball that a home run was worth a lot and that a walk was quite handy but there's a difference between kind of knowing it as a general truth and actually being told look when you manipulate the field this is the effect that it has on the opposition right and they have to change their tactic and those sorts of things. And all these sorts of things are still in their infancy only because, you know, I think one of the great things about the baseball movement is that everyone was being fact-checked all the time because it was done publicly. A lot of people had access to the information. people had access to the information a lot of people you know through through bill james's work and then through online as well you know you would put up a theory of of you know of a new stat and 10 people could have a have a look at it and and and have a go problem with cricket is that you
Starting point is 01:43:14 were talking about how similar it is to baseball in some of the way in some ways it's like no other sport in the world i don't think there's another sport in the world that is so so international dependent yeah so when england get an advantage in analytics, right, if you think, you know, the Houston Astros are, you know, good at keeping their information to themselves, England don't tell anyone. They're the only ones who own this. And it's not like you can just go out and buy an edutronic camera
Starting point is 01:43:38 and catch up to them. You know, Hawkeye might not even sell this information onto anyone else because they maybe didn't even realize how useful it was when they first sold it, you know hawkeye might not even sell this information on to anyone else because they maybe didn't even realize how how useful it was when they first sold it you know and so so there's almost there's almost like you know the cricket teams almost become like governments and and they are really almost like governments within the sport and so they're slow to move on to these things and then they're also the information doesn't drip out of them very quickly because they're trying to hold on to these advantages and and because there isn't this big open thing so so england have been using
Starting point is 01:44:08 weighted averages to work out the quality of of their batsmen for for a while but i've never seen anyone in the cricket analytics community sort of have access to the weighted average numbers to see if it even stacks up so england could be thinking they've got this incredible um system but no it's not being fact-checked the way that you know i'm more of a basketball fan and numbers to see if it even stacks up. So England could be thinking they've got this incredible system, but it's not being fact-checked the way that, you know, I'm more of a basketball fan than a baseball fan, but, you know, John Hollinger's PER, like people always looking at that, you know,
Starting point is 01:44:34 and how to improve it and how do we make it better and, you know, when we can trust it, when we can't trust it. Whereas England is literally using this weighted averages thing and no one outside of maybe three or four people that they employ know what it is and understand the formula behind it. And you find that, you know, even with a lot of my stuff, I put some of the stuff out when I worked for ESPN. But even then, you know, ESPN don't want formulas in their articles. So, you know, no one's really testing these sorts of things. So, you know, no one's really testing these sorts of things.
Starting point is 01:45:04 And because there aren't many of us anyway, usually what happens is you end up with like 100 different versions of roughly the same thing. And so, you know, it's just, it's not going anywhere because of all these different factors. And cricket is such an interesting sport because of that international aspect of it. You know, you've got a country like Sri Lanka who, you know, which every person listening to this podcast
Starting point is 01:45:24 should visit Sri Lanka once in their life. It is one of the great places I've ever been. I'm not at all biased by the fact I married a woman of Sri Lankan descent. It really is. The food is incredible. The country is incredible. The people are incredible. The beaches.
Starting point is 01:45:36 I could go on. But Sri Lanka has 20 million people. They've obviously been involved in civil wars for a very long time. And they can be the world's best cricket team and they've won two world cups in their time their football team is ranked 200th in the world and i've seen them play ben they they have earned that ranking you know and so that cricket has these sort of random countries like afghanistan and zimbabwe that you they don't play any other sports and they're not good at it so there's no professional sporting infrastructure within afghanistan to support cricket becoming more professional in
Starting point is 01:46:09 those areas so you've actually seen of recent times the teams that are sort of improving the most outside the major teams are places like scotland and ireland the netherlands usa will be another one because there's a structure of how sports improve in those places and how they get more professional and they will hire analysts because that's what other sports are doing. And cricket is, you know, we can't play cricket in Afghanistan because no one will tour there. As we speak today, Pakistan have just played their first test match in over 10 years at home because the Sri Lankan team was shot on a tour there. Some of their players were shot through the leg. People died from a terrorist attack.
Starting point is 01:46:46 So teams just stopped touring there. You know, it has problems. It has problems that it needs to overcome that sadly for people like me are far more important than the fact that we don't have adequate cricket stats on the TV yet. Yes. But we're getting there. All right.
Starting point is 01:47:01 Well, this has been very educational. It's amazing just how many parallels there are, but also because it's a worldwide sport and there are so many teams in so many countries and so many languages and the information is not out there publicly. to a very concerted way where one team does something and all the other teams notice and pick up on it and start doing it themselves. So it seems like much more of a halting process, but gradually getting there. So yeah, sorry, just to add to that, I know you're trying to wrap up and get me off, but just to show how weird it is. So we have all these big leagues now. So the Indian Premier League is huge, but it's a two month long league right so players are paid a fortune to play in it on a per week basis but it's only two months long some of these leagues i've been hired for leagues and a week before they're supposed to start they just fold and they don't even exist and these are with major huge name international
Starting point is 01:47:57 players and most leagues run between four weeks and six weeks so they they're like pop-up. So I was talking to your co-host, Sam, about this when I took the job with St. Lucia Stars. It's like, so when you guys took over, you know, your baseball team, you had obviously some of the worst professional baseballers in the country and some good guys that you obviously picked with your metric. But, you know, you had the lowest level.
Starting point is 01:48:23 Whereas I came in and my first day on the job, I had David Warner, one of the most famous players on the planet i had karen pollard one of the best players on the planet in the change room with me and then in four weeks time we all went our separate ways and the franchise came back the next year and it literally when it came back the next year i had no owner and a different name yeah um and so you there's there's almost no way to you know keep the continuity going because you come up with a great idea with this one team and the next year the league may not exist yeah right that's a problem i will link to a lot of your articles and writing and research about this if people want to dive into it even more and uh give you some tips on how baseball
Starting point is 01:49:02 has handled these problems although it's a different game with different challenges. But there is certainly some overlap there. But you can follow Jared on Twitter at AJaredKimber. We thank you for coming on and sharing your insight. I have learned a lot about cricket, although I didn't know much to start. And I think I did it in a way that you in no way need to understand anything about cricket to understand this podcast i hope i think so i followed most of what you said i think and i knew nothing to start so and if that isn't the case i blame you not me you could
Starting point is 01:49:35 have stopped me at any time but no i i mean look realistically you know all these different sports it's you know it's fundamental problems of just trying to work out how to play them better isn't it that that's what that's what your podcast is all about and that's what my work has been in cricket over the last couple of years and you know you i always feel bad for the players so you know you just want to give them the best information that you can and that's not always been the case yeah well we wish you luck in the future i hope it becomes more open and public and peer-reviewed, but you're doing your best, it sounds like. No worries. Thanks, Ben.
Starting point is 01:50:09 All right, that will do it for today. Thank you for listening. Some of our listeners posted a screenshot of a cricket broadcast on Boxing Day. It was the match between Australia and New Zealand, and on the screen there was something called Smash Factor. It showed the bat speed and the launch angle and the quality of contact and the power generated by the swing, sort of like exit velocity and launch angle in baseball, but with a much cooler name, Smash Factor. This is actually a little different from StatCast, which is tracking
Starting point is 01:50:34 the ball. Smash Factor is essentially a swing sensor, which of course exists in baseball also, but not in Major League Games and not on broadcasts, and so we don't really have good public bat speed measurements. So this is a cool addition to cricket broadcast. This was a Fox Cricket innovation, and cricket has kind of been a leader when it comes to integrating technology into broadcast, even though the analytical movement has been behind baseballs. Some of you may know about the snickometer or snicko, which uses sound waves to tell whether a ball touches the bat in cricket, and there's another technology used in cricket for a similar purpose called hotspot, which uses sound waves to tell whether a ball touches the bat in cricket. And there's another technology used in cricket for a similar purpose called Hotspot,
Starting point is 01:51:08 which uses infrared cameras to see if there was contact. And some of you may remember that Fox ported that technology over to baseball in the 2011 World Series. And there was this disputed play where Adrian Beltre fouled a ball off his foot, and they had this Hotspot technology so you can see that it did touch his foot and therefore was foul. I will link to some articles about that in a video from the broadcast. That kind of just went away in baseball. It didn't really catch on. It's not essential, but I thought it was kind of cool and it would be helpful every now and then in determining, say, a foul tip or a hit by pitch, maybe some other applications even. So cricket's really leading the way when it comes to ball, bat, sport, broadcasting, presentation.
Starting point is 01:51:46 Great names too. Smash Factor, Snickometer, Hotspot. Love it. So thanks for listening to the series so far. And next time we will switch over to some individual sports and talk tennis and golf. You can support the podcast on Patreon by going to patreon.com slash effectivelywild.
Starting point is 01:52:01 The following five listeners have already signed up to pledge some small monthly amount and help keep the podcast going and get themselves access to some perks. Thank you. review, and subscribe to Effectively Wild on iTunes and other podcast platforms. Keep your questions and comments for me and my regular co-hosts, Meg and Sam, via email at podcast.fangraphs.com or via the Patreon messaging system if you are a supporter. Thanks to Dylan Higgins for his editing assistance. And we'll be back with an extra episode this week. Talk to you soon. Skating away Skating away On the thin ice of a new day Skating away
Starting point is 01:52:57 Skating away Skating away

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.