Effectively Wild: A FanGraphs Baseball Podcast - Effectively Wild Episode 1477: Multisport Sabermetrics Exchange (Tennis and Golf)

Starting point is 00:00:00 Hello and Nothing to you Love meant nothing to you Hello and welcome to episode 1477 of Effectively Wild, a baseball podcast from Fangraphs presented by our Patreon supporters. episode three of our seven episode series on the state of sabermetrics in a dozen different non-baseball sports, which we have dubbed the multi-sport sabermetrics exchange. If you missed the start of the series, we've already talked about football, basketball, hockey, and cricket, as we bring on experts and talk about the past, present, and future of advanced analysis in their respective sports. Today we are taking a break from team sports to talk about two individual sports, both of which one might find at a country club, and both of which have undergone major evolutions stemming from equipment changes, golf and tennis. Tennis gets first serve and golf will tee off second. To talk about tennis, we are bringing in a multi-sport star, two-way player,

Starting point is 00:01:21 Jeff Sackman, who has a baseball background also. He is now the founder of Tennis Abstract. He writes about tennis at the Tennis Abstract blog, Heavy Top Spin, and also hosts the Tennis Abstract podcast. And he started the Match Charting Project, which I'm sure we will talk about. But before all that, he co-founded College Splits, which aggregates and analyzes college data and provides it to teams. And he used to research and write regularly for the Heartball Times. So, Jeff, I guess I should ask where we lost you. Why did baseball lose you to tennis?

Starting point is 00:01:58 Well, when I founded College Splits, it was really sort of a crash course in college baseball. I had never really been a college baseball fan. But as you probably know, the college baseball season starts in February. And when you're running a business, working with college baseball, your season starts in January or sooner. And that meant that by the time opening day rolled around, I was already getting burned out. So by the time the All-Star game happened, I mean, the college season was wrapped up, I was ready to take a break from baseball before anything in the pennant race got interesting. So, so, I mean, I'm still, I'm still very involved with baseball, but it's just,

Starting point is 00:02:34 I don't have the capacity that I guess a lot of front office guys have to just be immersed in it, you know, nonstop 12 months a year. So I kind of needed an outlet and that's where the tennis came in. Well, I guess there are fewer sabermetric tennis analysts than there are baseball analysts. So this is probably an untapped market that you've tapped into, or at least less tapped. So I've been starting all of these by asking the same question, which I will also pose to you, which is on the scale of ease of analysis, where one is a sport that's just completely opaque and we can't figure out anything, and 10 is baseball, basically, which is structured in a way that lends itself to this sort of analysis, where would you put tennis? It depends what scale you're looking at. I mean,

Starting point is 00:03:22 from one perspective, which is most of the research that's been done so far, it's similar to baseball. It's extremely structured. We have matches that break down into sets, which break down into games and points and so on. So you can analyze those granular bits of matches and seasons and so on. and so on. But at the other extreme, you've got something more like, I don't know, is the other extreme something like soccer? Yeah. Where everybody's moving all the time. You have all of these different ways you could measure that. And it sort of gets at the same problem with measuring some of the motion data that you get in baseball, where we can talk about this later. But if we had access to all of the camera data that's recorded in tennis matches, you start looking at some really, really difficult problems. Like, is a serve good because it's fast? Because it has spin? Because it has a certain type of spin? Because it's deceptive somehow? Like, these are really tough questions. And I mean, unfortunately, we don't know exactly how hard they are because the data is pretty tough to come by. But at that scale, I think tennis would have to rank up there with among the hardest sports to analyze.

Starting point is 00:04:29 So can you give me a brief history of tennis sapermetric analysis when it started, how it kind of caught on, if it has caught on, and maybe what some of the major milestones or breakthroughs have been? or breakthroughs have been? Well, I'm not sure I could even say it's caught on now, but the early history, especially through the, well, in the sixties, there were a couple of books published. Like think about the guy who did some sabermetrics for Branch Rickey, sort of that kind of stuff. And then starting, and I think this started in the seventies, some academics started getting interested. And that's where you see a lot of people working out win probability, for instance, because tennis is so structured. You can say that if it's, you know, the first set in a best of three match and one guy's winning four to three and he's up 30-15, you can work out what his probability of winning the match is. So some academics did that.

Starting point is 00:05:20 And really until the last, let's say, five, maybe seven years, that's what there was. I mean, it was just a few academics who got really into it and published several papers. Mostly it was just sort of drive-bys where somebody who was working on operations research and played tennis in their spare time thought, hey, it would be fun to do a tennis paper. And then they did that and then forgot about it forever. One of the main problems is there wasn't really a data set. I mean, some of the academics who did it got some access from Wimbledon to some data that Wimbledon had put together. But as far as like a retro sheet or a layman database, it didn't exist. So that was really the state of affairs until about five years ago. And I've tried to remedy some of that. And there's something a little closer to a layman database now. And I've tried to do some of what I

Starting point is 00:06:10 see is the fundamental type of work and in figuring out like some of these same win probability questions, some like clutch type questions. And I'm not really sure we're at a point where we can even start talking about milestones yet. And part of what makes that difficult is that unlike with baseball this is almost entirely a fan's game like with with baseball you can always benchmark yourself against our teams taking this up or is it working and for the most part players aren't super interested there are no teams so it's kind of up to fans to decide well this is cool or cool, or this isn't cool, this is interesting, or it isn't interesting. And more people are interested than they used to be, but it's still very, very fragmented and very much on the sidelines. So your match charting project seems very much like sort of a project score sheet

Starting point is 00:06:58 type endeavor. So did you take direct inspiration from that? And how does it work? And what have you gleaned? Well, it's, yeah, directly inspired by a project score sheet. And some of the work I do with college splits is we're basically just extending retro sheet type analysis to college baseball. So I'm pretty immersed in that on a day to day basis. So, I mean, even even the the structure of the codes I developed is that you'll see some similarities there. But what it is, the match charting project is my effort to get more granular data for tennis matches. So I mentioned earlier that the one scale tennis is very easy to understand.

Starting point is 00:07:37 But when you start going to a narrower scale, then it's difficult to understand. So we might have stats on whether somebody wins a lot of first serve points, but we don't know about what happens when they hit first serves wide or if they hit forehand returns and so on and so on. So what the match starting project involves is recording the direction of every serve, every shot, forehand, backhand, volley, whatever, the direction of every shot, the depth of every return, and then how every point ends. It's a winner, volley, whatever, the direction of every shot, the depth of every return, and then how every point ends. It's a winner, unforced error, forced error.

Starting point is 00:08:09 And what you have then is a record of every shot. So I've been able to do some research where you can quantify the value of someone's overhead smash, for instance, or look at how effective someone's forehand or backhand is contributing to their odds of winning a certain match, how aggressive a player is hitting a lot of winners and errors rather than just keeping the ball in play. And that data wasn't publicly available now, I mean, before now. And over the last five years, we've accumulated, well, we're closing in on 7,000 matches, including every Grand Slam final back to 1980, most Grand Slam semifinals, lots of other notable matches. So it's turned into a really substantial data set that is really unmatched in tennis. So have there been any major misconceptions that have been challenged,

Starting point is 00:08:58 overturned, kind of old received wisdom that turns out to be not that true, or maybe players who are better than they're perceived to be. I don't know whether that's a thing because I guess the record is what it is a little more than, say, baseball win-loss records for pitchers, let's say, but over a big enough sample, let's say I know that you do ELO ratings, for instance, and I assume those are pretty telling, But is there kind of a signature insight of tennis analytics yet in the way that on base percentage or whatever you want to pick from baseball was? I mean, the core thing for me is there's this assumption that people have been referring to in the academic literature for decades called the IID assumption,

Starting point is 00:09:40 which is that points are independent and identically distributed, which basically means if you're saying that Federer serving has a 70% chance of winning a point, then it's always 70%. It doesn't change when he's under pressure. There's no clutch. There's no choking. There's no momentum. It's just, it's 70%. And it's always been acknowledged that this is just an assumption. And no one ever said this is definitely true, but it's's just it's like in any sport we it's usually a great simplifying assumption to say we're not going to worry about clutch now maybe we'll look at it later but if you listen to commentators and fans in tennis they've got story after story after story for when the iid model is is not accurate so people think that the first point of a game is more important, or they think that tie breaks go the direction of the bigger server. I mean, there's dozens of these things, and they're basically all these specific cases where someone believes that clutch, momentum, something like that is a factor.

Starting point is 00:10:39 And there has been work for a couple of decades now, going back to the academic stuff, that there are cases where it is slightly a factor. But this is an instance where having the baseball background is super valuable because we know from baseball research, clutch is a thing. But I mean, it's so tiny that you're not going to give somebody an extra $5 million a year for it. You're not going to be able to see it with your own eyes. While in tennis, you have these commentators who are constantly telling you that they are seeing it, that it's this huge deal. The numbers do not bear that out. So if you had to

Starting point is 00:11:13 pick yes clutch or no clutch, you're going to do a lot better job forecasting things if you say no clutch, even accepting the fact that you're slightly wrong. But I guess because it's a one-on-one game, you wouldn't have underrated or overrated players the way that you do in baseball, where it's harder to isolate one person's performance or skill. Yeah, that's generally right. There is one researcher named Stephanie Kowalczyk who took a stab at doing wins above replacement. She might not have called it that, but something along those lines for tennis.

Starting point is 00:11:45 But in general, Roger Federer gets as many wins as he gets. On the other hand, one thing that I've discovered managing ELO ratings is that the overall, like the official rankings, the ATP computer, the WTA computer, which are based on these pretty arcane systems where you get points based on how far you get in a certain tournament.

Starting point is 00:12:06 They're equally weighted over the course of the last 52 weeks. But if it's 53 or 54 weeks ago, they disappear. There's a lot of problems with that method. And what it means is that the official rankings are kind of a lagging indicator. So for instance, there have been some cases recently where players are entering the top 10 in my ELO ratings six months ahead of where they enter the top 10 in the official rankings. And the ELO usually manages to predict that they're on their way and that is a better predictor. So it's not that the official rankings are wrong. They're just kind of slow to get with the program. Interesting. So I guess that could have implications for tournaments, right? Because if that determines seeding, then maybe there are players whose official rank doesn't totally reflect

Starting point is 00:12:49 their actual performance. Yeah, absolutely. And another thing to that precise point is that players are not the same on all surfaces, but the ranking system is just one official ranking system. So if you have someone who's very strong on clay, but weak otherwise, he might not be seated at all going into the French Open, but he might be the 18th best player in the world on clay. And my ELO system has an approach to that. There's other ways you can dig into that. It's not that hard to figure out who's good on a surface or not, but that's something that the official rankings just choose not to address at all. Right. So yeah, how huge a factor is that? Because I think even people who don't follow tennis would know about something like Nadal

Starting point is 00:13:29 on clay, let's say, but in terms of comparing tennis park factors to baseball park factors, how huge a thing is this? Because obviously certain skills you would think transfer, it's the same sport, but presumably some of them tend to work much better on certain surfaces, not others. Yeah, I mean, the conventional wisdom is that your clay court player is someone who's good from the baseline, doesn't necessarily hit hard, doesn't necessarily have a big serve, but a hardcore player is going to be a big server, might have a big forehand, focuses on short points rather than long points. And that's generally true. I mean, we can nitpick that a little bit if we wanted. But the other part of the conventional wisdom is that that has changed over time, that surfaces have converged in speed and they've also slowed down a little bit. So there aren't super fast grass courts like there used to be. I have found a lot of mixed results based on that. I've dug into surface speed a lot and you can kind of get the numbers to say whatever you want. But in

Starting point is 00:14:31 general, I don't think the convergence narrative is as strong as people think it is. Part of what's happened is the calendar has shifted so much to hard courts that it's tough to be a clay court specialist anymore. It used to be you could play half your season or more on clay and still be, I think, a top 10 player. But now if you're really just a clay court guy, then you only get to play maybe three months of the year on clay and you have to adapt somehow to hard courts. And that means that some guys who might have been specialists 20 years ago, they, they can't cut it. I mean, they can play, they can play the second tier challenger tour if they wanted, but that's about all they could do. So, so what's happened is I think the, the,

Starting point is 00:15:15 the clay court specialists of previous generations have adapted to that. So a guy, Diego Schwartzman, who's this super short Argentinian guy, the absolute picture of a clay court specialist. He's making finals in hardcore tournaments. I mean, he's got no traditional hardcore weapons. But if you're going to be a competitive player in 2019, then, or 2020 soon, then you've got to adjust. I mean, Rafael Nadal, it's important to his legacy that he is able to win Grand Slams outside the French Open. And he has. I mean, he's changed his game, gotten more aggressive with the serve,

Starting point is 00:15:47 like adapted to different tactics. But that's a reaction to the surfaces, to the calendar. I don't think it's as much because of anything mechanical about the surface itself or anything about the physical nature of the surfaces. So how does all of that complicate cross-era comparisons? Because that's such a big thing in tennis, particularly when you have maybe the greatest of all time playing right now with Serena and Federer and the other contenders for that title, and yet you have differences in the schedule, you have differences in the rackets, of course, along with all the other

Starting point is 00:16:21 differences in athletes over time. So is there an era adjustment that goes on there in any rigorous way, or is it just sort of eyeballing it kind of thing? I think it really has to end up as an eyeballing thing. I mean, I've tried to do it. I wrote something for Tennis Magazine, I think five years ago now, where I tried to do something kind of like, I remember doing this with baseball and don't remember whether I published it or not, but I compared like basically just comparing people

Starting point is 00:16:49 who are entering the league versus those who are leaving the league and seeing what that does, what that tells you about overall quality. And when you do that with men's tennis, you end up with this really extreme era adjustment where at the time that I did it, the top five players were the big four, Federer, Nadal, Djokovic, Murray, and David Ferrer. And you think of the big four, maybe just the big three as among the best of all time, Ferrer not so much. When I did that analysis, Ferrer was clear cut fifth best player of all time because of the era adjustment. And what Federer always says when he asked this question

Starting point is 00:17:25 is you have to, you just have to consider the eras and the fact that they are so different. You really can't do a direct comparison any more than you can really compare Mike Trout and Babe Ruth. It's just, in this case, it's less than half the time span. But like you say, the rackets are different. I mean, the difference between a contemporary racket and the wooden racket that Rod Laver was using 50 years ago. I mean, I think that's a bigger difference than a lot of things in baseball between now and 1870. And it's one third as much time. So it's fun to play around with. It makes for a lot of great debates. But I kind of just throw my arms up and say, you know, do whatever you want. I'm not going to take it that seriously. They're almost

Starting point is 00:18:09 different sports that are difficult to compare. So how does the success and the longevity of the Big Four and Serena, how has that affected analysis or our understanding of the way tennis works or the typical aging curve? Because you do have a lot of sports, it seems like, where things are trending younger. And in baseball right now, players are younger and younger and successful at early ages. And then in tennis, you have this crop of players that just has defied time. So what has that taught tennis analysts? I'm not sure we've figured out what to take away from that. One of the difficulties in all sorts of analytic topic areas, I guess you could say, is that we are working with a fairly short span of time. I mean, there's been Grand Slam tennis since the 1870s, but we only have the professional era since

Starting point is 00:18:57 the late 1960s. We only have good data for matches back to 1991 in many cases. So the Federer era, let's say, starting from when he started winning in 2003, that's like 40% of the decades we have to work with. So if you're going to try to compare aging curves in the 70s and 80s to aging curves today, then I mean, I'm not sure what you can pin down exactly. I mean, you'd have to have more decades worth of data to really know whether this is the outlier or I've, I've seen some decent arguments that it's not that this is the outlier. It was the era kind of before this, when, when metal rackets were introduced, wood rackets were getting phased out that when we did have Boris Becker winning

Starting point is 00:19:45 Wimbledon at 17 or whatever, and Michael Chang competing at such a young age, that was the outlier. We had this time where suddenly 16-year-olds could compete with the best. If you look at the era before that, I mean, it wasn't like today where you had a 37-year-old winning everything, but you did have more early 30 somethings who are really competitive, late 20 somethings competitive and sometimes winning at slams. So it could be that this is not that abnormal. It's just sort of a convergence of getting the aging curve back to normal. Plus the fact that you do have these three, possibly the three greatest players of all time playing at once and then possibly the one greatest player of all time in Serena on the women's tour. It's just, I hate to have to give so many non-answer

Starting point is 00:20:30 answers, but I'm not sure what we can do with that other than just say these guys are really, really, really good unless the next 20 years of evidence gives us some reason to think otherwise. Well, it must be nice to work on these two sports that are both in some ways easy to analyze compared to other sports, but then in baseball, you have so many more answers. Not that there still aren't questions and mysteries, but it's more of a settled science certainly than tennis is. So I guess you get to flex your analytical muscles a lot more with tennis, or at least there's more potential for you to uncover things if fewer tools to use to do that. Yeah. I mean, I think that was one thing that appealed to me in the first place. There's just

Starting point is 00:21:16 sort of an open field to play around with. And you do start to get the feeling with baseball analysis that a lot of it's very interesting, but you have to really dig deep before you're saying anything new. And yeah, with tennis, there's lots of room to do that. And I hope more people dig into it. I mean, I think there's more data available now than there are people digging into the data and finding the answers that can be extracted from it. So it's a nice situation to be in. So you mentioned that advanced analysis hasn't really been widely embraced by players, but are there any examples of players or coaches who have been sort of sabermetric trailblazers when it comes to strategy or training or any aspect of the sport?

Starting point is 00:22:00 There are, you'll see headlines sometimes that a certain coach is using numbers more. The WTA has this partnership with SAP, the consulting firm, and they're putting iPads in the hands of coaches and pretending like that's somehow a breakthrough. I'm not really sure what the benefit is, but somehow the iPads are supposed to make everyone smarter. But there's a guy, Craig O'Shaughnessy, who's coached a number of players throughout the years. Now he's sort of on the edges of Novak Djokovic's coaching team. And he's also involved with an Italian player named Matteo Berrettini, who's in the top 10 right now.

Starting point is 00:22:35 And he comes from a coaching background, but he does some analysis. And a lot of the analysis he does, I don't think it would pass muster with sort of the hardcore baseball analysts. And I've certainly taken my share of swings at the stuff he's published and the rigor of it. But he does have that kind of mindset that he takes to his players. And we don't know exactly what it is. He doesn't talk about the specific tactics.

Starting point is 00:23:01 But a lot of it seems to be sort of a measured aggression that maybe players like Djokovic who make their money by getting everything back in court and just turning themselves into a backboard, they can be more aggressive. They can exploit their opponent's court position in certain ways, but we don't know exactly what that is. And I think that most players, I mean, they might look at their stats and say, you know, I need to be getting more first serves in or something. But that's basically where we're at. So it's a very small percentage of players or coaches at this point. It's interesting because in golf, for instance, you have players and coaches with trackman units or force plates or swing sensors.

Starting point is 00:23:42 And it seems like a lot of that is applied or you see track man data on golf broadcasts, for instance. And it seems like there would be potential for a lot of the same sort of insights and improvements for players to make. So is there any potential for player tracking data and wearables and some things that might yield a different sort of insight?

Starting point is 00:24:05 The potential is definitely there. And what I don't know a lot about that it could be happening more than I'm letting on is what's happening at the training stage. So there are smart rackets. There's this company called PlaySight that makes basically single court units that are computers with cameras connected to them where they track a match and they track your stats for you and the speed of every shot and spin and all that stuff. So you can get a lot of data on yourself as you're practicing. And I'm sure many

Starting point is 00:24:36 players and coaches are using that. I mean, if only to monitor technique. I know it's ubiquitous to use video, but I don't know how much they're using video analysis along the lines of TrackMan and that kind of stuff. So that potential is definitely there. The problem with sort of the next step is it's where tennis gets complicated. I mean, with golf, every shot is its own unit. I mean, the ball starts from just sitting on the ground. And with tennis, it's so much more complicated, even just working with the match charting project data, which is

Starting point is 00:25:10 really just two data points per shot. Even that gets complicated. If you're trying to isolate a player's skill in like changing directions and hitting a down the line backhand, then not only are you talking about a specific shot, but you have to be isolating a specific shot that came before, a specific position where that previous shot was hit from. If you start digging into camera data, then you've got an order of magnitude more variables for every one of those things.

Starting point is 00:25:41 So you've got to decide what the buckets are to distinguish certain shots from other shots. It's really complicated. I think it will be a lot of work invested before we start seeing really clear groundbreaking insights from that kind of stuff. And that's complicated because there's tons of camera tracking data being recorded, but very little is finding its way to people who can do anything with it because each individual tournament owns the tracking data from their tournament. Most of them don't do anything with it. So many cases, they don't even have a copy of this data that

Starting point is 00:26:14 they own. So it's all extremely siloed and somebody like me can really only speculate on what to do with it and not even take you know, take an uneducated stab at doing something with it. Yeah, the shot sequencing stuff you were just talking about sounds sort of like pitch sequencing in baseball, which is one of those areas that hasn't really been cracked, at least publicly, you know, is this pitch more effective if it's thrown after or before that pitch, for instance. But it seems like there must be some analogous insights with just shot selection and positioning and usage in the sense that many pitchers, for instance, had a really good breaking ball. And then it turned out they didn't throw that breaking ball as often as they

Starting point is 00:26:57 should have because there was this belief in establishing the fastball. So that's something that's been productive for a lot of pitchers. And I would imagine there have any idea of what that type of shot might be. Like, is there, you know, a sinker of tennis where players will stop hitting that shot the way the pitchers have stopped throwing sinkers and they've started throwing sliders? Or even if it's just on an individual level, this player should stop hitting that shot from there because it's not really helping him or her. Yeah, there's a lot I can comment on from there. I mean, yeah, shot selection is very much sort of an untapped area with a lot of potential. And one of the aspects that's worth talking about there

Starting point is 00:27:57 is that it's one place where the traditional stats can be misleading. Because if, let's say, you had net approach stats for Roger Federer and Rafael Nadal for an entire tournament, you'd almost always find that Federer comes to the net a lot more. That's just the way he plays. Nadal comes to the net sometimes, he's very good at the net, and his rate of winning points at the net is often a lot higher. So maybe, you know, over the course of a tournament, Federer comes forward a hundred times and wins 70 of them. Nadal comes to the net 20 times and wins 18. So who's better? And I think a lot of, a lot of like the first reactions is it's Nadal. If you think about it more than if Federer is coming, coming forward more than of course he's taking more risks. Like

Starting point is 00:28:42 we don't know how Nadal would do in those other 80 opportunities and you can and I think people have started to recognize that that's the logic you need to apply to net approaches but that logic applies to every other aspect of the game I mean whether it's it could be something as simple as running around and hitting forehands instead of hitting backhands like if if you do it too much it's not going going to work. But if you're only doing it when it's a clear win, you're missing a lot of opportunities. And this is a point that Carl Bialik has made a lot. And he's been the co-host on many of my podcasts that you should be trying to get those success rates

Starting point is 00:29:19 to converge with the thing you're not doing, like a regular backhand. If you're winning 50% of those points, then running around it to forehand, you should keep doing that until your success rate goes down to about 50%. And I think that's one place where players have some room to improve. And maybe that's something that some of the players who are more statistically aware, where it looks to us like they're being more aggressive, they're really just trying to get those rates to converge. When they have two choices, they're really exploiting the more aggressive choice, even though it means

Starting point is 00:29:50 that more aggressive tactic isn't as successful on a percentage basis. It might be winning them more points in the aggregate over the course of a match. Yeah. So that's what I was going to ask, whether there's any sense of whether when tennis analysis does become more pervasive and sophisticated, whether it might have an impact on the spectator experience, either positively or negatively, in the way that, say, football passing game taking over, three-pointers in basketball, strikeouts in baseball, those can be good things or bad things. Is there any sense of, well, in tennis, it will mean more hitting from the baseline or less serve and volley or less net play or one of those things that people might like or dislike? I think technology drives this more than any kind of statistical analysis is going to. So we've already seen tennis become a much more baseline-oriented game. And one thing that I ended up saying a lot is I think this stuff is statistical analysis is really important, but on the other hand,

Starting point is 00:30:50 we kind of need to trust the players that they know what they're doing in the first place. Our role is really more just tweaking around the edges, at least in the majority of cases. So if, if what we have now is mostly a baseline game, that's aggressive in certain ways, but passive in others, then that's probably about right. So I don't think we'd see a ton of changes. I think maybe it will get a little bit more aggressive, but again, like technology could flip that entirely. And because there are all these forces who like to see the game played in certain ways,

Starting point is 00:31:21 and there's still tons of nostalgia for serve and volley tennis, more aggressive net play. I mean, it's conceivable that certain, some things will change like the height of the net or even the size of the court, or people talk occasionally about taking away the second serve. So you end up with only, with only one serve and that changes the pros and cons of a lot of choices players make. But as far as what statistical analysis can do, a really great example of a time when it doesn't matter that much is the serving of Caroline Wozniacki. And I wrote this article maybe about eight months ago now.

Starting point is 00:31:55 I think it was this past spring. One of my friends who was charting matches for the Match Charting Project noticed that she had an exact pattern in every single service game. I think she would serve wide on the first point, down the middle on the second point, down the middle on the third point, and wide on the fourth point. After the fourth point, she would mix things up. But there were matches in which every single service game, she followed this exact pattern and opponent knew

Starting point is 00:32:22 about it. The reason that my friend discovered it was because he could understand Dutch and he overheard a coaching conversation with a, with a, when the coach came onto court between games, uh, and the coach told the player to follow the Wozniacki pattern. So, I mean, everybody knows what this is. And as far as I could tell it, it doesn't affect her success rate. So you were mentioning earlier, like pitch sequencing, like this seems like it could be, it could be rocket science with this huge potential to really optimize the use of pitches or in the case of tennis shots or serve directions. And here you have someone doing the most simple, obvious, predictable thing possible and everybody knows

Starting point is 00:33:05 about it and it doesn't matter. So it doesn't make me feel very valuable, but it is a really useful data point about what the potential is or isn't. Yeah. Have players had at least access to fairly detailed scouting reports if they wanted to know, I mean, let's say something as simple was how often does this person serve inside or outside or on first serve and second serve? Like, can they look that up very easily? And have they been able to for a long time? Very easily? No.

Starting point is 00:33:34 For a long time? No. But it is a sort of micro industry that's cropping up. So there are a few companies out there that basically do a variation of what the match charting project is doing and provide scouting reports on the sorts of things you're talking about. So it's not super high tech, but it's something like I remember talking to one player a couple of years ago, or it was at one remove, but the player was saying like, in the first set, you kind of figure out what direction your opponent likes to serve on breakpoint. So by the second set, you kind of figure out what direction your opponent likes to serve on break point. So by the second set, you know where they're going to go and you can really take advantage.

Starting point is 00:34:09 And of course, every analyst is thinking like, why don't you know that in the beginning? Like these guys have played hundreds of matches in their careers. How do you not know something as simple as whether the guy is going to go left or right under pressure? So I think finally, in the years or so, since I had that conversation, we've gone from most players not knowing to at least most top players knowing so that there is a little more game theory going on. So the access is possible, at least for top players with a budget, but it's still pretty limited. And then the other thing I wanted to ask is whether there's any indication that serving has

Starting point is 00:34:46 become more important over time. The rackets have changed and the players have gotten bigger and stronger. Can you see in the data that, for instance, it's harder to break players now or that having a very good serve is a bigger component of a player's game than it used to be? The technology has actually pushed that in the other direction. So if you go back to the, I guess probably the nineties would be your peak big serving time where not only were players hitting a lot of aces, but they were coming forward behind their serve and trying to finish points quickly. That was before some developments in string technology that made it easier to control the ball if you could get a racket on it. So there haven't been a lot of developments to make it possible to serve much faster.

Starting point is 00:35:34 It's not like we're setting new serve speed records every year. We're definitely not. I think the serve speed records are for both men and women 10 or 15 years old now. And maybe on average, players are serving a little faster. Maybe there's some selection going on where players who can't serve that hard can't make it to the top. But everybody, with a very few exceptions, everybody's a solid returner. I mean, they get the ball back in court. They have the capacity to control it somewhat. So it's not that the serve isn't important, but I think the difference between like an A-plus serve and an A-minus serve isn't as valuable as it used to be because they're both going to come back most of the time. always not always well hit not always making him work really hard but in contrast to an equivalent player 20 or 25 years earlier he's having to deal with having to say hit second or third shots a lot more often uh-huh and then one more question that came to mind as we were having this conversation is whether there's anything akin to similarity scores whether based on velocity or shot selection

Starting point is 00:36:43 or success rates? Is there anything like that where you could say this person statistically resembles that person or find the outlier, someone who just doesn't do things like everyone else does things? Yeah, it's something that I haven't played around with much. There's a researcher named Jeff McFarland who runs a site called Hidden Game of Tennis, and you can probably guess from the name that he also comes from a baseball background. He's done some work with that and found some interesting stuff. I don't have the details in mind, but it depends a lot on what you pick as what the characteristics are. And because the data is fairly limited,

Starting point is 00:37:22 I mean, match starting project data opens this up somewhat but we don't have a ton of data on older players so if you if you wanted to see like who's most comparable to rod labor ken rosewall now then you don't really have anything to go on you can go back to the the 90s for men and not even that far back for women but you can compare you know current players other current players and some people have tried to just like some of what you're saying, shot selection, you can get from match charting project data. You could look at how much better their first serve is than their second serve. So it's possible. I mean, yes, it's a direction some people have chosen to go.

Starting point is 00:37:58 But as with everything else, there's some limitations in what would make it interesting, at least for now. Yeah. All right. Well, this has been Enlightening it's been nice to talk About a sport that I have a little Bit of knowledge about not a ton But better than most of these other sports I've been discussing and I actually play

Starting point is 00:38:16 So not totally out of my Depth but you can learn much more From Jeff on Twitter at Tennis Abstract you can listen to the Tennis Abstract Podcast and find his writing and all the data that he has helped collect at tennisabstract.com. Jeff, thank you. And I hope you come back to baseball writing someday,

Starting point is 00:38:34 but I'm glad that you're making this contribution to tennis too. Thanks, man. All right, let's take a quick break and we'll be back in just a moment with Mark Brody, author of Every Shot Counts to talk about golf. Under a tree, she kissed me We go for walks in fine weather But all together, on the golf course

Starting point is 00:39:15 We go for walks So, I am joined now by Dr. Mark Brody. He is a Columbia Business School professor, but he also doubles as one of the foremost minds in golf analytics. He is the author of Every Shot Counts, and he has been described as the Bill James of golf because every sport has to have a Bill James of that sport. Mark, hello. How are you? I'm fine, Ben. Thanks for having me on. Yeah, happy to. So just to orient the discussion here, I start these off with the same question, which is on the spectrum of ease of analysis, where you have at a 10, you have the most easily analyzed sport, which perhaps could be baseball because of the way it's structured.

Starting point is 00:39:58 And then a one would be just completely impenetrable and opaque to analysis. Where would you put golf on that scale? I would put golf around a nine. I think it's probably one of the easiest after baseball. And baseball's easy because there's a lot of one-on-one matchups. And it's fairly easy to describe the state of the world, how many runners are on base and how many outs and what's the run differential. And of course, you could add a lot more once you have player tracking data. But the basic analysis of baseball is pretty easy. And it's similar to golf in that you need to know for each shot, where is it, where does it start and where does it finish? And those are very discreet pieces of information and relatively easy to collect and store in a

Starting point is 00:40:52 database and then, and then analyze. So I think it's the team sports where you have, you know, 10 players running around or more that then defining the state, how valuable is this situation is much tougher in football and basketball and in soccer. So golf is definitely on the easy end of things. Okay. So can you give me a brief history of golf analytics? And I don't know if there was one before you or whether I'm asking for your own history essentially, but what have been some of the big breakthroughs and major milestones and when did things get started? I think things got started in a book, a remarkable book that was called The Search for the Perfect Swing by

Starting point is 00:41:37 Alistair Cochran and John Stobbs and was published in 1968. And it was certainly, as far as I know, sort of the forerunner of all things golf analytics. And they had physics models of the trajectory of the ball, and they had models for golfer swings, where they have like a two pendulum model of golfer swings. And they were also the first ones to collect shot data so they analyzed all sorts of things about you know why does you know why does the ball fly the way it does with dimples and without dimples and but and stuff closer to to what i've done they they had people with pencil and paper recording where do shot start and where do they finish. And in a golf tournament, we're the first to, to analyze a shot level data.

Starting point is 00:42:30 And then at what point did you pick up the ball or carry on that legacy? So I, I picked up the ball around the early two thousands and I had a program to collect amateur data that was up and running. And I started to analyze it and had the strokes gain notion in 2005. So this time period from when golf started until 2005 or so was pretty much, the statistics were like the original baseball statistics, counting statistics. How many times were at bat and how many hits did you get? And in golf, it was how many putts did you take? How many greens and regulation did you hit? And how many fairways did you hit or miss? And those are all counting statistics. It's easy to do in the pre-computer days. And that was the state of the art in golf statistics for decades.

Starting point is 00:43:26 Basically until 2003, when the PGA Tour started to collect their shot length data, which is the shot level, where did every shot start? Where did every shot finish? So that was the sea change, all sort of driven by the availability of data. And I also collected data for amateur golfers. So I had sort of a range from rank amateur shooting 120 to PGA Tour pros that are the best in the world. And the shot level data was crucial for doing the analysis. And so can you explain what that analysis looked like and what it yielded and I guess the concept of strokes gained? So trying to measure performance, it's clear that sort of

Starting point is 00:44:12 all other things being equal, hitting it longer is better than hitting it shorter off the tee, hitting it closer to the hole is better than hitting it further away and hitting the greens better than missing the green. But these are all sort of disparate notions and you need sort of a common benchmark to measure performance. And I came up with an idea that I subsequently realized was very similar to ideas in other sports, baseball, football, and others. But the idea is how do you take yards, which is how far you hit a shot, and whether you hit a fairway or not, whether you hit a green or not,

Starting point is 00:44:52 they're all in different units. How do you put them into a single unit that's comparable across these different shots? And the answer is you want to measure how the quality of a shot relative to an average shot. And so the unit of strokes, how many strokes does it take you to get in the hole? And so if you start at a position, which is instead of thinking of it as 420 yards away, you think of it as your four strokes away from the hole. Then an average shot would put you

Starting point is 00:45:25 at three strokes away from the hole. An above average shot would put you at 2.8 strokes away from the hole. And a below average shot would put you at three and a half strokes away from the hole. So an average shot, you'd get one stroke closer to the hole with one swing of the club. And this is related to a notion of dynamic programming where you're trying to optimize, you know, in this case, minimize the number of strokes to, to hole out, but you can now measure performance and the quality of every shot in this, in this common unit of strokes. And the key is how much better or worse than an average shot is what you just saw or just recorded. So you need this notion of a benchmark, and the benchmark is how many strokes would it take an average player to get the ball in the hole,

Starting point is 00:46:19 and is the shot that you observe better or worse than average? I see. So as you were saying, there are some similarities here to other sports and metrics that compare performance to some baseline measure of an average performance. So run expectancy or expected points added in baseball and football, which go back to the 60s and 70s, or expected goals in soccer or hockey, for instance. Definitely some shared DNA here. And what major misconceptions, if any, have been overturned by the availability of this data or how has it changed strategy? So the biggest misconception that I never started off trying to prove or disprove, it just sort of, after you do the analysis, then it comes out. It wasn't certainly the goal, but it was the importance of putting. And the most famous expression in golf is you drive for show and putt for dough. Meaning that's, you know, you drive it, it looks good,

Starting point is 00:47:10 but where the money is made, where winning or losing a tournament is due to putting. And it turns out that's not the case. And if you look at it, analyze golf through the strokes gained lens, look at it, analyze golf through the strokes gained lens, and it turns out that the advantage of the better players comes two thirds from shots outside of a hundred yards and one third from shots inside a hundred yards. And that was very surprising to many people, but in hindsight, it's just so obvious. You take a look at the top 10 players in the world rankings or the top 10 players on the PGA Tour in any season, and you'll see great ball strikers. You'll see players that have superior iron games that hit the ball, you know, long and straight, and they may or may not be good putters, but there's a lot of,

Starting point is 00:48:05 you know, not spectacular putters among this group of the, the elite of, of the elite. Conversely, if you take a look at the, at the best putters, you might not recognize some of the names because putting just isn't, isn't enough. And, um, he said this, this two thirds of the, the contribution of superior play from approach shots and driving is constant across a different skill level. So it's also true that if you want to go from shooting 90 to shooting 80, about two thirds of that improvement comes from shots outside a hundred yards and one thirdthird from shots inside 100 yards. And so it's very constant across skill levels. And as you're doing this analysis, to what extent can you or do you need to adjust for environmental conditions? You know, the condition of the course,

Starting point is 00:48:58 the weather, the temperature, precipitation, that sort of thing. Yeah. So there are different ways to make these adjustments, but they are sort of crucial to the analysis. And you don't want to compare a player's birdie rate at an easy, say, desert course with another player's birdie rate at a tough US Open course. They're just not comparable and it can be course setup where it could be wind and pin placement and height of the rough and other things that you just mentioned. And so what we do is we make an adjustment at the end of each round so that we have basically an average strokes to hole out baseline or benchmark for that round. And that again, goes back to this notion of, we want to know how much better is a shot

Starting point is 00:49:53 than, than average. And we actually mean average at that course in those conditions that day. And so that's, that's how, that's how we do it. We just take this benchmark, which is estimated for millions of shots, but then we adjust the benchmark for the conditions that day. And that's really important so you don't get penalized for playing in US Opens, which would happen if you don't make that adjustment. And how much of this data or the output of the data is publicly available and widely disseminated? Unfortunately, it's a proprietary data set that the PGA Tour uses to

Starting point is 00:50:36 do these calculations. So you can find the results of the analysis on pgatour.com. So you can find the results of the analysis on PGATour.com. So you can find strokes gained rankings for all the players for any season. So you can see the output of the analysis. But unfortunately, they don't like pitch FX data or data in other sports. They don't just make it publicly available. How, if at all, has this changed the way the game is played? I know that obviously players hit the ball a lot farther than they used to, and that's partly the equipment and the clubs and maybe partly strength and that sort of thing.

Starting point is 00:51:15 But has there been a shift in play style because of strokes gained and other analytical insights? I think the answer to that is yes. And you can see some fairly dramatic shifts in strategy and what players are practicing. So for example, there is a widely held notion that on a par five, the best strategy is to lay up to your favorite distance. And by favorite distance, it means, you know, if you had a wedge that went 95 yards and you weren't so good, if it was 90 yards or a hundred yards, you'd try to lay up to 95 yards as opposed to 90 or 85 or 112, whatever. And so the conventional wisdom was to lay up to your favorite yardage. In fact,

Starting point is 00:52:07 the analysis of all these millions of shot length data shots shows pretty much that the closer you are to the hole, the fewer strokes on average you will need to hole out. There's definitely some qualifications there. But the players that go for the green in two, maybe they don't hit the green, but maybe they end up 30 yards away, maybe even 30 yards away in the rough, have a distinct advantage over the players that lay up to 90 or 100 yards. And that was a surprise to many. But if you take a look from 2003 to 2019, you'll see this, you know, precipitous decline in players that lay up to their favorite yardage. And more and more players just, you know, go for the green in two on par fives if they're in a position to do so. So it has definitely affected strategy in a similar way to some simple analytics in basketball has led to teams taking more three-point shots. And that's been a consistent trend over the past quite a few seasons. And in golf, is that seen as a positive or a negative for spectators from an entertainment

Starting point is 00:53:23 perspective? positive or a negative for spectators from an entertainment perspective? Well, there's a whole big debate going on about the distance that the ball travels. And there is certainly one camp that says they want to see the big bombers. It goes way back to Arnold Palmer and Jack Nicklaus. It's not just Dustin Johnson and Roy McIlroy and, that fans love to see big bombers hit these majestic, high, long, long drives. And the same thing with approach shots. And of course, Tiger Woods played a power game for a few decades now. So there's one camp that says, you know, this is good for the game.

Starting point is 00:54:00 But there is a voice that is, you know is saying that the technology and the way players are playing the game has led to courses that are too long, that it takes too long for amateurs to play. And so that's sort of the debate about rolling back the ball. If you go back from the 1980s until the 2000s, the distance that the ball travels on average on the PGA Tour has probably gone up from 255 or 60 yards to 295 yards on average. And of course, the longer players are at 320, 340 on some drives. And yeah, there is definitely a robust debate going on whether this is good for the game or not. And how has this complicated cross-era comparisons, because of course you can just count the major titles or titles period, but has there been more analytical work that's gone

Starting point is 00:54:59 into sort of normalizing for era and trying to adjust for these dramatically changing conditions so that you can put players from one era and another on a relatively level playing field and compare them? Unfortunately, that is one of the big open questions. That's really, really hard to do because of the changes in technology, and it's hard to separate changes in technology from changes in skill. Right. If somebody is hitting it 20 yards further, well, how much of that is due to equipment and how much of that is due to, to the player. And it's, it's even hard if you, if you don't look at, you know, distance and proximity and other things, if you just look at scores and who wins tournaments, unfortunately, we don't have good scoring records prior to 1983. So if you want to go back and look

Starting point is 00:55:55 at the Ben Hogans of the world and Arnold Palmers and Jack Nicklaus in the 50s and 60s and 70s, the scoring data isn't even available, except for a few tournaments here and there. And then some of the data that's available, you might know scores for the top 10 players. So intergenerational, cross-generational comparisons across the area are really tough to begin with, but the lack of data makes it even harder. So is there a particular player or coach who is most associated with sort of starting this and getting in early and demonstrating that applying some of these principles could help? Is there kind of an Oakland A's of golf?

Starting point is 00:56:43 I don't think so. I know that I worked with Luke Donald and his coach, Pat Goss, and we formed a relationship quite a number of years ago. And he rose to number one in the world. And they gave me a teeny bit of credit for that. But he was also an unusual case in that he was, when he was number one in the world, very good at everything except his driving distance. He was not the longest driver, but he really understood the notion of fractional gains and how you can accumulate fractional gains in different parts of your game. And that can add up to enough to become number one in the world. And so he was a great putter. And people said, yeah, he's number one because he was a great putter. But that wasn't what changed.

Starting point is 00:57:40 In his move up from number 10 to number one, his putting didn't get that much better, but his approach shots did. And his greenside sand play got better. And he picked up a 10th here and another 10th here and another 10th there. And then all of a sudden, he was number one in the world. And so how often are you working with players and how many players have someone like you who is kind of a personal analyst and provides this information to them? So I've worked with quite a few over the years of players, coaches, and caddies. So quite a significant fraction of players that are in the top 10 or top, top 20 in the world, although not on a, you know, continuous regular, regular basis. But I think, you know, quite a few players now have, you know, a golf analytics specialist helping, helping them out, but others that, that don't can find reasonable information on PGATour.com that's now publicly available. When I was working with Luke Donald, this was prior to any of this information being sort of widely disseminated.

Starting point is 00:58:55 But one of the reasons it's sort of been accepted by the community is it is the results at least are easily accessible to to the public and and to the players it's it's not like the raw data is out there and and some bloggers are on some website that just some people people know about but another another way that this has impacted the game this is the the week of the the president Cup. And there's a golf analytics firm called the 15th Club that's helping out the international team in deciding who to pair together in different formats. The formats being foursomes and four ball competitions, which we sort of know as alternate shot and better ball in the US. And they're getting a lot of credit for bringing analytics to analyzing how do you make pairings in addition to the strategy, both for the President's Cup and the Ryder Cup. So

Starting point is 00:59:55 it's certainly had an impact there as well. When I was doing the tennis interview, I learned that the official tennis player rankings are kind of lagging the actual player's performance and that if you do a different type of ranking or an ELO rating, you can kind of get a more accurate representation of how the players actually compare to each other before that's reflected in the official rankings. So is that similar in golf or do the golf rankings actually do a pretty good job of reflecting performance? So there's a philosophical question. What do you want a ranking to do? And the question is, do you want it to measure performance or do you want it to be a good predictor of future performance?

Starting point is 01:00:41 And those are quite different. And the official world golf rankings is trying to fairly reward past performance, and it's not intended to be a predictive system. So that means that you can do a better job of predicting who's going to rise or fall in the world rankings because it wasn't designed to be a predictive system. So one easy way to see that is a player wins a major like Patrick Reed winning the Masters. So if a player wins an important tournament that gets a lot of weight and a lot of ranking points by one stroke, they get a disproportionate number of points and the rise up in the rankings is similarly large. Even though if you wanted to predict what's going to happen in the next tournament or even the next major,

Starting point is 01:01:38 you wouldn't change your opinion of that player's skill level all that much. You would change it some, but you wouldn't change it in the way that's reflective or the way that the world rankings work. And it's because fans and media and writers and players really weigh winning a lot. And so if a winner gets 100 points and second place gets 60 and third place gets 40 points. There's this exponential drop-off in points, which means that winning in terms of past performances is hugely rewarded, but it adds a lot of sort of volatility and error if you want to use that in a predictive fashion. And the thing that we haven't talked about is the use of data and technology as a training tool, as a swing tool. And as I discovered when I was working on the MVP machine and I talked to you about this, golf was really a trailblazer in that area and baseball has actually adopted many ideas from golf. So how did that get started and why has it been adopted so early and so quickly and so readily in golf? So the similarities are many.

Starting point is 01:02:51 So in golf, the use of video goes back to probably the 1980s. And that's sort of when swings started to be analyzed a little bit more scientifically. when swings started to be analyzed a little bit more scientifically, but there is the availability of technology in several forms. I would say, after video, there were launch monitors, track man radar systems that would measure the clubhead speed, ball speed, launch angle, spin rates. And that, as an adjunctct to coaching just became part of some of the better coaches toolbox. And then on top of that, then there is 3D motion capture where you can measure the different joints and segments of a player's body and how fast are their hips turning and how fast are their shoulders turning and in what sequence.

Starting point is 01:03:51 And then there's force plates where you can measure how is a player using the ground. And from that, coaches can get information that's not easily gleaned just by the eye, by the naked eye. And so you can, in the same way when you're looking for these incremental gains, if somebody isn't quite shifting their weight properly or using the ground properly, or maybe they should be, you know, doing something slightly different with, with their hands at the, at the top of the top of the backswing, you can, you can measure this and you can quantify this and you can improve players sort of incrementally. And that sort of, when one coach sees another coach do this or another player becomes good, it's a little bit easier because it's one coach adopting it. And they can

Starting point is 01:04:41 see the success of another player. And then I think that leads to this diffusion of innovation that might be a little bit slower in other sports. Yeah, and just because the ball is stationary and you're not necessarily trying to hit a moving target, whereas in baseball, at least for a hitter, you have to worry about, well, where's the ball even going to be and what type of pitch will it be and how hard will it be traveling? Whereas the ball at least is not moving, although the positioning of it changes from shot to shot.

Starting point is 01:05:14 So that seems like, I mean, if you can control your swing, then that's everything. You aren't going to get fooled by someone else who is trying to slip the ball by you. Absolutely right. So what is the future if there's anything that we haven't touched on, any next frontier of golf analytics? Is there anything that's on the horizon? Well, I think there's work being done, but there's still relatively large room for growth in terms of golf strategy. And certainly golf betting is on the horizon. So I think that's going to drive a lot of innovation and analysis in the golf analytics

Starting point is 01:05:56 space. And there's still quite a bit of room for improvement in terms of measuring and analyzing and using data for golf performance and improving your score through using this kind of data and analytics. So I think this is, in a way, just the beginning. There's a lot more room for additional insights, and there's also going to be new data coming. So for example, the PGA Tour has been piloting and using a shot length plus system where instead of looking at where does a shot start and where does a shot finish, on the green they'll have trajectories of approach shots and the entire trajectory, you know, 10 or 20 times a second as the ball approaches the green.

Starting point is 01:06:50 And you'll be able to see, does it hit and roll forward? Does it hit and take a hop and then spin backwards? You'll be able to see trajectories of putts. From that, you'll be able to sort of quantify the advantage of being able to hit long approach shots with a steep descent angle or short approach shots with a certain amount of spin, be able to quantify the difficulty of downhill left to right putts versus uphill fairly straight putts. So the additional data that's coming will also sort of be this great sandbox for the golf analytics world to play in. And one more thing, I'm always interested in how the availability of new data or new ways of dissecting that data allows us to go back and

Starting point is 01:07:42 look at some of the great players and appreciate them in a different way, perhaps. And so I wonder when you look back at, say, Tiger Woods through the lens of strokes gained or some of the ways that we can look at golf performance now, is it just that he hit the ball farther and more accurately? Or in retrospect, was he also in his prime a very smart or analytically optimized player? So the hardest part of the game to measure with traditional stats was the approach game. T-shots on par threes, second shots on par fours, and sometimes the second or third shot on par fives, because putts clearly doesn't do that. Greens and regulations sort of

Starting point is 01:08:25 mixes tee shots and second and and third shots and of course whether you hit the fairway or not is not a direct measure of of these approach shots and it turns out tiger woods was one of the best players ever because he was really good at just about every phase of the game, but he was so good with his approach shots that if he was an average driver of the ball, average wedge player, short game player, and average putter, he would still be number one in the world. He was sort of that much beyond everybody else that when you added in how much better of a putter he was and how much better of a wedge player and how much better of the driver of the ball he was, then he became, you know, about a stroke better than the number two player on the world. He would have been in the top 10

Starting point is 01:09:17 just with his approach shots. And if you take a look at his career season after season, his, you know, putting varied a little bit and his short game varied a little bit, but he was never out of the top five in approach shots during the shot link era, which is just sort of phenomenal. People, I don't think, realized how critical that was to his success. He was not only one of the greatest players of all time, but he was certainly the greatest at approach shots. And there's one other thing in the strokes gain lens, if I can stick this one in, which is sort of related to baseball, which is what is one of the most phenomenal streaks in sports, but in particular baseball. I'm thinking of the 56

Starting point is 01:10:06 game hitting streak of Joe DiMaggio. So books have been written about this. So I was looking at, we need more records and streaks and fun things like this in golf. So I looked at streaks of beating the field. So what do I mean by beating the field? If a player scores, shoots a 69 in one round and the field average is 71.4, then the player score was better than the field average. So the player beat the field. So in any round, you can compare a player score to the field average score and either the player beat the beat the field or the player didn't and so if you're an average player you're going to beat the field half the time and you can ask across all of you know golf history that we have data for what's the longest streak of a player beating the field so round after round after round how many rounds in a row, you know, did a, did a player, uh, beat the field and how long was, was that streak?

Starting point is 01:11:10 And I've asked, you know, many different people, including, you know, PGA tour pros, but also amateur golfers, golf writers. And most people's guesses was on the order of 20 or 30 rounds has got to be the record. How could you do better than that? And it turns out that Tiger Woods, as you might expect, owns the record for the longest streak of beating the field. And it was 89 rounds in a row. 89. Wow. And when was that? And that was in the, uh, around 2000, which was

Starting point is 01:11:49 his, his most phenomenal, phenomenal season, 89 rounds in a row and second on the list. So one other way you can, you can look at how amazing is that, is that record is what's the gap between first and second. And so for the Joe DiMaggio streak, second is like 44 after, after 56 in golf. Second place was, I believe 34. So Tiger's at 89 and second place was 34 rounds in a row, consecutive rounds,

Starting point is 01:12:23 beating, beating the field. Well, that raises another question for me, I guess, which is what role does randomness play? Because random variation plays a pretty big role in baseball, but in golf, is there a clutch? Is there evidence of clutchness? Is there a great repeatability from round to round and tournament to tournament? Or is there a lot of fluctuation around whatever the player's true talent level is? So that's one of the things that I'm looking at currently, and I don't have any results to report. But the idea of how do you measure performance under pressure in golf is sort of a current topic of research of mine.

Starting point is 01:13:06 All right. Well, we will look forward to the results then. And for now, you can follow Mark at Mark Brody on Twitter. His book is Every Shot Counts. Thank you very much. Thanks for having me, Ben. Pleasure. That will do it for today and for this week.

Starting point is 01:13:21 Thanks to all of you for listening on On a holiday week or whenever you listened, this series is speeding along. We will cover soccer and rugby in the next installment. In the meantime, you can support the podcast on Patreon by going to patreon.com slash effectivelywild. The following five listeners have already signed up and pledged some small monthly amount to help keep the podcast going

Starting point is 01:13:41 and get themselves access to some perks. Joel Gillespie dustin palmeteer randy stearns eric smiley and nicholas perry thanks to all of you you can rate review and subscribe to effectively wild on itunes and other podcast platforms you can join our facebook group at facebook.com slash group slash effectively wild keep your questions and comments coming for me and my regular co-hosts, Sam and Meg, via email at podcastoffangraphs.com or via the Patreon messaging system if you are already a supporter. Thanks to Dylan Higgins for his editing assistance, and we will be back with another

Starting point is 01:14:16 episode extremely soon. Talk to you then. Bring a bowl of icing Then a glass of water To a place Setting up the chessboard While Beth rolls out the dice Anyone for tennis Wouldn't that be nice Thank you.

Effectively Wild: A FanGraphs Baseball Podcast - Effectively Wild Episode 1477: Multisport Sabermetrics Exchange (Tennis and Golf)

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.