Effectively Wild: A FanGraphs Baseball Podcast - Effectively Wild Episode 1793: Measuring the Unmeasurable
Episode Date: January 5, 2022Ben Lindbergh and Meg Rowley banter about Fanatics purchasing Topps and MLB Network reportedly parting ways with Ken Rosenthal because of his criticism of Rob Manfred. Then (27:36) they kick off a ser...ies of episodes about measuring difficult-to-quantify aspects of the sport by talking to Cameron Grove about translating his study of astrophysics into baseball […]
Transcript
Discussion (0)
Hello and welcome to episode 1793 of Effectively Wild, a Fangraphs baseball podcast brought to you by our Patreon supporters.
I'm Meg Reilly of Fangraphs and I'm joined as always by Ben Lindberg of The Ringer. Ben, how are you?
I'm alright. Happy New Year.
Happy New Year.
How was your New Year's?
You know, I stayed up till midnight. I was very proud of myself.
I am not prone to midnight. It is not an hour I often see on purpose
as a person who is an early riser.
And I don't say that as if there's any moral superiority
to rising early or one way to be.
That is just the way that I am wired.
There's been, you know,
sometimes people get very fussy about their sleeping habits.
I only get fussy about yours
because I don't think you sleep enough,
but now you have a newborn.
So what am I going to do?
Tell you to sleep more?
No, I'm not going to do that.
That'd be crazy.
But so anyway,
I don't normally stay up till midnight
and I did and I was proud of myself.
I survived the sound of fireworks
for many hours after that
and here we are in 2022.
I had to write down the year and I got it right on the
first try. And so I think my year might be downhill from there. I don't know if I can
do better than writing 2022 on purpose correctly the first time around. It might decline.
Yeah. I don't have to write the date all that often these days. Back when I used to,
I would always screw that up for a while. Now, I probably still will, but have fewer opportunities to. But yeah, I was up past midnight, which was not because it was New Year have been, but also that's just probably what I
would have been doing anyway. It turns out that a lot of lockdown pandemic precautions just mirror
what I would normally do under regular circumstances. So I just watched some drunken
Andy and Anderson and then went back to regular life. Yeah. I watched When Harry Met Sally,
which one could argue that that's a New Year's film, right?
Yeah, definitely.
Oh, very much so.
Yeah, it's also a Christmas film and a summer film
and a fall in New York film.
It is a film that reminds us that fashion is cyclical
because Meg Ryan looks fantastic throughout the whole thing,
and I'm like, I would wear most of these outfits now,
again, if I ever left the house.
But yeah, that was quite nice
and satisfying funny you know nora efron she she was a talented gal miss her wish she were still
around there you go well happy new year everyone it is a new year but same slow news week generally
no development in the lockout as of yet no No meetings that we have heard of. Just a couple of news items I guess we could touch on briefly before we get to our topics and guests today. So we talked last year, late August, episode 1736, we did an interview about the surprising, some might say shocking, news that Fanatics had swooped in and secured exclusive card licenses with both MLB and the MLBPA, which seemed to leave Topps, which had been making baseball cards for 70 years in a pretty precarious position.
And I think there was a lot of consternation about what this meant for baseball cards and
for sports cards in general and what it would mean for Topps.
And now we know, at least the latter, what it means for Topps is that Fanatics is reportedly
purchasing Topps.
So Topps has lost some of its valuable licenses, had lost the baseball license, although that was not scheduled to kick in until 2025.
But long term, there were a lot of questions about where Topps would go.
And I suppose it made sense for both parties to combine forces at this point.
And it reportedly came as a surprise to Topps that they were going to lose the license, and they were a bit miffed about that maybe, but, certainly the one that they've been the longest associated with.
And now they will be working with Fanatics.
And so Fanatics gets to start making baseball cards immediately instead of waiting for a few years and just gets presumably to keep using the top's name if they want to and bring that tradition and lineage and the back catalog and the experience along with them.
But it sounds like fanatics is really consolidating control of this industry.
It's not a monopoly, but they certainly seem to have a commanding grasp on sports cards in general and certainly on baseball cards for the foreseeable future now.
Yeah. I mean, it's weird to say you feel for
tops for like only getting 500 million dollars i don't know i i guess this is making the best
of a bad situation i will say i am i am heartened that tops will remain involved in the card market
like this because i think on that episode we had raised the concern that like you know some of fanatic's own products have not been
the best the quality of them just as pieces of merchandise has sometimes been wanting and
baseball cards are you know there there are a lot of things to a lot of people they're like a
speculative investment vehicle for some folks but they're also a thing that people really enjoy and
treasure and collecting them means something to them and it's a connection that they have to the game and you know i think there's been some fluctuation and variability and
in the sort of aesthetic quality of of tops sets over the years and i know that enthusiasts
probably have their favorites and ones that they think were not quite as up to snuff but in general
i think that we would acknowledge that like tops makes good baseball cards so it's good that they
will keep making them because people who care about baseball cards want their baseball cards to be good and
this suggests that they will be familiar in a way that is is nice so yeah and one would imagine that
tops probably had to swallow its pride lick its wounds a little bit after having fanatics kind of
come out of nowhere and take away its stranglehold on baseball cards. But
hopefully this means that the people in Topps will be in a better position than they would
have been otherwise. And Topps has a lot of other licenses and businesses as well. They do things
other than make cards and they also have licenses with like MLS and Formula One and UEFA and other
leagues. So now it's just all under the fanatics umbrella and we will see what happens there.
But I wanted to follow up on that because it was kind of an open question after our
previous discussion.
And now there is a little more clarity about that situation.
Yeah.
The other bit of news is that MLB Commissioner Rob Manfred has stepped in it once again.
is that MLB commissioner Rob Manfred has stepped in it once again.
He has not proved to be a master of public relations, I would say,
during his stint as commissioner of Major League Baseball. And that run continues this week with a report by Andrew Marchand,
the New York Post media reporter,
who broke the story that apparently Ken Rosenthal,
who had been an MLB Network insider essentially
since the inception of the network or thereabouts for 12 years or more, he got into some troublesome
hot water with the league with Rob Manfred over a column or columns that he wrote last year for
The Athletic. Ken Rosenthal is a man of many jobs.
He, of course, has been on camera at MLB Network a lot and also at Fox,
and he is one of the lead writers at The Athletic.
And at The Athletic, he wrote some things last year that were kind of critical
of Rob Manfred's handling of the lead-up to the start of the pandemic delayed season.
And according to Marchand's reporting, Rob Manfred was not happy with the tenor of those reports.
And again, according to the New York Post report, Rosenthal was sidelined for a few
months at MLB Network.
He continued to be paid, but was not appearing on screen.
And then he was eventually allowed to return to the network, but his contract expired
at the end of the year and was not renewed. And again, according to this report, seemingly that
may have had something to do with this continued bitterness on Rob Manfred's part about his prior
reports. Rosenthal has confirmed on Twitter that he will not be returning to MLB
Network. He said, I'm grateful
for the more than 12 years I spent
there and my enduring friendships with
on-air personalities, producers, and staff.
He did not confirm
or deny all of the details of
Marchand's report, but he did end
one tweet by saying, I always strove
to maintain my journalistic integrity
and my work reflects that. saying, I always strove to maintain my journalistic integrity and my work reflects
that.
Yeah, I mean, I don't think that anyone who has watched MLB Network over the years will
be shocked to learn that there is, you know, there's a league and owner sympathetic tone
that often pervades its air.
And I think that there are times when that is more obvious than others.
That isn't to say that there aren't people there who don't do good work and that there
isn't, you know, sort of straightforward analysis that takes place.
But it is, you know, it's league owned media and league owned media is going to have pitfalls.
And I think that one of them is if you are critical, even in spaces away from MLB Network's
air of the commissioner, and that makes its way back to him, whether it's being directed by him
personally, or as I find more likely sort of is an understood mandate of the network and its senior
leadership, you're probably not going to be asked back. And that's disappointing because they are at the center of a lot of important media
rights and they are in a position to help foster an appreciation for the game.
But the game thrives best when it is assessed critically and honestly, and this is antithetical
to doing that.
So yeah, it's a real bummer.
I mean, Ken's like a pro's's pro i don't think we need to
list ken's bona fides you know i think the part of this that i find the most disconcerting is that
he's a reporter who takes sort of seriously and i think puts a premium on being measured and
thoughtful and trying to be fair to the subjects of his reporting even as he is critical
of them right he is not someone who is prone to like florid language really and if you read the
piece that is linked here that supposedly got him benched last year like it's critical but it's fair
it's even keeled you know it is i think an honest assessment of where we were in that stretch where it
looked like despite the ability of baseball to come back, that it might not because, you
know, the league couldn't get out of its way.
And if that's enough to get you benched and then, you know, to, I guess, not get your
contract renewed, like, you know, what kind of hope can we have that they're
going to have honest critical uncomfortable conversations about other failings of the
league or the sport or the people who are involved in the league in the sport so not good as I said
on Twitter and then was worried that people would think I was making like a joke about Ken's height. I was very nervous. I
tweeted, I tweeted and then I got concerned that people would read extremely large yikes as a knock
at Ken and I didn't mean it like that at all. But it's it's not good. And in the midst of a moment
when I think having sort of clear eyed reporting about the state of the sport is particularly
important. It doesn't send a great message.
Now, I think given the sort of media terms of the lockout, no one was going to MLB Network
expecting that they were going to hear, you know, clear eyed analysis of the, you know,
what ownership should do to meet the players halfway or anything like that.
But it's really too bad.
And I, you know, I think one of the other potential casualties of
something like this is that it does kind of call into question and make readers potentially nervous
about the sort of objectivity of other aspects of that media operation and that's a real shame
because there are a lot of really good writers and reporters who work for MLB network and work for MLB.com. And, you know,
when you put your thumb on the scale like this, you, you're in danger of undermining all of that
good work too. So, you know, part of the job of being the commissioner is to take public criticism.
That is part of your job. It is not the only thing that you have to do, but to demonstrate that you,
you know, are so, you demonstrate that you view yourself as being entitled
to be sort of immune from that kind of criticism
such that you will bench one of the most popular reporters
to cover the sport,
suggest someone who perhaps needs to think critically
about whether they have the requisite skill set to do the job.
I don't know, just as a thing that we might say very casually on our baseball podcast.
Yeah, I mean, virtually all commissioners of major sports leagues are unpopular.
It goes with the territory very much,
and it does seem as if Manfred is a bit more sensitive to that
than perhaps some of his counterparts in other sports or his predecessors in MLB.
And yeah, in one sense, it's not that
surprising because again, we're all aware this is league-owned media, MLB.com and MLB Network are
not currently mentioning active MLB players while the lockout is going on. So clearly,
these are not just any media outlets. And of course, in any other walk of life, I mean, if many of us were to
say something critical publicly about the people who sign our paychecks, that would probably not
go over well, and perhaps it would do damage to our job security. But I think when it's a media
entity, and when you're bringing on someone like Ken, you are having him appear on your network partly because of the credibility
that he has as a journalist, right? And so what's the use of an MLB insider if you think that all
of these insiders are just puppets of the league and they're only saying the things that they're
allowed to say, right? I mean, you're just not going to put much stock in it at that point.
And so you do have to kind of be skeptical,
I guess, about anything that is produced by someone in that role, because this isn't even
something that Ken said on MLB Network, right? This is in one of his other guises with The
Athletic, and it still had some bearing on his employment status seemingly at MLB Network.
And so if you're kind of taking everything that is said in that vein,
and you're thinking, well, is this person pulling their punch because they think Rob Manfred might
be watching and it might impact them in some way? Well, then you're going to take everything that
you hear with an even bigger grain of salt, right? So I think that kind of undercuts the position. You want to at least maintain some semblance of a divide between the editorial operation and the business.
Even if you know that they can't be completely separate, you still want to have at least some illusion that that is the case.
And that is kind of undercut by this report, really. So I think that's the upshot of it, that if Rob Manfred didn't like
Ken Rosenthal's comments because he thought they made him look bad, well, they probably didn't
make him look as bad as this in this report that comes out that suggests that he can't handle the
criticism. Yeah, it doesn't speak well of the media apparatus that, you know, undergirds the
commissioner that this is the outcome that you get
and you know i appreciate that like these are weird moments because you're like do i give someone
credit for not getting renewed but i you know i do appreciate that i'm sure there was a conversation
about the reason for the benching and that you know ken decided right i'm not gonna i'm not
gonna change my tune on this stuff.
So that's to his credit and speaks to the failing over at Network.
Yeah.
If anything, this makes him look better.
Not that he necessarily needed a PR boost.
He has a great reputation as it is.
But I saw a lot of people who were maybe less familiar with his contributions across all
media saying, oh, he'll catch on somewhere else.
I think ken will
be fine i mean yeah i haven't spoken to him about this but uh he has only two big jobs now instead
of three like maybe he'll get some sleep every now and then or something probably not but yeah
i thought that was funny too i was like no no like the athletic didn't fire ken he got fired from one
of his many jobs yeah and uh it actually led to an outpouring of appreciation and affection for Ken by all of his colleagues at The Athletic and others as well.
I mean, just in my limited experience with Ken, I've always found him to be very gracious and considerate and helpful and diligent.
I mean, he's blurbed both of my books and he didn't even really know who I was at the time or I didn't know he knew who I was at the time.
So I've appreciated that, too.
And even though he's not known as, like, I guess one of the leading critics of the league, he's not necessarily like a rabble rouser or a bomb thrower.
He has certainly contributed to a lot of reports and investigations that teams and the league probably would rather not have
surfaced. So he has collaborated with a lot of colleagues at The Athletic on reports of that
nature. I mean, the sign stealing report, of course, which he co-authored or other, you know,
sexual harassment and assault reports. I mean, his name has been all over many of those reports.
has been all over many of those reports. So, I mean, he's been doing this for decades on air and in print, and he has made a deserved name for himself. And I think this also sort of illuminates
some of the almost unavoidable conflicts that can come about in the media industry. I mean,
it's hard to find a lot of baseball media members who have been working for a while who haven't had some kind of relationship or connection to the league in some kind of capacity.
I mean, even including us, right?
We have both appeared on MLB Network.
And I've done occasional appearances on MLB Network dating back probably almost a decade at this point and have been paid for some of those
appearances. And I know Fangraphs has some slight loose relationship with MLB, right, when it comes
to data sharing and, you know, stat cast stats appearing on Fangraphs player pages, etc. So
it's hard to find someone who has no connection to, you know, either MLB or to an MLB rights holder.
to either MLB or to an MLB rights holder.
And you kind of have to be aware of that.
I mean, my relationship with MLB Network has always been pretty informal.
I've never had a contract or anything there
and I've never had regular
or lucrative enough appearances there
for it to matter all that much to me financially.
It's just the thing to do every now and then
when they invite me to. And generally they're bringing me on to talk about more X's and O's stuff, right? Like
player evaluation or sabermetric stuff more so than like the labor situation. So I think you
kind of have to consider the source and consider the subject matter when you're watching that
stuff. But it can get tricky because uh there are all
these ties and and that's something that we've talked about with gambling for instance and the
fact that they're such big investments by gambling companies and sports betting in media operations
and you always kind of have to question well is this person being told what to say or what not to
say and in my appearances at mlb network i've certainly never been told what to say or what not to say. And in my appearances at MLB Network, I've
certainly never been told what to say or what not to say or been critiqued for something I said. But
when you're one of the most prominent voices and you have a more formal relationship and you're
talking about things that maybe touch on more sensitive areas for the league, it is not
shocking but still somewhat disappointing that things came to a head
there. So even though the Ken Rosenthal rules might not apply to me, an occasional contractor,
and someone who Rob Manfred might not read or listen to as regularly, I'd still have to
reevaluate whether I would want to go on because I wouldn't want to give anyone the impression that
I was compromised in some way and wouldn't want to condone essentially censorship on some level of someone I really respect, legally sanctioned censorship, obviously. This isn't some sort of
First Amendment issue, but that precedent would still give me pause even if it didn't directly
apply to me. Yeah. I think that individual media members have to navigate potential conflict all
the time, as you said, whether it's, do I appear on this show? What is the relationship of my site to the league? You know, what do I we we both have, you know, not just Jeff, like we both have friends who work for teams where, you know, we and try to be aware of them. I think that you are often aided in that pursuit by being able to rely on colleagues to help you assess potential conflicts and understand when disclosure might be necessary or when something is just too close a relationship for you to reasonably you know offer
an opinion on it without either actual bias or an unavoidable appearance of bias i think that
that challenge just automatically gets harder when you're working for the league which isn't to say
that it can't be done but it does get harder just on its face because your direct editorial line
goes up to an mlb owned entity, in a way that is not true
for you at The Ringer, for me at Fangrass, even though we do have, you know, a relationship with
the league around data. So it's just a really tricky thing to navigate. And I think that it's
something that, you know, reporters should be cognizant of. It's something that readers should
try to assess and be mindful of. And I don't think
that those are sort of unresolvable issues, right? People navigate conflict all the time
and can do it, I think, successfully without being compromised. But it's a thing that you
have to work at regularly and be mindful of. So it's a tricky thing. And it's made all the harder
when your editorial line is potentially putting its thumb on the scale or getting a directive from the business side to present things in a particular way. channel, even if it's mostly showing Kevin Costner movies at this point. I think the production values
are typically very good, especially on game broadcasts. And if you're someone who's into
studio shows, then it's nice to have some that are devoted to baseball as opposed to, say,
turning on ESPN and rarely hearing about baseball at times. So you just have to be aware of what
you're watching and where it's coming from and
if you're looking at it through that lens i think it can still be good and hopefully effectively
wild is a place where people are getting the straight dope as you see it and we're not beholden
to a league we're not beholden to sponsors our sponsors are our listeners so we say what we think
and we try to be fair and we definitely say some things and have some people on to talk about things that all told probably the league would rather not have people talking about.
Really?
Like this segment for instance.
Anyway.
Anyway.
So what we wanted to do this week because it's a slow news period, we're doing a little theme week here and the theme is
measuring the unmeasurable. And I think one of the things that really got me in deep to baseball
analysis and caring about baseball and working in baseball in some capacity is the idea that there
were things that were misunderstood or still undiscovered about the sport despite its long
history. And it was really intoxicating to find out about things like catcher framing or even
earlier sabermetric innovations and to think that, wow, the things that I thought about
the sport were wrong and no one knew this.
And often it confirmed things we thought, but it was still kind of cool to see it in
objective quantified way.
And I think in recent years that has become a bit rarer. I find myself a little
less excited by discoveries and research than I used to be just because, you know, there's less
low-hanging fruit, I suppose, than there once was. And also, there's been a bit of brain drain,
perhaps, and teams have hired a lot of really talented public analysts. And also, there is a
divide in terms of the availability of statistics
and you have certain StatCast stats and other data sources that teams have access to that public
researchers don't. So there was a time where public researchers were way ahead of where teams were,
and then they were kind of neck and neck for a while. And now in some ways, I think that private
proprietary analysis is ahead, but there is still a lot of really
interesting and valuable research being done in the public sphere. And there are several studies
that have caught our attention recently. And so we wanted to devote this week's episodes to talking
about their authors and trying to plumb some of those depths and explore some of those unexplored areas.
So we've got two conversations lined up today.
Later on, we will be talking to Eric Shalek about his Negro Leagues Major League equivalencies,
where he has tried to determine what Negro Leagues players might have done, statistically speaking,
if they had been plopped down into the AL or NL and allowed to play in those leagues at the time,
just trying to come up with some baseline to compare in a more accurate way between players
on one side of the color barrier and those on the other. And in our first segment, we will be
talking to Cameron Grove, who is an astrophysicist in England, who has also been doing a lot of really cutting-edge baseball research on things like stuff and quantifying pitcher repertoires and catcher game calling and tracking pitcher deliveries and a lot of other exciting areas of research.
So we will be back in just a moment with Ken. We are joined now by Cameron Grove, who tweets about baseball at pitching underscore bot,
writes about baseball at his site, Ahead in the Count, and has contributed
to Baseball Perspectives. Also, he's an astrophysicist on the side, though maybe in most
contexts he would describe himself as an astrophysicist who studies baseball on the side.
Cameron, welcome to the show. Thank you for having me.
There's a pretty rich tradition of baseball analysis by scientists and thinkers who are
way overqualified to be researching sports,
and you seem to fit right into that lineage. So we're lucky that you've decided to dabble
in baseball while you're not busy trying to answer some existential questions about the universe,
because I think I might be more interested in your day job than I am in baseball.
So let's start there. Where do you work and what do you do?
Yeah, sure. So as you might
be able to tell by my voice, I'm British and I'm a PhD student or a grad student, I guess as
Americans say, at Durham University, which is in the northeast of the UK, kind of near Newcastle,
if any of you know where that is.
But yeah, so I've been there for a few years now, and I have one year left on my PhD,
basically working on simulations of the universe at kind of the very largest scales.
So we have big supercomputers here, and we kind of run universes on them,
see what changes when we tweak the cosmology and we kind of run universes on them see what changes when we tweak the
cosmology and that kind of stuff so it's interesting work but uh very computationally
expensive and heavy so yeah the baseball stuff is more interesting to me is it really wow it
sounds simple by comparison i mean you're studying what the expansion of the universe and dark energy
and how galaxies form or i like to nerd out about that
stuff in my spare time so is there a particular research interest of yours i know you've studied
gravitational waves so um my kind of main project is to do with a telescope called desi which is the
dark energy spectroscopic instrument or spectrographic i can never remember but it's
mounted on a telescope in arizona and it started taking data this year and what it's doing is it's
measuring the positions of millions of galaxies going all the way back to the early universe
and with that data we can learn a lot about kind of what kind of universe we live in
and kind of what the expansion history of the universe is like. But my specific job,
because this is a collaboration with hundreds of people in, I run these simulations that allow us
to kind of work out that the instrument is working correctly. So we only have kind of one version of
the sky that we end up seeing
through the telescope, but with simulations we can create hundreds and thousands of kind of fake
catalogues of galaxies that we might see. And so by comparing what we rarely see to all these
simulated versions, we can kind of match them up and see, okay, which simulated universes have the
most similar properties to the one that we actually see. So yeah, my part is basically doing these simulations
and then making sure that the simulations are doing what we want them to do. Because, you know,
it's computer code and there's bugs everywhere and we don't really know what the right answer is.
So we have to make sure that that all works out okay.
So what was the path that brought you to baseball and to baseball research?
What question did you come across where you thought, aha, I answer far harder questions
than this at work all the time. I'll give it a shot to see if I can sort out this baseball
industry instead. So I didn't really follow baseball at all until maybe midway through 2019.
So I'm a relatively recent follower of the sport, but I'd followed
other American sports like football or American football and basketball. But then I actually
started doing baseball statistics research to learn a new programming language. So I was doing
a placement at the Department for Education here in the United Kingdom, and they use a language called R.
And I use something else called Python for all my physics work.
So to force myself to use R and to force myself to learn this new language, I thought, OK, what am I interested in that will, you know, use coding and force me to learn this new language?
And so I tried doing some kind of baseball stats in it.
force me to learn this new language and so I tried doing some kind of baseball stats in it and then after the project has ended I just kind of kept on going with it and kind of found more
interesting questions to answer so the first kind of major question that I kind of had that I thought
was relatively maybe new in the public space was looking at kind of what makes a good pitch and can we quantitatively
measure that so often you'll hear announcers say oh that was a good pitch and or that was a bad
pitch and he got away with it and you know they're seeing it so surely there's some way of measuring
that and that's where the kind of name for my Twitter account came from, Pitching Bot.
It was this attempt to make a kind of a model to evaluate pitches without necessarily looking at what happened after the pitch,
but just looking at the velocity, the movement, where the pitch was in the strike zone, that kind of stuff, and put that into kind of an overall score so that was
the first kind of major project that I undertook and the main question I wanted to answer but that
spawned a whole load of different stuff and yeah I've done a lot of varied projects since yes yes
you have you uh your research has touched on a lot of areas that are kind of close to our theme
this week of measuring the unmeasurable and it must be challenging if you're just joining baseball and sabermetrics midstream
decades into that field to figure out, okay, what has been done already and what has been studied
and what's the past research that exists? You know, am I answering a question that has been
answered previously? I guess you're joining at a
time when the availability of data has really skyrocketed recently with StatCast and other
data sources. So some of those questions just weren't answerable really at all for most of
the history of sabermetrics even. But still, you kind of have to familiarize yourself with the
landscape. And I'm sure that given where you grew up, you were probably more inherently familiar with cricket or with non-American
football and various other sports. But I guess even though there are some rich traditions of
analytics in those sports, maybe the availability of data for you to practice on and bring the
methods that you had already mastered in your other work maybe weren't quite as plentiful. I
guess baseball just really lends itself to that, which is why so many people have gravitated toward
baseball analysis, even if they don't initially know that much about baseball. And have you
grown to like baseball as a fan, as a spectator, or is it mainly still sort of as a scientist and as a researcher it's a bit of both
i would say um i try to catch games when they're on but being in the uk you know i don't want to
stay up past 2 a.m at the latest so that limits me to kind of daytime east coast games um when
they're on sometimes i'll stay a bit later for the playoffs but um that's the only exception
i don't really have a favorite team i would say i kind of i like them all i like a good game you
know no favoritism there but no yeah as you said the availability of data is something that i
haven't seen in any other sports and it's mainly why i chose baseball as the sport that I was doing my analysis on because
with StatCast especially and just so much public data that when I found it I was like oh my god
this is amazing and it it feels so underutilized in the public space I guess I mean people do loads
with it but there's still so much more you can do with it. And because it's so complex and, you know, there's like a hundred different
features you can have on each pitch and there's loads of different variables to look at. There's
a lot of interesting stuff you can do with it, kind of beyond what you could do maybe five years
ago with normal stats. Yeah. So the way that you started with quantifying the stuff of a pitch and
predicting its results just based on the pitch characteristics, I think maybe the first person
I'm aware of who did work like that is Jeremy Greenhouse years ago at Baseball Analysts,
and he works for the Cubs now. And there were some early models like that, and now some people
will see. You know, Sarah Soften cite one called Stuff
Plus in his work at The Athletic that he developed with Max Bay, who works for the Astros now. But
I know that you have improved that model and added more variables to it. And the main application
usually is just figuring out how good is this pitcher? How good should he be just based on his
stuff? But I'm also really interested in the applications you found of that model to broader questions, more general questions about
how baseball works. For instance, I've seen you research the diminishment in stuff that pitchers
have after they work a certain amount of innings or maybe a few days of rest, do they have worse stuff? And you've been able to
show that that does seem to be the case. I think one of the most interesting applications of that
that I've seen is that you studied the times through the order effect. And historically,
there's been a bit of disagreement. Why do batters get better against pitchers as they
see them more times within the same game? And there are two schools of thought there. One is that stuff is getting worse. Pitchers are just
running out of gas as the game goes on. The other is that it's a familiarity effect and that the
more times the batter sees the pitcher's pitches, the better he's able to anticipate and predict
and hit them. So you applied your model to trying to answer that
question. What did you find? Yes. So you have the pitch quality on one side and then you have the
results on the other side. And if you kind of group those by the times through the order,
then on the first time through, they're kind of in pretty good agreement with each other.
But then as you go to more and more times through the order,
the pitch quality metric stays roughly equal.
There's no significant decrease in stuff.
Or even in the quality of pitch locations,
it's not like pitchers lose their feel and start, you know,
missing the zone or hitting the centre of it.
That stays very consistent.
But the results that the pitchers get on those pitches show a clear decrease.
So hitters get a lot better, even though the pitchers are throwing the same stuff that they did in the first inning.
So it's pretty clear to me that that was evidence for a familiarity effect.
And you can also split it up by pitch type and look at okay so if an individual batter has seen you know four
change-ups from this pitcher in this game then you can also see an even more stark decrease in
the results while pitch quality remains pretty stable so it's definitely familiarity as far as
i can tell yeah that's i was excited to see that because i've been kind of on team familiarity
effect when it comes to that question because there have been good studies, I think, that have come to different conclusions.
But I've found the ones that seem to indicate that it's familiarity more convincing on the whole.
then there are implications for strategy and pitch selection and how quickly do you get through a plate appearance that affects how big the times through the order effect is for that particular
hitter so i think analytically that's kind of the more interesting answer not that that matters but
i think that makes it more compelling to me and i guess there are ways that you could extend that
theoretically and maybe you can look at across games instead of within games.
I have looked in the past at what happens when a starting pitcher faces the same team in the playoffs within the same round.
And it seems like as long as you're on regular rest, there's no drop off in performance.
Although pitching on short rest can be bad. One question that always comes up during the playoffs is with relievers seeing the same opponent within the same series, you know, two or three or more times. And I think
there's some suggestion that maybe they might get a little less effective there. So if at some point
you have time, I guess you could potentially apply it to that question too. But are there
any other applications of your stuff model that you are interested in examining
i try to keep a big list of possible questions to answer somewhere on my computer i've lost it
at the moment but i'll add to that question about playoff familiarity in there yes and maybe you'll
see a tweet about it sometime in the next few weeks we'll see good but i mean there's so many
other questions you could try to answer this you essentially when with these kind of pitch
quality models you have a whole alternate universe of possible outcomes that don't actually happen
but are you know probabilistic so you can look at kind of expected strikeouts expected walks
instead of actually what actually happened.
And now you can build a whole new set of statistics from that. So yeah, there's lots
of interesting stuff that can be done. One topic that seems to have been on your
to-do list that you took a pass at answering and which has been a topic of great interest
to Ben and I over the years is measuring catcher game calling, right?
This has been sort of the next frontier in trying to quantify the value that catchers bring to their teams.
And unlike pitch framing has sort of stymied researchers in their ability to answer definitively the value that catchers are bringing to their teams by their game calling.
And so I wonder if you might share your approach to trying to answer
this question and then what conclusions you were able to draw, because I thought you took a really
interesting tact when it came to this. Sure. So the way I tried to look at catcher game calling
was by looking at the run values that pitchers get, depending on who's catching them. So on most teams you have
a catcher and a backup catcher, and pitchers will kind of maybe get slightly better results with one
than the other, but using something like ERA it just takes far too long to stabilize to get any
kind of good pattern out of that. So I essentially looked at something called run values,
which is on every single pitch you can have either like a ball or a strike or a ball in play,
and those can be assigned, you know, positive or negative run values based on whether it's
more likely for the team to score after that. So you can use these and see, okay, with one catcher
does one pitcher have a better set of run values than
a different catcher and you can also go beyond that and look at okay how about not run values but
expected run values so the expected run value is kind of okay if we kind of do all these modelling and predictions of pitch quality, do some pitchers actually throw better pitches
when one catcher is calling the game for them than another?
And that's something that seemed to have an effect.
So the most clear kind of pair that I found it for was on the Red Sox.
Sandy, Leon and Christine Vasquez.
Basically, every pitcher on the Red Sox was throwing better pitches when Leon was catching.
And that was kind of surprising to me because, I mean, I know that catchers kind of, they choose
where the pitcher is supposed to be throwing the ball. But it seemed to be that they were locating better.
I can't remember if I looked at the stuff numbers,
if they were throwing better stuff or not,
but certainly they were able to just locate their pitches better
when Leon was catching.
I don't know whether that's because he was calling the pitches in a better place,
or whether he was calling pitches that they were more comfortable with
using on that day and they had more command over but yeah so it's really interesting how that effect
arises yeah and that's probably an encouraging result because leon is reputed to be a good game
caller and also he's been just about the worst hitter in baseball over the past three or four
years so you'd have to think that there is a reason why teams keep playing him and whether they have actually studied his game calling or not.
Certainly, they've talked to pitchers and heard that about him.
And I think the only hitter who's been worse than Leon over that same period is Jeff Mathis, who is also supposed to be a great game caller.
And I think your method found him to be
above average not nearly as much as Leon but sometimes you want your results to be surprising
but sometimes you also want them to match I guess the accepted wisdom because that can kind of lend
credence to your method so it could be any number of things, I guess, that we're calling game calling here,
and it could be managing pitchers in some way, or it could be pitch sequencing, let's say,
but it's kind of encouraging that you came up with catchers mostly who have that reputation,
I suppose. And pitch sequencing is actually something I wanted to ask you about because
that's something where maybe you could apply findings from both of these methods or
models is that something that you've looked into at all what makes a pitch more effective given
what comes before it or after it let's say because that's still sort of a black box that a lot of
researchers have tried to tackle over the years it's something that I've considered, but I've never dove into it just yet.
It's certainly a very complex problem because there's just so many sequences that it gets out
of hand so quickly. Trying to make any sort of general statement about it, there's like,
I don't know, eight different types of pitches that you could throw, and then so many different
orders, different locations. I suppose one way of doing it
would be to look for so as i did for the times through the order effect is to look at okay we
have these pitch quality metrics are there certain sequences that cause right pitches to overperform
their pitch quality and some that cause them to underperform but it's not something that i've
considered yet just because i think it's a really hard problem to solve. And it's probably very batter specific as well. There's
probably different patterns that different batters can be weak to, that catchers might know about,
and some might not. So lots of interconnected effects there that are hard to isolate and find
which sequences are objectively better. I have sort of a broader question across
all of your inquiries here, which is, you know, you said that there is a lot of publicly available
data and it's somewhat underutilized in terms of people trying to answer interesting baseball
questions. But I wonder if you might share, you know, if you could access anything on the team
side, is there like a particular bit of
information we just don't have on the public side that you're particularly keen to get your hands
on? Is there a data set sitting out there that you think if I could just grab that from across
the divide between the public and the team side, I'd be able to answer this interesting question?
I think player positioning is definitely one of those. So I'm really intrigued by StatCast's out above average model.
And so having,
being able to kind of make my own version of that,
I think would be a really interesting project,
but that's all kind of behind closed doors.
So I can't quite get to that yet.
And then another thing is also all the minor league data that's there.
And teams obviously keep that under wraps for good reason.
But being able to look at kind of how players develop and at the major league level, we basically see the finished products.
But there's probably so much more kind of interesting tweaking that could be done in the minor leagues if we had, you know, full stat cast
data for all of that. So it does exist for one small set of leagues, and I have done a small
bit of looking at that, kind of feeding that through my pitch quality models. But yeah,
if there was like full minor league data for all players going back through time,
that could be really interesting to see the kind of tweaks
that get made and how it improves players well it seems like one gap that you're trying to close
is the motion tracking gap and pose tracking and a lot of your tweets of late have been devoted to
your efforts to capture picture deliveries from video footage so if you work for a team and you subscribe to certain data providers,
whether it's StatCast or others, there are teams that get full feeds of, let's say,
a pitcher's delivery based on many points of articulation and very rapidly captured points
in space and time, and they're able to put together these models. It's StatCast. There's
kind of a feed that teams can subscribe to over and above the standard StatCast that gives them
that information. And then there are other third-party providers like SimiMotion and
Kinetrax that have systems set up in ballparks that capture this movement data remotely just as players are playing.
And we don't have any of that.
And if we did, it would be tough to parse because it's an enormous amount of data and
you'd have to figure out what to do with it, which is a challenge that some teams are
facing now.
But you are doing your best to make some version of that publicly available.
So what is your method? how have you captured player poses
and what do you hope to accomplish with that sure so the main reason i actually got into looking at
pose detection of players was because i was making a gif and i wanted to align some of the i was
making a gif of some batters and I wanted to align them. But because
the camera angles are slightly different on different days, the players were always just
slightly blurry. And this was annoying me quite a lot. And so eventually I was just like, oh,
I'll find something that can, you know, measure where a person is in a video or a photo. And so
there's something called KAPAO. I'm not sure how it's supposed to be pronounced. It's K-A-P-A-O. But it's essentially a machine learning based pose detection algorithm that works quite fast. 10 seconds of video, which is pretty fast when you consider the complexity of the task it has to do.
And then applying that to pictures deliveries, for example, I just thought that's a kind of
a data set that the public doesn't have access to. But the potential value in that for analysis of
players is there's just so much you can do with it. So I've just been trying to kind of build up
a kind of small database of lots of different pictures, deliveries, and then just open that up
to other people to see if they can find anything interesting to do with it. So one example,
it was one of the first things I looked at, was I was linked to a Twitter
post that suggested that Luis Castillo was tipping his pitches early in the 2021 season.
And so I downloaded some of the videos from his starts where he was supposedly tipping
and isolated his pitching motion and split it by different pitch types.
And you could see really clearly in that data a big delay when he threw his sinker,
as opposed to his other pitches.
So I found that a really interesting conclusion just that you could get from the data itself.
And you could maybe expand that to kind of look for pictures who are tipping kind of more generally
if you had a bigger database and you can do a lot more stuff to do with maybe looking at
what correlates with injuries is there anything that happens kind of on a given start before a
picture gets injured and yeah there's loads of stuff you can do with it so I kind of set up a
pipeline to try and get as much of this data as possible which is quite hard to do because
you've got to navigate the baseball savant website to find where the videos are hosted and then
download them and things and then find out when the pitch is happening and where in the video
so it's quite a few steps but seems to be seems to work so far have you gotten any feedback from
the team side on on efforts there? I'm curious
if, because I imagine that, you know, while they have access to a lot more information,
they're always looking for something that might be able to optimize pitching. So
have you gotten any feedback from the team side? I've talked to quite a few people from teams,
just more generally, rather than being super specific. I think a lot of them are quite
interested in the work, but they have all the data already. So kind of how I get there isn't
as interesting to them. I think they're just more interested in me as a person rather than the stuff
I've been doing necessarily. Yeah, I would be shocked if some teams weren't doing something
similar with looking for pitch tipping, let's say, and, you know, in a legal way, not in a real time in-game video kind
of way, but between starts, let's say that is allowed and you could look for things that
you could then apply in games.
And I guess there are limitations in the way that you're doing this and that you're getting
it from a 2D image, right?
So are you able to extrapolate that to figure out positions in space?
Or is that too complex?
Or is it just kind of inherently limited because you just don't have access to the same data that teams would?
Yeah, so at the moment, it's all 2D data.
I'm not trying to kind to project it into 3D.
I've thought of some ideas of how to do that, but given the frame rates of the video, it's only 60
FPS. So there's only so much detail you can get from that. There's probably something you could
do looking at, I guess all the limbs stay the same length. So if you know how long someone's limb is,
you can probably project kind of how far it is
in the depth of the video.
But it's not something I've considered yet.
And if I were to try it,
I imagine there'd be some horrific creations
made from that horribly mutilated picture of oceans.
Great.
You could get some very good Twitter content out of that, though, I suppose.
Yeah, I bet.
Apart from the analytical value, it's just kind of cool to see, like, when you recognize
the wireframe pitcher, you know, when it's Chris Sale or Tyler Rogers or someone like
that, that it's like that little flicker of recognition, because even in the stick figure,
you know who that is.
It's kind of like
when PitchFX first came about and we were actually seeing these things represented via data for the
first time, it was like, oh yeah, that's how that pitch moves. And that was like before we figured
out what to do with that information, there was just kind of that little just enjoyment of the
recognition of, yeah, here's the thing we've seen in real life that is now captured in this data. And it seems like there's a lot you could do with this in
theory. I don't know if the data quality will be good enough to do it, but I know that teams are
doing it with what they have. I wrote a feature last year about pitcher deception, which is a
subject that has fascinated me for quite a while. And as you must know from your stuff model,
there are pitchers who just have some ability to repeatedly exceed or underperform
what the stuff says that they should do, right?
Their stuff says they should be this good, and they're actually that good,
whether that is better or worse.
And there are a whole range of reasons why that could
be so. But one of them that really fascinates me is deception. And I know that there are some
researchers and some team people who are trying to quantify that just based on this post-tracking
information and trying to figure out, okay, how long is the ball actually visible when it first
becomes visible to the batter to the point when it's released?
How long or how good a look do you get at it?
Are you hiding it in some way behind your body that makes it tougher to pick up?
Or is it just the release point is uncommon and therefore players aren't ready to hit it, etc.?
So I don't know whether you've looked at that yet or whether you think you can with the quality of the data that you have.
But I would be very interested to see any research along those lines yeah that's definitely something
that i can try looking at i mean it's something that's really a really hard problem to answer
so i've kind of looked at okay who overperforms my kind of stuff models and who underperforms
and trying to find some kind of coherent set of qualities about
either of those populations is just almost impossible as far as i can tell so far so
looking for kind of other data sources that might inform on that is always something that i would be
interested in i haven't found any evidence yet but i mean it's only like a stick figure diagram
so you can't really see
where in their hand are they holding the ball or is it actually picking up their hand or is it their
wrist or you know if they're wearing a glove then their hand shakes all over the place because the
the algorithm doesn't know where to look so it's definitely a tough problem and not something that
i've come close to solving yet i think one of the other tough problems that we are trying to sort out collectively is
how to both assess and potentially improve umpiring in Major League Baseball.
And I know one of the other sort of areas of interest for you, and you published a piece
about this at Baseball Perspectives, which we'll link to in the show notes, but was trying
to understand how some
games are themselves harder to umpire than others, irrespective of the sort of base competency of the
umpire involved. And I wonder if you might take us through some of some of that research, because
Ben and I have long been of the of the mind that a human ump back there is probably for the best,
we're a little skeptical of the robo ump revolution
and i found this piece really interesting because it suggested to me that we are perhaps
thinking about umpire calls a little bit incorrectly from a unit of analysis perspective
right that we we get fussed about the the worst blown calls rather than thinking about umpire
performance sort of over the course of not just
an entire game, but an entire season and sort of understanding that some games are themselves
perhaps a little more difficult to call back there than others because of the pitches that
are being thrown. So tell us about games that are harder to umpire than others.
Sure. So this came about because of the umpire scorecards Twitter page, which often blew up when there'd be some game where the umpire had a massive run favor for one team rather than the other.
Yes, we had the creator of that account on the show, actually.
So it was definitely of interest to a lot of people.
And so I was kind of looking at this and thinking, well, a lot of that run value can sometimes just be from one extremely borderline call.
Like if the bases are loaded and it's a 3-2 count and there's a pitch that grazes the edge of the zone, then it's probably a 50-50 call.
gets it wrong according to where StatCast said the ball was, then that could be a change in run value of plus one run for the team that gets the benefit of the call. And so I was kind of
interested in, okay, are there a lot of games where that happens? And does the game itself
actually tell us more about what the ump scorecard is likely to show than the umpire's quality. And so to do this I built a
model for called strikes just based on where in the zone the ball lands. So for each pitch that
was taken you can assign a probability of an average umpire calling it a strike. And one thing
that's immediately obvious is that the shape of that zone isn't the same as the rulebook zone.
It's a bit fatter, a bit shorter, and more rounded in the corners.
So some bad calls by umpires where they don't call a strike that clips the corner of the zone, that might only get called a strike 10% of the time.
So is it really an inconsistent call if some umpires
don't call that? And you can aggregate this over the course of a game and see that in fact a lot of
what comes out from the umpire scorecards and a lot of the kind of run favour is almost determined
by where the ball lands anyway and the differences between this average umpire zone and the rulebook
zone so there's definitely quite a few cases i think it was a dodgers giants game the one that
was the the offender that uh spurred me to make the model but i think that was biased towards the
dodgers by a couple of runs and and I simulated it with fake umpires
assigning ball strike calls according to these probabilities that I'd made. And most of those
favoured the Dodgers as well, so it wasn't the specific umpire's fault that he gave the
Dodgers more runs than the Giants in that case. An average, impartial umpire would have done the same.
And so yeah, it's just something that I found quite interesting as kind of context for these
scorecards. And some games really are really hard to call accurately. If they have loads of
borderline calls with men on base in deep counts, then the umpire is just gonna he's gonna be wrong in one direction
by quite a lot for sure so yeah that was something that i thought was a really interesting project
and craig from baseball prospectus saw my tweet on it and said oh you should write this up into
an article and uh yeah so that's how that came about so last subject before we let you go if there is one subject about
baseball that has resisted being measured and explained over the past several years it is the
ball itself i know you've done some work on predicting the difference in offense after the
supposed deadening of the ball and you've also tried to do a little work looking into the idea that there were multiple
baseballs in play last year multiple forms of baseball so what have you been able to determine
and what is still something of a mystery sure so my investigation into the ball was primarily
looking at the distances that fly balls traveled in 2021 versus previous years.
So MLB said they were going to deaden the baseball
so that hopefully there won't be so many home runs
as there have been over the past few years.
And I mean, we have all the StatGas data.
So I was interested to see, well, can we measure the effect of the deadened ball
and see kind of how it might have changed the game
and what that might mean going into future seasons?
You know, what changes might be sticky?
What changes were a bit more kind of random?
So I essentially recreated some work that was done by Alan Nathan, I think,
and that was published at Fangraphs,
looking at predicting the distance of fly balls
and what kind of different factors affect how far the ball travels.
So there are the obvious ones such as launch angle and exit velocity, but then spin on the ball actually has a big effect.
So when the ball spins, you get more drag and so it doesn't travel as far. Unfortunately, I don't have access to batted ball spin data,
but it correlates relatively well with pull angle.
So if you pull the ball, you hit it more squarely, I suppose.
And so it spins less and travels further.
So you can kind of roll all these into a model.
And I also looked at adding the weather.
So the wind direction, the temperature,
how that might affect things as well. So you have all these different factors. And so I built a model to predict how far the bowl would travel in 2020 and earlier based on all these factors.
And then I redid all the models, but only using 2021 2021 data and so by looking at the differences in the model
predictions you can say kind of what changed about the ball and the main things that i could
find out was that well the ball was traveling less far which i guess agrees with the fact that it was
deadened so that's a positive the effect was in the correct direction. And it was balls that were hit at more extreme launch angles that were being deadened further.
So either kind of flatter, kind of more line drive-y type hits with top spin, they were traveling less far.
And as were, you know, the really skied fly balls as well.
So those are the ones that I think are more likely to have spin on them.
And so, you know so you have more spin
if the drag on the ball
in 2021 was increased
then that might make those balls
travel less far
and this was
I was like okay maybe it is the drag
has increased
and when you compare with the effect of the weather
and how that's changed
it does seem to be that the 2021 bowl is kind of much more kind of draggy
and is affected by air resistance and the wind and spin more than the previous bowls.
So looking at, say, so I split it by ballpark.
And so Wrigley Field was most affected by wind direction and temperature by far.
It must be super, super exposed there.
But the effect of the wind almost doubled as far as I could tell in Wrigley with the new ball compared to the old ball in terms of the effect of the wind direction and strength on fly ball distances.
So I found that quite surprising at the magnitude of the effect that and strength on fly ball distances so i found that quite surprising at the
the magnitude of the effect that was changing there so yeah definitely some changes yeah and
one other um aspect of it was so there was the whole controversy a few weeks ago with the the
two baseballs right and i don't have anything to say on that matter based on this
research basically the um the kind of errors of the model which i think were mainly caused by
me not having access to spin data mean that there isn't really that much i can say about whether
there are maybe two populations of balls or not was thinking, so what happened in 2021 that was different to all the previous years
was that the errors on the model got way bigger for some reason. And so I was thinking, okay,
maybe that's because there are these two populations of bolls. And so, you know,
when you sample from different populations, you're going to get a larger spread in the effects that come out.
So maybe that is evidence for two-base balls.
But after doing some more research on it, I think that's primarily because the balls are more affected by wind and spin than they used to be.
So essentially that just smears out the distribution a bit more.
And so it effectively means my model is less accurate because the balls are being pushed around
by all these different forces to a greater degree.
I see.
So some mysteries remain.
Exactly.
Yes.
Is there anything we haven't touched on yet
that has been an area of interest for you
over the past few years
that you have looked into already
or that you're still hoping and planning to at some point?
I think you've covered most of it, to be honest. I mean, I tweet most of the stuff that I end up
doing research on, so there isn't much unturned as of yet that I've thought about. I guess the
limit kind of comes from the data that's available. So if there's potentially more interesting data sources coming out in the future,
then maybe there'll be new questions that can possibly be answered.
Well, last thing then, Meg asked what you would be most tempted by as far as team databases. And
you noted that some teams have talked to you, which is hardly surprising. The conceit of this
series, or one of them, is that there's still really interesting and
valuable public baseball research being done and that not everyone has been immediately
snapped up by teams.
But multiple people that we've reached out to talk to this week have already mentioned
that they have been contacted by teams, as one would expect.
And often that's the case.
You see some promising researcher appear on the scene
and publish a few things and you get excited about what they'll work on next and then suddenly they
disappear and you are of course happy for them that they got to pursue that if that's something
that they want to do but sorry that everyone else is deprived of their insights. Now, for you, you have a career of your own that's separate
from baseball and you're in England. So presumably you would have to either convince a team to let
you work remotely or move and switch continents. And I don't know how much you want to do that,
but what are some of the considerations for you there is working in baseball an ambition for you
and something that you think
you might want to do well one thing that i do know is that i don't want to stay in academia
so baseball is definitely an option i've got one year left on my phd so at the end of this year
i'll kind of be writing up my thesis and hopefully that'll go well. But after that, I'll be looking for real jobs.
I would like to stay in the UK.
So that's definitely a big consideration.
Moving to America would be a big change.
I mean, I've been there on holiday, but that's a bit different.
Right.
So, yeah, I'm not sure at the moment, but I guess we'll see what the future holds.
So yeah, I'm not sure at the moment, but I guess we'll see what the future holds.
Yes, you remind me a bit of Rob Arthur, whom I've worked with in multiple places. And he came from a genetics background and was kind of studying that and maybe going into academia and decided not to do that and to kind of give himself over to baseball research and other journalism work.
himself over to baseball research and other journalism work. And fortunately for all of us, he has contented himself with consulting for teams so that he has still been able to be a
public researcher to some extent. So selfishly, I kind of hope that that's what happens with you,
but I wish you the best whatever happens and we'll enjoy your work while we have access to it. So
I will link to all the things that we have talked about and
all of the various places you can find Cameron on our show page as usual. But thanks for coming on
and we're glad that you discovered baseball. Thanks for having me.
Okay, we'll take another quick break now and we'll be back in a moment with Eric Shalek to
talk about developing major league equivalencies for Negro Leagues players and other players from the black baseball game. so less than a month ago we had adam drowski on the show, Adam from Baseball Reference and from the Hall of Stats, his website that has a purely statistically based version of the Hall of Fame, who is the most deserving of being enshrined purely based on the stats.
And at the time that we talked to him, the Hall of Stats did not yet include Negro Leagues players.
And now it does.
And that is largely
because of the work of our guest today. He has been on Adam's podcast. Now he is on our podcast.
He is Eric Shalek. Hey, Eric, welcome. Hi, thanks for having me.
So we are talking to people who have tried to measure the unmeasurable this week. And really,
if there's anything that is unmeasurable, it is the hypothetical question of how would Negro Leagues players who were barred from playing in the AL and NL during their careers, how would they have performed if they hadn't been barred? That is a very difficult question to answer and one that we'll never know the answer to for sure, but you have taken a crack at figuring out the answer to that question as
best as we can determine it. So this work relies on the concept of major league equivalencies,
which is an old Bill James idea that has been applied for decades in various ways. So for those
who do not know, can you explain the concept of MLEs and how they have been used historically?
Sure.
So Bill James created the MLEs and wrote about them in the 1985 baseball abstract.
And at the time, he was talking about minor league players.
And he used examples of Dick Schofield and Tony Fernandez.
And since those were guys that I grew up following, I feel really old.
And since those were guys that I grew up following, I feel really old. Anyway, he showed that if you look at the run context in which they're playing and the park in which they're playing, then with a certain multiplier or discount, you can get a pretty good sense of what their seasons would have looked like in the major leagues.
And Dan Zimborski has done a lot of work with that with Zips and a lot of other systems do too. So people have been evolving that method, as you said, for decades.
The hard part with the Negro Leagues is that not only is it difficult to measure,
but you need multiple measuring sticks and multiple
ways to think about how to use those measuring sticks. And so what I've tried to do is to look
at a Negro Leagues player and begin by asking, what is the outcome that I want to get from this?
And the outcome is, if I dropped, say, Josh Gibson into the National League in 1933,
what would his performance from the
Negro Leagues look like translated to 1933 National League? And the answer is that,
to get to that answer, we have to strip away as much of the context as we can from his documented
play against top teams, and then we need to recontextualize it into the National League of 1933.
So we need to account for his park, and we need to account for the value of a run in
his league.
We need to account for some pretty niche-y things like the standard deviation of performance
in the league, because the Negro Leagues weren't as uniform top to bottom as the
major leagues were. In fact, there's quite a bit more variance. So there's a lot of little nuance
that goes into it. But in the end, we come back with Josh Gibson being able to put out this much
value. And then we can take that value statement and actually use another Bill James tool from the historical baseball abstract,
the new one, new back in 2001, and use the method he outlined in the Willie Davis and
Sam Crawford comments in that book to then project what a stat line would look like,
a traditional stat line would look like.
So we can do an awful lot more than you would have thought based on what looks, you know, Negro League stats look a little eccentric to our sort of, to the eyes that are used to Major League stats because Major League stats are so uniform and there's totals for everything and everybody plays the same number of games, yada, yada, yada.
yada, yada, yada. And in the Negro Leagues, we just have to make some different sorts of decisions about how they work with the numbers, because they didn't all play the same number of
games, and they had different levels of competition. Sometimes they had different
parks during the same year, all kinds of wacky stuff.
I can appreciate why translating to a sort of familiar context is useful to us,
particularly within the context of trying to assess the Hall of Fame cases of these players. I am curious how you thought about what the absence of Negro
League's players for the major league means in terms of our understanding of the quality of play
there, right? Because obviously there were a great many players who were very good who were playing
in the majors, and the reality is that they did not have to play against the players who were
kept in the Negro Leagues because of the color line.
So how did you think about sort of the question of difficulty and quality of play when it comes to that sort of interchange?
Because I imagine it's quite tricky.
Meg, that's a great question.
I've wrestled with it quite a bit.
And it's actually two questions that you asked in one.
One is, what's the quality of play in the Negro Leagues?
And then the other question is, what does that mean for the quality of play in the baseball universe of the time?
So I'm going to take it in the opposite direction I just said.
In terms of how it impacted the major leagues, that segregation did, and how segregation affected the quality of play in the Negro Leagues,
and that segregation did and how segregation affected the quality of playing the negro leagues i've done a i've done a study on my own that suggests that we're talking an effect of
approximately 10 runs over the above average over the course of the year so if if ted williams was
worth 100 runs in the 1940 and of course he's probably worth more but 100 runs in the 1940, and of course he's probably worth more, but 100 runs in the 1940 AL, in reality, if we had had integration at that time, we're looking at more like 90 runs. And the same is true for players in the Negro Leagues. If someone was worth 100 runs there, they'd be worth 90 runs instead when you bring the two leagues together. And it has to be that way, as you suggested, because we bring together the two leagues,
and now we are putting the best of the Negro Leagues in with the Major Leagues,
which was a much larger talent pool.
And we're talking about displacing something like 30,000 to 45,000 plate appearances
and innings pitched.
I should say outs, pitching outs. But we're talking about a lot of playing time. And so it's,
it's actually a pretty pronounced effect. And I did a really basic study where I just replaced
players from the major leagues with MLE players and just directly. And I'm sure there's probably a more scientific way to do it,
but that was the one I had at hand.
And like I said, it suggested about 10 runs a year
per, you know, like 600 plate appearances.
Now, the other question about the quality of play
in the Negro Leagues is thorny too.
There are lots of different ways in which data is suggesting that the Negro Leagues
are around AAA, they're better than AAA, they're worse than AAA. Because Negro Leaguers played in
a lot of different places, Cuba, Puerto Rico, and all these other places. So we have to do a lot of
sort of disentangling. There's a
researcher I know who insists that I've got quality of play peg completely wrong, that I'm
too low. I come in around a discount rate of 20% for most Negro League seasons. That's where I'm
at right now. I'm doing some research right now that could change that. And it's an ever-evolving process because
there's so much we don't know that we're going to always keep trying new things to see what we can
know. So it may change. But right now, it's around AAA.
Yeah. And so even though multiple leagues are designated as major leagues and are
certainly major league quality, you would not expect them to be exactly the same.
I mean, the AL and the NL sometimes are different quality, right?
And so it would be surprising if the Negro Leagues were at exactly the same caliber of play.
Given the conditions there, the challenges they faced, the smaller player pool probably that was available to them, etc.
Clearly, there were many, many, many players who would have been
more than adequate, would have been stars and among the very best players in the AL and NL at
the time. And we know that based on what happened after the color barrier was broken or began to
crumble, which took some time. But in the decades after Jackie Robinson, so many of the players in
the AL and NL were players who previously would not have been
allowed to play there. So you know that there were many such players prior to the breaking of the
color barrier who also would have fit that description. And as you are making these MLEs,
you are not trying to come up with what would everyone have done in a fully integrated league
at the time, which would have been a very difficult
question to answer, I imagine. I mean, the question you're trying to answer is very difficult to begin
with. But if you were trying to say, okay, everyone is in the same league, then you'd have to make
MLEs for the AL and NL players as well as the Negro Leagues players too. And presumably that
would just be an incredibly complex question,
and maybe one you don't even want to answer because obviously that wasn't the reality
at the time. So you are just kind of projecting what would happen if you dropped in one particular
player into those leagues that otherwise would still be segregated, right?
That's right. I think about it like if you had a rock and you threw it into a pond,
you'd see these nice ripples go all across the pond. But if you had a handful of rocks and you threw them into the pond, you'd have just chaos because all these rocks would hit and you'd have all these ripples going all over the place and nothing would look very pretty.
think of the MLE as that one rock and I'm throwing it into the pond and the pond is,
is the major leagues or specifically the NL or AL. And, and then I can see, I can see a sense of what the, what the players, how the players performing and rippling across the league.
But I, but if I'm, like you said, if I'm bringing all, if I have to do that for everybody,
it's going to get real messy real fast. Right. So I imagine that, you
know, you mentioned that there are a number of challenges that are attendant with this project,
and I think we can anticipate some of them. I'm curious if there were particular bits of
consternation specific to the position players versus the pitchers and trying to get everyone
on sort of ground that you could agree with so that you could figure out what their place would
be in Major League Baseball. Yeah, you know, that's a really, really neat question. And I think the big
difference is that with pitchers, you start with runs. And when I started this, I thought,
I want to use runs, not batting average, not home runs, not slugging. I want to use runs because
runs translate from place to place, from league to league,
from time to time. So with pitchers, you've got runs and you have runs allowed per nine innings.
It's pretty easy to work with in that way. With the hitters, then you have to go and figure out
what their production is, what their rate of production is. And it's a little more involved.
So what I do for them
is I take their statistical inputs and I turn them into weighted on base average. And so that's sort
of a proxy for what runs allowed per nine would be for pitchers. But once you're able to sort of
get it down to runs, it becomes simpler. And weighted on base average translates to runs and to runs above average,
as you both know so easily, that it's a great tool to sort of equalize.
And then once you've got the runs, though,
then it's really about figuring out what the contexts are.
Then it's about the park and the league,
and it's about standard deviation of performance in weighted
on base average or in runs allowed per nine or what have you so like that's what that's the
place where it has to start with the runs and then all the uh all the little adjustments can happen
and you have some very thorough explainers of your methodology on your website which i will link to
on the show page for anyone who wants to get into the nitty gritty. Or anyone who wants to fall asleep.
Right. Yeah, it might not work as well on a podcast, perhaps, but we'll give you the
Cliff's notes here. So you mentioned that MLEs have often been applied to players going from
the minors to the majors, and that is something that actually happens, right? That is not a
hypothetical. Players get promoted from AA to AAA, from AA to the majors, from AAA to the majors. So you can just sort of see
what the exchange rate is in reality. Now, when it comes to Negro Leagues players and players from
the white major leagues at the time, I know there's been a lot of great research that's been
done to look at what happens when those players would play each other in various exhibitions and barnstorming. And often it seems like the Negro Leagues players
held their own or more, although there are always complications with the makeups of those teams and
the effort levels, et cetera. So how do you figure out what the exchange rate is, what the league
quality difference is when you have a situation like
this where obviously there were no players for a time, at least, who were going from
the Negro National League to the National League, let's say.
Yeah, that again is a really great question.
And so what I tried to do was to fit the quality of play discount into a structure that I understand
and that is widely accepted, and that's the minor leagues to major league structure.
You know, that single A and double A and triple A each have their own level of play,
and then the major leagues are sort of one, and everything is discounted from that.
And it's not a perfect match, but we are starting to make some
headway on ways that we can improve that. That said, there's some very strange things that happen
when Negro Leagues players went into organized baseball. For example, especially with pitching,
we know that the hitters had a fairly smooth transition to organized baseball. And if
you look at the National League of the 1950s and 60s, almost all of the best players are
African-Americans who would have or were in the Negro Leagues. So there's no question that the
top end of the Negro Leagues translates easily. The problem is that then the pitchers don't
translate so easily in terms of the success we see or don't see.
So if you look at hitters, there's like 30 or 40 hitters.
And when they once they made the majors, they average careers of roughly average major league value.
You've got guys who are way off the charts, you know, but you also have you also have scrubs who, you know, like Kurt Roberts, who didn't make a big impact.
You also have scrubs who, you know, like Kurt Roberts, who didn't make a big impact.
With pitchers, though, there were really only four successful pitchers who went from the Negro Leagues to the Major Leagues, and that's Toothpick Sam Jones, Connie Johnson, Satchel
Page, and Don Newcomb.
And that's it.
There are other pitchers who played in the Major Leagues, but they weren't very good.
other pitchers who played in the major leagues, but they weren't very good. So I looked then at the performance of the hitters and the pitchers in AAA and in AA and in single A level leagues.
And the pitchers at the major leagues were, other than those four guys, or including those four
guys, were way, but like 20 runs below average for a season, career-wise.
Not a season, but for their career.
When you get to AAA, it's not much better, but the hitters are outstanding at AAA,
as befits the fact that they were at least major leaguers on average at the major league level. So when you go down the minor league chains, you see that not until you get to double A-level leagues,
or probably what were then probably called A or B leagues,
do we see that the pitchers are holding their own.
And I've thought about this again and again and again.
And so on one hand, it tells you that there was some kind of difference between the pitching in the big leagues or organized baseball and pitching in the Negro Leagues.
And then there's also this question like, OK, well, if the hitters were really going great guns in organized baseball and the pitchers weren't, what does that say about the quality of play?
And I'm still working through that but here's what i can tell you in the negro leagues
it seems like the defensive spectrum was different than we perceive it in the major leagues and that
teams tended to to select for their most athletic players first and put them at either shortstop or
center field depending on their handedness and then anyone who could catch went to catcher and
then they selected for pitchers.
And I think that's very different than what we understand the defensive spectrum
to be in the major leagues, where at least I perceive it as there's pitchers
and there's everyone else, and then we start moving down the defensive spectrum.
But that's not how it was in the Negro Leagues,
because the talent pool was smaller and the needs were different.
But that's not how it was in the Negro Leagues, because the talent pool was smaller and the needs were different. So the pitchers in the Negro Leagues weren't necessarily the best athletes. They weren't take the balls out of play as often as they did in the Major Leagues. And there's lots of narrative out there about how by the last three innings of a game, you'd be hitting a mushy sweet potato instead of a baseball that was all dark and cut up.
So a pitcher who can get by on guile and doesn't have a great fastball can make it for the first few innings of a game in the Negro Leagues.
can make it for the first few innings of a game in the Negro Leagues, and then they have to start relying on the fact that the ball is defaced, it's mushy,
and using trick pitches like using the cuts in the ball to change the spin on the ball, things like that.
Things they couldn't do in organized baseball.
So then on top of that, you have the fact that on average, I'm pretty sure
that parks in organized baseball were a little bit smaller than those in the Negro Leagues.
And the Negro Leagues tended to play at cities that were very near sea level. And that wasn't
the case in every organized baseball league. So you've got just a whole bunch of cards stacked
against pitchers going from the Negro Leagues to organized baseball. And so you've got just a whole bunch of cards stacked against pitchers
going from the Negro Leagues to organized baseball. And so it's really hard to say from that
what that really means about quality of play, especially when you've got the hitters doing so
well. And so my compromise so far has been, all right, we've got guys who aren't performing well
in AAA and may not be, you know, may not really be better than AA pitchers. Then we've got guys who aren't performing well in AAA and may not be, you know, may not
really be better than AA pitchers. Then we've got guys over here in the majors who are like
high quality players in between there somewhere is around AAA level. You know, it's not perfect
science, unfortunately. I can't get too much closer than that yet. I imagine that the sort
of increase in proliferation of available data and box scores
around Negro leaguers made a huge difference in your effort here. I'm curious if there are,
and this perhaps is in one of your 6,500 word explainers, so my apologies if I missed it,
but I'm curious if there are other pieces of information that you are keen to sort of add
to this project that you think might
improve the precision or specificity of some of these equivalencies?
Yes. So the work of Scott Simkus and Kevin Johnson and Gary Ashwell and Larry Lester and
all of the gentlemen and ladies who have contributed to the Negro Leagues database at
CMHits.com is amazing. And without it, none of this is really possible.
And the reason it's not possible is that until their work went up, we didn't have a lot of
information about league totals.
It's very difficult to find anything that says that, for example, you know, the NNL stole, who knows, like 300 bases in 1926.
You just can't find that stuff.
But you need that stuff in order to compare players against their own league so that you can then translate them into a major league.
And so once that stuff began appearing, those lead totals began appearing at seam heads,
then we had the ingredients where we could really start cooking.
And so that's huge.
It's huge.
And there's holes in the data.
There aren't stolen bases for every season.
There isn't hit by pitch for every season.
Not every box score has been uncovered.
And obviously that's an ongoing task that the guys at SeamHeads are still working on. I think that in terms of missing data,
there are individual players. We just don't have anything because they either went off to some semi-pro league. There's a guy named Heavy Johnson who was a tremendous hitter. And unfortunately,
he played like six or seven years in the Negro
National League in the 20s, starting around age 26 or 27. The preceding six years, he was in the
25th Infantry Wreckers, which was an army division that just basically played baseball. So we don't
have any stats for that. And we don't have any stats for 1929 and 1930, because he went off to
the Northwest to play semi-pro
ball and so there's people like that where we're just never going to get a lot of the information
but the information that could come through that i'm pretty sure that seam heads guys are
are working on is a full accounting of the cuban winter league now, there's about 20 seasons in their database.
And getting a fuller accounting of that will really help
because it's more plate appearances.
It's a bigger sample.
And the bigger the samples get,
the more confidence we can have.
And the same thing is true for the Puerto Rican Winter League.
There's currently no information on that available that's usable. And if that comes online, that'll provide some really good,
it'll really beef up the sample for a lot of latter-day players. And will give us insight
into players like Perugio Cepeda, the Bull, Orlando Cepeda's father, who currently we have
zero stats on. But this guy's a legend,
and I'd love to know more about him. So there's things like that that are likely on the way
someday. Gary and Kevin have mentioned to me that that's a goal of theirs. And really, anything we
can do to increase what we know and what's documented is going to be helpful. And if you
go to SeamHeads or you go to Baseball Reference, then you see our best accounting available now of the official league games
that those players played. Of course, they played in many exhibition and barnstorming contests too.
And those schedules, the official league schedules were considerably shorter than what people are
used to with AL and NL schedules of today or of that time.
And of course, you want to preserve that difference because you want to remember the reason why these
players were playing in a different league, why they had these shorter schedules. You don't kind
of want to pretend that this is some alternate happy history where there was no color barrier.
But I think one of the dangers of presenting these stats the way they are without anything
else, which I think it's wonderful to have them available, and I hope that it does lead
to more people discovering these players and their names and their accomplishments.
But you could inadvertently lead to some of these players being underrated because people
might look at their counting stats or their war or whatever and
will see lower totals than they expect to or than the greats of the AL and NL of the
time.
And so one thing that I think your work allows us to do and allows Adam to do at the Hall
of Stats is to present these things on somewhat of a comparable playing field here in terms of totals, in terms of playing time.
And they may still be conservative, as Adam explains here, but they at least look a little
more like you would expect these stats to look, like the career wars to look. So I don't know
if you want to run through an example or two, say take a Josh Gibson or a Satchel Paige or any player you want to pick and heads with 230 some home runs and, you know, tremendous batting average and all of that. him through the process, what we get is a guy who has about 9,000 plate appearances
and 83 war, and that's a pretty damn good player.
And we're talking about 500 and almost 600 career batting runs.
And man, that's not chopped liver, you know, that's high-end
stuff. And what that translates to in terms of traditional stats is a 917 OPS, a 160 OPS plus,
435 home runs. He's just, you know, he's a monster. He's a great hitter. And it's true that
he's a monster. He's a great hitter. And it's true that MLEs are slightly conservative by nature because we have to use a lot of measures of central tendency because we don't have all the
data. And we have to, in some cases, you know, in a season when somebody has fewer than 200 plate
appearances, I try to beef the sample up by using surrounding seasons or using their career averages to increase the sample size so that we're not giving 100 home
runs to a guy who has 10 plate appearances and hits two or three home runs. So it is a little
conservative. So could Josh Gibson have hit 500 home runs in those 9,000 plate appearances?
Absolutely. Absolutely. The fact that I've got
them down for 435 simply means that that's what my math is saying right now. But as more data
comes through, we could see a whole different look. I mean, we could see totals shooting up.
We just don't know. And we're not going to know until the data does come through. I think that it's important
that what you said is really important, that seeing numbers that are familiar looking,
putting this in a familiar context, brings Josh Gibson to life in a different way.
When I can look and say, gee whiz, I mean, you know, the only people who had hit 500 home
runs by the time that Gibson had retired were, you know, Babe Ruth, Mel Ott, and Jimmy Fox,
and he's at 435. That's one heck of a slugger. It puts some frame of reference around Josh Gibson.
When we say things like, well, he, you know, hit 800 home
runs against all competition. Well, Babe Ruth hit a thousand home runs against all competition in
all likelihood. I don't know the exact, the exact count. So, so we're, you know, how does that all
fit together? And it's hard to say, but when we have a, when we have a solidly, internally consistently derived figure, at least we can say, hey, we estimate that he's around 450 home runs.
Dang good player.
That connects me from the legend of Josh Gibson to what kind of player he really was.
And I happen to have Gibson open because we were talking about it.
Give me a second to pull up Satchel.
Sure.
Okay, so with Satchel Paige,
we're actually missing something like 1,000 innings of Satchel Paige's career,
believe it or not.
And I say believe it or not because we already have somewhere around 2,000.
But he pitched in the California Winter League,
and he pitched in Cuba, and he pitched he pitched in Cuba and he pitched in Puerto Rico
and he pitched in a whole bunch of places that we just don't have numbers for yet. But 2000
innings is a heck of a big sample. And so we can have some confidence that we're getting
good numbers, especially because we also have his numbers in the major leagues and in the minor leagues. So we know what kind of a pitcher he was late in his career.
So when we look at him through the MLE lens,
I'm getting a total of about 4,500 career innings.
And I'm seeing him saving about 530 runs more than an average pitcher.
And that comes out to about 95 more.
And man, that's a lot.
And when I run that through the traditional stats machine, I get about 300 wins.
Now, those 300 wins are assuming that he's playing for average teams every year.
If he was playing for the Yankees, he'd be somewhere around 350, 375 wins.
There's so much context around wins that we're just sort of saying on an average team, that's what that looks like.
If he had Whitey Ford's Yankees behind him, he'd have had a winning percentage much like Whitey Ford's or better.
He was an amazing pitcher.
He probably would have had somewhere around a 127 ERA plus and probably would have pitched around
almost 900 games. I mean, after all, he pitched from age 20 to 46 nonstop and then in 58 made a
three-inning comeback with Charlie Finley. So this guy's quite a pitcher. And of course, there's no one really
like Satchel Paige. He's an institution unto himself, but he was a darn good pitcher. And he
and Lefty Grove are really the only two pitchers between Walter Johnson and Tom Seaver who have
the kind of career where we can ask, were these the best guys between Walter and Tom?
Do they have a place in the discussion among the all-time great pitchers? I think, you know, the
other obvious move here would be to use those numbers to help us put these guys in context when
it comes to the Hall of Fame. And I know that when the early era baseball committee was meeting, you
did a number of tweets and posts about how the candidates stacked up relative to other major league peers that we might be familiar with.
And so I guess the first question I would ask is, were there any omissions from the gentlemen who
were elected and inducted that you were disappointed by, guys who you hope will
make their way into Cooperstown when the next committee meets?
Yeah, it was kind of bittersweet for me because I was really excited for Minnie Minoso,
and I was really excited for Bud Fowler and for Buck O'Neill, just like everybody.
But I was also really sorry that none of the actual ballplayers from the Negro Leagues era,
from early committees, deliberations made it.
I feel like in some ways that they probably couldn't reach consensus on the players,
and so they went with the sort of pioneers and executive types.
And I'm glad for those guys and their families and for all the fans.
I think that one of the places where I wish they had charted a different course
was to put Dick Lundy on the ballot.
Lundy's an incredible shortstop, a great fielding shortstop.
He could hit, and he had all the tools.
And I think it's a mistake not to have had him on that ballot.
They had Grant Johnson on there, and Grant was from an earlier era than Dick Lundy.
He retired a good five to ten years earlier than Lundy did.
Lundy was a very young player when he came up at age 18, and he had a really strong career.
Both of them are Hall of Fame caliber players, in my opinion.
I think Dick Redding,
not electing Dick Redding might be a mistake.
He's sort of like the...
If Smokey Joe Williams
is the Walter
Johnson of the Negro Leagues,
then Dick Redding is something like
the Pete
Alexander. They occupy the
same sort of 1-2
ranking in their era, and they're both long careers and were great pitchers. I don't think Reading was as good as Alexander, but they occupy that same sort of territory within the context that they're pitching in. So I think that that was a missed opportunity for sure. But you know what? I also think that
these are very difficult deliberations and I don't, I don't want to, I don't want to be like,
you know, Mr. Frowny face about this because, because the Negro league's got another shot
this time around at getting more people in. And that's, and that's important. They've had sort of
the, um, been sort of, um, treated not as consistently as the major league players have, which is unfortunate.
And I hope that we have more opportunities than once every 10 years to explore who is a good Hall of Fame candidate from the Negro Leagues.
It would be unfortunate if we just left it here until 2031 or 2032.
Right. And we did have two of the founders of the 42 for 21 committee, Sean Gibson and Ted Knorr, on the podcast on episode 1785 to talk about their efforts in that area. And I know that you shared on Twitter your 42 for 21 ballot. So I will link to that on the show page as well, the players that you
selected as the most deserving for potential induction. One last question for me. You mentioned
Grant Johnson as one of the most deserving players, and you have him or you and Adam have him
as more than a deserving Hall of Famer, according to your MLEs at the Hall of Stats. And of course,
his career at the Hall of Stats classifies it as 1895 to 1914. That predates the organization of
the Negro League. So how do you do your MLEs for players from black baseball from before
the Negro Leagues were founded? Just the same way that I do for post-founding of the NNL. And, you know,
the quality of play was lower. I ramped down from AAA to about AA in 1905, like, you know,
kind of a point at a time. But, you know, I sort of treat the Eastern independent teams and the
Western independent teams as their own quote-unquote league. They played each other a lot.
So I use those league totals in the same way that I would use the NNL or ECL or NAL totals
for latter-day players.
I try to use the same procedures and techniques for all players
because I don't want to treat anybody differently.
I want everybody to go through the same process so that my, so that my biases and my favorites or what have you are not influencing
the, the outcomes. And, you know, I've, like all of you who, who are writers, you know, your,
your, your bias might not come out in the, the actual words, but in the choice of subject and things like that.
So yes, of course, my bias will be in there.
But I try to make it as objective as I possibly can and have as few decisions to make for each player as possible
so that I'm not inflicting my ideas on everyone to the best that I can.
But it is very much the same process.
The issue with someone like Home Run Johnson
is that the first half of his career,
the information is very scant.
And that's true for all the players from his time,
especially the early part of his career.
Guys like Saul White, Hall of Famer,
and Hall of Famer Frank Grant,
and lots of other guys,
Abe Harrison, Clarence Williams,
and James Seldon, and George Stovey, guys like that from that period who we just don't have a
lot of info on. And I would love to have that information to be able to do more with people
from that era. But right now we don't. And until we do, kind of got to sit on my hands.
All right. Well, we'd encourage everyone
who's been interested in this interview to check out the show page where I will link to a lot of
other resources that you can check out to learn more about Eric's work. You can also find him
on Twitter at his name, Eric Shalek. That is C-H-A-L-E-K. Eric, thanks very much for your work
and for joining us today. Thank you both so much for having me.
It's been a delight to talk with you.
All right, that will do it for today.
Thanks, as always, for listening.
You can support Effectively Wild on Patreon by going to patreon.com slash effectivelywild.
The following five listeners have already signed up and pledged some monthly or yearly amount to help keep the podcast going and help keep us ad-free and also get themselves access to some
perks. Garrett Sutherland, Mark Bailey, Jonathan Goetz, Stephen R. Christensen, and Alex Kobayashi.
Thanks to all of you. And if you sign up like those listeners did, you can get access to the
Effectively Wild Patreon-only Discord group with about 450 members now talking about baseball all the time.
You can also get access to our exclusive Patreon-only bonus podcasts, a couple of which we have
recorded and published already.
You can join our Facebook group at facebook.com slash group slash Effectively Wild.
You can rate, review, and subscribe to Effectively Wild on iTunes and Spotify and other podcast
platforms.
Keep your questions and comments for me and Meg coming via email at podcast at fancrafts.com
or via the Patreon messaging system.
If you are a supporter, you can follow Effectively Wild on Twitter at etaleanpod.
You can join the Effectively Wild subreddit at r slash effectively wild.
Thanks to Dylan Higgins, as always, for his editing and production assistance.
We will be back with another Measuring the Unmeasurable episode a little later this week. Talk to you then. When you were at the school Did the factory help you grow Were you the maker or the tool
Did the place where you were living
Enrich your life and then
Did you reach some understanding
Of all your fellow men
All your fellow men
All your fellow men