Theories of Everything with Curt Jaimungal - AI is Taking Over Mathematics | Yang-Hui He
Episode Date: January 2, 2025This episode features Yang-Hui He, a mathematical physicist and professor known for his groundbreaking work in string theory, geometry, and AI-driven approaches to mathematical research, as he explain...s how machine learning is revolutionizing the field and leading to major discoveries. As a listener of TOE you can get a special 20% off discount to The Economist and all it has to offer! Visit https://www.economist.com/toe Timestamps: 00:00 - String Theory & Mathematics 10:54 - How Does One Do Mathematics? 16:28 - Computers & Mathematics 20:04 - Bottom-Up Mathematics 28:44 - Meta-Mathematics 46:17 - Top-Down Mathematics 55:22 - Pattern Recognition 01:01:30 - Platonic Data 01:07:15 - A Classic Problem Since 1736 01:17:38 - Classical Results for Reimann Surface 01:22:29 - Manifolds 01:26:52 - Superstring Theory 01:30:45 - When Physics Meets Math 01:43:01 - Progress in String Theory 01:45:45 - Image Processing 01:59:33 - Episode Recap 02:12:50 - Outro Links Mentioned: • The Calibi-Yau Landscape (book): https://amzn.to/41XmUi0 • Machine Learning (book): https://amzn.to/49YQ42t • Topology and Physics (book): https://amzn.to/4gCcjxr • Yang-Hui He’s recent physics lecture: https://www.youtube.com/watch?v=AhuZar2C55U • Roger Penrose on TOE: https://www.youtube.com/watch?v=sGm505TFMbU • Edward Frenkel’s String Theory discussion on TOE: https://www.youtube.com/watch?v=n_oPMcvHbAc • Edward Frenkel’s lecture on TOE: https://www.youtube.com/watch?v=RX1tZv_Nv4Y • Joseph Conlon and Peter Toit on TOE: https://www.youtube.com/watch?v=fAaXk_WoQqQ • A New Lower Bound For Sphere Packing (article): https://arxiv.org/pdf/2312.10026 • Principia Mathematica (book): https://www.amazon.com/Principia-Mathematica-Alfred-North-Whitehead/dp/1603864377/ref=sr_1_5?crid=2ANIKKX6G8KRK&dib=eyJ2IjoiMSJ9.c62w_u2CfXIK6AaEt-QKx6dp22lbkUr17cSyr3O-rltVBjvb8xCrwLWz8CQ6iWjo8rjmeCsSCwPwM_U0T8_InZfz0vEX9UKDWfSa5Oan86o4YwU6F3GdBPz3J2d_hXbLOc-EULawZ47JksUzndhf5q7ydfCMlK9lYKc2XLZQq-6_dHWQSbjYI82e_dcKw9EWp71DPKIZ9v5qvbyP3CnE7gRpN7uPMZpj-lxlo7Wjsl4.iSUZDFr0n-ZlkiADza8yEePerPoxBJRRCLhO0tQm2wU&dib_tag=se&keywords=principia+mathematica&qid=1735580157&s=books&sprefix=principia+ma%2Cstripbooks%2C122&sr=1-5 • Tshitoyan’s paper on Nature: https://www.nature.com/articles/s41586-019-1335-8 New Substack! Follow my personal writings and EARLY ACCESS episodes here: https://curtjaimungal.substack.com TOE'S TOP LINKS: - Enjoy TOE on Spotify! https://tinyurl.com/SpotifyTOE - Become a YouTube Member Here: https://www.youtube.com/channel/UCdWIQh9DGG6uhJk8eyIFl1w/join - Support TOE on Patreon: https://patreon.com/curtjaimungal (early access to ad-free audio episodes!) - Twitter: https://twitter.com/TOEwithCurt - Discord Invite: https://discord.com/invite/kBcnfNVwqs - Subreddit r/TheoriesOfEverything: https://reddit.com/r/theoriesofeverything #science #physics #ai #artificialintelligence #mathematics Learn more about your ad choices. Visit megaphone.fm/adchoices
Transcript
Discussion (0)
Looking for the perfect holiday gift?
Masterclass has been my go-to recommendation for something truly meaningful.
Imagine giving your loved ones the opportunity to learn directly from world-renowned instructors.
With Masterclass, your friends and family can learn from the best to become their best.
It's not just another streaming platform, it's a treasure trove of inspiration and personal growth.
Whether it's learning critical thinking from Noam Chomsky
to gaining practical problem-solving skills with Bill Nye or exploring the richness of history with
Doris Goodwin or my personal favorite which is learning mathematical tricks or techniques from
Terry Tao. There's something for everyone. Another one that I enjoyed was Chris Voss,
a former FBI negotiator who teaches communication strategy. It's been a game changer
to me ever since I read about Chris Voss from his book and it was surprising to see him here on
Masterclass. A pleasant surprise. Masterclass makes learning convenient and accessible whether
you're on your phone or your laptop or tv or just listening in audio mode and the impact is real.
88% of members say Masterclass has made a positive difference in their life.
For me, it's an incredible way that I discover, sometimes even rediscover, learning. Every
class that I take helps me feel more confident, even inspired. And I've received great feedback
from friends that I've recommended it to.
There's no risk. It's backed by a 30-day money-back guarantee. Right now, they're offering an
extremely wonderful holiday deal, with some memberships discounted by up to 50%. Head to masterclass.com slash theories and give
the gift of unlimited learning. That's masterclass.com slash theories. Your gateway to unforgettable
learning experiences.
BetMGM, authorized gaming partner of the NBA, has your back all
season long from tip off to
the final buzzer. You're
always taken care of with a
sports book born in Vegas.
That's a feeling you can only
get with bed MGM and no
matter your team, your
favorite player or your
style, there is something
every NBA fan will love
about that MGM download the
app today and discover why bed MGM is your basketball home for the season. Raise your game to the next level the NBA fan will love about
download the app today and
discover why bad MGM is your
basketball home for the
season.
Raise your game to the next
level this year with bad MGM,
a sports book worth a slam
dunk and authorized gaming
partner of the NBA bad MGM
dot com for terms and
conditions must be 19 years
of age or older to wager
Ontario only please play responsibly. If you have any questions or concerns about your gambling or someone close to you, please contact Connex Ontario
at 1-866-531-2600 to speak to an advisor free of charge.
BetMGM operates pursuant to an operating agreement with iGaming Ontario.
Yonhui He, welcome to the podcast. I'm so excited to speak with you. You have an energetic humility and your expertise and your passion comes across whenever I watch any of your lectures.
So it's an honor.
It's a great pleasure and great honor to be here. In fact, I'm a great admirer of yours.
You've interviewed several of my very distinguished colleagues like, you know, Roger Penrose and Edith Franco.
I actually watched some of them.
It's actually really nice.
Wonderful, wonderful.
Well, that's humbling to hear.
So firstly, people should know that we're going to talk about or you're going to give
a presentation on AI and machine learning mathematics and the relationship between them, as well
as the three different levels of what math is in terms of production and understanding
bottom up, top down down and then the meta.
But prior to that, what specific math and physics disciplines initially sparked your
interest and how did the collaboration with Roger come about?
So my bread and butter was mathematical physics, especially sort of the interface between algebraic
geometry and string theory.
So that's my background, what I did my PhD on. And so at some point, I was editing the book
with CN Yang, who is an absolute legend, you know, he's 102, he's still alive, and he's the
world's oldest living Nobel laureate. You know, Penrose is a mere 93 or something.
So CN Yang of the Yang-Mills theory, so it's an absolute legend.
He got the Nobel Prize in 1957.
So at some point I got involved in editing a book with CN,
with CN called Topology in Physics.
And you know, with a name like that, you can just invite anybody you want,
and they'll probably say yes.
And that was my initial friendship with Roger Penrose started through And you know, with a name like that, you can just invite anybody you want and they'll probably say yes.
And that was my initial friendship with Roger Pembroke started through working together on that editorial.
I mean, I have Roger as a colleague in Oxford, and I've known him on and off for a number of years,
but that's when we really started getting working together.
So when Roger snickers at string theorists, what do you say? What do you do?
How does that make you feel? That's totally fine. I mean, I'm not a diehard string theorist and,
you know, I'm just generally interested in the interface between mathematics and physics.
And, you know, Roger's totally chill with that. So you just happen to study the
mathematics that would be interesting to string theorists, though you're not one?
Exactly, and vice versa.
I just completely chanced on this.
It was kind of interesting.
I was recently given a public lecture in Dublin about the interactions between physics and
mathematics.
And I still find that string theory is still very much a field that gives the best cross-disciplinary
kind of feedback.
And it's been doing that for decades.
It's a fun thing.
I talked to my friends in pure mathematics, especially in algebraic geometry.
100% of them are convinced that string theory is correct because for them it's inconceivable for a physics
theory to give so much interest in mathematics.
Interesting.
And that's kind of a, I think that's a story that hasn't been told so much in media.
If you talk to a physicist, they're like, you know, string theory doesn't predict anything,
this and the other thing.
But there's a big chapter of string theory.
To me, more than 50% of the story, back story of string theory,
is just constantly giving new ideas in mathematics.
And historically, when a physical theory does that,
it's very unlikely for it to be completely wrong.
Yeah. You watched the podcast with Edward Frankel,
and he takes the opposite view, although
he initially took the former view that okay, string theory must be on the correct track
because of the positive externalities.
It's like the opposite of fossil fuels.
It doesn't give you what you want for your field like physics, but it gives you what
you want for other fields as serendipitous outgrowth.
But then he's no longer convinced after being at a string conference.
So you still feel like the pure mathematicians that you interact with see string theory as on the correct track as a physical theory, not just as a mathematical theory.
Yeah, so he yeah, absolutely. He does make a good point.
And so, like, I think, you know, Franco and, you know, algebra geometers like Richard Thomas and various people, they appreciate what string
theory is constantly doing in terms of mathematics.
And the challenge is whether it is a theory of physics based on the fact that it's giving
so much mathematics, I guess you've got to be a mystic.
Some of them are mystics, some of us are mystics. And I actually I don't personally have an opinion on that. I just,
you know, some days I'm like, well you know, this is such a cool mathematical structure
and there's so much internal consistency. It's got to be something there. So it's just
a gut feeling. But of course, you know, it being a science, you know, you need the experimental evidence,
you know, you need to go through the scientific process. And that I have absolutely no idea.
It could take years and decades.
Wouldn't you also have to weight the field like W-E-I-G-H-T, weight the whatever field
like the sub-discipline of string theory with how much IQ power has been poured into it,
how much raw talent has been poured into it versus others.
So you would imagine that if it was the big daddy field, which it happens to be, that
it should produce more and more insights.
And it's unclear to me, at least, if that this much time and effort went into asymptotic
safety or loop quantum gravity or what have you, or causal set theory, if that would produce
mathematical insights of the same level of quality,
we don't have a comparison. I mean, I don't know. I want to know what your thoughts are on that.
I think the reason for that is just that, you know, we follow our own nose as a community,
and the contending theories like, you know, loop quantum gravity and stuff,
you know, there are people who do it. There are communities of people who do it.
And, you know, there's a reason why the top mathematicians
are going to do string-related stuff is because, you know,
you follow the right notes.
You feel like it is actually giving the kind of,
the right mathematics.
Things like, you know, mirror symmetry, you know,
or vertex algebras, that's kind of giving
the right ideas constantly, and
it's been doing this since the very beginning.
And people do the alternative theories of everything, but so far it hasn't produced
new math.
You can certainly prove us wrong, but I think there's know, I follow, you know, there's a reason why Witten is the
one who gets the Fields Medal.
Because it's just somehow it's at the right interface of the right ideas in geometry,
number theory, representation theory, algebra, that this idea tends to produce the right
mathematics.
Whether it is a theory of physics, that's still, you know, that's the next mystical
level. But, you know, it's kind of, it's an exciting time, actually.
Wittenden didn't get the Fields Medal for string theory, though. It was his work on
the Jones polynomial and Chern-Simons theory and Morse theory with supersymmetry and topological
quantum field theory, but not specifically
string theory.
That's right.
That's right.
But he certainly is a champion for string theory.
And for him, I mean, you know, that idea of, he was able to do, you know, the Morse theory stuff, he was able to get
because of his work at supersymmetry, he was able to realize this was a supersymmetric
index theorem that generated this idea. And that's really a, supersymmetry really is a
cornerstone for string theory, even though there's no experimental evidence for it. So
I think that's one of the reasons that's guiding him towards this direction.
So, what's cool is that just prior, the podcast that I filmed just prior to yours was Peter White, as you know, is a critic of string theory.
And Joseph Conlin, who is a defender of string theory, and he has a book even called Why String Theory.
That's right.
I think it was the first time that publicly someone like Peter White, along with the Defender of String
Theory, were just on a podcast of this length, speaking about in a technical manner, what
are both of their likes and dislikes of string theory and in the string community. There's
three issues, string theory as a physical theory, string theory as a tool for mathematical
insight, and then three, string theory as a sociological phenomenon of overhype and does it see itself as the only game in town is the arrogance. Should there be arrogance?
It was an interesting conversation.
Yeah. Well, Joe is a good friend of mine, Joe Collin.
Yeah, right, right. In Oxford.
And yeah, no, I value his comments greatly. I've always been kind of, you know, for me, it's, you know, I've always been kind of like
slightly orthogonal to the main, the main string theory community.
I'm just happy because it's constantly given me good problems to work on.
Yes.
Including what I'm about to talk about in AI.
Wonderful.
I'll mention a little bit about it because I got into this precisely because I had a
huge database of Clavier manifolds and I wouldn't have done that without the string community. It's again one of those accidents that
you know no other you know the the the other theoretical physicists didn't
happen to have this didn't happen to be thinking about this problem. There's this
proliferation of Clavier manifolds and I'll mention that bit in my lecture later
and why this is such an interesting problem, why clavialness is interesting inherently regardless whether you're a string theorist.
And that kind of launched me in this direction of AI assisted mathematical discovery.
So this is kind of really nice and I think, I mean for me the most exciting thing about this whole community is that science and especially
theoretical science, well not especially science, including theoretical science, has become
so compartmentalized.
Everyone is doing their tiny little bit of thing.
String theory has been great in that mode for the last decades.
It's constantly going, let's take a piece of algebraic geometry.
Let's take a bit of number theory here, elliptic curves.
Let's take a bit of quantum information, entanglement, whatever, entropy,
black holes.
And it's the only field that I know that different expertise
are talking to each other.
I mean, this doesn't happen in any other field that I know of
in sort of mathematics, theoretical physics.
And that just gets me excited, and that's what I really like thinking about.
Well, let's hear more about what you like thinking about and what you're enthusiastic about these days.
Let's get to the presentation.
Sure. Well, thank you very much for having me here and I'm going to talk about
work I've been thinking about, stuff I've been thinking about for the last seven years, which is
how AI can help us do mathematical discovery, you know, in theoretical physics and pure mathematics.
I recently wrote this review for Nature which is trying to summarize a lot of these ideas
that I've been thinking about.
And there's an earlier review that I wrote in 2021 about how machine learning can help
us with understanding mathematics.
So let me just take it away and think about, oh, by the way, please feel free to interrupt
me. I know this is one of these lectures. I always like to make my lectures interactive
So please if you have any questions just interrupt me anytime and I'll just pretend there's a big audience out there and I'll just make it
So firstly you're likely going to get to this but what's the definition of meta mathematics? Okay, great. So
roughly, I'm of course, you know
How does one so the first question is how does one
actually do mathematics, right?
And so one can think about, of course, these, in these reviews, I tried to divide it into
sort of three directions.
Of course, these three directions are interlaced and it's very hard to pull them apart.
But roughly, you can think about, you know, bottom-up mathematics, which is, you know, mathematics is a formological system,
you know, definition and lemma proof and theorem proof.
And that's certainly how mathematics is presented in papers.
And there's another one I would like to call top-down mathematics, where the practitioner looks from above, from above that's why I say top-down from like a bird's-eye
view you see different ideas and subfields of mathematics and you try to
do this as a sort of an intuitive creative art you know you've got some
experience and then you're trying to see oh well maybe I can take a little bit of
peace from here and a piece from there and I'm trying to create a new idea or maybe a method of proof or attack or derivation.
So these are complementary directions of research.
And the third one, meta, that's just because it was short of any other creative words because
there are words like meta science and meta philosophy
or meta physics. I'm just thinking about mathematics as purely as a language. Whether the person
understands what's going on underneath is of secondary importance. So it's kind of like
chat GPT, if you wish. can you do mathematics purely by symbol processing?
So that's what I mean by meta.
So I'm going to talk a little bit about in this talk about each of the three directions
and focusing mostly on the second direction of top-down,
which is what I've been thinking about for the last seven years or so.
Hmm.
Okay, I don't know if you know of this experiment called the Chinese Room Experiment.
Yeah.
Okay, so in that, the person in the center who doesn't actually understand Chinese, but
is just symbol pushing or pattern matching, I don't know if it's exactly pattern, rule
following that would be the better way of saying it.
They would be an example of bottom up or meta in this.
So I would say that's meta.
As you know know on theories of
everything we delve into some of the most reality spiraling concepts from
theoretical physics and consciousness to AI and emerging technologies to stay
informed in an ever-evolving landscape I see The Economist as a wellspring of
insightful analysis and in-depth reporting on the various topics we explore
here and beyond.
The Economist's commitment to rigorous journalism means you get a clear picture of the world's
most significant developments.
Whether it's in scientific innovation or the shifting tectonic plates of global politics,
The Economist provides comprehensive coverage that goes beyond the headlines.
What sets the economists
apart is their ability to make complex issues accessible and engaging, much like
we strive to do in this podcast. If you're passionate about expanding your
knowledge and gaining a deeper understanding of the forces that shape
our world, then I highly recommend subscribing to The Economist. It's an
investment into intellectual growth. One that you won't regret.
As a listener of Tou, you get a special 20% off discount. Now you can enjoy the Economist
and all it has to offer for less. Head over to their website www.economist.com slash Tou,
T-O-E, to get started. Thanks for tuning in and now back to our explorations of the mysteries
of the universe
So I would I would say that's meta
In the sense that the person doesn't even have to be a mathematician. You're just simply taking symbols
Large language modeling for math if you wish
Got it
Of course, you know, there's a bit of component rather
You know that you can see there's a little bit of component at the bottom up because you are taking mathematics as a sequence of symbols. But I would mainly call that meta.
If that's okay, I mean, these definitions are just things that I'm using.
Yes, yes.
But in any case, I would talk mostly about this bit, which is what I've been thinking mostly about.
One thing I just want to make, just to set the scene, you know,
20th century, of course, you know, computers have been playing an
increasingly important role in mathematical discovery, right?
And of course, you know, it speeds up computation, all that stuff
goes without saying, but something that's perhaps not so emphasized and appreciated
is the fact that there are actually fundamental and major results
in mathematics that could no longer have been done
without the help of the computer.
So there's famous examples, even back in 1976,
this is the famous Up-Hack-and, and Cock proof of the four-color theorem.
You know, that every map, it only takes four, every map in a plane,
it only takes four colors to completely color it with no neighbors.
And this is a problem that was posed, I think, probably by Euler, right?
And this was finally settled by reducing this whole topology problem to thousands of
cases and then they ran it through a computer and checked it case by case.
So and then other major things like, you know, the Kepler conjecture, which is, you know,
that stacking balls, identical balls, the best way to stack it is what you see in the
supermarket, you know, in this hexagonal thing.
And this was conjectured by Kepler, but to prove that this is actually the best way to
do it was settled in 1998, again, by huge computer check.
And the full acceptance by the math community was only as late as 2017 when proof co-pilots
actually went through House's construction and then made this into a proof.
Yes. Wasn't there a recent breakthrough in the generalized Kepler conjecture?
Absolutely. So this is what Marina Vyatsovska got the Fields Medal for.
So the Kepler conjecture is in three dimensions, our world.
Vyatsovska showed in dimensions 8, 16, and 24 with the best possible packing error.
And she gave a beautiful proof of that fact.
And to my knowledge, I don't think she actually used the computer.
There's some optimization method.
Actually what I'm referring to is that there are some researchers who generalize this for
any n, not just 8, not just 24, who used methods
in graph theory of selecting edges to maximize packing density to solve the sphere packing
problem probabilistically for any n, though I don't believe they used machine learning.
Well, thanks for turning on. I'll go to check that. That's interesting. Interesting. This
was actually really interesting when it's the classific. I mean, that's something that's
closer to me, which is the classification of finite simple groups.
So simple groups are building blocks of all finite groups.
And the proof is, you know, it took 200 years, and the final definitive volume was by Dittgerenstein 2008.
And what's really interesting, the lore in the finite group theory community is that nobody's actually read the entire proof.
It's just not possible. It takes longer for people to actually read the entire proof than a lifetime.
So this is kind of interesting that we have reached the cusp in mathematical research
where mathematics, the computers are not just becoming, you know, computational tools,
but it's increasingly becoming an integral part of who we are. So this is just set the scene.
So we're very much in this, you know, we're now in the early stages of the 21st century,
and this is increasingly the case where we have this, where computers can help us, or AI can help
us in these three different directions. Great.
So let me just begin with this bottom up and just sort of to summarize.
This is probably the oldest attempt in where computers can help us.
So this is where I'm going to define bottom-up, which is, I guess it goes back to the modern
version of this is this classic paper, the classic book of Russell Whitehead on the Principia
Mathematica, which is 1910s, where they try to axiomatize mathematics from the very beginning.
It took like 300 pages for them to prove that one plus one is good at two famously.
Nobody has read this, sorry.
This is one of those impenetrable books.
But I mean, this tradition goes back to Leibniz
or to Euclid even, you know,
that the idea that mathematics should be axiomatized, right?
Of course, this program took only about 20 years
before he was completely killed in some sense because of Gödel and
Church and Tuer's incompleteness theorems.
That, you know, this very idea of trying to axiomatize mathematics by constructing, you
know, layer by layer is proven to be, you know, logically impossible within every order
of logic. But I'd like to quote my very distinguished colleague, Professor
Minyong Kim. He says, the practice of mathematician hardly ever worries about good old. Because
you know, if you have to worry about whether your axioms are valid to your day to day,
you know, if an algebraic geometry has to worry about this, then you are sunk, right?
You get depressed about everything you do, right?
So the two parts kind of cancel out.
But the reason I mention this is that
because of the fact that these two
parts cancel each other out, these two
negatives cancel each other out, this idea
of using computers to check proofs, or
to computer-aided proofs, really goes
back to the 1950s.
So despite what Gödel and Georgian Turing have proved
is foundational, even back in 1956, Noah Simon and Shaw
devised this logical theory machine. I have no idea how
they did it because this is really very, very, very
primitive computers. And they were actually able to
prove certain theorems of
Principia by building this bottom up.
You know, take these axioms and use the computer to prove.
And this is becoming, you know, an entire field of itself
with this very distinguished history.
And just to mention that this 1956 is actually a very interesting
year because it's the same year, 56, 57, that the first neural networks emerged from the basement of Penn and MIT.
And that's really interesting, right?
So people in the 50s were really thinking about the beginnings of AI, you know,
because neural networks is what we now call, you know, goes under the rubric of AI.
And at the same time, they were really thinking
about computers to prove theorems in mathematics.
So it's, 56 was kind of a magical year.
And this neural network really was a neural network
in the sense that they put cadmium sulfide cells
in a basement.
It's a wall size of photoreceptors.
And they were using flashlights to try to stimulate neurons,
literally, to try to simulate computation.
That's quite an impressive thing.
And then this thing really developed, right?
And then, and now, you know, a half a century later,
we have very advanced and very, very sophisticated computer aided
proof automated theorem provers. Things like the Coq system, the Lean system, and they
were able to create. So Coq was used in this full color, the full verification of the proof
of the four color theorem was through the Cox system.
And then there's the Phy-Thomson theorem, which got Thompson the Fields Medal.
Again, they got the proof through this system.
Lean is very good.
I do a little bit of Lean, but also Lean, the true champion of Lean is Kevin Buzzard
at Imperial, 30 minutes down the road from here, from this spot.
And he's been very much a champion for what he calls the Zener Project, and using Lean
to formulate, to formalize all of mathematics.
That's the dream.
What Lean has done now is that it has, Kevin tells me that all of the undergraduate
level mathematics at Imperial, which is a non-trivial set of mathematics, but still
a very, very tiny bit of actual mathematics.
And they can check it and everything that we've been taught so far at undergraduate
level is good and self-consistent, so nobody needs to cry about that one.
Wonderful.
And so that's all good. And then more recent breakthroughs is the beautiful work of, you
know, so three Fields Medalists here, also two Fields Medalists, Gowas Green, Manners
and Tao, Manners, I think it's the name, and Tao, where they prove this conjecture, which
I don't know the details of, but they were actually using Lean to prove, to help prove
in this. And I think Terry Tao in this public lecture, which he gave recently in 2024 in Oxford,
he calls this whole idea of AI co-pilot, which I very much like this word.
I was with Tao in August in Barcelona, we were at this conference,
and he's very much into this very well. And
of course, you know, Tao, Terry Tao for us is, you know, is a godlike figure. So, and the fact
that he's championing this idea of AI co-pilots for mathematics is very, very encouraging for all of us.
Yes, and for people who are unfamiliar with Terry Tao, but are familiar with Ed Whitten, Terry
Tao is considered the Ed Whitten of math and Ed Whitten is considered the Terry Tao
of physics.
Yeah, I've never heard that expression.
That's kind of interesting.
At Barcelona, when Terry was being introduced by the organizer, Eva Miranda, she said,
Terry Tao is, this is a very beautiful sentence.
Terry Tao has been described as the...
Looking for the perfect holiday gift?
Masterclass has been my go-to recommendation for something truly meaningful.
Imagine giving your loved ones the opportunity to learn directly from world-renowned instructors.
With Masterclass, your friends and family can learn from the best to become their best.
It's not just another streaming platform, it's a treasure trove of inspiration and personal
growth.
Whether it's learning critical thinking from Noam Chomsky to gaining practical problem-solving
skills with Bill Nye or exploring the richness of history with Doris Goodwin, or my personal
favorite which is learning mathematical tricks or techniques from Terry Tau.
There's something for everyone.
Another one that I enjoyed was Chris Voss, a former FBI negotiator who teaches communication
strategy.
It's been a game changer to me ever since I read about Chris Voss from his book, and
it was surprising to see him here on Masterclass, a pleasant surprise.
Masterclass makes learning convenient and accessible whether you're on your phone or
your laptop or TV or just listening in audio mode. And the impact is real. 88% of members
say Masterclass has made a positive difference in their life. For me, it's an incredible
way that I discover, sometimes even rediscover learning. Every class that I take helps me
feel more confident, even inspired. And I've received great feedback from friends that
I've recommended it to.
There's no risk. It's backed by a 30-day money-back guarantee. Right now, they're offering an
extremely wonderful holiday deal with some memberships discounted by up to 50%. Head
to MasterClass.com slash theories and give the gift of unlimited learning.
That's masterclass.com slash theories.
Your gateway to unforgettable learning experiences.
The Gauss of mathematics.
Or the Mozart.
But I think a more appropriate thing to describe him is to describe him as the Leonardo da Vinci of mathematics because he has such a broad impact on all fields mathematics and
that's very rare thing. Yeah I remember he said something like topology is my
weakest field and by weakest field to him it means I can only write one or two
graduate textbooks off of the top of my head on the subject of topology. Yeah
exactly, exactly. I guess his intuitions are more analytic.
He's very much in that world of analytic number three functional analysis.
He's not very pictorial, surprisingly.
Like Roger Penrose has to do everything has to be
picked in terms of pictures.
But Terry is a symbolic matcher.
We can just look at equations,
extremely long complicated equations,
and just see which pieces should go together.
That's very interesting.
Speaking of Eva Miranda,
you and I, we have several lines of connection.
Eva's coming on the podcast in a week or two to talk about geometric quantization.
Awesome. Eva is super fun.
She's filled with energy.
Yes. Yeah, she's super fun, right? She's filled with energy. Yes.
Yeah, she's a good friend of mine. Yeah. I think, you know, in this academic world of, you know, math and physics, I think, you know, we're at most one degree of separation from anyone else.
Yeah. It's a very small community, relatively small community. Yeah. So, this back to this thing about, of course, you know, one could one could get over optimistic.
I was told by my friends in DeepMind that that Shaggedy, who I think he's one of the
one of the on the on this AI math team, he says, you know, he was instructing that computers
beat humans in chess in the 90s, beat humans go at 2018.
So you should beat humans and beat in proving theorems in 2030.
I have no idea how he extrapol idea how he extrapolated these points.
There are only three data points.
But DeepMind has a product to sell, so it's very good for them to be over-optimistic.
But I wouldn't be surprised that this number, you know, well, I'm not sure to beat humans,
but it might give ideas that humans have not thought about before.
So that's possible. Just moving before. So that's possible.
Just moving on. So that's the bottom up.
And I said this is very much a blossoming,
or not blossoming, it's very much a long distinguished field of
automated theorem computes, of theorem provers,
and verifications of formalization mathematics,
which Tao calls the AI copilot.
Just to mention a bit with your question a bit earlier about metamathematics. formalization mathematics, which Tao calls the AI copilot.
Just to mention a bit with your question a bit earlier about
meta-mathematics. So this is just kind of, I like your
analogy, this is like the Chinese room. Can you do
mathematics without actually understanding anything?
You know, personally I'm a little biased because having
interacted with so many undergraduate students before I moved to the London Institute so I don't have to teach
anymore or teach undergraduates. I've noticed, you know, maybe one can say the vast majority
of undergraduates are just pattern matching.
Right.
Whether there's any understanding. I think this is one of the reasons why why chat GPT does things so well. It's not just because it's not because
oh you know LLMs are great, large language models are great. It's more
that most things that humans do are so without comprehension anyway. So that's
why it's kind of this pattern matching idea. And this is also true for
mathematics.
What's funny is that my brother's a professor of math in the University of Toronto for machine
learning but for finance. And I recall 10 years ago, he would lament to me students
that came to him who wanted to be PhD students and he would say, okay, but Kurt, some of
them, they don't have an understanding, they have a pattern matching understanding. He
didn't want that at the time, but now he's into machine learning, which is effectively
that times 10 to the power of 10.
Right.
Right, right.
No, no, I completely agree.
I mean, this is not to criticize undergraduate studies.
You know, I think in undergraduate students, it's just that, you know, it's just, it's
part of being human.
We kind of pattern match and then we do it the best we can.
And then of course, if you're Terry Tao,
you actually understand what you're doing.
But you know.
Of course.
But the vast majority of us doing most of the stuff
is just pattern matching.
So that's why, and this is true even for mathematics.
So here, I just want to mention something,
which is a fun project that I did with my friends
Vishnu Jigala and Brent Nelson back in 2018 before LLM, before all this LLM for science
thing.
And this is a very fun thing because what we did, we took the archive and we took all
the titles of the archive.
This is the preprint server for contemporary research in theoretical sciences.
And, you know, we would do an LLM classifiers,
Wurteweg, very old fashioned,
this is a neural network, Wurteweg.
And, you know, you can classify this and then do their thing.
But what's really interesting, and this is my favorite bit,
we took, to benchmark the archive, we took Vixra.
So Vixra is a very
interesting repository because it's of archives spelled backwards and it has all
kinds of crazy stuff. I'm not saying everything on Vixra is crazy, but certainly
it has everything that archives rejects because you think it's crazy. Things like,
you know, three page proof of the Riemann hypothesis or Albert Einstein is wrong.
It's got filled with that.
It's interesting to study the linguistics even at the title level.
You could see that, you know, what they call the distinctions of quantum gravity versus the other things,
they have the right words in Vixra, but the word order is already quite random.
That, you know, the classification matrix, the confusion matrix
for Vixra is certainly not as distinct as archive. So kind of interesting, you get all
the right buzzwords. It's like kind of thing, Vixra I think is a good benchmark that linguistically
is not as sophisticated as real research articles.
But this idea, so this is something much more serious,
it's this very beautiful work of Chitoyan et al. in nature,
where they actually took all of material science,
and they did a large language model for that,
and they were able to actually generate new reactions in material science.
So I think this paper in 2019, this paper by Chitoyan,
is really the beginnings of LLM
LLM for for scientific discovery. This is quite early. This is 2019, right?
Yeah, and it's remarkable how we can even say that that's quite early. The field is exploding so quickly
Absolutely, five years ago is considered
Time ago. Yeah, absolutely. I mean five years ago
I you know
I was still very much in a lot of, I've evolved in thinking a lot about
this thing.
I would also like to get to your personal use cases for LLMs, chatGVT, Claude, and
what you see as the pros and cons between the different sorts, like Gemini was just
released at 2.0, and then there there's a 1 and there's a variety. So at some point I would like to get to you personally how you use LLMs both as a
researcher and then your personal use cases.
Okay, I can mention a little bit.
So one of the very, very first things when Chachi BT3 came out in what, 2018,
something, 2019, something like that.
Oh, three.
You mean GPT 3?
GPT 3 like the really early baby versions.
Yeah that was during just before the pandemic.
Just before pandemic. So that was just like so I got into
this AI for math through this Clavier Maniforge
which I'm going to mention a bit later.
And then this GPT came out when I was just thinking about this large language model.
So this is a great conversation.
So I was typing problems in calculus, freshman calculus.
And it was solved that fairly well.
I mean, it's really quite impressive what he can do.
So, you know, it's fairly sophisticated because, you know, things like, you know,
I was typing questions like, you know, take vector fields, blah, blah, blah, on a sphere,
you know, find me the grad or the curve.
I mean, it's like, you know, first, second year stuff and you have to do a lot of computation.
And he was actually doing this kind of thing correctly, you know, partially because there's
just so many example sheets of this type out there on the internet.
And so he's kind of learned all of that.
So I was getting very excited and I was trying to sell this to everybody at lunch.
I was having lunch with my usual team of colleagues in Oxford over this.
And of course, lo and behold, who was at lunch was the great Andrew Wiles.
And I felt like I was being a peddler for GPT, LLM for mathematics, to perhaps the greatest living legend in mathematics and I'm just super nice and he's a lovely guy
And he just instantly asked me says how about you try something much simpler?
Two problems he tried the first one was
Tell me the rank of a certain elliptic curve and he just typed it down a certain elliptic curve or rush rational points
Sorry, rational points of a very simple elliptic curve, which is his baby. And I typed it and it got completely wrong. It was just even,
it started very quickly started saying things like, you know, five over seven is an integer.
Partially because this is a very hard thing to do. You can't really guess integer points,
but unlike in calculus where there's a routine of what
you need to do, right?
And then very quickly we converge on an even simpler problem.
How about find the 17th digit in the decimal expansion of 22 divided by 29, like whatever.
And that is completely random because you can't train, you actually have to do long division.
This is, you know, this is, you know, primary school level stuff and yet GPT just simply
cannot do and it's inconceivable that it could do it because no language model could possibly
do this.
But GPT now 001, 02, 01 is already clever enough when you ask him a question like this,
linguistically it knows to go to war from alpha and then it's okay.
Then he's actually doing them.
But so something so basic like this, you just can't train the language model to do.
You get, you know, one in 10 right.
And it's just a randomly distributed thing.
Yes.
Whereas sophisticated things, they are seemingly sophisticated thing like solving differential
equations or doing very complicated integrals.
It can do because there's somewhat of a routine
and there are enough samples out there.
So, anyway, so that's my user case, two user cases.
That's also not terribly different
than the way that you and I,
or the average person or people in general think.
So for instance, we're speaking right now in terms of conversation.
And then if we ask each other a math question, we move to a math part of our brain.
We recognize this is a math question.
So there's some modularity in terms of how we think.
It's not like we're going to solve long division using Shakespeare.
Even if we're in a Shakespeare class and someone just jumps in and then ask that
question, we're like, okay, that's a different, that's of a different sort of mechanism.
Yeah, that's a good analogy. Yeah.
When you first encountered chat GPT or something more sophisticated that could answer even larger mathematical problems, did you get a sense of awe or terror initially? initially. So I'll give you an example. There was this meeting with some other math friends
of mine and I was showing them chat GPT when it first came out. And then one of the friends
was like, explain, can you get it to explain some inequality or prove some inequality?
And then it did. And then explained it step by step. Then he, then everyone just had their
hand over their mouth. Like, are you serious? Can you do this? And then they're like, then
one said, one friend said, this is like speaking to God. And another friend said, had the thought
like, what am I even doing? What's the point of even working if this can just do my job
for me? So did you ever get that sense? Like, yes, we're excited about the future and it
as an assistant. But did you ever feel any sense of dread?
I'm by nature a very optimistic person.
So I think it was just all an excitement.
I don't think I've ever felt that I was threatened or the community is being threatened.
I could totally be wrong.
But so far I just think it's like this is such an awesome thing because it'll save me so much time
Looking up references and stuff like this. I can you know
Yeah, I was happy. I was just like wow, this is kind of cool I mean, I guess if I were an educator I might get a bit of a dread because there's like, you know
You know undergraduate degrees is what you know, you do an undergraduate degree
It's just basically one chat GPT being fed to another.
A lot of my colleagues started setting questions in exams with chart GPT with fully latexed out equation.
I mean, this is becoming the standard thing to do. I guess even if you're an educator, you would probably worry.
But I was thinking about just long-term discovery of, you know, what new knowledge can we generate?
Okay.
So in that sense, this is going to be certainly an incredible help
because it's got all the knowledge in the background.
Wonderful.
All right, let's move forward.
Yeah, sure.
So 2022 was a great year.
I'm surprised this wasn't like over every single newspaper.
I don't know why.
At least I was told after, there was some obscuring outlet,
I can't even remember, some expert friends in the community
told me that the Chachi BTS passed the Turing test.
This is a big deal, but I don't know why it hasn't been,
I was hoping to see this on BBC and every major newslet,
but it didn't catch on.
But anyhow, I believe that in 2022,
ChargBTS passed the Turing test.
And then, you know, where in the last two years,
this is obviously where we can, you know,
this is a huge development now
for large language models for mathematics.
And, you know, every major company, OpenAI, MatterAI, EpochAI,
you know everything and they've been doing tremendous work in trying to get LLM for math.
Basically, you know, take the archive which is a great repository for mathematics and theoretical
physics, pure mathematics and theoretical physics, and then
just learn that and try to generate to see how much this is very much working in progress.
And of course, AlphaGeo, AlphaGeo 2, AlphaProof, this is all the DeepMind's success.
It's kind of interesting, within a year, you've gone from 53% on Olympia level to 84%, which
is part, this is scary, right?
Every, this is scary in the sense, like, impressively awesome
that they could do so quickly.
So basically in 2022, an AI is approximately equal to the 12-year-old Terence Tao,
in the sense that it could do a silver medal.
But of course, this is a very specialized, you know, the Alpha-Geo 2 was really just
homing in on Euclidean geometry problems.
Which to be fair are extremely difficult, right?
If you don't know how to add the right line or the right angle, you have no idea how to
attack this problem.
But it's kind of learned how to do this.
So it's kind of nice.
So, you know, this is all within
a couple of years. And there's this very nice benchmark called Frontier Math that Epoch
AI has put out. I think there was a white paper and they got gowers and towels, you
know, the usual suspects, just to benchmark. Okay, fine. So we can do 84% on math Olympiad,
which is sort of high school level.
What about truly advanced research problems?
So to my knowledge, as of the beginning of this month,
it was only doing 2%.
So that's okay, fine, it's still not doing that great.
But the beginning of this week, you learn that OpenAI 03 is doing 25%.
So we've gone 20% up.
We've got a fifth up within four weeks of what they can do.
So I said, wow, that's kind of very interesting.
Yeah, such a rapid improvement.
It's so, this is crazy.
I love this, right?
Because it's exciting. It's so, this is crazy. I love this, right, because it's exciting.
It's very rare to be.
I remember back in the day when I was a PhD student doing ADS CFT-related algebra geometry,
because Marithana had just come out with a paper in 97, 98, and that's just when I began my PhD.
I remember that kind of excitement,
the buzz in the string community.
And people are saying there was a paper every couple of days
on the next, that kind of excitement.
And I haven't felt that kind of excitement for a very long
time just because of this.
And then this is like that.
Every week, there's this new benchmark and new breakthrough.
So that's why I find this field of AI system mathematics to be really, really exciting.
Can you explain, perhaps it's just too small on my screen because I have to look over here,
can you explain the graph to the left with Terence Tao?
Oh gosh, I'm not sure I can because I'm sure I can read this graph in detail.
I think it's the year.
What is it trying to convey?
So it's the ranking of Terrence, no this is just Terrence Towers individual performances
over different years, over different problems.
So he's retaking the test every year?
No, no, he's taken it three times,
ages 10, 11, and 12.
And when he was 10, he got the bronze medal,
and then he got the silver medal,
then he got the gold medal within three years.
Okay.
And age of 12 or something.
But I can't, I think-
What are those bars, though?
I think the bars, a good question.
Maybe it's to the different questions.
You're giving 60 questions and what it would take to get the gold medal, I think, or what
it would take to get the silver medal.
I think.
How many percents do you have to be quick?
Okay, so was it a foolish question of mine?
It's actually...
No, no, no, no.
It's a good question. I have no recollection or maybe I never even looked at it.
Somebody told me about this graph at some point. I forgot what it is.
Okay, because it looks to me like Terrence Tao is retaking the same test and then this is just showing his score across time and he's only getting better.
But that can't be it. Why would he retake the test? He's a professor. No, I think it goes to 66.
It must be like, this is an open source graph.
Oh, I thought you were going to say this is an open problem in the field.
What does this graph mean?
No, no, no.
It's an open source.
This graph is just, you could take it from the Math Olympiad database.
Which I shamelessly, see again, perfect, right?
I've just done something that I have absolutely no understanding of presented to you like
a language model and I just copy and paste it because it's got a nice cute picture of
Terry's style when he was a little, so finally I'll go back to the stuff that I be really
me thinking about, which is just sort of top-down mathematics.
So and then this is kind of interesting.
So the way we do research, you know, practitioners, is completely opposite to the way we write
papers.
I think that's important to point that out.
We muck about all the time.
We do all kinds of things.
You look at my board, right, it's just filled with all kinds of stuff.
And most of it is probably just wrong.
And then once we got a perfectly good story,
we write it backwards.
And I think writing math papers backwards,
and math generally define math and theoretical physics
papers backwards.
Well, theoretical physics is a bit better.
At least sometimes you write the process.
But in pure math papers, everything
is written in the style of Bubacki.
This very dry definition proof, which is completely not how it's actually done at all.
This is why Arnold, the great Vladimir Arnold says,
Bubacki is criminal.
He actually used this word, the criminal bubacca-ization of mathematics, because it leaves out all human
intuition experience.
It just becomes this dry machine-like presentation, which is exactly how things should not be
done.
But bubacca is extremely important because that's exactly the language that's most amenable
to computers.
So it's one way or another.
But human practitioners certainly don't do this kind of stuff.
We muck about.
Sometimes even rigorous sacrifice.
If we have to wait for proper analysis in the 19th century to come about before Newton invented calculus, we won't
even know how to compute the area of an ellipse because we have to wait and formalize all
of that.
You don't just go all backwards.
So kind of the historical progression of mathematics is exactly opposite to the way that it's represented. I mean, it's fine, but the way it's presented is better,
it's much more amenable to approve copilot system
like Lean than what we actually do.
Even science in general is like that,
where we say it's the scientific method,
where you first come up with a hypothesis
and then you just, you test it against the world,
gather data and so on.
But the way that scientists, not just in math and physics, but biologists and chemists and
so on work are based on hunches and creative intuitions and conversations with colleagues
and several dead ends.
And then afterward you formalize it into a paper in terms of step by step, but it was
highly nonlinear.
You don't even have a recollection most of the time of how it came about.
That's right. And I think one of the reasons I got so excited about all this AI from ARC
is this direction. Because this hazy idea of intuition or experience, this is something
that a neural network is actually very, very good at.
Wonderful.
It could help you. So I'm going to give concrete examples later on about how it gives guides humans.
But just to give some classical examples, I've given this, I've said this joke so many
times.
I think, so what's the best neural network of the 18th century?
Well, it's clearly the brain of Gauss.
I mean, that's a perfectly functioning, perhaps the greatest neural network of all time.
And this is, I mean, I want to use this as an example because, you know, what did Gauss
do?
Now, Gauss plotted the number of prime numbers less than a given positive real number.
Just to give a sort of continuity.
And he plotted this and it's kind of a really, really, you know,
jaggedy curve and it's a step function.
It's a step function because it jumps whenever you hit a prime.
But Gauss was just able to look at this when he was 16 and said, well,
this is clearly x over log x. How did he even do this experience? I mean, he had to compute
this by hand and he did and he got some of the wrong even, you know, primes. He had tables
by his time, the tables of primes were up in the tens and hundreds of thousands. He
has to go up in the hundred thousand range.
And you can just look at this as x over log of x.
But this is very important because he was able to raise
a conjecture before the method by which this conjecture is proved,
namely complex analysis, was even conceived of by Cauchy and Riemann.
And that's a very important fact.
So he just kind of felt that this was
X over log X. And you had to wait for 50 years before Hadamard and Delavay-Presante proved
this fact. Because this technique, known as, which we now take for granted, this technique
called complex numbers, complex analysis, wasn't invented by Cauchy, wasn't invented
yet. You had to wait for that to happen. So that's kind of, that's how it happens like
this in mathematics all the time.
Even major things, of course, you know, this is, so now it's called the prime number theorem,
which is a cornerstone of all of mathematics, right?
This is the first major result since Euclid on the distribution of primes.
How did Gauss say this was x over log x?
Because he had a really great neural network and
this happened it happens over and over again like you know the best Swindon and
Dyer conjecture which I'm going to talk about later which is a one of
the millennium problems and it's still open and it's certainly one of the most
important problems in mathematics of all time and this is Birch and Swindon and Dyer
in a basement you you know, in
Cambridge in 1960s. They just plotted ranks and conductors of lead curbs. I'm
gonna define those in more detail later. And they will say, oh, that's kind of
interesting, you know, the rank should be related to the conductor in some
strange way. And that's now the BSD conjecture, the person-winner-die
conjecture. And what they were doing was computer-aST conjecture, the person with the diet conjecture.
And what they were doing was computer aided conjectures.
So here was the eyeballs of Gauss in the 19th century.
But the 20th century really have seriously computer aided conjectures.
And of course the proof of this is still open in general. There've been lots of nice progress in
this. And, you know, where we're going to go is very much what
technique do we need to wait to prove something like this?
Now, is there a reason that you chose Gauss and not Euler? Like
is it just because Gauss had this example of data points and
guessing a form of a function?
I'm sure Euler, who is certainly is great, had conjectures maybe, that's an interesting
quote.
I'll mention Euler later, but I think there's not an example as striking as this one. In fact, what's interesting as a byproduct
of Gauss inventing this, because it was kind of mucking around with statistics, right?
This is before statistics existed at Esso Field as well, right? This is like early 1800s.
And Gauss, I think, and you can check me on this, Gauss got the idea of statistics and the Gaussian distribution
because he was thinking about this problem.
So it's kind of interesting.
So he was laying foundations to both analytic number theory and modern statistics in one go.
He was doing regression.
So I think he essentially invented regression, the curve fitting, which is like 101 of modern
society.
He was trying to fit a curve.
What was the curve that really fit this?
In the process, he got x over log x, and in addition, he got this idea of regression.
Impressive guide. says he got X over log X and in addition he got this idea of regression and an impressive
guide. What can we say? He's a God to us all. And then, so the upshot of this is like, I
love this. Again, there's something I found on the internet. Just to emphasize, you know,
that this idea of-
Speaking of God. Yes, speaking of God, this idea of mucking about with data in pure mathematics is a very
ancient thing.
Right?
You know, you have to, you know, and you know, once you formulate something like this in
conjecture, you will write your paper.
Imagine, you know, writing a paper, you will, conjecture, you know, definition, prime, definition pi of x,
then conjecture pi of x, evidence. Rather than all of the
failed stuff about inventing regression and mucking about,
all that stuff just gets not written at all. That intuitive,
creative process is not written down anywhere. So here is a,
it's great, I'm glad I'm chatting to you about it, right, because it's nice
to have an audience with this, right.
So you know, if you look at like, so pattern recognition, what do we do, right, in terms
of pure mathematical data.
If I gave you a sequence like this, you can immediately tell me what the next number is
to some confidence.
Yeah, zero, zero, one.
Zeroes is just, you know, this is just multiple of three or not. This one, I've tried this with many audiences and, you know, after a few minutes of struggle,
you can get the answer and then this turns out to be the prime characteristic function.
So what I've done here is to mark all the odd integers and evens obviously you're going
to get zero so it's kind of pointless.
You just odd, just a sequence of odd integers.
And then it's a 1 if it's a prime, it's 0 if it's not.
So 3, 5, 7, 8, and so on and so forth.
No, sorry, 3, 5, 7, 9, 11.
And you mark all the odd ones, which are 1.
And you can probably, after a while, you can muck about,
and you can see where this is going.
The next sequence is much harder.
So I'm going to give away so we won't have to spend a couple of hours staring at it.
So this one is what's called the shifted Moebius function.
What this is, just you take an integer and you take the parity of the number of prime factors it has up to multiplicity.
Starting from 2, I think I didn't start from 1 here.
And then if it's 1, if it's a, if it's, I'm not sure maybe I did start with 1,
it's 0 if it's an odd number of prime factors, it's 1 if it's an even number of prime factors
for all the sequence
of integers.
And I hope now I've gotten this right.
So if I think I start with 2, 2 has, so that's all, no, let's see, 2, 3, yeah, so I did
start with, I'm going to mark 1 for 1 just to start this kick off the sequence.
And then 2 is a prime number, it has only one prime factor, it's an odd
number, 3 is an odd number of prime factors, 4 is 2 because it's 2 squared, so it has an
even number of prime factors and so on and so forth.
So 5 is prime, it has one odd number, 6 is 2 times 3, so it has 2, an even number of
prime factors and so on and so forth.
It looks kind of harmless.
What's really interesting, so this is even number.
So, I've just stared at this for a while, it's very, very hard to recognize a pattern.
And what's really interesting is that to know the parity of the next number, if you have
an algorithm that can tell me the parity of this in an efficient way, you will have an equivalent formulation of the movement hypothesis. So
that's actually an extremely hard sequence to predict. So if you can tell me with some
confidence more than 50% what the next number is without looking up some table, then you
can probably end up cracking every
bank in the world.
Interesting.
Because this is equivalent to the Riemann hypothesis.
So I'm just giving three, so trivial, kind of okayish, really, really, really hard.
Yes.
So now you can think about a question.
How, if I were to feed sequences like this into
some neural network, how would a neural network do?
So one way to do it, so this goes a bend, so we go way back to the very beginning, to
the question of what is mathematics?
And you know, Hardy in his beautiful apology says, you know, what mathematicians do is
essentially we are pattern recognizers.
That's probably the best definition of what mathematics is, is that it's a study of patterns,
finding regularity in patterns.
And, in fact, you know, if there's one thing that AI can do better than us, it's pattern
detection.
Because, you know, we evolved in being able to detect patterns in three dimensions and
no more.
So in this sense, if you have the right representation of data, you're sure that AI can do better
than that.
I mean, you know, it generates a lot of stuff, but filtering out what is better is a very interesting problem
in and of itself.
So let's try to do one.
I mean, there are various ways to do this representation.
One way you can do it is to do a problem which is maybe best fit for an AI system, which
is binary classification of binary vectors.
So what you do is, sequence prediction is kind of difficult.
So one thing you can do is just take this infinite sequence and
just take, say, a window of 100, 1000, what, fixed window size.
And then label it with the one immediately outside the window, and
then shift, label, shift, label.
So then you can generate a lot of training data this way.
So for this sequence,
I think I've just taken here whatever the sequence is,
and I might just with a fixed window size,
and with this label. So now you have a perfectly supervised,
perfectly defined binary supervised machine learning problem.
Then you pass it to your standard AI algorithm.
They're just out of the box ones.
You don't even have to tune your particular architecture.
Just take your favorite one and then do cross validation,
the standard stuff, take sample, do the training,
and then try to validate this on unseen data.
So if you do this to the MOT3 problem, to this one,
you immediately find that, you know, any neural network or whatever base classifies would do it 100% accuracy,
as you should, because you'll be really dumb if you didn't, because this is just a linear transformation.
So even if you have a single neuron that's just doing linear transform, that's good enough
to do it.
The prime Q problem I did some experiment, some, oh gosh, like seven years ago, it got
80% accuracy.
And I was like, wow, that's kind of, this was a wow moment.
I was like, why, why is it doing 80?
I don't have a good answer to this.
Why is it doing 80% accuracy to this?
How is it learning?
Maybe it's doing some sieve method,
which is kind of interesting somehow.
The second number is just to chi-square,
just to double test that the, what's called MCC,
which is Matthew's correlation coefficient.
These are just buzzwords and stats.
I never learned stats, but now I'm relearning.
I took Coursera in 2017 so I can relearn all these buzzwords.
It's great.
It's really useful.
And then this shift in the over lambda function, it's sorry.
I think I made a, yeah, I mistakenly called this, called called this Merbius mu function. It's not.
I mean, it's related but it's not. It's the shift in the Leoviel lambda function.
Got it.
Sorry, one of my neurons died when I said Merbius mu but it's Leoviel lambda.
You were subject to the one pixel attack.
Yeah, so this one I couldn't break 50%. right? 0.5 just means it's coin toss.
It's not doing any better guessing than whatever.
And this chi-square is 0.00.
That means I'm up to statistic error.
So which means I couldn't find an AI system which could break,
which could do better than random guess.
I'm not saying there isn't one.
It would be great if there were one. And then, yeah, so it's kind of, you know, it's life.
And I couldn't, if I do break it, you know, I might actually stand a good chance breaking
every bank in the world.
All right.
But I haven't made it worse.
Well, let's remain close friends.
Yeah, that's right
That's right. So I was very proud of this because this experiment I'm gonna mention a bit later
This little lambda was suggesting I was just trying like way back when but apparently Peter Sarnak whom whom I really admire
He's one of the world's greatest
Number theorists currently current number theorists and theorist. And I got to know him through this memorization thing that I'm going to talk about later.
And I reminded him that I almost became his undergraduate research student.
I ended up doing, I was an undergrad at Princeton where I had two paths I could follow for,
you know, to kind of define your undergrad,
your thesis, right?
So, one was in mathematical physics, one was, that's with Alexander McDowell.
And the other one was with, you know, two problems and the other one was actually offered
by Peter Sinek on arithmetic problems. And I somehow just, because I wanted to understand the nature of space and time,
I went through the Alexander Mikhailov path to do mathematical physics,
which led to do string theory.
And after 20, 30 years, I came full back to be in Peter Sarnak's world again.
I made him at this conference, I reminded him of this
and he was very happy.
But also actually what's really interesting is
is that he was asking DeepMind the same question
a few years ago about the deluvial lambda,
whether DeepMind could do better than 50%.
So I was glad that I thought along the similar lines
as a great expert in number theory.
And somebody who could have potentially have been my supervisor.
And then I would have gone into number theory instead of swing theory, which is whatever.
It's how life happens.
So perhaps you're going to get to this later on in the talk, but I noticed here you have the word classifier.
And the recent buzz since 2020 or so has been
with architecture, the transformer architecture in specific.
So is there anything regarding mathematics, not just LLMs, that has to do with transformer
architecture that's going to come up in your talk?
Not specifically.
I'm actually, it's interesting.
I'm one of my colleagues here at the London Institute.
He's a, uh, uh, Mikhail Berd London Institute, he's Mihail Bertsov.
He's an AI, he's our Institute's AI fellow, and he's an expert on transform architecture.
So I've been talking to him and we're trying to devise a nice transform architecture to
address problems in finite group theory. It's in the works. But nothing so far, even with the memorization stuff,
is very basic neural networks that we didn't use anything
more sophisticated than that.
So to be determined whether it will outperform
the standard ones will be kind of interesting.
Got it.
Yeah, so actually now we go way back to the beginning of our conversation.
It's how I got into this stuff.
And that, I don't know, completely coincidentally was through string theory.
So at this point, maybe I'll just give a bit of a background of like, you know, how all this stuff came about.
At least personally. At least personally.
Why was I even thinking about this?
Because I knew nothing about AI,
like seven, eight years ago, zero, like literally zero.
Like I knew nothing more than to read it on the news,
from the news.
And this is actually a very interesting story,
which shows again, the kind of ideas
that the string theory community is capable of generating.
Just because you got all these experts looking
on kind of interesting problems.
So let's go way back and again,
I've quoted Gauss, I've got to cook,
I have to say something about Euler.
So this is a problem.
Again, you can see I'm very influenced by three,
the number three. I'm very influenced by three, the number three.
I'm a total numerologist, right?
Trinity, name the three, three is something, right?
And there is called the trichotomy classification theorem by Euler.
This dates to 1736.
So if you look at, so I'm going to say the buzzword,
which is connected compact orientable surfaces.
So these are, you know, I mean the words explain themselves, you know, they have no boundaries
and they're, you know, topologically, you know, whatever, the topological surfaces.
So Euler was able to realize that a single integer characterizes all such surfaces.
So this is the standard thing that people see in topology, right?
So the surface of a ball is the surface of a ball, and you can deform it.
The surface of a football is the same as an American football.
It can deform without cutting or tearing. and then the surface of a donut is the same as your cup, right?
Because it's everything that everyone understands, the thing, you know, this has one handle.
And so the surface of a donut is exactly the topologically, what they call topologically homeomorphic to the cup. And then he got the pretzel.
So I think that's a pretzel.
Or maybe, I think this is like the German pretzel.
And it gets more and more complicated.
But Euler's, because you know, Euler invented the field of topology.
So he realized this idea of topological equivalence in the sense that
there's a single
topological invariant which we now call the Euler number which characterizes these things.
Another way to an equivalent way to say is the genus of these surfaces is no handles,
one handle, two handles, three handles and so on and so forth. It turns out that the Euler number, which we now call the Euler number, is 2 minus twice the genus.
So 2, 2 minus 2g.
Okay, that's great. So this is, that's the classic Euler's theorem.
And then, you know, comes in Gauss, right?
Once you got these three names next to each other, Euler, Gauss, and Riemann, you know it's gotta be some serious theorem.
So Euler did this in topology.
And then Gauss did this incredible work,
which he himself calls him the Theorem Agrigium,
the great theorem,
which he considers this is his personal favorite,
and this is Gauss.
And Gauss said, you can relate this number to, which is, this number is purely topological.
You can relate this number to metric geometry.
So he came up with this concept, which we now call Gaussian curvature.
It's just some complicated stuff.
You can characterize this curvature,
which you can define on this. This is even before the word manifold existed on the surface.
And then you can integrate using calculus, and the integral of this Gaussian curvature
divided by 4 pi is exactly equal to this topological number.
And that's incredible, right? The fact that you can do an integral, it comes out to be an integer.
And that integer is exactly topology. So this idea, this Gauss-related geometry to
topology in this one sweep. And then what's even the next level comes Riemann.
Riemann says, well, what you can do is to complexify.
So these are no longer, you know, real connected compact orientable surfaces,
but you can think about these as complex objects.
So what do we mean by that is, well, if you think about the real Cartesian plane,
that's a two-dimensional object, but you can equally think of that as a one complex dimensional object,
namely the complex plane.
Or the complex line.
Yeah, the complex line, exactly.
So with R2, Riemann would call C. And then Riemann realized that you can put similar
structure on all of these things as well.
So all of a sudden, these things are no longer two-dimensional real or interval surfaces,
but one complex dimensional what's now called curves.
I mean, it's a terrible name so a complex
curve is actually a two real dimensional surface and it turns out that all
complex curves are orientable so you already rule out things like
you know applying bottles and stuff like that or Möbius strips. So the
complex structure requires orientability and that's partly because
of Cauchy-Riemann relations. You know, it puts a direction. You can't get away. But
the idea is, the interesting thing is all of this now should be thought of as one complex
dimensional curves. They're called curves because they're one complex dimension, but
they're not curves, right? They're surfaces in the real sense. Right? Yes. So now here, here comes, so if you apply this to, to, to the Gauss thing, this, you
get this amazing trichotomy theorem.
And the theorem says, if you do this to the curvature, you can see this, I mean,
this, the number here is two, right?
You get the only number two, which is a positive curvature thing.
Right?
And that's consistent with the fact that the sphere is a positively curved object.
Locally, everywhere, it has positive curvature.
If you do it to a torus or the surface of a donut,
which is just called the algebraic donut, you integrate that, you get zero curvature.
And this is not a surprise because you know you have a sheet of paper you
Fold it once you get a cylinder and you fold it again. You glue it again. You get this
this curtain this torus this donut and
This sheet of paper thing is is inherently flat. Yes
So if you like if you just take a piece of paper you roll
I mean, I like you know you take take a piece of paper, you roll it up, you take this piece of paper
and you roll it up, you get a cylinder and then you do it again and nonetheless you get
the surface of a donut, like a rubber tire and that is currently zero curvature.
And then you can do this and this is a consequence of what's known as Riemann uniformization
theorem.
If you do anything that has more than one handle, you get zero curvature.
So now you have the trichotomy, right?
Positive curvature, zero curvature, negative curvature.
The one in the middle is really, obviously is interesting, it's the boundary case.
In complex algebraic geometry, these things are called final varieties.
Earlier you said if you have anything that's more than one handle, you have zero curvature.
You meant negative curvature.
Sorry, sorry, I meant negative curvature.
So these fidget spinners on the right, they all have negative curvature.
Everything here has negative curvature.
Got it.
So now in the world of complex algebraic geometry, these positive curvature things are called
final varieties after this Italian guy Fano.
These negative curvature objects which proliferate are called varieties of general type.
And this boundary case are called zero curvature objects.
And it just so happens we now call things in the middle Clavier, these zero curvature objects. And it just so happens, we now call things in the middle
clavia, these zero curvature objects.
Yes.
So far, this has got nothing to do with physics.
I mean, it's just the fact of topology, right?
But this is such a beautiful diagram that took from 1736
until Riemann.
Riemann died in the 1860s, I think, or something like that.
So it took 120 years to really formulate just this table to relate metric
geometry to topology to algebraic geometry.
It's kind of a beautiful thing, right?
So to generalize this table is the central piece of what's now called the minimal model program in algebraic geometry,
for which there have been all these fields metalists, you know,
B. Akar a couple of years ago, and then he started with Mori who got the fields metal,
and then this whole Mukai and this whole distinct, distinguish idea.
So basically this minimal model program should just generalize this to higher dimension.
This is dimension complex, dimension one, right?
How do you do it? It's very hard.
And once you have it, I won't bore you with the details,
this is very nice, you know, there's topology,
algebraic geometry, differential geometry, index theorem,
they all get unified in this very beautiful way.
And you want to, obviously you want to generalize this
to arbitrary dimension, arbitrary complex dimension.
It would be nice.
It's still an open problem. how do you do it in general?
It's a very nice problem.
But at least for a class of complex manifold known as Kähler manifolds, I won't bore you
with the details, but Kähler manifolds on which where the metric has very nice behavior,
there's a potential for which you can have a double derivative that gets on the metric. And then it was conjectured by Kalebi in the 50s.
Again, you know, 54, 56, 57, it was a great year, right?
All these different ideas, I mean, in three completely different worlds,
now come together because mathematical physicists have kind of tied it up,
you know, the world of neural networks, the world of Kouby conjecture, the world of string theory to one.
I like, you know, when things get bridged up in this way, you know, but, you know, again,
this, the theorem itself is extremely technical.
But the idea is for this Kähler manifold, there is an analog of this diagram, basically.
I love this slide. I saved this slide for my own private notes.
Oh really? Okay.
I keep a collection of dictionaries in physics and math.
Yeah.
I think this is beautiful.
Yeah, me too. But it took me like, it took me years to do this table because you, you know, it's not written down anywhere.
And it touches different things.
I think it's not written down anywhere precisely because math textbooks are written in the
Babaki style.
But now it just becomes clear what people have been thinking about for the past 100
years, you know, after Grotendieck, it's just trying to relay these ideas.
You know, this is intersection theory of characteristic classes.
So this is topology.
And you know, this is, I mean, this is over 200 years of work
of, you know, the central part of the analytics.
And mathematicians like Turing, Ritchie, Euler, Betty.
Yeah, every, everything, every, everyone,
everybody was every involved in this diagram
is an absolute legend.
In fact, there is one more column to this diagram.
I think for short of, I think when I did, I mean this was a slide from some time ago, but when I was talking to a string audience,
there is one more, one more, which is relations to L function. And that's when number theory comes in. So there is one more column.
And to understand this world, to this one more column of its behavior to L functions,
that's the Langlands program.
So it's actually really magical that this table actually extends more.
As far as, I mean, that's just as far as we know now, right?
Of course.
The L functions and its relations to modularity.
And I think this is, of course, obviously, to me, like mathematics is about extending this
table as much as possible to let it go into different fields of mathematics.
So but at least for sure we know there is one because of the Langlands correspondence,
there is one more column and that column should be on number theory and modularity.
And soon there'll be another table on the Yang invariant, the He invariant. No, I don't think I have enough talent to create something that, but it could well be there should
be something new to do. To me that's really the most fun part about mathematics. It's not
not so I mean they're like you know, who is it?
I think maybe it's Arnold as well, because there's two types of mathematicians.
They're the hedgehogs and they're the birds, right?
Hedgehogs really like, you know, like, like, and-
Specialized.
Specialized.
I mean, you absolutely need it.
I think, you know, who is a great hedgehog? I think Zhang, the guy who, you know,
made this first major breakthrough in the prime gap.
I mean, he's been saying his entire life,
just trying to think about, can I bound,
can I bound the, you know, the, how many, you know,
in the, what is the, what's the the limsop of the distance between prime pairs.
And the technique he uses is beautifully argued analytic number theory technique,
sieve methods, you know, kind of the Ben Green world of sieFS and James Maynard and then there the
The birds who are like, you know, I'm just gonna just fly around and they bump into trees and whatnot
But I'm just trying to see whether they can do and and people like Robert Langlands and you know
They're very much in that world. Can I see from a distance? I mean, I make it very coarse-grained view and which are you?
I'm 100%
In the bird category.
I mean, I like to go, you know, once I see something,
of course sooner or later you need to dig like a hedgehog,
but the most thrill that I get is when I say, oh wow, this gets connected.
So the results are proven when you dig, but the connections are seen when you get the overview.
Yeah, yeah, absolutely. So, I mean, of course, again, this is a division that's kind of artificial.
In all of us, we do a bit of both.
Yes.
The guy who really does a well is, I forget to mention, of course, it's like it's become like a grand, well, he passed away, John McKay, who was a Canadian, probably the greatest Canadian
mathematician since Coxeter.
John McKay really saw unbelievable connections in fields that nobody would ever see.
And he passed away.
He became sort of in the last 10 years of life, he became sort of like a grandfather
to me. He saw my kids grow up
over Zoom. So the London Math Society asked me to write obituary. I was very touched by
this. So I wrote his obituary for it. And I was just trying to say, well, this guy is
the ultimate pattern linker. So John McCain, absolute legend. Great. Moving on, I mean this is very
much a huge digression for what I'm actually going to tell you about, which is you know
the Birch test for AI and that's great. Do you have a limit on what these videos are? No, just so you know some of them are one hour some of them are four hours and people listen for all of it
Yeah, this is great fun. Great. Yeah same. I'm loving this. Yeah, me too
Because normally, you know, I have one up with 55 minute cutoff
I could give a talk right then like five minutes questions and I'm like, oh my god
I haven't said most of the stuff
I wanted to say.
Yeah, yeah, exactly.
Because the point of this channel
is to give whoever I'm speaking to enough time
to get through all of their points,
rather than they're rushing and not
covering something in depth.
I want them to be technical and rigorous.
So please continue.
Sure.
Sounds good to me.
So CloudBit, so in that magical year of 1957 of neural networks,
the magical year of the automated theorem prover world
and the world of algebra geometry in three complete different worlds,
they didn't even know of each other's names, let alone the results.
Klab's conjecture that at least for Kailer manifolds, this diagram is very much well defined, this table.
And Yao proved it 20 years later, so Shintong Yao,
who is again very much like a mentor to me.
And he gets the Fields Medal immediately.
So you can see why this is so important.
He gets the Fields Medal because this idea of falling through Klabi is trying to generalize
this sequence of ideas of Euler, Riemann, and Euler, Gauss, and Riemann.
So it's certainly very important.
So there it is.
We can park this idea.
So Yao showed that there are these Taylor manifolds
that have this property, that have the right
metrical properties.
So by metric, I mean distance, you know,
something can integrate over that,
because here, you know, you never think you know,
this integral is messy, right?
Even if we do this on a sphere, right?
This R has all these cosines and sines that have got,
you know, they've all got to cancel at the end of the day to get 4 pi
yes like what the hell and then divided by 2 pi you get 2 and that's the only
number which is kind of amazing stuff and now you can do this in general the
the just as a caveat Yao showed that this metric exists he never actually
gave you a metric so So the only currently known
metric on these things is for the zero curvature case is just the torus. Anything above that
we don't know, we just know that exists and if you did this integral you're going to get
like 2, 5 or whatever the number is. Which is kind of amazing. This is like a completely
non-constructive proof.
What's interesting is that these automated theorem provers, they seem computational.
And it's my understanding that computationalists, so people who use intuitionist logic,
they don't like constructive proofs.
Sorry, they like constructive proofs. They don't like non-constructive proofs.
In other words, existence proofs without showing the specific construction.
Right, right.
So it's interesting to me that all of undergraduate math, which has some non-constructive proofs,
are included in Lean.
So I don't know the relationship between Lean and non-constructive proofs, but that's an
aside.
Yeah, that's an aside.
I probably won't have too much to say about it.
Cool. So, back to, I don't know why I went on this diatribe on digression on string theory,
but I just want to say this is a side comment. So this is something since 1736, which is kind of nice.
Which is, oh by the way, that's actually kind of interesting. I'm going to have to check this again. Just down the street from the Institute
is the famous department store, Fortman Mason's,
which I think is established in 17 something.
It's a great department store.
It's not usually, it's not where I usually do my shopping,
but it's just a beautiful department store
where, you know, Mozart would have,
and Haydn might have, you might have called and did their Christmas shopping.
But anyhow, just random thought.
So string theory was just one slide, right?
I mean, I'm not, in some sense, I'm not a string theorist.
In a sense, I don't go quantize strings.
The kind of stuff that I'm more interested in is like, I didn't grow up writing conformal field theories and do all that stuff.
It's just that for me, it's an input so I can play with a little more problems in geometry.
So string theory is this theory of space-time that unifies quantum gravity, blah, blah, blah.
And then it works in 10 dimensions and we've got to get down to 4 dimensions.
So we're missing 6 dimensions.
So that's what I'm going to say.
And this amazing paper in 1985 by Candela Horowitz-Strohminger and Whitten, they were
thinking about what are the properties of the 6 extra dimensions.
So what is interesting is that by imposing supersymmetry,
and this is why supersymmetry is so interesting to me,
by imposing supersymmetry and other anomaly cancellation,
bunch of, not too stringent conditions,
they hit on the condition that this six extra dimensions
has to be richy flat.
Richy flat is, you can understand,
because it's vacuum-outside solutions. You want the vacuum string solution. And then the condition
which you've never seen before, which just happens to be this Cayley condition. They didn't know
about this. No physicist until 1985 would know what a Cayley manifold was. And it's complex,
it's complex and it's complex dimension three. Remember again, I said complex dimension three means real dimension six, right?
That's 10 minus four is six, and six needs to be complexified into three.
And again, this is just an amazing fact that in 1985,
Strominger, who was a physicist,
was visiting Yale at the Institute of Advanced Study in Princeton.
And so he went to Yao and said, can you tell me what this strange condition, this technical condition I got?
And Yao says, wow, you know, I just got the Fields Medal for this. I think I made know a few things. I was just amazed.
It was again, it was a complete confluence of ideas that's totally random.
And the rest is history.
So in fact, these four guys named this Richie Flap, Kailor Manifold, Klabi Yao.
So it wasn't the mathematicians who did it.
This word Klabi Yao came from physicists, from string theorists,
which now, of course, Calabi-Yau is now one of the central pieces. So, Philip Candelas
was my mentor at Oxford when I was a junior fellow there, and he tells me this story.
He's a very lively guy. He tells me about how this whole story came about,
and it's very interesting.
And so he and these four guys came up with the word collab.
So all of a sudden we now have a name for this boundary case
in complex hydrometry.
This bounding case is now known as a collab so i remember we had names before right this was the final variety this was varieties of general type and this bounding case is now called club.
So what we're seeing with the tourists here is a club one exactly exactly exactly in fact, the the the Taurus is
The only Klabi-Yau one. So it's the only one that's richy-flat. I mean by this classification. It's the only one that's topologically possible
So that's kind of interesting right and then this is just a comment
I like this title because I think your series is called TOE
This is a TOE on TOE. Love it. I just want to emphasize,
this is a nice confluence of ideas
with mathematics and physics.
But my string theory really, what it really is,
is this brainchild of interpreting problems between,
interpreting and interpolating
between problems in mathematics and physics.
All right, so for example, you example, we now, you know,
GR should be phrased in differential geometry.
The standard model gauge theory should be phrased
in terms of algebraic geometry and representation theory
of finite groups.
And condensed matter physics of topological insulators
should be phrased in terms of algebraic topology.
This idea, I think the greatest achievement of the 20th century physics is, to me,
and I think something you would appreciate since you like tables, is here's a dictionary of a list of things
and then here's what they are in mathematics.
And then, you know, you can talk to mathematicians in this language and you can talk to physicists in that language,
but they're actually the really same thing, You know, what's a fermion?
You know, it's a spin representation of the Lorentz group.
You know, I like that because it gives a precise definition of what we are seeing around.
Then you have something you can purely play with in this platonic world.
And string theory is really just a brainchild of this translation, this tradition of what's
on the left and what's on the right, and let's see what we can do.
And sometimes you make progress on the left, you give insight on stuff on the right, and
sometimes you make progress on the right and you give insight on the left.
Why is it that you call the standard model algebraic geometry?
Because bundles and connections are part of differential geometry, no?
Oh yeah, that's true.
Well I think that's, yeah, they're interlinked. And I think algebraic,
maybe it's because of Atiya and Hichin. Of course, you know, they are fluid in both.
So yeah, yeah, they go either way. But algebraic in the sense that you can often work with bundles and connections without actually doing the integral in differential geometry.
So I think that's the part I want to emphasize. You can understand bundles purely as algebraic objects
without ever doing an integral.
You know, like here.
Like here, for example, like this integral
is obviously something you would do
in differential geometry.
But this integral, the fact that it comes to be an integer
was explained through the theory of churn classes.
This integral is a pairing between the churn class, be, you know, to be, you know, this integral is a pairing
between the churn class, between homology and cohomology, which is a purely algebraic
thing. You know, we all try to avoid doing integrals because integrals are horrible because
it's hard to do. And in this language, it really just becomes polynomial manipulation
and it becomes much simpler.
Okay.
So, you know, that in that sense, I want to put it. Of course, you know, it's a bit of a both.
Got it.
So, I like doing this diagram, right? And, you know, if you look at the time lag between
the mathematical idea and the physical realization of that idea, there really is a confluence.
Yeah.
Yeah.
It's getting closer.
I mean, these things going up and down.
I mean, I'm just saying, if you take the last 200 years or so,
last 100 years or so, of the groundbreaking ideas in physics,
there is this...
Interesting.
Right.
It gets shorter and shorter.
Obviously, Einstein took ideas of Riemann and you know there was a six year gap.
Dirac was able to come up with the equation of electron essentially because of Clifford Algebras.
Did historically was he motivated by Clifford Algebras or did it just was it later realized hey Dirac what you're doing is an example of a Clifford Algebra? So I believe the story goes, in order to write down the first time derivative version of the Klein-Gordon equation, which is a second order, you know, that's the bosonic one, he had to do some fact, essentially he factorized the matrix in a way that seemed very strange to him. And Dirac said, this really reminded me of something that I've seen before.
And this is one of these moments, right? Today, we can chat GPT this.
But what Dirac did was, he said, he was at St. John's in Cambridge at the time.
He said, I have seen this in the textbook before somewhere,
you know, this gamma mu, gamma nu thing.
And then he said, I need to go to the library to check this.
So he really knew about this.
And unfortunately, the St. John's library was closed
that evening.
So he waited until the morning,
until the library was open to go to Clifford's book or a book about Clifford.
I can't remember whether it was Clifford's book or maybe it was one of these books.
And then he opened up and he really knew that this gamma mu gamma nu anti-computation relation really was through the through.
So he knew about Clifford.
Cool.
It's kind of interesting, yeah. Just like Einstein knew about Riemann's work on curvature.
But, you know, whether you say, you know, Dirac was really inspired by Clifford,
well, he certainly did a funky factorization and then he knew how to justify it immediately
by looking at the right source.
And then similarly, you know, Yang-Mills theory depended on this Zybert's book on apology.
And then, you know, by the time you get to Witten and Borchett's, really there's this...
This diagram for me is like what gets me excited about string theory.
Because string theory is a brainchild of this curve, this orange curve.
And now it's getting mixed up.
I mean, of course, you know, people hear about this great quote that Witten says,
you know, string theory is a piece of 21st century mathematics that happens to fall into 20th century.
And I think he means this.
Yes.
You know, that he was using supersymmetry to prove, you know, theorems in Morse theory
and vice versa.
Richard Borchers was using vertex algebra, which is sort of foundational thing, conformal
field theory, to prove some properties about the monster group.
We're at this stage.
I know, of course, you know, this was turn of last turn of the century and now we're here and I
Have to where are we now? Are we are we crisscrossed or are we parallel?
Yeah, it's hard to it's hard to say and in a meta manner
You can even interpret this as the pair of pants in string theory with the world sheet. Yeah, cute
Why not yeah, it is it is but going back to with the world sheet. Yeah, cute. Very cute. Yeah, why not?
Yeah, it is.
But going back to what you were saying, how I got to...
Oh yeah, so just, yeah, this confluence idea,
of course, you know, everyone quotes these two papers.
You know, when Wigner was thinking about in 59,
why mathematics is so effective
in physics. And there's this maybe slightly less known paper, but certainly equally important
paper by the great Leith Latia and then Dijkgraaf and Hitchin, which is the other way around.
Why is mathematics so, why is physics so effective in giving ideas in mathematics?
So this is a beautiful pair of essays.
In this, this is like very much a, in the world of a summary of the kind of physics
ideas from string theory is making such beautiful advances in geometry.
So this is a very beautiful pair of one given in the other that needs to be, you know, sort
of praised more.
And that's why you were mentioning earlier how I got to know, you know, Roger Sawal is
through this editorial, we try to collect with my colleague, Mo Ling
Ge, who is a former director of the Chern Institute.
Everybody's connected, right?
So it just so happens that I grew up in the West, but after I grew up with my parents,
after so many decades, my parents actually retired and went back to Tianjin where Nankai University is,
where Chern founded what's now called the Chern Institute for Mathematical Sciences.
And that's an institute devoted to the dialogue between mathematics and physics.
In fact, one third of Chern's ashes is buried outside of the Math Institute.
There's a great, beautiful marble tomb.
Interesting.
And once they're not because of any mathematical reason,
it's just that he considered three parts of his home.
So his hometown in Zhejiang, China,
and Berkeley, where he did most of his professional career and then Nankai University
where he retired to for the last 20 years of his life.
So a third each.
Yes.
The number three comes up again.
It's all about free.
And in fact, I was going to joke.
So in Chen Simon's theory in three dimensions, there's this topological theory, the Chen
Simon's theory. There's
a crucial factor of one third. I always joke, you know, that's why churn chose one third
for his ashes, but that's not my complete coincidence. But it's actually, what is actually
interesting is the, that tomb, that beautiful black marble tomb, you know, for somebody
that's great as churn, it mentions nothing about, you know, for somebody as great as Chern, it mentions
nothing about, you know, his chief done this, done the other thing. It's just one page of
his notebook. I mean, think about the poor guy who had to chisel or that, he had no idea
what he's chiseling, right? The guy who's chiseling this thing. And it's the proof
of this, of this. The fact is such, and of course
it's a little, you can look this on the internet, just say the grave of SS Chern at Nankai University.
Well, the whole conversation we've had is just about pattern matching without the intuitive
understanding behind it. So this chiseler may have had that.
Yeah, yeah, yeah. That's what I do every day. Love it. So that chisel is essentially his proof why this is equal to this.
You know, why this intersection product is the same as this integral.
So he's actually, it's where the Gaspard theorem is a corollary of this trick in algebraic
geometry, which is his great achievement.
But anyhow, so it just so, yeah, back to this coincidence, and it just so happens that my
parents, after drifting all these years abroad, they retired back to Tianjin, where the Chuan
Institute is. So that's why I became an honorary professor at Nankai, because, I mean, my motivation
was purely just so that I could spend time with my parents.
But it just so happens that it happens there and I can just pay my homage to Chen, just to see his grave.
I mean, it's a great, you know, it's a mind-blowing experience just to see the Chen's grave and to see the derivation of this in his handwriting chiseled in stone.
But anyway, so that's how I got involved in with Xianyang because he was very deeply involved
with Chen.
He and Chen are good friends.
I can imagine that Xianyang is 102 today.
Yeah, it's remarkable.
And that he was still doing, he wrote the preface to this, um, when he was 99.
And these guys are unstoppable.
And, and, you know, Roger, Roger Penrose sent he, he sent his essay to this one when he
was what, 90, 92.
Yeah.
These guys are anyhow, it's kind of, I, I, I of, you like tables, right? I love tables.
So the tables, here's just like, you know, a speculation of where string theory is going.
Here's a list of, you know, the annual conferences, like the series where string theory has been
happening. So 1986 was the first string revolution where since then every year there's been a major string conference.
I'm going to the one, the first one I'm going to for years in two weeks time.
It happens to be Abu Dhabi, I guess I'm sorry.
And then, you know, there's a series of annual ones, the string phenol,
and the string math came in as late as in 2011. That's kind of interesting.
So that's like, you know, 30 years after the first string conference and the various other ones.
What's really interesting one is in 2017 is there's the first string data. This is what AI entered string theory.
And so it's kind of so what I read the first paper in 2017 about AI sister stuff,
and there were three other groups independently mining different AI aspects and how to apply to string theory.
So the reason I want to mention is just why was, you know,
with the string community even thinking about this problems in AI.
Oh, and also just to be clear, briefly speaking,
I'm not a fan of tables per se.
I'm a fan of dictionaries
because they're like Rosetta stones.
So I'm a fan of Rosetta stones
and translating between different languages.
So you mentioned the siloing earlier
and mathematicians call,
even physicists call them dictionaries,
but technically they're thesauruses.
Like a dictionary, you just have a term
and then you define it.
The translations.
Right, right, right, right. Like Rosetta stones yes yeah yeah no absolutely I guess
that's why you like Langlands so much yeah yeah for sure yeah no yeah absolutely
in some way this whole channel is a project of a Rosetta stone between the
different fields of math and physics and philosophy yeah yeah that's fantastic
love it big fan thank you okay so do you want to just I noticed it jump back to physics and philosophy. Yeah, that's fantastic. Love it. Big fan. Thank you.
OK, so do you want to just, I noticed it jumped back to number 13.
So it seems like I thought we were at 39 out of 40.
No, no, no, because I've learned this non-linear structure.
Because you see, I've learned this.
This is really dangerous.
I've learned the click button in PDF presentations.
Like you click it, it button in PDF presentations. Like you
click it, it jumps to another one and you can have interludes. So you know it's
clearly an interlude and you say you jump back to your main. So my actual main
presentation is only like you know 30 pages but it's got all these digressions
which is actually very typical of my personality.
So I gave you this big interlude about string, about string theory and Clavier manifolds, right?
So now we've already got to the point that Clavier onefold, the one dimensional
complex Clavier, there's only one example.
That's just one of these.
Right.
And then it turns out that in complex dimension two, there are two of these.
There is the four dimensional torus, which is, and then there's this crazy thing called the K3,
which is Ricci flat and K-ler. So you got one in complex dimension one, two in complex dimension
three. You would think in three dimensions
there's three of these things that are topologically distinct.
And unfortunately this is one of the sequences in mathematics that goes as one, two, we have
absolutely no idea.
And we know at least one billion.
At least.
So it's kind of, it goes one, two, a billion.
And so starting from complex dimension three just goes crazy.
It's still a conjecture of Yao that in every dimension this number is finite.
So, remember this positive curvature thing, this final thing to the very top.
It is a theorem that in every dimension, final varieties is finite in possibility in
topology, that only a finite number of these that are distinct topologically.
It's also known that the negative curvatures is infinite in every dimension.
And when it goes higher, it's like even uncountably infinite.
Oh, interesting.
But is this boundary case, Yao conjectures in an ideal world, they're also finite.
But we don't know.
This is the open conjecture.
Now the billion, are any of them constructed or is it just the existence?
Yeah, yeah.
That's it.
Now that's exactly where we're getting.
So it's gotten, it's one, two and three.
Three is like, you know, how are you going to list these things?
Right.
And then.
Algebra geomotors never really bother listing all of them out.
This is just not something they do.
So it took on, the physicists took on the challenge.
So Philip Candelas and Franz and then Harold Scharke and Maximilian Kreutzer started just
listing these.
And that's why we have these billions.
There is actually databases of these.
And they're presented in just like matrices like this.
I won't bore you with the details of these matrices.
These algebraic varieties, you can define this as intersections of polynomials.
That's one way to represent them. And in Croats and Schakka's database, they
put vertices, uptoric varieties, hiding in bed. But the upshot is that, you know, there's
a database of gigas, of many, many gigabytes, that really got done by the, certainly by
the turn of the century, by year 2000, these guys were running on Pentium machines.
I mean, this is an absolute feat,
especially Kroetser and Shkalka.
They were able to get 500 million of this
and stored on a hard drive using a Pentium machine
of this car by Almanyfold.
And they were able to compute
topological invariance of these.
So I happen to have this database And they were able to compute topological invariance of these.
So I happened to have this database.
I could access them and that was kind of fun.
And I've been playing on and off with them for a number of years.
So a typical calculation is like you have something like a configuration of tensors
of here is even in integers,
and you have some standard method in
algebraic geometry to compute topological invariance.
This topological invariance again,
in this dictionary, means something.
For example, H21 in some context,
is the number of generations of
fermions in the low-energy world.
That's a complete problem in this computing
of topological invariant in algebraic geometry.
And there are methods to do it.
And in these databases, people took 10, 20 years
to compile this database and you got these things in.
And they're not easy.
It's very complicated to compute these things.
So in 2017, I was playing around with this.
And the reason why I was playing around with this. And the reason why I was playing
around with this was very simple. It's because my son was born and I had infinite, sleepless nights
and I couldn't do anything. Right? I had like, you know, there's the kid and then, you know,
there's the kid, there's the kid. And, you know, and he wakes you up at two, you know,
put him to and, you know, and I was bottle feeding him two, you know, put him to and, and, you know, and
I was bottle feeding him and I had a daughter at the time. So the, the, my wife's taking
care of the daughter. They they're passed out. And then I got this kid, I passed him
out, put them into bed and I'm, I'm wide awake at this point. It's like 2 AM. It's like,
I can't fall asleep anymore and I can't do real, you know, serious computation anymore because I'm just
too tired.
So, let's just play around with data.
At least I can let the computer help me to do something.
And then that's when I learned, you know, what's this thing that everybody's talking
about?
Well, you know, it's machine learning.
Right.
So that's why I got through this.
It's a very simple, very simple biological reason why I was trying to learn machinery.
So then I think I was hallucinating at some point, right?
I was like, well, if you look at pictures like, you know, matrices a lot like, you know,
we're talking about, you know, 500 million of these things, right?
Yes.
Certainly I wasn't going through all of them.
And they're being labeled by topological invariants.
How different is it if I just sort of pixelated one
of these and labeled them by this?
And all of a sudden this began to look like a problem
in hand digit recognition, right?
This is like, how different is this or image recognition?
So, and I just literally started feeding in.
I took 500, I mean, 500 million is too much, right?
So I took like 5,000 of these, 10,000 of these,
and then trained them to look and recognize this,
to recognize this on number.
And I was like, this is going to be,
it's just going to give crap, obviously.
It's going to give 0% accuracy.
And to my surprise, it was giving extremely good accuracies.
So somehow the neural network that I was training,
I was even using standard MNIST, the high-dreconitioned MNIST
things, recognizing this.
And it was recognizing it to great accuracy.
And now, people have improved this.
Like loads of people, like Fin you know, Finitello,
there's a group there that did some serious work on just trying to dis-problem.
But this idea suddenly didn't seem so crazy anymore.
The idea seemed completely crazy to me because I was hallucinating at 2 a.m.
And it was way so, but what's the upshot of this?
The upshot is somehow the neural network was doing algebraic geometry, like
this kind of algebraic geometry, really sequence chasing, very complicated, without knowing
anything about algebraic geometry. It somehow was just doing pattern recognition and somehow
it's beating us because you know if you do this computation seriously, it's double exponential complexity.
But it's just now, but pattern recognition is bypassing all of that.
So then I became a fanatic, right?
Then I said, well, all of algebraic geometry is image processing.
So far I have not been shocked by the algebraic geometries, because it's actually true if
you really think about it.
You know, any algebraic, the point of algebraic geometry,
the reason I like algebraic more than differential
is because there's a very nice way
to represent manifolds in this way.
Manifolds in algebraic geometry,
so in differential geometry,
manifolds are defined in terms of Euclidean patches.
Then you do, you know, transition functions,
you know, which are differentiable, C infinity, blah,
blah, blah.
But in algebraic geometry, they're just vanishing low-side polynomials.
And then once you have systems of polynomials, you have a very good representation.
So for example, here, this is just recording the list of polynomials, the degrees of polynomials
that are embedded in some space.
And that really is algebraic geometry.
So basically any algebraic variety, so that's fancy way of saying this polynomial representation
of a manifold, which is called an algebraic variety, this thing is representable in terms
of a matrix or a tensor, sometimes even an integer tensor.
And then the computation of invariance, a topological invariance,
is the recognition problem of such tensor.
But once you have a tensor, you can always pixelate it and, you know,
picturize it. You know, at the end of the day,
it's doing this because it's just image process in algebraic geometry.
Now, do you mean to say every problem in algebraic geometry is an image process?
Oh almost. Is an image processing problem or just problems involving invariance or image processing or even broader than that?
Well, I think it's I'm thinking it is really more broad
I think you know at some level, you know, I think in my view, I try to say, you know, bottom, bottom up mathematics is language processing and top down mathematics is image processing.
Interesting.
Of course, this is, I mean, take with the caveat, but of course, at some level, this there is truth in what I say. Of course, it's an extreme thing to say.
But, you know, in terms of what mathematical discovery is, is that you're trying to take
a pattern in mathematics.
So in algebra, if you use perfect example, you can pixelate everything and you can just
try to see certain images have certain properties.
And so your image processing mathematics, whereas bottom up, you're building up
mathematics as a language. So, it's language processing.
And of course, all of this will be useless if you can't actually get human readable mathematics out of it.
So, this is the first surprise.
The fact that it's even doing it at all to a certain degree of accuracy.
Now, we're talking about accuracy like now it's been improved to like 99.99 percent accuracy
in these databases.
But that's the first level, that's the first surprise.
The second surprise is that you can actually extract
human understandable mathematics from it.
And I think that's the next level surprise.
So the murmuration conjectures, this beautiful work in DeepMind
that Jody Williamson's involved in, in this human-guided intuition,
you can actually get human mathematics out of it.
That's really quite something.
So maybe that's a good point to break for part two,
which is an advertisement of, you know, here is like we've gone through
many things about what mathematics is and how it got this through doing this interaction
between algebra, geometry, and string theory.
And then a second part would be how you can actually extrapolate and extract mathematics, actual conjectures,
things to prove from doing this kind of experimentation, which are summarized in these books.
I keep on advertising my books because I get 50 pounds per year of, what do they call it, royalties,
so I don't have to sell my liver
for my kids.
But it's actually kind of fun.
It's a complete, I mean, academic publish is a joke, right?
You get like out of like a hundred pound a year because you don't actually make money
out of it.
But maybe that's a good place to break.
And then for part two, how we try to formulate what the Birch test is for AI, which is sort of the Turing test plus.
Because the Birch test is how to get actual meaningful human
mathematics out of this kind of playing around with mathematical data.
I see two of your sentences that will be these maxims for the future will be
that machine learning is the 22nd century's math that fell
into the 21st. So this machine learning assisted mathematics or that the bottom up is language
processing and then the bottom the top down is image processing. Yeah, I like those two.
Yeah. Anyone who's watching if you have questions for Yang Hui for part two, please leave them
in the comments. Do you want to give just a brief overview?
Oh yeah, sure.
So, I'm going to talk about what the Birch Test is and which papers so far have gotten
– how closely they've gone through the Birch Test.
And then I'm going to talk about some more experiments in number theory.
And the one that I really enjoyed doing with my collaborators, Lee, Oliver and Posniakoff,
which is to actually make something meaningful that's related to the Bresch-Snowden-Dyer
conjecture just by just letting machine go crazy and finding a new pattern in elliptic
curves, which is fundamentally a new pattern in the prime numbers, which is completely
amazing. You mentioned quanta earlier so this quanta feature that featured this one
considered this as one of the breakthroughs of 2024.
Great and that word murmuration which was used repeatedly throughout it was never
defined but it will be in the part two.
Absolutely.
I'm looking forward to it.
Me too.
Okay thank you so much. Thank you. This has been wonderful. I could continue speaking to it. Me too. Me too. Okay. Thank you so much. Thank you.
This has been wonderful. I could continue speaking to you for four hours. Both of us have to get going but
That's so much fun. Pleasure.
Don't go anywhere just yet. Now I have a recap of today's episode brought to you by The Economist.
Just as The Economist brings clarity to complex concepts,
we're doing the same with our new AI-powered episode recap.
Here's a concise summary of the key insights from today's podcast.
All right, let's dive in.
We're talking about Kurt J. Mungle and his deep dives into all things mind-bending.
You know this guy puts in the hours, like weeks, prepping to grill guests like Roger
Penrose on some wild topics.
Yeah, it's amazing using his own background to dig in.
Really challenging guests with his knowledge of mathematical physics pushes them beyond
the usual.
Definitely.
And today we're focusing on his chat with mathematician Yang Huihe.
They're getting into AI, math, where those two worlds collide.
And it's fascinating because it really makes you think differently about how math works,
how we do math, and where AI might fit into the picture.
You might think a mathematician's life is all formulas and proofs,
but Yang Hui, he actually started exploring
AI assisted math while dealing
with sleepless nights with his newborn son.
It's such a cool example of finding inspiration when you least expect it.
Tired but inspired, he started messing around with
machine learning in those quiet early morning hours.
So let's break down this whole AI and math thing. Yanqui, he talks about three levels
of math. Bottom up, top down, and meta. Bottom up is like building with Legos. Very structured,
rigorous proofs. That's the foundation. But here's where things get really interesting.
It has limitations.
Right. And those limitations are highlighted by Gödel's incompleteness theorems. Basically,
Gödel showed us that even in perfectly logical systems, there will always be true statements
that can't be proven within that system. It's mind-blowing.
So if even our most rigorous math has these inherent limitations, it makes you think.
Could AI discover truths that we as humans bound by our formal systems might miss?
Could it explore uncharted territory?
That's a really deep thought. And it's really at the core of what makes this conversation revolutionary.
It's not about AI just helping us with math faster. It's about AI possibly changing how we think about math altogether.
So how is this all playing out? We've had computers in math for ages, from early theorem provers to AI assistants like Lean. But where are we now with AI actually doing math?
Well, AI is already making some big strides. It's tackling Olympiad-level problems and doing it well,
which makes you ask, can AI really unlock the secrets of math?
And that leads us to the big philosophical questions. Is AI really understanding these
mathematical ideas, or is it just incredibly good at spotting patterns? It's like that famous Chinese
room thought experiment. You could follow rules to manipulate Chinese symbols without truly
understanding the language. Yang Hui, he shared a story about Andrew Wiles, the guy who proved
Fermat's last theorem, trying to challenge GPT-3 with some basic math problems.
It highlights how early AI models, while excelling in tasks with clear rules and plenty of examples,
struggled with things that needed real deep understanding.
It seems like AI's strength right now is in pattern recognition. And that ties into what
Yang Kui he calls top-down mathematics. It's where intuition and seeing connections between
different parts of math are king. Like Gauss. He figured out the prime number
theorem way before we had the tools to prove it. It shows how a knack for
patterns can lead to big breakthroughs even before we have the rigorous
structure. It's like AI is taking that intuitive leap, seeing connections that
might have taken us humans years even decades to figure out. And it's all because AI can deal
with such massive amounts of data.
Which brings us back to Yang We.
He's sleepless nights.
He started thinking about Calabiao manifolds,
super complex mathematical things key to string theory
as image processing problems.
Wait, Calabiao manifolds?
Those sound like something straight out of science fiction.
They're pretty wild.
Think six dimensions all curled up, nearly impossible to picture.
They're vital to string theory, which tries to bring all the forces of nature together.
Now mathematicians typically use these really abstract algebraic geometry techniques for
this, but Yang-Wei?
He had a different thought.
So instead of equations and formulas, he starts thinking about pixels.
Yeah. Like taking a Klabi-Yau manifold, breaking it down into a pixel grid, like you do with an
image. He's taking abstract geometry and turning it into something a neural network built for image
recognition can handle.
That is a radical change in how we think about this. It's like he's making something incredibly
abstract, tangible, translating it for AI. Did it even work?
The results blew people away.
He fed these pixelated manifolds into a neural network, and it predicted their topological
properties really accurately.
He basically showed AI could do algebraic geometry in a whole new way.
So it's not just speeding up calculations.
It's uncovering hidden patterns and connections that might have stayed hidden, like opening a new way of
seeing math. And that leads us to the big question. If AI can crack open complex
math like this, what other secrets could it unlock? We're back. Last time we were
talking about AI not just helping us with math, but actually coming up with
new mathematical insights, which is where the Birch test comes in.
It's like, can AI go from being a super calculator to actually being a math partner?
Exactly.
And now we'll look at how researchers like Yang Hui-hee are trying to answer that.
Remember the Turing test was about a machine being able to hold a conversation like a human.
The Birch test is a whole other level.
It's not about imitation.
It's about creating completely new mathematical ideas.
Think about Brian Birch back in the 60s.
He came up with this bowl conjecture about elliptic curves
just from looking at patterns in numbers.
So this test wants AI to do similar leaps,
to go through tons of data, find patterns,
and come up with conjectures that push math forward.
Exactly.
Can AI, like Birch,
show us new mathematical landscapes?
That's asking a lot.
So how are we doing?
Are there any signs AI might be on the right track?
There have been some promising developments,
like in 2021, Davies and his team used AI to explore knot theory.
Knots, like tying your shoelaces,
what's that got to do with advanced math?
It's more complex than you think.
Knot theory is about how you can embed a loop
in three-dimensional space and it actually connects to things like
topology and even quantum physics.
Okay, that's interesting. So how does AI come in?
Well, every knot has certain mathematical properties called invariants.
It's kind of like its fingerprint. Davy's team used machine learning to
analyze a massive amount of these invariants.
So was the AI just crunching numbers?
Or was it doing something more?
What's amazing is the AI didn't just process the data, it actually found hidden relationships
between these invariants, which led to new conjectures that mathematicians hadn't even
considered before, like the AI was pointing the way to new mathematical truths.
That's wild.
Sounds like AI is becoming a powerful tool
to spot patterns our human minds might miss.
Absolutely.
Another cool example is Lample and Charton's work in 2019.
They trained AI on a massive data set of math formulas.
And what did they find?
Well, this AI could accurately predict
the next formula in a sequence,
even for really complex ones.
It was like the AI was learning the grammar of math and could guess what might come next.
So we might not have AI writing full-blown proofs yet, but it's getting
really good at understanding the structure of math and suggesting new
directions. And that brings us back to Yang Cuhi. His work with those Calabi-Yau
manifolds, analyzing them as pixelated forms, that was a huge breakthrough. Showed
that AI could take on algebraic geometry
problems in a totally new way. Like bridging abstract math in the world of data and algorithms.
Exactly. And that bridge leads to some really mind-bending possibilities. Yang Hui, he and his
colleagues started exploring something they call murmuration. Murmuration? Like birds.
It's a great analogy. Think of a flock of birds moving together like one.
Each bird reacts to the ones around it
and you get these complex, beautiful patterns.
Right, I get it.
But how does it relate to AI and math?
Well, Yanghui, he sees a parallel
between how birds navigate together in a murmuration
and how AI can guide mathematicians towards new insights
by sifting through tons of math data
So the AI is like the flock exploring math and showing us where things get interesting
Yeah
And they've actually used this murmuration idea to look into a famous problem in number theory the Birch and Swinerton-Dyer conjecture
That name sounds a bit intimidating. What's it all about?
Imagine a doughnut shape, but in the world of numbers. These are called elliptic curves.
Mathematicians are obsessed with finding rational points on these curves, points where the coordinates
can be written as fractions.
Okay, I'm following so far.
The Birch and Swinerton-Dyer conjecture basically says there's this deep connection between
how many of these rational points there are and a specific math function, like linking
the geometry of these curves to number theory.
Things are definitely getting complex now.
And it's a big deal in math.
It's actually one of the Clay Mathematics Institute's
Millennium Prize problems.
Solve it, you win a million bucks.
Now that's some serious math street cred.
So how did Yang Hu, his team, use AI for this?
They trained an AI on this massive data set
of elliptic curves and their functions.
The AI didn't actually solve the whole conjecture,
but it found this new pattern, this correlation,
that mathematicians hadn't noticed before.
So the AI was like a digital explorer,
mapping out this math territory
and showing mathematicians what to look at more closely.
Exactly.
This discovery, while not a complete proof,
gives more support to the conjecture and opens
up some exciting new areas for research.
It shows how AI can help with even the hardest problems in mathematics.
It feels like we're on the edge of something new in math.
AI is not just a tool, it's a partner in figuring out the truth.
What does all this mean for math in the future?
That's a great question, and it's something we'll dig into in the final part of this
deep dive. for math in the future? That's a great question, and it's something we'll dig into in the final part of this Deep Dive. We'll look at the philosophical and ethical stuff around AI in math.
We'll ask if AI is really understanding the math it's working with, or if it's just
manipulating symbols in a really fancy way. See you there. Welcome back to our Deep Dive. We've
been exploring how AI is changing the game in math, from solving tough problems
to finding hidden patterns in complex structures.
But what does it all mean?
What are the implications of all of this?
We've touched on this question of understanding.
Does AI really understand the math it's dealing with, or is it just a master of pattern
matching?
Yeah, we can get caught up in the cool stuff AI is doing.
But we can't forget about those implications.
If AI is going to
be a real collaborator in mathematics, this whole understanding question is huge.
It goes way back to the Chinese room thought experiment. Imagine someone who doesn't speak
Chinese has this rulebook for moving Chinese symbols around. They can follow the rules
to make grammatically correct sentences, but do they actually get the meaning?
So is AI like that, just manipulating symbols in math
without grasping the deeper concepts?
That's the big question, and there's no easy answer.
Some people say that because AI gets meaningful results,
like we've talked about, it shows some kind of understanding,
even if it's different from how we understand things.
Others say AI doesn't have that intuitive grasp
of math concepts that
we humans have.
It's a debate that's probably going to keep going as AI gets better and better at math.
Makes you wonder how it's going to affect the foundations of mathematics itself.
That's a key point. Traditionally, mathematical proof has been all about logic, building arguments
step by step using established axioms and theorems. But AI brings something new, inductive
reasoning, finding patterns, and
extrapolating from those patterns.
So could we see a change in how mathematicians approach proof?
Could we move toward a way of doing math that's driven by data?
It's possible.
Some mathematicians are already using AI as a partner in the proving process.
AI can help generate potential theorems or find good strategies for tackling conjectures.
But others are more cautious, worried that relying too much on AI could make math less
rigorous, more prone to errors.
It's like with any new tool, there's good and bad.
Finding that balance is important.
We need to be aware of the limitations and not rely on AI too much.
Right.
And as AI becomes more important in math, it's crucial to have open and honest conversations.
We need to talk about what AI means, not just for math, but for everything we do.
It's not just about the tech, it's about how we choose to use it.
We need to make sure AI helps humanity and the benefits are shared. That's everyone's responsibility.
A responsibility that goes way beyond just mathematicians and computer scientists.
We need philosophers, ethicists, social scientists,
and most importantly, the public.
We need all sorts of voices and perspectives
to guide us as we go into this uncharted territory.
This has been an amazing journey
into the world of AI and math.
From sleepless nights to those mind-bending manifolds,
we've seen how AI is pushing the boundaries
of what's possible.
And as we wrap up, we encourage you to keep thinking about these things.
What does it really mean for a machine to understand math?
How will AI change the way we prove things and make discoveries in math?
How can we make sure we're using AI responsibly and ethically in our search for knowledge?
These are tough questions, but they're worth asking.
The future of mathematics is being shaped right now,
and AI is a major player.
Thanks for joining us on this deep dive.
We'll catch you next time,
ready to explore some other fascinating corner
of the universe of knowledge.
New update, started a sub stack.
Writings on there are currently about language
and ill-defined concepts,
as well as some other mathematical details.
Much more being written there. This
is content that isn't anywhere else. It's not on theories of everything. It's not on
Patreon. Also full transcripts will be placed there at some point in the future.
Several people ask me, hey Kurt, you've spoken to so many people in the fields of theoretical
physics, philosophy and consciousness. What are your thoughts? While I remain impartial
in interviews, this substack is a way to peer into my present
deliberations on these topics.
Also, thank you to our partner, The Economist.
Firstly, thank you for watching, thank you for listening.
If you haven't subscribed or clicked that like button, now is the time to do so.
Why? Because each subscribe,
each like helps YouTube push this content to more people like yourself, plus it helps
out Kurt directly, aka me.
I also found out last year that external links count plenty toward the algorithm, which means
that whenever you share on Twitter, say on Facebook or even on Reddit, etc. It shows YouTube, hey, people are talking about this content outside of YouTube, which
in turn greatly aids the distribution on YouTube.
Thirdly, there's a remarkably active Discord and subreddit for theories of everything where
people explicate toes, they disagree respectfully about theories, and build as a community our
own toe.
Links to both are in the description.
Fourthly, you should know this podcast is on iTunes, it's on Spotify, it's on all of the
audio platforms. All you have to do is type in theories of everything and you'll find it.
Personally, I gain from rewatching lectures and podcasts. I also read in the comments that hey,
toe listeners also gain from replaying. So how about instead you re-listen on those platforms
like iTunes, Spotify, Google Podcasts, whichever podcast catcher you use.
And finally, if you'd like to support more conversations like this, more content like
this, then do consider visiting patreon.com slash Kurt Jaimungal and donating with whatever
you like. There's also PayPal, there's also crypto, there's also just joining on YouTube.
Again, keep in mind, it's support from the sponsors and you that allow me to work on
toe full time.
You also get early access to ad free episodes, whether it's audio or video, it's audio in
the case of Patreon, video in the case of YouTube.
For instance, this episode that you're listening to right now was released a few days earlier.
Every dollar helps far more than you think.
Either way, your viewership is
generosity enough. Thank you so much.