Theories of Everything with Curt Jaimungal - AI is Taking Over Mathematics | Yang-Hui He

Episode Date: January 2, 2025

This episode features Yang-Hui He, a mathematical physicist and professor known for his groundbreaking work in string theory, geometry, and AI-driven approaches to mathematical research, as he explain...s how machine learning is revolutionizing the field and leading to major discoveries. As a listener of TOE you can get a special 20% off discount to The Economist and all it has to offer! Visit https://www.economist.com/toe Timestamps: 00:00 - String Theory & Mathematics 10:54 - How Does One Do Mathematics? 16:28 - Computers & Mathematics 20:04 - Bottom-Up Mathematics 28:44 - Meta-Mathematics 46:17 - Top-Down Mathematics 55:22 - Pattern Recognition 01:01:30 - Platonic Data 01:07:15 - A Classic Problem Since 1736 01:17:38 - Classical Results for Reimann Surface 01:22:29 - Manifolds 01:26:52 - Superstring Theory 01:30:45 - When Physics Meets Math 01:43:01 - Progress in String Theory 01:45:45 - Image Processing 01:59:33 - Episode Recap 02:12:50 - Outro Links Mentioned: •⁠ ⁠The Calibi-Yau Landscape (book): https://amzn.to/41XmUi0 •⁠ ⁠Machine Learning (book): https://amzn.to/49YQ42t •⁠ ⁠Topology and Physics (book): https://amzn.to/4gCcjxr •⁠ ⁠Yang-Hui He’s recent physics lecture: https://www.youtube.com/watch?v=AhuZar2C55U •⁠ ⁠Roger Penrose on TOE: https://www.youtube.com/watch?v=sGm505TFMbU •⁠ ⁠Edward Frenkel’s String Theory discussion on TOE: https://www.youtube.com/watch?v=n_oPMcvHbAc •⁠ ⁠Edward Frenkel’s lecture on TOE: https://www.youtube.com/watch?v=RX1tZv_Nv4Y •⁠ ⁠Joseph Conlon and Peter Toit on TOE: https://www.youtube.com/watch?v=fAaXk_WoQqQ •⁠ ⁠A New Lower Bound For Sphere Packing (article): https://arxiv.org/pdf/2312.10026 •⁠ ⁠Principia Mathematica (book): https://www.amazon.com/Principia-Mathematica-Alfred-North-Whitehead/dp/1603864377/ref=sr_1_5?crid=2ANIKKX6G8KRK&dib=eyJ2IjoiMSJ9.c62w_u2CfXIK6AaEt-QKx6dp22lbkUr17cSyr3O-rltVBjvb8xCrwLWz8CQ6iWjo8rjmeCsSCwPwM_U0T8_InZfz0vEX9UKDWfSa5Oan86o4YwU6F3GdBPz3J2d_hXbLOc-EULawZ47JksUzndhf5q7ydfCMlK9lYKc2XLZQq-6_dHWQSbjYI82e_dcKw9EWp71DPKIZ9v5qvbyP3CnE7gRpN7uPMZpj-lxlo7Wjsl4.iSUZDFr0n-ZlkiADza8yEePerPoxBJRRCLhO0tQm2wU&dib_tag=se&keywords=principia+mathematica&qid=1735580157&s=books&sprefix=principia+ma%2Cstripbooks%2C122&sr=1-5 •⁠ ⁠Tshitoyan’s paper on Nature: https://www.nature.com/articles/s41586-019-1335-8 New Substack! Follow my personal writings and EARLY ACCESS episodes here: https://curtjaimungal.substack.com TOE'S TOP LINKS: - Enjoy TOE on Spotify! https://tinyurl.com/SpotifyTOE - Become a YouTube Member Here: https://www.youtube.com/channel/UCdWIQh9DGG6uhJk8eyIFl1w/join - Support TOE on Patreon: https://patreon.com/curtjaimungal (early access to ad-free audio episodes!) - Twitter: https://twitter.com/TOEwithCurt - Discord Invite: https://discord.com/invite/kBcnfNVwqs - Subreddit r/TheoriesOfEverything: https://reddit.com/r/theoriesofeverything #science #physics #ai #artificialintelligence #mathematics Learn more about your ad choices. Visit megaphone.fm/adchoices

Transcript
Discussion (0)
Starting point is 00:00:00 Looking for the perfect holiday gift? Masterclass has been my go-to recommendation for something truly meaningful. Imagine giving your loved ones the opportunity to learn directly from world-renowned instructors. With Masterclass, your friends and family can learn from the best to become their best. It's not just another streaming platform, it's a treasure trove of inspiration and personal growth. Whether it's learning critical thinking from Noam Chomsky to gaining practical problem-solving skills with Bill Nye or exploring the richness of history with Doris Goodwin or my personal favorite which is learning mathematical tricks or techniques from
Starting point is 00:00:36 Terry Tao. There's something for everyone. Another one that I enjoyed was Chris Voss, a former FBI negotiator who teaches communication strategy. It's been a game changer to me ever since I read about Chris Voss from his book and it was surprising to see him here on Masterclass. A pleasant surprise. Masterclass makes learning convenient and accessible whether you're on your phone or your laptop or tv or just listening in audio mode and the impact is real. 88% of members say Masterclass has made a positive difference in their life. For me, it's an incredible way that I discover, sometimes even rediscover, learning. Every class that I take helps me feel more confident, even inspired. And I've received great feedback
Starting point is 00:01:15 from friends that I've recommended it to. There's no risk. It's backed by a 30-day money-back guarantee. Right now, they're offering an extremely wonderful holiday deal, with some memberships discounted by up to 50%. Head to masterclass.com slash theories and give the gift of unlimited learning. That's masterclass.com slash theories. Your gateway to unforgettable learning experiences. BetMGM, authorized gaming partner of the NBA, has your back all season long from tip off to the final buzzer. You're
Starting point is 00:01:49 always taken care of with a sports book born in Vegas. That's a feeling you can only get with bed MGM and no matter your team, your favorite player or your style, there is something every NBA fan will love
Starting point is 00:02:01 about that MGM download the app today and discover why bed MGM is your basketball home for the season. Raise your game to the next level the NBA fan will love about download the app today and discover why bad MGM is your basketball home for the season. Raise your game to the next level this year with bad MGM,
Starting point is 00:02:13 a sports book worth a slam dunk and authorized gaming partner of the NBA bad MGM dot com for terms and conditions must be 19 years of age or older to wager Ontario only please play responsibly. If you have any questions or concerns about your gambling or someone close to you, please contact Connex Ontario at 1-866-531-2600 to speak to an advisor free of charge.
Starting point is 00:02:35 BetMGM operates pursuant to an operating agreement with iGaming Ontario. Yonhui He, welcome to the podcast. I'm so excited to speak with you. You have an energetic humility and your expertise and your passion comes across whenever I watch any of your lectures. So it's an honor. It's a great pleasure and great honor to be here. In fact, I'm a great admirer of yours. You've interviewed several of my very distinguished colleagues like, you know, Roger Penrose and Edith Franco. I actually watched some of them. It's actually really nice. Wonderful, wonderful.
Starting point is 00:03:07 Well, that's humbling to hear. So firstly, people should know that we're going to talk about or you're going to give a presentation on AI and machine learning mathematics and the relationship between them, as well as the three different levels of what math is in terms of production and understanding bottom up, top down down and then the meta. But prior to that, what specific math and physics disciplines initially sparked your interest and how did the collaboration with Roger come about? So my bread and butter was mathematical physics, especially sort of the interface between algebraic
Starting point is 00:03:42 geometry and string theory. So that's my background, what I did my PhD on. And so at some point, I was editing the book with CN Yang, who is an absolute legend, you know, he's 102, he's still alive, and he's the world's oldest living Nobel laureate. You know, Penrose is a mere 93 or something. So CN Yang of the Yang-Mills theory, so it's an absolute legend. He got the Nobel Prize in 1957. So at some point I got involved in editing a book with CN, with CN called Topology in Physics.
Starting point is 00:04:19 And you know, with a name like that, you can just invite anybody you want, and they'll probably say yes. And that was my initial friendship with Roger Penrose started through And you know, with a name like that, you can just invite anybody you want and they'll probably say yes. And that was my initial friendship with Roger Pembroke started through working together on that editorial. I mean, I have Roger as a colleague in Oxford, and I've known him on and off for a number of years, but that's when we really started getting working together. So when Roger snickers at string theorists, what do you say? What do you do? How does that make you feel? That's totally fine. I mean, I'm not a diehard string theorist and,
Starting point is 00:04:52 you know, I'm just generally interested in the interface between mathematics and physics. And, you know, Roger's totally chill with that. So you just happen to study the mathematics that would be interesting to string theorists, though you're not one? Exactly, and vice versa. I just completely chanced on this. It was kind of interesting. I was recently given a public lecture in Dublin about the interactions between physics and mathematics.
Starting point is 00:05:20 And I still find that string theory is still very much a field that gives the best cross-disciplinary kind of feedback. And it's been doing that for decades. It's a fun thing. I talked to my friends in pure mathematics, especially in algebraic geometry. 100% of them are convinced that string theory is correct because for them it's inconceivable for a physics theory to give so much interest in mathematics. Interesting.
Starting point is 00:05:50 And that's kind of a, I think that's a story that hasn't been told so much in media. If you talk to a physicist, they're like, you know, string theory doesn't predict anything, this and the other thing. But there's a big chapter of string theory. To me, more than 50% of the story, back story of string theory, is just constantly giving new ideas in mathematics. And historically, when a physical theory does that, it's very unlikely for it to be completely wrong.
Starting point is 00:06:21 Yeah. You watched the podcast with Edward Frankel, and he takes the opposite view, although he initially took the former view that okay, string theory must be on the correct track because of the positive externalities. It's like the opposite of fossil fuels. It doesn't give you what you want for your field like physics, but it gives you what you want for other fields as serendipitous outgrowth. But then he's no longer convinced after being at a string conference.
Starting point is 00:06:44 So you still feel like the pure mathematicians that you interact with see string theory as on the correct track as a physical theory, not just as a mathematical theory. Yeah, so he yeah, absolutely. He does make a good point. And so, like, I think, you know, Franco and, you know, algebra geometers like Richard Thomas and various people, they appreciate what string theory is constantly doing in terms of mathematics. And the challenge is whether it is a theory of physics based on the fact that it's giving so much mathematics, I guess you've got to be a mystic. Some of them are mystics, some of us are mystics. And I actually I don't personally have an opinion on that. I just, you know, some days I'm like, well you know, this is such a cool mathematical structure
Starting point is 00:07:36 and there's so much internal consistency. It's got to be something there. So it's just a gut feeling. But of course, you know, it being a science, you know, you need the experimental evidence, you know, you need to go through the scientific process. And that I have absolutely no idea. It could take years and decades. Wouldn't you also have to weight the field like W-E-I-G-H-T, weight the whatever field like the sub-discipline of string theory with how much IQ power has been poured into it, how much raw talent has been poured into it versus others. So you would imagine that if it was the big daddy field, which it happens to be, that
Starting point is 00:08:10 it should produce more and more insights. And it's unclear to me, at least, if that this much time and effort went into asymptotic safety or loop quantum gravity or what have you, or causal set theory, if that would produce mathematical insights of the same level of quality, we don't have a comparison. I mean, I don't know. I want to know what your thoughts are on that. I think the reason for that is just that, you know, we follow our own nose as a community, and the contending theories like, you know, loop quantum gravity and stuff, you know, there are people who do it. There are communities of people who do it.
Starting point is 00:08:47 And, you know, there's a reason why the top mathematicians are going to do string-related stuff is because, you know, you follow the right notes. You feel like it is actually giving the kind of, the right mathematics. Things like, you know, mirror symmetry, you know, or vertex algebras, that's kind of giving the right ideas constantly, and
Starting point is 00:09:06 it's been doing this since the very beginning. And people do the alternative theories of everything, but so far it hasn't produced new math. You can certainly prove us wrong, but I think there's know, I follow, you know, there's a reason why Witten is the one who gets the Fields Medal. Because it's just somehow it's at the right interface of the right ideas in geometry, number theory, representation theory, algebra, that this idea tends to produce the right mathematics.
Starting point is 00:09:42 Whether it is a theory of physics, that's still, you know, that's the next mystical level. But, you know, it's kind of, it's an exciting time, actually. Wittenden didn't get the Fields Medal for string theory, though. It was his work on the Jones polynomial and Chern-Simons theory and Morse theory with supersymmetry and topological quantum field theory, but not specifically string theory. That's right. That's right.
Starting point is 00:10:09 But he certainly is a champion for string theory. And for him, I mean, you know, that idea of, he was able to do, you know, the Morse theory stuff, he was able to get because of his work at supersymmetry, he was able to realize this was a supersymmetric index theorem that generated this idea. And that's really a, supersymmetry really is a cornerstone for string theory, even though there's no experimental evidence for it. So I think that's one of the reasons that's guiding him towards this direction. So, what's cool is that just prior, the podcast that I filmed just prior to yours was Peter White, as you know, is a critic of string theory. And Joseph Conlin, who is a defender of string theory, and he has a book even called Why String Theory.
Starting point is 00:10:59 That's right. I think it was the first time that publicly someone like Peter White, along with the Defender of String Theory, were just on a podcast of this length, speaking about in a technical manner, what are both of their likes and dislikes of string theory and in the string community. There's three issues, string theory as a physical theory, string theory as a tool for mathematical insight, and then three, string theory as a sociological phenomenon of overhype and does it see itself as the only game in town is the arrogance. Should there be arrogance? It was an interesting conversation. Yeah. Well, Joe is a good friend of mine, Joe Collin.
Starting point is 00:11:34 Yeah, right, right. In Oxford. And yeah, no, I value his comments greatly. I've always been kind of, you know, for me, it's, you know, I've always been kind of like slightly orthogonal to the main, the main string theory community. I'm just happy because it's constantly given me good problems to work on. Yes. Including what I'm about to talk about in AI. Wonderful. I'll mention a little bit about it because I got into this precisely because I had a
Starting point is 00:12:01 huge database of Clavier manifolds and I wouldn't have done that without the string community. It's again one of those accidents that you know no other you know the the the other theoretical physicists didn't happen to have this didn't happen to be thinking about this problem. There's this proliferation of Clavier manifolds and I'll mention that bit in my lecture later and why this is such an interesting problem, why clavialness is interesting inherently regardless whether you're a string theorist. And that kind of launched me in this direction of AI assisted mathematical discovery. So this is kind of really nice and I think, I mean for me the most exciting thing about this whole community is that science and especially theoretical science, well not especially science, including theoretical science, has become
Starting point is 00:12:55 so compartmentalized. Everyone is doing their tiny little bit of thing. String theory has been great in that mode for the last decades. It's constantly going, let's take a piece of algebraic geometry. Let's take a bit of number theory here, elliptic curves. Let's take a bit of quantum information, entanglement, whatever, entropy, black holes. And it's the only field that I know that different expertise
Starting point is 00:13:22 are talking to each other. I mean, this doesn't happen in any other field that I know of in sort of mathematics, theoretical physics. And that just gets me excited, and that's what I really like thinking about. Well, let's hear more about what you like thinking about and what you're enthusiastic about these days. Let's get to the presentation. Sure. Well, thank you very much for having me here and I'm going to talk about work I've been thinking about, stuff I've been thinking about for the last seven years, which is
Starting point is 00:13:50 how AI can help us do mathematical discovery, you know, in theoretical physics and pure mathematics. I recently wrote this review for Nature which is trying to summarize a lot of these ideas that I've been thinking about. And there's an earlier review that I wrote in 2021 about how machine learning can help us with understanding mathematics. So let me just take it away and think about, oh, by the way, please feel free to interrupt me. I know this is one of these lectures. I always like to make my lectures interactive So please if you have any questions just interrupt me anytime and I'll just pretend there's a big audience out there and I'll just make it
Starting point is 00:14:34 So firstly you're likely going to get to this but what's the definition of meta mathematics? Okay, great. So roughly, I'm of course, you know How does one so the first question is how does one actually do mathematics, right? And so one can think about, of course, these, in these reviews, I tried to divide it into sort of three directions. Of course, these three directions are interlaced and it's very hard to pull them apart. But roughly, you can think about, you know, bottom-up mathematics, which is, you know, mathematics is a formological system,
Starting point is 00:15:07 you know, definition and lemma proof and theorem proof. And that's certainly how mathematics is presented in papers. And there's another one I would like to call top-down mathematics, where the practitioner looks from above, from above that's why I say top-down from like a bird's-eye view you see different ideas and subfields of mathematics and you try to do this as a sort of an intuitive creative art you know you've got some experience and then you're trying to see oh well maybe I can take a little bit of peace from here and a piece from there and I'm trying to create a new idea or maybe a method of proof or attack or derivation. So these are complementary directions of research.
Starting point is 00:15:55 And the third one, meta, that's just because it was short of any other creative words because there are words like meta science and meta philosophy or meta physics. I'm just thinking about mathematics as purely as a language. Whether the person understands what's going on underneath is of secondary importance. So it's kind of like chat GPT, if you wish. can you do mathematics purely by symbol processing? So that's what I mean by meta. So I'm going to talk a little bit about in this talk about each of the three directions and focusing mostly on the second direction of top-down,
Starting point is 00:16:35 which is what I've been thinking about for the last seven years or so. Hmm. Okay, I don't know if you know of this experiment called the Chinese Room Experiment. Yeah. Okay, so in that, the person in the center who doesn't actually understand Chinese, but is just symbol pushing or pattern matching, I don't know if it's exactly pattern, rule following that would be the better way of saying it. They would be an example of bottom up or meta in this.
Starting point is 00:17:01 So I would say that's meta. As you know know on theories of everything we delve into some of the most reality spiraling concepts from theoretical physics and consciousness to AI and emerging technologies to stay informed in an ever-evolving landscape I see The Economist as a wellspring of insightful analysis and in-depth reporting on the various topics we explore here and beyond. The Economist's commitment to rigorous journalism means you get a clear picture of the world's
Starting point is 00:17:32 most significant developments. Whether it's in scientific innovation or the shifting tectonic plates of global politics, The Economist provides comprehensive coverage that goes beyond the headlines. What sets the economists apart is their ability to make complex issues accessible and engaging, much like we strive to do in this podcast. If you're passionate about expanding your knowledge and gaining a deeper understanding of the forces that shape our world, then I highly recommend subscribing to The Economist. It's an
Starting point is 00:18:01 investment into intellectual growth. One that you won't regret. As a listener of Tou, you get a special 20% off discount. Now you can enjoy the Economist and all it has to offer for less. Head over to their website www.economist.com slash Tou, T-O-E, to get started. Thanks for tuning in and now back to our explorations of the mysteries of the universe So I would I would say that's meta In the sense that the person doesn't even have to be a mathematician. You're just simply taking symbols Large language modeling for math if you wish
Starting point is 00:18:37 Got it Of course, you know, there's a bit of component rather You know that you can see there's a little bit of component at the bottom up because you are taking mathematics as a sequence of symbols. But I would mainly call that meta. If that's okay, I mean, these definitions are just things that I'm using. Yes, yes. But in any case, I would talk mostly about this bit, which is what I've been thinking mostly about. One thing I just want to make, just to set the scene, you know, 20th century, of course, you know, computers have been playing an
Starting point is 00:19:12 increasingly important role in mathematical discovery, right? And of course, you know, it speeds up computation, all that stuff goes without saying, but something that's perhaps not so emphasized and appreciated is the fact that there are actually fundamental and major results in mathematics that could no longer have been done without the help of the computer. So there's famous examples, even back in 1976, this is the famous Up-Hack-and, and Cock proof of the four-color theorem.
Starting point is 00:19:46 You know, that every map, it only takes four, every map in a plane, it only takes four colors to completely color it with no neighbors. And this is a problem that was posed, I think, probably by Euler, right? And this was finally settled by reducing this whole topology problem to thousands of cases and then they ran it through a computer and checked it case by case. So and then other major things like, you know, the Kepler conjecture, which is, you know, that stacking balls, identical balls, the best way to stack it is what you see in the supermarket, you know, in this hexagonal thing.
Starting point is 00:20:25 And this was conjectured by Kepler, but to prove that this is actually the best way to do it was settled in 1998, again, by huge computer check. And the full acceptance by the math community was only as late as 2017 when proof co-pilots actually went through House's construction and then made this into a proof. Yes. Wasn't there a recent breakthrough in the generalized Kepler conjecture? Absolutely. So this is what Marina Vyatsovska got the Fields Medal for. So the Kepler conjecture is in three dimensions, our world. Vyatsovska showed in dimensions 8, 16, and 24 with the best possible packing error.
Starting point is 00:21:07 And she gave a beautiful proof of that fact. And to my knowledge, I don't think she actually used the computer. There's some optimization method. Actually what I'm referring to is that there are some researchers who generalize this for any n, not just 8, not just 24, who used methods in graph theory of selecting edges to maximize packing density to solve the sphere packing problem probabilistically for any n, though I don't believe they used machine learning. Well, thanks for turning on. I'll go to check that. That's interesting. Interesting. This
Starting point is 00:21:40 was actually really interesting when it's the classific. I mean, that's something that's closer to me, which is the classification of finite simple groups. So simple groups are building blocks of all finite groups. And the proof is, you know, it took 200 years, and the final definitive volume was by Dittgerenstein 2008. And what's really interesting, the lore in the finite group theory community is that nobody's actually read the entire proof. It's just not possible. It takes longer for people to actually read the entire proof than a lifetime. So this is kind of interesting that we have reached the cusp in mathematical research where mathematics, the computers are not just becoming, you know, computational tools,
Starting point is 00:22:25 but it's increasingly becoming an integral part of who we are. So this is just set the scene. So we're very much in this, you know, we're now in the early stages of the 21st century, and this is increasingly the case where we have this, where computers can help us, or AI can help us in these three different directions. Great. So let me just begin with this bottom up and just sort of to summarize. This is probably the oldest attempt in where computers can help us. So this is where I'm going to define bottom-up, which is, I guess it goes back to the modern version of this is this classic paper, the classic book of Russell Whitehead on the Principia
Starting point is 00:23:14 Mathematica, which is 1910s, where they try to axiomatize mathematics from the very beginning. It took like 300 pages for them to prove that one plus one is good at two famously. Nobody has read this, sorry. This is one of those impenetrable books. But I mean, this tradition goes back to Leibniz or to Euclid even, you know, that the idea that mathematics should be axiomatized, right? Of course, this program took only about 20 years
Starting point is 00:23:43 before he was completely killed in some sense because of Gödel and Church and Tuer's incompleteness theorems. That, you know, this very idea of trying to axiomatize mathematics by constructing, you know, layer by layer is proven to be, you know, logically impossible within every order of logic. But I'd like to quote my very distinguished colleague, Professor Minyong Kim. He says, the practice of mathematician hardly ever worries about good old. Because you know, if you have to worry about whether your axioms are valid to your day to day, you know, if an algebraic geometry has to worry about this, then you are sunk, right?
Starting point is 00:24:23 You get depressed about everything you do, right? So the two parts kind of cancel out. But the reason I mention this is that because of the fact that these two parts cancel each other out, these two negatives cancel each other out, this idea of using computers to check proofs, or to computer-aided proofs, really goes
Starting point is 00:24:44 back to the 1950s. So despite what Gödel and Georgian Turing have proved is foundational, even back in 1956, Noah Simon and Shaw devised this logical theory machine. I have no idea how they did it because this is really very, very, very primitive computers. And they were actually able to prove certain theorems of Principia by building this bottom up.
Starting point is 00:25:08 You know, take these axioms and use the computer to prove. And this is becoming, you know, an entire field of itself with this very distinguished history. And just to mention that this 1956 is actually a very interesting year because it's the same year, 56, 57, that the first neural networks emerged from the basement of Penn and MIT. And that's really interesting, right? So people in the 50s were really thinking about the beginnings of AI, you know, because neural networks is what we now call, you know, goes under the rubric of AI.
Starting point is 00:25:44 And at the same time, they were really thinking about computers to prove theorems in mathematics. So it's, 56 was kind of a magical year. And this neural network really was a neural network in the sense that they put cadmium sulfide cells in a basement. It's a wall size of photoreceptors. And they were using flashlights to try to stimulate neurons,
Starting point is 00:26:08 literally, to try to simulate computation. That's quite an impressive thing. And then this thing really developed, right? And then, and now, you know, a half a century later, we have very advanced and very, very sophisticated computer aided proof automated theorem provers. Things like the Coq system, the Lean system, and they were able to create. So Coq was used in this full color, the full verification of the proof of the four color theorem was through the Cox system.
Starting point is 00:26:45 And then there's the Phy-Thomson theorem, which got Thompson the Fields Medal. Again, they got the proof through this system. Lean is very good. I do a little bit of Lean, but also Lean, the true champion of Lean is Kevin Buzzard at Imperial, 30 minutes down the road from here, from this spot. And he's been very much a champion for what he calls the Zener Project, and using Lean to formulate, to formalize all of mathematics. That's the dream.
Starting point is 00:27:20 What Lean has done now is that it has, Kevin tells me that all of the undergraduate level mathematics at Imperial, which is a non-trivial set of mathematics, but still a very, very tiny bit of actual mathematics. And they can check it and everything that we've been taught so far at undergraduate level is good and self-consistent, so nobody needs to cry about that one. Wonderful. And so that's all good. And then more recent breakthroughs is the beautiful work of, you know, so three Fields Medalists here, also two Fields Medalists, Gowas Green, Manners
Starting point is 00:27:53 and Tao, Manners, I think it's the name, and Tao, where they prove this conjecture, which I don't know the details of, but they were actually using Lean to prove, to help prove in this. And I think Terry Tao in this public lecture, which he gave recently in 2024 in Oxford, he calls this whole idea of AI co-pilot, which I very much like this word. I was with Tao in August in Barcelona, we were at this conference, and he's very much into this very well. And of course, you know, Tao, Terry Tao for us is, you know, is a godlike figure. So, and the fact that he's championing this idea of AI co-pilots for mathematics is very, very encouraging for all of us.
Starting point is 00:28:41 Yes, and for people who are unfamiliar with Terry Tao, but are familiar with Ed Whitten, Terry Tao is considered the Ed Whitten of math and Ed Whitten is considered the Terry Tao of physics. Yeah, I've never heard that expression. That's kind of interesting. At Barcelona, when Terry was being introduced by the organizer, Eva Miranda, she said, Terry Tao is, this is a very beautiful sentence. Terry Tao has been described as the...
Starting point is 00:29:13 Looking for the perfect holiday gift? Masterclass has been my go-to recommendation for something truly meaningful. Imagine giving your loved ones the opportunity to learn directly from world-renowned instructors. With Masterclass, your friends and family can learn from the best to become their best. It's not just another streaming platform, it's a treasure trove of inspiration and personal growth. Whether it's learning critical thinking from Noam Chomsky to gaining practical problem-solving skills with Bill Nye or exploring the richness of history with Doris Goodwin, or my personal
Starting point is 00:29:45 favorite which is learning mathematical tricks or techniques from Terry Tau. There's something for everyone. Another one that I enjoyed was Chris Voss, a former FBI negotiator who teaches communication strategy. It's been a game changer to me ever since I read about Chris Voss from his book, and it was surprising to see him here on Masterclass, a pleasant surprise. Masterclass makes learning convenient and accessible whether you're on your phone or your laptop or TV or just listening in audio mode. And the impact is real. 88% of members
Starting point is 00:30:15 say Masterclass has made a positive difference in their life. For me, it's an incredible way that I discover, sometimes even rediscover learning. Every class that I take helps me feel more confident, even inspired. And I've received great feedback from friends that I've recommended it to. There's no risk. It's backed by a 30-day money-back guarantee. Right now, they're offering an extremely wonderful holiday deal with some memberships discounted by up to 50%. Head to MasterClass.com slash theories and give the gift of unlimited learning. That's masterclass.com slash theories.
Starting point is 00:30:48 Your gateway to unforgettable learning experiences. The Gauss of mathematics. Or the Mozart. But I think a more appropriate thing to describe him is to describe him as the Leonardo da Vinci of mathematics because he has such a broad impact on all fields mathematics and that's very rare thing. Yeah I remember he said something like topology is my weakest field and by weakest field to him it means I can only write one or two graduate textbooks off of the top of my head on the subject of topology. Yeah exactly, exactly. I guess his intuitions are more analytic.
Starting point is 00:31:28 He's very much in that world of analytic number three functional analysis. He's not very pictorial, surprisingly. Like Roger Penrose has to do everything has to be picked in terms of pictures. But Terry is a symbolic matcher. We can just look at equations, extremely long complicated equations, and just see which pieces should go together.
Starting point is 00:31:50 That's very interesting. Speaking of Eva Miranda, you and I, we have several lines of connection. Eva's coming on the podcast in a week or two to talk about geometric quantization. Awesome. Eva is super fun. She's filled with energy. Yes. Yeah, she's super fun, right? She's filled with energy. Yes. Yeah, she's a good friend of mine. Yeah. I think, you know, in this academic world of, you know, math and physics, I think, you know, we're at most one degree of separation from anyone else.
Starting point is 00:32:15 Yeah. It's a very small community, relatively small community. Yeah. So, this back to this thing about, of course, you know, one could one could get over optimistic. I was told by my friends in DeepMind that that Shaggedy, who I think he's one of the one of the on the on this AI math team, he says, you know, he was instructing that computers beat humans in chess in the 90s, beat humans go at 2018. So you should beat humans and beat in proving theorems in 2030. I have no idea how he extrapol idea how he extrapolated these points. There are only three data points. But DeepMind has a product to sell, so it's very good for them to be over-optimistic.
Starting point is 00:32:54 But I wouldn't be surprised that this number, you know, well, I'm not sure to beat humans, but it might give ideas that humans have not thought about before. So that's possible. Just moving before. So that's possible. Just moving on. So that's the bottom up. And I said this is very much a blossoming, or not blossoming, it's very much a long distinguished field of automated theorem computes, of theorem provers, and verifications of formalization mathematics,
Starting point is 00:33:21 which Tao calls the AI copilot. Just to mention a bit with your question a bit earlier about metamathematics. formalization mathematics, which Tao calls the AI copilot. Just to mention a bit with your question a bit earlier about meta-mathematics. So this is just kind of, I like your analogy, this is like the Chinese room. Can you do mathematics without actually understanding anything? You know, personally I'm a little biased because having interacted with so many undergraduate students before I moved to the London Institute so I don't have to teach
Starting point is 00:33:48 anymore or teach undergraduates. I've noticed, you know, maybe one can say the vast majority of undergraduates are just pattern matching. Right. Whether there's any understanding. I think this is one of the reasons why why chat GPT does things so well. It's not just because it's not because oh you know LLMs are great, large language models are great. It's more that most things that humans do are so without comprehension anyway. So that's why it's kind of this pattern matching idea. And this is also true for mathematics.
Starting point is 00:34:25 What's funny is that my brother's a professor of math in the University of Toronto for machine learning but for finance. And I recall 10 years ago, he would lament to me students that came to him who wanted to be PhD students and he would say, okay, but Kurt, some of them, they don't have an understanding, they have a pattern matching understanding. He didn't want that at the time, but now he's into machine learning, which is effectively that times 10 to the power of 10. Right. Right, right.
Starting point is 00:34:52 No, no, I completely agree. I mean, this is not to criticize undergraduate studies. You know, I think in undergraduate students, it's just that, you know, it's just, it's part of being human. We kind of pattern match and then we do it the best we can. And then of course, if you're Terry Tao, you actually understand what you're doing. But you know.
Starting point is 00:35:11 Of course. But the vast majority of us doing most of the stuff is just pattern matching. So that's why, and this is true even for mathematics. So here, I just want to mention something, which is a fun project that I did with my friends Vishnu Jigala and Brent Nelson back in 2018 before LLM, before all this LLM for science thing.
Starting point is 00:35:34 And this is a very fun thing because what we did, we took the archive and we took all the titles of the archive. This is the preprint server for contemporary research in theoretical sciences. And, you know, we would do an LLM classifiers, Wurteweg, very old fashioned, this is a neural network, Wurteweg. And, you know, you can classify this and then do their thing. But what's really interesting, and this is my favorite bit,
Starting point is 00:36:00 we took, to benchmark the archive, we took Vixra. So Vixra is a very interesting repository because it's of archives spelled backwards and it has all kinds of crazy stuff. I'm not saying everything on Vixra is crazy, but certainly it has everything that archives rejects because you think it's crazy. Things like, you know, three page proof of the Riemann hypothesis or Albert Einstein is wrong. It's got filled with that. It's interesting to study the linguistics even at the title level.
Starting point is 00:36:29 You could see that, you know, what they call the distinctions of quantum gravity versus the other things, they have the right words in Vixra, but the word order is already quite random. That, you know, the classification matrix, the confusion matrix for Vixra is certainly not as distinct as archive. So kind of interesting, you get all the right buzzwords. It's like kind of thing, Vixra I think is a good benchmark that linguistically is not as sophisticated as real research articles. But this idea, so this is something much more serious, it's this very beautiful work of Chitoyan et al. in nature,
Starting point is 00:37:09 where they actually took all of material science, and they did a large language model for that, and they were able to actually generate new reactions in material science. So I think this paper in 2019, this paper by Chitoyan, is really the beginnings of LLM LLM for for scientific discovery. This is quite early. This is 2019, right? Yeah, and it's remarkable how we can even say that that's quite early. The field is exploding so quickly Absolutely, five years ago is considered
Starting point is 00:37:40 Time ago. Yeah, absolutely. I mean five years ago I you know I was still very much in a lot of, I've evolved in thinking a lot about this thing. I would also like to get to your personal use cases for LLMs, chatGVT, Claude, and what you see as the pros and cons between the different sorts, like Gemini was just released at 2.0, and then there there's a 1 and there's a variety. So at some point I would like to get to you personally how you use LLMs both as a researcher and then your personal use cases.
Starting point is 00:38:12 Okay, I can mention a little bit. So one of the very, very first things when Chachi BT3 came out in what, 2018, something, 2019, something like that. Oh, three. You mean GPT 3? GPT 3 like the really early baby versions. Yeah that was during just before the pandemic. Just before pandemic. So that was just like so I got into
Starting point is 00:38:34 this AI for math through this Clavier Maniforge which I'm going to mention a bit later. And then this GPT came out when I was just thinking about this large language model. So this is a great conversation. So I was typing problems in calculus, freshman calculus. And it was solved that fairly well. I mean, it's really quite impressive what he can do. So, you know, it's fairly sophisticated because, you know, things like, you know,
Starting point is 00:39:11 I was typing questions like, you know, take vector fields, blah, blah, blah, on a sphere, you know, find me the grad or the curve. I mean, it's like, you know, first, second year stuff and you have to do a lot of computation. And he was actually doing this kind of thing correctly, you know, partially because there's just so many example sheets of this type out there on the internet. And so he's kind of learned all of that. So I was getting very excited and I was trying to sell this to everybody at lunch. I was having lunch with my usual team of colleagues in Oxford over this.
Starting point is 00:39:47 And of course, lo and behold, who was at lunch was the great Andrew Wiles. And I felt like I was being a peddler for GPT, LLM for mathematics, to perhaps the greatest living legend in mathematics and I'm just super nice and he's a lovely guy And he just instantly asked me says how about you try something much simpler? Two problems he tried the first one was Tell me the rank of a certain elliptic curve and he just typed it down a certain elliptic curve or rush rational points Sorry, rational points of a very simple elliptic curve, which is his baby. And I typed it and it got completely wrong. It was just even, it started very quickly started saying things like, you know, five over seven is an integer. Partially because this is a very hard thing to do. You can't really guess integer points,
Starting point is 00:40:41 but unlike in calculus where there's a routine of what you need to do, right? And then very quickly we converge on an even simpler problem. How about find the 17th digit in the decimal expansion of 22 divided by 29, like whatever. And that is completely random because you can't train, you actually have to do long division. This is, you know, this is, you know, primary school level stuff and yet GPT just simply cannot do and it's inconceivable that it could do it because no language model could possibly do this.
Starting point is 00:41:16 But GPT now 001, 02, 01 is already clever enough when you ask him a question like this, linguistically it knows to go to war from alpha and then it's okay. Then he's actually doing them. But so something so basic like this, you just can't train the language model to do. You get, you know, one in 10 right. And it's just a randomly distributed thing. Yes. Whereas sophisticated things, they are seemingly sophisticated thing like solving differential
Starting point is 00:41:44 equations or doing very complicated integrals. It can do because there's somewhat of a routine and there are enough samples out there. So, anyway, so that's my user case, two user cases. That's also not terribly different than the way that you and I, or the average person or people in general think. So for instance, we're speaking right now in terms of conversation.
Starting point is 00:42:08 And then if we ask each other a math question, we move to a math part of our brain. We recognize this is a math question. So there's some modularity in terms of how we think. It's not like we're going to solve long division using Shakespeare. Even if we're in a Shakespeare class and someone just jumps in and then ask that question, we're like, okay, that's a different, that's of a different sort of mechanism. Yeah, that's a good analogy. Yeah. When you first encountered chat GPT or something more sophisticated that could answer even larger mathematical problems, did you get a sense of awe or terror initially? initially. So I'll give you an example. There was this meeting with some other math friends
Starting point is 00:42:45 of mine and I was showing them chat GPT when it first came out. And then one of the friends was like, explain, can you get it to explain some inequality or prove some inequality? And then it did. And then explained it step by step. Then he, then everyone just had their hand over their mouth. Like, are you serious? Can you do this? And then they're like, then one said, one friend said, this is like speaking to God. And another friend said, had the thought like, what am I even doing? What's the point of even working if this can just do my job for me? So did you ever get that sense? Like, yes, we're excited about the future and it as an assistant. But did you ever feel any sense of dread?
Starting point is 00:43:25 I'm by nature a very optimistic person. So I think it was just all an excitement. I don't think I've ever felt that I was threatened or the community is being threatened. I could totally be wrong. But so far I just think it's like this is such an awesome thing because it'll save me so much time Looking up references and stuff like this. I can you know Yeah, I was happy. I was just like wow, this is kind of cool I mean, I guess if I were an educator I might get a bit of a dread because there's like, you know You know undergraduate degrees is what you know, you do an undergraduate degree
Starting point is 00:44:03 It's just basically one chat GPT being fed to another. A lot of my colleagues started setting questions in exams with chart GPT with fully latexed out equation. I mean, this is becoming the standard thing to do. I guess even if you're an educator, you would probably worry. But I was thinking about just long-term discovery of, you know, what new knowledge can we generate? Okay. So in that sense, this is going to be certainly an incredible help because it's got all the knowledge in the background. Wonderful.
Starting point is 00:44:34 All right, let's move forward. Yeah, sure. So 2022 was a great year. I'm surprised this wasn't like over every single newspaper. I don't know why. At least I was told after, there was some obscuring outlet, I can't even remember, some expert friends in the community told me that the Chachi BTS passed the Turing test.
Starting point is 00:44:54 This is a big deal, but I don't know why it hasn't been, I was hoping to see this on BBC and every major newslet, but it didn't catch on. But anyhow, I believe that in 2022, ChargBTS passed the Turing test. And then, you know, where in the last two years, this is obviously where we can, you know, this is a huge development now
Starting point is 00:45:19 for large language models for mathematics. And, you know, every major company, OpenAI, MatterAI, EpochAI, you know everything and they've been doing tremendous work in trying to get LLM for math. Basically, you know, take the archive which is a great repository for mathematics and theoretical physics, pure mathematics and theoretical physics, and then just learn that and try to generate to see how much this is very much working in progress. And of course, AlphaGeo, AlphaGeo 2, AlphaProof, this is all the DeepMind's success. It's kind of interesting, within a year, you've gone from 53% on Olympia level to 84%, which
Starting point is 00:46:04 is part, this is scary, right? Every, this is scary in the sense, like, impressively awesome that they could do so quickly. So basically in 2022, an AI is approximately equal to the 12-year-old Terence Tao, in the sense that it could do a silver medal. But of course, this is a very specialized, you know, the Alpha-Geo 2 was really just homing in on Euclidean geometry problems. Which to be fair are extremely difficult, right?
Starting point is 00:46:34 If you don't know how to add the right line or the right angle, you have no idea how to attack this problem. But it's kind of learned how to do this. So it's kind of nice. So, you know, this is all within a couple of years. And there's this very nice benchmark called Frontier Math that Epoch AI has put out. I think there was a white paper and they got gowers and towels, you know, the usual suspects, just to benchmark. Okay, fine. So we can do 84% on math Olympiad,
Starting point is 00:47:06 which is sort of high school level. What about truly advanced research problems? So to my knowledge, as of the beginning of this month, it was only doing 2%. So that's okay, fine, it's still not doing that great. But the beginning of this week, you learn that OpenAI 03 is doing 25%. So we've gone 20% up. We've got a fifth up within four weeks of what they can do.
Starting point is 00:47:37 So I said, wow, that's kind of very interesting. Yeah, such a rapid improvement. It's so, this is crazy. I love this, right? Because it's exciting. It's so, this is crazy. I love this, right, because it's exciting. It's very rare to be. I remember back in the day when I was a PhD student doing ADS CFT-related algebra geometry, because Marithana had just come out with a paper in 97, 98, and that's just when I began my PhD.
Starting point is 00:48:06 I remember that kind of excitement, the buzz in the string community. And people are saying there was a paper every couple of days on the next, that kind of excitement. And I haven't felt that kind of excitement for a very long time just because of this. And then this is like that. Every week, there's this new benchmark and new breakthrough.
Starting point is 00:48:27 So that's why I find this field of AI system mathematics to be really, really exciting. Can you explain, perhaps it's just too small on my screen because I have to look over here, can you explain the graph to the left with Terence Tao? Oh gosh, I'm not sure I can because I'm sure I can read this graph in detail. I think it's the year. What is it trying to convey? So it's the ranking of Terrence, no this is just Terrence Towers individual performances over different years, over different problems.
Starting point is 00:49:01 So he's retaking the test every year? No, no, he's taken it three times, ages 10, 11, and 12. And when he was 10, he got the bronze medal, and then he got the silver medal, then he got the gold medal within three years. Okay. And age of 12 or something.
Starting point is 00:49:19 But I can't, I think- What are those bars, though? I think the bars, a good question. Maybe it's to the different questions. You're giving 60 questions and what it would take to get the gold medal, I think, or what it would take to get the silver medal. I think. How many percents do you have to be quick?
Starting point is 00:49:40 Okay, so was it a foolish question of mine? It's actually... No, no, no, no. It's a good question. I have no recollection or maybe I never even looked at it. Somebody told me about this graph at some point. I forgot what it is. Okay, because it looks to me like Terrence Tao is retaking the same test and then this is just showing his score across time and he's only getting better. But that can't be it. Why would he retake the test? He's a professor. No, I think it goes to 66. It must be like, this is an open source graph.
Starting point is 00:50:11 Oh, I thought you were going to say this is an open problem in the field. What does this graph mean? No, no, no. It's an open source. This graph is just, you could take it from the Math Olympiad database. Which I shamelessly, see again, perfect, right? I've just done something that I have absolutely no understanding of presented to you like a language model and I just copy and paste it because it's got a nice cute picture of
Starting point is 00:50:34 Terry's style when he was a little, so finally I'll go back to the stuff that I be really me thinking about, which is just sort of top-down mathematics. So and then this is kind of interesting. So the way we do research, you know, practitioners, is completely opposite to the way we write papers. I think that's important to point that out. We muck about all the time. We do all kinds of things.
Starting point is 00:51:00 You look at my board, right, it's just filled with all kinds of stuff. And most of it is probably just wrong. And then once we got a perfectly good story, we write it backwards. And I think writing math papers backwards, and math generally define math and theoretical physics papers backwards. Well, theoretical physics is a bit better.
Starting point is 00:51:19 At least sometimes you write the process. But in pure math papers, everything is written in the style of Bubacki. This very dry definition proof, which is completely not how it's actually done at all. This is why Arnold, the great Vladimir Arnold says, Bubacki is criminal. He actually used this word, the criminal bubacca-ization of mathematics, because it leaves out all human intuition experience.
Starting point is 00:51:50 It just becomes this dry machine-like presentation, which is exactly how things should not be done. But bubacca is extremely important because that's exactly the language that's most amenable to computers. So it's one way or another. But human practitioners certainly don't do this kind of stuff. We muck about. Sometimes even rigorous sacrifice.
Starting point is 00:52:16 If we have to wait for proper analysis in the 19th century to come about before Newton invented calculus, we won't even know how to compute the area of an ellipse because we have to wait and formalize all of that. You don't just go all backwards. So kind of the historical progression of mathematics is exactly opposite to the way that it's represented. I mean, it's fine, but the way it's presented is better, it's much more amenable to approve copilot system like Lean than what we actually do. Even science in general is like that,
Starting point is 00:52:55 where we say it's the scientific method, where you first come up with a hypothesis and then you just, you test it against the world, gather data and so on. But the way that scientists, not just in math and physics, but biologists and chemists and so on work are based on hunches and creative intuitions and conversations with colleagues and several dead ends. And then afterward you formalize it into a paper in terms of step by step, but it was
Starting point is 00:53:19 highly nonlinear. You don't even have a recollection most of the time of how it came about. That's right. And I think one of the reasons I got so excited about all this AI from ARC is this direction. Because this hazy idea of intuition or experience, this is something that a neural network is actually very, very good at. Wonderful. It could help you. So I'm going to give concrete examples later on about how it gives guides humans. But just to give some classical examples, I've given this, I've said this joke so many
Starting point is 00:53:59 times. I think, so what's the best neural network of the 18th century? Well, it's clearly the brain of Gauss. I mean, that's a perfectly functioning, perhaps the greatest neural network of all time. And this is, I mean, I want to use this as an example because, you know, what did Gauss do? Now, Gauss plotted the number of prime numbers less than a given positive real number. Just to give a sort of continuity.
Starting point is 00:54:28 And he plotted this and it's kind of a really, really, you know, jaggedy curve and it's a step function. It's a step function because it jumps whenever you hit a prime. But Gauss was just able to look at this when he was 16 and said, well, this is clearly x over log x. How did he even do this experience? I mean, he had to compute this by hand and he did and he got some of the wrong even, you know, primes. He had tables by his time, the tables of primes were up in the tens and hundreds of thousands. He has to go up in the hundred thousand range.
Starting point is 00:55:05 And you can just look at this as x over log of x. But this is very important because he was able to raise a conjecture before the method by which this conjecture is proved, namely complex analysis, was even conceived of by Cauchy and Riemann. And that's a very important fact. So he just kind of felt that this was X over log X. And you had to wait for 50 years before Hadamard and Delavay-Presante proved this fact. Because this technique, known as, which we now take for granted, this technique
Starting point is 00:55:36 called complex numbers, complex analysis, wasn't invented by Cauchy, wasn't invented yet. You had to wait for that to happen. So that's kind of, that's how it happens like this in mathematics all the time. Even major things, of course, you know, this is, so now it's called the prime number theorem, which is a cornerstone of all of mathematics, right? This is the first major result since Euclid on the distribution of primes. How did Gauss say this was x over log x? Because he had a really great neural network and
Starting point is 00:56:05 this happened it happens over and over again like you know the best Swindon and Dyer conjecture which I'm going to talk about later which is a one of the millennium problems and it's still open and it's certainly one of the most important problems in mathematics of all time and this is Birch and Swindon and Dyer in a basement you you know, in Cambridge in 1960s. They just plotted ranks and conductors of lead curbs. I'm gonna define those in more detail later. And they will say, oh, that's kind of interesting, you know, the rank should be related to the conductor in some
Starting point is 00:56:38 strange way. And that's now the BSD conjecture, the person-winner-die conjecture. And what they were doing was computer-aST conjecture, the person with the diet conjecture. And what they were doing was computer aided conjectures. So here was the eyeballs of Gauss in the 19th century. But the 20th century really have seriously computer aided conjectures. And of course the proof of this is still open in general. There've been lots of nice progress in this. And, you know, where we're going to go is very much what technique do we need to wait to prove something like this?
Starting point is 00:57:18 Now, is there a reason that you chose Gauss and not Euler? Like is it just because Gauss had this example of data points and guessing a form of a function? I'm sure Euler, who is certainly is great, had conjectures maybe, that's an interesting quote. I'll mention Euler later, but I think there's not an example as striking as this one. In fact, what's interesting as a byproduct of Gauss inventing this, because it was kind of mucking around with statistics, right? This is before statistics existed at Esso Field as well, right? This is like early 1800s.
Starting point is 00:57:59 And Gauss, I think, and you can check me on this, Gauss got the idea of statistics and the Gaussian distribution because he was thinking about this problem. So it's kind of interesting. So he was laying foundations to both analytic number theory and modern statistics in one go. He was doing regression. So I think he essentially invented regression, the curve fitting, which is like 101 of modern society. He was trying to fit a curve.
Starting point is 00:58:36 What was the curve that really fit this? In the process, he got x over log x, and in addition, he got this idea of regression. Impressive guide. says he got X over log X and in addition he got this idea of regression and an impressive guide. What can we say? He's a God to us all. And then, so the upshot of this is like, I love this. Again, there's something I found on the internet. Just to emphasize, you know, that this idea of- Speaking of God. Yes, speaking of God, this idea of mucking about with data in pure mathematics is a very ancient thing.
Starting point is 00:59:12 Right? You know, you have to, you know, and you know, once you formulate something like this in conjecture, you will write your paper. Imagine, you know, writing a paper, you will, conjecture, you know, definition, prime, definition pi of x, then conjecture pi of x, evidence. Rather than all of the failed stuff about inventing regression and mucking about, all that stuff just gets not written at all. That intuitive, creative process is not written down anywhere. So here is a,
Starting point is 00:59:42 it's great, I'm glad I'm chatting to you about it, right, because it's nice to have an audience with this, right. So you know, if you look at like, so pattern recognition, what do we do, right, in terms of pure mathematical data. If I gave you a sequence like this, you can immediately tell me what the next number is to some confidence. Yeah, zero, zero, one. Zeroes is just, you know, this is just multiple of three or not. This one, I've tried this with many audiences and, you know, after a few minutes of struggle,
Starting point is 01:00:11 you can get the answer and then this turns out to be the prime characteristic function. So what I've done here is to mark all the odd integers and evens obviously you're going to get zero so it's kind of pointless. You just odd, just a sequence of odd integers. And then it's a 1 if it's a prime, it's 0 if it's not. So 3, 5, 7, 8, and so on and so forth. No, sorry, 3, 5, 7, 9, 11. And you mark all the odd ones, which are 1.
Starting point is 01:00:38 And you can probably, after a while, you can muck about, and you can see where this is going. The next sequence is much harder. So I'm going to give away so we won't have to spend a couple of hours staring at it. So this one is what's called the shifted Moebius function. What this is, just you take an integer and you take the parity of the number of prime factors it has up to multiplicity. Starting from 2, I think I didn't start from 1 here. And then if it's 1, if it's a, if it's, I'm not sure maybe I did start with 1,
Starting point is 01:01:16 it's 0 if it's an odd number of prime factors, it's 1 if it's an even number of prime factors for all the sequence of integers. And I hope now I've gotten this right. So if I think I start with 2, 2 has, so that's all, no, let's see, 2, 3, yeah, so I did start with, I'm going to mark 1 for 1 just to start this kick off the sequence. And then 2 is a prime number, it has only one prime factor, it's an odd number, 3 is an odd number of prime factors, 4 is 2 because it's 2 squared, so it has an
Starting point is 01:01:52 even number of prime factors and so on and so forth. So 5 is prime, it has one odd number, 6 is 2 times 3, so it has 2, an even number of prime factors and so on and so forth. It looks kind of harmless. What's really interesting, so this is even number. So, I've just stared at this for a while, it's very, very hard to recognize a pattern. And what's really interesting is that to know the parity of the next number, if you have an algorithm that can tell me the parity of this in an efficient way, you will have an equivalent formulation of the movement hypothesis. So
Starting point is 01:02:32 that's actually an extremely hard sequence to predict. So if you can tell me with some confidence more than 50% what the next number is without looking up some table, then you can probably end up cracking every bank in the world. Interesting. Because this is equivalent to the Riemann hypothesis. So I'm just giving three, so trivial, kind of okayish, really, really, really hard. Yes.
Starting point is 01:03:00 So now you can think about a question. How, if I were to feed sequences like this into some neural network, how would a neural network do? So one way to do it, so this goes a bend, so we go way back to the very beginning, to the question of what is mathematics? And you know, Hardy in his beautiful apology says, you know, what mathematicians do is essentially we are pattern recognizers. That's probably the best definition of what mathematics is, is that it's a study of patterns,
Starting point is 01:03:38 finding regularity in patterns. And, in fact, you know, if there's one thing that AI can do better than us, it's pattern detection. Because, you know, we evolved in being able to detect patterns in three dimensions and no more. So in this sense, if you have the right representation of data, you're sure that AI can do better than that. I mean, you know, it generates a lot of stuff, but filtering out what is better is a very interesting problem
Starting point is 01:04:07 in and of itself. So let's try to do one. I mean, there are various ways to do this representation. One way you can do it is to do a problem which is maybe best fit for an AI system, which is binary classification of binary vectors. So what you do is, sequence prediction is kind of difficult. So one thing you can do is just take this infinite sequence and just take, say, a window of 100, 1000, what, fixed window size.
Starting point is 01:04:37 And then label it with the one immediately outside the window, and then shift, label, shift, label. So then you can generate a lot of training data this way. So for this sequence, I think I've just taken here whatever the sequence is, and I might just with a fixed window size, and with this label. So now you have a perfectly supervised, perfectly defined binary supervised machine learning problem.
Starting point is 01:05:02 Then you pass it to your standard AI algorithm. They're just out of the box ones. You don't even have to tune your particular architecture. Just take your favorite one and then do cross validation, the standard stuff, take sample, do the training, and then try to validate this on unseen data. So if you do this to the MOT3 problem, to this one, you immediately find that, you know, any neural network or whatever base classifies would do it 100% accuracy,
Starting point is 01:05:39 as you should, because you'll be really dumb if you didn't, because this is just a linear transformation. So even if you have a single neuron that's just doing linear transform, that's good enough to do it. The prime Q problem I did some experiment, some, oh gosh, like seven years ago, it got 80% accuracy. And I was like, wow, that's kind of, this was a wow moment. I was like, why, why is it doing 80? I don't have a good answer to this.
Starting point is 01:06:06 Why is it doing 80% accuracy to this? How is it learning? Maybe it's doing some sieve method, which is kind of interesting somehow. The second number is just to chi-square, just to double test that the, what's called MCC, which is Matthew's correlation coefficient. These are just buzzwords and stats.
Starting point is 01:06:24 I never learned stats, but now I'm relearning. I took Coursera in 2017 so I can relearn all these buzzwords. It's great. It's really useful. And then this shift in the over lambda function, it's sorry. I think I made a, yeah, I mistakenly called this, called called this Merbius mu function. It's not. I mean, it's related but it's not. It's the shift in the Leoviel lambda function. Got it.
Starting point is 01:06:52 Sorry, one of my neurons died when I said Merbius mu but it's Leoviel lambda. You were subject to the one pixel attack. Yeah, so this one I couldn't break 50%. right? 0.5 just means it's coin toss. It's not doing any better guessing than whatever. And this chi-square is 0.00. That means I'm up to statistic error. So which means I couldn't find an AI system which could break, which could do better than random guess.
Starting point is 01:07:20 I'm not saying there isn't one. It would be great if there were one. And then, yeah, so it's kind of, you know, it's life. And I couldn't, if I do break it, you know, I might actually stand a good chance breaking every bank in the world. All right. But I haven't made it worse. Well, let's remain close friends. Yeah, that's right
Starting point is 01:07:45 That's right. So I was very proud of this because this experiment I'm gonna mention a bit later This little lambda was suggesting I was just trying like way back when but apparently Peter Sarnak whom whom I really admire He's one of the world's greatest Number theorists currently current number theorists and theorist. And I got to know him through this memorization thing that I'm going to talk about later. And I reminded him that I almost became his undergraduate research student. I ended up doing, I was an undergrad at Princeton where I had two paths I could follow for, you know, to kind of define your undergrad, your thesis, right?
Starting point is 01:08:28 So, one was in mathematical physics, one was, that's with Alexander McDowell. And the other one was with, you know, two problems and the other one was actually offered by Peter Sinek on arithmetic problems. And I somehow just, because I wanted to understand the nature of space and time, I went through the Alexander Mikhailov path to do mathematical physics, which led to do string theory. And after 20, 30 years, I came full back to be in Peter Sarnak's world again. I made him at this conference, I reminded him of this and he was very happy.
Starting point is 01:09:06 But also actually what's really interesting is is that he was asking DeepMind the same question a few years ago about the deluvial lambda, whether DeepMind could do better than 50%. So I was glad that I thought along the similar lines as a great expert in number theory. And somebody who could have potentially have been my supervisor. And then I would have gone into number theory instead of swing theory, which is whatever.
Starting point is 01:09:35 It's how life happens. So perhaps you're going to get to this later on in the talk, but I noticed here you have the word classifier. And the recent buzz since 2020 or so has been with architecture, the transformer architecture in specific. So is there anything regarding mathematics, not just LLMs, that has to do with transformer architecture that's going to come up in your talk? Not specifically. I'm actually, it's interesting.
Starting point is 01:10:00 I'm one of my colleagues here at the London Institute. He's a, uh, uh, Mikhail Berd London Institute, he's Mihail Bertsov. He's an AI, he's our Institute's AI fellow, and he's an expert on transform architecture. So I've been talking to him and we're trying to devise a nice transform architecture to address problems in finite group theory. It's in the works. But nothing so far, even with the memorization stuff, is very basic neural networks that we didn't use anything more sophisticated than that. So to be determined whether it will outperform
Starting point is 01:10:40 the standard ones will be kind of interesting. Got it. Yeah, so actually now we go way back to the beginning of our conversation. It's how I got into this stuff. And that, I don't know, completely coincidentally was through string theory. So at this point, maybe I'll just give a bit of a background of like, you know, how all this stuff came about. At least personally. At least personally. Why was I even thinking about this?
Starting point is 01:11:06 Because I knew nothing about AI, like seven, eight years ago, zero, like literally zero. Like I knew nothing more than to read it on the news, from the news. And this is actually a very interesting story, which shows again, the kind of ideas that the string theory community is capable of generating. Just because you got all these experts looking
Starting point is 01:11:29 on kind of interesting problems. So let's go way back and again, I've quoted Gauss, I've got to cook, I have to say something about Euler. So this is a problem. Again, you can see I'm very influenced by three, the number three. I'm very influenced by three, the number three. I'm a total numerologist, right?
Starting point is 01:11:47 Trinity, name the three, three is something, right? And there is called the trichotomy classification theorem by Euler. This dates to 1736. So if you look at, so I'm going to say the buzzword, which is connected compact orientable surfaces. So these are, you know, I mean the words explain themselves, you know, they have no boundaries and they're, you know, topologically, you know, whatever, the topological surfaces. So Euler was able to realize that a single integer characterizes all such surfaces.
Starting point is 01:12:27 So this is the standard thing that people see in topology, right? So the surface of a ball is the surface of a ball, and you can deform it. The surface of a football is the same as an American football. It can deform without cutting or tearing. and then the surface of a donut is the same as your cup, right? Because it's everything that everyone understands, the thing, you know, this has one handle. And so the surface of a donut is exactly the topologically, what they call topologically homeomorphic to the cup. And then he got the pretzel. So I think that's a pretzel. Or maybe, I think this is like the German pretzel.
Starting point is 01:13:12 And it gets more and more complicated. But Euler's, because you know, Euler invented the field of topology. So he realized this idea of topological equivalence in the sense that there's a single topological invariant which we now call the Euler number which characterizes these things. Another way to an equivalent way to say is the genus of these surfaces is no handles, one handle, two handles, three handles and so on and so forth. It turns out that the Euler number, which we now call the Euler number, is 2 minus twice the genus. So 2, 2 minus 2g.
Starting point is 01:13:52 Okay, that's great. So this is, that's the classic Euler's theorem. And then, you know, comes in Gauss, right? Once you got these three names next to each other, Euler, Gauss, and Riemann, you know it's gotta be some serious theorem. So Euler did this in topology. And then Gauss did this incredible work, which he himself calls him the Theorem Agrigium, the great theorem, which he considers this is his personal favorite,
Starting point is 01:14:22 and this is Gauss. And Gauss said, you can relate this number to, which is, this number is purely topological. You can relate this number to metric geometry. So he came up with this concept, which we now call Gaussian curvature. It's just some complicated stuff. You can characterize this curvature, which you can define on this. This is even before the word manifold existed on the surface. And then you can integrate using calculus, and the integral of this Gaussian curvature
Starting point is 01:15:00 divided by 4 pi is exactly equal to this topological number. And that's incredible, right? The fact that you can do an integral, it comes out to be an integer. And that integer is exactly topology. So this idea, this Gauss-related geometry to topology in this one sweep. And then what's even the next level comes Riemann. Riemann says, well, what you can do is to complexify. So these are no longer, you know, real connected compact orientable surfaces, but you can think about these as complex objects. So what do we mean by that is, well, if you think about the real Cartesian plane,
Starting point is 01:15:52 that's a two-dimensional object, but you can equally think of that as a one complex dimensional object, namely the complex plane. Or the complex line. Yeah, the complex line, exactly. So with R2, Riemann would call C. And then Riemann realized that you can put similar structure on all of these things as well. So all of a sudden, these things are no longer two-dimensional real or interval surfaces, but one complex dimensional what's now called curves.
Starting point is 01:16:23 I mean, it's a terrible name so a complex curve is actually a two real dimensional surface and it turns out that all complex curves are orientable so you already rule out things like you know applying bottles and stuff like that or Möbius strips. So the complex structure requires orientability and that's partly because of Cauchy-Riemann relations. You know, it puts a direction. You can't get away. But the idea is, the interesting thing is all of this now should be thought of as one complex dimensional curves. They're called curves because they're one complex dimension, but
Starting point is 01:17:00 they're not curves, right? They're surfaces in the real sense. Right? Yes. So now here, here comes, so if you apply this to, to, to the Gauss thing, this, you get this amazing trichotomy theorem. And the theorem says, if you do this to the curvature, you can see this, I mean, this, the number here is two, right? You get the only number two, which is a positive curvature thing. Right? And that's consistent with the fact that the sphere is a positively curved object. Locally, everywhere, it has positive curvature.
Starting point is 01:17:32 If you do it to a torus or the surface of a donut, which is just called the algebraic donut, you integrate that, you get zero curvature. And this is not a surprise because you know you have a sheet of paper you Fold it once you get a cylinder and you fold it again. You glue it again. You get this this curtain this torus this donut and This sheet of paper thing is is inherently flat. Yes So if you like if you just take a piece of paper you roll I mean, I like you know you take take a piece of paper, you roll it up, you take this piece of paper
Starting point is 01:18:06 and you roll it up, you get a cylinder and then you do it again and nonetheless you get the surface of a donut, like a rubber tire and that is currently zero curvature. And then you can do this and this is a consequence of what's known as Riemann uniformization theorem. If you do anything that has more than one handle, you get zero curvature. So now you have the trichotomy, right? Positive curvature, zero curvature, negative curvature. The one in the middle is really, obviously is interesting, it's the boundary case.
Starting point is 01:18:39 In complex algebraic geometry, these things are called final varieties. Earlier you said if you have anything that's more than one handle, you have zero curvature. You meant negative curvature. Sorry, sorry, I meant negative curvature. So these fidget spinners on the right, they all have negative curvature. Everything here has negative curvature. Got it. So now in the world of complex algebraic geometry, these positive curvature things are called
Starting point is 01:19:05 final varieties after this Italian guy Fano. These negative curvature objects which proliferate are called varieties of general type. And this boundary case are called zero curvature objects. And it just so happens we now call things in the middle Clavier, these zero curvature objects. And it just so happens, we now call things in the middle clavia, these zero curvature objects. Yes. So far, this has got nothing to do with physics. I mean, it's just the fact of topology, right?
Starting point is 01:19:35 But this is such a beautiful diagram that took from 1736 until Riemann. Riemann died in the 1860s, I think, or something like that. So it took 120 years to really formulate just this table to relate metric geometry to topology to algebraic geometry. It's kind of a beautiful thing, right? So to generalize this table is the central piece of what's now called the minimal model program in algebraic geometry, for which there have been all these fields metalists, you know,
Starting point is 01:20:09 B. Akar a couple of years ago, and then he started with Mori who got the fields metal, and then this whole Mukai and this whole distinct, distinguish idea. So basically this minimal model program should just generalize this to higher dimension. This is dimension complex, dimension one, right? How do you do it? It's very hard. And once you have it, I won't bore you with the details, this is very nice, you know, there's topology, algebraic geometry, differential geometry, index theorem,
Starting point is 01:20:34 they all get unified in this very beautiful way. And you want to, obviously you want to generalize this to arbitrary dimension, arbitrary complex dimension. It would be nice. It's still an open problem. how do you do it in general? It's a very nice problem. But at least for a class of complex manifold known as Kähler manifolds, I won't bore you with the details, but Kähler manifolds on which where the metric has very nice behavior,
Starting point is 01:21:00 there's a potential for which you can have a double derivative that gets on the metric. And then it was conjectured by Kalebi in the 50s. Again, you know, 54, 56, 57, it was a great year, right? All these different ideas, I mean, in three completely different worlds, now come together because mathematical physicists have kind of tied it up, you know, the world of neural networks, the world of Kouby conjecture, the world of string theory to one. I like, you know, when things get bridged up in this way, you know, but, you know, again, this, the theorem itself is extremely technical. But the idea is for this Kähler manifold, there is an analog of this diagram, basically.
Starting point is 01:21:44 I love this slide. I saved this slide for my own private notes. Oh really? Okay. I keep a collection of dictionaries in physics and math. Yeah. I think this is beautiful. Yeah, me too. But it took me like, it took me years to do this table because you, you know, it's not written down anywhere. And it touches different things. I think it's not written down anywhere precisely because math textbooks are written in the
Starting point is 01:22:11 Babaki style. But now it just becomes clear what people have been thinking about for the past 100 years, you know, after Grotendieck, it's just trying to relay these ideas. You know, this is intersection theory of characteristic classes. So this is topology. And you know, this is, I mean, this is over 200 years of work of, you know, the central part of the analytics. And mathematicians like Turing, Ritchie, Euler, Betty.
Starting point is 01:22:36 Yeah, every, everything, every, everyone, everybody was every involved in this diagram is an absolute legend. In fact, there is one more column to this diagram. I think for short of, I think when I did, I mean this was a slide from some time ago, but when I was talking to a string audience, there is one more, one more, which is relations to L function. And that's when number theory comes in. So there is one more column. And to understand this world, to this one more column of its behavior to L functions, that's the Langlands program.
Starting point is 01:23:15 So it's actually really magical that this table actually extends more. As far as, I mean, that's just as far as we know now, right? Of course. The L functions and its relations to modularity. And I think this is, of course, obviously, to me, like mathematics is about extending this table as much as possible to let it go into different fields of mathematics. So but at least for sure we know there is one because of the Langlands correspondence, there is one more column and that column should be on number theory and modularity.
Starting point is 01:23:48 And soon there'll be another table on the Yang invariant, the He invariant. No, I don't think I have enough talent to create something that, but it could well be there should be something new to do. To me that's really the most fun part about mathematics. It's not not so I mean they're like you know, who is it? I think maybe it's Arnold as well, because there's two types of mathematicians. They're the hedgehogs and they're the birds, right? Hedgehogs really like, you know, like, like, and- Specialized. Specialized.
Starting point is 01:24:19 I mean, you absolutely need it. I think, you know, who is a great hedgehog? I think Zhang, the guy who, you know, made this first major breakthrough in the prime gap. I mean, he's been saying his entire life, just trying to think about, can I bound, can I bound the, you know, the, how many, you know, in the, what is the, what's the the limsop of the distance between prime pairs. And the technique he uses is beautifully argued analytic number theory technique,
Starting point is 01:24:56 sieve methods, you know, kind of the Ben Green world of sieFS and James Maynard and then there the The birds who are like, you know, I'm just gonna just fly around and they bump into trees and whatnot But I'm just trying to see whether they can do and and people like Robert Langlands and you know They're very much in that world. Can I see from a distance? I mean, I make it very coarse-grained view and which are you? I'm 100% In the bird category. I mean, I like to go, you know, once I see something, of course sooner or later you need to dig like a hedgehog,
Starting point is 01:25:35 but the most thrill that I get is when I say, oh wow, this gets connected. So the results are proven when you dig, but the connections are seen when you get the overview. Yeah, yeah, absolutely. So, I mean, of course, again, this is a division that's kind of artificial. In all of us, we do a bit of both. Yes. The guy who really does a well is, I forget to mention, of course, it's like it's become like a grand, well, he passed away, John McKay, who was a Canadian, probably the greatest Canadian mathematician since Coxeter. John McKay really saw unbelievable connections in fields that nobody would ever see.
Starting point is 01:26:18 And he passed away. He became sort of in the last 10 years of life, he became sort of like a grandfather to me. He saw my kids grow up over Zoom. So the London Math Society asked me to write obituary. I was very touched by this. So I wrote his obituary for it. And I was just trying to say, well, this guy is the ultimate pattern linker. So John McCain, absolute legend. Great. Moving on, I mean this is very much a huge digression for what I'm actually going to tell you about, which is you know the Birch test for AI and that's great. Do you have a limit on what these videos are? No, just so you know some of them are one hour some of them are four hours and people listen for all of it
Starting point is 01:27:10 Yeah, this is great fun. Great. Yeah same. I'm loving this. Yeah, me too Because normally, you know, I have one up with 55 minute cutoff I could give a talk right then like five minutes questions and I'm like, oh my god I haven't said most of the stuff I wanted to say. Yeah, yeah, exactly. Because the point of this channel is to give whoever I'm speaking to enough time
Starting point is 01:27:31 to get through all of their points, rather than they're rushing and not covering something in depth. I want them to be technical and rigorous. So please continue. Sure. Sounds good to me. So CloudBit, so in that magical year of 1957 of neural networks,
Starting point is 01:27:49 the magical year of the automated theorem prover world and the world of algebra geometry in three complete different worlds, they didn't even know of each other's names, let alone the results. Klab's conjecture that at least for Kailer manifolds, this diagram is very much well defined, this table. And Yao proved it 20 years later, so Shintong Yao, who is again very much like a mentor to me. And he gets the Fields Medal immediately. So you can see why this is so important.
Starting point is 01:28:25 He gets the Fields Medal because this idea of falling through Klabi is trying to generalize this sequence of ideas of Euler, Riemann, and Euler, Gauss, and Riemann. So it's certainly very important. So there it is. We can park this idea. So Yao showed that there are these Taylor manifolds that have this property, that have the right metrical properties.
Starting point is 01:28:51 So by metric, I mean distance, you know, something can integrate over that, because here, you know, you never think you know, this integral is messy, right? Even if we do this on a sphere, right? This R has all these cosines and sines that have got, you know, they've all got to cancel at the end of the day to get 4 pi yes like what the hell and then divided by 2 pi you get 2 and that's the only
Starting point is 01:29:12 number which is kind of amazing stuff and now you can do this in general the the just as a caveat Yao showed that this metric exists he never actually gave you a metric so So the only currently known metric on these things is for the zero curvature case is just the torus. Anything above that we don't know, we just know that exists and if you did this integral you're going to get like 2, 5 or whatever the number is. Which is kind of amazing. This is like a completely non-constructive proof. What's interesting is that these automated theorem provers, they seem computational.
Starting point is 01:29:49 And it's my understanding that computationalists, so people who use intuitionist logic, they don't like constructive proofs. Sorry, they like constructive proofs. They don't like non-constructive proofs. In other words, existence proofs without showing the specific construction. Right, right. So it's interesting to me that all of undergraduate math, which has some non-constructive proofs, are included in Lean. So I don't know the relationship between Lean and non-constructive proofs, but that's an
Starting point is 01:30:17 aside. Yeah, that's an aside. I probably won't have too much to say about it. Cool. So, back to, I don't know why I went on this diatribe on digression on string theory, but I just want to say this is a side comment. So this is something since 1736, which is kind of nice. Which is, oh by the way, that's actually kind of interesting. I'm going to have to check this again. Just down the street from the Institute is the famous department store, Fortman Mason's, which I think is established in 17 something.
Starting point is 01:30:55 It's a great department store. It's not usually, it's not where I usually do my shopping, but it's just a beautiful department store where, you know, Mozart would have, and Haydn might have, you might have called and did their Christmas shopping. But anyhow, just random thought. So string theory was just one slide, right? I mean, I'm not, in some sense, I'm not a string theorist.
Starting point is 01:31:18 In a sense, I don't go quantize strings. The kind of stuff that I'm more interested in is like, I didn't grow up writing conformal field theories and do all that stuff. It's just that for me, it's an input so I can play with a little more problems in geometry. So string theory is this theory of space-time that unifies quantum gravity, blah, blah, blah. And then it works in 10 dimensions and we've got to get down to 4 dimensions. So we're missing 6 dimensions. So that's what I'm going to say. And this amazing paper in 1985 by Candela Horowitz-Strohminger and Whitten, they were
Starting point is 01:31:58 thinking about what are the properties of the 6 extra dimensions. So what is interesting is that by imposing supersymmetry, and this is why supersymmetry is so interesting to me, by imposing supersymmetry and other anomaly cancellation, bunch of, not too stringent conditions, they hit on the condition that this six extra dimensions has to be richy flat. Richy flat is, you can understand,
Starting point is 01:32:25 because it's vacuum-outside solutions. You want the vacuum string solution. And then the condition which you've never seen before, which just happens to be this Cayley condition. They didn't know about this. No physicist until 1985 would know what a Cayley manifold was. And it's complex, it's complex and it's complex dimension three. Remember again, I said complex dimension three means real dimension six, right? That's 10 minus four is six, and six needs to be complexified into three. And again, this is just an amazing fact that in 1985, Strominger, who was a physicist, was visiting Yale at the Institute of Advanced Study in Princeton.
Starting point is 01:33:10 And so he went to Yao and said, can you tell me what this strange condition, this technical condition I got? And Yao says, wow, you know, I just got the Fields Medal for this. I think I made know a few things. I was just amazed. It was again, it was a complete confluence of ideas that's totally random. And the rest is history. So in fact, these four guys named this Richie Flap, Kailor Manifold, Klabi Yao. So it wasn't the mathematicians who did it. This word Klabi Yao came from physicists, from string theorists, which now, of course, Calabi-Yau is now one of the central pieces. So, Philip Candelas
Starting point is 01:33:53 was my mentor at Oxford when I was a junior fellow there, and he tells me this story. He's a very lively guy. He tells me about how this whole story came about, and it's very interesting. And so he and these four guys came up with the word collab. So all of a sudden we now have a name for this boundary case in complex hydrometry. This bounding case is now known as a collab so i remember we had names before right this was the final variety this was varieties of general type and this bounding case is now called club. So what we're seeing with the tourists here is a club one exactly exactly exactly in fact, the the the Taurus is
Starting point is 01:34:52 The only Klabi-Yau one. So it's the only one that's richy-flat. I mean by this classification. It's the only one that's topologically possible So that's kind of interesting right and then this is just a comment I like this title because I think your series is called TOE This is a TOE on TOE. Love it. I just want to emphasize, this is a nice confluence of ideas with mathematics and physics. But my string theory really, what it really is, is this brainchild of interpreting problems between,
Starting point is 01:35:18 interpreting and interpolating between problems in mathematics and physics. All right, so for example, you example, we now, you know, GR should be phrased in differential geometry. The standard model gauge theory should be phrased in terms of algebraic geometry and representation theory of finite groups. And condensed matter physics of topological insulators
Starting point is 01:35:40 should be phrased in terms of algebraic topology. This idea, I think the greatest achievement of the 20th century physics is, to me, and I think something you would appreciate since you like tables, is here's a dictionary of a list of things and then here's what they are in mathematics. And then, you know, you can talk to mathematicians in this language and you can talk to physicists in that language, but they're actually the really same thing, You know, what's a fermion? You know, it's a spin representation of the Lorentz group. You know, I like that because it gives a precise definition of what we are seeing around.
Starting point is 01:36:14 Then you have something you can purely play with in this platonic world. And string theory is really just a brainchild of this translation, this tradition of what's on the left and what's on the right, and let's see what we can do. And sometimes you make progress on the left, you give insight on stuff on the right, and sometimes you make progress on the right and you give insight on the left. Why is it that you call the standard model algebraic geometry? Because bundles and connections are part of differential geometry, no? Oh yeah, that's true.
Starting point is 01:36:42 Well I think that's, yeah, they're interlinked. And I think algebraic, maybe it's because of Atiya and Hichin. Of course, you know, they are fluid in both. So yeah, yeah, they go either way. But algebraic in the sense that you can often work with bundles and connections without actually doing the integral in differential geometry. So I think that's the part I want to emphasize. You can understand bundles purely as algebraic objects without ever doing an integral. You know, like here. Like here, for example, like this integral is obviously something you would do
Starting point is 01:37:33 in differential geometry. But this integral, the fact that it comes to be an integer was explained through the theory of churn classes. This integral is a pairing between the churn class, be, you know, to be, you know, this integral is a pairing between the churn class, between homology and cohomology, which is a purely algebraic thing. You know, we all try to avoid doing integrals because integrals are horrible because it's hard to do. And in this language, it really just becomes polynomial manipulation and it becomes much simpler.
Starting point is 01:38:03 Okay. So, you know, that in that sense, I want to put it. Of course, you know, it's a bit of a both. Got it. So, I like doing this diagram, right? And, you know, if you look at the time lag between the mathematical idea and the physical realization of that idea, there really is a confluence. Yeah. Yeah. It's getting closer.
Starting point is 01:38:26 I mean, these things going up and down. I mean, I'm just saying, if you take the last 200 years or so, last 100 years or so, of the groundbreaking ideas in physics, there is this... Interesting. Right. It gets shorter and shorter. Obviously, Einstein took ideas of Riemann and you know there was a six year gap.
Starting point is 01:38:46 Dirac was able to come up with the equation of electron essentially because of Clifford Algebras. Did historically was he motivated by Clifford Algebras or did it just was it later realized hey Dirac what you're doing is an example of a Clifford Algebra? So I believe the story goes, in order to write down the first time derivative version of the Klein-Gordon equation, which is a second order, you know, that's the bosonic one, he had to do some fact, essentially he factorized the matrix in a way that seemed very strange to him. And Dirac said, this really reminded me of something that I've seen before. And this is one of these moments, right? Today, we can chat GPT this. But what Dirac did was, he said, he was at St. John's in Cambridge at the time. He said, I have seen this in the textbook before somewhere, you know, this gamma mu, gamma nu thing. And then he said, I need to go to the library to check this. So he really knew about this.
Starting point is 01:39:53 And unfortunately, the St. John's library was closed that evening. So he waited until the morning, until the library was open to go to Clifford's book or a book about Clifford. I can't remember whether it was Clifford's book or maybe it was one of these books. And then he opened up and he really knew that this gamma mu gamma nu anti-computation relation really was through the through. So he knew about Clifford. Cool.
Starting point is 01:40:26 It's kind of interesting, yeah. Just like Einstein knew about Riemann's work on curvature. But, you know, whether you say, you know, Dirac was really inspired by Clifford, well, he certainly did a funky factorization and then he knew how to justify it immediately by looking at the right source. And then similarly, you know, Yang-Mills theory depended on this Zybert's book on apology. And then, you know, by the time you get to Witten and Borchett's, really there's this... This diagram for me is like what gets me excited about string theory. Because string theory is a brainchild of this curve, this orange curve.
Starting point is 01:41:05 And now it's getting mixed up. I mean, of course, you know, people hear about this great quote that Witten says, you know, string theory is a piece of 21st century mathematics that happens to fall into 20th century. And I think he means this. Yes. You know, that he was using supersymmetry to prove, you know, theorems in Morse theory and vice versa. Richard Borchers was using vertex algebra, which is sort of foundational thing, conformal
Starting point is 01:41:35 field theory, to prove some properties about the monster group. We're at this stage. I know, of course, you know, this was turn of last turn of the century and now we're here and I Have to where are we now? Are we are we crisscrossed or are we parallel? Yeah, it's hard to it's hard to say and in a meta manner You can even interpret this as the pair of pants in string theory with the world sheet. Yeah, cute Why not yeah, it is it is but going back to with the world sheet. Yeah, cute. Very cute. Yeah, why not? Yeah, it is.
Starting point is 01:42:07 But going back to what you were saying, how I got to... Oh yeah, so just, yeah, this confluence idea, of course, you know, everyone quotes these two papers. You know, when Wigner was thinking about in 59, why mathematics is so effective in physics. And there's this maybe slightly less known paper, but certainly equally important paper by the great Leith Latia and then Dijkgraaf and Hitchin, which is the other way around. Why is mathematics so, why is physics so effective in giving ideas in mathematics?
Starting point is 01:42:46 So this is a beautiful pair of essays. In this, this is like very much a, in the world of a summary of the kind of physics ideas from string theory is making such beautiful advances in geometry. So this is a very beautiful pair of one given in the other that needs to be, you know, sort of praised more. And that's why you were mentioning earlier how I got to know, you know, Roger Sawal is through this editorial, we try to collect with my colleague, Mo Ling Ge, who is a former director of the Chern Institute.
Starting point is 01:43:32 Everybody's connected, right? So it just so happens that I grew up in the West, but after I grew up with my parents, after so many decades, my parents actually retired and went back to Tianjin where Nankai University is, where Chern founded what's now called the Chern Institute for Mathematical Sciences. And that's an institute devoted to the dialogue between mathematics and physics. In fact, one third of Chern's ashes is buried outside of the Math Institute. There's a great, beautiful marble tomb. Interesting.
Starting point is 01:44:06 And once they're not because of any mathematical reason, it's just that he considered three parts of his home. So his hometown in Zhejiang, China, and Berkeley, where he did most of his professional career and then Nankai University where he retired to for the last 20 years of his life. So a third each. Yes. The number three comes up again.
Starting point is 01:44:35 It's all about free. And in fact, I was going to joke. So in Chen Simon's theory in three dimensions, there's this topological theory, the Chen Simon's theory. There's a crucial factor of one third. I always joke, you know, that's why churn chose one third for his ashes, but that's not my complete coincidence. But it's actually, what is actually interesting is the, that tomb, that beautiful black marble tomb, you know, for somebody that's great as churn, it mentions nothing about, you know, for somebody as great as Chern, it mentions
Starting point is 01:45:06 nothing about, you know, his chief done this, done the other thing. It's just one page of his notebook. I mean, think about the poor guy who had to chisel or that, he had no idea what he's chiseling, right? The guy who's chiseling this thing. And it's the proof of this, of this. The fact is such, and of course it's a little, you can look this on the internet, just say the grave of SS Chern at Nankai University. Well, the whole conversation we've had is just about pattern matching without the intuitive understanding behind it. So this chiseler may have had that. Yeah, yeah, yeah. That's what I do every day. Love it. So that chisel is essentially his proof why this is equal to this.
Starting point is 01:45:50 You know, why this intersection product is the same as this integral. So he's actually, it's where the Gaspard theorem is a corollary of this trick in algebraic geometry, which is his great achievement. But anyhow, so it just so, yeah, back to this coincidence, and it just so happens that my parents, after drifting all these years abroad, they retired back to Tianjin, where the Chuan Institute is. So that's why I became an honorary professor at Nankai, because, I mean, my motivation was purely just so that I could spend time with my parents. But it just so happens that it happens there and I can just pay my homage to Chen, just to see his grave.
Starting point is 01:46:34 I mean, it's a great, you know, it's a mind-blowing experience just to see the Chen's grave and to see the derivation of this in his handwriting chiseled in stone. But anyway, so that's how I got involved in with Xianyang because he was very deeply involved with Chen. He and Chen are good friends. I can imagine that Xianyang is 102 today. Yeah, it's remarkable. And that he was still doing, he wrote the preface to this, um, when he was 99. And these guys are unstoppable.
Starting point is 01:47:09 And, and, you know, Roger, Roger Penrose sent he, he sent his essay to this one when he was what, 90, 92. Yeah. These guys are anyhow, it's kind of, I, I, I of, you like tables, right? I love tables. So the tables, here's just like, you know, a speculation of where string theory is going. Here's a list of, you know, the annual conferences, like the series where string theory has been happening. So 1986 was the first string revolution where since then every year there's been a major string conference. I'm going to the one, the first one I'm going to for years in two weeks time.
Starting point is 01:47:55 It happens to be Abu Dhabi, I guess I'm sorry. And then, you know, there's a series of annual ones, the string phenol, and the string math came in as late as in 2011. That's kind of interesting. So that's like, you know, 30 years after the first string conference and the various other ones. What's really interesting one is in 2017 is there's the first string data. This is what AI entered string theory. And so it's kind of so what I read the first paper in 2017 about AI sister stuff, and there were three other groups independently mining different AI aspects and how to apply to string theory. So the reason I want to mention is just why was, you know,
Starting point is 01:48:38 with the string community even thinking about this problems in AI. Oh, and also just to be clear, briefly speaking, I'm not a fan of tables per se. I'm a fan of dictionaries because they're like Rosetta stones. So I'm a fan of Rosetta stones and translating between different languages. So you mentioned the siloing earlier
Starting point is 01:48:56 and mathematicians call, even physicists call them dictionaries, but technically they're thesauruses. Like a dictionary, you just have a term and then you define it. The translations. Right, right, right, right. Like Rosetta stones yes yeah yeah no absolutely I guess that's why you like Langlands so much yeah yeah for sure yeah no yeah absolutely
Starting point is 01:49:14 in some way this whole channel is a project of a Rosetta stone between the different fields of math and physics and philosophy yeah yeah that's fantastic love it big fan thank you okay so do you want to just I noticed it jump back to physics and philosophy. Yeah, that's fantastic. Love it. Big fan. Thank you. OK, so do you want to just, I noticed it jumped back to number 13. So it seems like I thought we were at 39 out of 40. No, no, no, because I've learned this non-linear structure. Because you see, I've learned this. This is really dangerous.
Starting point is 01:49:40 I've learned the click button in PDF presentations. Like you click it, it button in PDF presentations. Like you click it, it jumps to another one and you can have interludes. So you know it's clearly an interlude and you say you jump back to your main. So my actual main presentation is only like you know 30 pages but it's got all these digressions which is actually very typical of my personality. So I gave you this big interlude about string, about string theory and Clavier manifolds, right? So now we've already got to the point that Clavier onefold, the one dimensional
Starting point is 01:50:16 complex Clavier, there's only one example. That's just one of these. Right. And then it turns out that in complex dimension two, there are two of these. There is the four dimensional torus, which is, and then there's this crazy thing called the K3, which is Ricci flat and K-ler. So you got one in complex dimension one, two in complex dimension three. You would think in three dimensions there's three of these things that are topologically distinct.
Starting point is 01:50:50 And unfortunately this is one of the sequences in mathematics that goes as one, two, we have absolutely no idea. And we know at least one billion. At least. So it's kind of, it goes one, two, a billion. And so starting from complex dimension three just goes crazy. It's still a conjecture of Yao that in every dimension this number is finite. So, remember this positive curvature thing, this final thing to the very top.
Starting point is 01:51:18 It is a theorem that in every dimension, final varieties is finite in possibility in topology, that only a finite number of these that are distinct topologically. It's also known that the negative curvatures is infinite in every dimension. And when it goes higher, it's like even uncountably infinite. Oh, interesting. But is this boundary case, Yao conjectures in an ideal world, they're also finite. But we don't know. This is the open conjecture.
Starting point is 01:51:52 Now the billion, are any of them constructed or is it just the existence? Yeah, yeah. That's it. Now that's exactly where we're getting. So it's gotten, it's one, two and three. Three is like, you know, how are you going to list these things? Right. And then.
Starting point is 01:52:09 Algebra geomotors never really bother listing all of them out. This is just not something they do. So it took on, the physicists took on the challenge. So Philip Candelas and Franz and then Harold Scharke and Maximilian Kreutzer started just listing these. And that's why we have these billions. There is actually databases of these. And they're presented in just like matrices like this.
Starting point is 01:52:37 I won't bore you with the details of these matrices. These algebraic varieties, you can define this as intersections of polynomials. That's one way to represent them. And in Croats and Schakka's database, they put vertices, uptoric varieties, hiding in bed. But the upshot is that, you know, there's a database of gigas, of many, many gigabytes, that really got done by the, certainly by the turn of the century, by year 2000, these guys were running on Pentium machines. I mean, this is an absolute feat, especially Kroetser and Shkalka.
Starting point is 01:53:11 They were able to get 500 million of this and stored on a hard drive using a Pentium machine of this car by Almanyfold. And they were able to compute topological invariance of these. So I happen to have this database And they were able to compute topological invariance of these. So I happened to have this database. I could access them and that was kind of fun.
Starting point is 01:53:33 And I've been playing on and off with them for a number of years. So a typical calculation is like you have something like a configuration of tensors of here is even in integers, and you have some standard method in algebraic geometry to compute topological invariance. This topological invariance again, in this dictionary, means something. For example, H21 in some context,
Starting point is 01:53:59 is the number of generations of fermions in the low-energy world. That's a complete problem in this computing of topological invariant in algebraic geometry. And there are methods to do it. And in these databases, people took 10, 20 years to compile this database and you got these things in. And they're not easy.
Starting point is 01:54:18 It's very complicated to compute these things. So in 2017, I was playing around with this. And the reason why I was playing around with this. And the reason why I was playing around with this was very simple. It's because my son was born and I had infinite, sleepless nights and I couldn't do anything. Right? I had like, you know, there's the kid and then, you know, there's the kid, there's the kid. And, you know, and he wakes you up at two, you know, put him to and, you know, and I was bottle feeding him two, you know, put him to and, and, you know, and I was bottle feeding him and I had a daughter at the time. So the, the, my wife's taking
Starting point is 01:54:50 care of the daughter. They they're passed out. And then I got this kid, I passed him out, put them into bed and I'm, I'm wide awake at this point. It's like 2 AM. It's like, I can't fall asleep anymore and I can't do real, you know, serious computation anymore because I'm just too tired. So, let's just play around with data. At least I can let the computer help me to do something. And then that's when I learned, you know, what's this thing that everybody's talking about?
Starting point is 01:55:18 Well, you know, it's machine learning. Right. So that's why I got through this. It's a very simple, very simple biological reason why I was trying to learn machinery. So then I think I was hallucinating at some point, right? I was like, well, if you look at pictures like, you know, matrices a lot like, you know, we're talking about, you know, 500 million of these things, right? Yes.
Starting point is 01:55:41 Certainly I wasn't going through all of them. And they're being labeled by topological invariants. How different is it if I just sort of pixelated one of these and labeled them by this? And all of a sudden this began to look like a problem in hand digit recognition, right? This is like, how different is this or image recognition? So, and I just literally started feeding in.
Starting point is 01:56:06 I took 500, I mean, 500 million is too much, right? So I took like 5,000 of these, 10,000 of these, and then trained them to look and recognize this, to recognize this on number. And I was like, this is going to be, it's just going to give crap, obviously. It's going to give 0% accuracy. And to my surprise, it was giving extremely good accuracies.
Starting point is 01:56:27 So somehow the neural network that I was training, I was even using standard MNIST, the high-dreconitioned MNIST things, recognizing this. And it was recognizing it to great accuracy. And now, people have improved this. Like loads of people, like Fin you know, Finitello, there's a group there that did some serious work on just trying to dis-problem. But this idea suddenly didn't seem so crazy anymore.
Starting point is 01:56:52 The idea seemed completely crazy to me because I was hallucinating at 2 a.m. And it was way so, but what's the upshot of this? The upshot is somehow the neural network was doing algebraic geometry, like this kind of algebraic geometry, really sequence chasing, very complicated, without knowing anything about algebraic geometry. It somehow was just doing pattern recognition and somehow it's beating us because you know if you do this computation seriously, it's double exponential complexity. But it's just now, but pattern recognition is bypassing all of that. So then I became a fanatic, right?
Starting point is 01:57:32 Then I said, well, all of algebraic geometry is image processing. So far I have not been shocked by the algebraic geometries, because it's actually true if you really think about it. You know, any algebraic, the point of algebraic geometry, the reason I like algebraic more than differential is because there's a very nice way to represent manifolds in this way. Manifolds in algebraic geometry,
Starting point is 01:57:56 so in differential geometry, manifolds are defined in terms of Euclidean patches. Then you do, you know, transition functions, you know, which are differentiable, C infinity, blah, blah, blah. But in algebraic geometry, they're just vanishing low-side polynomials. And then once you have systems of polynomials, you have a very good representation. So for example, here, this is just recording the list of polynomials, the degrees of polynomials
Starting point is 01:58:22 that are embedded in some space. And that really is algebraic geometry. So basically any algebraic variety, so that's fancy way of saying this polynomial representation of a manifold, which is called an algebraic variety, this thing is representable in terms of a matrix or a tensor, sometimes even an integer tensor. And then the computation of invariance, a topological invariance, is the recognition problem of such tensor. But once you have a tensor, you can always pixelate it and, you know,
Starting point is 01:58:57 picturize it. You know, at the end of the day, it's doing this because it's just image process in algebraic geometry. Now, do you mean to say every problem in algebraic geometry is an image process? Oh almost. Is an image processing problem or just problems involving invariance or image processing or even broader than that? Well, I think it's I'm thinking it is really more broad I think you know at some level, you know, I think in my view, I try to say, you know, bottom, bottom up mathematics is language processing and top down mathematics is image processing. Interesting. Of course, this is, I mean, take with the caveat, but of course, at some level, this there is truth in what I say. Of course, it's an extreme thing to say.
Starting point is 01:59:46 But, you know, in terms of what mathematical discovery is, is that you're trying to take a pattern in mathematics. So in algebra, if you use perfect example, you can pixelate everything and you can just try to see certain images have certain properties. And so your image processing mathematics, whereas bottom up, you're building up mathematics as a language. So, it's language processing. And of course, all of this will be useless if you can't actually get human readable mathematics out of it. So, this is the first surprise.
Starting point is 02:00:18 The fact that it's even doing it at all to a certain degree of accuracy. Now, we're talking about accuracy like now it's been improved to like 99.99 percent accuracy in these databases. But that's the first level, that's the first surprise. The second surprise is that you can actually extract human understandable mathematics from it. And I think that's the next level surprise. So the murmuration conjectures, this beautiful work in DeepMind
Starting point is 02:00:48 that Jody Williamson's involved in, in this human-guided intuition, you can actually get human mathematics out of it. That's really quite something. So maybe that's a good point to break for part two, which is an advertisement of, you know, here is like we've gone through many things about what mathematics is and how it got this through doing this interaction between algebra, geometry, and string theory. And then a second part would be how you can actually extrapolate and extract mathematics, actual conjectures,
Starting point is 02:01:30 things to prove from doing this kind of experimentation, which are summarized in these books. I keep on advertising my books because I get 50 pounds per year of, what do they call it, royalties, so I don't have to sell my liver for my kids. But it's actually kind of fun. It's a complete, I mean, academic publish is a joke, right? You get like out of like a hundred pound a year because you don't actually make money out of it.
Starting point is 02:01:57 But maybe that's a good place to break. And then for part two, how we try to formulate what the Birch test is for AI, which is sort of the Turing test plus. Because the Birch test is how to get actual meaningful human mathematics out of this kind of playing around with mathematical data. I see two of your sentences that will be these maxims for the future will be that machine learning is the 22nd century's math that fell into the 21st. So this machine learning assisted mathematics or that the bottom up is language processing and then the bottom the top down is image processing. Yeah, I like those two.
Starting point is 02:02:37 Yeah. Anyone who's watching if you have questions for Yang Hui for part two, please leave them in the comments. Do you want to give just a brief overview? Oh yeah, sure. So, I'm going to talk about what the Birch Test is and which papers so far have gotten – how closely they've gone through the Birch Test. And then I'm going to talk about some more experiments in number theory. And the one that I really enjoyed doing with my collaborators, Lee, Oliver and Posniakoff, which is to actually make something meaningful that's related to the Bresch-Snowden-Dyer
Starting point is 02:03:12 conjecture just by just letting machine go crazy and finding a new pattern in elliptic curves, which is fundamentally a new pattern in the prime numbers, which is completely amazing. You mentioned quanta earlier so this quanta feature that featured this one considered this as one of the breakthroughs of 2024. Great and that word murmuration which was used repeatedly throughout it was never defined but it will be in the part two. Absolutely. I'm looking forward to it.
Starting point is 02:03:42 Me too. Okay thank you so much. Thank you. This has been wonderful. I could continue speaking to it. Me too. Me too. Okay. Thank you so much. Thank you. This has been wonderful. I could continue speaking to you for four hours. Both of us have to get going but That's so much fun. Pleasure. Don't go anywhere just yet. Now I have a recap of today's episode brought to you by The Economist. Just as The Economist brings clarity to complex concepts, we're doing the same with our new AI-powered episode recap. Here's a concise summary of the key insights from today's podcast.
Starting point is 02:04:10 All right, let's dive in. We're talking about Kurt J. Mungle and his deep dives into all things mind-bending. You know this guy puts in the hours, like weeks, prepping to grill guests like Roger Penrose on some wild topics. Yeah, it's amazing using his own background to dig in. Really challenging guests with his knowledge of mathematical physics pushes them beyond the usual. Definitely.
Starting point is 02:04:34 And today we're focusing on his chat with mathematician Yang Huihe. They're getting into AI, math, where those two worlds collide. And it's fascinating because it really makes you think differently about how math works, how we do math, and where AI might fit into the picture. You might think a mathematician's life is all formulas and proofs, but Yang Hui, he actually started exploring AI assisted math while dealing with sleepless nights with his newborn son.
Starting point is 02:04:59 It's such a cool example of finding inspiration when you least expect it. Tired but inspired, he started messing around with machine learning in those quiet early morning hours. So let's break down this whole AI and math thing. Yanqui, he talks about three levels of math. Bottom up, top down, and meta. Bottom up is like building with Legos. Very structured, rigorous proofs. That's the foundation. But here's where things get really interesting. It has limitations. Right. And those limitations are highlighted by Gödel's incompleteness theorems. Basically,
Starting point is 02:05:29 Gödel showed us that even in perfectly logical systems, there will always be true statements that can't be proven within that system. It's mind-blowing. So if even our most rigorous math has these inherent limitations, it makes you think. Could AI discover truths that we as humans bound by our formal systems might miss? Could it explore uncharted territory? That's a really deep thought. And it's really at the core of what makes this conversation revolutionary. It's not about AI just helping us with math faster. It's about AI possibly changing how we think about math altogether. So how is this all playing out? We've had computers in math for ages, from early theorem provers to AI assistants like Lean. But where are we now with AI actually doing math?
Starting point is 02:06:10 Well, AI is already making some big strides. It's tackling Olympiad-level problems and doing it well, which makes you ask, can AI really unlock the secrets of math? And that leads us to the big philosophical questions. Is AI really understanding these mathematical ideas, or is it just incredibly good at spotting patterns? It's like that famous Chinese room thought experiment. You could follow rules to manipulate Chinese symbols without truly understanding the language. Yang Hui, he shared a story about Andrew Wiles, the guy who proved Fermat's last theorem, trying to challenge GPT-3 with some basic math problems. It highlights how early AI models, while excelling in tasks with clear rules and plenty of examples,
Starting point is 02:06:52 struggled with things that needed real deep understanding. It seems like AI's strength right now is in pattern recognition. And that ties into what Yang Kui he calls top-down mathematics. It's where intuition and seeing connections between different parts of math are king. Like Gauss. He figured out the prime number theorem way before we had the tools to prove it. It shows how a knack for patterns can lead to big breakthroughs even before we have the rigorous structure. It's like AI is taking that intuitive leap, seeing connections that might have taken us humans years even decades to figure out. And it's all because AI can deal
Starting point is 02:07:26 with such massive amounts of data. Which brings us back to Yang We. He's sleepless nights. He started thinking about Calabiao manifolds, super complex mathematical things key to string theory as image processing problems. Wait, Calabiao manifolds? Those sound like something straight out of science fiction.
Starting point is 02:07:45 They're pretty wild. Think six dimensions all curled up, nearly impossible to picture. They're vital to string theory, which tries to bring all the forces of nature together. Now mathematicians typically use these really abstract algebraic geometry techniques for this, but Yang-Wei? He had a different thought. So instead of equations and formulas, he starts thinking about pixels. Yeah. Like taking a Klabi-Yau manifold, breaking it down into a pixel grid, like you do with an
Starting point is 02:08:10 image. He's taking abstract geometry and turning it into something a neural network built for image recognition can handle. That is a radical change in how we think about this. It's like he's making something incredibly abstract, tangible, translating it for AI. Did it even work? The results blew people away. He fed these pixelated manifolds into a neural network, and it predicted their topological properties really accurately. He basically showed AI could do algebraic geometry in a whole new way.
Starting point is 02:08:40 So it's not just speeding up calculations. It's uncovering hidden patterns and connections that might have stayed hidden, like opening a new way of seeing math. And that leads us to the big question. If AI can crack open complex math like this, what other secrets could it unlock? We're back. Last time we were talking about AI not just helping us with math, but actually coming up with new mathematical insights, which is where the Birch test comes in. It's like, can AI go from being a super calculator to actually being a math partner? Exactly.
Starting point is 02:09:13 And now we'll look at how researchers like Yang Hui-hee are trying to answer that. Remember the Turing test was about a machine being able to hold a conversation like a human. The Birch test is a whole other level. It's not about imitation. It's about creating completely new mathematical ideas. Think about Brian Birch back in the 60s. He came up with this bowl conjecture about elliptic curves just from looking at patterns in numbers.
Starting point is 02:09:34 So this test wants AI to do similar leaps, to go through tons of data, find patterns, and come up with conjectures that push math forward. Exactly. Can AI, like Birch, show us new mathematical landscapes? That's asking a lot. So how are we doing?
Starting point is 02:09:49 Are there any signs AI might be on the right track? There have been some promising developments, like in 2021, Davies and his team used AI to explore knot theory. Knots, like tying your shoelaces, what's that got to do with advanced math? It's more complex than you think. Knot theory is about how you can embed a loop in three-dimensional space and it actually connects to things like
Starting point is 02:10:10 topology and even quantum physics. Okay, that's interesting. So how does AI come in? Well, every knot has certain mathematical properties called invariants. It's kind of like its fingerprint. Davy's team used machine learning to analyze a massive amount of these invariants. So was the AI just crunching numbers? Or was it doing something more? What's amazing is the AI didn't just process the data, it actually found hidden relationships
Starting point is 02:10:33 between these invariants, which led to new conjectures that mathematicians hadn't even considered before, like the AI was pointing the way to new mathematical truths. That's wild. Sounds like AI is becoming a powerful tool to spot patterns our human minds might miss. Absolutely. Another cool example is Lample and Charton's work in 2019. They trained AI on a massive data set of math formulas.
Starting point is 02:10:56 And what did they find? Well, this AI could accurately predict the next formula in a sequence, even for really complex ones. It was like the AI was learning the grammar of math and could guess what might come next. So we might not have AI writing full-blown proofs yet, but it's getting really good at understanding the structure of math and suggesting new directions. And that brings us back to Yang Cuhi. His work with those Calabi-Yau
Starting point is 02:11:18 manifolds, analyzing them as pixelated forms, that was a huge breakthrough. Showed that AI could take on algebraic geometry problems in a totally new way. Like bridging abstract math in the world of data and algorithms. Exactly. And that bridge leads to some really mind-bending possibilities. Yang Hui, he and his colleagues started exploring something they call murmuration. Murmuration? Like birds. It's a great analogy. Think of a flock of birds moving together like one. Each bird reacts to the ones around it and you get these complex, beautiful patterns.
Starting point is 02:11:51 Right, I get it. But how does it relate to AI and math? Well, Yanghui, he sees a parallel between how birds navigate together in a murmuration and how AI can guide mathematicians towards new insights by sifting through tons of math data So the AI is like the flock exploring math and showing us where things get interesting Yeah
Starting point is 02:12:11 And they've actually used this murmuration idea to look into a famous problem in number theory the Birch and Swinerton-Dyer conjecture That name sounds a bit intimidating. What's it all about? Imagine a doughnut shape, but in the world of numbers. These are called elliptic curves. Mathematicians are obsessed with finding rational points on these curves, points where the coordinates can be written as fractions. Okay, I'm following so far. The Birch and Swinerton-Dyer conjecture basically says there's this deep connection between how many of these rational points there are and a specific math function, like linking
Starting point is 02:12:43 the geometry of these curves to number theory. Things are definitely getting complex now. And it's a big deal in math. It's actually one of the Clay Mathematics Institute's Millennium Prize problems. Solve it, you win a million bucks. Now that's some serious math street cred. So how did Yang Hu, his team, use AI for this?
Starting point is 02:13:01 They trained an AI on this massive data set of elliptic curves and their functions. The AI didn't actually solve the whole conjecture, but it found this new pattern, this correlation, that mathematicians hadn't noticed before. So the AI was like a digital explorer, mapping out this math territory and showing mathematicians what to look at more closely.
Starting point is 02:13:20 Exactly. This discovery, while not a complete proof, gives more support to the conjecture and opens up some exciting new areas for research. It shows how AI can help with even the hardest problems in mathematics. It feels like we're on the edge of something new in math. AI is not just a tool, it's a partner in figuring out the truth. What does all this mean for math in the future?
Starting point is 02:13:42 That's a great question, and it's something we'll dig into in the final part of this deep dive. for math in the future? That's a great question, and it's something we'll dig into in the final part of this Deep Dive. We'll look at the philosophical and ethical stuff around AI in math. We'll ask if AI is really understanding the math it's working with, or if it's just manipulating symbols in a really fancy way. See you there. Welcome back to our Deep Dive. We've been exploring how AI is changing the game in math, from solving tough problems to finding hidden patterns in complex structures. But what does it all mean? What are the implications of all of this?
Starting point is 02:14:11 We've touched on this question of understanding. Does AI really understand the math it's dealing with, or is it just a master of pattern matching? Yeah, we can get caught up in the cool stuff AI is doing. But we can't forget about those implications. If AI is going to be a real collaborator in mathematics, this whole understanding question is huge. It goes way back to the Chinese room thought experiment. Imagine someone who doesn't speak
Starting point is 02:14:35 Chinese has this rulebook for moving Chinese symbols around. They can follow the rules to make grammatically correct sentences, but do they actually get the meaning? So is AI like that, just manipulating symbols in math without grasping the deeper concepts? That's the big question, and there's no easy answer. Some people say that because AI gets meaningful results, like we've talked about, it shows some kind of understanding, even if it's different from how we understand things.
Starting point is 02:15:01 Others say AI doesn't have that intuitive grasp of math concepts that we humans have. It's a debate that's probably going to keep going as AI gets better and better at math. Makes you wonder how it's going to affect the foundations of mathematics itself. That's a key point. Traditionally, mathematical proof has been all about logic, building arguments step by step using established axioms and theorems. But AI brings something new, inductive reasoning, finding patterns, and
Starting point is 02:15:26 extrapolating from those patterns. So could we see a change in how mathematicians approach proof? Could we move toward a way of doing math that's driven by data? It's possible. Some mathematicians are already using AI as a partner in the proving process. AI can help generate potential theorems or find good strategies for tackling conjectures. But others are more cautious, worried that relying too much on AI could make math less rigorous, more prone to errors.
Starting point is 02:15:51 It's like with any new tool, there's good and bad. Finding that balance is important. We need to be aware of the limitations and not rely on AI too much. Right. And as AI becomes more important in math, it's crucial to have open and honest conversations. We need to talk about what AI means, not just for math, but for everything we do. It's not just about the tech, it's about how we choose to use it. We need to make sure AI helps humanity and the benefits are shared. That's everyone's responsibility.
Starting point is 02:16:19 A responsibility that goes way beyond just mathematicians and computer scientists. We need philosophers, ethicists, social scientists, and most importantly, the public. We need all sorts of voices and perspectives to guide us as we go into this uncharted territory. This has been an amazing journey into the world of AI and math. From sleepless nights to those mind-bending manifolds,
Starting point is 02:16:39 we've seen how AI is pushing the boundaries of what's possible. And as we wrap up, we encourage you to keep thinking about these things. What does it really mean for a machine to understand math? How will AI change the way we prove things and make discoveries in math? How can we make sure we're using AI responsibly and ethically in our search for knowledge? These are tough questions, but they're worth asking. The future of mathematics is being shaped right now,
Starting point is 02:17:04 and AI is a major player. Thanks for joining us on this deep dive. We'll catch you next time, ready to explore some other fascinating corner of the universe of knowledge. New update, started a sub stack. Writings on there are currently about language and ill-defined concepts,
Starting point is 02:17:20 as well as some other mathematical details. Much more being written there. This is content that isn't anywhere else. It's not on theories of everything. It's not on Patreon. Also full transcripts will be placed there at some point in the future. Several people ask me, hey Kurt, you've spoken to so many people in the fields of theoretical physics, philosophy and consciousness. What are your thoughts? While I remain impartial in interviews, this substack is a way to peer into my present deliberations on these topics.
Starting point is 02:17:50 Also, thank you to our partner, The Economist. Firstly, thank you for watching, thank you for listening. If you haven't subscribed or clicked that like button, now is the time to do so. Why? Because each subscribe, each like helps YouTube push this content to more people like yourself, plus it helps out Kurt directly, aka me. I also found out last year that external links count plenty toward the algorithm, which means that whenever you share on Twitter, say on Facebook or even on Reddit, etc. It shows YouTube, hey, people are talking about this content outside of YouTube, which
Starting point is 02:18:29 in turn greatly aids the distribution on YouTube. Thirdly, there's a remarkably active Discord and subreddit for theories of everything where people explicate toes, they disagree respectfully about theories, and build as a community our own toe. Links to both are in the description. Fourthly, you should know this podcast is on iTunes, it's on Spotify, it's on all of the audio platforms. All you have to do is type in theories of everything and you'll find it. Personally, I gain from rewatching lectures and podcasts. I also read in the comments that hey,
Starting point is 02:18:59 toe listeners also gain from replaying. So how about instead you re-listen on those platforms like iTunes, Spotify, Google Podcasts, whichever podcast catcher you use. And finally, if you'd like to support more conversations like this, more content like this, then do consider visiting patreon.com slash Kurt Jaimungal and donating with whatever you like. There's also PayPal, there's also crypto, there's also just joining on YouTube. Again, keep in mind, it's support from the sponsors and you that allow me to work on toe full time. You also get early access to ad free episodes, whether it's audio or video, it's audio in
Starting point is 02:19:35 the case of Patreon, video in the case of YouTube. For instance, this episode that you're listening to right now was released a few days earlier. Every dollar helps far more than you think. Either way, your viewership is generosity enough. Thank you so much.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.