Muscle for Life with Mike Matthews - Cal Newport on the Future of AI and Knowledge Work

Starting point is 00:00:00 Language model-based tools like ChatGPT or Cloud, again, they're built only on understanding language and generating language based on prompts. Mainly how that's being applied, and I'm sure this has been your experience, Mike, in using these tools, is that it can speed up things that, you know, we were already doing. Help me write this faster. Help me generate more ideas than I'd be able to come up, you know, on my own. Help me summarize this document. It's sort of speeding up tasks. But none of that is my job doesn't need to exist, right? The Turing test we should care about is when can an AI empty my email inbox on my behalf, right? And I think that's an important threshold because that's capturing

Starting point is 00:00:39 a lot more of what cognitive scientists call functional intelligence, right? And I think that's where a lot of the prognostications of big impacts get more interesting. Hello, and welcome to another episode of Muscle for Life. I'm your host, Mike Matthews. Thank you for joining me today for something a little bit different than the usual here on the podcast, than the usual here on the podcast. Something that may seem a little bit random, which is AI, but although I selfishly wanted to have this conversation because I find the topic and the technology interesting

Starting point is 00:01:15 and I find the guest interesting, I'm a fan of his work. I also thought that many of my listeners may like to hear the discussion as well because if they are not already using AI to improve their work, to improve their health and fitness, to improve their learning, to improve their self-development, they should be and almost certainly will be in the near future. And so that's why I asked Cal Newport to come back on the show and talk about AI. And in case you are

Starting point is 00:01:46 not familiar with Cal, he is a renowned computer science professor, author, and productivity expert. And he's been studying AI and its ramifications for humanity long before it was cool. And in this episode, he shares a number of counterintuitive thoughts on the pros and cons of this new technology, how to get the most out of it right now, and where he thinks it is going to go in the future. Before we get started, how many calories should you eat to reach your fitness goals faster? What about your macros? What types of food should you eat? And how many meals should you eat every day? Well, I created a free 60 second diet quiz that will answer those questions for you and others, including how much alcohol you should drink, whether you should

Starting point is 00:02:38 eat more fatty fish to get enough omega-3 fatty acids, what supplements are worth taking and why, and more. To take the quiz and get your free personalized diet plan, go to muscleforlife.show slash diet quiz, muscleforlife.show slash diet quiz now, answer the questions and learn what you need to do in the kitchen to lose fat, build muscle, and get healthy. Hey, Cal, thanks for taking the time to come back on the podcast. Yeah, no, it's good to be back. Yeah, I've been looking forward to this, selfishly, because I'm personally very interested in what's happening with AI. I use it a lot in my work. It's now, it's basically my,

Starting point is 00:03:23 it's like my little digital assistant, basically. And because so much of my work is these days, it's creating content of different kinds. It's just doing things that require me to create ideas, to think through things. And I find it very helpful. But of course, it's also, there's a lot of controversy over it. And I thought that might be a good place to start. So the first question I'd like to give to you is, so everyone listening has heard about AI and what's happening to some degree, I'm sure. And there are a few different schools of thought from what I've seen in terms of where this technology is and where it may go in the future. There are people who think that it may save humanity, it may usher in a new renaissance, it may dramatically reduce the

Starting point is 00:04:14 cost of producing products and services, new age of abundance, prosperity, all of that. And then there seems to be the opposite camp who think that it's more likely to destroy everything and possibly even just eradicate humanity altogether. And then there also seems to be a third philosophy, which is kind of just a meh, like the most likely outcome is probably going to be disappointment. It's not going to do either of those things. It's just going to be a technology that is useful for certain people under certain circumstances. And it's just going to be a technology that is useful for certain people under certain circumstances. And it's just going to be another tool, another digital tool that we have. I'm curious as to your thoughts, where do you fall on that multipolar spectrum?

Starting point is 00:05:00 Well, I tend to take the Aristotelian approach here. Well, you know, I tend to take the Aristotelian approach here. When we think about Aristotelian ethics, where he talks about the real right target tends to be between extremes, right? So when you're trying to figure out about particular character traits, Aristotle would say, well, you don't want to be foolhardy, but you also don't want to be a coward. And in the middle is the golden mean, he called it. That's actually where I think we are probably with AI. Yes, we get reports of it's going to take over everything in a positive way, new utopia. This is sort of an Elon Musk, I would say, endorsed idea right now. Horowitz as well. Andreessen Horowitz, Mark Andreessen. Yes, that's true. That's right. But Andreessen Horowitz, you got to take them with a grain of salt because their goal is they need massive new markets in which to put capital, right? So, you know, we're like two years out from Andreessen Horowitz really pushing a crypto-driven internet was going to be the future of all technology because they were looking for plays and that

Starting point is 00:06:04 kind of died down. But yeah, Musk is pushing it too. I don't think we have evidence right now to support this sort of utopian vision. The other end, you have the P-Doom equals one vision of the Nick Bostrom superintelligence. Look, this is already out of control and it's going to recursively improve itself until it takes over the world. Again, like most computer scientists I know aren't sweating that right now either. I would probably go with something, if I'm going to use your scale, let's call it meh plus, because I don't think it's meh, but I also don't think it's one of those extremes. If I had to put money down, and it's dangerous to put money down on something that's so hard to predict, you're probably going to have a change maybe on the scale of something

Starting point is 00:06:42 like the internet, the consumer internet. Let's think about that for a little bit, right? I mean, that is a transformative technological change, but it doesn't play out with the drasticness that we like to envision or we're more comfortable categorizing our predictions. Like, when the internet came along, it created new businesses that didn't exist before. It put some businesses out of business. For the most part, it changed the way, like the business we were already doing, we kept doing it, but it changed what the day-to-day reality of that was. Professors still profess, car salesmen still sell cars. But it's like different now, you have to deal with the internet, it kind of changed the day-to-day. That's probably like the safest bet for how the generative AI revolution, what that's going to lead to, is not necessarily a drastic wholesale definition of what we mean by work or what we do for work, but perhaps a pretty drastic change to the day-to-day composition of these efforts. Just like someone from 25 years ago wouldn't be touching email or Google in a way that a knowledge worker today is going to be constantly touching those tools.

Starting point is 00:07:44 But that job might be the same job that was there 25 years ago. It just feels different how it unfolds. That's, I think, the safe bet right now. That aligns with something Altman said in a recent interview I saw where, to paraphrase, he said that he thinks now is the best time to start a company since the advent of the internet, if not the entire history of technology because of what he thinks people are going to be able to do with this technology. I also think of, he has a bet with, I forget, a friend of his on how long it'll take to see the first billion dollar market cap on a solopreneur's business, basically. Just a one-man business. I mean, obviously it would be in tech. It'd be some sort of next big app or something that was created though, by one dude in AI billion dollar plus

Starting point is 00:08:31 valuation. Yeah. And you know, that's possible because if we think about, for example, Instagram. Yep. Great example. I think they had 10 employees when they sold, right? It's 10 or 11 and they sold for right around a billion dollars. Right. So. And how many of those 10 or 11 were engineers just doing engineering that AI could do? Yep, that was probably four. Yeah, and so, right, one AI-enhanced, one AI-enhanced programmer. I think that's an interesting bet to make. That's a smarter way, by the way, to think of this from an entrepreneurial angle,

Starting point is 00:09:02 making sure you're leveraging what's newly made possible by these tools in pursuing whatever business seems like in your sweet spot and seems like there's a great opportunity, as opposed to what I think is a dangerous play right now, is trying to build a business around the AI tools themselves in their current form, right? Because one of my, kind of a collection of takes I've been developing about where we are right now with consumer facing AI. But one of these strong takes is that the existing form factor of generative AI tools, which is essentially a chat interface. I interface with these tools through a chat interface, giving prompts that have to, you know, carefully engineered prompts to get language model based tools to produce useful text. That might be more fleeting than we think. That's a step towards more

Starting point is 00:09:45 intricate tools. So if you're building a startup around using text prompts to an LLM, you may actually be building around the wrong technology. You're building around not necessarily where this is going to end up in its widest form. And we know that in part because these chatbot-based tools have been out for about a year and a half now, November 2022 would be the debut of ChatGPT. In transformed by the tools as they're designed right now, which tells us this form factor of copying and pasting text into a chat box is probably not going to be the form factor that's going to deliver the biggest disruptions. We sort of need to look down the road a little bit about how we're going to build on top of this capability.

Starting point is 00:10:40 This is not going to be the way I think like the average knowledge worker ultimately is going to interact, is not going to be typing into a box at chat.openai.com. I think this is a sort of preliminary stepping stone in this technology's development. with who also use it is the quality of its outputs is highly dependent on the quality of the inputs, the person using it. And as it really excels in verbal intelligence, general reasoning, not so much. I saw something recently that Claude 3 scored about 100 or so on a general IQ test, which was delivered the way you would deliver it to a blind person. Whereas verbal intelligence, I think GPT on that same, it was an informal paper of sorts. GPT's general IQ was maybe 85 or something like that. Verbal IQ though, very high. So GPT, according to a couple of analyses, scores somewhere in the 150s on verbal IQ. And so

Starting point is 00:11:46 what I've seen is it takes an above average verbal IQ in an individual to get a lot of utility out of it in its current form factor. And so I've seen that as just a limiting factor. Even if somebody, if they haven't spent a lot of time dealing with language, they struggle to get to the outcomes that it is capable of producing. But you can't just give it kind of vague. This is kind of what I want. Can you just do this for me? Like you need to be very particular, very deliberate.

Starting point is 00:12:19 Sometimes you have to break down what you want into multiple steps and walk it through. Sometimes you have to break down what you want into multiple steps and walk it through. So it's just echoing what you were saying there is for it to really make major disruptions, it's going to have to get beyond that because most people are not going to be able to 100x their productivity with it. They just won't. Yeah. Well, look, I'm working right now. As we talk, I'm writing a draft of a New Yorker piece on using AI for writing.

Starting point is 00:12:47 One of the just universally agreed-on axioms of people who study this is that a language model can't produce writing that is of higher quality than the person using the language model is already capable of doing. And with some exceptions, right? Like, you're not an English language, natural, English is not your first language. But it can't, you have to be the taste function. Like, is this good? Is this not good? Here's what this is missing. In fact, one of the interesting conclusions, preliminary conclusions that's coming from the work I'm doing on this is that, like, for students who are using language models with paper writing, it's not saving them time. I think we have this idea that it's going to be a plagiarism machine, like write this section for me and I'll lightly edit it. It's not what they're doing. It's way more interactive, back and forth. What about this? Let me get this idea. It's as much about relieving the psychological distress of facing the blank page as it is about trying to speed up or produce or automate part of this effort. There's a bigger point here. I'll make some big takes. Let's take some big swings here. There's a bigger point I want to

Starting point is 00:13:48 underscore, which, as you mentioned, like, Claude is not good at reasoning. You know, GPT-4 is better than GPT at reasoning, but, you know, not even like a moderate human level of reasoning. But here's the bigger point I've been making recently. The idea that we want to build large language models, big enough that just as like an accidental side effect, they get better at reasoning, is like an incredibly inefficient way to have artificial intelligence do reasoning. The reasoning we see in something like GPT-4, which there's been some more research on, it's like a side effect of this language model trying to be very good at producing reasonable text, right? The whole model is just trained on,

Starting point is 00:14:28 you've given me a prompt, I want to expand that prompt in a way that makes sense given the prompt you gave me. And it does that by generating tokens, right? Given the text that's in here so far, what's the best next part of a word or word to output next? And that's all it does. Now, in winning this game of producing text

Starting point is 00:14:43 that actually makes sense, it has had to implicitly encode some reasoning into its wiring because sometimes to actually expand text, if that text is capturing some sort of logical puzzle in it, to expand that text in a logical way, it has to do some reasoning.

Starting point is 00:14:59 But this is a very inefficient way of doing reasoning, to have it be as a side effect of building a really good token generation machine. Also, you have to make these things huge just to get that as a side effect. GPT-3.5, which powered the original chat GPT, which had probably around 100 billion parameters, maybe 170 billion parameters, could do more, some of this reasoning, but it wasn't very good. When they went to a trillion plus parameters for GPT-4, this sort of accidental implicit reasoning that was built into it got a lot better, right? But we're making these things good. When they went to a trillion plus parameters for GPT-4, this sort of accidental implicit reasoning that was built into it got a lot better, right? But we're making these things huge. This is

Starting point is 00:15:31 not an efficient way to get reasoning. So what makes more sense? And this is my big take. It's what I've been arguing recently. I think the role of language models in particular is going to actually focus more. Understanding language. What is it that someone is saying to me? What the user is saying? What does that mean? Like, you know, what are they looking for? And then translating these requests into the very precise formats

Starting point is 00:15:56 that other different types of models and programs can take as input and deal with. And so like, let's say, for example, you know, there's a certain, let's say, for example, you know, there's a certain, there's mathematical reasoning, right? And we want to have help from an AI model to solve complicated mathematics. The goal is not to keep growing

Starting point is 00:16:13 a large language model large enough that it has seen enough math that kind of implicitly gets bigger and bigger. Actually, we have really good computerized math solving programs like Mathematica, Wolfram's program. So what we really want is a language model to recognize you're asking about a math problem, put it into the precise language that

Starting point is 00:16:30 another program can understand, have that program do what it does best. And it's not an emergent neural network. It's more hard code. Let it solve the math problems. And then you can give the result back to the language model with a prompt for it to tell you, here's what the answer is. This is the future I think we're going to see is many more different types of models that do different types of things that we would normally do in the human head. Many of these models not emergent, not just trained neural networks that we have to just study and see what they can do, but very explicitly programmed. And then these language models, which are so fantastic at translating between languages and understanding language, sort of being kind of at the core of this, taking what we're saying in natural languages, users, turning it into the language of these

Starting point is 00:17:08 ensembles of programs, getting the results back and transforming it back to what we can understand. This is a way more efficient way of having much broader intelligences, as opposed to growing a token generator larger and larger, that it just sort of implicitly gets okay at some of these things. It's just not an efficient way to do it. The multi-agent approach to something that would maybe appear to be an AGI-like experience, even though it still may not be in the sense of to come back to something you commented on, on understanding the answer as opposed to just regurgitating probabilistically correct text, we see the, I think a good example of that is the latest round of Google gaffes, Gemini gaffes,

Starting point is 00:17:52 where it's saying to put glue in the cheese of the pizza, eat rocks, bugs crawl up your penis hole, that's normal, all these things, right? Where the algorithm says, yeah, here, here's the text, spit it out, but it doesn't understand what it's saying in the way that a human does, because it doesn't reflect on that and go, well, wait a minute. No, we definitely don't want to be putting glue on the pizza. And so to your point, for it to reach that level of human-like awareness, I don't know where that goes. I don't know enough about the details. You probably would be able to comment on that a lot better than I would. But the multi-agent approach, that anyone can understand where if you build that up, you make that robust enough,

Starting point is 00:18:35 it can reach a level where it seems to be highly skilled at basically everything. And it goes beyond the current generalization, generally not that great at anything other than outputting grammatically perfect text and knowing a bit of something about basically everything. Yeah. Well, I mean, let me give you a concrete example, right? I wrote about this in a New Yorker piece I published in March. And I think it's an important point, right? A team from Meta set out to build an AI that could do really well at the board game diplomacy. And I think this is really important when we think about AGI,

Starting point is 00:19:16 or just more in general, like human-like intelligence in a very broad way, because the diplomacy board game, you know, if you don't know it, is partially like a risk type strategy war game. you know, if you don't know it, is partially like a risk type strategy war game. You know, you move figures on a board. It takes place in World War I era Europe and you're trying to take over countries or whatever. But the key to diplomacy is that there's this human negotiation period. At the beginning of every term, you have

Starting point is 00:19:40 these private one-on-one conversations with each of the other players and you make plans and alliances and you also double cross and you make a fake alliance with this player so that they'll move their positions out of a defensive position so that this other player that you have a secret alliance with can come in from behind and take over this country. And so it's really considered like a game of real politic human-to-human skill. There was this rumor that Henry Kissinger would play diplomacy in the Kennedy White House just to sharpen his skill of how do I deal with all these world leaders. So when we think of AI from a perspective of like, ooh, this is getting kind of spooky what it can do,

Starting point is 00:20:19 winning at a game like diplomacy is exactly that. Like it's playing against real players and pitting them against themselves and negotiating to figure out how to win. They built a bot called Cicero that did really well. They played it on a online diplomacy chat-based, text-based chat diplomacy server

Starting point is 00:20:35 called DiplomacyNet. And it was winning, you know, two-thirds of its games by the time they were done. So I interviewed some of the developers for this New Yorker piece. And here's what's interesting about it. Like the first thing they did is they took a language model, and they trained it on a lot of transcripts of diplomacy games. So it was a general language

Starting point is 00:20:53 model, and then they extra trained it with a lot of data on diplomacy games. Now, you could ask this model, you could chat with it, like, what do you want to do next? But, you know, it would output, these are reasonable descriptions of diplomacy moves, given like what you've told it so far about what's happening in the game. And in fact, probably it's learned enough about seeing enough of these examples and how to generate reasonable text

Starting point is 00:21:17 to expand a transcript of a diplomacy game. There'll be moves that like match where the players actually are. Like they make sense, but it was terrible at playing diplomacy. It was just reasonable stuff. Here's how they built a bot that could win at diplomacy. They said, oh, we're going to code a reasoning engine,

Starting point is 00:21:36 a diplomacy reasoning engine. And what this engine does, if you give it a description of where all the pieces are on the board and what's going on and what requests you on and what request you have from different players like what they want you to do it can just simulate a bunch of futures like okay let's see what would happen if russia is lying to us but we go along with this plan what would they do oh you know three or four moves from now we could really get in trouble well what if we lied to them and then they did that? So you're simulating the future and none of this is like emergent. Yeah, it's like Monte

Starting point is 00:22:09 Carlo type. It's a program. Yeah. Monte Carlo simulations. Exactly. And like, we've just hardcoded this thing. And so what they did is that a language model talked to the players. So if you're a player, you're like, okay, hey, Russia, here's what I want to do. The language model would then translate what they were saying into like a very formalized language that the reasoning model understands, a very specific format. The reasoning model would then figure out what to do.

Starting point is 00:22:32 It would tell the language model with a big prompt and it would add a prompt to it. Like, okay, we want to like accept France's proposal, like generate a message to try to get it to like accept the proposal

Starting point is 00:22:41 and let's like deny the proposal for Italy or whatever. And then the language model who had seen a bunch of diplomacy game and says, and write this in the style of a diplomacy game. And it would sort of output the text that would get sent to the users. That did really well. Not only did that do well, none of the users, they surveyed them after the fact, or I think they looked at the forum discussions. None of them even knew they were playing against a bot. They thought they're playing against another human. And this thing did really well, but it was a small language model. It was an off-the-shelf research language model,

Starting point is 00:23:08 nine billion parameters or something like that, and this hand-coded engine, right? That's the power of the multi-agent approach. But there's also an advantage to this approach. So I call this intentional AI or IAI. The advantage of this approach is that we're no longer staring at these systems like an alien mind and we don't know what it's going to do. Because the reasoning now, we're coding this thing. We know exactly how this thing is going to decide what moved it, but we programmed the diplomacy reasoning engine. And in fact, and here's the interesting part about this example, they decided they didn't want their bot to lie. That's a big strategy in diplomacy.

Starting point is 00:23:44 They didn't want the bot to lie to human's a big strategy in diplomacy. They didn't want the bot to lie to human players for various ethical reasons, but because they were hand-coding the reasoning engine, they could just code it to never lie. So, you know, when you don't try to have all of the sort of reasoning decision-making happen in this sort of obfuscated, unpredictable, uninterpretable way within a giant neural network, but you have more of the reason just programmed explicitly working with this great language model, now we have a lot more control over what these things do. Now we can have a diplomacy bot, hey, it can beat human players. That's scary, but it doesn't lie because actually all the reasoning, there's nothing mysterious about it. It's just like we do with a chess playing bot. We simulate lots of different

Starting point is 00:24:21 sequences of moves to see which one's going to end up best. It's not obfuscated. It's not unpredictable. And it can't be jailbroken. There's no jailbreaking. We programmed it. Yeah. So this is the future I see with multi-agent. It's a mixture of when you have generative AI, so if you're generating text or understanding text or producing video or producing images, these very large neural network-based models are really, really good at this. And we don't exactly know how they operate, and that's fine. But when it comes to planning or reasoning or intention or the evaluation of which of these plans is the right thing to do or of the evaluation of is this thing you're going to say or do correct or incorrect, that can actually all be super intentional, super transparent, hand-coded. There's nothing here to escape when we think about this way.

Starting point is 00:25:11 So I think IAI gives us a powerful vision of an AI future, especially in the business context, but also a less scary one. Because the language models are kind of scary in the way that we just train this thing for $100 million over months. And then we're like, let's see what it can do. I think that rightly freaks people out. But this multi-agent model, I don't think it's nearly as sort of Frankenstein's monster as people fear AI sort of has to be. One of the easiest ways to increase muscle and strength gain is to eat enough protein and to eat enough high quality protein. Now you can do that with food, of course, you can get all of the protein you need from food, but many people supplement with whey protein because it is convenient and it's tasty and that

Starting point is 00:25:57 makes it easier to just eat enough protein. And it's also rich in essential amino acids, which are crucial for muscle building and it's digested well, it's also rich in essential amino acids, which are crucial for muscle building. And it's digested well. It's absorbed well. And that's why I created Whey Plus, which is a 100% natural grass-fed whey isolate protein powder made with milk from small sustainable dairy farms in Ireland. Now, why whey isolate? Well, that is the highest quality whey protein you can buy. And that's why every serving of Whey Plus contains 22 grams of protein with little or no carbs and fat.

Starting point is 00:26:32 Whey Plus is also lactose-free, so that means no indigestion, no stomach aches, no gassiness. And it's also 100% naturally sweetened and flavored, and it contains no artificial food dyes or other chemical junk. And why Irish dairies? Well, research shows that they produce some of the healthiest, cleanest milk in the world. And we work with farms that are certified by Ireland's Sustainable Dairy Assurance Scheme, SDSAS, scheme SDSAS, which ensures that the farmers adhere to best practices in animal welfare, sustainability, product quality, traceability, and soil and grass management. And all that is why I have sold over 500,000 bottles of Whey Plus and why it has over 6,000 four and five star reviews on Amazon and on my website. So if you want a mouth-watering, high-protein,

Starting point is 00:27:27 low-calorie whey protein powder that helps you reach your fitness goals faster, you want to try Whey Plus today. Go to buylegion.com slash whey. Use the coupon code MUSCLE at checkout, and you will save 20% on your first order. And if it is not your first order, you will get double reward points. And that is 6% cash back. And if you don't absolutely love

Starting point is 00:27:51 WayPlus, just let us know and we will give you a full refund on the spot. No form, no return is even necessary. You really can't lose. So go to buylegion.com slash way now, use the coupon code muscle at checkout to save 20% or get double reward points, and then try way plus risk-free and see what you think. Speaking of fears, there's a lot of talk about the potential negative impacts on people's jobs on economies. Now you've expressed some skepticism about the claims that AI will lead to massive job losses, at least in the near future. Can you talk a little bit about that for people who have that concern as well? Because they've read maybe that their job

Starting point is 00:28:38 is on the list that AI is replacing whatever this is in the next X number of years, because you see a lot of that. Yeah, no, I think those are still largely overblown right now. I don't like the methodologies of those studies. And in fact, one of the, it's kind of ironic, one of the big early studies that was given specific numbers for like what part of the economy is going to be automated. Ironically, their methodology was to use a language model to categorize whether each given job was something that a language model might one day automate.

Starting point is 00:29:12 So it's this interesting methodology. It was very circular. So here's where we are now. Where we are now, language model-based tools like ChatGPT or Cloud, again, they're built only on understanding language and generating language based on prompts. Mainly how that's being applied, I'm sure this has been your experience, Mike, in using these tools, is that it can speed up things that we were already doing. Help me write this faster, help me generate more ideas that I'd

Starting point is 00:29:35 be able to come up on my own. Help me summarize this document. It's sort of speeding up tasks. Help me think through this. Here's what I'm dealing with. Am I missing anything? I find those types of discussions very helpful. And that's another aspect that's been helpful. That's what we're seeing with students as well. It's interesting. It's sort of more of a psychological than efficiency advantage.

Starting point is 00:29:59 It's humans are social. There's something really interesting going on here where there's a rhythm of thinking where you're going back and forth with another entity that somehow is a kind of a more comfortable rhythm than just I'm sitting here, white knuckling my brain, trying to come up with things. But none of that is my job doesn't need to exist, right? So that's sort of where we are now. It's speeding up certain things or changing the nature of certain things we're already doing. I argued recently that the next step, like the Turing test we should care about, is when can an AI empty my email inbox on my behalf, right? And I think that's an important threshold because

Starting point is 00:30:34 that's capturing a lot more of what cognitive scientists call functional intelligence, right? So the cognitive scientists would say a language model has very good linguistic intelligence, understanding producing language. The human brain does that, but also has these other things called functional intelligences, simulating other minds, simulating the future, trying to understand the implication of actions on other actions, building a plan, and then evaluating progress towards the plan. There's all these other functional intelligences that we break out as cognitive scientists. Language models can't do that, but the empty and inbox, you need those, right?

Starting point is 00:31:05 For me to answer this email on your behalf, I have to understand who's involved. What do they want? What's the larger objective that they're moving towards? What information do I have that's relevant to that objective? What information or suggestion can I make that's going to make the best progress

Starting point is 00:31:19 towards that objective? And then how do I deliver that in a way that's going to actually understanding how they think about it and what they care about and what in a way that's going to actually understanding how they think about it and what they care about and what they know about that's going to like best fit these other minds? That's a very complicated thing. So that's going to be more interesting, right? Because that could take more of this sort of administrative overhead off the plate of knowledge workers, not just speeding up or changing how we do things, but taking things off our plate, which is where things get interesting. That needs multi-agent models,

Starting point is 00:31:44 right? Because you have to have the equivalent of the diplomacy planning bot doing sort of business planning. Like, well, what would happen if I suggest this and they do this, what's going to happen to our project? It needs to have specific objectives programmed in. Like, in this company, this is what matters. Here's the list of things I can do. And here's things that I, so now when I'm trying to plan what I suggest, I have like a hard-coded list of like, these are the things I'm authorized to do in my position at this company, right? So we need multi-agent models for the inbox clearing Turing test to be passed. That's where things start to get more interesting. And I think that's where like a lot of the prognostications of big impacts get more interesting. Again, though, I don't know that it's going to eliminate large swaths of the economy, but it might really change the character

Starting point is 00:32:30 of a lot of jobs. Sort of, again, similar to the way the internet or Google or email really change the character of a lot of jobs versus what they were like before, really changing what the day-to-day rhythm is. Like we've gotten used to in the last 15 years, work is a lot of sort of unstructured back and forth communication that sort of our day is built on email, Slack and meetings. Work five years from now, if we cross the inbox Turing test might feel very different because a lot of that coordination can be happening between AI agents and it's going to be a different feel for work. And that could be substantial, but I still don't see that as, you know, knowledge work goes away. Knowledge work is like building, you know, water run mills or horse and buggies. I table stakes if you are a knowledge worker,

Starting point is 00:33:26 which would also include, I think it include creative work of any kind, and that we could have a scenario where information slash knowledge slash idea, whatever workers with AI, it's just going to get to a point where they can outproduce quantitatively and qualitatively their peers on average who do not have or who do not use ai so much so that a lot of the latter group will not have employment in that capacity if they if they don't adopt the the technology and change yeah mean, I think it's like internet-connected PCs, right? Like, eventually, everyone had, in knowledge work, had to adopt and use these.

Starting point is 00:34:13 Like, you couldn't survive by, like, the late 90s. You're like, I'm just at too big of a disadvantage if I'm not using an internet-connected computer, right? You can't email me. I'm not using word processors.

Starting point is 00:34:24 We're not using digital graphics and presentations. You had to adopt that technology. We saw a similar transition, if we want to go back 100 years, to the electric motors and factory manufacturing. There was like a 20-year period where we weren't quite sure. We were uneven in our integration of electric motors into factories that before were run by giant steam engines that would turn an overhead shaft and all the equipment would be connected to it by belts. But eventually, and there's a really nice business case written about this, that's sort of often cited, eventually, you had to have small motors on each piece of equipment because it was just, you're still building the same things.

Starting point is 00:35:04 And like the equipment was functionally the same, You're whatever, you're sewing short or pants, right? You're still a factory making pants. You still have sewing machines, but you eventually had to have a small motor on every sewing machine connected by electrical cable to a dynamo because that was just so much more of an efficient way to do this than to have a giant overhead single speed crankshaft on which everything was connected by belts, right? So we saw that in knowledge work already with internet connected computers. If we get to this sort of functional AI, this functional intelligence AI, I think it's going to be unavoidable, right? Like, I mean, one way to imagine this technology, I don't exactly know

Starting point is 00:35:40 how it'll be delivered, but one way to imagine it is something like a chief of staff, right? So like if you're a president or a tech company CEO, you have a chief of staff that sort of organizes all the stuff so that you can focus on what's important. Like the president of the United States doesn't check his email inbox. He'd be like, what do I work on next, right? That sort of Leo McGarry character is like, all right, here's who's coming in next. Here's what you need to know about it. Here's the information. We got to make a decision on like whether to deploy troops. You do that. Okay, now here's what's happening next. You can imagine a world in which AIs play something like that role. So now things like email, a lot of what we're doing in meetings, for example, that gets taken over more by the digital chief of staff, right?

Starting point is 00:36:24 They gather what you need, they coordinate with other AI agents to get you the information you need, they deal with the information on your behalf, they deal with the sort of software programs that like make sense of this information or calculate this information, they sort of do that on your behalf. We could be heading more towards a future like that, a lot less administrative overhead and a lot more sort of undistracted thinking or that sort of cognitive focus. That will feel very different. Now, I think that's actually a much better rhythm of work than what we evolved into over the last 15 years or so in knowledge work.

Starting point is 00:36:57 But it could have interesting side effects because if I can now produce 3x more output because I'm not on email all day, well, that changes up the economic nature of my particular sector because technically we only need a third of me now to capture the sort of surplus cognitive capacity. We just sort of have a lot more raw brain cycles available. We don't have everyone sending and receiving emails once every four minutes, right? And so we're going to see more, I think, probably injection of cognitive cycles into other parts of the economy where I might now have someone hired that like helps me manage a lot of like the paperwork in my household, like things that just require, because there's going to be this sort of excess cognitive capacity. So we're going to have sort

Starting point is 00:37:49 of more thinking on our behalf. It's a hard thing to predict, but that's where things get interesting. I think email is a great example of necessary drudgery. And there's a lot of other necessary drudgery that will also be able to be offloaded. I mean, an example is the CIO of my sports nutrition company who oversees all of our tech stuff and has a long list of projects he's always working on. He is heavily invested now in working alongside AI. And I think he likes GitHub's co-pilot the most. And he's kind of fine-tuned it on how he likes to code and everything. And he said a couple things. One, he estimates that his personal productivity is at least 10 times. And he is not a sensationalist. That's like a conservative estimate with his coding. And then he also has commented that something he loves about it is it automates a lot of drudgery. A code that typically, okay, so you have to kind of reproduce something you've already done before.

Starting point is 00:38:56 And that's fine. You can take what you did before, but you have to go through it and you have to make changes. And you know what you're doing, but it's, but it's boring and it can take a lot of time. And he said, now he spends very little time on that type of work because the AI is great at that. And so the time that now he gives to his work is more fulfilling and ultimately more productive. And so I can see that effect occurring in many other types of work. I mean, just think about writing. Like you say, you don't ever have to deal with the scary blank page. Not that that is really an excuse to not put words on the page, but that's something that

Starting point is 00:39:40 I've personally enjoyed is, although I don't believe in writer's block per se, you can't even run into idea block, so to speak, because if you get there and you're not sure where to go with this thought, or if you're even onto something, if you jump over to GPT and start a discussion about it, at least in my experience, especially if you get at generating ideas, and you mentioned this earlier, a lot of the ideas are bad and you just throw them away. But always, always in my experience, I'll say always I get to something when I'm going through this kind of process, at least one thing, if not multiple things that I genuinely like, that I have to say, that's a good idea. That gives me a spark. I'm going to take that and I'm going to work with that.

Starting point is 00:40:25 Yeah, I mean, again, I think this is something we don't, we didn't fully understand. We still don't fully understand, but we're learning more about, which is like the rhythms of human cognition and what works and what doesn't. We've underestimated the degree to which the way we work now, which is,

Starting point is 00:40:39 it's highly interruptive and solitary at the same time. It's, I'm just trying to write this thing from scratch, and that's like a very solitary task, but also like I'm interrupted a lot with like unrelated things. This is a rhythm that doesn't fit well with the human mind. A focused collaborative rhythm is something the human mind is very good at, right? So now if my day is unfolding with me interacting back and forth with an agent, maybe that seems really artificial, but I think the reason why we're seeing this actually be useful to people is it's probably more of a human rhythm for cognition. It's like I'm going back and forth with someone else in a social context trying to figure something else, something out, and my mind can

Starting point is 00:41:16 be completely focused on this. You and I, or you as a bot in this case, we're trying to write this article. And now that is more familiar, and I think that's why it feels less strain than I'm going to sit here and do this very abstract thing on my own, you know, just like staring at a blank page. Programming, you know, it's an interesting example. And I've been wary about trying to extrapolate too much from programming because I think it's also a special case, right? Because what a language model does do really well is they can produce text that well matches the prompt that you gave

Starting point is 00:41:51 for like what type of text you're looking for. And as far as a model is concerned, computer code is just another type of text. So it can produce, if it's producing sort of like English language, it's very good at following the rules of grammar. And it's like, it's grammatically correct language. If they're producing computer code, it's very good at following the syntax of grammar. And it's like, it's grammatically correct language. If they're producing computer code,

Starting point is 00:42:05 it's very good at following the syntax of programming languages. This is actually like correct code that's going to run. Now, language plays an important role in a lot of knowledge work jobs, English language, but it's not the main game. It sort of supports the main things you're doing. I have to use language to sort of like request

Starting point is 00:42:22 the information I need for what I'm producing. I need to use language to like write a summary of the thing, the strategy I figured out. So the language is a part of it, but it's not the whole activity. And computer coding is the whole activity. The code is what I'm trying to do. Code that like produces something. We just think of that as text that like matches a prompt. Like the models are very good at that.

Starting point is 00:42:41 And more importantly, if we look at the knowledge work jobs where the English text is the main thing we produce, like writers, there typically we have these incredibly fine-tuned standards for what makes good writing good. When I'm writing a New Yorker article,

Starting point is 00:42:55 it's very, very intricate. It's not enough to be like, this is grammatically correct language that covers the relevant points and these are good points. It's like the sentence, everything matters, the sentence construction, the rhythm.

Starting point is 00:43:06 But in computer code, we don't have that. The code has to be like reasonably efficient and run. So like that, it's like a bullseye case of getting the maximum possible productivity, knowledge or productivity out of a language model is like producing computer code as like a CIO for a company where it's like, yeah, we need the right programs to do things. We're not trying to build a program that's going to have 100 million customers

Starting point is 00:43:28 and has to be like the super, like most efficient possible, like something that works and solves the problem. I want to solve it. And there's no aesthetic dimension, although I suppose maybe there'd be some pushback and that there can be elegant code and inelegant code, but it's not anywhere to the same degree as what you're as when you're trying to write something that really resonates with other humans in a deep way and inspires different emotions and images and things. Yeah, I think that's right. And elegant code is sort of the language equivalent of polished prose, which actually these language models do very well. This is very polished prose. It doesn't sound amateur. There's no mistakes in it. Yeah, that's often enough unless you're trying to do something fantastical and new,

Starting point is 00:44:09 in which case the language models can't help you with programming, right? You're like, okay, I'm doing something completely different, a super elegant algorithm that changes the way like we compute something. But most programming's not that, you know. That's for the 10x coders to do. So yeah, it's interesting. Programming is interesting. But for most other knowledge

Starting point is 00:44:29 work jobs, I see it more about how AI is going to get the junk out of the way of what the human is doing, more so than it's going to do the final core thing that matters for the human. And this is like a lot of my books, a lot of my writing is about digital knowledge work. We have these modes of working that accidentally got in the way of the underlying value-producing thing that we're trying to do in the company. The underlying thing I'm trying to do with my brain is getting interrupted by the communication, by the meetings, and that this is sort of an accident of the way digital knowledge work unfolded. AI can unroll that, potentially unroll that accident, but it's not going to be GPT-5 that

Starting point is 00:45:03 does that. It's going to be a multi-agent model where there's language models and hand-coded models and company-specific bespoke models that all are going to work together. I really think that's going to be the future. Maybe that's going to be Google's chance at redemption because they've made a fool of themselves so far compared to open AI. Even perplexity, not to get off on a tangent, but by my lights, Google Gemini should fundamentally work exactly the way that perplexity works. I now go to perplexity just as often, if not more often.

Starting point is 00:45:39 If I want that type of, I have a question and I want an answer and I want sources cited to that answer and I want more than one line, I go to perplexity now. I don't even bother with Google because Gemini is so unreliable with that. But maybe Google will be the one to bring multi-agent into its own. Maybe not. Maybe it'll just be open AI. They might be. But yeah, I mean, then we say, okay, you know, I talked

Starting point is 00:46:06 about that bot that wanted diplomacy by doing this multi-agent approach. The lead designer on that got hired away from Meta. It was OpenAI who hired him. So, that's where he is now, Noam Brown. He's at OpenAI, working industry insiders suspect

Starting point is 00:46:21 on building exactly like these sort of bespoke planning models to connect the language models and extend the capability. Google Gemini also showed the problem, too, of just relying on just making language models bigger and just having these massive models do everything as opposed to the IAI model of, okay, we have specific logic and these more emergent language understanders look what happened with you know what was this a couple months ago where they're having they were fine-tuning the the controversy where they were trying to fine-tune these models to be more inclusive and then it led to completely

Starting point is 00:46:55 unpredictable like unintended results like refusing to show yeah the the black the black waffen f waffen ss exactly or to refuse to show the founding fathers as white. The main message of that was kind of misunderstood. I think that was somehow being understood by sort of political commentators as like each of those, someone was programming somewhere like, don't show anyone as white or something like that. But no, what really happens is these models are very complicated. So they do these fine tuning things.

Starting point is 00:47:25 You have these giant models that take hundreds of million dollars to train. You can't retrain them from scratch. So now you're like, we're worried about it being like showing, defaulting to like showing maybe like white people too often when asked about these questions.

Starting point is 00:47:39 So we'll give it some examples to try to nudge it in the other way. But these models are so big and dynamic that you go in there and just give it some examples to try to nudge it in the other way. But these models are so big and dynamic that, you know, you go in there and just give it a couple examples of like, show me a doctor and you kind of, you give it a reinforcement signal to show a non-white doctor to try to unbias it away from, you know, what's in this data. But that can then ripple through this model in a way that now you get

Starting point is 00:47:59 the SS officers and the founding fathers, you know, as American Indians or something like that. It's because they're huge. And these fine're trying to fine-tune a huge thing, you have like a small number of these fine-tuned examples, like 100,000 examples, that have these massive reinforcement signals that fundamentally rewire the front and last layers of these models and have these huge unpredictable dynamic effects. It just underscores the unwieldiness of just trying to have a master model that is huge, that's going to serve all of these purposes in an emergent manner.

Starting point is 00:48:30 It's an impossible goal. It's also not what any of these companies want. Their hope, if you're open AI, if you're anthropic, right, if you're Google, you do not want a world in which, like, you have a massive model that you talk to through an interface and that's everything. And this model has to satisfy all people in all things. You don't want that world. You want the world where your AI, complicated combinations of models, is in all sorts of different stuff that people does in these much smaller form factors with much more specific use cases. ChatGPT, it was an accident that that got so big. It was supposed to be a demo

Starting point is 00:49:08 of the type of applications you can build on top of a language model. They didn't mean for ChatGPT to be used by 100 million people, right? It's kind of like we're in this, that's why I say like don't overestimate this particular, the importance of this particular form factor for AI.

Starting point is 00:49:23 It was an accident that this is how we got exposed to what language models could do. It's not, people do not want to be in this business of blank text box. Anyone everywhere can ask it everything. And this is going to be like an Oracle that answers you. That's not what they want. They want like the GitHub co-pilot vision. In the particular stuff I already do, AI is there making this very specific thing better and easier or automating it. So I think they want to get away from the mother model, the Oracle model that all thing goes through. This is a temporary step. It's like accessing mainframes through teletypes before, you know, eventually we got personal computers. This is not going to be the future

Starting point is 00:50:05 of our interaction with these things. The Oracle blank text box to which all requests go. They're having so much trouble with this and they don't want this to be. It's, you know, I see these massive trillion parameter models, just marketing, like,

Starting point is 00:50:17 look at the cool stuff we can do, associate that with our brand name so that when we're then offering like more of these more bespoke tools in the future that are all over the place, you'll remember Anthropic because you remember Claude was really cool during this period where we were all using chatbots. And we did the Golden Gate experiment. Remember how fun that was? A good example of what you were just mentioning of how you can't brainwash the bots per se,

Starting point is 00:50:50 wash the bots per se, but you can hold down certain buttons and produce very strange outcomes. For anyone listening, if you go check out, I think it's still live now. I don't know how long they're going to keep it up, but check out Claude's, their Anthropics Claude Golden Gate Bridge experiment and fiddle around with it. And by the way, think about this objectively. There's another weird thing going on with the Oracle model of AI, which again, why they want to get away from it. We're in this weird moment now where we're conceptualizing these models sort of like important individuals. And we want to make sure that like these individuals, like the way they express themselves is proper, right? But if you zoom out, like this doesn't necessarily

Starting point is 00:51:25 make a lot of sense for something to invest a lot of energy into. Like you would assume people could understand this is a language model. It's this neural network that just like produces text to expand stuff that you put in there. You know, hey, it's going to say all sorts of crazy stuff, right?

Starting point is 00:51:39 Because this is just a text expander, but here's all these like useful ways you can use it, but you can make it say crazy stuff. Yeah. And if you want it to say whatever, nursery rhymes as if written by Hitler, whatever, it's a language model that can do almost anything. And that's a cool tool. And we want to talk to you about ways you can build tools on top of it. But we're in this moment where we got obsessed about we're treating it like it's an elected official or something. And the things it says somehow reflects on the character of some sort of entity that actually exists. And so we

Starting point is 00:52:08 don't want this to say something. You know, it used to be, there's a whole interesting field, an important field in computer science called algorithmic fairness, right? Or algorithmic bias. And these are similar things where they look for, like, if you're using algorithms for making decisions, you want to be wary of biases being unintentionally programmed into these algorithms, right? This makes a lot of sense. The kind of the classic early cases were things like, hey, you're using an algorithm to make loan approval decisions, right? Like, I would give it all this information about the applicant and the model maybe is better at a human and figuring out who to give a loan to

Starting point is 00:52:45 or not. But wait a second, depending on the data you train that model with, it might be actually biased against people from certain backgrounds or ethnic groups in a way that is just an artifact of the data. Like we got to be careful about that, right? Or in a way that may actually be factually accurate and valid, but ethically unacceptable. And so you just make a determination. Yeah. So, right. There could be, if this was just us as humans doing this, there's these nuances and determinations we could have. And so we got to be very careful about having a black box do it. But somehow we shifted that attention over to just chatbots producing text. They're not at the core decisions. They're not at the core decisions.

Starting point is 00:53:30 They're not, the chatbox text doesn't become canon. It doesn't get taught in schools. It's not used to make language decisions. It's just a toy that you can mess with and it produces text. But we became really important that like the stuff that you get this bot to say has to be like, meet the standards of like what we would have for like an individual human and it's a huge amount of effort that's going into this um and it's really unclear why because it's so what if i can uh make a chat bot like say something very disagreeable i can also just say something very disagreeable i can search the internet and find things very disagreeable or you exactly you can go poke around on some forums about anything and go spend some time on 4chan. And there you go. That's enough disagreeability for a lifetime.

Starting point is 00:54:13 So we don't get mad at Google for, hey, I can find websites written by preposterous people saying terrible things, because we know this is what Google does. It just sort of indexes the web. So there's a lot of effort going into trying to make this sort of Oracle model thing kind of behave, even though like the text doesn't have impact. There's like a big scandal right before ChatGPT came out.

Starting point is 00:54:34 This was, I think it was Meta had this language model Galaxy that they had trained on a lot of scientific papers. And they had this, I think a really good use case, which is if you're working on scientific papers,

Starting point is 00:54:43 it can help speed up like right sections of the papers. So it speeds up. It's hard. You get the results in science, but then writing the paper is like a pain, where the real value is in doing the research, typically, right? And so, like, great, we've trained on a lot of scientific papers, so it kind of knows the language of scientific papers. It can help you, like, let's write the interpretation section. Let me tell you the main points you put in the right language. And that people were messing around with this, like, let's write the interpretation section. Let me tell you the main points you put in the right language. And that people were messing around

Starting point is 00:55:06 with this, like, hey, we can get this to write fake scientific papers. Like a famous example was about, you know, the history of bears in space. And they got real spooked. And like we got, and they pulled it. But like, in some sense,

Starting point is 00:55:18 it's like, yes, sure. This thing that can produce scientific sounding text can produce papers about bears in space. I could write a fake paper about bears in space. Like it's not adding some new harm to the world, but this tool would be very useful for like specific uses, right? Like I want to make this section, help me write this section of my particular paper.

Starting point is 00:55:37 So when we have this like Oracle model of these, this Oracle conception of these machines, I think we anthropomorphize them into like they're an entity and we want that. And I created this entity as a company. It reflects on me, like what their values are and the things they say. And I want this entity to be like sort of appropriate,

Starting point is 00:55:54 culturally speaking. You could easily imagine, and this is the way we thought about these things pre-ChatGPT. Hey, we have a model, GPT-3. You can build applications on it to do things. That had been out for a year, like two years. You could build a chatbot on it, but you could build a bot on it

Starting point is 00:56:08 that just like, hey, produce fake scientific papers or whatever. But we saw it as a program, a language generating program that you could then build things on top of. But somehow when we put it into this chat interface, we think of these things as entities. And then we really care then about the beliefs and behavior of the entities. It all seems so wasteful to me. Because we need to move past the chat interface era anyways and start integrating these things directly into tools. No one's worried about the political beliefs of GitHub's co-pilot because it's focused on producing, filling in computer code and writing drafts of computer code. Well, anyways, to try to summarize these various points and sort of bring it to our look at

Starting point is 00:56:46 the future, essentially what I'm saying is that in this current era where the way we interact with these generative AI technologies is through just like this single chat box and the model is an oracle that we do everything through. We're going to keep running into this problem where we're going to begin to treat this thing as like an entity. We're going to have to care about what it says and how it expresses itself and whose team is it on. And a huge amount of resources have to be invested into this. And it feels like a waste because the inevitable future we're heading towards is not one

Starting point is 00:57:18 of the all-wise oracle that you talk to through a chatbot to do everything, but it's going to be much more bespoke where these networks of AI agents will be customized for various things we do, just like GitHub Copilot is very customized at helping me in a programming environment to write computer code. There'll be something similar happening when I'm working on my spreadsheet, and there'll be something similar happening with my email inbox. And so right now to be wasting so much resources on whether, you know, Claude or Gemini or ChatGPT, you know, a political correct, like it's a waste of resources because the role of these large chatbots is like, Oracle's is going to go away anyway. So that's, you know, I am excited. I am excited for the future where AI becomes, we splinter it and it becomes more responsive

Starting point is 00:58:09 and bespoke and it's directly working and helping with the specific things we're doing. That's going to get more interesting for a lot of people because I do think for a lot of people right now, the copying and pasting, having to make everything linguistic, having to prompt engineer, that's a big enough of a stumbling block that is impeding, I think, sector-wide disruption right now. That disruption was going to be much more pronounced once we get the form factor of these tools much more integrated into what we're already doing. And the LLM will probably be the gateway to that because of how good it is at coding in particular and how much better it's going to be. That is going to enable the coding of, it's going to be able to do a lot of the work

Starting point is 00:58:52 of getting to these special use case multi-agents probably at a degree that without it would just be, it just wouldn't be possible. It's just too much work. Yeah, I think it's going to be the gateway. I think the, we're going to have sort of, if I'm imagining an architecture, the gateway is the LLM. I'm saying something that I want to happen. And the LLM understands the language and translates it into like a machine, much more precise language. I imagine there'll be some sort of coordinator program that then takes that description and it can start figuring out, okay, so now we need to use this program to help do this. Let me talk to the LLM. Hey, change this to this language. Now let me talk to that. So we'll have

Starting point is 00:59:35 a coordinator program, but the gateway between humans and that program and between that program and other programs is going to be LLMs. But what this is also going to enable is they don't have to be so big. If we don't need them to do everything, we don't need them to like play chess games and be able to write in every idiom, we can make them much smaller. If what we really need them to do is understand, you know, human language that is like relevant to the types of business tasks that this multi-agent thing is going to run on, that LLM can be much smaller, which means we can like fit it on a phone. And more importantly, it can be much more responsive. Sam Altman's been talking about this recently. It's just too slow right now because these LLMs are so big. Even 4.0, when you get it into more esoteric token spaces, I mean, it's fine. I'm not complaining. It's a fantastic tool.

Starting point is 01:00:28 But I do a fair amount of waiting while it's chewing through everything. Yeah, well, and because, like the model is big, right? And how do you, the actual computation behind a transformer-based language model production of a token, the actual computation

Starting point is 01:00:41 is a bunch of matrix multiplications, right? So the weights of the neural networks in the layers are represented as big matrices, and you multiply matrices by matrices. This is what's happening on GPUs. But the size of these things is so big, they don't even fit in the memory of a single

Starting point is 01:00:57 GPU chip. So you might have multiple GPUs involved just to produce, working full out, just to produce a single token because these things are so big. Massive matrices are being multiplied. So if you make the model smaller, they can generate the tokens faster. And what people really want is essentially

Starting point is 01:01:14 real-time response. They want to be able to say something and have the text response just boom. That's the response of this feed where now this is going to become a natural interface where I can just talk and not watch it word by word go, but I can talk and boom, it does it. What's next, right? Or even talks back to you. So now you have a commute or whatever, but you can actually now use that time maybe to have a discussion with this highly specific

Starting point is 01:01:42 expert about what you are working on. And it's just a real time as if you're talking to somebody on the phone. Oh, it's good. And I think people underestimate how cool this is going to be. So we need very quick latency, very small latency,

Starting point is 01:01:53 because we imagine, I want to be at my computer or whatever, just to be like, okay, find the data from the, get the data from the Jorgensen movie. Let's open up Excel here. Let's put that into a table. Do it like the way we did before.

Starting point is 01:02:04 If you're seeing that just happen as you say it, now we're in like the linguistic equivalent of Tom Cruise, a minority report, sort of moving the AR windows around with his special gloves. That's when it gets really important. Sam Altman knows this. He's talking a lot about it. It's not too difficult. We just need smaller models, but we know small models are fine. Like, as I mentioned in that diplomacy example, the language model was very small, and it was a factor of 100 smaller than something like GPT-4. And it was fine because it wasn't trying to be this oracle that anyone could ask everything about and was constantly prodding it and giving it... It was an idiot plot. It was just really good at diplomacy language. And it had the reasoning engine.

Starting point is 01:02:48 And it knew it really well. And it was really small. It was 9 billion parameters, right? And so anyways, I'm looking forward to that. We get these models smaller. Smaller is going to be more. It's an interesting mindset shift. Smaller models hooked up the custom other programs,

Starting point is 01:03:02 deployed in a bespoke environment. Like that's the startup play you want to be involved in. With a big context window. Big context window. Yeah. But even that doesn't have to be that big. A lot of the stuff we do doesn't even need a big context window. You can have another program just find the relevant thing to what's happening next and paste that into the prompt that you don't even see. That's true. I'm just thinking selfishly. like think about a writing project, right? So you go through your research phase and you're reading books and articles and transcripts of podcasts, whatever, and you're making your highlights and you're getting your thoughts together.

Starting point is 01:03:38 And you have this, this corpus, this, this, I mean, if it were fiction, it would be like your story Bible, as they say, or codex, right? You have all this information now that, and it's time to start working with this information to be able to, and it might be a lot depending on what you're doing. And in Google's notebook, it was called notebook LLM. This is the concept and I've started to tinker with it in my work. I haven't used it enough to have, and this is kind of a segue into the final question I want to ask you. I haven't used it enough to have, and this is kind of a segue into the final question I want to ask you. I haven't used it enough to pronounce one way or other on it.

Starting point is 01:04:09 I like the concept though, which is exactly this. Oh, cool. You have a bunch of material now that is going to be, that's related to this project you're working on. Put it all into this model and it now, it reads it all and it can find the little password uh example or you hide the password in a million tokens of text or whatever and it can find it so so it in a sense quote unquote knows to a high degree with a high degree of accuracy everything you put in there and now you have this

Starting point is 01:04:42 this bespoke little assistant on the project that is, it's not trained on your data per se, but you can have that experience. And so now you have a very specific assistant that you can use. But of course you need a big context window. Maybe you don't need it to be 1.5 million or 10 million tokens, but if it were 50,000 tokens, then maybe that's sufficient for an article or something, but not for a book. It does help, though it's worth knowing, like the architecture, there's a lot of these sort of third-party tools, like for example, built-on language models, where you hear people say, like, I built this tool where I can now ask this custom model questions about all of the quarterly reports

Starting point is 01:05:26 of our company from the last 10 years or something. This is like, there's a big business now, consulting firms building these tools for individuals. But the way these actually work is there's an intermediary. So you're like, okay, I want to know about, you know, how have our sales like different between the first quarter this year versus 1998?

Starting point is 01:05:45 You don't have in these tools 20 years worth of into the context. What it does is it actually, right, it's search, not the language model, just old-fashioned program, searches these documents to find relevant text, and then it builds a prompt around that. And actually how a lot of these tools work is it stores this text in such a way that it uses the embeddings of your prompt. So like after they've already been transformed into the embeddings that the language model neural networks understand, and all your text has also been stored in this way, and it can find sort of now conceptually similar text. So it's

Starting point is 01:06:22 like more sophisticated than text matching, right? It's not just looking for text. So it's like more sophisticated than text matching, right? It's not just looking for keywords. It can, so it can actually leverage like a little bit of the language model, how it embeds these prompts into a conceptual space and then find text that's in a similar conceptual space.

Starting point is 01:06:36 But then it creates a prompt. Okay, here's my question. Please use the text below in answering this question. And then it has 5,000 tokens worth of text pasted below. That actually works pretty well. So all the OpenAI demos

Starting point is 01:06:48 from last year, like the one about the plugin demo with UN reports, etc., that's the way that worked. It was finding relevant text from a giant corpus and then creating smarter prompts that you don't see as the user. But your prompt is not what's going to the language

Starting point is 01:07:03 model. It's a version of your prompt that has cut and pasted text that have found the documents. Even that works well. Yeah. I'm just parroting actually the CIO of my sports company who knows a lot more about the AI than I do. He's really into the research of it. He has just commented to me a couple of times that when I'm doing that type of work, he has recommended stuffing the context window because if you just give it big PDFs, you just don't get nearly as good as results as if you do when you stuff the context window. That was just a comment, but we're coming up on time, but I just wanted to ask one more question if you have a few more minutes. And this is something that you've commented on a number of times, but I wanted to come back to it.

Starting point is 01:07:49 And so in your work now, and obviously a lot of your work is that the highest quality work that you do is deep in nature in many ways, aside from maybe the personal interactions in your job. In many ways, your career is the personal interactions in your job, in many ways, your career is based on coming up with good ideas. And so how are you currently using these LLMs? And specifically, what have you found helpful and useful? Well, I mean, I'll say right now in their current incarnation, I use them very little outside of specifically experimenting with things for articles about LLMs, right? Because as you said, like my main livelihood is trying to produce ideas at a very high level, right?

Starting point is 01:08:37 So for academic articles, New Yorker articles, or books, it's a very precise thing that requires you taking a lot of information, and then your brain is trained over decades of doing this, sits with it and works on it for months and months until you kind of slowly coalesce, like, okay, here's the right way to think about this, right? This is not something that I don't find it to be aided much with sort of generic brainstorming prompts from like an LLM. It's way too specific and weird and idiosyncratic for that. Where I imagine, and then what I do is I write about it. But again, the type of writing I do is incredibly sort of like precise. I have a very specific voice, the rhythm of the sentences, I have a stylistic. It's just, I just write. And I'm used to it,

Starting point is 01:09:20 and I'm used to the psychology of the blank page and that pain, and I sort of internalize it. And I'm sure you have, I mean sure you have to go through multiple drafts. The first draft, you're just throwing stuff down. I don't know for you, but for me, I have to fight the urge to fix things. Just get all the ideas down and then you have to start refining. Yeah, and I'm very used to it. My inefficiency is not like if I could speed up that by 20%, somehow that matters. It's, you know, it might take me months to write an article. And it's about getting the ideas right and sitting with it. Where I do see these tools playing a big role, what I'm waiting for is this next generation where they become more customized and bespoke and integrated in the things I'm already using. That's what I'm waiting for. Like, I'll give you an example. I've been experimenting with just a lot of examples with GPT-4 for understanding natural language described schedule constraints and understanding here's a time that here's a

Starting point is 01:10:14 meeting time that satisfies these constraints. This is going to be eminently built into like Google workspaces. That's going to be fantastic. Where you can say, we need a meeting, like a natural language. We need a meeting with like Mike and these other people. It needs to be the next two weeks. Here's my constraints. I really want to try to keep this

Starting point is 01:10:34 in the afternoon impossible, not on Mondays or Fridays, but if we really have to do a Friday afternoon, we can, but no later than this. And then, you know, the language model working with these other engines sends out a scheduling email to the right people.

Starting point is 01:10:46 People just respond in natural language with the times that might work. It finds something in the intersection. It sends out an invitation to everybody. That's really cool. That's going to make a big difference for me right away, for example. These type of things. Or integrated into Gmail. Suddenly, it's able to highlight

Starting point is 01:11:06 a bunch of messages in my inbox and be like, you know what? I can handle these for you. You're like, good. And it's like, and they disappear. That's where this is going to start to enter my world in the way that like GitHub Copilot

Starting point is 01:11:19 has already entered the world of computer programmers. So because the thinking and writing I do is so highly specialized, this sort of the impressive but generic ideation and writing abilities of those models isn't that relevant to me. But the administrative overhead

Starting point is 01:11:34 that goes around being any type of knowledge worker is poison to me. And so, you know, that is the evolution, the turn of this sort of product development crank that I'm really waiting to happen. And I'm assuming one of the things that we will see probably sometime in the near future is think about Gmail is currently, I guess, it has some of these predictive text outputs where if you like what it's suggesting, you can just hit tab or whatever.

Starting point is 01:12:01 And it throws a couple of words in there, but I could see that expanding to it's actually now just suggesting an entire reply. And Hey, if you like it, you just go, yeah, you know, it sounds great. Next, next, next. Yep. Or you'll train it. And this is where you need other programs, not just a language model, but you sort of show it examples. Like you just tell it like these, the types of like common types of messages I get. And then like, you're kind of telling it, which is what type of example. And then it sort of learns to categorize these messages. And then you can kind of, it can have rules for how you deal with these different types of messages. Yeah, it's going to be powerful. Like that's going to, that's going to start to matter, I think, in an interesting way. I think information gathering, right? So one of the

Starting point is 01:12:43 big purposes, like in an office environment of meetings is there's certain information or opinions I need, and it's kind of complicated to explain them all. So can we just like all get together in a room? But AI with control programs, now like I don't necessarily need everyone to get together. I can explain like this is the information. I need this information, this information, and a decision on this and this. Like that AI program might be able to talk to your AI program. Like it might be able to gather most of that information with no humans in the loop. And then there's a few places where what it has is like questions for people and it gives it to those people's AI agent. And so there's certain points of the day where you're talking to your agent and it like asks you some questions

Starting point is 01:13:21 and you respond and then it gets back and then all this is gathered together. And then when it comes time to work on this project, it's all put on my desk, just like a presidential chief of staff puts the folder on the president's desk. There it is. This is where I think people need to be focused in knowledge work and LLMs and not get too caught up in thinking about, again, a chat window into an oracle as being the end all of what this technology could be. Again, it's when it gets smaller that its impact gets bigger.

Starting point is 01:13:49 That's when things are going to start to get interesting. Final comment I'll add is in my work, because I've said a number of times that I'm using it quite a bit. And just in case anybody's wondering, because that seems to contradict with what you said, because in some ways my work is very specialized. And that is where I use it the most. If I think about health and fitness related to work, I found it helpful at a high level of generating overview. So I want to create some content on a topic and I want to make sure that I'm being comprehensive. I'm not forgetting about something that should be in there. And so I find it helpful to take something like if it's just an

Starting point is 01:14:25 outline for an article on or I want to write and just ask it to, does this look right to you? Am I missing anything? Is there any, how, how might you make this better? Those types of simple little interactions are helpful. Also applying that to specific materials. So again, is there anything here that seems to be incorrect to you? Or is there anything that you would add to make this better? Sometimes I get utility out of that. And then where I've found it most useful actually is in a, it's really just hobby work. My original interest in writing actually was fiction going back to, I don't know, I was 17, 18 years old. And it's kind of been an abiding interest that I put on the back burner to focus on other things for a while. Now I've brought it back to not a front burner, but maybe I bring it

Starting point is 01:15:19 to a front burner and then I put it back and then bring it and put it back. And so for that, I found it extremely helpful because that process started with me reading a bunch of books on storytelling and fiction so I can understand the art and science of storytelling beyond just my individual judgment or taste. Pulling out highlights, notes, things, I'm like, that's useful, that's good. Organizing those things into a system of checklists really to go through. So, okay, you want to create characters. There are principles that go into doing this well. Here they are in a checklist and working with GPT in particular through that process is, I mean, that is extremely useful because again, as this context builds in this chat, in the exact case of building a character, it understands quote unquote, the psychology and it understands probably in some

Starting point is 01:16:12 ways more so than any human could because it also understands the, or in a sense can produce the right answers to questions that are now also given the context of people like this character that you're building. And so much of putting together a story is actually just logical problem solving. There are maybe some elements that you could say are more purely creative, but as you start to put all the scaffolding there, a lot of it now is you've kind of built constraints of a story world and characters and how things are supposed to work. And it becomes more and more just logical problem solving. And because these LLMs are so good with language in particular, that has been actually a lot of fun to see how all these things come together. And it saves a tremendous amount of

Starting point is 01:17:01 time. It's not just about copy and pasting the answers. So much of the material that it generates is great. And so anyway, just to give context for listeners, because that's how I've been using it both in my fitness work, but it's been actually more useful in the fiction hobby. Yeah. And one thing to point out about those examples is they're both focused on like the production of text under sort of clearly defined constraints, which like language models are fantastic at. And so for a lot of knowledge work jobs, there is text produced as a part of those jobs, but either it's not necessarily core, you know, it's like the text that shows up in emails or something like this, or yeah, they're not getting paid to write the emails.

Starting point is 01:17:49 Yeah. And in that case, the constraints aren't clear, right? So like the issue with like email text is like the text is not complicated text, but the constraints are like very business and personality specific. Like, okay, well, so-and-so is a little bit nervous about getting out of the loop and we need to make sure they feel better about that. But there's this other initiative going on and it's too complicated for people. I can't get these constraints to my language model. So that's why, so I think people who are generating content with clear constraints, which like a lot of what you're doing is doing,

Starting point is 01:18:15 these language models are great. And by the way, I think most computer programming is that as well, is producing content under very clear constraints. It compiles and solves this problem. And this is why, to put this in the context of what I'm saying, so for the knowledge workers that don't do that, this is where we're going to have the impact of these tools come in and say, okay, well,

Starting point is 01:18:32 these other things you're doing, that's not just a production of text and clear constraints. We can do those things individually or take those off your plate by having to kind of program into explicit programs the constraints of what this is. Like, oh, this is an email in this type of company. This is a calendar or whatever. So one way or the other, this is going to get into what most knowledge workers do. But you're in a fantastic position to sort of see the power of these next generation of models up close because it was already a match for what you're doing. And as you would describe, you would say, this has really changed the feel of your day. It's opened things up. So I think that's an optimistic look ahead to the future. And in using what now is just this big, unwieldy model that's kind of good at a lot of things, not great really at anything, in a more specific manner that you've been talking about in this

Starting point is 01:19:21 interview, where not only is the task specific, I think it's a general tip for anybody listening who can get some utility out of these tools, the more specific you can be, the better. And so in my case, there are many instances where I want to have a discussion about something related to this story. And I'm working through this little system that I'm putting together, but I'm feeding it. I'm like even defining the terms for it. So, okay, we're going to talk about, we're going to go through a whole checklist related to creating a premise for a story, but here's specifically, here's what I mean by premise. And that now is me pulling material from several books that I read and I kind of cobbled together. I think this is the definition that I like of premise. This is what we're going

Starting point is 01:20:05 for very specifically, feed that into it. And so I've been able to do a lot of that as well, which is again, creating a very specific context for it to work in. And the more hyper-specific I get, the better the results. Yep. And more and more in the future, the bespoke tool will have all that specificity built in. So you can just get to doing the thing you're already doing, but now suddenly it's much easier. Yep. Well, I've kept you over. I appreciate the accommodation there. I really enjoyed the discussion and want to thank you again. And before we wrap up again, let's just let people know where they can find you, find your work. You have a new book that recently came out.

Starting point is 01:20:49 If people liked listening to you for this hour and 20 minutes or so, I'm sure they'll like the book as well as, as well as your other books. Yeah. I guess the background on me is that, I'm a computer scientist, but I write a lot about the impact of technologies on our life and work and what we can do about it in response. So, you know, you can find out more about me at calnewport.com. You can find my New Yorker archive at newyorker.com where I write about these issues. My new book is called Slow Productivity. like email, for example, and smartphones and laptops, sped up knowledge work until it was overly frenetic and stressful and how we can reprogram our thinking of productivity to make

Starting point is 01:21:31 it reasonable. Again, we talked about that when I was on the show before. So definitely check that out as well. Gives a kind of a, almost a framework that is actually very relevant to this discussion. Oh yeah. Yeah. And, and, you know, the motivation for that whole book is technology, too. Like, again, technology sort of changed knowledge work. Now we have to take back control of the reins,

Starting point is 01:21:51 but also, right, the vision of knowledge work is one, the slow productivity vision is one, and where AI could definitely play a really good role is it takes a bunch of this freneticism off your plate, potentially, and allows you

Starting point is 01:22:01 to focus more on what matters. I guess I should mention I have a podcast as well, Deep Questions, where I take questions from my audience about all these types of issues and then get in the wheeze, get nitty gritty, give some specific advice. You can find that. That's also on YouTube as well. Awesome. Well, thanks again, Cal. I appreciate it. Thanks, Mike. Always a pleasure. How would you like to know a little secret that will help you get into the best shape of your life? Here it is.

Starting point is 01:22:29 The business model for my VIP coaching service sucks. Boom, mic drop. And what in the fiddly frack am I talking about? Well, while most coaching businesses try to keep their clients around for as long as possible, I take a different approach. You see, my team and I, we don't just help you build your best body ever. I mean, we do that. We figure out your calories and macros, and we create custom diet and training plans based

Starting point is 01:23:00 on your goals and your circumstances. And we make adjustments depending on how your body responds, and we help you ingrain the right eating and exercise habits so you can develop a healthy and a sustainable relationship with food and training and more. But then there's the kicker, because once you are thrilled with your results, we ask you to fire us. Seriously, you've heard the phrase, give a man a fish and you feed him for a day, teach him to fish and you feed him for a lifetime. Well, that summarizes how my one-on-one coaching service works. And that's why it doesn't make nearly as much coin as it could, but I'm okay with that because my

Starting point is 01:23:46 mission is not to just help you gain muscle and lose fat. It's to give you the tools and to give you the know-how that you need to forge ahead in your fitness without me. So dig this. When you sign up for my coaching, we don't just take you by the hand and walk you through the entire process of building a body you can be proud of. We also teach you the all-important whys behind the hows, the key principles, and the key techniques you need to understand to become your own coach. And the best part? It only takes 90 days. So instead of going it alone this year, why not try something different? Head over to muscleforlife.show slash VIP.

Starting point is 01:24:31 That is muscleforlife.show slash VIP and schedule your free consultation call now. And let's see if my one-on-one coaching service is right for you. Well, I hope you liked this episode. I hope you found it helpful. And if you did, subscribe to the show because it makes sure that you don't miss new episodes.

Starting point is 01:24:53 And it also helps me because it increases the rankings of the show a little bit, which of course then makes it a little bit more easily found by other people who may like it just as much as you. And if you didn't like something about this episode or about the show in general, or if you have ideas or suggestions or just feedback to share, shoot me an email, mike at muscleforlife.com, muscleforlife.com, and let me know what I could do better or just what your thoughts are about maybe what you'd

Starting point is 01:25:24 like to see me do in the future. I read everything myself. I'm always looking for new ideas and constructive feedback. So thanks again for listening to this episode, and I hope to hear from you soon.

Muscle for Life with Mike Matthews - Cal Newport on the Future of AI and Knowledge Work

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.