Muscle for Life with Mike Matthews - Cal Newport on the Future of AI and Knowledge Work
Episode Date: June 12, 2024Is artificial intelligence a quantum leap forward for humanity? The key to world peace, the cure for disease and aging, and the springboard to the abundant, leisure-filled future depicted in science f...iction novels? Or is it the death knell for humanity as they know it? Or something in between? In this episode, I talk with Cal Newport, a renowned computer science professor, author, and productivity expert, to delve into the complex landscape of AI. Cal has been studying AI and its ramifications for humanity long before it was cool, and he has a number of counterintuitive thoughts on the pros and cons of this new technology, how to get the most out of it right now, and what the future of AI will look like. In this interview, you’ll learn . . . How to use existing AI tools like ChatGPT and Claude to be more productive and successful in your work and personal life The pros and cons of existing AI tools What the future of AI development may look like How to use AI to stay competitive in the modern workplace And more . . . So, if you're curious about how AI is shaping our world and what you should do right now to get and stay ahead of the curve, click play and join the conversation! Timestamps: (3:44) The current and future impact of AI on life and work (10:52) The limitations and inefficiencies of current LLMs (15:37) The future of LLMs (18:56) The benefits of a “multi-agent approach” (28:15) Will AI lead to massive job losses? (33:16) How AI will become essential in the modern workplace (36:51) How will AI change the “rhythm” of work? (44:27) The future of AI in knowledge work (50:31) The problems with the “Oracle” model of AI (58:37) How LLMs will advance AI (1:07:39) How Cal uses LLMs in his work (1:09:56) What AI innovations are set to benefit writers the most? (1:12:38) AI’s future role in information gathering (1:13:52) How Mike uses AI in his work (01:20:54) Where can people find Cal’s work? Mentioned on the Show: Cal Newport’s Website Cal Newport’s New Yorker Archive Cal Newport’s New Book: Slow Productivity: The Lost Art of Accomplishment Without Burnout Cal Newport’s previous podcast appearance Cal Newport’s Deep Questions Podcast The Legion Diet Quiz Buy Legion Whey+ Legion Body Transformation Coaching
Transcript
Discussion (0)
Language model-based tools like ChatGPT or Cloud, again, they're built only on understanding language and generating language based on prompts.
Mainly how that's being applied, and I'm sure this has been your experience, Mike, in using these tools, is that it can speed up things that, you know, we were already doing.
Help me write this faster.
Help me generate more ideas than I'd be able to come up, you know, on my own.
Help me summarize this document.
It's sort of speeding up tasks. But none of that is my job
doesn't need to exist, right? The Turing test we should care about is when can an AI empty my email
inbox on my behalf, right? And I think that's an important threshold because that's capturing
a lot more of what cognitive scientists call functional intelligence, right? And I think
that's where a lot of the prognostications of big impacts get more interesting.
Hello, and welcome to another episode of Muscle for Life. I'm your host, Mike Matthews. Thank you
for joining me today for something a little bit different than the usual here on the podcast,
than the usual here on the podcast.
Something that may seem a little bit random,
which is AI,
but although I selfishly wanted to have this conversation because I find the topic and the technology interesting
and I find the guest interesting,
I'm a fan of his work.
I also thought that many of my listeners
may like to hear the discussion as well
because if they are not already using AI to improve their
work, to improve their health and fitness, to improve their learning, to improve their
self-development, they should be and almost certainly will be in the near future. And so
that's why I asked Cal Newport to come back on the show and talk about AI. And in case you are
not familiar with Cal, he is a renowned computer science professor, author, and productivity expert.
And he's been studying AI and its ramifications for humanity long before it was cool. And in this
episode, he shares a number of counterintuitive thoughts on the pros and cons of this new technology,
how to get the most out of it right now, and where he thinks it is going to go in the future.
Before we get started, how many calories should you eat to reach your fitness goals faster?
What about your macros? What types of food should you eat? And how many
meals should you eat every day? Well, I created a free 60 second diet quiz that will answer
those questions for you and others, including how much alcohol you should drink, whether you should
eat more fatty fish to get enough omega-3 fatty acids, what supplements are worth taking and why, and more. To take the quiz and
get your free personalized diet plan, go to muscleforlife.show slash diet quiz, muscleforlife.show
slash diet quiz now, answer the questions and learn what you need to do in the kitchen to lose fat,
build muscle, and get healthy.
Hey, Cal, thanks for taking the time to come back on the podcast.
Yeah, no, it's good to be back.
Yeah, I've been looking forward to this, selfishly, because I'm personally very interested
in what's happening with AI. I use it a lot in my work. It's now, it's basically my,
it's like my little digital assistant, basically. And because so much of my work is these days, it's creating content of
different kinds. It's just doing things that require me to create ideas, to think through
things. And I find it very helpful. But of course, it's also, there's a lot of controversy over it.
And I thought that might be a good place to start. So the first question I'd
like to give to you is, so everyone listening has heard about AI and what's happening to some
degree, I'm sure. And there are a few different schools of thought from what I've seen in terms
of where this technology is and where it may go in the future. There are people who think that it
may save humanity, it may usher in a new renaissance, it may dramatically reduce the
cost of producing products and services, new age of abundance, prosperity, all of that.
And then there seems to be the opposite camp who think that it's more likely to destroy everything and possibly even just
eradicate humanity altogether. And then there also seems to be a third philosophy, which is
kind of just a meh, like the most likely outcome is probably going to be disappointment. It's not
going to do either of those things. It's just going to be a technology that is useful for
certain people under certain circumstances. And it's just going to be a technology that is useful for certain people under certain circumstances.
And it's just going to be another tool, another digital tool that we have.
I'm curious as to your thoughts, where do you fall on that multipolar spectrum?
Well, I tend to take the Aristotelian approach here.
Well, you know, I tend to take the Aristotelian approach here. When we think about Aristotelian ethics, where he talks about the real right target tends to be between extremes, right? So when you're trying to figure out about particular character traits, Aristotle would say, well, you don't want to be foolhardy, but you also don't want to be a coward. And in the middle is the golden mean, he called it. That's actually where I think we are probably with AI. Yes, we get
reports of it's going to take over everything in a positive way, new utopia. This is sort of an Elon
Musk, I would say, endorsed idea right now. Horowitz as well. Andreessen Horowitz, Mark Andreessen.
Yes, that's true. That's right. But Andreessen Horowitz, you got to take them with a grain of
salt because their goal is they need massive new markets in which to put capital, right? So,
you know, we're like two years out from Andreessen Horowitz really pushing a crypto-driven internet
was going to be the future of all technology because they were looking for plays and that
kind of died down. But yeah, Musk is pushing it too. I don't think we have evidence
right now to support this sort of utopian vision. The other end, you have the P-Doom equals one
vision of the Nick Bostrom superintelligence. Look, this is already out of control and it's
going to recursively improve itself until it takes over the world. Again, like most computer
scientists I know aren't sweating that right now either. I would probably go with something, if I'm going to use your scale,
let's call it meh plus, because I don't think it's meh, but I also don't think it's one of
those extremes. If I had to put money down, and it's dangerous to put money down on something
that's so hard to predict, you're probably going to have a change maybe on the scale of something
like the internet, the consumer internet. Let's think about that for a little bit, right? I mean, that is a transformative technological
change, but it doesn't play out with the drasticness that we like to envision or we're
more comfortable categorizing our predictions. Like, when the internet came along, it created
new businesses that didn't exist before. It put some businesses out of business. For the most part, it changed the way, like the business we were already doing, we kept doing it, but it changed what the day-to-day reality of that was.
Professors still profess, car salesmen still sell cars.
But it's like different now, you have to deal with the internet, it kind of changed the day-to-day.
That's probably like the safest bet for how the generative AI revolution, what that's going to lead to, is not necessarily a drastic wholesale definition of what we mean by work or what we do for work, but perhaps a pretty drastic change to the day-to-day composition of these efforts.
Just like someone from 25 years ago wouldn't be touching email or Google in a way that a knowledge worker today is going to be constantly touching those tools.
But that job might be the same job that was there 25 years ago. It just feels different how it unfolds.
That's, I think, the safe bet right now. That aligns with something Altman said in a
recent interview I saw where, to paraphrase, he said that he thinks now is the best time to
start a company since the advent of the internet, if not the entire history of technology because of what he thinks people
are going to be able to do with this technology. I also think of, he has a bet with, I forget,
a friend of his on how long it'll take to see the first billion dollar market cap on a
solopreneur's business, basically. Just a one-man business. I mean, obviously it would be in tech.
It'd be some sort of next big app or something that was created though, by one dude in AI billion dollar plus
valuation. Yeah. And you know, that's possible because if we think about, for example, Instagram.
Yep. Great example. I think they had 10 employees when they sold, right?
It's 10 or 11 and they sold for right around a billion dollars. Right. So.
And how many of those 10 or 11 were engineers just doing engineering that AI could do?
Yep, that was probably four.
Yeah, and so, right, one AI-enhanced, one AI-enhanced programmer.
I think that's an interesting bet to make.
That's a smarter way, by the way, to think of this from an entrepreneurial angle,
making sure you're leveraging what's newly
made possible by these tools in pursuing whatever business seems like in your sweet spot and seems
like there's a great opportunity, as opposed to what I think is a dangerous play right now,
is trying to build a business around the AI tools themselves in their current form, right? Because
one of my, kind of a collection of takes I've been developing about where we are right now with consumer facing AI. But one of these strong takes is that the existing form factor of generative
AI tools, which is essentially a chat interface. I interface with these tools through a chat
interface, giving prompts that have to, you know, carefully engineered prompts to get language model
based tools to produce useful text. That might be more fleeting than we think. That's a step towards more
intricate tools. So if you're building a startup around using text prompts to an LLM, you may
actually be building around the wrong technology. You're building around not necessarily where
this is going to end up in its widest form. And we know that in part because these chatbot-based
tools have been out for about a year and a half now, November 2022 would be the debut of ChatGPT. In transformed by the tools as they're designed
right now, which tells us this form factor of copying and pasting text into a chat box
is probably not going to be the form factor that's going to deliver the biggest disruptions.
We sort of need to look down the road a little bit about how we're going to build on top
of this capability.
This is not going to be the way I think like the average knowledge worker ultimately is
going to interact, is not going to be typing into a box at chat.openai.com. I think this is a sort of preliminary stepping stone in this technology's development.
with who also use it is the quality of its outputs is highly dependent on the quality of the inputs,
the person using it. And as it really excels in verbal intelligence, general reasoning, not so much. I saw something recently that Claude 3 scored about 100 or so on a general IQ test,
which was delivered the way you would deliver it to a blind person.
Whereas verbal intelligence, I think GPT on that same, it was an informal paper of sorts. GPT's
general IQ was maybe 85 or something like that. Verbal IQ though, very high. So GPT,
according to a couple of analyses, scores somewhere in the 150s on verbal IQ. And so
what I've seen is it takes an above average verbal IQ in an individual to get a lot of utility
out of it in its current form factor. And so I've seen that as just a limiting factor.
Even if somebody, if they haven't spent a lot of time dealing with language, they struggle
to get to the outcomes that it is capable of producing.
But you can't just give it kind of vague.
This is kind of what I want.
Can you just do this for me?
Like you need to be very particular, very deliberate.
Sometimes you have to break down what you want into multiple steps and walk it through.
Sometimes you have to break down what you want into multiple steps and walk it through. So it's just echoing what you were saying there is for it to really make major disruptions,
it's going to have to get beyond that because most people are not going to be able to 100x
their productivity with it.
They just won't.
Yeah.
Well, look, I'm working right now.
As we talk, I'm writing a draft of a New Yorker piece on using AI for writing.
One of the just universally agreed-on axioms of people who study this is that a language model can't produce writing that is of higher quality than the person using the language model is already capable of doing.
And with some exceptions, right?
Like, you're not an English language, natural, English is not your first language.
But it can't, you have to be the taste function. Like, is this good? Is this not good? Here's what this is missing. In fact, one of the interesting conclusions, preliminary conclusions that's coming from the work I'm doing on this is that, like, for students who are using language models with paper writing, it's not saving them time. I think we have this idea that it's going to be a
plagiarism machine, like write this section for me and I'll lightly edit it. It's not what they're
doing. It's way more interactive, back and forth. What about this? Let me get this idea. It's as
much about relieving the psychological distress of facing the blank page as it is about trying
to speed up or produce or automate part of this effort. There's a bigger point here. I'll make some big takes. Let's take some big swings here. There's a bigger point I want to
underscore, which, as you mentioned, like, Claude is not good at reasoning. You know,
GPT-4 is better than GPT at reasoning, but, you know, not even like a moderate human level of
reasoning. But here's the bigger point I've been making recently. The idea that we want to build large language models,
big enough that just as like an accidental side effect, they get better at reasoning,
is like an incredibly inefficient way to have artificial intelligence do reasoning.
The reasoning we see in something like GPT-4, which there's been some more research on,
it's like a side effect of this language model trying to be very good at producing reasonable text, right?
The whole model is just trained on,
you've given me a prompt,
I want to expand that prompt in a way that makes sense
given the prompt you gave me.
And it does that by generating tokens, right?
Given the text that's in here so far,
what's the best next part of a word or word to output next?
And that's all it does.
Now, in winning this game of producing text
that actually makes sense,
it has had to implicitly encode some reasoning
into its wiring
because sometimes to actually expand text,
if that text is capturing
some sort of logical puzzle in it,
to expand that text in a logical way,
it has to do some reasoning.
But this is a very inefficient way of doing reasoning,
to have it be as a side effect
of building a really good token generation machine. Also, you have to make these things huge just to get that as a side
effect. GPT-3.5, which powered the original chat GPT, which had probably around 100 billion
parameters, maybe 170 billion parameters, could do more, some of this reasoning, but it wasn't
very good. When they went to a trillion plus parameters for GPT-4, this sort of accidental
implicit reasoning that was built into it got a lot better, right? But we're making these things good. When they went to a trillion plus parameters for GPT-4, this sort of accidental implicit
reasoning that was built into it got a lot better, right? But we're making these things huge. This is
not an efficient way to get reasoning. So what makes more sense? And this is my big take. It's
what I've been arguing recently. I think the role of language models in particular is going to
actually focus more. Understanding language.
What is it that someone is saying to me?
What the user is saying?
What does that mean?
Like, you know, what are they looking for?
And then translating these requests into the very precise formats
that other different types of models and programs
can take as input and deal with.
And so like, let's say, for example,
you know, there's a certain, let's say, for example,
you know, there's a certain,
there's mathematical reasoning, right?
And we want to have help from an AI model to solve complicated mathematics.
The goal is not to keep growing
a large language model large enough
that it has seen enough math
that kind of implicitly gets bigger and bigger.
Actually, we have really good
computerized math solving programs
like Mathematica, Wolfram's program.
So what we really want is a language
model to recognize you're asking about a math problem, put it into the precise language that
another program can understand, have that program do what it does best. And it's not an emergent
neural network. It's more hard code. Let it solve the math problems. And then you can give the
result back to the language model with a prompt for it to tell you, here's what the answer is.
This is the future I think we're going to see is many more different types of models that do different types
of things that we would normally do in the human head. Many of these models not emergent, not just
trained neural networks that we have to just study and see what they can do, but very explicitly
programmed. And then these language models, which are so fantastic at translating between languages
and understanding language, sort of being kind of at the core of this, taking what we're saying in natural languages, users, turning it into the language of these
ensembles of programs, getting the results back and transforming it back to what we can understand.
This is a way more efficient way of having much broader intelligences, as opposed to growing a
token generator larger and larger, that it just sort of implicitly gets okay at some of these
things. It's just not an efficient way to do it. The multi-agent approach to something that would maybe appear
to be an AGI-like experience, even though it still may not be in the sense of to come back
to something you commented on, on understanding the answer as opposed to just regurgitating
probabilistically correct text,
we see the, I think a good example of that is the latest round of Google gaffes, Gemini gaffes,
where it's saying to put glue in the cheese of the pizza, eat rocks, bugs crawl up your penis
hole, that's normal, all these things, right? Where the algorithm says, yeah, here, here's the
text, spit it out, but it doesn't understand what it's saying in the way that a human does,
because it doesn't reflect on that and go, well, wait a minute. No, we definitely don't want to be
putting glue on the pizza. And so to your point, for it to reach that level of human-like awareness,
I don't know where that goes. I don't know enough about the details. You
probably would be able to comment on that a lot better than I would. But the multi-agent approach,
that anyone can understand where if you build that up, you make that robust enough,
it can reach a level where it seems to be highly skilled at basically everything. And it goes beyond the current generalization,
generally not that great at anything other than outputting grammatically perfect text
and knowing a bit of something about basically everything.
Yeah. Well, I mean, let me give you a concrete example, right? I wrote about this in a New
Yorker piece I published in March. And I think it's an important point, right? A team from Meta set out to build an AI
that could do really well at the board game diplomacy.
And I think this is really important
when we think about AGI,
or just more in general,
like human-like intelligence in a very broad way,
because the diplomacy board game,
you know, if you don't know it,
is partially like a risk type strategy war game. you know, if you don't know it, is partially like
a risk type strategy war game. You know, you move figures on a board. It takes place in
World War I era Europe and you're trying to take over countries or whatever. But the key to
diplomacy is that there's this human negotiation period. At the beginning of every term, you have
these private one-on-one conversations with each of the other players and you make plans and alliances and you also double cross and you make a fake alliance with this player so that they'll
move their positions out of a defensive position so that this other player that you have a secret
alliance with can come in from behind and take over this country. And so it's really considered
like a game of real politic human-to-human skill. There was this rumor that Henry Kissinger would play diplomacy in the Kennedy White House
just to sharpen his skill
of how do I deal with all these world leaders.
So when we think of AI from a perspective of like,
ooh, this is getting kind of spooky what it can do,
winning at a game like diplomacy is exactly that.
Like it's playing against real players
and pitting them against themselves
and negotiating to figure out how to win.
They built a bot called Cicero
that did really well.
They played it on a online diplomacy
chat-based, text-based chat diplomacy server
called DiplomacyNet.
And it was winning, you know,
two-thirds of its games
by the time they were done.
So I interviewed some of the developers
for this New Yorker piece.
And here's what's interesting about it. Like the first thing they did is they took a language model,
and they trained it on a lot of transcripts of diplomacy games. So it was a general language
model, and then they extra trained it with a lot of data on diplomacy games. Now, you could ask
this model, you could chat with it, like, what do you want to do next? But, you know, it would output, these are reasonable descriptions
of diplomacy moves,
given like what you've told it so far
about what's happening in the game.
And in fact, probably it's learned enough
about seeing enough of these examples
and how to generate reasonable text
to expand a transcript of a diplomacy game.
There'll be moves that like match
where the players actually are.
Like they make sense,
but it was terrible at playing diplomacy.
It was just reasonable stuff.
Here's how they built a bot that could win at diplomacy.
They said, oh, we're going to code a reasoning engine,
a diplomacy reasoning engine.
And what this engine does, if you give it a description
of where all the pieces are on the board and what's going on
and what requests you on and what request
you have from different players like what they want you to do it can just simulate a bunch of
futures like okay let's see what would happen if russia is lying to us but we go along with this
plan what would they do oh you know three or four moves from now we could really get in trouble
well what if we lied to them and then they did that? So you're simulating the future and none of this is like emergent. Yeah, it's like Monte
Carlo type. It's a program. Yeah. Monte Carlo simulations. Exactly. And like, we've just
hardcoded this thing. And so what they did is that a language model talked to the players.
So if you're a player, you're like, okay, hey, Russia, here's what I want to do. The language
model would then translate what they were saying into like a very formalized language
that the reasoning model understands,
a very specific format.
The reasoning model would then
figure out what to do.
It would tell the language model
with a big prompt
and it would add a prompt to it.
Like, okay, we want to like
accept France's proposal,
like generate a message
to try to get it to like
accept the proposal
and let's like deny the proposal
for Italy or whatever.
And then the language model who had seen a bunch of diplomacy game and says, and write this in the style of a
diplomacy game. And it would sort of output the text that would get sent to the users.
That did really well. Not only did that do well, none of the users, they surveyed them after
the fact, or I think they looked at the forum discussions. None of them even knew they were
playing against a bot. They thought they're playing against another human. And this thing
did really well, but it was a small language model. It was an off-the-shelf research language model,
nine billion parameters or something like that, and this hand-coded engine, right? That's the
power of the multi-agent approach. But there's also an advantage to this approach. So I call
this intentional AI or IAI. The advantage of this approach is that we're no longer staring at these
systems like an alien
mind and we don't know what it's going to do. Because the reasoning now, we're coding this
thing. We know exactly how this thing is going to decide what moved it, but we programmed the
diplomacy reasoning engine. And in fact, and here's the interesting part about this example,
they decided they didn't want their bot to lie. That's a big strategy in diplomacy.
They didn't want the bot to lie to human's a big strategy in diplomacy. They didn't want the
bot to lie to human players for various ethical reasons, but because they were hand-coding the
reasoning engine, they could just code it to never lie. So, you know, when you don't try to have all
of the sort of reasoning decision-making happen in this sort of obfuscated, unpredictable,
uninterpretable way within a giant neural network, but you have more of the reason just programmed explicitly working with this great language model, now we have a lot
more control over what these things do. Now we can have a diplomacy bot, hey, it can beat human
players. That's scary, but it doesn't lie because actually all the reasoning, there's nothing
mysterious about it. It's just like we do with a chess playing bot. We simulate lots of different
sequences of moves to see which one's going to end up best. It's not obfuscated. It's not unpredictable. And it can't be jailbroken.
There's no jailbreaking. We programmed it. Yeah. So this is the future I see with multi-agent.
It's a mixture of when you have generative AI, so if you're generating text or understanding text
or producing video or producing images, these very large neural network-based models are really, really good at this. And we don't exactly know how they operate,
and that's fine. But when it comes to planning or reasoning or intention or the evaluation of
which of these plans is the right thing to do or of the evaluation of is this thing you're going
to say or do correct or incorrect, that can actually all be super intentional, super transparent, hand-coded.
There's nothing here to escape when we think about this way.
So I think IAI gives us a powerful vision of an AI future, especially in the business context, but also a less scary one.
Because the language models are kind of scary in the way that we just train this thing for $100 million over months. And then
we're like, let's see what it can do. I think that rightly freaks people out. But this multi-agent
model, I don't think it's nearly as sort of Frankenstein's monster as people fear AI sort of
has to be. One of the easiest ways to increase muscle and strength gain is to eat enough protein
and to eat enough high quality
protein. Now you can do that with food, of course, you can get all of the protein you need from food,
but many people supplement with whey protein because it is convenient and it's tasty and that
makes it easier to just eat enough protein. And it's also rich in essential amino acids,
which are crucial for muscle building and it's digested well, it's also rich in essential amino acids, which are crucial for muscle building. And it's digested well.
It's absorbed well.
And that's why I created Whey Plus, which is a 100% natural grass-fed whey isolate protein
powder made with milk from small sustainable dairy farms in Ireland.
Now, why whey isolate?
Well, that is the highest quality whey protein you can buy. And that's why
every serving of Whey Plus contains 22 grams of protein with little or no carbs and fat.
Whey Plus is also lactose-free, so that means no indigestion, no stomach aches, no gassiness.
And it's also 100% naturally sweetened and flavored, and it contains no artificial food dyes or other chemical junk.
And why Irish dairies?
Well, research shows that they produce some of the healthiest, cleanest milk in the world.
And we work with farms that are certified by Ireland's Sustainable Dairy Assurance Scheme, SDSAS,
scheme SDSAS, which ensures that the farmers adhere to best practices in animal welfare,
sustainability, product quality, traceability, and soil and grass management. And all that is why I have sold over 500,000 bottles of Whey Plus and why it has over 6,000 four and five star reviews
on Amazon and on my website. So if you want a mouth-watering, high-protein,
low-calorie whey protein powder
that helps you reach your fitness goals faster,
you want to try Whey Plus today.
Go to buylegion.com slash whey.
Use the coupon code MUSCLE at checkout,
and you will save 20% on your first order.
And if it is not your first order,
you will get double reward points. And that is 6% cash back. And if you don't absolutely love
WayPlus, just let us know and we will give you a full refund on the spot. No form, no return
is even necessary. You really can't lose. So go to buylegion.com slash way now,
use the coupon code muscle at
checkout to save 20% or get double reward points, and then try way plus risk-free and see what you
think. Speaking of fears, there's a lot of talk about the potential negative impacts on people's
jobs on economies. Now you've expressed some skepticism about the
claims that AI will lead to massive job losses, at least in the near future. Can you talk a little
bit about that for people who have that concern as well? Because they've read maybe that their job
is on the list that AI is replacing whatever this is in the next X number of years, because you see a lot of that.
Yeah, no, I think those are still largely overblown right now. I don't like the methodologies of
those studies. And in fact, one of the, it's kind of ironic, one of the big early studies that was
given specific numbers for like what part of the economy is going to be automated. Ironically,
their methodology was to use a language model to categorize
whether each given
job was something that a language model
might one day automate.
So it's this interesting methodology. It was very
circular. So here's where we are now. Where we are
now, language model-based tools like
ChatGPT or Cloud, again, they're built only
on understanding language and generating language
based on prompts. Mainly
how that's being applied,
I'm sure this has been your experience, Mike, in using these tools, is that it can speed up things that we were already doing. Help me write this faster, help me generate more ideas that I'd
be able to come up on my own. Help me summarize this document. It's sort of speeding up tasks.
Help me think through this. Here's what I'm dealing with.
Am I missing anything?
I find those types of discussions very helpful.
And that's another aspect that's been helpful.
That's what we're seeing with students as well.
It's interesting.
It's sort of more of a psychological than efficiency advantage.
It's humans are social.
There's something really interesting going on here where there's a rhythm of thinking
where you're going back and forth with another entity that somehow is a kind of a more
comfortable rhythm than just I'm sitting here, white knuckling my brain, trying to come up with
things. But none of that is my job doesn't need to exist, right? So that's sort of where we are
now. It's speeding up certain things or changing the nature of certain things we're already doing.
I argued recently that the next step, like the Turing test we should care about, is when can
an AI empty my email inbox on my behalf, right? And I think that's an important threshold because
that's capturing a lot more of what cognitive scientists call functional intelligence, right?
So the cognitive scientists would say a language model has very good linguistic intelligence,
understanding producing language.
The human brain does that, but also has these other things called functional intelligences,
simulating other minds, simulating the future, trying to understand the implication of actions
on other actions, building a plan, and then evaluating progress towards the plan.
There's all these other functional intelligences that we break out as cognitive scientists.
Language models can't do that, but the empty and inbox, you need those, right?
For me to answer this email on your behalf,
I have to understand who's involved.
What do they want?
What's the larger objective that they're moving towards?
What information do I have
that's relevant to that objective?
What information or suggestion can I make
that's going to make the best progress
towards that objective?
And then how do I deliver that in a way
that's going to actually understanding how they think about it and what they care about and what in a way that's going to actually understanding how
they think about it and what they care about and what they know about that's going to like best
fit these other minds? That's a very complicated thing. So that's going to be more interesting,
right? Because that could take more of this sort of administrative overhead off the plate of
knowledge workers, not just speeding up or changing how we do things, but taking things
off our plate, which is where things get interesting. That needs multi-agent models,
right? Because you have to have the equivalent of the diplomacy planning bot doing
sort of business planning. Like, well, what would happen if I suggest this and they do this, what's
going to happen to our project? It needs to have specific objectives programmed in. Like, in this
company, this is what matters. Here's the list of things I can do. And here's things that I, so now when I'm trying to plan what I suggest, I have like a hard-coded list of like, these are the
things I'm authorized to do in my position at this company, right? So we need multi-agent models for
the inbox clearing Turing test to be passed. That's where things start to get more interesting. And I
think that's where like a lot of the prognostications of big impacts get more interesting.
Again, though, I don't know that it's going to eliminate large swaths of the economy, but it might really change the character
of a lot of jobs. Sort of, again, similar to the way the internet or Google or email really change
the character of a lot of jobs versus what they were like before, really changing what the day-to-day
rhythm is. Like we've gotten used to in the last 15 years, work is a lot of sort of unstructured
back and forth communication that sort of our day is built on email, Slack and meetings.
Work five years from now, if we cross the inbox Turing test might feel very different
because a lot of that coordination can be happening between AI agents and it's going
to be a different feel for work. And that could be substantial, but I still don't see
that as, you know, knowledge work goes away. Knowledge work is like building, you know, water run mills or horse and buggies. I table stakes if you are a knowledge worker,
which would also include, I think it include creative work of any kind, and that we could
have a scenario where information slash knowledge slash idea, whatever workers with AI, it's just
going to get to a point where they can outproduce quantitatively and qualitatively their peers on average who do
not have or who do not use ai so much so that a lot of the latter group will not have employment
in that capacity if they if they don't adopt the the technology and change yeah mean, I think it's like internet-connected PCs, right?
Like, eventually,
everyone had, in knowledge work,
had to adopt and use these.
Like, you couldn't survive
by, like, the late 90s.
You're like, I'm just at too big
of a disadvantage
if I'm not using
an internet-connected computer, right?
You can't email me.
I'm not using word processors.
We're not using digital graphics and presentations. You had to adopt that technology. We saw a similar
transition, if we want to go back 100 years, to the electric motors and factory manufacturing.
There was like a 20-year period where we weren't quite sure. We were uneven in our integration of
electric motors into
factories that before were run by giant steam engines that would turn an overhead shaft and
all the equipment would be connected to it by belts. But eventually, and there's a really nice
business case written about this, that's sort of often cited, eventually, you had to have small
motors on each piece of equipment because it was just, you're still building the same things.
And like the equipment was functionally the same, You're whatever, you're sewing short or pants,
right? You're still a factory making pants. You still have sewing machines, but you eventually
had to have a small motor on every sewing machine connected by electrical cable to a dynamo because
that was just so much more of an efficient way to do this than to have a giant overhead single speed
crankshaft on which everything was connected
by belts, right? So we saw that in knowledge work already with internet connected computers.
If we get to this sort of functional AI, this functional intelligence AI, I think it's going
to be unavoidable, right? Like, I mean, one way to imagine this technology, I don't exactly know
how it'll be delivered, but one way to imagine it is something like a chief of staff,
right? So like if you're a president or a tech company CEO, you have a chief of staff that sort
of organizes all the stuff so that you can focus on what's important. Like the president of the
United States doesn't check his email inbox. He'd be like, what do I work on next, right?
That sort of Leo McGarry character is like, all right, here's who's coming in next.
Here's what you need to know about it. Here's the information. We got to make a decision on like whether to deploy troops. You do that. Okay, now here's what's happening next. You can imagine a
world in which AIs play something like that role. So now things like email, a lot of what we're
doing in meetings, for example, that gets taken over more by the digital chief of staff, right?
They gather what you need,
they coordinate with other AI agents to get you the information you need, they deal with the
information on your behalf, they deal with the sort of software programs that like make sense
of this information or calculate this information, they sort of do that on your behalf. We could be
heading more towards a future like that, a lot less administrative overhead and a lot more sort of undistracted thinking or that sort of cognitive focus.
That will feel very different.
Now, I think that's actually a much better rhythm of work than what we evolved into over
the last 15 years or so in knowledge work.
But it could have interesting side effects because if I can now produce 3x more output
because I'm not on email all day, well, that changes up the economic nature of my particular sector because technically we only need a third of me now to capture the sort of surplus cognitive capacity.
We just sort of have a lot more raw brain cycles available. We don't have everyone
sending and receiving emails once every four minutes, right? And so we're going to see more,
I think, probably injection of cognitive cycles into other parts of the economy where I might now
have someone hired that like helps me manage a lot of like the paperwork in my household,
like things that just require,
because there's going to be this sort of excess cognitive capacity. So we're going to have sort
of more thinking on our behalf. It's a hard thing to predict, but that's where things get interesting.
I think email is a great example of necessary drudgery. And there's a lot of other necessary
drudgery that will also be able to be offloaded. I mean, an example is the CIO of my sports nutrition company who oversees all of our
tech stuff and has a long list of projects he's always working on.
He is heavily invested now in working alongside AI.
And I think he likes GitHub's co-pilot the most.
And he's kind of fine-tuned it on how he likes to code and everything. And he said a couple things. One, he estimates that his personal productivity is at least 10 times. And he is not a sensationalist. That's like a conservative estimate with his coding. And then he also has commented that something he loves about it is it automates a lot of drudgery.
A code that typically, okay, so you have to kind of reproduce something you've already done before.
And that's fine.
You can take what you did before, but you have to go through it and you have to make changes.
And you know what you're doing, but it's, but it's boring and it can take a lot of
time. And he said, now he spends very little time on that type of work because the AI is great at
that. And so the time that now he gives to his work is more fulfilling and ultimately more
productive. And so I can see that effect occurring in many other types of work.
I mean, just think about writing. Like you say, you don't ever have to deal with the scary blank
page. Not that that is really an excuse to not put words on the page, but that's something that
I've personally enjoyed is, although I don't believe in writer's block per se,
you can't even run into idea block, so to speak, because if you get there and you're not sure
where to go with this thought, or if you're even onto something, if you jump over to GPT and start
a discussion about it, at least in my experience, especially if you get at generating ideas,
and you mentioned this earlier, a lot of the ideas are bad and you just throw them away.
But always, always in my experience, I'll say always I get to something when I'm going through this kind of process, at least one thing, if not multiple things that I genuinely like,
that I have to say, that's a good idea. That gives me a spark. I'm going to take that and
I'm going to work with that.
Yeah, I mean, again, I think this is something
we don't, we didn't fully understand.
We still don't fully understand,
but we're learning more about,
which is like the rhythms of human cognition
and what works and what doesn't.
We've underestimated the degree to which
the way we work now, which is,
it's highly interruptive and solitary at the same time.
It's, I'm just trying to write this thing from
scratch, and that's like a very solitary task, but also like I'm interrupted a lot with like
unrelated things. This is a rhythm that doesn't fit well with the human mind. A focused collaborative
rhythm is something the human mind is very good at, right? So now if my day is unfolding with me
interacting back and forth with an agent, maybe that seems really artificial, but I think the reason why we're seeing this actually be useful to people is
it's probably more of a human rhythm for cognition. It's like I'm going back and forth with someone
else in a social context trying to figure something else, something out, and my mind can
be completely focused on this. You and I, or you as a bot in this case, we're trying to write this
article. And now that is more familiar, and I think that's why it feels less strain
than I'm going to sit here and do this very abstract thing on my own, you know, just like staring
at a blank page. Programming, you know, it's an interesting example.
And I've been wary about trying to extrapolate too much from programming
because I think it's also a special case, right? Because what a language model
does do really well is they can produce text
that well matches the prompt that you gave
for like what type of text you're looking for.
And as far as a model is concerned,
computer code is just another type of text.
So it can produce,
if it's producing sort of like English language,
it's very good at following the rules of grammar.
And it's like, it's grammatically correct language.
If they're producing computer code, it's very good at following the syntax of grammar. And it's like, it's grammatically correct language. If they're producing computer code,
it's very good at following the syntax
of programming languages.
This is actually like correct code that's going to run.
Now, language plays an important role
in a lot of knowledge work jobs, English language,
but it's not the main game.
It sort of supports the main things you're doing.
I have to use language to sort of like request
the information I need for what I'm producing.
I need to use language to like write a summary of the thing, the strategy I figured out.
So the language is a part of it, but it's not the whole activity.
And computer coding is the whole activity.
The code is what I'm trying to do.
Code that like produces something.
We just think of that as text that like matches a prompt.
Like the models are very good at that.
And more importantly, if we look at the knowledge work jobs
where the English text
is the main thing we produce,
like writers,
there typically we have these
incredibly fine-tuned standards
for what makes good writing good.
When I'm writing a New Yorker article,
it's very, very intricate.
It's not enough to be like,
this is grammatically correct language
that covers the relevant points
and these are good points.
It's like the sentence,
everything matters,
the sentence construction, the rhythm.
But in computer code, we don't have that.
The code has to be like reasonably efficient and run.
So like that, it's like a bullseye case
of getting the maximum possible productivity,
knowledge or productivity out of a language model
is like producing computer code
as like a CIO for a company where it's like,
yeah, we need the right programs to do things. We're not trying to build a program that's going to have 100 million customers
and has to be like the super, like most efficient possible, like something that works and solves the
problem. I want to solve it. And there's no aesthetic dimension, although I suppose maybe
there'd be some pushback and that there can be elegant code and inelegant code, but it's not
anywhere to the same degree as what you're as when you're trying to write something that really resonates with other humans in a deep way and inspires different emotions and images and things.
Yeah, I think that's right. And elegant code is sort of the language equivalent of polished prose, which actually these language models do very well.
This is very polished prose. It doesn't sound amateur. There's no mistakes in it.
Yeah, that's often enough
unless you're trying to do something fantastical and new,
in which case the language models
can't help you with programming, right?
You're like, okay, I'm doing something completely different,
a super elegant algorithm that changes the way
like we compute something.
But most programming's not that, you know.
That's for the 10x coders to do.
So yeah, it's interesting. Programming is interesting. But for most other knowledge
work jobs, I see it more about how AI is going to get the junk out of the way of what the human
is doing, more so than it's going to do the final core thing that matters for the human.
And this is like a lot of my books, a lot of my writing is about digital knowledge work.
We have these modes of working that accidentally got in
the way of the underlying value-producing thing that we're trying to do in the company. The
underlying thing I'm trying to do with my brain is getting interrupted by the communication,
by the meetings, and that this is sort of an accident of the way digital knowledge work
unfolded. AI can unroll that, potentially unroll that accident, but it's not going to be GPT-5 that
does that. It's going to be a multi-agent
model where there's language models and hand-coded models and company-specific bespoke models that
all are going to work together. I really think that's going to be the future.
Maybe that's going to be Google's chance at redemption because they've made a fool of
themselves so far compared to open AI.
Even perplexity, not to get off on a tangent, but by my lights,
Google Gemini should fundamentally work exactly the way that perplexity works.
I now go to perplexity just as often, if not more often.
If I want that type of, I have a question and I want an answer and I want sources
cited to that answer and I want more than one line, I go to perplexity now.
I don't even bother with Google because Gemini is so unreliable with that.
But maybe Google will be the one to bring multi-agent into its own.
Maybe not.
Maybe it'll just be open AI.
They might be.
But yeah, I mean, then we say, okay, you know, I talked
about that bot that wanted diplomacy
by doing this multi-agent approach. The lead
designer on that
got hired away from Meta. It was
OpenAI who hired him. So,
that's where he is now, Noam Brown. He's
at OpenAI, working
industry insiders suspect
on building exactly like these sort of
bespoke planning models to
connect the language models and extend the capability. Google Gemini also showed the
problem, too, of just relying on just making language models bigger and just having these
massive models do everything as opposed to the IAI model of, okay, we have specific logic and
these more emergent language understanders look what happened with you know
what was this a couple months ago where they're having they were fine-tuning the the controversy
where they were trying to fine-tune these models to be more inclusive and then it led to completely
unpredictable like unintended results like refusing to show yeah the the black the black
waffen f waffen ss exactly or to refuse to show the founding fathers as white.
The main message of that was kind of misunderstood.
I think that was somehow being understood by sort of political commentators as like
each of those, someone was programming somewhere like, don't show anyone as white or something
like that.
But no, what really happens is these models are very complicated.
So they do these fine tuning things.
You have these giant models
that take hundreds of million dollars to train.
You can't retrain them from scratch.
So now you're like,
we're worried about it being like showing,
defaulting to like showing
maybe like white people too often
when asked about these questions.
So we'll give it some examples
to try to nudge it in the other way.
But these models are so big and dynamic that you go in there and just give it some examples to try to nudge it in the other way. But these models are so big and dynamic that, you know,
you go in there and just give it a couple examples of like,
show me a doctor and you kind of,
you give it a reinforcement signal to show a non-white doctor
to try to unbias it away from, you know, what's in this data.
But that can then ripple through this model in a way that now you get
the SS officers and the founding fathers, you know,
as American Indians or something like that.
It's because they're huge. And these fine're trying to fine-tune a huge thing,
you have like a small number of these fine-tuned examples, like 100,000 examples,
that have these massive reinforcement signals that fundamentally rewire the front and last
layers of these models and have these huge unpredictable dynamic effects. It just underscores
the unwieldiness of just trying to have a master
model that is huge, that's going to serve all of these purposes in an emergent manner.
It's an impossible goal. It's also not what any of these companies want. Their hope,
if you're open AI, if you're anthropic, right, if you're Google, you do not want a world in which,
like, you have a massive model that you talk
to through an interface and that's everything. And this model has to satisfy all people in all
things. You don't want that world. You want the world where your AI, complicated combinations of
models, is in all sorts of different stuff that people does in these much smaller form factors
with much more specific use cases. ChatGPT, it was an accident that that got so big.
It was supposed to be a demo
of the type of applications
you can build on top of a language model.
They didn't mean for ChatGPT
to be used by 100 million people, right?
It's kind of like we're in this,
that's why I say like don't overestimate
this particular,
the importance of this particular form factor for AI.
It was an accident that this is how we got exposed to what language models could do. It's not, people do not want to be in
this business of blank text box. Anyone everywhere can ask it everything. And this is going to be
like an Oracle that answers you. That's not what they want. They want like the GitHub co-pilot
vision. In the particular stuff I already do, AI is there
making this very specific thing better and easier or automating it. So I think they want to get away
from the mother model, the Oracle model that all thing goes through. This is a temporary step.
It's like accessing mainframes through teletypes before, you know, eventually we got personal
computers. This is not going to be the future
of our interaction with these things.
The Oracle blank text box
to which all requests go.
They're having so much trouble with this
and they don't want this to be.
It's, you know, I see these massive
trillion parameter models,
just marketing, like,
look at the cool stuff we can do,
associate that with our brand name
so that when we're then offering
like more of these more bespoke tools
in the future that are all over the place,
you'll remember Anthropic because you remember Claude was really cool during this period where we were all using chatbots.
And we did the Golden Gate experiment. Remember how fun that was?
A good example of what you were just mentioning of how you can't brainwash the bots per se,
wash the bots per se, but you can hold down certain buttons and produce very strange outcomes.
For anyone listening, if you go check out, I think it's still live now. I don't know how long they're going to keep it up, but check out Claude's, their Anthropics Claude Golden Gate
Bridge experiment and fiddle around with it. And by the way, think about this objectively.
There's another weird thing going on
with the Oracle model of AI, which again, why they want to get away from it. We're in this weird
moment now where we're conceptualizing these models sort of like important individuals.
And we want to make sure that like these individuals, like the way they express themselves
is proper, right? But if you zoom out, like this doesn't necessarily
make a lot of sense for something
to invest a lot of energy into.
Like you would assume people could understand
this is a language model.
It's this neural network that just like produces text
to expand stuff that you put in there.
You know, hey, it's going to say
all sorts of crazy stuff, right?
Because this is just a text expander,
but here's all these like useful ways you can use it,
but you can make it say crazy
stuff. Yeah. And if you want it to say whatever, nursery rhymes as if written by Hitler, whatever,
it's a language model that can do almost anything. And that's a cool tool. And we want to talk to you
about ways you can build tools on top of it. But we're in this moment where we got obsessed about
we're treating it like it's an elected official or something. And the things it says somehow
reflects on the character of some sort of entity that actually exists. And so we
don't want this to say something. You know, it used to be, there's a whole interesting field,
an important field in computer science called algorithmic fairness, right? Or algorithmic bias.
And these are similar things where they look for, like, if you're using algorithms for making
decisions, you want to be wary of biases being unintentionally programmed into these algorithms,
right? This makes a lot of sense. The kind of the classic early cases were things like,
hey, you're using an algorithm to make loan approval decisions, right? Like,
I would give it all this information about the applicant and the model maybe is better at a
human and figuring out who to give a loan to
or not. But wait a second, depending on the data you train that model with, it might be actually
biased against people from certain backgrounds or ethnic groups in a way that is just an artifact
of the data. Like we got to be careful about that, right? Or in a way that may actually be
factually accurate and valid, but ethically unacceptable. And so you just make a determination.
Yeah. So, right. There could be, if this was just us as humans doing this, there's these
nuances and determinations we could have. And so we got to be very careful about having a black
box do it. But somehow we shifted that attention over to just chatbots producing text. They're not
at the core decisions. They're not at the core decisions.
They're not, the chatbox text doesn't become canon. It doesn't get taught in schools. It's not used to make language decisions. It's just a toy that you can mess with and it produces text.
But we became really important that like the stuff that you get this bot to say has to be like,
meet the standards of like what we would have for like an individual
human and it's a huge amount of effort that's going into this um and it's really unclear why
because it's so what if i can uh make a chat bot like say something very disagreeable i can also
just say something very disagreeable i can search the internet and find things very disagreeable
or you exactly you can go poke around on some forums about anything and
go spend some time on 4chan. And there you go. That's enough disagreeability for a lifetime.
So we don't get mad at Google for, hey, I can find websites written by preposterous people
saying terrible things, because we know this is what Google does. It just sort of indexes the web.
So there's a lot of effort going into trying to make this sort of Oracle model thing
kind of behave,
even though like the text
doesn't have impact.
There's like a big scandal
right before ChatGPT came out.
This was, I think it was Meta
had this language model Galaxy
that they had trained
on a lot of scientific papers.
And they had this,
I think a really good use case,
which is if you're working
on scientific papers,
it can help speed up
like right sections of the papers. So it speeds up. It's hard.
You get the results in science, but then writing the paper is like a pain,
where the real value is in doing the research, typically, right? And so, like, great, we've
trained on a lot of scientific papers, so it kind of knows the language of scientific papers. It can
help you, like, let's write the interpretation section. Let me tell you the main points you put
in the right language. And that people were messing around with this, like, let's write the interpretation section. Let me tell you the main points you put in the right language.
And that people were messing around
with this, like, hey,
we can get this to write
fake scientific papers.
Like a famous example was about,
you know, the history of bears in space.
And they got real spooked.
And like we got, and they pulled it.
But like, in some sense,
it's like, yes, sure.
This thing that can produce
scientific sounding text
can produce papers
about bears in space.
I could write a fake paper about bears in space.
Like it's not adding some new harm to the world, but this tool would be very useful for like specific uses, right?
Like I want to make this section, help me write this section of my particular paper.
So when we have this like Oracle model of these, this Oracle conception of these machines, I think we anthropomorphize them into like they're an entity
and we want that.
And I created this entity as a company.
It reflects on me,
like what their values are
and the things they say.
And I want this entity to be
like sort of appropriate,
culturally speaking.
You could easily imagine,
and this is the way we thought
about these things pre-ChatGPT.
Hey, we have a model, GPT-3.
You can build applications on it
to do things.
That had been out for a year, like two years. You could build a chatbot on it, but you could build a bot on it
that just like, hey, produce fake scientific papers or whatever. But we saw it as a program,
a language generating program that you could then build things on top of. But somehow when we put it
into this chat interface, we think of these things as entities. And then we really care then about
the beliefs and behavior of the entities. It all seems so wasteful to me. Because we need to move past the chat interface era
anyways and start integrating these things directly into tools. No one's worried about
the political beliefs of GitHub's co-pilot because it's focused on producing, filling in computer
code and writing drafts of computer code. Well, anyways, to try to summarize these various points
and sort of bring it to our look at
the future, essentially what I'm saying is that in this current era where the way we
interact with these generative AI technologies is through just like this single chat box
and the model is an oracle that we do everything through.
We're going to keep running into this problem where we're going to begin to treat this thing
as like an entity.
We're going to have to care about what it says and how it expresses itself
and whose team is it on. And a huge amount of resources have to be invested into this.
And it feels like a waste because the inevitable future we're heading towards is not one
of the all-wise oracle that you talk to through a chatbot to do everything,
but it's going to be much more bespoke where these networks of AI agents will be customized for various things we do, just like
GitHub Copilot is very customized at helping me in a programming environment to write computer code.
There'll be something similar happening when I'm working on my spreadsheet, and there'll be
something similar happening with my email inbox. And so right now to be wasting so much resources on whether, you know, Claude or
Gemini or ChatGPT, you know, a political correct, like it's a waste of resources because the role
of these large chatbots is like, Oracle's is going to go away anyway. So that's, you know, I am
excited. I am excited for the future where AI becomes, we splinter it and it becomes more responsive
and bespoke and it's directly working and helping with the specific things we're doing.
That's going to get more interesting for a lot of people because I do think for a lot
of people right now, the copying and pasting, having to make everything linguistic, having
to prompt engineer, that's a big enough of a stumbling block that is impeding, I think, sector-wide disruption right now.
That disruption was going to be much more pronounced once we get the form factor of
these tools much more integrated into what we're already doing. And the LLM will probably be the
gateway to that because of how good it is at coding in particular and how much better it's
going to be. That is going to enable the coding of, it's going to be able to do a lot of the work
of getting to these special use case multi-agents probably at a degree that without it would just be,
it just wouldn't be possible. It's just too much work.
Yeah, I think it's going to be the gateway. I think the, we're going to have sort of,
if I'm imagining an architecture, the gateway is the LLM. I'm saying something that I want to happen.
And the LLM understands the language and translates it into like a machine,
much more precise language. I imagine there'll be some sort of coordinator program that then takes that
description and it can start figuring out, okay, so now we need to use this program to help do this.
Let me talk to the LLM. Hey, change this to this language. Now let me talk to that. So we'll have
a coordinator program, but the gateway between humans and that program and between that program
and other programs is going to be LLMs. But what this is also going to enable is they don't have to be so big. If we don't need them to do everything, we don't need them to like
play chess games and be able to write in every idiom, we can make them much smaller. If what we
really need them to do is understand, you know, human language that is like relevant to the types
of business tasks that this multi-agent thing is going to run on, that LLM can be much smaller,
which means we can like fit it on a phone. And more importantly, it can be much more responsive.
Sam Altman's been talking about this recently. It's just too slow right now because these LLMs are so big. Even 4.0, when you get it into more esoteric token spaces, I mean, it's fine. I'm not complaining.
It's a fantastic tool.
But I do a fair amount of waiting while it's chewing through everything.
Yeah, well, and because,
like the model is big, right?
And how do you,
the actual computation
behind a transformer-based
language model production of a token,
the actual computation
is a bunch of matrix multiplications, right?
So the weights of the neural networks
in the layers are represented
as big matrices, and you multiply
matrices by matrices. This is what's happening
on GPUs. But the size
of these things is so big, they don't even fit
in the memory of a single
GPU chip. So you might have multiple
GPUs involved just to
produce, working full out, just to produce
a single token because these things are so big.
Massive matrices are being multiplied.
So if you make the model smaller,
they can generate the tokens faster.
And what people really want is essentially
real-time response. They want to be able to
say something and
have the text response just boom.
That's the response of this feed where now
this is going to become a natural interface where I
can just talk and not watch it word by word go, but I can talk and boom, it does it.
What's next, right? Or even talks back to you. So now you have a commute or whatever,
but you can actually now use that time maybe to have a discussion with this highly specific
expert about what you are working on.
And it's just a real time
as if you're talking to somebody on the phone.
Oh, it's good.
And I think people underestimate
how cool this is going to be.
So we need very quick latency,
very small latency,
because we imagine,
I want to be at my computer or whatever,
just to be like, okay,
find the data from the,
get the data from the Jorgensen movie.
Let's open up Excel here.
Let's put that into a table.
Do it like the way we did before.
If you're seeing that just happen as you say it, now we're in like the linguistic equivalent of Tom Cruise,
a minority report, sort of moving the AR windows around with his special gloves. That's when it
gets really important. Sam Altman knows this. He's talking a lot about it. It's not too difficult.
We just need smaller models, but we know small models are fine. Like, as I mentioned in that diplomacy example, the language model was very small, and it was a factor of 100 smaller than
something like GPT-4. And it was fine because it wasn't trying to be this oracle that anyone could
ask everything about and was constantly prodding it and giving it... It was an idiot plot. It was
just really good at diplomacy language.
And it had the reasoning engine.
And it knew it really well.
And it was really small.
It was 9 billion parameters, right?
And so anyways, I'm looking forward to that.
We get these models smaller.
Smaller is going to be more.
It's an interesting mindset shift.
Smaller models hooked up the custom other programs,
deployed in a bespoke environment.
Like that's the startup play you want to be involved in. With a big context window.
Big context window. Yeah. But even that doesn't have to be that big. A lot of the stuff we do
doesn't even need a big context window. You can have another program just find the relevant thing
to what's happening next and paste that into the prompt that you don't even see.
That's true. I'm just thinking selfishly. like think about a writing project, right? So you go
through your research phase and you're reading books and articles and transcripts of podcasts,
whatever, and you're making your highlights and you're getting your thoughts together.
And you have this, this corpus, this, this, I mean, if it were fiction, it would be like your
story Bible, as they say, or codex, right? You have all this information now that, and it's time to start
working with this information to be able to, and it might be a lot depending on what you're doing.
And in Google's notebook, it was called notebook LLM. This is the concept and I've started to
tinker with it in my work. I haven't used it enough to have, and this is kind of a segue
into the final question I want to ask you. I haven't used it enough to have, and this is kind of a segue into the final question
I want to ask you.
I haven't used it enough to pronounce one way or other on it.
I like the concept though, which is exactly this.
Oh, cool.
You have a bunch of material now that is going to be,
that's related to this project you're working on.
Put it all into this model and it now,
it reads it all and it can find the little password uh example or you hide the password
in a million tokens of text or whatever and it can find it so so it in a sense quote unquote knows
to a high degree with a high degree of accuracy everything you put in there and now you have this
this bespoke little assistant on the project that is,
it's not trained on your data per se, but you can have that experience. And so now you have a very
specific assistant that you can use. But of course you need a big context window. Maybe you don't
need it to be 1.5 million or 10 million tokens, but if it were 50,000 tokens, then maybe that's sufficient for an article or something,
but not for a book. It does help, though it's worth knowing, like the architecture,
there's a lot of these sort of third-party tools, like for example, built-on language models,
where you hear people say, like, I built this tool where I can now ask this custom model questions
about all of the quarterly reports
of our company from the last 10 years or something.
This is like, there's a big business now,
consulting firms building these tools for individuals.
But the way these actually work
is there's an intermediary.
So you're like, okay, I want to know about,
you know, how have our sales like different
between the first quarter this year versus 1998?
You don't have in these tools 20 years worth of into the context.
What it does is it actually, right, it's search, not the language model,
just old-fashioned program, searches these documents to find relevant text,
and then it builds a prompt around that.
And actually how a lot of these tools work is it stores this text in such
a way that it uses the embeddings of your prompt. So like after they've already been transformed
into the embeddings that the language model neural networks understand, and all your text
has also been stored in this way, and it can find sort of now conceptually similar text. So it's
like more sophisticated than text matching, right? It's not just looking for text. So it's like more sophisticated than text matching, right?
It's not just looking for keywords.
It can, so it can actually leverage
like a little bit of the language model,
how it embeds these prompts
into a conceptual space
and then find text
that's in a similar conceptual space.
But then it creates a prompt.
Okay, here's my question.
Please use the text below
in answering this question.
And then it has 5,000 tokens
worth of text pasted below.
That actually works pretty well.
So all the OpenAI demos
from last year, like the one about
the plugin demo with UN reports, etc.,
that's the way that worked.
It was finding relevant
text from a giant corpus
and then creating smarter
prompts that you don't see as the user.
But your prompt is not what's going to the language
model. It's a version of your prompt that has cut and pasted text that have found the documents.
Even that works well. Yeah. I'm just parroting actually the
CIO of my sports company who knows a lot more about the AI than I do. He's really into the
research of it. He has just commented to me a couple of times that when I'm doing that type of work, he has recommended
stuffing the context window because if you just give it big PDFs, you just don't get nearly as
good as results as if you do when you stuff the context window. That was just a comment,
but we're coming up on time, but I just wanted to ask one more question if you have a few more
minutes. And this is something that you've commented on a number of times, but I wanted to come back to it.
And so in your work now, and obviously a lot of your work is that the highest quality work that you do is deep in nature in many ways, aside from maybe the personal interactions in your job.
In many ways, your career is the personal interactions in your job, in many ways, your
career is based on coming up with good ideas.
And so how are you currently using these LLMs?
And specifically, what have you found helpful and useful?
Well, I mean, I'll say right now in their current incarnation, I use them very little
outside of specifically experimenting with things for articles about LLMs, right?
Because as you said, like my main livelihood is trying to produce ideas at a very high level, right?
So for academic articles, New Yorker articles, or books, it's a very precise thing that requires you taking a lot of information, and then your
brain is trained over decades of doing this, sits with it and works on it for months and months
until you kind of slowly coalesce, like, okay, here's the right way to think about this, right?
This is not something that I don't find it to be aided much with sort of generic brainstorming
prompts from like an LLM. It's way too specific and weird
and idiosyncratic for that. Where I imagine, and then what I do is I write about it. But again,
the type of writing I do is incredibly sort of like precise. I have a very specific voice,
the rhythm of the sentences, I have a stylistic. It's just, I just write. And I'm used to it,
and I'm used to the psychology of the blank page and that pain, and I sort of internalize it.
And I'm sure you have, I mean sure you have to go through multiple drafts.
The first draft, you're just throwing stuff down.
I don't know for you, but for me, I have to fight the urge to fix things.
Just get all the ideas down and then you have to start refining.
Yeah, and I'm very used to it.
My inefficiency is not like if I could speed up that by 20%, somehow that matters. It's, you know, it might take me months to write an article. And it's about getting the ideas right and sitting with it. Where I do see these tools playing a big role, what I'm waiting for is this next generation where they become more customized and bespoke and integrated in the things I'm already using. That's what I'm waiting for. Like, I'll give you an example. I've been experimenting with just a lot of examples with GPT-4 for understanding natural
language described schedule constraints and understanding here's a time that here's a
meeting time that satisfies these constraints. This is going to be eminently built into like
Google workspaces. That's going to be fantastic. Where you can say, we need a meeting,
like a natural language.
We need a meeting with like Mike
and these other people.
It needs to be the next two weeks.
Here's my constraints.
I really want to try to keep this
in the afternoon impossible,
not on Mondays or Fridays,
but if we really have to do a Friday afternoon,
we can, but no later than this.
And then, you know, the language model
working with these other engines
sends out a scheduling email
to the right people.
People just respond in natural language with the times that might work.
It finds something in the intersection.
It sends out an invitation to everybody.
That's really cool.
That's going to make a big difference for me right away, for example.
These type of things.
Or integrated into Gmail.
Suddenly, it's able to highlight
a bunch of messages in my inbox
and be like, you know what?
I can handle these for you.
You're like, good.
And it's like, and they disappear.
That's where this is going to start
to enter my world
in the way that like GitHub Copilot
has already entered the world
of computer programmers.
So because the thinking and writing I do
is so highly specialized,
this sort of the impressive but generic ideation
and writing abilities of those models
isn't that relevant to me.
But the administrative overhead
that goes around being any type of knowledge worker
is poison to me.
And so, you know, that is the evolution,
the turn of this sort of product development crank
that I'm really waiting to happen.
And I'm assuming one of the things that we will see probably sometime in the near future
is think about Gmail is currently, I guess, it has some of these predictive text outputs
where if you like what it's suggesting, you can just hit tab or whatever.
And it throws a couple of words in there, but I could see that expanding to it's actually now just suggesting an entire reply. And Hey, if you like it, you just go,
yeah, you know, it sounds great. Next, next, next. Yep. Or you'll train it. And this is where you
need other programs, not just a language model, but you sort of show it examples. Like you just
tell it like these, the types of like common types of messages I get. And then like, you're kind of
telling it, which is what type of example. And then it sort of learns to categorize these messages. And then
you can kind of, it can have rules for how you deal with these different types of messages.
Yeah, it's going to be powerful. Like that's going to, that's going to start to matter,
I think, in an interesting way. I think information gathering, right? So one of the
big purposes, like in an office environment of meetings is there's certain information or opinions I need, and it's kind of
complicated to explain them all. So can we just like all get together in a room? But AI with
control programs, now like I don't necessarily need everyone to get together. I can explain like
this is the information. I need this information, this information, and a decision on this and this.
Like that AI program might be able to talk to your AI program. Like it might be able to gather
most of that information with no humans in the loop. And then there's a few places where what
it has is like questions for people and it gives it to those people's AI agent. And so there's
certain points of the day where you're talking to your agent and it like asks you some questions
and you respond and then it gets back and then all this is gathered together. And then when it
comes time to work on this project, it's all put on my desk, just like a presidential chief of staff
puts the folder on the president's desk. There it is.
This is where I think people need to be focused in knowledge work
and LLMs and not get too caught up in thinking about, again, a chat window
into an oracle as being the end
all of what this technology could be.
Again, it's when it gets smaller that its impact gets bigger.
That's when things are going to start to get interesting.
Final comment I'll add is in my work, because I've said a number of times that I'm using
it quite a bit.
And just in case anybody's wondering, because that seems to contradict with what you said,
because in some ways my work is very specialized. And that is where I use it the most. If I think about health and fitness related to work,
I found it helpful at a high level of generating overview. So I want to create some content on a
topic and I want to make sure that I'm being comprehensive. I'm not forgetting about something
that should be in there. And so I find it helpful to take something like if it's just an
outline for an article on or I want to write and just ask it to, does this look right to you? Am I
missing anything? Is there any, how, how might you make this better? Those types of simple little
interactions are helpful. Also applying that to specific materials. So again, is there anything here that seems to be incorrect to
you? Or is there anything that you would add to make this better? Sometimes I get utility out of
that. And then where I've found it most useful actually is in a, it's really just hobby work.
My original interest in writing actually was fiction going back to, I don't know, I was
17, 18 years old. And it's kind of been an abiding interest that I put on the back burner to focus
on other things for a while. Now I've brought it back to not a front burner, but maybe I bring it
to a front burner and then I put it back and then bring it and put it back. And so for that,
I found it extremely helpful because that process started with me reading a bunch of books on
storytelling and fiction so I can understand the art and science of storytelling beyond just my
individual judgment or taste. Pulling out highlights, notes, things, I'm like,
that's useful, that's good. Organizing those things into a system of checklists really to go through. So, okay, you want to create characters. There are principles
that go into doing this well. Here they are in a checklist and working with GPT in particular
through that process is, I mean, that is extremely useful because again, as this context builds in this chat, in the exact case of building
a character, it understands quote unquote, the psychology and it understands probably in some
ways more so than any human could because it also understands the, or in a sense can produce
the right answers to questions that are now also given the context of people like this character
that you're building. And so much of putting together a story is actually just logical
problem solving. There are maybe some elements that you could say are more purely creative,
but as you start to put all the scaffolding there, a lot of it now is you've kind of built
constraints of a story world and characters and how things are supposed to work. And it becomes more and more just logical problem
solving. And because these LLMs are so good with language in particular, that has been actually a
lot of fun to see how all these things come together. And it saves a tremendous amount of
time. It's not just about copy and pasting the answers. So much of
the material that it generates is great. And so anyway, just to give context for listeners,
because that's how I've been using it both in my fitness work, but it's been actually more useful
in the fiction hobby. Yeah. And one thing to point out about those examples is they're both focused on like
the production of text under sort of clearly defined constraints, which like language models
are fantastic at. And so for a lot of knowledge work jobs, there is text produced as a part of
those jobs, but either it's not necessarily core, you know, it's like the text that shows up in
emails or something like this, or yeah, they're not getting paid to write the emails.
Yeah. And in that case, the constraints aren't clear, right? So like the issue with like email text is like the text is not complicated text, but the constraints are like very business and
personality specific. Like, okay, well, so-and-so is a little bit nervous about getting out of the
loop and we need to make sure they feel better about that. But there's this other initiative
going on and it's too complicated for people.
I can't get these constraints to my language model.
So that's why,
so I think people who are generating content with clear constraints,
which like a lot of what you're doing is doing,
these language models are great.
And by the way,
I think most computer programming is that as well,
is producing content under very clear constraints.
It compiles and solves this problem.
And this is
why, to put this in the context of what I'm saying, so for the knowledge workers that don't do that,
this is where we're going to have the impact of these tools come in and say, okay, well,
these other things you're doing, that's not just a production of text and clear constraints.
We can do those things individually or take those off your plate by having to kind of program into
explicit programs the constraints of what this is. Like, oh, this is an email in this type of company. This is a calendar or whatever. So one way or the other,
this is going to get into what most knowledge workers do. But you're in a fantastic position
to sort of see the power of these next generation of models up close because it was already a match
for what you're doing. And as you would describe, you would say, this has really changed the feel of your day. It's opened things up. So I think that's an optimistic look ahead to the future.
And in using what now is just this big, unwieldy model that's kind of good at a lot of things,
not great really at anything, in a more specific manner that you've been talking about in this
interview, where not only is the task specific, I think it's a general tip for anybody listening who can get some utility out of these tools, the more specific
you can be, the better. And so in my case, there are many instances where I want to have a discussion
about something related to this story. And I'm working through this little system that I'm
putting together, but I'm feeding it. I'm like even defining the terms for it. So, okay, we're
going to talk about, we're going to go through a whole checklist related to creating a premise for
a story, but here's specifically, here's what I mean by premise. And that now is me pulling material
from several books that I read and I kind of cobbled together. I think this is the definition
that I like of premise. This is what we're going
for very specifically, feed that into it. And so I've been able to do a lot of that as well,
which is again, creating a very specific context for it to work in. And the more
hyper-specific I get, the better the results. Yep. And more and more in the future,
the bespoke tool will have all that specificity built in. So you can just get to doing the thing
you're already doing, but now suddenly it's much easier. Yep. Well, I've kept you over. I appreciate
the accommodation there. I really enjoyed the discussion and want to thank you again. And
before we wrap up again, let's just let people
know where they can find you, find your work. You have a new book that recently came out.
If people liked listening to you for this hour and 20 minutes or so, I'm sure they'll like the
book as well as, as well as your other books. Yeah. I guess the background on me is that,
I'm a computer scientist, but I write a lot about the impact of technologies on our life and work
and what we can do about it in response.
So, you know, you can find out more about me at calnewport.com.
You can find my New Yorker archive at newyorker.com where I write about these issues.
My new book is called Slow Productivity.
like email, for example, and smartphones and laptops, sped up knowledge work until it was overly frenetic and stressful and how we can reprogram our thinking of productivity to make
it reasonable. Again, we talked about that when I was on the show before. So definitely check that
out as well. Gives a kind of a, almost a framework that is actually very relevant to this discussion.
Oh yeah. Yeah. And, and, you know, the motivation for that whole book
is technology, too.
Like, again, technology
sort of changed knowledge work.
Now we have to take back
control of the reins,
but also, right,
the vision of knowledge work is one,
the slow productivity vision is one,
and where AI could definitely
play a really good role
is it takes a bunch of this
freneticism off your plate,
potentially, and allows you
to focus more on what matters.
I guess I should mention
I have a podcast as well,
Deep Questions, where I take questions from my audience about all these types of issues and then get in the wheeze, get nitty gritty, give some specific advice. You
can find that. That's also on YouTube as well. Awesome. Well, thanks again, Cal. I appreciate it.
Thanks, Mike. Always a pleasure.
How would you like to know a little secret that will help you get into the best shape of your life?
Here it is.
The business model for my VIP coaching service sucks.
Boom, mic drop.
And what in the fiddly frack am I talking about?
Well, while most coaching businesses try to keep their clients around for as long as possible,
I take a different approach.
You see, my team and I, we don't just help you build your best body ever.
I mean, we do that.
We figure out your calories and macros, and we create custom diet and training plans based
on your goals and your circumstances.
And we make adjustments depending on how your body responds,
and we help you ingrain the right eating and exercise habits so you can develop a healthy
and a sustainable relationship with food and training and more. But then there's the kicker,
because once you are thrilled with your results, we ask you to fire us. Seriously,
you've heard the phrase, give a man a fish and you feed him for a day, teach him to fish and
you feed him for a lifetime. Well, that summarizes how my one-on-one coaching service works. And
that's why it doesn't make nearly as much coin as it could, but I'm okay with that because my
mission is not to just help you gain muscle and lose fat. It's to give you the tools and to give
you the know-how that you need to forge ahead in your fitness without me. So dig this. When you
sign up for my coaching, we don't just take you by the hand and walk you through the entire process of building a
body you can be proud of. We also teach you the all-important whys behind the hows, the key
principles, and the key techniques you need to understand to become your own coach. And the best
part? It only takes 90 days. So instead of going it alone this year,
why not try something different?
Head over to muscleforlife.show slash VIP.
That is muscleforlife.show slash VIP
and schedule your free consultation call now.
And let's see if my one-on-one coaching service
is right for you.
Well, I hope you liked this episode.
I hope you found it helpful.
And if you did, subscribe to the show
because it makes sure that you don't miss new episodes.
And it also helps me
because it increases the rankings of the show a little bit,
which of course then makes it a little bit
more easily found by other people
who may like it just as much as you.
And if you didn't like something about this episode or about the show in general, or if you have ideas or suggestions or
just feedback to share, shoot me an email, mike at muscleforlife.com, muscleforlife.com,
and let me know what I could do better or just what your thoughts are about maybe what you'd
like to see me do in
the future. I read everything myself. I'm always looking for new ideas and constructive feedback.
So thanks again for listening to this episode, and I hope to hear from you soon.