Lex Fridman Podcast - #115 – Dileep George: Brain-Inspired AI

Starting point is 00:00:00 The following is a conversation with Deliep George, a researcher at the intersection of neuroscience and artificial intelligence, co-founder of Ecarius, Scott Phoenix, and formerly co-founder of Numenta, with Jeff Hawkins, who has been on this podcast and Dana Dupinski. From his early work on hierarchical temporal memory to recursive cortical networks, to today, the leaps always sought to engineer intelligence that is closely inspired by the human brain. As a side note, I think we understand very little about the fundamental principles underlying the function of the human brain, but the little we do know gives hints that may be more useful for engineering intelligence than any idea in mathematics, computer science, physics, and scientific fields outside of biology.

Starting point is 00:00:50 And so the brain is a kind of existence proof that says it's possible. Keep at it. I should also say that brain-inspired AI is often overhyped and use this fodder, just this quantum computing for marketing speak. But I'm not afraid of exploring these sometimes overhyped areas since where there's smoke, there's sometimes fire. Quick summary of the ads. Three sponsors, Babel, Raycon earbuds, and Masterclass. Please consider supporting this podcast by clicking the special links in the description to get the discount. It really is the best way to support this podcast. If you enjoy this thing, subscribe on YouTube, review the 5 stars and up on podcasts, support

Starting point is 00:01:34 on Patreon, I'll connect with me on Twitter, and Lex Friedman. As usual, I'll do a few minutes of ads now and never any ads in the middle that can break the flow of the conversation. This show is sponsored by Babel. An app and website that gets you speaking in a new language within weeks, go to Babel.com and use code Lex to get three months free. They offer 14 languages including Spanish, French, Italian, German, and yes yes Russian. Daily lessons are 10-15 minutes, super easy, effective, designed by over 100 language experts. Let me read a few lines from the Russian poem, Noć

Starting point is 00:02:14 Ulitsa Fanar Abtecha, by Alexander Blok, that you'll start to understand if you sign up the babble. Noć Ulica fanarja abteka. Besmysliny i tuskli svet. Živije šo hočetert veka. Sem bude tak. Ischoda. Njet. Now, I say that you'll only start to understand this poem,

Starting point is 00:02:39 because Russian starts with a language and ends with a vodka. Now the latter part is definitely not endorsed or provided by Babel, and will probably lose me this sponsorship. But once you graduate from Babel, you can enroll in my advanced course of late-night Russian conversation over vodka. I have not yet developed enough for that. It's in progress. So get started by visiting babel.com and use code Lex to get 3 months free.

Starting point is 00:03:06 This show is sponsored by Raycon Air Buds. Get them and buy raycon.com slash Lex. They become my main method of listening to podcasts, audio books, and music when I run, do pushups and pull-ups, or just living life. In fact, I often listen to brown noise with them when I'm thinking deeply about something it helps me focus. They're super comfortable, pair easily, great sound, great bass, six hours of playtime. I've been putting in a lot of miles to get ready for a potential ultramarathon and listening

Starting point is 00:03:38 to audiobooks on World War II. The sound is rich and really comes in clear. So again, get them at buyraycon.com slash Lex. This show is sponsored by Masterclass. Sign up at masterclass.com slash Lex to get a discount and to support this podcast. When I first heard about Masterclass, I thought it was too good to be true.

Starting point is 00:04:01 I still think it's too good to be true. For 180 bucks a year you get an all-access pass to watch courses from to list some of my favorites. Chris Hatfield on Space Exploration. Neil Grass Tyse and a Scientific Thinking and Communication will write, creator of some city and sims on game design. Every time I do this read, I really want to play a city builder game. Carlos Santana on guitar, Géry Casparov and Chess, Daniel Nagrano and Poker and many more. Chris Hadfield explaining how rockets work and the experience of being launching this space alone is worth the money.

Starting point is 00:04:37 By the way, you can watch it on basically any device. Once again, sign up at masterclass.com to get a discount and to support this podcast. And now here's my conversation with Deliep George. Do you think we need to understand the brain in order to build it? Yes, if you want to build the brain, we definitely need to understand how it works. So blue brain or Henry Markham's project is trying to build a brain without understanding it, just trying to put details of the brain from neuroscience experiments into a giant simulation by putting more and more neurons, more and more details. But that is not going to work because

Starting point is 00:05:39 when it doesn't perform as what you expect it to do, then what do you do? You just keep adding more details. How do you debug it? So unless you understand, unless you have a theory about how the system is supposed to work, how the pieces are supposed to fit together, what they're going to contribute, you can't build it. At the functional level, understand. So can you actually link around and describe the blue brain project? It's kind of fascinating

Starting point is 00:06:09 Principle and idea to try to simulate the brain is we're talking about the human brain, right? right human brains and rad brains of cat brains have lots in common that the cortex the neocortex Structure is very similar. So initially they were trying to just simulate a cat brain and to understand the nature of evil. That is the nature of evil. Or as it happens in most of these simulations, you easily get one thing out, which is oscillations. If you simulate a large number of neurons, they oscillate. And you can adjust the parameters and say that

Starting point is 00:06:52 oscillates, you can match the rhythm that we see in the brain, etc. But... Oh, I see. So the idea is... Is the simulation at the level of individual neurons? Yeah. So the blue brain project, the original idea as proposed was, you put very detailed biophysical neurons,

Starting point is 00:07:15 biophysical models of neurons, and you interconnect them according to the statistics of connections that we have found from real neuroscience experiments. And then turn it on and see what happens. And these neural models are, you know, incredibly complicated in themselves, right? Because these neurons are modeled using this idea called Hodgkin-Huxley models, which are about how signals propagate in a cable. And there are active dendrites, all those phenomena, which those phenomena themselves we don't understand that well. And then we put in connectivity, which is part guess work, part, you know,

Starting point is 00:07:59 observed. And of course, if we do not have any theory about how it is supposed to work, And of course, if we do not have any theory about how it is supposed to work, we just have to take whatever comes out of it as, okay, this is something interesting. But in your sense, like these models of the way signal travels along, like with the axons and all the basic models, that's there to crude. Oh, well, actually, they are pretty detailed and pretty sophisticated and they do replicate the neural dynamics. If you take a single neuron and you try to turn on the different channels, the calcium channels and the different receptors and see what the effect of turning on or off those channels are in the neurons spike output. People have built pretty sophisticated models of that

Starting point is 00:08:53 and they are, I would say, you know, in the regime of correct. Well, see, the correctness, that's interesting because you mentioned in several levels, the correctness is measured by looking some kind of aggregate statistics. It would be more the spiking dynamics. Spiking dynamics of Sun and you're okay. Yeah. And yeah, these models, because they are going to the level of mechanism, right? So they are basically looking at, okay, what is the effect of turning on an ion channel? And you can model that using electric circuits. So it is not just a function fitting, it is people are looking at the mechanism underlying it and putting that in terms of electric circuit theory, signal propagation theory and modeling that. theory, signal propagation theory and modeling that. And so those models are sophisticated,

Starting point is 00:09:46 but getting a single neurons model 99% right, does not still tell you how to, you know, it would be the analog of getting a transistor model right and now trying to build a microprocessor. And if you just observe, you know, if you did not understand how a microprocessor. And if you just observe, if you did not understand how a microprocessor works, but you say, oh, I know Ken Mordel won transistor well, and now I will just try to interconnect the transistors

Starting point is 00:10:17 according to whatever I could guess from the experiments and try to simulate it, then it is very unlikely that you will produce a functioning microprocessor. When you want to produce a functioning microprocessor, you want to understand Boolean logic, how does the gates work, all those things, and then understand how do those gates get implemented using transistors? Yeah, there's actually, I remember this reminds me, there's a paper, maybe you're familiar with it. I remember going through in a reading group that approaches microprocessor from a perspective of a neuroscientist. I think it basically, it uses all the tools that we have of neuroscience to try to understand, like I said, if we just aliens showed up to study computers, and to see if those tools could be used to get any kind of sense of how the microprocessor works,

Starting point is 00:11:11 I think the final takeaway from at least this initial exploration is that we're screwed. There's no way that the tools in your ass ass would be able to get us to anything, like not even bullying logic. I mean, it's just any aspect of the architecture of the function of the processes involved, the clocks, the timing, all that, you can't figure that out from the tools of neuroscience. Yeah, so I'm very familiar with this particular paper. I think it was called Can a Neuro Scientist understand a microprocessor? Yeah, something like that.

Starting point is 00:11:53 Following the methodology in that paper, even electrical engineer would not understand microprocessor, so I couldn't. So I don't think it is that bad in the sense of saying, neuroscience do find valuable things by observing the brain. They do find good insights, but those insights cannot be put together

Starting point is 00:12:19 just a simulation. You have to investigate what are the computational underpinnings of those findings. How do all of them fit together from an information processing perspective? You have to somebody has to painstakingly put those things together and build hypothesis. So I don't want to, this all of neuroscience are saying, oh, they are not finding anything. No, that paper almost went to that level of neuroscientists will never understand. No, that's not true.

Starting point is 00:12:51 I think they do find lots of useful things, but it has to be put together in a computational framework. Yeah, but just the AI systems are building into this podcast a hundred years from now, and there will probably and there's some non-zero probability that'll find your words laughable. That's like I remember humans thought they understood something about the brain that are totally cool is there's a sense about neuroscience that we may be in the very, very early days of understanding the brain. But I mean that's one perspective. In your

Starting point is 00:13:28 understanding the brain. But I mean, that's one perspective. In your perspective, how far are we into understanding any aspect of the brain, so the dynamics of the individual and your communication to the how in a collective sense, how they're able to store information, transfer information, how the intelligence and emerges all that kind of stuff. Where are we on that timeline? Yeah. So, you know, timelines are very, very hard to predict. And you can of course be wrong.

Starting point is 00:13:57 And you can be wrong on either side. We know that when we look back, the first flight was in 1903. In 1900, there was a New York Times article on flying machines that do not fly. Humans might not fly for another hundred years. That was what that article stated. But no, they flew three years after that. So it's very hard to... Well, and on that point, one of the Wright brothers, I think two years before, said that, like he said, some number like 50 years, he has become convinced that it's, it's, it's impossible.

Starting point is 00:14:47 Even during that experimentation, yeah, yeah, yeah, I mean, that's distributed to when it's like the entrepreneurial battle of like depression of going through just like thinking there's this is impossible. But there, yeah, there's something even the person that's in it is not able to see us to make correctly. Exactly. But I can, I can tell from the point of, you know, objectively, what are the things that we know about the brain and how that can be used to build AI models, which can then go back and inform how the brain works.

Starting point is 00:15:18 So my way of understanding the brain would be to basically say, look at the insights neuroscientists have found. Understand that from a computational angle, information processing angle, build models using that. And then building that model which functions, which is a functional model, which is doing the task that we want the model to do. It is not just trying to model a phenomena in the brain. It is trying to do what the brain is trying to do on the model to do. It is not just trying to model a phenomena in the brain. It is trying to do what the brain is trying to do on the whole functional level. And building that model will help you fill in the missing pieces that, you know, biology just gives you the hints. And building the

Starting point is 00:15:57 model, you know, fills in the rest of the pieces of the puzzle. And then you can go and connect that back to biology and say, okay, now it makes sense that this part of the brain is doing this or this layer in the cortical circuit is doing this. And then continue this iteratively because now that will inform new experiments in neuroscience. And of course, you course, building the model and verifying that in the real world, will you also tell you more about, does the model actually work? And you can refine the model, find better ways of putting these neuroscience insights

Starting point is 00:16:35 together. So I would say it is, so neuroscience is alone, just from experimentation, will not be able to build a model of the brain, or a functional model of the brain. So, there's lots of efforts, which are very impressive efforts in collecting more

Starting point is 00:16:55 and more connectivity data from the brain. How are the micro circuits of the brain connected with each other? That's beautiful, by the way. Those are beautiful. And at the same time, those do not itself, by themselves, convey the story of how does it work. And somebody has to understand, OK, why are they connected like that?

Starting point is 00:17:19 And what are those things doing? And we do that by building models in AI using hints from neuroscience and repeat the cycle. So what aspect of the brain are useful in this whole endeavor, which by the way I should say you're both in neurosciences and AI person, I guess the dream is to both understand the brain and to build AGI systems. So here, it's like an engineer's perspective of trying to understand the brain. So what aspects of the brain function is speaking, like you said, you find interesting. Yeah, quite a lot of things. So one is, if you look at the visual cortex, and visual cortex is a large part of the brain, I forgot the exact fraction, but it's a huge part of our brain area is occupied by just vision.

Starting point is 00:18:16 So vision, visual cortex is not just a feed-forward cascade of neurons. There are a lot more feedback connections in the brain compared to the feed forward connections. And it is surprising to the level of detail neuroscientists have actually studied this. If you go into neurosensitivity and poke around and ask, have they studied what will be the effect of poking a neuron in level IT in level V1 and have they studied that and you will say,

Starting point is 00:18:50 yes, they have studied that. Every possible combination. It's not a random exploration at all, it's a very hypothesis driven, right? They are very experimental neuroscientists are very, very systematic in how they probe the brain because experiments are very experimental neuroscientists are very, very systematic in how they probe the brain because experiments are very costly to contact. They take a lot of preparation. They need a lot of control. So they are very hypothesis driven in how they probe the brain. And often what I find is that when we have a question in AI about, has anybody probed how lateral connections in the brain works? And when you go and read the literature, yes, people have probed it and people have probed it very systematically.

Starting point is 00:19:33 And they have hypothesis about how those lateral connections are supposedly contributing to visual processing. But of course, they have been built very, very functional detail model software. By the way, how do the in-house studies site interrupt? Do they stimulate like a neuron in one particular area of the visual cortex and then see how the signal travels kind of thing? Fascinating, very, very fascinating experiments. So I can give you one example I was impressed with. So before going to that, let me give you,

Starting point is 00:20:05 or we have how the layers in the cortex are organized. Visual cortex is organized into roughly four hierarchical levels. So V1, V2, V4, IT. And in V1. We're happy to V3. Well, yeah, there's another pathway. OK, so I'm talking about just object recognition pathway. All right. And then in V1 itself, so it's, there is a very detailed microsurkit in V1 itself. That is, there is organization within a level itself. The cortical sheet is organized into multiple layers, and there are columnar structure. And this layer wise and columnar structure is repeated

Starting point is 00:20:50 and V1, V2, V4, IT, all of them. And the connections between these layers within a level, in V1 itself, there are six layers, roughly. And the connections between them, there is a particular structure to them. And now, so one example of an experiment people did is when I, when you present a stimulus, which is, let's say, requires separating the foreground from the background of an object. So it is a textured triangle on a textured background. And you can check, does the surface settle first or does the contour settle first? Settle in the sense that the so when you finally form the percept of the triangle,

Starting point is 00:21:45 you understand where the contours of the triangle are and you also know where the inside of the triangle is, that's when you form the final percept. Now, you can ask what is the dynamics of forming that final percept? Do the neurons first find the edges and converge on where the edges are. And then they find the inner surfaces or does it go the other way around? So what's the answer?

Starting point is 00:22:16 In this case, it turns out that it first settles on the edges. It converges on the edge hypothesis first. And then the surfaces are filled in from the edges to the inside. That's fascinating. And the detail to which you can study this, it's amazing that you can actually not leave find the temporal dynamics of when this happens.

Starting point is 00:22:40 And then you can also find which layer in V1, which layer is encoding the edges, which layer is encoding the surfaces and which layer is encoding the feedback, which layer is encoding the feed forward and what, what's the combination of them that produces the final person. And these kinds of experiments stand out when you try to explain illusions. One example of a favorite illusion of mine is the Kanitsa triangle. I don't know that you are familiar with this one. So this is an example where it's a triangle, but the corners of the only corners of the triangle are shown in the stimulus. So they look like kind of Pac-Man. Oh, the Black Pac-Man. Exactly.

Starting point is 00:23:26 And then you start to see. Your visual system hallucinates the edges. Yeah. And you can, you know, when you look at it, you will see a faint edge, right? And you can go inside the brain and look, you know, do actually neurons signal the presence of this edge. And if there's signal, how do they do it? Because

Starting point is 00:23:46 they are not receiving anything from the input. In the input is black for those neurons. Right. So how do they signal it? When does the signaling happen? You know, does it, you know, so so if a real contour is present in the input, then the signal, the neurons immediately signal, okay, there is an edge here. When it is an illusory edge, it is clearly not in the input, it is coming from the context. So those neurons fire later, and you can say that, okay, it's the feedback connections that is causing them to fire, and they happen later, and you can find the dynamics of them. So this study is a very impressive and very detailed. So by the way, just to step back,

Starting point is 00:24:34 you said that there may be more feedback connections than the feed forward connections. Yeah. First of all, it's just for like a machine learning folks. Yeah, I mean, that's crazy for like a machine learning folks. Yeah. I mean, that's crazy that there's all these feedback connections. Like, we often think about, I have a thing, thanks to deep learning, you start to think about the human brain as a kind of feed-forward mechanism.

Starting point is 00:25:01 Right. So what the heck are these feedback connections? Yeah. What's the dynamics? What are we supposed to think about them? Yeah. So this is this fits in their very beautiful picture about how the brain works. Right. So the beautiful picture of how the brain works is that our brain is building a model of the world. I know. So our visual system is building a model of how objects behave

Starting point is 00:25:29 in the world. And we are constantly projecting that model back onto the world. So what we are seeing is not just a feed-forward thing that just gets interpreted in an infinite world, but we are constantly projecting our expectations onto the world. And what the final person is a combination of what we project onto the world

Starting point is 00:25:50 combined with what the actual sensory input is. Almost like trying to calculate the difference and then trying to interpret the difference. Yeah, it's I wouldn't put just calculating the difference. It's more like what is the best explanation for the input stimulus based on the model of the world I have? Got it. Got it. And that's where all the illusions come in.

Starting point is 00:26:13 But that's an incredibly efficient process. So the feedback mechanism just helps you constantly. Yeah. So hallucinate how the world should be based on your world model and then just looking at if there's novelty, like trying to explain it. Yeah. Hence that's why movement, we detect movement really well, there's all these kinds of things. And this is like at all different levels of the cortex you're saying that it happens at the lowest level, the highest level.

Starting point is 00:26:46 Yes. Yeah. In fact, feedback connections are more prevalent in everywhere in the cortex. And so one way to think about it, and there's a lot of evidence for this is inference. So basically, if you have a model of the world, and when some evidence comes in,

Starting point is 00:27:04 what you are doing is inference. You are trying to now explain this evidence using your model of the world. And this inference includes projecting your model onto the evidence and taking the evidence back into the model and doing an iterative procedure. And this iterative procedure is what happens using the feed-for-word feedback propagation. And feedback affects what you see in the world, and it also affects feed-for-word propagation.

Starting point is 00:27:36 And examples are everywhere. We see these kinds of things everywhere. The idea that there can be multiple competing hypothesis in our model, trying to explain the same evidence. And then you have to kind of make them compete. And one hypothesis will explain every other hypothesis through this competition process. Wait, wait.

Starting point is 00:28:00 So you have competing models of the world that try to explain, what do you mean by explain away? So this is a classic example in graphical models, probabilistic models. So if you are, what are those? OK. I think it's useful to mention because we'll talk about them more. Yeah.

Starting point is 00:28:22 So neural networks are one class of machine learning models. You have distributed set of nodes, which are called the neurons. Each one is doing a dot product, and you can approximate any function using this multi-level network of neurons. So that's a class of models which are useful for function approximation. There is another class of models which are useful for function approximation.

Starting point is 00:28:45 There is another class of models in machine learning called probabilistic graphical models. And you can think of them as each node in that model is variable, which is talking about something. It can be a variable representing is an edge present in the input or not. And at the top of the network, node can be representing, is there an object present in the world or not? And then, so it is another way of encoding knowledge. And then once you encode the knowledge,

Starting point is 00:29:25 you can do inference in the right way. What is the best way to explain some set of evidence using this model that you encoded? So when you encode a model, you are encoding the relationship between these different variables. How is the edge connected to the model of the object? How is the edge connected to the model of the object? How is the surface connected to the model of the object?

Starting point is 00:29:47 And then, of course, this is a very distributed complicated model. And inference is, how do you explain a piece of evidence? When a set of stimulus comes in, if somebody tells me, there is a 50% probability that there is an edge here in this part of the model. How does that affect my belief on whether I should think that there should be a square percent in the image? So this is the process of inference. So one example of inference is having

Starting point is 00:30:17 this, exploring a way effect between multiple causes. So graphical models can be used to represent causality in the world. So let's say, you know, your alarm at home can be triggered by a burglar getting into your house or it can be triggered by an earthquake. Both can be causes of the alarm going off. So now, you're in your office, you heard burglar alarm going off, you are heading home, thinking that there's a burglar, got it. But while driving home,

Starting point is 00:30:56 if you hear on the radio that there was an earthquake in the vicinity, now your strength of evidence for a burglar getting into your house is diminished. Because now that piece of evidence is explained by the earthquake being present. So if you think about these two causes explaining at lower level variable, which is alarm, now what we're seeing is that increasing the evidence for some cause, there is evidence coming from below for alarm being present and initially it was flowing to a burglar being present but now since somebody

Starting point is 00:31:33 had some this evidence, there is side evidence for this other cause it explains every this evidence and it evidence will now flow to the other cause. This is you know two competing causal things trying to explain the same evidence. And the brain has similar kind of mechanism for doing so. That's kind of interesting. And that, how's that all encoded in the brain? Like, where's the storage of information? Are we talking just maybe to get it a little bit more specific? Is it in the hardware of the actual connections? Is it in chemical communication? Is it electrical communication? Do we know? So this is a paper that we are bringing out soon, which was this. This is the cortical micro circuits paper that I sent you a draft

Starting point is 00:32:20 of. Of course, this is a lot of it is still hypothesis. One hypothesis is that you can think of a cortical column as encoding a concept. A concept, you know, think of it as a, a, a, a, a, a, a, a, a, a, a, a, an example of an example of a concept is, um, is an edge present or not, or is, is an object present or not. Okay. So you can, you can think of it as a binary variable, a binary random variable, the presence of an edge or not, or the presence of an present or not. Okay, so you can you can think of it as a binary variable, a binary random variable, the presence of an edge or not or the presence of an object or not. So each critical column can be thought of as representing that one concept, one variable. And then the connections between these critical columns are basically encoding the relationship between these random variables. And then there are connections within the cortical column.

Starting point is 00:33:06 There are each cortical column is implemented using multiple layers of neurons with very, very, very rich structure there. There are thousands of neurons in a cortical column. But as structures similar across the different cortical columns. Correct. Correct. And also these cortical columns connect

Starting point is 00:33:24 to a substructure called Thalamus. So all cortical columns pass through this substructure. So our hypothesis is that the connections between the cortical columns implement this, you know, that's where the knowledge is stored about, you know, how these different concepts connect to each other. Then the neurons inside this cortical column and in the Thalamus in combination, implement these actual computations in the data for inference, which includes explaining a way and competing between the different hypothesis. What is amazing is that neurosuroSend is actually done experiments to the tune of showing these things.

Starting point is 00:34:11 They might not be putting it in the overall inference framework, but they will show things like, if I poke this higher level neuron, it will inhibit through this complicated loop through the Thalamus, it will inhibit this other column. So they will do it such experiments. Do they use terminology of concepts? For example, is it something where it's easy to enter, to purmorphize and think about concepts like you start moving into logic-based And think about concepts like you start moving into logic based kind of reasoning systems. So I would you think of concepts in that kind of way or is it Is it a lot messier a lot more gray area? You know even even more gray even more messy than the artificial neural network kinds of abstractions It. It's the easiest way to think of it as a variable, right?

Starting point is 00:35:08 It's a binary variable, which is showing the presence or absence of something. But I guess what I'm asking is, is that something that we suppose I think is something that's human interpretable of that something? It doesn't need to be. It doesn't need to be human interpretable. There is no need for it to be human interpretable of that something. It doesn't need to be. It doesn't need to be human interpretable. There is no need for it to be human interpretable. But it's almost like you will be able to find some interpretation of it because it is connected to the other

Starting point is 00:35:37 things that you know. And the point is that it's useful somehow. Yeah. It's useful as an entity in the graph, in connecting to the other entities that are, let's call them concepts. Right. Okay. So, by the way, what are these the cortical micro circuits?

Starting point is 00:35:57 Correct. These are the cortical micro circuits. You know, that's what neuroscience is used to talk about about the circuits within a level of the cortex. So you can think of, let's think of a neural network, artificial neural network terms. People talk about the architecture of the, so how many layers they build,

Starting point is 00:36:17 what is the fan in fan out, et cetera? That is the macro architecture. So, and then within a layer of the neural network, the cortical neural network is much more structured within a level. There's a lot more intricate structure there. But even within an artificial neural network, you can think of in feature detection plus pooling as one level. And so that is kind of a micro circuit. It's much more complex in the real brain. And so within a level, what about is that circuitry within a column of the cortex

Starting point is 00:36:54 and between the layers of the cortex? That's the micro circuitry. I love that terminology. Machine learning people don't use the circuit terminology. Right. But they should. It's a nice. Okay. So that's the the cortical micrasex. So what's interesting about what can we say? What is the paper that you're working on proposed about the ideas around these cortical micrasex? So this is a fully functional model for the microcircuits of the visual cortex. So the paper focuses on your idea and our discussions now is focusing on vision. The visual cortex. Okay. Yeah. This is a model. This is a four model. This is how vision works.

Starting point is 00:37:40 But this is this is a high model of yeah. Okay, so let me let me step back a bit. So we looked at neuroscience for insights on how to build a vision model. And we synthesized all those insights into a computational model. This is called the recursive critical network model that we used for breaking captures and we are using the same model for robotic picking and tracking of objects. And that again is a vision system. That's a vision system.

Starting point is 00:38:10 Computer vision system. That's a computer vision system. Takes in images and outputs what? On one side it outputs the class of the image and also segments the image. And you can also ask it further queries. Where is the edge of the object? Where is the interior of the can also ask it further queries, where is the edge of the object, where is the interior of the object.

Starting point is 00:38:28 So it's a model that you build to answer multiple questions. So you are not trying to build a model for just classification or just segmentation, etc. It's a joint model that can do multiple things. So that's the model that we built using insights from neuroscience. And some of those insights are what is the role of feedback connections, what is the role of lateral connections. So all those things went into the model. The model actually uses feedback connections. All these ideas from neuroscience. Yeah. So what the heck is a recursive cortical network like what what are the architecture approaches interesting aspects here? Which is essentially a brain inspired approach to a computer vision?

Starting point is 00:39:13 Yeah, so there are multiple layers to this question again go from the very very top and then zoom in okay So one important thing Constraint that went into the model is that you should not think vision, think of vision as something in isolation. We should not think perception as something as a pre-processor for cognition. Perception and cognition are interconnected. And so you should not think of one problem in separation from the other problem. And so that means if you finally want to have a system that understand concepts about the world and can learn a very conceptual model of the world and can reason and connect to language, all of those things, you need to think all the way through and make sure that your perception

Starting point is 00:40:00 system is compatible with your cognition system and language system and all of them. And one aspect of that is top down controllability. What does that mean? So that means you know, so think of, you know, you can close your eyes and think about the details of one object, right. I can I can zoom in further and further. I can, you know, so think of the bottle in front of me. Now, think about what the cap of that bottle looks.

Starting point is 00:40:31 Think about what's the texture on that bottle of the cap. Think about what will happen if something hits that. So you can manipulate your visual knowledge in cognition driven ways. Yes. And so this top-down controllability and being able to simulate scenarios in the world. So you're not just a passive player in this perception game, you can control it. You have imagination. Correct. So basically having a generating network, which is a model, and it is not just

Starting point is 00:41:14 some arbitrary generating network. It has to be built in a way that it is controllable top-down. It is not just trying to generate a whole picture at once. It's not trying to generate photorealistic things of the world. You don't have good photorealistic models of the world. Human brains do not have. If I, for example, ask you the question, what is the color of the letter E in the Google logo? You have no idea. Although, I have seen it millions of times. I've made it to the hundreds of times. So, it's not our model, it's not

Starting point is 00:41:47 photorealistic, but it has other properties that we can manipulate it. And you can think about filling in a different color in that logo. You can think about expanding the letter E. So, you can imagine the know, actions that you have never performed. So, so these are the kind of characteristics the generative model need to have. So this is one constraint that went into our model. Like, you know, so this is when you read the, just the perception side of the paper, it is not obvious that this was a constraint into the, that went into the model, this top downdown controllability of the generality model. So what does top-down controllability in a model look like? It's a really interesting concept, fascinating concept. What does that recursiveness gives you that? Or how do you do it?

Starting point is 00:42:38 Quite a few things. What does the model factorize? And what is the model representing us different pieces in the puzzle? Like, you know, so in the RCN network, it thinks of the world, you know, so far as the background of an image is modeled separately from the foreground of the image.

Starting point is 00:42:58 So the objects are separate from the background. They are different entities. So there's a kind of segmentation that's built in fundamental each of the background. They are different entities. So there's a kind of segmentation that's built in fundamental nature of the story. And then even that object is composed of parts and another one is the shape of the object is differently modeled from the texture of the object. So there's like these, I've been, you know, a French martial A's. Yeah. So there's, he developed this like IQ test type of thing for arc challenge for, and it's kind of cool that there's these concepts, priors, they define that you bring to the table in

Starting point is 00:43:41 order to be able to reason about basic shapes and things in IQ tests. So here you're making it quite explicit that here are the things that you should be, these are like distinct things you should be able to model in this. Keep in mind that you can derive these from much more general principles. It doesn't, you don't need to explicitly put it as, oh, objects versus foreground versus background, the surface versus the structure. No, these are, these are derived from more fundamental principles of how, you know, what's the property of continuity of natural signals? What's the property of continuity of natural signals? Yeah, by the way, that sounds very poetic.

Starting point is 00:44:25 But yeah, so you're saying that's a, there's some low, low properties from which emerges the idea that shapes should be different than, like there should be a part of an object, there should be, I mean, exactly, kind of like Francois Tauke. I mean, there's object-ness,

Starting point is 00:44:40 there's all these things that it's kind of crazy that we humans, I guess, evolved to have because it's useful for us to perceive the world. And it derives mostly from the properties of natural signals. And so natural signals. So natural signals are the kind of things we'll perceive in the natural world. I don't know why that sounds so beautiful, natural signals, yeah.

Starting point is 00:45:06 As opposed to a QR code, which is an artificial signal that we created, humans are not very good at classifying QR codes. We are very good at saying something is a cat or a dog, but not very good at classifying QR codes. So our visual system is tuned for natural signals. And there are fundamental assumptions in the architecture that are derived from natural signals properties. I wonder when you take host and genetic drugs. Does that go into natural or is that closer to QR codes?

Starting point is 00:45:40 It's still natural. Yeah, because it is still operating using our brains. By the way, on that topic, I mean, I haven't been following. I think they're becoming legalized in certain. I can't wait until they become legalized to a degree that vision science researchers could study it. Just like through medical, chemical ways, modified, there could be ethical concerns, but that's another way to study the brain is to be able to chemically modify it. It's probably probably very long a way to figure out how to do it ethically. Yeah, but I think that our studies one died already.

Starting point is 00:46:19 Yeah, I think so. Because it's not unethical to give it to rats. Oh, that's true. That's true. There's a lot of... Struck the rats out there. Okay. Cool. Sorry.

Starting point is 00:46:34 It's okay. So there's these low-level things from natural signals that... From which these properties will emerge. Yes. But it is still a very hard problem on how to encode that. So you don't, there is no, so you mentioned the priors,

Starting point is 00:46:58 Francho wanted to encode in the abstract reasoning challenge. But it is not straightforward how to encode those priors. So some of those challenges, like the object recognition challenges, are things that we purely use our visual system to do. It looks like abstract reasoning, but it is purely an output of the vision system. For example, completing the corners of that concert triangle,

Starting point is 00:47:22 completing the lines of that concert triangle. It's a purely visual system property. There is no abstract reasoning involved. It uses all these priors, but it is stored in our visual system in a particular way that is amenable to inference. And that is one of the things that we tackled in the, basically saying, OK, these are the prior knowledge, which will be derived from the word, but then how is that prior knowledge represented in the model such that inference when when some piece of evidence comes in can be done very efficiently and in a very distributed way. It is very, there are so many ways of representing knowledge, which is not amenable to very quick

Starting point is 00:48:05 inference, you know, quick lookups. And so that's one core part of what we tackled in the RCN model. How do you encode visual knowledge to do very quick inference? And yeah, can you maybe comment on, so folks listening to this in general, maybe familiar with different kinds of architectures of neural networks? What what are we talking about with the RCN? What are what is the architectural look like? What are different components? Is it close to neural networks? Is it far away from neural networks?

Starting point is 00:48:39 What does it look like? Yeah. So so you can think of the delta between the model and a convolutional neural network. If people are familiar with convolutional neural networks, so convolutional neural networks have this feedforward processing cascade, which is called feature detectors and pooling, and that is repeated in the in the hierarchy in a multi-level system. And if you want to and intuitive idea of what is happening feature detectors are you know detecting interesting co-occurrences in the input. It can be a line, a corner,

Starting point is 00:49:14 an eye or a piece of texture etc. And the pooling neurons are doing some local transformation of that and making it invariant to local transformations. So this is what the structure of convolutional neural network is. Recurstic vertical network has a similar structure when you look at just the feed-forward pathway. But in addition to that, it is also structured in a way that it is generative. So that again, it can run it backward and combine the forward with the backward. Another aspect that it has is it has lateral connections. This lateral connections, which is between, so if you have an edge here and an edge here,

Starting point is 00:49:57 it has connections between these edges. It is not just feed forward connections. It is something between these edges, which is the nodes are presenting these edges, which is to enforce compatibility between them. So otherwise, what will happen is that constraints. It's a constraint. It's basically, if you do just feature detection followed by pooling, then your transformations in different parts of the visual field are not coordinated. And so you can, you will create a jagged, when you generate from the model, you will create jagged things and uncoordinated transformations. So these lateral connections are enforcing the transformations. Is the whole thing still different, Sheribu? No. Okay. No. It's not trade-using back prop. Okay, that's really important. So so there's these feed forward, there's feedback mechanisms, there's some interesting connectivity

Starting point is 00:50:54 things. It's still layered. Like multiple layers. Okay, very interesting. And yeah, okay, very, very interesting. Yeah, okay, so the interconnection between adjacent, the connections across service constraints, that keep the thing stable. Got it. Okay, so what else? Then there's this idea of doing inference. A neural network does not do inference on the fly. So an example of why this inference is important is, you know,

Starting point is 00:51:26 so one of the first applications that we showed in the paper was to crack text-based captures. What are captures, by the way? Oh, yep. By the way, one of the most awesome, like the people who use this term anymore is human computation, I think. I love this term. The guy who created captures, I think, came up with this term. Yeah. I love it. Anyway. Yeah. Yeah. What are captures? So, captures are those strings that you fill in when you're, you know, if you're opening a new account in Google, they show you a picture, you know, usually it used to be set of garbled letters that you have to kind of figure out what, what, what is that

Starting point is 00:52:08 string of characters and type in. And the reason cap just exists is because, you know, Google or Twitter do not want automatic creation of accounts. You can use a computer to create millions of accounts and use that for in FADS purposes. So you want to make sure that to the extent possible, the interaction that their system is having is with a human. So it's called a human interaction proof. A capture is a human interaction proof. So this is a capture that's are by design, things that are easy for humans to solve, but hard for computer. Hard for robots.

Starting point is 00:52:49 Yeah. So, and text-based captures was the one which is prevalent around 2014, because at that time text-based capture were hard for computers to crack. Even now, they are actually in the sense of an arbitrary text-based capture will be unsolvable even now. But with the techniques that we have developed, it can be, you know, you can quickly develop a mechanism that solves the capture. They probably got a lot harder too. They've been getting clever and clever generating these text characters. Yeah. Right. Right. So, OK. So, that was one of the things you've tested it on

Starting point is 00:53:26 is these kinds of captures. In 2014, 2015, that kind of stuff. So, why the way, why captures? Yeah. Yeah. Even now, I would say capture is a very, very good challenge problem if you want to understand how human perception works.

Starting point is 00:53:44 And if you want to build systems human perception works and if you want to build systems that work like the human brain. And I wouldn't say capture is a solved problem. We have cracked the fundamental defense of captures, but it is not solved in the way that human solved it. So I can give you an example. I can take a five year old child who has just learned characters and show them any new capture that we create. They will be able to solve it. I can show you pretty much any new capture from any new website. You will be able to solve it without getting any training examples from that particular style of capture. You're assuming I'm human, yeah. Yes, yeah. That's right. So if you are human, otherwise I will be able to figure that out

Starting point is 00:54:31 using this one. But this whole podcast is just a touring test, a long touring test. Anyway, I'm sorry. So yeah, so human humans can figure it out with very few examples. Or not training examples, examples, no training examples from that particular style of capture. And so you can, you know, so even now this is unreachable for the current deep learning system. So basically there is no, I don't think a system exists where you can basically say, train on whatever you want.

Starting point is 00:55:01 And then now say, hey, I will show you a new capture, which I did not show you in the in the training setup. Will the system be able to solve it? Still doesn't exist. So that is the magic of human perception. And Doug Huffstarter put this very beautifully in one of his talks. The central problem in AI is what is the letter A? If you can build a system that reliably can detect all the variations of the letter A, you don't even need to go to the V and C. Yeah, you don't even know to go to the V and the C or the strings of characters. So that is a speedy at which, at which, with which we tackle that problem. What does it mean by that? I mean, is it like without training examples,

Starting point is 00:55:52 try to figure out the fundamental elements that make up the letter A in all of its forms? In all of its forms. It can be A can be made with two human standing leaning against each other, holding the hands., it can be made with two human standing leaning against each other, holding the hands. And it can be made of leaves. It can be.

Starting point is 00:56:10 You might have to understand everything about this world, the Northern Tunisian letter A. Exactly. So that's common sense reasoning essentially. So to finally, to really solve, finally, to say that you have solved Khapp capture. You have to solve the whole problem. Yeah, okay. So how does this kind of the RCN architecture help us to get a better job? Yeah. So, as I mentioned, one of the important things was being able to do inference, being able to dynamically do inference. Can you clarify what you mean? Because you said, like, neon networks don't do inference.

Starting point is 00:56:50 Yeah. So what do you mean by inference in this context? So, okay. So, in captures, what they do to confuse people is to make these characters crowd together. Yes. Okay. And when you make the characters crowd together, what happens is that you will now start seeing

Starting point is 00:57:06 combinations of characters as some other new character or an existing character. So you would you would put an R and N together. It will start looking like an M. And and so locally they are you know there is very strong evidence for it being some incorrect character. But globally, the only explanation that fits together is something that is different from what you can get find locally. Yes. So this is inference.

Starting point is 00:57:36 You are basically taking local evidence and putting it in the global context. And often coming to a conclusion locally, which is conflicting with the local information. So actually, so you mean inference like in the way it's used when you talk about reasoning, for example, is supposed to like inference, which is this, with artificial neural networks,

Starting point is 00:57:59 which is a single path to the network. Correct. Okay. So like you're basically doing some basic forms of reasoning, like integration of how local things fit into the global patient. And things like explaining a way coming into this one, because you are explaining that piece of evidence

Starting point is 00:58:18 as something else because globally, that's so leading that makes sense. So now, you can amortize this inference by, you know, in a neural network, if you want to do this, you can brute force it, you can just show it all combinations of things, that you want to, you want to, you're reasoning to work over,

Starting point is 00:58:40 and you can, you know, like, just train the help out of that neural network, and it will look like it is doing, you know, inference on the fly, but it is, it is really just doing a more tight inference. It is because you, you have to show it a lot of these combinations during training time. So what you want to do is be able to do dynamic inference, rather than just being able to show all those combinations in the training time. And that's something we emphasized in the model. What does it mean dynamic inference?

Starting point is 00:59:11 Is that that has to do with the feedback thing? Yes. Like what is dynamic? I'm trying to visualize what dynamic inference would be in this case. Like what is it doing with the input? It's shown the input the first time. Yeah, and it's like, what's changing over temporarily? Over to, what's the dynamics of this inference process? So, so you can think of it as you have, at the top of the model, the characters that you are trained on, they are the causes. You're trying to explain the pixels

Starting point is 00:59:43 using the characters as the causes. The characters are the things that cost the pixels. Yeah, so there's this causality thing. So the reason you mention causality, I guess, is because there's a temporal aspect of this whole thing. In this particular case, the temporal aspect is not important. It is more like, if I turn the character on, the pixels will turn on. Yeah, it will be after there's a little bit but yeah. So that is causality in the sense of like a logic causality like hence inference. Okay. The dynamics is that even the locally, it will look like,

Starting point is 01:00:19 okay, this is an A. And locally just when I look at just that patch of the image, it looks like an A, but when I look at it in the context of all the other causes, it might not, in an A is not something that makes sense. So that is something you have to recursively figure out. So, okay, so, and this thing performs pretty well on the captures. Correct. And I mean, is there some kind of interesting intuition you can provide why it did well? Like what it looked like? Is there visualizations that could be human interpretable to us humans?

Starting point is 01:00:57 Yes. Yeah. So the good thing about the model is that it is extremely, so it is not just doing a classification, right? It is providing a full explanation for the scene. So when it, when it operates on a scene, it is coming at back and saying, look, this is the part is the A and these are the pixels that turned on. These are the pixels in the input that tells, makes me think that it is an A.

Starting point is 01:01:25 And also these are the portions I hallucinated. It provides a complete explanation of that form. And then this is the context, this is the interior and this is in front of this other object. So that's the kind of explanation, the inference network provides. So that is useful and interpretable. Then the errors it makes are also,

Starting point is 01:01:57 I don't want to read too much into it, but the errors the network makes are very similar to the kinds of errors humans would make in a similar situation. So there's something about the structure that feels reminiscent of the way humans' visual system works. Well, I mean, how hard coded is this to the capture problem, this idea? Not really hard coded because it's the assumptions, as I mentioned, are general, right? It is more, and those themselves can be applied in many situations which are natural signals.

Starting point is 01:02:34 So it's the foreground versus background factorization and the factorization of the surfaces versus the contours. So these are all generally applicable assumptions. In all vision. So why why capture why attack the capture problem which is quite unique in the computer vision context versus like the traditional benchmarks of image net and all those kinds of image classification or even segmentation tasks and all that kind of stuff. Do you feel like that's image classification or even segmentation tests and all that kind of stuff. Do you feel like that's, I mean, what's your thinking about those kinds of benchmarks in this context? I mean, those benchmarks are useful for deep learning kind of algorithms where you, you know, so the settings that deep learning works in are here is my huge training set and here

Starting point is 01:03:22 is my test set. So the training set is almost 100,000x bigger than the test set in many cases. What we wanted to do was invert that. The training set is very smaller than the test set. And you know, capture is a problem that is by definition hard for computers. And it has these good properties of strong generalization, strong out of training distribution generalization. If you are interested in studying that and having your model have that property, then it's a good data set to tackle. So is there a view attempted to, which I think I believe there's quite a growing body of work

Starting point is 01:04:10 on looking at an M-ness and image net without training. The basic challenge is how what tiny fraction of the training set can we take in order to do a reasonable job of the classification task. Have you explored that angle in these classic benchmarks? Yes, so we did do MNIST. So, you know, so it's not just CAPTURE. So, there was also versions of multiple versions of MNIST, including the standard version, which we had we inverted the problem, which is basically saying rather than train on 60,000 training data, you know, how quickly can you get to high level accuracy with very little training data? Is there some performance you remember, like how well, how well did it do, how many examples that it need? Yeah, I remember that it was, you know,

Starting point is 01:05:09 on the order of tens or hundreds of examples to get into 95% accuracy and it was definitely better than other systems out there at that time. At that time. Yeah, they're really pushing. I think that's a really interesting space, actually. I think there's an actual name for MNIST that, like, there's different names,

Starting point is 01:05:34 the different sizes of training sets. I mean, people are like attacking this problem. I think it's super interesting. It's funny how, like, the MNIST will probably be with us all the way to a GI It's a data set that just sits by it is it's a clean simple Data set to decide the fundamentals of learning with just like captures. It's interesting. Not enough people I don't know. Maybe you can correct me, but I feel like captures don't show up as often in papers as they probably should.

Starting point is 01:06:05 That's correct. Yeah, because you know, usually these things have a momentum, you know, once, once something gets established, a suspended benchmark, that is a, there is a, there is a dynamics of how graduate students operate and how a cadmium, a cadmium system works that pushes people to track that benchmark. So, yeah. So, yeah. Nobody wants to think outside the box. Okay.

Starting point is 01:06:33 Yeah. Okay, so good performance on the captures. What else is there interesting on the RCN side before we talk about the cortical micros? Yeah, so the same model, so the important part of the model was that it trains very quickly with very little training data and it's you know quite robust to out of distribution perturbations. And and we are using that very fruitfully in advaricarious in many of the robotic stuff we are solving. So let me ask you this kind of touchy question I have to I've spoken with your friend colleague Jeff Hawkins

Starting point is 01:07:14 too. I mean these I have to kind of ask there is a bit of one of you have brain inspired stuff. Yeah. And you make big claims. Yeah. Big. There's, you know, there's critics, I mean, machine learning sub-breddit. Don't give me started on those people. There are hard, I mean, criticism was good, but they're a bit over the top. There is quite a bit of sort of skepticism and criticism. You know, is this work really as good as it promises to be? What do you have thoughts on that kind of skepticism? Do you have comments on the kind of criticism I might have received about, you know, is this approach legit?

Starting point is 01:07:58 Is this a promising approach? Yeah. Or at least as promising as it seems to be advertised as? Yeah, I can comment on it. Our art scene paper is published in Science, which I would argue is a very high quality journal, very hard to publish. Usually, it is indicative of the quality of the work. I am very, very certain that the ideas that we brought together in that paper in terms of the importance of feedback connections, recursive inference, lateral connections, coming to best explanation of the scene as the problem to solve, trying to solve recognition, segmentation, all jointly in a way that is compatible with

Starting point is 01:08:47 higher level cognition, top-down attention, all those ideas that we brought together into something you know coherent and workable in the world and solving a challenging, tackling a challenging problem. I think that will, that will stay and that contribution I stand by, right? Now I can I can tell you a story which is funny in the context of this. So if you read the abstract of the paper and the argument we are putting in, we are putting in, look, current deep learning systems take a lot of training data. They don't use these insights. And here is our new model, which is not a deep neural network. It's a graphical model. It does inference. This is how the paper is? Now, once the paper was accepted and everything,

Starting point is 01:09:28 it went to the press department in science, you know, to play as science office. We didn't do any press release when we were published, it was, it went to the press department. What was the press release that they wrote up? A new deep learning model. Sobscaptures. Sobscaptures.

Starting point is 01:09:44 And so, so you can see where was you know what what was being hyped in that thing right. So, so it's like there is the there is a dynamic in the in the community of you know, so that's especially happens when there are lots of new people coming into the field and they get attracted to one thing. And some people are trying to think different compared to that. So there is some, I think skepticism is science is important and it is very much required. But it's also, it's not skepticism usually. It's mostly bandwagon effect that is happening rather than in. Well, but that's not even that. I mean, I'll tell you what they react to, which is like,

Starting point is 01:10:28 I'm sensitive to as well. If you look at just companies open AI, DeepMind, Macarius, I mean, there's a little bit of a race to the top and hype, right? It's like, it doesn't pay off to be humble. Like, so like, and the press is just irresponsible often. They just, I mean, don't give me started on the state of journalism today.

Starting point is 01:10:58 Like, it seems like the people who write articles about these things, they literally have not even spent an hour on the Wikipedia article about what is new on that works. They haven't invested just even the language to laziness. It's like robots, beat humans, they write this kind of stuff that just and then and then of course the researchers are quite sensitive to that Because it gets a lot of attention. They're like, why did this work get so much attention? You know that's that's over the top and people get really sensitive, you know the same kind of criticism with Open AI did work with Rubik's cube with robot that people criticized

Starting point is 01:11:43 same with GPT 2 and 3, they criticized same thing with deep minds, with Alpha 0. I'm sensitive to it. And of course with your work, I mentioned deep learning, but there's something super sexy to the public about brain inspired. I mean, that immediately grabs people's imagination, not even like neural networks, but like really brain inspired, like brain like neural networks. That seems really compelling to people. And to me as well, to the world as a narrative. And so people hook up, hook on to that. And sometimes you, uh, the skepticism engine turns on in the research community and they're skeptical. But I think putting aside the ideas of the actual performance on captures or performance on any data set,

Starting point is 01:12:39 I mean, to me, all these data sets are useless.. It's nice to have them, but in the grand scheme of things, they're silly toy examples. The point is, is there intuition about the ideas, just like you mentioned, bringing the ideas together in the NIC way? Is there something there? Is there some value there? And is it going to stand the test the time?

Starting point is 01:13:03 And that's the hope. That's the hope. I'm my conference there is very high. I don't treat brain inspirer as a marketing term. I am looking into the details of biology and puzzling over those things. And I am grappling with those things. And so it is not a marketing term at all. You

Starting point is 01:13:26 can use it as a marketing term and people often use it and you can get combined with them. And when people don't understand how we are approaching the problem, it is easy to be misunderstood and think of it as purely marketing. But that's not the way we are. So you really, I mean, as a scientist, you believe that if we kind of just stick to really understanding the brain, that's going to, that's the right, like you should constantly meditate on the, how does the brain do this? Because that's going to be really helpful for engineering intelligence systems. Yes, you need to, so I think it is one input, and it is helpful. But you should know when to deviate from it, too.

Starting point is 01:14:13 So an example is convolutional neural networks. Convolution is not an operation brain implements. Visual cortex is not convolutional. Visual cortex has local receptive fields, local connectivity, but there is no translation in variants in the the network weights in the visual cortex. That is a computational trick, which is a very good engineering trick that we use for sharing the training between the different nodes. And that trick will be with us for some time. It will go away when we have robots with eyes and heads that move. And so then that trick will go away. It will not be useful at that time. So so the brain doesn't

Starting point is 01:15:07 So the brain doesn't have translational invariance. It has the focal point. I guess I think it focuses on Correct. It does it has a phobia and and because of the phobia The the receptives are not like the copying of the weights like the the weights in the center are very different from the weights in the periphery. Yes, the periphery. I mean, I did this, actually wrote a paper and just got in a chance to really study peripheral vision, which is a fascinating thing. Very under understood thing of what the, you know,

Starting point is 01:15:42 at the every level the brain does with the periphery. It does some funky stuff. So it's another kind of trick than convolutional. Like it does, it's, you know, convolutional, convolution in neural networks is a trick to for efficiency, is efficiency trick. And the brain does a whole not other kind of thing, I guess. So you need to understand the principles or processing so that you can still apply engineering tricks

Starting point is 01:16:13 where you want it. You don't want to be slavishly making all the things of the brain. And so yeah, so it should be one input and I think it is extremely helpful. But it should be the point of really understanding so that you know when to deviate from it. So, okay, that's really cool.

Starting point is 01:16:30 That's worked from a few years ago. So you did work in Newmenta, which are foralkins, with hierarchical temper memory. How is your, if you could give a brief history, how is your view of the way, the models of the brain changed over the past few years leading up to now? Is there some interesting aspects where there was an adjustment to your understanding of

Starting point is 01:16:58 the brain, or is it all just building atop of each other? In terms of the higher level ideas, especially the one Jeff wrote about in the book, if you blur out, right? On intelligence. Right. On intelligence. If you blur out the details and if you just zoom out and at the higher level idea, things are, I would say consistent with what he wrote about, but many things will be consistent with that because it's a blur, you know, when you, you know, deep learning systems are also you know multi-level hierarchical all of those things right so so at the but in terms of the detail a lot of things are different and and those details matter a lot so so one point of difference I had with Jeff

Starting point is 01:17:48 So one point of difference I had with Jeff was how to approach, you know, how much of biological, possibility and realism do you want in the learning algorithms? So when I was there, this was, you know, almost 10 years ago now. So, yeah, I was really having fun. I don't know. I don't know what just things now, but 10 years ago, the difference was that I did not want to be so constrained on saying, my learning algorithms want to need to be biologically plausible based on some filter of biological possibility available at that time. To me, that is a dangerous cut to make because we are, you know, discovering more and more things about the brain all the time. To me, that is a dangerous cut to make because we are, you know, discovering more and more things about the brain all the time. New biophysical mechanisms, new channels

Starting point is 01:18:30 are being discovered all the time. So I don't want to upfront kill off a learning algorithm just because we don't really understand the full, the full of biophysics or whatever of how the brain learns. Exactly, exactly. But let me ask an sergeant drop, what's your sense? What's our best understanding of how the brain learns? So things like back propagation,

Starting point is 01:18:57 credit assignment. So many of these algorithms have learning algorithms, have things in common, right? It is a back propagation is one way of credit assignment. There is another algorithm called expectation maximization, which is another weight adjustment algorithm. But is your sense the brain does something like this? Has to. There is no way around it in the sense of saying that

Starting point is 01:19:19 you do have to adjust the connections. So, and you're saying credit assignment, you have to reward the connections. And you're saying credit assignment, you have to reward the connections that were useful in making a correct prediction. And not, yeah, I guess, what up, but yeah, it doesn't have to be differentiable. I mean, yeah, it doesn't have to be differentiable. Yeah, but you have to have a, you know,

Starting point is 01:19:39 you have a model that you start with, you have data comes in, and you have to have a way of adjusting the model such that it better fits the data. So that is all of learning. And some of them can be using Backprop to do that, some of it can be using very local graph changes to do that. graph changes to do that. Many of these learning algorithms have similar update properties locally in terms of what the neurons need to do locally. I wonder if small differences in learning algorithms have huge differences in the actual the dynamics of, I mean, sort of the reverse, like spiking, like, if credit assignment is like a lightning versus like a rainstorm or something, like whether there's like a looping,

Starting point is 01:20:39 local type of situation with the credit assignment, whether there is like regularization like how how how many injects were bossnessed into the whole thing like whether it's chemical or electrical or mechanical all those kinds of things I feel like it that Yeah, I feel like those differences could be essential, right? It could be. It's just that you don't know enough to on the learning side.

Starting point is 01:21:12 You don't know enough to say that is definitely not the way the brain does it. Got it. So you don't want to be stuck to it. So that, yeah. So you've been open minded on that side of the thing. Correct. On the inference said, on the recognition said, I am much more amenable to being constrained because it's much easier to do experiments because you know, it's take, okay, he's as stimulus, you know, how many steps did it get to take the answer? I can trace it back, I can, I can understand the speed

Starting point is 01:21:40 of that computation, etc. much more readily on the inference site. Got it. And then you can, much more readily on the inference side. Got it. And then you can't do good experiments on the learning side. Correct. So that let's, let's go right into the cortical micro circuits right back. So what, what are these ideas beyond recursive cortical network

Starting point is 01:22:00 that you're looking at now? So we have made a, you know, pass through, or, you know, multiple of the steps that we, you know, as I mentioned earlier, you know, we were looking at perception from the angle of cognition, right? It was not just perception for perception sake. How do you, how do you connect it to cognition? How do you learn concepts?

Starting point is 01:22:20 And how do you learn abstract reasoning, similar to some of the things Fransua talked about. So we have taken one pass through it basically saying, what is the basic cognitive architecture that you need to have, which has a perceptual system, which has a system that learns dynamics of the world and then has something like a routine program learning system on top of it to learn concepts. So we have built one, you know,

Starting point is 01:22:53 the version 0.1 of that system. This was another science robotics paper. It's the title of that paper was, you know, something like cognitive programs. How do you build cognitive programs? And the application there was on manipulation or about a manipulation. It was, so think of it like this.

Starting point is 01:23:13 Suppose you wanted to tell a new person that you met, you don't know the language or that person uses. You want to communicate to that person to achieve some task. So I want to say, hey, you need to pick up all the red cups from the kitchen counter and put it here. How do you communicate that? You can show pictures. You can basically say, look, this is a starting state. The things are here, this is the ending state. What does the person need to understand from that? The person need to understand what conceptually happened in those pictures from the input

Starting point is 01:23:51 to the output. We are looking at pre-verbal conceptual understanding. Without language, how do you have a set of concepts that you can manipulate in your head? I'm from a set of images of input and output. Can you infer what is happening in those images? Got it. With concepts of pre-language. What does it mean for a concept to be pre-language? Why is language so important here? So I want to make a distinction between concepts

Starting point is 01:24:31 that are just learned from text by just feeding brute force text. You can start extracting things like, OK, cow is likely to be on grass. So those kinds of things, you can extract purely from text. But that's kind of a simple association thing rather than a concept, as an abstraction of something that happens in the real world in a grounded way that I can, I can simulate it in my mind and connect it back to the real world. And you think kind of the visual the visual world concepts in the visual world are somehow

Starting point is 01:25:13 lower level than just the language. The lower level kind of makes it feel like okay that's important. And importantly, it's more like I would say the concepts in the visual and the motor system and you know the the concept learning system which if you cut off the language part just the just what we learn by interacting with the world and abstractions from that that is a prerequisite for any real language understanding. So you're so you disagree with Sharmsky? He says languages at the bottom, everything. No, I disagree with Sharmsky completely on how many levels, from universal grammar to yes.

Starting point is 01:25:57 So that was a paper in science beyond the Cursor-Cortical Network. What other interesting problems are there, the open problems and brain inspired approaches that you're thinking about? I mean, everything is solved, right? No problem is solved, solved, all right? I think of perception as kind of the first thing that you have to build, but the last thing that you will be actually solved.

Starting point is 01:26:26 Because if you do not build perception system in the right way, you cannot build concept system in the right way. So you have to build a perception system. However, wrong that might be, you have to still build that and learn concept from there, and then keep it trading. And finally, perception will get solved fully when perception, cognition, language,

Starting point is 01:26:48 all those things work together, finally. So what, and so great, we've talked a lot about perception, but maybe on the concept side and common sense, or just general reasoning side, is there's some intuition you can draw from the brain about how we can do that? So I have this classic example. So suppose I give you a few sentences and then ask you a question following that sentence. This is a natural language processing problem. right? So here goes, I'm telling you, Sally pounded a nail on the ceiling. Okay, that's a sentence. Now I'm asking you a question, what's the nail

Starting point is 01:27:32 horizontal or vertical? Vertical. Okay, how did you answer that? Well, I imagine Sally was kind of hard to imagine what the hell she was doing, but I imagined I had a visual of the whole situation. Exactly. So here, I post a question in natural language. The answer to that question was, you got the answer from actually simulating the scene. Now I can go more and more detail about, okay, was Sally standing on something while doing this, you know, could she have been standing on a light bulb to do this, you know, I could ask more and more questions about this and

Starting point is 01:28:15 I can ask, make you simulate the scene in more and more detail, right? Where is all that knowledge that you're accessing stored? It is not in your language system. It was not just by reading text you got that knowledge. It is stored from the ability experiences that you have had from, and by the age of five, you have pretty much all of this. And it is stored in your visual system, water system, in a way such that it can be accessed through language. I got it. I mean, right. So here, the language is just almost

Starting point is 01:28:51 the query into the whole visual cortex, and that does the whole feedback thing. But I mean, it is all reasoning kind of connected to the perception system in some way. You can do a lot of it, you can still do a lot of it by quick associations without having to go into the depth. And most of the time you will be right, right?

Starting point is 01:29:13 You can just do quick associations, but I can easily create tricky situations for you where that quick associations is wrong and you have to actually run the simulation. So the figuring out how these concepts connect, do you have a good idea of how to do that? That's exactly what that's the one of the problems that we are working on. And the way we are approaching that is basically saying, okay, you need to, so the takeaway is that language is simulation control.

Starting point is 01:29:45 And your perceptual plus motor system is building a simulation of the world. And so that's basically the way we are approaching it. And the first thing that we built was a controllable perceptual system. And we built a schema networks, which was a controllable dynamic system. Then we built a concept learning

Starting point is 01:30:05 system that puts all these things together into programs, our abstractions that you can run and simulate. And now we are taking the step of connecting it to language. And it will be very simple examples initially. It will not be the GPT-3 like examples, but it will be grounded simulation-based language. And for like the the querying would be like question answering kind of thing. Correct. And it will be in some simple world initially on you know, but it will be about okay, can the system connect the language and ground it in the right way and run the right simulations to come up with the answer. And the goals to try to do things that for example, GPT-3 couldn't do.

Starting point is 01:30:49 Correct. Speaking of which, if we could talk about GPT-3 a little bit, I think it's an interesting thought-provoking set of ideas that open AI is pushing forward. I think it's good for us to talk about the limits and the possibilities in neural networks. So in general, what are your thoughts about this recently released very large 175 billion parameter language model? So I haven't directly evaluated it yet, from what I have seen on Twitter and other people evaluating it, it looks very intriguing. I am very intrigued by some of the properties it is displaying. And of course, the text generation part of that was already evident in GPT2 that it can generate

Starting point is 01:31:37 coherent text over long distances. But of course, the weaknesses are also pretty visible in saying that, okay, it is not really carrying a world state around. And sometimes you get sentences like, I went up the hill to reach the valley or to think about something completely incompatible statements or when you're traveling from one place to the other, it doesn't take into account the time of travel, things like that. So those things, I think will happen less in GPT-3 because it is trained on even more data. And so, and it can do even more longer distance coherence.

Starting point is 01:32:19 But it will still have the fundamental limitations that it doesn't have a world model. And it can't have a world model and it can't run simulations in its head to find whether something is true in the world or not. Do you think within, so sticking a huge amount of text from the internet and forming a compressive representation, do you think in that could emerge something that's an approximation of a world model, which essentially could be used for reasoning. And it's a, it's a, I'm not talking about GPT-3.

Starting point is 01:32:51 I'm talking about GPT-4, 5 and GPT-10. Yeah, I mean, they will look more impressive than GPT-3. So you can, if you take that to the extreme, then a Markov chain of just first order. And if you go to, I'm taking the other extreme, if you read Shannin's book, right, he has a model of English text, which is based on first order Markov chains, second order Markov chains, third order Markov chains and saying that, okay, third order Markov chains look better than first order Markov chains. that okay, the Rural Markov chain looks better than faster Markov chains. So does that mean a faster Markov chain has a model of the world? Yes, it does. So yes, in that level when you go higher order models or more sophisticated structure in the model like the Transuma Networks have, yes, they have a model of the text world. But that is not a model of

Starting point is 01:33:48 the world. It's a model of the text world and it will have interesting properties and it will be useful, but just scaling it up is not going to give us a GI or natural language understanding or meaning. The question is whether being forced to compress a very large amount of text forces you to construct things that are very much like, because the idea of concepts and meaning is a spectrum. Sure. So, in order to form that kind of compression, maybe it will be forced to figure out abstractions which look awfully a lot like

Starting point is 01:34:40 the kind of things that we think about as concepts, as world models, as common sense. Is that possible? No, I don't think it is possible because the information is not there. The information is there behind the text, right? No, unless somebody has written on all the details about how everything works in the world,

Starting point is 01:35:02 to the absurd amounts like, okay, it is easier to walk forward than backward, that you have to open the door to go out of the thing. Doctors wear underwear, you know, unless all these things, somebody has written down somewhere or, you know, somehow the program found it to be useful for compression from some other text, the information is just not there. So that's an argument that like text is a lot lower fidelity than the experience of our physical world. Right. So you're looking for the thousand words, like that's the kind of thing. Well, in this case pictures aren't really, so the richest aspect of the physical world is even just pictures.

Starting point is 01:35:42 It's the interactivity with't even just pictures. It's the it's the interactivity of the world. Yeah. It's being able to interact. It's almost like it's almost like if you could interact. So I disagree. Well, maybe I agree with you that picture is worth a thousand words, but a thousand. You can say you yeah, you can say, you can capture it with a GPT X. So I wonder if there's some interactive element where a system could live in text world where it could be part of the chat, be part of, you know, talking to people.

Starting point is 01:36:17 It's interesting. I mean, fundamentally, so you're making a statement about the limitation of text. Okay, so let's say we have a text corpus So you're making a statement about the limitation of text. Okay, so let's say we have a text corpus that includes basically every experience we could possibly have. I mean, just a very large corpus of text. And also interactive components.

Starting point is 01:36:40 I guess the question is whether the neural network architecture, these very simple transformers, but if they had like hundreds of trillions or whatever comes after a trillion parameters, whether that could store the information needed. That's architecturally. Do you have thoughts about the limitation on that side of things of what new on that works? I mean, so, Transformer is still a feed-forward neural network. It's a very interesting architecture,

Starting point is 01:37:14 which is good for text modeling and probably some aspects of video modeling, but it is still a feed-forward architecture. Do you believe in the feedback mechanism recursion. Oh, and also causality, being able to do counterfactual reasoning, being able to do intervention, which is actions in the world. So all those things require different kinds of models to be built. I don't think transformers captures that family. It is very good at statistical modeling of text. And it will become better and better with more data, bigger models a model that has read all of quantum mechanics and theory of relativity and we are asking you to do text completion or you know we are asking you to solve simple puzzles that you know when you have AGI if you if you you know that's not what you ask a system to do if it is you know we ask we'll ask the system to do experiments you know

Starting point is 01:38:23 what you and I'm come up with hypothesis. And you know, revise the hypothesis based on evidence from experiments, all those things, right? Those are the things that we want the system to do when we have a GI, not solve the simple puzzles. So. Like impressive demo, somebody generating a red button in HTML.

Starting point is 01:38:41 Right. Which are all useful. There's no, not missing the usefulness of it. So I get, by the way, I'm, I mean, playing a little bit of a devil's advocate, so calm down internet. So I just, I'm curious, almost, in which ways, will a dumb, but large, you on that work will surprise us. Yeah. So like it's kind of you're I completely agree

Starting point is 01:39:09 with your intuition is just that I don't want to dog medically like a hundred percent put all the chips there. Right. We've been surprised so much. Even the current GPT two and three are so surprising. Yeah. The self-play mechanisms of Alpha Zero are really surprising. And I reinforce the fact that reinforcements learning works at all to me is really surprising. The fact that neural networks work at all is quite surprising. Given how nonlinear the space is, the fact that it's

Starting point is 01:39:45 able to find local minima that are at all reasonable, it's very surprising. So I wonder sometimes whether us humans just want it to not, for AGI not to be such a dumb thing. to not, for AGI, not to be such a dumb thing. So I just, because exactly what you're saying is like, the ideas of concepts and be able to reason those concepts and, and connect those concepts in, like, hierarchical ways and then to be able to have world models. I mean, just everything we're describing in human language in this poetic way seems to make sense that that is what intelligence, the reasoning are like. I wonder if at the core of it, it could be much dumber. Uh, well, finally, it is still connections and messages passing

Starting point is 01:40:36 over them, right? Right. So in that way, it's done. So, I guess the recursion, the feedback mechanism So I guess the recursion, the feedback mechanism, that does seem to be a fundamental kind of thing. Yeah, yeah. The idea of concepts. Also memory. Correct. Like having an episodic memory. Yeah. That seems to be an important thing. So how do we get memory? So yeah, we have another piece of work that came on recently on how do you form a episodic memory and form abstractions from them. And we haven't figured out, you know, all the connections of that to the overall cognitive architecture. But, yeah, what are your ideas about how you could have an up-sark memory? So, at least it's very clear that there you need to have two kinds of memory, that's very, very clear, right, which is there are things that happen as statistical patterns in the world,

Starting point is 01:41:33 but then there is the one timeline of things that happen only once in your life, right, and this day is not going to happen ever again. And so, and that needs to be stored, this day is not going to happen ever again. And so, and that needs to be stored as a, you know, just a stream of strings, right? This is my experience. And then, then the question is about, how do you take that experience and connect it to the statistical part of it?

Starting point is 01:41:57 How do you now say that, okay, I experienced this thing, now I want to be careful about similar situations. And so you need to be able to index that similarity using your other giants that is in the model of the world that you have learned. Although the situation came from the episode, you need to be able to index the other one. So the episodic memory being implemented as an indexing over the other model that you're building. So the memory is remain and they are indexed into the statistical thing that you formed. Yeah, statistical causal structural model that you built over time. So it's basically the idea is that the hippocampus is just storing sequencing in a set of pointers that happens

Starting point is 01:42:58 over time. And then whenever you want to reconstitute that memory and evaluate the different aspects of it, whether it was good, bad, do I need to encounter the situation again, you need the cortex to reinstantiate, to replay that memory. So, how do you find that memory? Which direction is the important direction? Both directions are, you know, it's again, by directional.

Starting point is 01:43:24 I guess, how do you retrieve the memory? So this is again hypothesis, right? Yeah, make this. So when you come to a new situation, right? Your cortex is doing inference over in the new situation and then of course, Kippa campus is connected to different parts of the cortex and you have this déjà vu situation, right?

Starting point is 01:43:47 Okay, I have seen this thing before. And then in the HIPAA Campus, you can have an index of, okay, this is when it happened, as a timeline. And then you can use the HIPAA Campus to drive the similar timelines to say, now I am rather than being driven by my current input stimuli, I am going back in time and rewinding my experience for playing it, putting back into the cortex and then putting it back into the cortex, of course, affects what you're going to see next in your current situation. Yeah, so that's the whole thing having a world model and then just connecting to the perception.

Starting point is 01:44:29 Yeah, it does seem to be that that's what's happening. It'd be on the neural network side. It's interesting to think of how we actually do that. Yeah. Yeah. So I have a knowledge base. Yes. It is possible that you can put many of the structures into

Starting point is 01:44:47 Neural networks and we will find ways of combining properties of Neural networks and graphical models. So I mean, it's already started happening. Yes, graph neural networks are kind of a merge between them Yeah, and there will be more of that thing So, to me, it is the direction is pretty looking at biology and the history of evolutionary history of intelligence. It is pretty clear that what will need is more structure in the models and modeling of the world and supporting dynamic inference. Well, let me ask you, there's a guy named Elon Musk, there's a company called Neuralink, and there's a general field called Brain Computing Interfaces. It's kind of a interface between your two loves, the brain and the intelligence. So there's like very direct applications to bring computer interfaces

Starting point is 01:45:46 for people with different conditions, more in the short term. But there's also these sci-fi futuristic kinds of ideas of AI systems being able to communicate in a high bandwidth way with a brain, bi-directional. What are your thoughts about neural link BCI in general as a possibility? So I think BCI is a cool research area. And in fact, when I got interested in brains initially, so I was enrolled at Stanford, and when I got interested in brains, it was through a brain computer interface talk that

Starting point is 01:46:26 Krishna Shanoi gave. That's when I even started thinking about the problem. So it is definitely a fascinating research idea and it is the applications are enormous. So there is a science fiction scenario of brains directly communicating. Let's keep that aside for the time being. Even just the intermediate milestones that pursuing, which are very reasonable as far as I can see, being able to control an external limb using, you know, direct connections from the brain and being able to write things into the brain. So those are all good steps to take and they have enormous applications. You know, people losing limbs, being able to control prosthetics, quadriplegic being able to control something. So, and therapeutics and, you know, I also know about another company working in the space called paradromics. They're doing, you know, it's based on a different electrode array, but trying to attack some of the same

Starting point is 01:47:27 problems. So I think it's a very- Also surgery? Correct, surgically implanted, like that. Yeah. So yeah, I think of it as a very, very promising field, especially when it is helping people overcome some limitations. Now at some point, of course, it will advance the level of being able to communicate. How hard is that problem, do you think? Like, so, so, okay, let's say we magically solve what I think is a really hard problem of doing all of this safely. Yeah. So, so like being able to connect electrodes and not just thousands but like millions to the brain. I think it's very, very hard because you also do not know what will happen to the brain

Starting point is 01:48:14 with that, right? And so how does the brain adapt to something like that? And it's, you know, as we were learning, the brain is quite in terms of neuroplasticity. It's pretty malleable. Correct. So it's going to adjust. So the machine learning side, the computer side is going to adjust and then the brain is going to adjust. Exactly. And then what soup does this land into is the kind of hallucinations you might get from this. There might be pretty intense. Yeah. Yeah. So just connecting to all of workipedia. It's interesting whether we need to be able to figure out

Starting point is 01:48:48 the basic protocol of the brain's communication schemes in order to get them to the machine and the brain to talk. Because another possibility is the brain actually just adjusts to whatever the heck the computer is doing. Exactly. That's where I think that I find that to be a more promising way. It's basically saying, you know, okay, attach electrodes to some part of the cortex, okay, and make sure. Maybe if it is done from birth, the brain will adapt, it says that, that part is not damaged. It was not used for anything. The electrodes are attached there, right? And now you train that part of the brain to do this high bandwidth communication between

Starting point is 01:49:26 something else. And if you do it like that, then it is brain adapting to, and of course, your external system is designed so that it is adaptable. Just like we design computers or mouse, keyboard, all of them to be interacting with humans. So of course, that feedback system is designed to be human compatible. But now it is not trying to record from all of the brain and now two system trying to adapt to each other. It's a brain adapting it to one way.

Starting point is 01:50:01 So it's fascinating. The brain is connected to the internet. It's connected. Yeah. Just imagine it's connecting it to one way. So it's a spas-ing. The brain's connected to like the internet. It's connected. Yeah. Just imagine it's connecting it to Twitter and just just taking that stream of information. Yeah, but again, if we take a step back, I don't know what your intuition is,

Starting point is 01:50:19 I feel like that is not as hard of a problem as the doing it safely. There's a huge barrier to surgery. Right. Because the biological system, it's a mush of weird stuff. Correct. So that the surgery part of it,

Starting point is 01:50:41 biology part of it, the long term repercussions part of it. Again, I don't know what else will, you know, we often find after a long time in biology that, okay, that idea was strong, right? You know, so people used to cut off this, the gland called the thymus or something. And then they found that, oh oh no, that actually causes cancer. And then there's a subtle like millions of variables involved. But this whole process, the nice thing, just like again with Elon, just like colonizing Mars, seems like a ridiculously difficult idea. But in the process of doing it, we might learn a lot about the biology of the

Starting point is 01:51:25 neurobiology of the brain, the neuroscience side of things. It's like, if you want to learn something, do the most difficult version of it. See what you learn. The intermediate steps that they are taking sounded all the way to reasonable to me. It's great. Well, but like, everything with Elon is the timeline seems insanely fast. So that's the only awful question. Well, we've been talking about cognition a little bit so like reasoning, we haven't mentioned the other C word which is consciousness. Do you ever think about that one?

Starting point is 01:52:01 Do is that useful at all in this whole context of what it takes to create an intelligent reasoning being, or is that completely outside of your, like, the engineer perspective of intelligence? So it is not outside the realm, but it doesn't on a day-to-day basis form what we do, but it's more. So in many ways, the company name is connected to this idea of consciousness. What's the company name? Vicarious. Vicarious is the company name. And so what does vicarious mean? At the first level, it is about modeling the world. And it is internalizing the external actions. So, so you interact with the world and learn a lot about the world. And now, after having learned a lot about the world, you can run those things in your mind without actually having to act in the world.

Starting point is 01:53:00 So, you can run things vicariously, just in your in your brain. And similarly, you can experience another person's thoughts by having a model of how that person works and running their, you know, putting yourself in some other person's shoes. So that is being vicarious. Now, it's the same modeling apparatus that you're using to model the external world or some other person's thoughts. You can turn it to yourself. You can up, you know, if that same modeling thing is applied to your own modeling apparatus, then that is what gives rise to consciousness, I think.

Starting point is 01:53:38 Well, that's more like self-awareness. There's the hard problem of consciousness, which is like when the model becomes, when the model feels like something, when the skull process is like it's like you really are in it. You feel like an entity in this world. Not just you know that you're an entity, but it feels like something to be that entity. And thereby, we attribute this, you know, then it starts to be where something that has consciousness can suffer. You start to have these kinds of things that we can reason about that.

Starting point is 01:54:19 Yes, much heavier. It seems like there's much greater cost to your decisions. And mortality is tied up into that, like the fact that these things end. Right. First of all, I end at some point, and then other things end. And that somehow seems to be, at least for us humans, a deep motivator. Yes. And that idea of motivation in general, we talk about goals in AI, but goals aren't quite the same thing as the arm mortality.

Starting point is 01:55:00 It feels like, first of all, humans don't have a goal. And they just kind of create goals at different levels. They make up goals because we're terrified by the mystery of the thing that gets us all. So we make these goals up. So we're like a goal generation machine, as opposed to a machine which optimizes the trajectory towards

Starting point is 01:55:26 a singular goal. So it feels like that's an important part of cognition, that whole mortality thing. Well, it is, it is a part of human cognition, but there is, uh, that mortality to come to the question for a, uh, artificial system, because we can copy the artificial system. The problem with humans is that we can't clone you. I can, I can, I can, I can, I can clone, even if I clone you as a, uh, you know, the hardware, your experience, uh, that was stored in your brain, your episodic memory, all those will not be captured in the new clone. So, but that's not the same with an AI system, right? So, but it's also possible that the thing that you mentioned with us humans is actually

Starting point is 01:56:21 fundamental importance for intelligence. So like the fact that you can copy an AI system means that that AI system is not yet a GI. So if you look at existence proof, if you reason based on existence proof, you could say that it doesn't feel like death is a fundamental property of an intelligent system. Got it. But we don't yet give me an example of an immortal intelligent being. We don't have those. It's very possible that that is a fundamental property of intelligence is a thing that has a deadline for itself. You can think of it like this, so suppose you invent a way to freeze people for a long

Starting point is 01:57:11 time. It's not dying, right? So you can be frozen and woken up thousands of years from now. So it's not fear of death. Well, no, you're still, it's not about time, it's about the knowledge that it's temporary. And that aspect of it, the finiteness of it, I think creates a kind of urgency. Correct. For us, for humans. Yeah, for humans. Yes. And that is part of our drives. But and that's why I'm not too worried about AI, you know, having motivations to kill all humans. And those kinds of things, why just wait, you know, so why do you need to do that? I've never heard that before. That's a good point.

Starting point is 01:58:05 Yeah, it just murders seems like a lot of work. We just waited out. I've probably heard themselves. Let me ask you, people often kind of wonder world class researchers, such as yourself, what kind of books, technical fiction, philosophical, were had an impact on you in your life and maybe ones you could possibly recommend that others read. Maybe if you have three books that pop in the mind. Yeah. So I definitely liked Judea Pell's book, Probolicity, Reasoning and Intelligent Systems.

Starting point is 01:58:47 It's a very deep technical book. But what I liked is that in, so there are many places where you can learn about probabilistic graphical models from. But throughout this book, Judea Pell kind of sprinkles his philosophical observations and he thinks about how he connects us to how the brain thinks and attentions and resources, all those things. So so that whole thing makes it more interesting to

Starting point is 01:59:12 read. He emphasizes the importance of causality. So that was in his later book. So this was the first book, probabilistically, sending in intelligence systems, he mentions causality, but he hadn't really sunk his teeth into, like, you know, how do you actually formalize? Yeah. And the second book, causality is 2000, the one in 2000, that one is really hard. So I wouldn't recommend that.

Starting point is 01:59:34 Oh, yes. So that looks at the like the mathematical, like his model of Duke calculus. Duke calculus. Yeah. It was pretty dense mathematical. Right. Right. The book of why is definitely more enjoyable. For sure. Yeah. So yeah, so I would recommend probably stick deconing in intelligent systems. Another book I liked was One From Doug Huffstarter.

Starting point is 01:59:56 This is a long time ago. Here's a book, I think, called it was called The Mind's Eye. It was It was probably Havstjader and Daniel Dennett together. And I actually was, I bought that book, it's not my show, I haven't read it yet, but I couldn't get an electronic version of it, which is annoying because I read everything on Kindle. I had to actually purchase the physical, it's like one of the only physical books I have because yeah, anyway, there's a lot of people recommended it highly so yeah. And the third one I would definitely recommend reading is this is not a technical book. It is history. It's called it's the name of the book I think is Bishops Boys. It's about Wright Brothers and their path and how it was, there are multiple books on this topic and all of them are great. It's fascinating how a flight was treated as an unsolvable problem. And also, what aspects did people emphasize? People thought, oh, it is all about just powerful engines.

Starting point is 02:01:11 Just need to have powerful lightweight engines. So some people thought of it as, how far can we just throw the thing? Just throw it. Carapal. So it's a very fascinating. And even after they made the invention, people not believing it. And the social aspect, yeah.

Starting point is 02:01:34 The social aspect. It's a very fascinating. Do you draw any parallels between birds fly? So there's the natural approach to flight and then there's the engineer approach. Do you see the same kind of thing with the brain and are trying to engineer intelligence? Yeah, it's a learn anything from birds. Look, but the funny thing is that, and the saying is airplanes don't flap wings, right? This is what they say. The funny thing and the ironic thing is that that you don't need to flap to fly is something right

Starting point is 02:02:27 Brothers found by observing birds So they have in their notebook in some of these books. They show that they're not book drawings right they make Detail notes about buzzards just soaring over Thermals and they basically say look flapping is not the important. The proposition is not the important problem to solve here. We want to solve control. And once you solve control, proposition will fall into place.

Starting point is 02:02:56 All of these are people, you know, they realize this by observing birds. Beautiful. That's actually brilliant. Because people do use that analogy. I'm going to have to remember that one. Do you have advice for people interested in artificial intelligence like young folks today?

Starting point is 02:03:14 I talked to undergraduate students all the time. Interested neuroscience, interested in understanding how the brain works. Is there advice you would give them about their career, maybe about their life? The channel. Sure. I think every piece of wide ways should be taken with a pint of salt, of course, because each person is different, their motivations are different.

Starting point is 02:03:35 But I can definitely say, if your goal is to understand the brain from the angle of wanting to build one, then being an experimental neuroscience, this might not be the way to go about it. It might be a better way to pursue it, might be through computer science, electrical engineering, machine learning and AI, and of course, you have to study up the neuroscience, but that you can do on your own. If you are more attracted by finding something intriguing about discovering

Starting point is 02:04:11 something intriguing about the brain, then of course, it is better to be an experimentalist. So find that motivation. What are you intrigued by? And of course, find your strengths too. Some people are very good experimentalists and they enjoy doing that. Insisting to see which department, if you're picking in terms of your education path, whether to go with an MIT, it's brand and computer, no, the BCS Yeah. Brain and cognitive science, yeah. Or the CS side of things. And actually, the brain folks, the neuroscience folks,

Starting point is 02:04:53 are more and more now embracing of learning TensorFlow and my torch. Right? They see the power of trying to engineer ideas that they get from the brain into and then explore how those could be used to create intelligent systems. So that might be the right department actually.

Starting point is 02:05:16 Yeah. So this was a question in one of the Red Bull Neuro Science Institute workshops that Jeff Hawkins organized almost 10 years ago, this question was put to a panel, right? What should be the undergrad major? You should take if you want to understand the brain. And the majority opinion that one was electrical engineering. Interesting.

Starting point is 02:05:40 Interesting. Because I mean, I'm a doubly undergrad, so I got lucky in that way. But I think it does have some of the right ingredients because you learn about circuits. You learn about how you can construct circuits to approach, you know, do functions. You learn about microprocessors. You learn information theory, you learn signal processing, you learn continuous math. So, so in that way, it's a good step to if you want to go to computer science or neuroscience, you can, it's a good step. The downside, you're more likely to be forcey's MAT lab.

Starting point is 02:06:23 One of the interesting things about, I mean, this is changing, the world is changing, but like certain departments lagged on the programming side of things, on developing good good habits, and there's a software engineering, but I think that's more and more changing, and students can take the enter their own hands, like learn to program. I feel like everybody should learn to program because it, like everyone in the sciences, because it empowers, it puts the data at your fingertips. So you can organize it, you can find all kinds of things in the data, and then you can also for the appropriate sciences, build systems that, like, based based on that so like then engineer intelligent systems We already talked about mortality so we hit no

Starting point is 02:07:13 Ridiculous point, but let me ask you the You know one of the things about intelligence is One of the things about intelligence is it's goal driven. And you study the brain. So the question is like, what's the goal that the brain is operating under? What's the meaning of it off us humans in your view? What's the meaning of life? The meaning of life is what about you constrict out of it.

Starting point is 02:07:42 It's completely open. It's open. So there's nothing like you mentioned, you like constraints. So there's what's, it's wide open. Is there some useful aspect that you think about in terms of like the openness of it and just the basic mechanisms of generating goals in studying cognition and the brain that you think about or is it just about, because everything we've talked about

Starting point is 02:08:14 kind of the perception system is to understand the environment. That's like, to be able to like not die. Exactly. Like, not fall over and like be able to, you don't think we need to think about anything bigger than that. Yeah, I think so because it's basically being able to understand the machinery of the world such that you can push to whatever goals you want, right?

Starting point is 02:08:40 So the machinery of the world is really ultimately we should be striving to understand. The rest is just, the world is really ultimately we should be striving to stand the rest is just the rest is just whatever the I can want to do or whatever whatever I think that's beautifully put I don't think there's a better way to end it I'm so honored that you show up here and waste your time with me. It's been awesome conversation. Thanks so much for talking today. Oh, thank you so much. This was so much more fun than I expected. Thank you. Thanks for listening to this conversation with Delete George. And thank you to our sponsors, Babel, Raycon Airboats, and Masterclass. Please consider supporting this

Starting point is 02:09:26 podcast by going to Babble.com and use Code Lex, going to buy Raycon.com slash Lex, and signing up at masterclass.com slash Lex. Click the links, get the discount, and really is the best way to support this podcast. If you enjoy this thing, subscribe on YouTube, review it, the 5 stars and Apple podcasts, support it on Patreon, connect with me on Twitter, Alex, Friedman, Spelled, Yes, without the e, just F-R-I-D-M-A-N. And now let me leave you with some words from Marcus Aurelius. You have power over your mind, not outside events, realize this and you will find strength. Thank you.

Lex Fridman Podcast - #115 – Dileep George: Brain-Inspired AI

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.