Theories of Everything with Curt Jaimungal - Karl Friston on What Is Life, Consciousness, the meta-Hard Problem, and the Free Energy Principle

Episode Date: April 26, 2021

YouTube link: https://youtu.be/2v7LBABwZKAKarl Friston is a British neuroscientist at University College London and an authority on the free energy principle and predictive coding theory.Patreon for c...onversations on Theories of Everything, Consciousness, Free Will, and God: https://patreon.com/curtjaimungal Help support conversations like this via PayPal: https://bit.ly/2EOR0M4 Twitter: https://twitter.com/TOEwithCurt iTunes: https://podcasts.apple.com/ca/podcast/better-left-unsaid-with-curt-jaimungal/id1521758802 Pandora: https://pdora.co/33b9lfP Spotify: https://open.spotify.com/show/4gL14b92xAErofYQA7bU4e Google Podcasts: https://play.google.com/music/listen?u=0#/ps/Id3k7k7mfzahfx2fjqmw3vufb44 Discord Invite Code (as of Mar 04 2021): dmGgQ2dRzS Subreddit r/TheoriesOfEverything: https://reddit.com/r/theoriesofeverythingLINKS MENTIONED: Iain McGilchrist's interview: https://www.youtube.com/watch?v=M-SgOwc6Pe4 Donald Hoffman's interview: https://www.youtube.com/watch?v=CmieNQH7Q4w Anil Seth's interview: https://www.youtube.com/watch?v=_hUEqXhDbVs Stuart Hameroff's interview: https://www.youtube.com/watch?v=uLo0Zwe579g Machine Learning Street Talk's interview with Karl Friston (not mentioned but an informative watch): https://www.youtube.com/watch?v=KkR24ieh5OwTIMESTAMPS 00:00:00 Introduction 00:02:45 Balancing work and life with an h-index of 200+ 00:04:33 Daily routine 00:05:55 How to choose which problems to work on? 00:10:13 Defining "deflationary" and "self-evidencing" 00:14:33 Equilibrium vs non-equilibrium steady state 00:20:52 Why is it called the Free Energy Principle? 00:27:20 What is the Free Energy Principle? (two answers) 00:41:15 Is causation and time required for the FEP? 00:46:03 FEP incomplete because it doesn't incorporate relativity? 00:57:21 Schrödinger, entropy, gradient flow, and life 00:59:01 Why is the Free Energy Principle infamous for being esoteric? 01:01:00 Is FEP a TOE? A theory of everything? 01:03:20 Do atoms have "beliefs" in a Bayesian sense? 01:04:54 List of different fields FEP has implications for (physics, biology, AI, etc.) 01:17:47 On Occam's Razor and Artificial Intelligence 01:25:38 Newcomb's paradox 01:29:05 FEP - a Law of Nature or a different way of compressing? 01:36:50 Jordan Peterson's "order and chaos" and the FEP 01:50:51 Using brain structure to infer about the world 02:07:12 On hierarchies and modelling different "things" (tools don't behave like apples) 02:13:59 The importance of "shared narratives" 02:18:37 Religion as aiding "shared narratives" 02:22:43 Politics needs 50 / 50 splits on issues 02:25:53 The existence of free will and definition of it 02:28:43 Self fulfilling prophecies 02:31:37 Schizophrenia, Curt, and Karl 02:45:48 "You are your own existence proof." 02:48:03 Can you think yourself into non-existence? 02:51:33 Definition of generative model 02:54:06 Donald Hoffman's consciousness model 02:54:34 Quantum consciousness and Penrose's Orchestrated OR 02:59:41 What does FEP say about consciousness? 03:09:48 The meta-Hard Problem of Consciousness 03:15:22 Anna Lukomsky: Idealism vs Physicalism 03:28:13 Joanne Dong: Explain the meaning of life via the FEP 03:34:23 Monotheism vs Polytheism in terms of FEP and psychology 03:38:50 Faraz Honarvar: Do our actions matter to the world? 03:47:20 Final words on Schizophrenia and existential anxiety from studying different metaphysics* * *Subscribe if you want more conversations on Theories of Everything, Consciousness, Free Will, God, and the mathematics / physics of each.* * *I just finished (April 2021) a documentary called Better Left Unsaid http://betterleftunsaidfilm.com on the topic of "when does the left go too far?" Visit that site if you'd like to watch it.

Transcript
Discussion (0)
Starting point is 00:00:00 Alright, hello to all listeners, Kurt here. That silence is missed sales. Now, why? It's because you haven't met Shopify, at least until now. Now that's success. As sweet as a solved equation. Join me in trading that silence for success with Shopify. It's like some unified field theory of business.
Starting point is 00:00:20 Whether you're a bedroom inventor or a global game changer, Shopify smooths your path. From a garage-based hobby to a bustling e-store, Shopify navigates all sales channels for you. With Shopify powering 10% of all US e-commerce and fueling your ventures in over 170 countries, your business has global potential. And their stellar support is as dependable as a law of physics. So don't wait. Launch your business with Shopify. This is the longest interview I've ever done because they are such intriguing questions. But yeah, the fact that I'm still here is testament to the fact that I had a great time. I don't fall into our next exchange.
Starting point is 00:01:24 Carl Friston is considered by his peers to be the top neuroscientist that exists, probably that ever existed. Now, obviously, he won't say that. He'll rebuff that because he's a humble British chap who doesn't like to draw undue attention to himself, but I give him that accolade. He has a principle called the free energy principle, which is infamously inscrutable, but we try to make it as simple as possible. Please keep in mind that while there is plenty of technical jargon in this podcast, it's important that you stay with it. Once you're at the end, you'll probably have a better understanding
Starting point is 00:01:53 as to how to interpret what he said at the beginning. This is one of the longest conversations, if not the longest conversation with Carl Friston that exists. And not that size matters, but you can use the duration or the length of a podcast as a proxy for the interviewee's interest and that's great for me as an interviewer because hopefully the rapport comes across and influences whether or not you can absorb the information or influences the rate at which you can absorb it and possibly even the comfortableness that me and Carl have
Starting point is 00:02:20 with one another aids your understanding, perhaps the integration of this information so that you can assemble your own Weltanschauung, that is your own theory of everything. These podcasts are extremely difficult to produce because they require quite a bit of preparation and if you're interested in supporting conversations like this or seeing more like this or you have other ideas as to who i should interview and you'd like to support this channel in some way shape or form then please do consider going to patreon.com slash kurt jay mungo literally every dollar every donation every patron helps a tremendous amount not only financially but it also
Starting point is 00:03:00 helps with encouragement it helps show me that this is something that people like and are willing to support voluntarily. Thank you to all the existing patrons. And just so you know, the best way to view this podcast, in spite, like I said, its inscrutable nature as to what exactly the free energy principle is, the best way to view this podcast or listen to it is by re-listening to it. By the end of it, you'll have a better understanding
Starting point is 00:03:23 as to what Carl Friston means when he says certain terminology and then when you re-watch or revisit certain parts you can look at the time stamps, revisit and hopefully gain a better understanding. For those listening and not watching there is also a YouTube channel and for those who are watching and not listening there is a iTunes, Spotify, Google Play. We're on pretty much every platform. You can look at the description to get the links for that. Thank you so much and enjoy. Sir, it's a pleasure. I have been looking forward to this for quite some time. As I started researching you,
Starting point is 00:03:55 I found out that you're not only one of the top neuroscientists that are living, but you may be the greatest neuroscientist to ever exist. I'm just going by the approbation from your own colleagues, as well as your H-index, which is another reflection of colleagues approving of you, and your citations, which is in the six digits, if I'm not mistaken. So how about first we start off with just how is it that you balance work and life? with just how is it that you balance work and life? Very badly.
Starting point is 00:04:29 I should say that I'm certainly not the greatest neuroscientist. I think that's more a reflection of the fact they invented the H-index a few years ago. So it gives me an unfair advantage. The balance is very poor. Balance is very poor. So most of my life I spend pursuing academic commitments. In fact, in the past year, also commitments I've taken on in relation to the coronavirus outbreak in terms of modelling. So on the other hand, I love what I do.
Starting point is 00:05:08 So for me, life is a bit of a holiday anyway. I think many people who suffer are my family in terms of my improper work-life balance. Does your family get upset that you're constantly thinking about work? Or are you distracted when you're spending time with them because something else is on your mind? I think they've got used to it. And I have to say that they also have their preoccupations
Starting point is 00:05:28 and the things that they invest in. So I think they are as guilty as I am in not paying proper attention to the life side of the work-life balance. Do you have a daily ritual or do you meditate? Do you eat a certain type of meal, go to sleep at a specific time, wake up at a specific time? What's the structure? I am certainly a creature of habit. And I guess we'll come back to why that is possibly the case
Starting point is 00:05:59 or should be the case for most creatures. So, no, I don't meditate. but it's an interesting question because my daily routine does involve a long period of thinking in the morning, and it's alone. So I generally joke, and perhaps not joke, that I don't like talking to other human beings before lunch, before 12 o'clock. So my day typically will start with at least an hour, if not two hours, sitting and contemplating on the problem at hand, usually in my conservatory, smoking my pipe, just drilling down without any aids, without a computer or without any pen and pencil, just to boil down the simplicity of the problem at hand and try and see the architecture of the solution.
Starting point is 00:06:54 And then it's a question of buckling down to emails or then doing coding or mathematical derivations or writing something up for the rest of the day. And when I start talking to other people. You sit without a pen and a paper and you just think with your eyes closed? Well, I actually enjoy looking at the garden whilst thinking, but I do get distracted. Yeah. And if you see me close my eyes, I'm often thinking deeply about what you're saying. I don't want you to think I'm distracted or sleepy, just so you know.
Starting point is 00:07:24 It just sounds extremely British of you to have a pipe. And do you drink tea? Are you sitting with biscuits? Do you have a breakfast? Well, no. In the morning, it's coffee, and then tea would be for the afternoon. Without biscuits, but certainly coffee in the morning, tea in the afternoon. How do you choose which problems to tackle? Because I assume that maybe 80% of the reason for your high H-index is the problem selection. Yeah, absolutely. And I often have heard it said that the mark of a great scientist
Starting point is 00:08:00 is not the answers that they offer, it's really the questions that they ask and i have to say that um most of the questions that i'm obliged to contend with are those that are proffered um either by colleagues internationally and but most of the time by the by the younger people that i work with or supervise so it really, I spend most of my time in response mode, just trying to formulate and resolve questions that my international, usually younger team bring to the table. So 90% of my work is just basically helping other people solve problems. of my work is just basically helping other people solve problems. About 10% of it, when I have time, is sort of focusing on my particular hobby, which in latter years has become the free energy principle. But that's a somewhat rare opportunity. Most of the time, you're working on other,
Starting point is 00:09:02 slightly more practical problems. How has learning mathematics, new mathematics, new physics, whatever it may be, how has that changed in your later years compared to your younger years? Well, practically, an honest answer to that question is it's much easier now because of things like Wikipedia and the electronic age. I mean, certainly I I had the conviction, which I think is a proper conviction, that to make any real formal difference, you had to be able to articulate things mathematically, so you made a particular point of
Starting point is 00:09:42 choosing an early career and education that covered physics and maths and probability theory and the like but probably not not dissimilar to your early career structure and then forgot it all until I returned after a studying as a doctor as a doctor, as a psychiatrist, to maths via data analysis in imaging neuroscience. So it was extremely useful having had that early physics sort of degree level education. Although I'd forgotten everything, at least I had the confidence to relearn those bits that you need to know to make something work. So it was very much one of the situations of sort of see one, do one, teach one, where very quickly you need to solve a problem and then you went to, you know, in latter days at least, you went to Wikipedia, you got the right equations and you put them
Starting point is 00:10:34 together, implemented it in code and then proceed, you know, onto the next problem. So, So I have possibly an undue respect or reverence for the utility of maths, not just in relation to its unreasonable efficacy, but also as a language for communication, as a calculus that every like-minded academic or indeed industrial partner will at some subscribe to it and drill down on and understand. So I think it is the ultimate language, which, you know, effectively most of what I say in words or write textually inherits from and is always guided by the underlying mathematics. Before we get to the free energy principle, which I'm going to ask you to explain as simply as you can, and then you can go as complex as you can later. But before we get to that,
Starting point is 00:11:34 I found that there were a few words that you use in the different interviews that I've seen of you that I'm unsure as to what you mean precisely when you say them. So deflationary is one. What do you mean when you say deflationary? Because I'm sure that will come up over and over here. Right, yes, you're right. That's one of my favorite words at the moment.
Starting point is 00:11:54 Just taking, if you want an honest answer, I use it in the sense of just taking the hot air out of something. So in some instances, of something. So in some instances, one sees an over-interpretation of certain things. So there is always a simpler explanation for various phenomena or behaviours or constructs. So for me, a deflationary explanation is a better thing in the sense that it provides a simpler, more parsimonious account of something than was on offer before that simpler account came along. of what my friend Andy Clark, who's one of the world's more accomplished philosophers, refers to as a sort of Quinian desert landscape.
Starting point is 00:12:52 So this inherits from Quine, who conceived of a set of explanations that was so simple, there was nothing left, hence the desert landscape. And a lot of the aspirations of our theorizing actually points in that direction a little bit like natural selection i mean the natural selection is such a wonderful idea probably the you know the greatest biological idea um certainly in the past few centuries and yet it is inherently tautological. It just almost goes away when you think about the essential tautology of natural selection. So for me, that's a beautiful idea that is
Starting point is 00:13:33 deflationary. It explains so much with hardly anything. When you're saying that natural selection is tautological, do you mean to say that survival of the fittest, well, you're defining fitness by survival anyway? That's exactly what I mean. Yep. That's exactly what I mean. Yep. Okay. Okay. Now self-evidencing. Self-evidencing is a philosophical term, which I think has been around for some time, but I was introduced to it by my other philosophy friend, Jakob Howie, who's currently in Melbourne, as a nice perspective on the one take on what we do or what the imperatives, the existential imperatives behind our behaviour and our dynamics, if you're a physicist, which is to maximise the evidence for your models of the world. So in maximising the evidence, sometimes described perhaps poetically as acting to garner evidence for your own existence, you can be said to be self-evidencing. I have to say it's a slight
Starting point is 00:14:42 play on words from the point of view of a mathematician, because nearly every interesting formulation chemistry through self-assembly. Sometimes in the biological sciences, people refer to autopoiesis, which is again Greek for self-creation. You start off just thinking about the information theory that would underwrite that kind of self-organization. The first thing you come across is self-information. You're following that through and you can sort of spin off all sorts of other selfs in terms of self-serving behaviors that you can write down in terms of information theory. And for me, the simplest and most, the prettiest is this notion of self-evidencing. What about equilibrium state versus a non-equilibrium steady state? So that one's not idiosyncratic to you.
Starting point is 00:15:51 Right, no, no, heavens no. And again, remember I'm a bit of an amateur physicist. I trained as a physicist as a young man, but haven't sort of a young man but haven't really been in that field throughout my career. But my reading of physics as it applies to the kinds of systems that we have to deal with can be read as sort of 20th century physics where you're dealing with closed systems at equilibrium. And then the 21st century, people got much more interested in the physics of open systems that are far from equilibrium or non-equilibrium. And what can you do with these kinds of systems? Well, you can write down their dynamics, their density dynamics.
Starting point is 00:16:42 You can formalize the behavior of the probability distribution over the various states of these systems and how they would unfold from any given initial condition to some steady state distribution. For me though it is the steady state distribution which is interesting because that provides you with a well-defined probability distribution and once you've got that then you with a well-defined probability distribution. And once you've got that, then you can go forward in terms of the information theory and spin off everything that you might want to in terms of inference and the like. So the non-equilibrium part is absolutely crucial in the sense that it defines a set of questions about systems that are in exchange with a world out there. So there is,
Starting point is 00:17:31 you know, from the thermodynamic point of view, an exchange of energy and entropy between the system in and of itself and the rest of the universe. From a mathematical point of view, From a mathematical point of view, the non-equilibrium aspect is inherited from the prevalence of circular flows, solenoidal flows, is sort of the fluctuations that you get in classical mechanics like the orbits of heavenly bodies or the circular fluctuations you get in predator-prey relationships or red queen dynamics wherever you look at sort of interesting systems that persevere over time you see this sort of solenoidal circular um oscillatory like behavior um and it is that which characterizes real dynamical systems that are you know i repeat not isolated from the rest of the universe or are not carefully enshrouded in a heat bath like an idealized gas,
Starting point is 00:18:45 but they actually have to contend with and exchange with the outside world. So that's what's encapsulated by a non-equilibrium system. The steady state means that if you leave it alone for long enough, it will self-organize into some recognizable configuration that you can describe probabilistically with a probability density function or in terms of some attracting set a pullback attractor whatever your preferred kind of calculus or maths would be. So steady state refers to it over time being somewhat stable but then the non-equilibrium means that it can ramify, it can go between here, and then it may jump here, and then it may jump here, but between three only instead of one would
Starting point is 00:19:29 be equilibrium? No, I think that's a very nice description. So yes, absolutely. So the steady state just means that the system has settled down to some attracting set of states but within those that set of states there is exactly what you were you were referring to there's an itinerancy there are different sort of you know regimes or sub manifolds of an attracting set which gives it an itinerancy or a wandering so it moves around from one part of the attracting set to another part of the attracting set so the example i like to use when trying to get this notion in play is you know at every level of self-organization in me you see this phenomena whether it's very very fast fluctuations in electrophysiological potentials in one part of my dendrite of one cell in one part of my brain,
Starting point is 00:20:26 fluctuating at, say, the gamma frequency, or whether it's my heartbeat unfolding during the cardiac cycle, or whether it's me getting up in the morning, having a cup of coffee, do my emails, or whether it's the annual cycles that we all enjoy, sort of Christmas, easter summer holidays every one of these instances of an you know an open system at some kind of non-equilibrium steady state has in common the fact you keep revisiting certain states of being and it's that revisiting that
Starting point is 00:21:01 defines this attracting set and the steady state aspect. So exactly as you are doing with your fingers, you are moving around from sort of one part of the manifold to another part of the manifold. If you were a physicist who did dynamical systems theory, you might think of this in like a heteroclinic cycle where you're moving from one unstable point to the next unstable point,
Starting point is 00:21:26 but you're always ultimately coming back to the same attracting set of unstable points. So it's that non-equilibrium. So we're not at a fixed point. We're not all aspiring to be at thermodynamic equilibrium. There's an itinerancy and a complexity and a richness to the behavior. But on the other hand, it's a behavior that evinces the same kinds of states, the neighborhood of places in state space time and time and time again for the duration of
Starting point is 00:21:59 the existence of that particular system or particle or person in question. Why did you call it the free energy principle? When I first heard, well, I heard about it quite a few times, then I kept dismissing it because I thought it was referring to perpetual motion or the extraction of energy from vacuum fluctuations. You're right. I understand that it has something to do with free energy in the physics term, but what led you to calling it the free energy principle? And have you heard of other people? Have you heard of the case of mistaken identity from other people or is it just me? or gently articulated. So, yeah, for me, in my world, the free energy was the most natural and obvious thing to call it because the kind of free energy
Starting point is 00:22:54 that we're dealing with, or would I say we, people in either based in statistics or latterly what we know now as machine learning, statistics or latterly what we know now as machine learning would would would always refer to this quantity this variational free energy as a computable objective function for any inference or estimation problem so whether you're doing sort of classification with machine learning and you're using a restrictive Belser machine and you're
Starting point is 00:23:25 using, let's make it more current. So you're doing, you're using high-end deep learning, using a variational autoencoder to try and recognise some sequence of images. Then the weights that you are optimising in that deep deep learning setting in that particular variational autoencoder setting are optimized with respect to a variation free energy so this variation free energy plays a central role has done for for more than half a century now in optimization problems its origins actually really really quite interesting um historically again this is not really my field but from what i gather there are two ways that the free energy sort of came um into play the this the american route and this starts with richard feynman as as with most things. So he had the problem of basically wanting to, you know, express it simply,
Starting point is 00:24:33 evaluate the probability distribution over all the paths an electron, say, could take. And he realized that that was an intractable integration problem. He couldn't work out the normalization constant to make sure that probability distribution summed to one. So he was faced with an intractable, effectively, integration problem in quantum electrodynamics. So he solved it by converting that integration problem into an optimization problem.
Starting point is 00:25:05 And the way he did it was to introduce a variational free energy that was always greater than the quantity that he wanted to minimize or was less than the quantity that he wanted to maximize, which in this instance is just the marginal likelihood or the likelihood distribution of various paths say a small particle might take so that's where I learned about free energy essentially a Feynman-esque free energy that was provided a bound approximation to, that thing you want to maximise is the model evidence or the marginal likelihood, bringing us back to self-evidencing, but also the marginal likelihood, sort of indicating that this is a central notion in physics, in Bayesian statistics as well. And that quantity, that variation of free energy has been in play now for not quite a century, but certainly many, many decades, currently known as an ELBO,
Starting point is 00:26:14 E-L-B-O, an evidence lower bound. So that evidence lower bound underwrites all of high-end machine learning. Certain simply reduced cases of it, I would actually suggest probably all of machine learning at some point refer to or can be seen as a special case of this. The other route inherits more from the Russian side. This is notions of algorithmic complexity, underlying universal computation via Kovalev complexity and sort of induction. So the big drive there is, again,
Starting point is 00:26:57 formulating universal computation as basically communicating or articulating or encoding some structure in the simplest way possible by introducing a bound on the thing that you want to you want to optimize or minimize or extremize so you can also find the free energy in the context of sort of a more Shannon-esque take on communication and optimization. that perspective to the table, a perspective that was well understood by people like Wallace in Melbourne, who themselves had taken it from the Russian literature and so on. I think he was an American, but also taking this algorithmic complexity. So it's a very long answer to actually say that the free energy principle or the principle of minimizing free energy has underwritten universal computation, practical approaches to quantum electrodynamics and machine learning since since the 1950s.
Starting point is 00:28:18 So that's what the free energy means. It's not that the energy is costless. So that's what the free energy means. It's not that the energy is costless. It just borrows from the formalism of the thermodynamic free energy, either Gibbs or Helmholtz free energy. Okay, this sounds like a great time to explain what the free energy principle is. I know you touched on it, but for the people who aren't aware. Right.
Starting point is 00:28:50 So they're asked to do this, I'll give you a choice, you can either take the high road or the low road, so the high road would start with a consideration of what it is to exist, to be something, and then unpack that in terms of physics. And then ultimately, you get to a picture of things that exist as things that look as if they are trying to minimize their free energy. And you can interpret that in terms of action and perception, sentient behavior, effectively a physics of sentience, where everything entailed by a particle, it could be a sort of small particle, it could be a person or a plant, anything that exists can be seen as or understood or characterized as trying to optimize or minimize a free energy through changes in internal states or through action upon the world that would be the high road and we can take that if we choose that option the low road would be sort of building up a sense of there being just one imperative
Starting point is 00:30:01 from the the notions that probably can be found in the students of Plato, that throughout philosophy, through Kant, through Helmholtz, through modern-day psychology and subsequently machine learning, would appeal to a story, a narrative about prediction and inference and that we are all effectively prediction machines where in this setting the existential imperative is basically to reduce prediction error. So by prediction error I simply mean that we have in mind a model of the way that the world works, and that we can leverage that model to produce predictions about the sensory evidence at hand, and the disparity or the difference between our predictions and our actual sensory samples can be called prediction error. And it turns out mathematically that the sort of the high road and
Starting point is 00:31:05 the low road converge because the prediction error is just the gradient of the free energy that you're trying to minimize. So I'll come back to you. Which would you prefer, the high road or the low road? Okay. First, when you say sentient, do you mean the capacity to feel pleasure or pain? Or how are you using that word? Yeah, just a sense, just to not so much the affect aspect, not the sort of the pleasure and the pain, just being able to represent and infer. So to have in mind a notion of what caused your sensations. So I'm using it in a deflationary
Starting point is 00:31:49 way. So some systems may or may not have the kind of sentience that you'd associate with, say, a person or a pet. So is a plant sentient um let me ask you do you think a plant is sentient does it does it have can it in an elemental way feel its way around the world or have some internal representation of um of the world that it that it inhabits i would say it's a scary thought to think of plants as extremely sentient because you crush them constantly, even if you're a vegetarian. So I'm unsure if I think that they can feel pleasure or pain. It does seem like it's clear that they respond to the environment
Starting point is 00:32:35 and that they take in sensory information and act on it. Yes, I think that's absolutely true. In fact, interestingly, they have the same sort of electrochemical message passing that we have with our axons in the brain. It unfolds at a much slower timescale. But the physics of the message passing, the internal states of a plant actually comply with very similar kinds of computational architectures that your brain and mind is. I would say a plant is sentient, but not to the extent that it has notions, emotions, or even a sense of self. You know, there are lots of graded, very, you know, sort of ladder, there are lots of steps on the ladder of sentience right through to self-awareness and pleasure and pain but i'm talking about a very elemental sort where there's something going on in the inside
Starting point is 00:33:31 that is caused by and causes stuff that's going on on the outside that you can read as a kind of representation or an inference about the causes of sensations. Okay, so when you ask this question as to which explanation would I prefer, the high road or the low road, I imagine that as top-down versus bottom-up. And when I say that, please let me know if I'm understanding it correctly. When you say the bottom-up, it's something like looking at what exists and seeing that it does act such that it minimizes some quantity or it wants to minimize error in some way versus the top down, which looks like if I was to sit in an armchair with my eyes closed and think what exists, okay, what would have to be the case if it were to exist? What
Starting point is 00:34:17 properties would it have to have? So one is, is that correct? Am I, am I obtuse? No, that's brilliant. Yeah. Okay. correct am i no that's my obtuse no that's brilliant yeah okay let's start with the top down let's just do the top down because we have quite a few questions so let's start from the armchair excuse me so okay i'll do the top down with a little nod to the bottom up so we don't we don't miss out on anything so you've just said what you know um basically the strategy you start off okay i want to explain um anything a theory of everything so what's a thing well the first thing you have to contend with well how do i differentiate a thing from no thing or nothing
Starting point is 00:34:58 or something else so that immediately implies that you're're splitting or carving or partitioning the states of some universe into states that belong to the thing and states that don't belong to the thing. And when you think about what that means, you have to, well, you're compelled then to consider the statistical dependencies that demarcate something from nothing. And when one drills down on that, you come across this notion of a Markov boundary or Markov blanket, which is effectively a set of states that separate or insulate internal states on the inside from external states on the outside so you have this picture of transactions mediated by standard dynamics by of the sort you'd see in physics with a launch van formulation um the the kinds of equations of motions that everybody uses to you know to to build their favorite physics, whether it's quantum mechanics or classical mechanics or statistical mechanics, they all start with the notion of there being some dynamics out there. that we're interested in, in terms of talking about anything,
Starting point is 00:36:31 applies to systems that have this partition. So as soon as you put this partition into play, which is basically a partition where internal states influence external states, this direction through the blanket states, and then external states influence internal states in the other direction and vicariously through the markov blanket you have this notion of a some generalized action and perception that you know the inside influences the outside so somehow the thing is acting up on the world but at the same time through the blanket states the outside is acting upon the inside you know and we normally divide the blanket states into sensory and active states so in this
Starting point is 00:37:13 instance the outside external states impress themselves upon the sensory states and then the sensory states influence the internal states and so there's a sort of circular causality implied by this partition. Now all you do then is say well look let's just look at the long-term behavior of any system that would that you can describe in terms of the the dynamics or the rate of change of the probability distribution over states or density over states. And note that because we are interested in systems that have attained a non-equilibrium steady state, the probability density is not itself changing. And when one does that, when one looks at the solution to the density dynamics in the context of the Markov blanket you get a particular kind of mechanics and that
Starting point is 00:38:14 mechanics is exactly the same as quantum mechanics or statistical mechanics but with one key difference now it applies in the setting of a markov blanket and that that's where the interpretation of um things as self-evident comes from because you can always write down the flows that maintain this steady state as um effectively trying to trying to minimize their prediction error or minimize their self-information or minimize their what's known as surprise and information theory which mathematically is just the same thing as what statisticians like to call Bayesian model evidence or the log of the probability of these sensory states, given my implicit model encoded by the internal states. So what you end up with is a description of anything defined stipulatively in terms of possessing
Starting point is 00:39:19 a Markov blanket defined by its non-equilibrium steady state density, whose dynamics or flows must have this property that the internal states and the active part of the blanket states must be performing a gradient flow on this self-information. And then you make a further move and say, say well this self-information can be written down as a free energy functional or the free energy if I interpret the internal states as encoding basing beliefs or probabilistic beliefs about the external states. So now you have a mathematical image or a picture of existential dynamics that inherit from having a Markov blanket where you can say that the dynamics of the internal states look as if they're trying to maximize their model evidence or trying to minimize their free energy or prediction error and that's basically the story so a story that rests really upon what physicists would understand
Starting point is 00:40:28 as dynamics and average flow at non-equilibrium steady state, and in particular, the gradient flows. So perhaps it would demystify this to say that, and coming back to Helmholtz, who was a key architect of these ideas on the low road to describing. So among the many things that Helmholtz brought to the table was you can always decompose the flow. For example, let's take the flow of a fluid into two parts. There's one that's flowing up or down gradients, say concentration gradients,
Starting point is 00:41:08 and then there's another circular flow which flows around the isoprobability contours or the isocontours of any function. So what we're talking about is the gradient flow part of it, not the solenoidal, which is the other circular part of the flow that we were previously talking about in terms of defining non-equilibrium as opposed to equilibrium systems. And that gradient flow then can be now read as a gradient flow on a variational free energy, which is wonderful from the point of view of people like me in neuroscience, because now what you've got is a description of neuronal dynamics. So you've got now a first principle account of how neuronal activity or states change as a function of their state over time.
Starting point is 00:42:06 And you can write that down now as a gradient flow on this one quantity, this variation-free energy, that endows that gradient flow, that dynamics, with an interpretation that they're trying to maximize the evidence for their model internally of what's going on on the outside. Okay, let's zoom out. And we see in your model, most of the time when you're presenting with the PowerPoint slides, I see nodes and then they have arrows. And so the nodes are connected with different arrows. And then that's what affords a Markov blanket. I have a couple questions. The
Starting point is 00:42:45 arrows represent influence which to me is a synonym for causation so I'm wondering is causation necessary for this model because some people argue that causation is an illusion so for example Sean Carroll may say that and I'm well first of all let's tackle that. Does your model presuppose causation? Yeah, in a deflationary sense, yes. Deflationary? Yes. So in the sense, I'm not quite sure what Sean has said, but I want to agree with whatever he said. So this is a very trivial causation that is implicit when you write down
Starting point is 00:43:26 any random differential equation or stochastic differential equation or Langevin expression. If you just write down a universe in which there are states that change with time as a function of those states, plus say some random fluctuations to make it a random or a stochastic differential equation, then what you are saying is that the flow of those states, the rate of change of those states is caused by those states. That's the only causality that is in play here.
Starting point is 00:43:59 So it's causality of a rather sort of elemental and trivial sort that I have to refer to, oh sorry elemental and trivial sort that I often refer to, sorry, not me, but people often refer to as causality in a control theoretic sense, that the system is causal in the sense that there is just motion or dynamics that is caused by a state. So the causality here inherits just from writing down differential equations that interestingly are only expressed in motion over time. So there's a deep link between behind this trivial causality and time because you're just writing down a differential equation. After that there's no causality,
Starting point is 00:44:37 there's nothing else after that. The sort of the causal inference that sort of, that would be of the kind, say, that people like Pearl in his causal mechanics talk about, I think that's much more a product of inference. But you'd have to actually work right up through the free energy principle to you know, to actually understand what it means just to measure things, you know, let alone have ideas about did this A cause B, you know, you first of all got to even, you know, work out how would you measure, how would you infer the existence of A out there and B out there. So, there? So the free energy principle starts at a very, very low level, and then the baby steps works
Starting point is 00:45:30 up to higher-order notions of things like causality. Does that answer your question about the other? Yeah. Does time always have to be the parameter, or can it be something else? Because then it presupposes time as well. Yeah. to be something else because then it presupposes time as well yeah um yeah i mean in my um in in our work yeah it is always time but of course you can generalize that you know just treat time as another another another coordinate if you wanted to i i personally haven't found that a very useful
Starting point is 00:46:00 move i all have an enormous respect for people who can think like that uh you know and completely generalize uh this um you know get all um you know gate theoretic um sort of um understandings of the underlying geometry for me i have to say you know i just start with a random differential equation and then everything else follows from that. And in particular, you keep connected to useful aspect of some of some space um then you know um it's less easy in a straightforward way to connect it to things like entropy production or um even things like sort of information entropy and information length in dynamical systems. So I personally do, you know, time starts off with a privileged position in the formulation in the sense that it underwrites the meaning of a differential equation. Do you see that as then meaning that the free energy principle as it's currently formulated
Starting point is 00:47:24 is incomplete in some manner given special relativity let's say that or any relativity says that time and space must be on some equal footing and if you're saying that time has some privileged position then well position needs to have a equally privileged position right yeah yeah that that question is above my pay grade. I personally don't know. You can easily get dynamics or mechanics out of a Langevin formulation that has time baked into the very core of it. At the beginning, we just write down a long run.
Starting point is 00:48:06 You can work the whole density dynamics up at non-equilibrium steady state. You can write down quite a simple potential function that describes general relativity. So I don't think there's anything. Oh, I didn't know that. Okay, because when I was looking at some of your notes I know you call it the launch van equation
Starting point is 00:48:28 I can never pronounce that so I always called it the Kolmogorov-Forward equation for some reason that's easier for me but whatever the launch van is that correct? launch van? yeah that's how I say it anyway yes okay let's just say it like that for now
Starting point is 00:48:39 L-A-N-G-E-B-I-N right okay so the launch van equation it can imply Newtonian mechanics. I saw that you were showing that in one of your slides, as well as quantum mechanics, as well as some aspect of thermodynamics. And then I forget the fourth one. But then I didn't see special relativity or non-relativistic quantum mechanics.
Starting point is 00:49:00 You're saying that you can tweak some of the parameters to derive general relativity? Yep. mechanics you're saying that you can tweak some of the parameters to derive general relativity yep um and it all rests upon the way that you configure the um the um the the gradient flows that have the dissipative uh the lender system a dissipative aspect um versus the um solenoidal flows that that um characterize more conservative systems So when we're talking about general relativity, for example, we're almost considering the limiting case where the random fluctuations are attenuated by being averaged away. We're talking about sort of massive bodies moving around.
Starting point is 00:49:40 So it's just a question of looking at the various functional forms you would get out of the solution to the Kolmogorov-Ford equation. moving around. So it's just a question of looking at the various functional forms you would get out of the solution to the Kolmogorov forward equation under the limit that gamma is very, sorry, the amplitude, the random fluctuations is very, very small. So the gradient flows almost disappear. So then it's just a question, look at the functional forms and with a few nonlinearities here and there, you can quite easily write down something that general relativity would be quite comfortable with. random fluctuations into the so general solution um to um the um the fokker-plank equation or the camogra forward equation um and sorry just for the people listening the fokker-plank equation and then the launch van equation and the calmagorov forward equation are all synonyms
Starting point is 00:50:40 yes well or the special cases of one another? Yeah, well, the Langevin equation is just the underlying equation that describes the flow of the system as a function of the states. But if you know that and you know the amplitude of any random fluctuations on that flow, then you can equivalently write that down
Starting point is 00:51:04 as a Fokker-Planck equation or a Kolmogorov-Ford equation. I see. You can also say, incidentally, you can also write it down equivalently in terms of a pathological formulation. All of these could be thought of as different ways of just articulating the same thing. But the density dynamics most transparently inherits from the Fokker-Planck or the Komolgraf formulation, which is explicitly about how the probabilities over the states evolve over time.
Starting point is 00:51:35 So one example of that would be the time-independent Schrodinger wave equation, which is one version of a Fokker-Planck equation. What they have in common is describing the dynamics of systems in terms of the probability density over the states, or if you need quantum mechanics say the wave function of states um so at steady state um in fact you don't need to go to steady state but certainly once you've written down um the the Fokker-Planck equation um and solved it for uh the flow and then use the Helmholtz decomposition to split the contributions to flow into the solenoidal and gradient flow parts, then that provides a sort of a nice way of looking at limiting cases. So the gradient flow is realized by the random fluctuations. So for people who are not physicists, I often use the following analogy
Starting point is 00:52:47 that imagine that I placed a drop of ink in a cup of water and the ink for dissipative systems or a dissipative ink would disperse itself throughout the solvent as a random molecular fluctuations cause the ink molecules to diffuse down concentration gradients until you had a maximum entropy dissipated dissolved sort of equilibrium within the sort of the heat bath or the boundaries supplied by the glass that's not the kind of system that we're interested in. That's a 20th century sort of equilibrium physics. What we're interested in is special kinds of ink that seem to gather themselves up again into a little globule.
Starting point is 00:53:36 And that gathering up in the context of you stirring the glass of water can always be written down in accord with the Helmholtz decomposition into two kinds of flow. One kind of flow is up-concentration gradients. So this is the gradient flow I was talking about. So the molecules are gathering themselves together, paradoxically, if you like, looking as if they're flowing towards each other or to the highest concentration, the highest probabilities or log probabilities. And it's that gradient flow that is responsible for this self-assembly, this sort of the emergence of this attracting set of a small number of states
Starting point is 00:54:27 that I keep visiting. And then the solenoidal flow is this sort of circular flow around the concentration profile, the isoconcentration lines, whereas the gradient flow is going to the maximum concentration or the maximum probability density. So that's non-equilibrium steady state. Crucially, that gradient flow is exactly balancing the dispersion due to the random fluctuations. the dispersion due to the random fluctuations. So put that another way, the gradient flow that keeps things glued together, as it were, that keeps things pointing towards that attracting set, that pullback attractor, that states that have a high non-equilibrium state density, is exactly offsetting the dispersive dissipative effects of the random fluctuations which means that the random fluctuations realize the gradient flow so you have to have random
Starting point is 00:55:34 fluctuations before you can have at non-equilibrium steady state the gradient flow so what would happen if you took them away well what would happen is there would be no gradient flows, and things would just have solenoidal orbits. And then we have basically Lagrangian mechanics. We have a mechanics app for describing large bodies, heavenly bodies, moons, and earths, and the like. All they can do is move around in circles so they've just got solenoidal flow that's because there aren't all the random fluctuations are averaged away so they're effectively zero so they can't realize the gradient flows so they don't fall towards each other you know with you know with a suitable potential energy um um that would define the
Starting point is 00:56:22 you know the gradient flows or the form of that non-equilibrium density. On the other hand, you could say, well, no, I'm more interested in systems that are hot and fast, really fast, like quantum fast, where the solenoidal bit can be ignored. I'm just going to focus in on the random fluctuations or the impact of those on the density dynamics. And then you can ignore the solenoidal flow.
Starting point is 00:56:48 We return to systems that have detailed balance and we can write down all our favorite thermodynamic laws and derive all the integral fluctuation theorems that underwrite generalizations of those laws. So, from that perspective, classical mechanics, where general relativity might live, for example, lives at one end of the spectrum, very large, cold, no random fluctuations.
Starting point is 00:57:19 And then quantum mechanics lives at the other end, where it's dominated by random fluctuations very fast gradient flows and in the middle is where we live where we've got both the so we're neither committed to just orbiting for eternally so you know orbiting some you know some center committed to a periodic orbit for the rest of our lives, nor do we dissipate or behave in a quantum way, but we're sort of halfway between. We're at that right sort of scale
Starting point is 00:58:01 where there's lots of solenoidal, there's lots of oscillations that you know that we we are exposed to imposed from a larger scale such as tides that we generate ourselves such as oscillations in our hearts and our physiology in our brains um but at the same time we do have to contend with the random fluctuations so you know we spend all of our time doing our gradient flows gathering ourselves up up, planning homeostasis, keeping ourselves in, you know, actively keeping ourselves within certain bounds, within that attracting set. And this moving up the gradient flow, is this analogous to what Schrodinger talked about when he said that life resists entropy in some way? Yeah, I think that's absolutely right.
Starting point is 00:58:44 You're asking as if you didn't know. That's exactly right. So you know in the presence of these at equilibrium, certainly in closed systems that don't have sort of low entropy pullback attractors, then you will get the second law in the usual way where all the random fluctuations eventually dissipate. And so you get dissipative systems and dissipative dynamics. And yet somehow we we in our existence, we are an existence proof that that's a violation of that kind of behavior. And that's basically where the free energy principle starts. It just says there are things out there that have to be explained that seem to resist the second law.
Starting point is 00:59:35 So Schrodinger framed it in terms of negentropy. Right, right, right. to frame it in terms of resisting that entropic dissipation by this gathering up behavior. And it is those gradient flows that are responsible when you just write down the dynamics in terms of the Helmholtz decomposition for providing that balance between the dissipative effects of random fluctuations and the self-organizing dynamics that exactly balance those dissipative effects to give you this non-equilibrium steady
Starting point is 01:00:15 state. Professor, when I was first researching the free energy principle, I kept coming up to cautions about how intellectually formidable it is. And if one was to just go by the, just the warning signs, one would think that it was more abstruse than particle physics or general relativity. And I'm curious, why do you think it has this reputation? Here's an example. When you search general relativity or particle physics or quantum field theory or the standard model or any grand unified theory, I don't think I've ever read a single sentence except on quantum mechanics that says this is a difficult theory to understand. However, on the Wikipedia page for the free energy principle, it says this is notoriously difficult to understand.
Starting point is 01:01:04 Yeah. So I'm just trying to think of other examples that... I'm not saying that it's easy to understand. In the least, what I'm wondering is, why do you think it has this reputation? Oh, I see. Yes. I think it's largely because somebody slipped in that Wikipedia, that sentence in Wikipedia, to be quite honest. So it becomes a self-fulfilling prophecy? actually because somebody slipped in that Wikipedia, that sentence in Wikipedia to be quite honest. It becomes a self-fulfilling prophecy. Absolutely.
Starting point is 01:01:28 It's entertaining. It's, yeah. And of course it's, and I know it's not as hard as quantum physics and thermodynamics because I used to do those and I can tell you the free energy principle in the maths is much simpler than than quantum physics or general relativity um it is much simpler but it is mathematical but if you're you're you're in dialogue with psychologists who um don't do
Starting point is 01:01:56 mathematics then then there can be this sort of inflationary reification of the ideas and then there's a mystery because you you don't see the simplicity that is inherent in the in the mathematical formalism, the functional forms that you're dealing with. But I quite like that. I think it's quite funny, really. Keep that in the Wikipedia page. Okay, so the free energy principle, do you see it as being a potential theory of everything? First of all, you have to decide what are the desiderata of a theory of everything. But the way that I, one of the reasons I'm interested was from seeing that slide
Starting point is 01:02:30 about the Langevin equation or the Kalmaker or whatever, that equation, and seeing that it, when you tweak the parameters, you can either generate Newtonian mechanics or general relativity, at least you're saying that now,
Starting point is 01:02:40 I haven't seen that, and the quantum mechanics and so on. So then that to me means it's a more fundamental principle than those, which makes it a candidate for a theory of everything but then i just heard you say well there's the quantum mechanical world let's say high fluctuations then there's low fluctuations of general we live in the middle so then is the free energy principle simply a principle of the middle and not the one that generates all three or do you see it as being a theory of everything itself?
Starting point is 01:03:06 That's a very good question. You're right, which deserves two completely different answers. I think the thing that the free energy principle brings to the table is only licensed by the presence of a Markov blanket. So basically, the way I look at this is it's just applying box standard variational calculus of the kind that people have been using for centuries, well not centuries, but in certain instances. Two centuries at the most.
Starting point is 01:03:38 Right. To get at formal formulations of classical mechanics and quantum mechanics and statistical mechanics or, strictly speaking, stochastic mechanics, thermodynamics. So it uses exactly the same maths. The only thing that's special about it is that it applies that maths in the context of a Markov blanket or a particular partition into inside and outside and separated by blanket states.
Starting point is 01:04:11 That's the only thing that it brings to the table. And in that sense, if you allow me to interpret everything as every space thing, then thingness as defined stipulatively by a markov blanket that allows you to demarcate a thing from not thing is is certainly a theory of everything by definition but in a very deflationary way right right right now when you say deflationary see to me the way that you use it is different than how you described it unless i'm misunderstanding it the way that i hear you using the term is like there are concepts that we have a colloquial understanding as to its connotations, but you use it in such an abstract manner that it coincides with the colloquial only seldomly. And so we shouldn't take it too seriously. For example, when someone
Starting point is 01:04:59 says surprise or belief, it doesn't mean that that atom has a belief. That's the way that I understand deflation. Am I misunderstanding deflationary? No, well, not at all. No, you're certainly understanding my use of the word, but it is quite likely that I'm using the word, I'm abusing the word, or at least misusing it. But that's exactly what I meant. Yes. Yeah. And so it is getting under the hood and not reifying and not associating all the folk's psychological interpretations. So, yeah, I get that a lot. I have to suddenly pause and say, when I say a basic belief,
Starting point is 01:05:34 I, of course, do not mean something you can talk about that you have a conviction about. These are just conditional probability distributions. So you always have to sort of say, look, we're talking about something much simpler here than you might think and i think that may in part be an explanation as to why people think that the free energy principle is difficult to understand it's not it's just they think it is because they're um they're reading it using um folk psychological or uh you know
Starting point is 01:06:03 possibly from my point of view over reifying the mathematics. I see. I see. When I first started learning free energy principle, when I first started learning it, what struck me was its connection to disparate fields, so biology, physics, machine learning, and so on. And it reminded me of in mathematics, there's modular forms, which keep coming up over and over. And there are problems which previously were intractable that when you recast them in terms of modular forms, they become easier to solve. Do you mind listing some of the applications of the free energy principle that, let's say, reformulates previously convoluted, complex problems into something more simple and the different fields
Starting point is 01:06:46 maybe i can even when i'm editing this maybe i'll even list like physics and then here's how it helps physics here's how it helps biology here's how it helps machine learning so do you mind going through a couple examples you can go through this quickly because i'm sure you've done this many times and i'm mainly interested in a list. Well, no, that's a very interesting question. No, I haven't. So, but you're absolutely right. So, you know, the theory of everythingness, which is, you know, I use in a rather cheeky way to interpret everything and it's thingness, which is the theory, is also, I think, claimed by this, the free energy principle,
Starting point is 01:07:28 in the sense you're hinting at that it provides an explanatory framework, albeit deflationary, for many different disciplines. And you see that in terms of it really being the thing that it provides an approximation to or a bound on, which is the log probability of being in attracting states, you can interpret that in many different ways and you spin off different kinds of approaches in the life sciences and the physical sciences that suddenly are seen as just different ways of expressing the same underlying mechanics so a particular instance here would be um interpreting the um the free energy as a value
Starting point is 01:08:33 function a negative a loss function yeah so then you can write reinforcement learning uh you can write down expected utility theory in economics you You can write down optimal control theory in engineering. So all of these things have in common the notion that you want to have controlled dynamics that optimize some loss function. But if the loss function is just the negative evidence or the free energy, you've now got an explanation as to why all of these takes on interesting behaviours of the kind that economists and behavioural psychologists and control theoreticians study, they're looking at exactly the right kind of dynamics because all of these systems have to possess this fundamental property. And then you can not only, if you like, provide a unifying framework and that you can now
Starting point is 01:09:26 start to understand you know well where does the bellman optimality principle come from well you can actually cast it as a limiting case of a principle of stationary action hamilton's principle of stationary action when applied to a markov blanket which is just a free energy principle so if you wanted me to say in one sentence, what is a free energy principle, it is the application of Hamilton's principle of stationary action to a Markov blanket. That's interesting.
Starting point is 01:09:53 That's interesting. So where do you get the Bellman-Optimality principle from? Well, if you take away lots of uncertainty from the probabilistic treatment on offer from the free energy principle, you end up with the Bellman-Optimality principle. So that provides a unifying take on things like Bayesian decision theory. When you put the uncertainty back in again, you get something which is now a mixture of Bayesian decision theory and optimal Bayesian design, sometimes known as active learning in machine learning. So now you have a principal, first principal account of the so-called exploration exploitation trade-off, which just dissolves in the free
Starting point is 01:10:37 energy principle. The free energy has two bits to it. I won't go into the details, but in the same way that a good statistician is always trying to optimize the evidence, the marginal likelihood of or the evidence for or the variation of free energy associated with their model. They're just trying to provide an accurate account of the data at hand that is as simple or minimally complex as possible so we come back to algorithmic complexity uh under the hood here so in the same way you got this these dual aspects to the good inference again complying with occam's principle possibly even uh james's principle of maximum entropy applied to belief structures or posterior beliefs or basing beliefs you you you've got this sort of dual aspect um um you know um uh to what is a good inference which has this sort of accuracy on the one hand and the complexity on the other hand and when you apply this in an
Starting point is 01:11:40 inactive setting where you have to have to make moves and decisions and consider not just you know me as a sentient organ absorbing sensory states but I actually also act and choose where to look you get this dual aspect which is basically exploration and exploitation I want to minimize surprise or prediction errors by sampling those things that I predict I should sample, like being warm and happy and befriended. But at the same time, there's also another aspect, which is this epistemic aspect, this resolving uncertainty. So if surprise, and specifically self-information, if expected self-information is entropy,
Starting point is 01:12:21 then expected surprise is uncertainty. So in minimising surprise and in minimising prediction error, in expectation on average under particular decisions or choices I make, I'm also compelled to minimise my uncertainty about the world out there generating my sensations. So there's this epistemic part that comes into play, which, you know, in the same way that the accuracy and the complexity add together to give you the model evidence or the free energy, these two sort of exploration, exploitation things just add together to give you the expected free energy. So that, to my mind, provides another example of the unifying aspect, where you can, from a first principle account, put together two seemingly completely separate strands of Bayesian theorizing. On the one hand, Bayesian decision theory, which is all about making the right decisions under some priors some loss function and on the other hand
Starting point is 01:13:25 good experimental design that you know could be read as a sort of preparing approach to the scientific process or possibly beyond but they're just two sides of the same coin so that's a nice unification that I repeat dissolves things that I see as living in the 20th century, like the exploration-exploitation dilemma. There is no dilemma. It's just how precise are your preferences that drive that sort of, that's technically what is actually a risk. So the complexity, when you take the average complexity of the free energy under some expected outcomes before you've actually made a move on the world, that becomes a KL divergence between what you anticipate will happen
Starting point is 01:14:10 and what a priori you expect to happen. So that's risk in economics and in engineering. Sometimes it's KL control. What does KL stand for in the divergence? Sorry, Kolbach-Liebe. So it's just a particular, well, it is just a relative entropy. It's just a measure of, in fact, it's not a measure, it's a way of quantifying the divergence between two probability distributions. It's a measure in the deflationary sense.
Starting point is 01:14:46 Well, actually, no. That's very clever, but technically wrong. The reason I caught myself was the KL divergence between A and B is not the same as between B and A, so it's not a metric measure, which is why I can't use the word measure.
Starting point is 01:15:03 It's not a measure, but it's like a measure, so that's why it's called a measure. It's not a measure, but it's like a measure. So that's why it's called a divergence. So it's just a relative entropy. It's a way of quantifying how much you, if you read a Bayesian belief as a state of mind, how much have you changed your mind? And the key thing at the heart of, if you like,
Starting point is 01:15:24 applications of the free energy principle to inference and active inference and Bayesian decision theory, and the like, is the divergence between your beliefs before seeing some sensory evidence and afterwards. So if you're a Bayesian statistician, this would be the KL divergence between the prior and the posterior, literally how much you've changed your mind or moved your beliefs in response to assimilating some new data. Am I supposed to understand surprise as equivalent to a mismatch between
Starting point is 01:15:59 what you predict and what actually happens? So error? Right, that's an excellent question. Is it more complex than that? Well, if you want to get into the nitty-gritty, so surprise read as surprisal is a very simple concept. It's just a self-information. It's the negative log probability of some outcome, usually a sensory state of your markov blanket so if something is highly improbable or if the outcome or your sensory state say your physiological sensed physiological state is highly unlikely given what you are so the usual example here is a fish out of out of water for example that that has a lot of surprise um it just means it's just a you know a euphemism for for self-information that scores the kinds of
Starting point is 01:16:52 outcomes that um would be characteristic of this self-organizing system so it's just a way of writing down um or scoring the probability of being in this state if you're that kind of system or technically if you're you have that that kind of pullback attractor there's another kind of surprise which is the Bayesian surprise which is often confused with that which is I think closer to the notion that we were just talking about, which is the degree to which something causes me to change my mind, the information gain, which is, in fact, the complexity. So the complexity, basically, when you, if you're looking at this as a statistician, you're given some data and you had some prior beliefs about what caused those data before
Starting point is 01:17:44 you saw the data and then you use bayes rule to combine the likelihood of those of those data given your beliefs about how they're caused with your priors you come up with the posterior so posterior after seeing the data this is now your belief so you start with your prior belief and then you have your posterior belief and then there's a movement and that movement is this KL divergence that we're talking about that scores the amount you've changed your mind that just is the complexity it's the complexity cost that you have paid for providing a more accurate account or an accurate account of the data that you've explained that has moved your beliefs from priors to to postures so it is the the cost of um changing beliefs in order to do belief updating or basing belief updating i'm going on about that because it's
Starting point is 01:18:41 quite interesting though because you know via things like landauer's principle and the Jorzinski equality, that degree of movement costs thermodynamic energy. from the point of view of existential goodness as scored by minimizing free energy or maximizing marginal likelihood or model evidence is the kind of person or computer or artifact that when confronted with some new information will actually process and assimilate that in a really thermodynamically efficient way so they'll do it with a very cool brain they won't use very much electricity or food and very little will change and what that basically means is their prior beliefs were already quite close to the posterior beliefs so they already had a good idea of what's going on so they were not surprised so um and in not being surprised they're trying to minimize that complexity cost in in terms of the price you have to pay for providing an accurate account of some data.
Starting point is 01:19:52 You are also maximizing the efficiency of your belief updating. a nice thing because what it says is if you're going to build good artifacts good artificial general intelligence what you're looking for is a really small thing that can be powered with a battery that's the one that's the kind of thing that'll do the best kind of belief updating it's really tailor-made for the inference problem uh inference problem at hand sorry that's a we wandered away from your question about what surprise is, but surprise is just basically the implausibility of this happening to me, basically. When you were talking about what makes a good computer or a good person, and you're using good in a different manner, I assume you're using it in terms of adaptive or effective or accurate, or
Starting point is 01:20:42 at least not using too much energy to accomplish its tasks. That to me sounds like almost a restatement of Occam's razor that you want the minimal assumptions to account for the most data. Is that similar or no? Oh, no, it's exactly the same thing. So that's excellent. Yeah, no, that's exactly. So, you know, the pressure to provide a simple account of the data is exactly getting this, paying the least complexity costs possible for the accurate or the fitting. And that is exactly Occam's principle. important, and that's probably more important than the accurate fitting, then you will end up with machine learning artifacts and devices and schemes that overfit. And if they overfit, they don't try to minimize the complexity of the explanation at hand or effectively their inference, then you will get an inference and learning that does not generalize.
Starting point is 01:21:50 So inevitably, if you allow for overfitting, then you overfit today's data and tomorrow's data are very difficult to explain because you've over-explained today's data. So just practically what this manifests as is a problem of sharp minima and difficult-to-solve optimization problems in machine learning.
Starting point is 01:22:11 You get stuck in local minima that have a very, very difficult... Ah, I see, I see. And that's because you haven't flattened the minima by building into your objective function this simplifying aspect, this complexity suppressing aspect. And of course, once you have an objective function which actually entails that complexity minimising imperative, then the minima by definition, imperative, then the minima, by definition, the free energy minima are always shallow, so that you've got a lot of latitude to wander around your minima. So you're not committing to
Starting point is 01:22:52 a particular sharp explanation of over explaining the data. So it's a really practically really important observation that I should say that when I talk about good, I just mean minimizing free energy or maximizing evidence. So that's the only good for me, which is fit for purpose in this world, having a good model of this particular world. Razor blades are like diving boards. The longer the board, the more the wobble, the more the wobble, the more nicks, cuts, scrapes. A bad shave isn't a blade problem. It's an extension problem. Henson is a family-owned aerospace parts manufacturer that's made parts for the International Space Station and the Mars rover. Now they're bringing that precision engineering
Starting point is 01:23:34 to your shaving experience. By using aerospace-grade CNC machines, Henson makes razors that extend less than the thickness of a human hair. The razor also has built-in channels that evacuates hair and cream, which make clogging virtually impossible. Henson Shaving wants to produce the best razors, not the best razor business. So that means no plastics, no subscriptions, no proprietary blades, and no planned obsolescence.
Starting point is 01:24:01 It's also extremely affordable. The Henson razor works with the standard dual edge blades that give you that old school shave with the benefits of this new school tech. It's time to say no to subscriptions and yes to a razor that'll last you a lifetime. Visit hensonshaving.com slash everything. If you use that code, you'll get two years worth of blades for free. Just make sure to add them to the cart. Plus 100 free blades when you head to h-e-n-s-o-n-s-h-a-v-i-n-g.com slash everything and use the code everything. Let me see if I can explain what I heard in another manner. Let's say you're playing roulette.
Starting point is 01:24:40 What you want to do is you want to have a minimization of your mismatch of prediction versus actuality. But let's say you have a good reason to believe that 33 black is going to be hit. You don't put all your eggs in that one basket because you could be wrong. So you need to have some spread. It's not a direct delta function. It's not just that. And then that spread is equivalent to entropy. Is that correct? Because there's a trade-off between you wanting to be completely accurate, but then leaving one's options open.
Starting point is 01:25:13 Yep. No, that's absolutely right. So that, well, you use that wonderful phrase, leaving one's option open, having that latitude in play is a very important aspect of this expected complexity minimizing imperative that you get from you know sort of trying to minimize the expected free energy for expected following a particular choice or action so just in that particular instance, if you are acting according to expected utility maximization, and you thought that the black 33 was the most likely outcome, then you put all your money as a delta function on the outcome. That not what KL control does that's not what risk
Starting point is 01:26:06 sensitive control it's not what the sort of the exploitative part of minimizing expected free energy does it tries to match your anticipated outcomes with your with your prior beliefs that incorporates exactly that kind of uncertainty and And indeed, there may be situations where your behaviour at the roulette table doesn't look as if it's just about trying to maximise the amount of money that you'll get following the bet. You may be sometimes compelled to engage in epistemic behaviours that resolve uncertainty.
Starting point is 01:26:53 So say that you have the hypothesis that the the roulette wheel was rigged um and it was rigged in a way that depended upon the way that people placed bets then you might actually place a bet just to see what would happen in terms of the you know what the person who's in charge of spinning the wheel does so you know not every move is just about minimising or realising some prior preferences or minimising some loss function. A lot of our moves are actually to resolve uncertainty and disclose things that we don't know, and resolve our uncertainty about the contingencies under which we're operating. So then you can get into some sorts of interesting paradoxes in terms of deceit and regret and
Starting point is 01:27:32 making moves just to reveal what's going on out there and particularly in transactions with other people. So that's another, you know, just beyond the KL, the engineer's perspective on KL control, the matching behavior, which is in psychology, this, you know, you try to match the probability distributions from which you sample your choices to the underlying payoff structure at hand. So that, you know, that would be one way of looking at the utility of a KL control or risk sensitive kind of control. But beyond that, there's also explicit uncertainty resolving moves you can make, such as checking, which is a fair gambling house and a non-fair one on Google before you actually start, commit to going to gamble in this house as opposed to that house. Yeah. Speaking of gambling,
Starting point is 01:28:29 have you heard of Newcombe's paradox? Oh, okay. Well, look it up at some point. I'll say it here, but the paradox is, imagine you go to a circus tent and then someone says, here is a box.
Starting point is 01:28:44 It's clear and it has $1,000 in it. That's box A. Then box B is sealed. Can't see it. But the fortune teller inside there says, I think it's you can choose either the sealed box or both boxes. Those are your only choices. Now what's in the sealed box is $1,000,000. Oh, I'm messing it up. You'll just have to look it up. It's something, please let me spasmodically verbally vomit here for a second as I try and get it correct, because it's an interesting paradox. And I wanted to know what your theory had to say about it. Now, obviously, you can't answer that in real time because you might have to think about it. But something like,
Starting point is 01:29:20 you have a sealed box, you have a million dollars, a million dollars may be in it or it's not. I'm trying to work through what the actual paradox is. The fortune teller says, I have predicted, I'm 100% correct. Maybe there's a sensor. You can say there's a sensor that analyzed your brain and it knows what you're going to do. And she either put $1 million in it or didn't, depending on if you choose. Ah, choose ah right she says if you're going to choose this single box just the box alone the sealed box then I've included a million dollars in it I've already predicted which one you're gonna do by the way but I'm just letting you know if you just pick the sealed box you get a
Starting point is 01:30:00 million dollars I put a million dollars in it if you choose both boxes because you're greedy I put nothing inside the sealed box. So then the question is, what do you do? Now you have to make many assumptions. You have to assume she's telling the truth when she says that she's able to predict you with 100% accuracy. But even if it's 99% accuracy, the paradox still holds. The question is, well, what do you do? You may say, okay, now that I'm there, she made that decision beforehand. The million dollars is either there or not. I may as well take both boxes. So that's one way of reasoning through this problem. But another one is to think, okay, all the people that just selected the sealed box get 1 million because that's the way that she said it's worked. She said also that, hey, I've
Starting point is 01:30:42 done this millions of times and I've always predicted correctly. So then what do you do? Do you risk taking both boxes or do you just take one? And it shows a difference in, I forget what it's called, either causal decision theory and there's another type of decision theory, one where you maximize evidential based decision theory. So usually those two imply the same solution, but here they don't. Anyway, I wanted to know what the free energy principle says one should do, because it's
Starting point is 01:31:13 a famous paradox. Let's forget about that then. I'm going to have to go and Google that one. It sounds even more complicated than the Monty Hall paradox problem. So it sounds intriguing. So the paradox is basically the paradox that is confronted by you in the situation as opposed to a paradoxical behavior that people actually... Right, right, right. You want to maximize the amount of money.
Starting point is 01:31:43 So let's assume you're instrumentally rational. That is, you want to maximize your pleasure, whatever it is. What do you do? Okay, so forget about that. Something I was wondering about the free energy principle is how much of it do you see as a principle, like a law, like a law of physics versus a compression mechanism? So for example, what I mean by that is when you look at Maxwell's original paper for his four laws, there are actually 26 different equations or 24. It's two pages long because he didn't compress it down to the four that we now use. So these four aren't actually four equations. They're just code. And then I always wondered how much of our Lagrangians are, sorry, the minimization of action is an actual principle of nature
Starting point is 01:32:25 versus this compression mechanism that says, here are your complicated equations of motion. What you can do is you can package it into this simplified little Lagrangian that you then put into the Euler-Lagrange equations and you crank out the equations of motions. So it's not actually a principle in and of itself. It's more like a zipping of something that's convoluted. So do you see the free energy principle as a law or do you see it as akin to what I made an analogy about with Lagrangian mechanics and the Lagrangian? That is, it's just compressing, could just be compressing. Yeah, I think that I see it as a law, but I think that distinction is very nicely articulated. And I think it's really in play in my world in a slightly different way. free energy principle is just a variational principle of stationary action cast in terms
Starting point is 01:33:28 of density dynamics of things that have Markov blankets and by implication at steady state. So it is as it plays a role exactly the same role as Hamilton's principles of stationary or at least action so it either applies or it doesn't so it's not a it's not a theory and you're in many senses it's not very useful you know it's just a way of writing down and zipping things up as you nicely put it in an internally coherent way that speaks to sort of you know know, as all, I think these useful principles do some symmetry or some invariance property. For me, the variational principles of stationary action are the simplest and most graceful way of expressing these symmetries. You can do gate theories, or I'm sure there are other formulations, but for me,
Starting point is 01:34:30 I'm sure there are other formulations, but for me, the best way is just to use that principle of least action or stationary action. So that's what the free energy principle does. But as you say, you know, for the free energy principle, what does that actually mean in practice? that actually mean in practice well it means that anything that exists can be understood and simulated or built as a gradient flow on a free energy function a function of what well the free energy is a function of the states the data the states of the Markov blanket and the Bayesian beliefs about the posterior beliefs about the causes of those data that are encoded by the internal states. So where does that, where does the, where do those beliefs come from? Well they're, or the free energy is defined, if you think of the sort of prediction error version or reading of free energy, they are defined by a generative model that predicts the sensations that you would get if that model was right. So you have to have a
Starting point is 01:35:37 generative model. So, you know, there's going to be a universe of generative models you could plug into the underlying gradient flows, the Helmholtz decomposition that we talked about that basically describes all dynamics, you know, for systems that can be cast as a launch valve with Markov blankets. So the question is not so much now the free energy principle or the application of a variation principle least action but really what is the generative model so at this point i think then you move away from the principle and you start now to get into the world of process theories and hypotheses so and you can ask that question or you can have those hypotheses at a number of different levels you can actually take a working system a person say if you're a psychiatrist you know somebody who might have obsessional compulsive disorder and you may say well i want to understand
Starting point is 01:36:31 them now as making decisions under some model of their lived world some generative model i know their neurodynamics and i and i know the principles that underwrite their choices but what i don't know is their generative model and their prior beliefs. So I'm going to now reverse engineer on the basis of their behaviour. What are their prior beliefs? What do they actually believe is going on
Starting point is 01:36:54 to best explain this behaviour? So that will be one application. Another application might be building artificial intelligence machine artefacts, where I now write down the um you know the prior preferences of a generative model the kinds of states that i want this system uh to aspire to and i then just let it you know i just um equip it with um active states and actuators and sensory states and sensors, and make a little robot, and off it will go,
Starting point is 01:37:26 and it will go and epistemically forage, always with a mind to learning about its world, but also under the constraint of its prior preferences, like keeping its battery charged. So that will be another instance. But in both applications, I've had to either reverse engineer or commit to a particular generative model and that's where I think the you know you moved a long way away from principles and if it's a question about does this person with obsessional compulsive disorder
Starting point is 01:37:58 have this kind of generative model on that kind of generative model and that's now a hypothesis that we've falsified with respect to the evidence um if it's a question of building an artifact do i use a discrete state space um generative model or a continuous one which means i'm now committing to different kinds of message passing if it's discrete it could be a variational message passing or belief propagation. It's continuous. I'll be using things like linear quadratic control or Kalman-Busey filters. So again, you're now in the mechanics of the processes that are realizing these gradient flows. And of course, there are no principles or rules that tell you you're right or wrong. These are all process theories that might work and that might not work so you know i think the zipping that you talked about is from my point of view it's basically
Starting point is 01:38:54 um unzipping the principle to to realize that under the hood it is the generative model that supplies the free energy gradients that drive the gradient flows that's the big open question and getting the generative model right is in most instances scientifically or indeed in industry and possibly even in medical translational practice translation of these ideas in medical practice. It's all about getting the right kind of generative model and your hypotheses about that generative model. You said a couple of statements that reminded me of Jordan Peterson. Now, I'm not sure how much you follow Jordan Peterson at all, but he's a proponent of watching what someone does to infer their beliefs. So that is, okay, well, you're not sure. So you understand what that means.
Starting point is 01:39:45 And then number two, he also mentions order and chaos and the balance between them, which is Taoist, that you don't want the elimination of one or the predominance of another. You want a special balance between them, which reminds me of what you said about surprise. And then I believe it's KL divergence. Right, okay, you don't want one to dominate.
Starting point is 01:40:06 There's a delicate balance. Have you heard Jordan Peterson speak on order and chaos? And do you see any correspondence between the free energy principle and what he says about chaos and order? Yes, I think so. I mean, I would put this, I mean, there are a number of different routes one could take to that kind of issue.
Starting point is 01:40:28 You could ask yourself, is there any first principle account, for example, of self-organized criticality? So do you remember some edge of chaos notions from Stuart Kaufman? And self-organized criticality. That was the self I was trying to remember before, but that's another one of these branches of physics, which starts off with self. So SOC, self-organized criticality. So this is a notion um that it is um inevitably the case that any
Starting point is 01:41:08 self-organizing system will organize itself to a regime of critical slowing um and uh dynamical instability um which you know stuart kaufman might might might have articulated in terms of systems moving themselves towards the edge of chaos. So towards separatrices, towards sort of bifurcation into regimes, usually associated with multistability or metastability, depending on the nature of the dynamical system but this tendency to um to to to put yourself um in a state where you have a repertoire of dynamics available to you simply because you are near disorder you're near chaos. So but you never actually go chaotic, you just increase the latitude of all the repertoire
Starting point is 01:42:12 of things, you know, of dynamics that might happen to you. And that, from that perspective, then there are sort of, there are, I think, a number of interesting things that the free energy formulation brings to the table. The first we've actually already touched on, which was this building in to an optimization perspective, shallow minima, to preclude the existence of sharp minima. And just by having effectively Occam's principle baked into your understanding of life or indeed any kind of self-organization as trying to optimize this bound on evidence
Starting point is 01:43:04 or marginal likelihood, then you're necessarily saying that you want to have these low curvature minima, you want to occupy low curvature minima, that if you now ask what would that look like in terms of, you know, coupling between internal and external states in the context that the gradient flows are informed when at free energy minima by very shallow gradients. What do you get? Well, you get exactly this latitude for excursions which have a long correlation length. So you get this, you know, the kind of critical slowing that is associated with self-organized criticality so often so heavy tail distributions in terms of um in terms of um
Starting point is 01:43:55 uh sort of um sort of covariance functions for for example so that i think there's a simple and mathematically and deflationary account of the balance between order and chaos, at least, from a purely dynamical perspective that is on offer via the minimization of free energy with gradient flows, simply because you are dealing with minima that don't trap you. They allow for that latitude so that you avoid a particular solution. Now, sharp would be too much order. Absolutely. Yeah. So you just get locked in.
Starting point is 01:44:43 You literally get stuck in a rut. too much order. Absolutely, yeah. So you just get locked in. You literally get stuck in a rut. You get locked into this particular explanation, and any slight movement away from a very sharp minima incurs an enormous penalty. You just don't make that move. But if you've got
Starting point is 01:44:56 a very shallow basin of attraction, if you like, and I repeat, I think the beautiful thing about the free energy functionals, or this free energy functional, is that it's built to be shallow. Perhaps an analogy would help here. If you think of a free energy landscape that is parodied by a mountain landscape right from the high the mountains mountains in the Scottish Highlands right down to the estuaries at sea level then what you typically tend to get is that
Starting point is 01:45:32 very high terrains of any landscape have lots of high frequency ravines and sharp valleys and sharp minima. So there's a little stream at the top of a mountain will be carving its way through very sharp trajectories and then meeting slower and slower and wider and wider streams as it comes down the mountain. But also what happens is that as you come down towards sea level, as you get to lower and lower free energy levels, the terrain itself becomes smoother. So all the minima now have smaller curvature so that you don't get those sharp little landscape features that you saw high in the free energy or high in the mountains anymore. or high in the mountains anymore what you get are now much much more gentle landscape where everything because it's a lower altitude or a lower free energy has to be smoother it has to
Starting point is 01:46:31 have a you know by ockham's principle it has to be it has to have less of a curvature and what that affords is the latitude for the river now to start meandering around like you know sometimes forming oxbow lakes and features that you this kind of meandering that you see as a large river starts to wander and oscillate as it gets towards the estuary and gets towards the sea and it's that sort of wandering around which is that if you like, the permissive or the reflection of what can happen with a low curvature free energy functional or objective function, which is in the sense that if the system is trying to minimise its free energy and get to those estuaries so it can do its one slow wandering around you get this critical slowing that you see dynamically that characterizes you know all kinds of interesting um behaviors from sort of avalanches in neuronal avalanches in the brain through to the markets in the old days this might have been known as catastrophe theory um but you know it's the same notion that that we're close to the edge of instability
Starting point is 01:47:45 which means we have the latitude to explore different states on some objective function that we wouldn't have if we were stuck in a rut and committed to a particular solution and we've overfitted the data, for example, or we've got trapped in a local minima. So that's one thing. There is another aspect, though, which we haven't touched upon, which I think nicely follows on from this notion of getting the
Starting point is 01:48:13 generative model right. So if you remember, the free energy functional or the model evidence needs a model in order to have, so the model can have the evidence um and of course you know most of the time you're dealing with models that have unknown states such as models that underwrite things like kalman filters um unknown parameters such as the generative models that might do weather forecasting um but there's another level which is the very structure of the model itself so you know even even if I knew all the states the time varying unknowns in a generated by a particular model and even if I knew exactly all the particular connection strands and rate constants and every all the other parameters and contingencies I you need to know to make this model predict the perfect data, perfect explanation for
Starting point is 01:49:10 this observed world. I may not know the structure of the model. So if I was doing, say, deep learning and I wanted to build a deep convolution network. How many layers do I use? Two, four, eight, 16? So these are really important architectural structural problems that are attributes of the model. But the model can always be scored, given some data, in terms of its free energy or elbow evidence lower bound, which means there's always an answer if you can create a space of models to evaluate.
Starting point is 01:49:47 So this is known in cognitive neuroscience by people like Josh Tenenbaum as structure learning. And it's the problem of basically exploring the right structure of models by adding things in, taking things away, collapsing things together. You talked about modularity before. That's a really important architectural uh aspect of these models usually cast in terms of factorization uh exploiting conditional dependencies to get simpler models that you can have a modular architecture in the
Starting point is 01:50:17 generative model um so all of these structural the hierarchical depth the degree of lateral factorization or modularity the number of parallel streams the number of non-linearities all of these things need to be optimized to get the good model the free energy minimizing model so how would you do that and it's at this point that you come back to disorder and chaos so one really efficient way of exploring a space of models is to actually exploit the chaotic dynamics that you get, or at least stochastic dynamics that you get, for example, in natural selection. You could take arguments from evolutionary theory, and again, coming back
Starting point is 01:51:09 to Stuart Kaufman, in terms of his formulation of selection for selectability. Have you come across this as second order selection? No, actually, the way that I encountered Stuart Kaufman was reading about his expansion of Schrodinger's What is Life? And then Schrodinger posed three questions, that is how, whatever, he posed three questions and Stuart Kaufman answers the second two, so two and three. I didn't read much about him afterward. And I'll ask you about Stuart afterwards. But do you have to tell me what the questions were? That sounds very interesting. Yeah, sure, sure, sure. I'm talking about his work um on sort of um um more sort of artificial life um that speaks to this edge of chaos and the the necessary role of disorder or
Starting point is 01:51:53 chaos in any evolving system that needs to explore different ways of being um so it's interesting because of course you can now um the degree of of chaos that you bring to the table, the degree of mutation, for example, there will be an optimum degree of mutation that depends upon the volatility of the environment. And the degree at which you exploit or leverage chaos now becomes subject to selective pressure. chaos now becomes subject to selective pressure and just as an aside you can always recast selective pressure as basically a pressure to minimize free energy if you read free energy as the you know the adaptive fitness of any of any phenotype in a in a natural set of natural selection so what he's saying is that there is an optimal degree of changeability and chaotic exploration
Starting point is 01:52:48 of any model space or any way of being that, to my mind, provides a very nice perspective on this tendency for self-organised criticality, that we actually move ourselves to the edge of chaos just to position ourselves so we can explore and make sure there are no better ways of doing things. So I think that aspect of the right mixture of order versus disorder, order versus chaos, comes out at the level of the structural learning itself and the sort of the repertoires of different alternative hypotheses
Starting point is 01:53:26 about modeling our world or making decisions in that world. Okay, when we're modeling this world, most of the time, what we do is we look at nature and then we see principles and then sometimes we can infer about the brain. So let's say we understand how atoms assemble,
Starting point is 01:53:41 then we understand how molecules assemble, then we understand cells and so on. So obviously it's not as simple as that. But then you, in one of your interviews, posed another mechanism. That is that we can look at the brain structures, and we can infer about the world. So for example, I believe you said that the fact that an axon is long and reaches out is a reflection of spooky action at a distance, or action at a distance. And then I was thinking, well, what about the hierarchical structures of pyramidal neurons? Do they reflect that our world is hierarchical? And first of all, why would that be the case? Why does it have to be mirrored?
Starting point is 01:54:17 Second of all, there's another structure that is the bi-hemispheric structure of the brain. So one is, I know there's a fair bit of myths about the brain, left brain versus right brain, but there's also some truth to some parts of that. Now, Ian McGilchrist, I'm sure you're aware of his work or his name sounds familiar. He explored that as well. So my question is, what can you infer? Well, why does it have to be the case that your internal structure of your brain would mirror reality in some manner? And second, what about the bi-hemispheric structure? What does that say about our world? Okay, that's a really good question, which could take us in a number of different directions. The notion that if you are in the game of predicting, minimising surprise through the lens of minimising prediction error, then you want a generative model that can level, recapitulated on the inside in order to generate and afford the right kind of predictions. the good regulator theorem which arose towards the end of the cybernetics movement by people
Starting point is 01:55:48 like Ross Ashby, regarded by some as a father of self-organisation, this notion that every system that controls or regulates its environment must in essence be a model of that environment. So there's an isomorphism between the controller and the controlled. And that certainly is the case. Is it an, sorry to interrupt, I'm so sorry. Is it an isomorphism? Like, is it exactly mirrored? Because I recall when I was speaking
Starting point is 01:56:18 to Bernardo Castro, he said, we can't model our reality exactly because if we did, we would dissolve into an entropic soup. And then I said, what do you mean? And then he said, the can't model our reality exactly because if we did, we would dissolve into an entropic soup. And then I said, what do you mean? And then he said, the argument was, it was too convoluted for him to state at the time. And I think he was referring to you. So is it isomorphic or is it just? No, you're absolutely right. It's certainly not isomorphic because as you say, if you want, I mean, the best model of the world is the world itself. I mean, that's a truism which everyone celebrates. So, no, it's a sufficiently good model or an optimally good
Starting point is 01:56:52 model in the sense that it's the simplest caricature of the system or parts of the system, the subspace you're trying to control. So, you know, I still think it's useful to consider the good regulator theorem. But you're right, it's not isomorphic in any sense. And even lesser when you consider that a lot of the time we actually build our own sensations when we move. And certainly in terms of sensing our own body, you know, we are in charge of basically creating that sensorium. However, let's just stick to the original question. So no, it's not an isomorphic, but it certainly has the right architecture to be able to produce a simplified summary or prediction of what's going on that is conserved every time
Starting point is 01:57:46 around so you're only interested in predicting um things on average so that you're revisiting particular states so you just need to get the gist of what's going on uh so it's a much simpler one but it still must fundamentally have the same kind of causal architecture the same conditional dependencies and you've highlighted um a really important one which is the um the the symmetry between our two hemispheres the um you also mentioned this this um which i've forgotten about which is the or the very existence of neuronal processes long thin um connections um that our neurons are equipped with, which you won't find anywhere else in any other organ. The liver doesn't have them.
Starting point is 01:58:31 The heart, to a certain extent, does have fibres, but they're not nice and long. Just reading the structure of the generative model as telling you something about the kind of universe that's generating the data that this model is trying to predict tells you immediately you've got action at a distance which is not necessarily a given but it tells you that this creature must contend with a world where there's some kind of action at a distance it can see things in the distance, for example. So the contention would be if you had a worm that could not see things at a distance, then it probably wouldn't have very long axons.
Starting point is 01:59:16 And it would be quite comfortable having lots of short-range axons that were quite sufficient for modeling a world where it's just immediate contact and short- range causality that's generating its sensations. Okay, quick question. So I'm speaking to a camera right now and its sensor is fairly flat and yet it can pick up from far away.
Starting point is 01:59:37 It can pick up a mountain. Now, I understand that the camera isn't acting on the world. Are you saying that if it was to have some embodied enactment, then it wouldn't use the sensor in the way that it's formulated right now? It would use axons in some way? It would use long wires in some way?
Starting point is 01:59:53 If you actually... Well, okay, let's pursue that analogy then. So let's assume you're going to be using some kind of VGI, assume you're going to be using some kind of VGI, computer graphics to generate a visual scene that you could use in a movie. So what that would necessitate is basically a machine with lots of wires because it's all action at a distance so you couldn't do it on a a computer um that didn't have big buses and the ability to to move data around um in a way
Starting point is 02:00:36 that would recapitulate the um the action at a distance that is necessary for VGI. So particularly sort of the ray tracing that's required to basically render a scene. That's massively computer intensive. That requires a lot of, if you you like hardware that you cannot do with just local computing you actually have to do lots of lots of message passing yeah and we're talking now about sort of the architectures that computer scientists would you would use so it's just that there are certain computations you can do um without sort of you know the kinds of architectures you'd find in computer science
Starting point is 02:01:25 that just involve local interactions. But as soon as you have to actually generate virtual worlds or worlds that entail action at a distance, you get a different kind of collectivity and a different kind of structure. So that's what I was really trying to intimate there. My favourite example is, before we turn to the interhemispheric one, is a differentiation between a dorsal stream and a ventral stream, primarily concerned with what things are and where they are.
Starting point is 02:02:01 And this speaks to that modularity that we were talking about earlier on that somehow our brain has found a really simplifying device to create a modular architecture by leveraging the conditional independences between whatness and whereness in objects in the kind of universe we live in so put that simply that you're knowing that this is a cup doesn't tell me where it is knowing something is over there doesn't tell me what it is so that means that if i'm trying to generate predictions i only need i can have one part of my brain doing the whatness and the other part of my brain doing the awareness and then i can bring them together to actually explain the sensory input and And that keeping things apart, keeping these parallel streams apart, means I don't have to have the complexity cost of all the connections between them.
Starting point is 02:02:52 I don't have to represent every object in every position in the world. I can just represent what it is and where it is, and then just bring them together as part of my generative models. So that tells you something quite fundamental. If I find a brain that has this segregation into what and where, I know that they live in a world of objects where in their universe things are conserved when they move around. Ah, aha, aha, interesting. when they move around. Ah, aha, aha, interesting.
Starting point is 02:03:28 So that brings us to the delicate issue of why we've got symmetrical brains. I think quite simply because we've got symmetrical bodies. I think that, you know, but then you may be asking why we've got symmetrical bodies, but sorry, I'm not going to try and answer that. Well, I was actually referring to the asymmetry of the brains. All right. That the left, as let's say Ian McGilchrist would say that the left is more concerned with manipulation and pinpointing, making definitive.
Starting point is 02:03:53 And the right is more concerned with exploratory motion. So even actually at nighttime when you're looking for your watch or your clock, you actually explore with your left hand because it's controlled by the right brain. Naturally, you make more exploratory movements with your left. I'm not saying anything you probably don't know. And then with your right, most people are right-handed and they like to, well, whatever. So then I was wondering, is that a reflection of what the world is composed of or comprises in some manner? And then also, how does one know when one is taking this too far? So for example, just because the brain has a morphological structure of foldiness and gyruses and sulky and so on, doesn't mean the
Starting point is 02:04:32 world is foldy. Or my experiences is hilly. How does one know when to apply it and when not to? So I was just thinking about all your challenging questions one by one. I like the one about the world isn't foldy. That's very nice. Mystically, one can make an analogy and say, well, the world is complex and fractal-like and nothing is ever the same. And you can look at it from multiple vantage points.
Starting point is 02:05:04 So it sounds to me more like one is playing a linguistic game in that example that i just gave rather than actually giving a property of the world that's reflected in the morphology of the brain yes yes um so i i also just remembered that i saw a presentation by colleagues of mine recently that actually interestingly um made the the foldedness a possible reality. But that's a distraction and a unique and very exciting observation. But you're right. So when I'm talking about structure, the only structure that matters is the same kind of structure that underwrites a Markov blanket is conditional dependencies. So I'm talking about a
Starting point is 02:05:45 connectivity architecture here. I'm not talking about the physical shape of the brain. I'm just talking about what is connected to what is not connected. So all I need is the graph, if you like, if you're a graph theoretician. I just need the adjacency matrix or the connectivity, usually the directed connectivity matrix that's for me is what what defines the structure so um you know the the at that level um the kinds of structures that can be defined purely in terms of the adjacency or the edges on a graph are things like the number of hierarchies or the number of parallel streams or the number of modular number of modules or clusters you know the degree of small worlders if you if you like
Starting point is 02:06:43 of clustering indices I can't remember all the graphetic terms for them. But crucially, it's just that defined in terms of connectivity. And I think you can, within that remit, without going into the world doesn't have cortical folds, which I agree with, is a brilliant sort of sanity check, I'm taking this too far, but within the remit of instantiating and biophysically realising causal contingencies and associations in terms of connections between biological systems, in particular
Starting point is 02:07:22 neuronal systems, I think you can actually play this game and play it for quite a long time in terms of the hierarchical depth and in terms of this modularity and this factorization that we were just talking about. And in particular, the hierarchical depth, I think is a very important one because you're asking me, well, why should the world be hierarchically structured?
Starting point is 02:07:51 But it is necessarily hierarchically structured if one just considers a you know a separation of temporal scales so you know that there just has to exist in terms of a coarse graining applied to any dynamical system a progression of slower slower stuff that has a more coarse-grained aspect to it that you could add, you know, you could elaborate recursively, you know, in principle for an infinite number of levels. So there does exist causal structure out there in any sparsely connected, in the dynamical sense, world that I think fully licenses an interpretation of the corresponding architecture, and in particular its hierarchical depth, as somehow mirroring or reflecting
Starting point is 02:08:41 that hierarchical structure out there. The obvious example, of course is is is um all of that machinery that aspect of our brains that is devoted to providing an apt generative model for interpersonal interactions so you know if one realizes that most of our lived world is, or at least the sensations generated by that world, are generated by other creatures like me, namely you. Whether we're driving around in cars, walking in parks, talking on Zoom, reading books,
Starting point is 02:09:26 99.9% of our sensorium is generated by another human being that is like me. So that actually says there has to be a lot of the brain has to be devoted to modeling me and people like me and making inferences about me. And as soon as you start to get to this, think about what kinds of generative models would be fit for purpose in that context, then the imperatives to resolve uncertainty are basically the drives for me to understand you. So this epistemic part of the free energy minimization as a consequence of action translates into an imperative to understand the world, but the lived world now is basically you in in this instance I need to understand you how do I do that we have to have a shared narrative what does that require language it could be maths it could be could be English but that narrative has many many different scales to it that has itself a deep temporal structure
Starting point is 02:10:22 so there's the concept that I concept that we're conveying, there's a temporal scale of the duration of this exchange. There are recursively much finer temporal structures in terms of the structure of the sentence, the phrases that I'm using, the words, the phonemes, all the way down to the millisecond by millisecond activation of your sensory epithelia in your ears or my neuromuscular junctions controlling my articulatory apparatus. So we wouldn't be able to talk to each other or comply with this imperative
Starting point is 02:11:01 to resolve uncertainty, to explore the world that I have to model without language and without a deep generative model with a deep hierarchical structure, there can be no language. So there's a natural, if you like, not pressure, but there's certainly an easy way to understand why we have deep generative models in our brain. In fact, we do.
Starting point is 02:11:22 I mean, nearly all the interesting connectivity all the interesting architectural aspects of brain connectivity speak to a hierarchical organization at some level it's cortical subcortical the very word sub means below below is only an attribute of hierarchy it can't be the other thing you know the visual hierarchy being the sort of the the poster child for um very well um defined subsumption hierarchies but you know beyond that wherever you look in the brain there is some hierarchical organization where where slow stuff is in training fast stuff at the lower below and then all the way down to the um to the sensory inputs and the uh the actuators or the active outputs, which are the fastest parts and the elemental parts.
Starting point is 02:12:11 Okay, so hierarchy is another way of saying difference that can be compared? Say that again. So a difference that can be compared. So these two are different, but there's no hierarchy between them if it's like someone may say apples and oranges, but if it's between apples and two apples, then there's a hierarchy there because there's a quantity that we can reduce down and put a comparison between them. Oh, I see. Yeah, no, absolutely. So the way I'm using hierarchy here is certainly sort of, so when people talk about, say, deep learning um what they're talking about is um
Starting point is 02:12:46 inference and classification under a um under a hierarchical generative model that has a number of hidden layers so the depth of the the learning machine refers to how many layers you build and so what you do is you set put the data in um and then you try and explain it with one layer. And then those explanations are that you then try to explain it with a layer on top of that. And then you keep on going until ultimately you get to some very, very coarse grain, very abstract explanations that are predicting, if you like, the layer below, and then they're unpacked to predict hierarchically right down to the level of, say, pixel elements in a TV screen or some image that's been grabbed. So the deep, the hierarchical depth is basically how many subordinate layers do you have. I see. There's one higher level providing, getting information from the lower level but also providing constraints saying well
Starting point is 02:13:47 if this hierarchical if this context is in play then I expect this kind of thing to happen over here in that modality and that kind of thing to happen over there in that modality so it brings a simplification and a better way of modeling provided that you've got this hierarchical structure what you were talking about I think is more the modularity, the sort of, is this an apple or is this an orange? I think that within the hierarchy, within the depth from sort of, you know, the lower level, which are usually at the level of the sensors and the actuators, and the higher level is usually the top of a pyramid.
Starting point is 02:14:23 Of course, you know, there isn't a pyramid in the brain. You can imagine it more like a circle, the centre being the deeper parts of it. But there's another architectural structure, which is not in the depth of the hierarchy from the bottom of the circle, say, to the centre, but actually the different streams you were talking about, the apples and
Starting point is 02:14:45 oranges and it's that factorization that what and when that's that i was you know hinting up before which is a lovely example of um you know things i know about oranges are not relevant for things i know about apples i mean that's probably a silly example because there are there are things that are concerned but things i know about fruit are not going to be terribly useful for things i know about tools so i can actually have my generative model generating a cascade of abstract representations becoming more and more detailed committed right down to a picture of or a sound of or me moving this particular artifact but of course this artifact can be either a tool or a face you know or an apple a fruit and because you've got you know
Starting point is 02:15:34 these conditional dependencies tools don't behave like fruit and i don't in particular um i don't um actively engage with tools in the same way that I engage with fruit. I eat fruit, but I use tools. So it's likely that you get this separation through this factorization or modular parallel architecture in the setting of a hierarchical composition. So these are really important sort of architectural principles if you have to actually build or understand the brain,
Starting point is 02:16:08 but also if you wanted to build a conscious artifact or an internal artifact, you'd have to equip it with this, you know, with both a deep generative model, but also have this sort of these parallel streams to them, which brings us back to the right brain
Starting point is 02:16:23 versus left brain. I still don't have a neat answer for you, I'm afraid. You're right. I mean, language lateralization is a classic and conserved aspect of our brains, but I can't think of a first principle account as to why that might be the case. You mentioned the word narrative,
Starting point is 02:16:43 that we have to have a shared narrative. Now, word narrative, that we have to have a shared narrative. Now, were you saying that we have to have a shared narrative in order to have peace between us or in order to interact or in order to understand one another? And then also, does that narrative have to be encapsulated in language or can it be embodied?
Starting point is 02:16:57 Because I assume that animals can get along and they don't have language, at least not the way we do. And presumably we didn't have language, at least not the way we do. And presumably we didn't have language prior to a million years ago, yet we got along. Yeah. Well, I meant it in exactly the sense that you intimated there. So yeah, it is just there. If we have a shared narrative that I'm now using narrative to describe a generative model that entails sequences of things that happen over time. So it's a model of sequences that usually has a deep temporal structure to it. So if I want to understand you,
Starting point is 02:17:59 So if I want to understand you and I want to communicate with you in order to understand you and to ask you questions, to understand more about your intentional stance, your knowledge, then I'm going to need to infer what you're thinking. And I can do that if I know what I'm thinking, but only under the assumption that you're using the same kind of model as me. So we're both using the same code or singing from the same hymn sheet. So, you know, when you start to model this, you don't, well, we have models of linguistic exchange, which are cast in terms of simple games of 20 questions. So, you know, one agent has to ask another agent through linguistic exchange, but it does depend upon them both committing to the same generative model of this linguistic exchange and the meaning. A simpler set of simulations arises when you think about just birds singing to each other so they can recognize
Starting point is 02:18:46 conspecifics so you know what that's quite that's much easier to simulate when you just have two dynamical systems talking to each other and they both now become entrained in the synchronization of chaos that we were talking about before in a different context but here in the service of mutual predictability so another way of thinking about what's the best way to minimize surprise surprise or self-information when exchanged with another is to make sure that i only exchange with other people exactly like me because i can predict exactly what you're going to do next because i know exactly what i'm going to do next and i'm doing it. So, you know, we could be singing or talking together. Clearly we take turns, but there are situations where we could be actually singing in the choir together.
Starting point is 02:19:32 But I can only sing with people that have a sufficiently similar generative model to me that makes it, that licenses the use of my model of how this kind of creature interacts with the world licenses its use as a model of how you are interacting in the world and interacting with interacting with me so that's what i meant by shared narrative just just a shared or a conserved generative model between um creatures creatures that typically act together in terms of things like joint attention or familial bonds or sort of conspecifics in an evolutionary context
Starting point is 02:20:15 you know that generalizes in terms of you know cultural takes on niche construction, the existence of things like signs and traffic lights and elephant and desire paths. These are all manifestations of living, sharing a world with other creatures like myself. After a while, they become very efficient, very simple ways of communicating that possibly, as you say, don't involve spoken language and non-verbal communication.
Starting point is 02:20:49 It's clearly an incredibly important part of that. But even beyond that, just the fact that we have a shared commitment to stopping when a traffic signal or light goes red. That's a shared narrative we have that enables us to drive around and occupy the same streets when some of us are driving and some of us are walking or both of us are driving so that's what i meant by shared narrative it's sort of you know a common a common generative model of how to live in this world with things like me interesting interesting okay now it can't simply be just
Starting point is 02:21:23 shared there has to be some other criteria only because me and you can still share the same model that says that you're an enemy, and I'm supposed to kill you and I'm and you're supposed to kill me. So then it's not just shared that allows us to be peaceful cohabitants. And I was wondering, then do you think religion is an attempt to make a hypothesis or, or to generate possibilities as as degenerative models that a large class of people could share and minimize the suffering of both the society and then individuals at the same time? Yeah, I haven't thought deeply about this, but that's a very plausible explanation for why religious narratives have emerged and are so successful in maintaining themselves. So you know just in terms of a simple analysis of the minimizing complexity arguments that we were rehearsing before, if you want to find a really simple explanation that accommodates a lot of difficult to explain stuff very accurately, then a deity that can cause all this stuff is a really simple explanation and you know if it accurately explains your sensorium and your world then it's a it's a beautiful example of a very
Starting point is 02:22:38 parsimonious hypothesis now it may not be sufficiently accurate for a scientist who does not accept a religious explanation for this or that. But if you are not sampling that kind of sensory information or that kind of scientific data, then that doesn't matter because you only need to explain what you need to explain. So just as a broad comment upon religious beliefs, and I think you can generalize that to societal norms, just ways of behaving, the right way to behave. These are just simple hypotheses that explain a lot of my behaviour and a lot of your behaviour in a really parsimonious way. And therefore, they have big evidence or low free energy because they provide simple explanations. But you're bringing something else to the table, which is an interesting one, which I haven't thought about, which of course, these kinds of hypotheses are easy to share as
Starting point is 02:23:45 well in virtue of their simplicity. And there's always going to be benefit in having a shared or a conserved or a common generative model, provided as you say that we're all cooperating, we're all acting as conspecifics. And it's interesting to, and you know know so in answer to your question I am sure that if you simulated multiple agents all free energy minimizing all trying to predict each other in the most efficient way possible and then one of them had the simple hypothesis of a religious or a ideological or theological sword that made sense of lots of things to which everybody could subscribe, you'd suddenly see the consolidation and the emergence of that aspect of the generative model absolutely. And that would render everybody mutually predictable, so everybody would be
Starting point is 02:24:35 reducing their surprise. And that would mathematically be expressed as a collective decrease in free energy or an increase in the adaptive fitness in the sense of increasing the marginal likelihood of finding that phenotype around if you simulate in multiple generations. Well, in that case, they would be trivially correct in saying that there's a deity because you're the simulator. Yes. Well, that's an interesting hypothesis. you're the simulator yes well that's an interesting hypothesis which um interestingly if you um if you do philosophy of course then you've got the brain in the vat thought experiment which indeed actually has that exactly as an alternative hypothesis uh to confound the um you know the philosophy of of of realism versus skepticism um but coming to to your interesting
Starting point is 02:25:26 point that you know what happens if my generative model is that you know you're my enemy you're going to you're going to um you know cause me surprises i think that's that's an interesting one you know the level of which we simulate these things really only addresses um cooperation and systems that have different components and trying to find their place with a shared narrative. So what one would predict is that if there were other kinds of agents that were not like you, then you would certainly have to represent them in virtue of the fact that they're not like you, you wouldn't be able to communicate with them. So my prediction would be that whether it's a theological or political or other kind of commitment,
Starting point is 02:26:14 if there's an in-group and an out-group for any one given individual, it is highly unlikely that there will be a shared language or indeed a big transaction of communication between these two groups. So there will be a Markov blanket, if you like, between the in-group and the out-group, the blues and the reds, wherever you go. An interesting observation, which is not mine, but I've heard a number of people make, is that the only stable non-equilibrium for that in-group out-group is a 50-50 split. So in the language of theoretical biology, the evolutionally stable strategies for opponents is a 50-50 split, which is borne out time and time again. What do you mean a 50-50 split of what, population? Of the number of people that would identify with one group versus another group.
Starting point is 02:27:12 That's interesting. So Brexit versus non-Brexit in the UK, or Trump versus Biden in the most recent American elections. is Biden in the most recent American elections. You know, wherever you get something where there is contention, where you commit to one side or the other side, the only, if you like, contentions or dialectics which seem to survive is when there's a roughly 50-50 split between you. So I thought... Okay. So in other words, the fact that we stably differ somewhat down the line on many
Starting point is 02:27:48 important issues is adaptive? For the collective as a whole, it's just maintaining a non-equilibrium steady state. I mean, adaptive in the deflationary sense, that is one way to to to maintain you know a a an extended non-equilibrium um you know where now we're talking about markup blankets and markup blankets and sort of applying blankets to um you know to multiple agents yeah there's one quote that i love that i i try to live by and it's only the shallowest of minds would think that in great controversy, one side is mere folly. I'm sure you've heard that before. I haven't heard that particular one, but I've heard something similar, which is... Okay, so do you believe in free will? And if so, how do you define it? So free will, yeah. I mean, you're asking, do I believe in it?
Starting point is 02:28:43 Yeah. I mean, you asked do I believe in it? There's certainly space for free will in the realisation of a free energy principle in sentient artefacts at many levels. When you actually come to write down and simulate or build little toy agents, you very quickly realize that the most interesting, in fact, the only interesting behaviors that you can simulate arise when you write down the generative models as containing autonomous dynamics, usually of a chaotic sort. So the reason I use the word autonomous dynamics
Starting point is 02:29:26 is that mathematically speaking, there's a sort of free will in the autonomy. Even in a deterministic setting, there is an unpredictability given the initial conditions that cannot be determined. So in that sense there has to be a mathematical kind of free will at play. The other sort of take I guess on free will is it comes back to what we were talking about before about you know making
Starting point is 02:30:02 our own sensations, creating our own sensorium. So if you remember about you know making our own sensations creating our own sensorium so if you remember that um that you know from the point of view of minimizing um prediction error as surprise there are two ways i can do that i can um change my mind so that my predictions are more like what i'm sensing and that would be a minimization of prediction error through perception but there's another way of doing that minimizing the prediction error I actually just change what I'm sampling to make the sensations more like the predictions so that's you know that's action in the service of minimizing surprise or prediction but what that means is that my actions are basically in um enslaved to where they can fulfill my predictions so they are in the service of um fulfilling prophecies so
Starting point is 02:30:57 collectively action perception is a self-fulfilling prophecy and in that sense i think you know um you can find free will you know if if we are creating our own worlds and our own um sensory inputs we're constructing our sensorium uh who else is doing it you know so in that very simple um i won't say deflationary but simple account of free will then i i can't see how it could be any other way, really. There's something that I've been wanting to study for quite some time, and I've been making extensive notes on, which is self-fulfilling prophecies. I find that to be an extremely interesting area of research. The fact that you can have, it's like you have a model of the world, and most of the time what you want to do
Starting point is 02:31:45 is make sure that your model comports with reality. But then it's as if there are these blank spots in reality, where whatever you think, if you think there's a chair there, then a chair becomes. I know that that's an extreme example. But what I mean is that if you imagine that your wife loves you, then you're going to act in a manner that makes her love you more. And if you mistrust and so on, so on. So they-fulfilling so it's as if there's these lacunas in the world that whatever you think is there becomes there i wanted to know what your free energy principle had to say about self-fulfilling prophecies i imagine quite a quite a lot but i don't not sure if we have enough time do you have any quick notes on
Starting point is 02:32:25 well i mean the way that you just articulated that is spot on from the point of view of the 300 Principles. So we often use the phrase, action is there to realize and fulfill the prophecies supplied by the brain. It is literally a way of creating your own sensorium in virtue of self-realizing and self-fulfilling prophecies. The question is, can you keep on doing that
Starting point is 02:32:54 in the face of a particular environment that may allow you to do that or may not? And in particular, if you do it, lots of other people are also trying to pursue and self-fulfill their purposes. Like, what are the limitations of it? Yeah, yeah. So, you know, I think that's absolutely right. And it also speaks to, you know,
Starting point is 02:33:16 seeing stuff which is not there starts to talk, brings us into the world of false inference and delusions and hallucinations and the kinds of um fulfilling self sorry self-fulfilling prophecies of a perceptual sort that people with things like autism or schizophrenia might be subject to so as you know you can go too far but on the other hand you know there are many people who now consider perception basically as sort of hallucinations that are entrained by sensory input that you know what we actually see is sort of you know a product of the hypotheses about what
Starting point is 02:33:59 best explain what's going on and there are occasions when we can get those hypotheses wrong, and then we're then subject to illusions, for example, in psychophysics, or hallucinations, and indeed delusions, you know, if we've taken some psychedelic drugs, or that we have conditions such as, say, schizophrenia. So then is schizophrenia a pathology of free will? I don't think so, no. I think it's a pathology that can be understood as a false inference, but I don't think that there's any aspect that would enable you to sort of isolate free will as the locus of that false inference. There's certainly a lot of work suggesting that people with schizophrenia,
Starting point is 02:34:55 when they are acutely psychotic, may have difficulties inferring who did that. So assignment of agency. So in the sense that, you know, who willed that, was it you or me, there may be a slight confusion. So if I say something to myself, I may infer, actually, you said it, you put that thought into my head. So, you know, that's just, you know, a false attribution of agency. So in that sense, perhaps you could say that that is a form of disorganized free will. But certainly the people with schizophrenia that I have treated or worked with in the past,
Starting point is 02:35:37 I think they would have a sense of self and a sense of free will, which would be indistinguishable from yours and mine. You know, I know that we have to wrap up at some point soon. And I wanted to bring this up. I didn't know how. But I'll just tell you quickly, the a few months ago, maybe I'll even take this out of the actual clip. But a few months ago, January, February, I was, I woke up in the middle of the night. And then I had a conversation with my wife, it was minor neutral, wasn't positive or negative. And then as I was going back to sleep, she either said yes or okay, or she could have not said it. But I was in this hypnagogic, almost sleep-like state. And I've never had a panic attack in my life. But for some reason, I wasn't sure if she said okay or yes or whatever it was, or if it was in my own head. And then I thought, am I psychotic?
Starting point is 02:36:20 Am I getting schizophrenia? I don't know why why it's not like that was a background anxiety of mine before maybe it was maybe it's latent but then i started to have a panic attack because i didn't want to hear anything and i felt like i may be able to will myself to hear something by telling myself don't hear then i'm like well what would it sound like if i were to hear then it's like no no don't go down there because you may hear a voice. And I don't want to go. I don't want to be crazy. I don't want to be psychotic. A couple of days later, I had another panic attack, almost based on the worry that I'm going to be psychotic rather than I am at all. And I've been afraid to search about schizophrenia because I don't want to read that I have the symptoms of it. And there's
Starting point is 02:37:06 nothing else other than that I may have heard my wife say okay or yes in a state where I was about to sleep and that she actually may have in fact said it or not. I talked to my doctor about this on the phone because it's COVID. And she said, oh, Kurt, well, you can hear many things. You can hear music when you're about to sleep. Your foot can feel like you're about to fall off. So don't worry about that. And I found that when I talk about it, I feel much more at ease. But ever since then, for three months, Carl, for three months, at nighttime, I've had such a difficult time falling asleep because I'm afraid of my own mind.
Starting point is 02:37:41 I can't let myself be alone with my own thoughts because I'm afraid of what I might find. And it's created such anxiety in me that I, well, I can't rely on any benzodiazepines to sleep because that can just put you in a, that'll create way worse problems. So I have to find some way of sleeping without any medication besides maybe melatonin. That's difficult. And I'm just not sure what to do with that issue. I brought this up on one of my podcasts because I feel like there are plenty of people who may be experiencing what I'm experiencing, but people don't talk about it. It turns out when I did talk about it, many people in the comment section said, I can't believe you've been going through this.
Starting point is 02:38:19 I've been going through something similar. When I study about consciousness and so on for this, for these types of interviews, I feel like I don't know what reality is anymore because there's so many different theories. And so that is destabilizing me on top of what I've already been feeling. And I'm unsure what to do about it. And I just want to know what what advice do you have? I know that that's extremely personal question, but what what do you recommend that I do or not do? Well, ironically, you've just done it. So now I'm responding in my role as a psychiatrist, not as the author of the free energy principle,
Starting point is 02:38:58 although much of the free energy principle was actually inspired by exactly the questions that you're contending with. free energy principle was actually inspired by exactly the questions that you're contending with you know uh how do i maintain a sense of self um which is the most important part of my narratives uh explain explain me and what can go wrong when you have um psychosis i think the first thing to say is that there there have been an enormous number of people around the world during lockdown and during the coronavirus crisis with reduced social contacts and changing points of reference so your old narratives your old models are just not fit for purpose and in a sense you've got to be able to grieve for the things what that you know the way that things were and it's quite frightening to actually relearn how to live even for a few
Starting point is 02:39:51 months um you know in an in a new covid world certainly if there's been um you know sort of changes in social distancing or the way that you interact with people and I should say also when and if you have to unlock that will also be very you know anxiety provoking as well because of the uncertainty you've got to relearn how to do it so lots of my colleagues have had psychotic breaks and you know I talking to you I think it's highly unlikely that you have any form of psychosis or indeed an acute psychotic episode just by your composure your theory of mind and a lot of non-verbal cues however I do have many colleagues who've actually had an acute psychotic episode in response to this I haven't but you know a lot of my close colleagues have and it's perfectly natural they They usually last about sort of, you know,
Starting point is 02:40:46 sort of a few weeks to a month or two. May respond to medication if the person is becoming sufficiently agitated or they're worrying their close family. It needs containment pharmacologically. That's certainly viable if it gets that far usually it doesn't but the important thing to know is this is perfectly normal you know even if you haven't had a psychotic episode if you had had it's you know lots of people are going through this which brings me to the answer to your question You can certainly talk to your GP and talk to healthcare professionals
Starting point is 02:41:27 and mental healthcare professionals. And there will be the option of medication, either to help with sleep or to if you actually did have a psychotic break to take the edge off that and enable everybody to cope with the know with the consequences that's always an option should you want to go for it and you shouldn't feel ashamed of doing that or in any way hesitant I have to say that the nature of these episodes is such that it won't be the person who has the psychotic episode who knows that they they need containment and looking after of a particular kind it's usually the the next of kin, their children, their parents, or their wife or their husband, who has to take the initiative
Starting point is 02:42:11 to bring in support and help to get that done. And that help is there. The very fact you're worrying about it means that you're almost, by definition, can't be psychotic because you've got insight. So just by worrying about it, I'm afraid, takes you off that list. But it doesn't mean to say the anxiety doesn't go away. It doesn't mean you're not actually psychotic. The other thing which really helps is just, which you've already said,
Starting point is 02:42:39 it's basically sharing in the group. So the angst that these things generate, whether it's a psychotic break or not, or worrying about it or not, everybody's going through this. And being able to share that in a group can be enormously therapeutic for everybody in the group. So if you can set up, by sharing, a safe environment within which to share these experiences and anxieties, and by safe, I mean very clearly bounded. So it starts at a particular time and it finishes exactly one hour later. So you've got this boundary so that anything that comes out in that context can't spill over and affect and can be put back in the box. But you know there's going to be a box next week or tomorrow.
Starting point is 02:43:29 Very, very carefully timed. You know who's going to be there or you know what kind of people are going to be there. So boundaries are very, very important when it comes to group therapy. That, in my experience, that's the most effective kind of talking therapy in this situation how do you get a group together you have to say well this is a thing um how do you do that you just have to declare you have to come out say i'm upset by this i can't get to sleep because of this this
Starting point is 02:43:57 you know you know if i'd told myself these worries a year ago i thought i was stupid but these are real worries they're stopping me thinking i can you know, just saying that out loud means that other people now are aware that this is a thing and it's something that can be talked about. So when I said you've just done it by just going public in this kind of interview is the first move. The next move is then to find, you know, a network or a structure that will support either therapist-led or self-help groups, usually small enough that you actually have the intimacy of a small group dynamic. You know, so not more than 12 people, usually about eight people with or without a therapist you don't need the therapist but it's useful um you know if somebody is just to get used to the importance of boundaries and you know um and group dynamics but it's not necessary if you know the important thing is you're getting together
Starting point is 02:44:56 in a bounded way um to to you know to to share and um and to just in a this in a completely open and nonjudgmental way, tell other people how what they are saying makes you feel and just use them to reflect literally like a mirror. So you come back to the shared narrative again. You can get a long way with that. So that's what I would recommend. Somebody in your position may well feel that perhaps you should start an initiative along these lines or actually start.
Starting point is 02:45:32 You should take the initiative and actually set up a discussion group or a self-help group. But if not, then certainly look around on the web and I'm sure you'll find, you know, somebody over the past few months will have started this. And be mindful that you should tell your wife if you do get a psychotic, she's got to get the doctors in to get you medicated because that'll go away in about three weeks if you just take the right kind of medication. It's not a big thing. I repeat, it's happened to several of my colleagues and friends. They're back in the saddle now, and it's been a useful experience.
Starting point is 02:46:18 Sometimes people get more creative. They go slightly manic. I don't sense that in you. I do notice that when I was having my high anxiety periods right after the initial episode, that I saw many more connections. Almost like I imagine when you're on LSD or psychedelic, that everything is imbued with meaning. That every other sentence someone would say in a movie, I'm like, oh, that's so deep. That makes me want to cry almost.
Starting point is 02:46:46 Now that's not there. I still do see connections, but I tend to have always seen connections to some degree anyway. Yeah. So, I mean, these episodes will either just reveal innate sort of ways of thinking and seeing and making sense of the world, particularly in a pro-social context. But certainly, you know, what I was going to say is that there can be some, if you keep a record of these, if you can, if you just see associations and links and just keep a documentary record of it,
Starting point is 02:47:20 it may actually be something quite useful either to go back to as a, you know, an essay in the way that you saw the world in that particular state at that particular time but sometimes you actually get some quite profound insights you know again a number of my colleagues actually do suffer from manic depressive psychosis and of course if you look at the history of of your creative people in both you know the arts and the sciences you know you're you're on this edge of chaos the instability that is associated with um with psychosis um so it's not an unproductive time um you know if you are if you did get close to the edge then then make sure you keep a very
Starting point is 02:48:00 um if you can keep a very clear document document what happened to you and everything you can. This would be nice for you to go back to next year. Yeah, I definitely take notes on all my thoughts. I've done so for years anyway. I'm always constantly saying, I can't say now because it'll turn on, but OK, and then the name of my device. And then I say, make notes and so on and so on. Yeah, OK, well, thank you so much. I want to say doctor. Thank you so much you are a doctor obviously professor call me car thank
Starting point is 02:48:30 you so much Carl there was a statement that you said before many times actually it's that you are your own existence proof can you explain what that is what that means it is one of those deflationary summaries of the free energy principle that says the very fact you exist tells you a lot about the mechanics and the things that you must do or the properties that you must possess in order to exist so you turn that on its head, and the fact that you exist is proof of your own existence, again, in this tautological sense. You could also read it as one way of encapsulating the notion of self-evidencing. So if you read self-organization as a system trying to minimize its free energy or maximize the evidence for its models of the world, then what you can say is that these kinds of systems change in such a way to increase their evidence for their own existence. So it comes to very close to the
Starting point is 02:49:48 notion of this self-fulfilling prophecy that, you know, I expect to exist and I'm going to act on the world in a way that solicits or secures evidence for that existence, and if I have a good model of that world, then I will be successful in soliciting that evidence. And in so doing, I will persevere, and I will simply exist. And I look as if I'm self-evidencing. And of course, evidence for what? Well, evidence to prove that I exist. So it's a rather sort of cheeky way of celebrating the tautology of existence in the sense that just existing is really all you need to know. And you can spin off everything else, all the attributes of artifacts, particles, people, plants that exist just from their very existence. What happens if you start looking for evidence that you don't exist?
Starting point is 02:50:48 Like, let's say you're on the eastern end of the philosophical spectrum, and you say that the eye is illusory, that maybe this is all a dream, whatever that may be. What happens? Is there a system that can do that, where you look for evidence of your non-existence? Can you think yourself into non-existence? I know that likely not, but please run with that idea. Oh, well, it's a wonderful question. I mean, yes, I think you could certainly think yourself into non-existence. I mean, just a trivial thought experiment would be somebody who is a very committed meditator who forgets to drink or eat and dies through dehydration.
Starting point is 02:51:26 So, no, it is certainly possible. It's a wonderful question because it speaks to something in philosophy called the dark room problem. So it is certainly possible to simulate in silico, in computers, in silico, in computers, agents that deliberately do not believe that they exist and will avoid sampling data that would otherwise provide evidence for their existence. And what that translates into is essentially an artifact or a creature that turns off the lights it you know deliberately avoids any information from the world and sequesters itself from the world but in so doing of course um there are profound implications for its homeostasis uh and you know from a sensory point of view it hides away in dark corners and is now subject to the dark room
Starting point is 02:52:25 problem. But at the same time, it will not be able to maintain its Markov blanket, and ultimately it will dissipate. So if it doesn't comply with the existential principles that maintain its structure and form and its organization technically if it doesn't comply with the principles that underwrite the maintenance of a Markov blanket then that Markov blanket will cease to exist and so will you if you are that Markov blanket so you know it's practically a very very interesting question because it all rests upon having a good generative model of the world so you know there's a sort of dual dependency here that you know it looks as if creatures that or systems more generally that exist are seeking out evidence for their internal model of how the world works
Starting point is 02:53:20 and how the world generates its sensory samples. And that's only an apt explanation for self-organization if the generative model is a fair or a good enough explanation for the way that those sensory samples are generated. Technically, that's an idea which dates back to the inception of cybernetics in the early half of the 20th century known as a good regulator theorem that in order to regulate to control to survive in your environment you have to be effectively a model of that environment so you know whether you're a thermostat or or a person you know your structure and the way that you exchange with the environment under a model of that environment
Starting point is 02:54:11 only works when you are a good model of that environment. When you say generative model, what's the difference between a generative model and just a model? Well, generative model is just a technical term. Strictly speaking, it's just a specification of the probabilistic relationship between causes and consequences. So you literally write down a probability density or a probability distribution over causes and consequences, and that allows you then to evaluate the evidence in some data for that model. So in
Starting point is 02:54:48 this instance, the causes are unobservable, sometimes referred to as latent or hidden states, say, of a world, and the consequences are observable consequences, say sensory samples that we solicit with our sensory epithelial, our eyes, our ears, or indeed our interception. So the genetic model read like that is just a probabilistic specification of how hidden or latent states hidden behind, say, a Markov blanket generate observable consequences. And if you've got that model, then you can assess how likely those sensory samples, those sensations, those sensory impressions were under your model. And if they were very likely, you're not going to be surprised,
Starting point is 02:55:38 you'll have a low for the energy, and you know that you've got a good model of the world. If you're constantly surprised, unable to predict, then you've got a good model of the world. If you're constantly surprised, unable to predict, then you've got a bad model of the world. And remember that surprise can also be used as prediction error. So if you constantly subject yourself actively to lots of prediction errors, you're going to be constantly surprised in a state of high free energy. But think about what that means. These are not just prediction errors. In fact, they are not prediction errors of a sort of personal sort of, you know,
Starting point is 02:56:09 a propositional thing. I could say, oh, I didn't expect that. These are sort of fundamental deviations from your expectations, like your bodily temperature, the amount of blood sugar that your circulation is currently supplying. So when these get out of physiological bounds, you start to literally disintegrate and die, and your Markov blanket dissolves. So it's important to keep these prediction errors within bounds or in the language of the free energy principle, to keep on top of your surprise, self-information, or free energy.
Starting point is 02:56:45 All words for essentially the same thing. Okay. What do you make of Donald Hoffman's ideas on consciousness? I know that's broad. Right, yeah. I had a mentor once called Jerry Adelman. He was a Nobel laureate, famed for understanding the evolutionary basis of our immune systems.
Starting point is 02:57:09 He used to call me an intellectual thug because I didn't know anything outside my own sphere. So if you tell me what Donald says. Yeah, so forget about that. Forget about that. Because in order for me to say that, it would take too much time, and I probably wouldn't do its service. Okay, how about orchestrated objective reduction from Penrose? Roger Penrose, Stuart Hameroff, if you've heard of them. Yes, yes, yes, certainly.
Starting point is 02:57:36 So, yeah, I certainly read Roger Penrose's… Emperor's New Mind. Yes, The Emperor's New Mind. So, what... Sorry, can you repeat? What do you think about it? Does it comport with your theory?
Starting point is 02:57:53 Does it contradict it? Do you think it's too unfalsifiable? Do you like it? Do you think it's creative? What are your broad thoughts about it? I think it's entertaining. So I'm going to lump that in with quantum theories of consciousness, which is entertaining.
Starting point is 02:58:15 I don't see it as a serious contender that offers any explanatory power in terms of sentient behavior. I should say, of course, that my work with the free energy principle and things like active inference, these are not theories of consciousness. They can certainly be deployed to set some parameters on questions that arise in consciousness research, but in and of themselves, they're not theories or principles that would underwrite consciousness having said that you know they do tell you where the mechanics of the bayesian mechanics the physics of sentience lie in the physical world and it is not at the quantum level
Starting point is 02:59:02 so you know from the perspective of a theory of every space thing, there is a range that you can formalize mathematically under things like the renormalization group of things. You can have very, very small things that have very large amplitude random fluctuations, very, very fast things. You can have very, very big things, very fast things you can have very very big things very slow things very cool things that where the you know the randomness averages away so you know one can envisage this as a spectrum between quantum mechanics um at the very fast small end and classical mechanics at the very very slow large end of the sort that would describe the orbits of heavenly bodies um which means that you um you you ask well where do where do we
Starting point is 02:59:56 fit and we being um biotic systems that show a particular kind of self-organisation that we would associate with life. And it's in that intermediate zone where both the random fluctuations and what underwrites classical mechanics, which is this solenoidal circular motion that doesn't actually change the surprise or the marginal likelihood that's where we live so we know it's not at the quantum level so we know that sentient behavior and by implication sentience cannot be found at the quantum level in the same sense it cannot be found in the motion of the heavenly bodies
Starting point is 03:00:44 simply because they're too cool. They don't have the right kind of itinerancy, the right, if you like, adaptation. The challenges, the existential challenges that might destroy their Markov blankets are simply not in play. there's effectively no interesting itinerancy that defines non-equilibrium steady state. So if I was responding, and I know I'm not, but if I was responding to a physicist asking that question, what I would be saying would be that if we consider life to be self-organization to some form of
Starting point is 03:01:28 non-equilibrium far from equilibrium non-organization then stipulatively I am saying that we have to account for this non-dissipative these non-dissipative dynamics that are mathematically the solenoidal part. If they are dwarfed by random fluctuations that you find in quantum physics, then we're not talking about any more non-equilibria. We're talking about equilibria. We're talking about solutions to the Scrodinger wave equation. So, you know, in conclusion, you know, quantum neurobiology is entertaining, it's interesting, but it's just not the right place to create a mechanics of sentient behavior. And by that, you know, I would guess that that precludes it from being an apt description
Starting point is 03:02:21 of conscious artifacts. Is one to understand the free energy principle as giving any indication as to what consciousness is? So for example, the self-evidencing, is that consciousness or is it when it's minimizing the free energy or is it the fact that it has a Markov blanket? Because when I was looking at the nodes,
Starting point is 03:02:38 any point you can define a Markov blanket to. So it doesn't seem like the fact that there exists a Markov blanket means it's conscious. So what are the conclusions one is to take from reading your free energy principle that someone can draw for consciousness? Yeah. Right. Yeah. I mean, this is again an excellent question because it's one being posed around the world as we speak you know so people like the John Templeton Foundation are sort of earnestly looking at ways to sort of put up for example the free energy principle
Starting point is 03:03:13 against integrated information theory as formalisms that might speak to speak to consciousness research so it's a it's a great question The normal answer from people in my world would start off by saying that consciousness is probably a vague notion. And I mean vague in a philosophical sense that, you know, at what point does a collection of sand grains become a pile of sand. So it admits the possibility that a thing may not be conscious. Yes, it certainly would, by definition, have to have a Markov blanket, you know, in our formalism. But that does not mean it is conscious. It will be self-evidencing in the sense of having gradient flows that try to minimize self-information. but that's as far as it goes. So then you ask yourself, well, what kind of systems, first of all, look as if they are living?
Starting point is 03:04:11 And then as we move through this vague landscape or vague dimension towards very sophisticated particles like yourself and myself, your particles like yourself and myself, what would take us from living to systems that had a minimal selfhood? And then even further, what would take us to systems like philosophers that sort of puzzle over the fact that they have minimal selfhood? And I think the answer lies really in the generative model. So if you remember, we were talking about a generative model as being a probabilistic specification of causes and consequences that underwrites the free energy or specifically the free energy gradients that provide an explanation for our dynamics.
Starting point is 03:05:00 So it all boils down to the generative model. So first of all, what would it mean to be alive? And we have discussed this in terms of just moving, to have an itinerancy to actually act upon the world. So you need to see a particle moving around, and it would look as if it was moving around in the service of sampling evidence for its own existence and maintaining itself in these characteristic ranges, showing a sort of generalized homeostasis,
Starting point is 03:05:34 you know, of the sort you might expect a worm or a single-celled system, kinds of behavior a single cell system might show. Does that have any sense of self or any sense of consciousness? Probably not. So you then take it up to say insects and small mammals. And at some point in this vague continuum, you come across the next milestone or marker on the way, which is the kind of generative model that would explain the consequences of action. So put it very, very simply, though, there are some creatures that, or systems at least, that will respond reflexively, like say microbes and indeed thermostats, that will respond in a way that seems to keep themselves within a particular range. But crucially, they don't plan. So what would it mean in terms of planning in a deliberative way,
Starting point is 03:06:42 basically selecting among a number of ways forward, a number of courses of action that will lead you to reduce uncertainty, minimise your free energy or minimise your expected surprise as a consequence of that action. Well, you would require a generative model of what would happen if I did that. Now, because what would happen is in the future, that's a very sophisticated generative model. So now you're actually equipping this generative model with a temporal depth, it can reach into the future.
Starting point is 03:07:15 And if a creature or a system is endowed with a deep generative model that encompasses the consequences of action, it's now in a position to choose those actions that it thinks will either resolve uncertainty or supply the most evidence for its own existence or, put more simply, evince the characteristic sensory states
Starting point is 03:07:37 that define what it is. So then we're moving on now to creatures that plan and if creatures that plan now have an extra bit to their generative models which recognize that they are planning, that have this hypothesis that, hang on, it may be the simplest explanation for all these multimodal sensations for all these multimodal sensations is that I am a thing. And that it is me that is prosecuting these reflexes or moving, or it is me that is looking over there. And now you've got a very sophisticated generative model where you can certainly imagine that there is a minimal representation
Starting point is 03:08:24 of self, like self-modeling so now we're moving closer to the notion of conscious artifacts would they be fully conscious would they would they have qualitative experiences would they have qualia and then the next item is what would license um a qualitative experience you know you know what it would be like to see um red or any other attribute of the sensorium that usually the argument then is well that's um that would require the kind of generative model that was able to essentially infer on its own inference, that would be able to enact a kind of mental or covert action in order to select in the model itself, in the message passing in the brain, for example,
Starting point is 03:09:19 be able to select certain sources of information in the service of this hierarchical self-evidence. If you remember the hierarchy, like layers of an onion, and the deeper you get into the onion, the higher you are in this hierarchy, with the recurrent message passing between each of the layers. But if the deeper, the core of the onion, was now able to act upon the intermediate layers or levels of this hierarchical construct, then you might now suppose that this is a kind of mental action that licenses you to say now this is an agent, it's an agent that plans because it's acting. Psychologically, this would be effectively attention. So if you can plan what to attend to, or generate attentional selection of the way that you go and secure evidence from your world, then you might now start to develop a theory of
Starting point is 03:10:27 what it is like to actively perceive something. You know, I could go on at length, probably not very usefully, in terms of how you might cast that in terms of phenomenal transparency and opacity. In brief, what we're saying is that there are certain generative models, there are certain creatures that can render their percepts experienced by rendering them opaque, by acting upon them, by having this capacity of attentional selection. So now you've got a very sophisticated generative model that not only has temporal depth that enables the agent to plan, it's also truly an agent that may be self-aware or at least aware of what it is perceiving because it can now act upon the perceptual inference that's implicit in this self-evidencing but we still haven't got to self-awareness yet so these things might be apt to describe um say mice um but we don't know whether mice are self-aware they could you know probably uh they can attend to this or that and then you're under this, if you like, spectrum of vague levels of different kinds of consciousness. They could be said to be conscious in the sense that they are sentient and have qualitative experiences, but are they self-aware?
Starting point is 03:11:59 And if you want to be self-aware, you've now got to, by definition, have part of your generative model literally part of your brain that stands in and represents selfhood so just and then we come back to this notion that your highest forms of um of life on earth would would probably have realized and learned uh and infer that they are entities they are agents and that becomes me and it's me that's doing this overt action and this covert action this attentional selection and and the final step would be to go from this highest form of life to become a philosopher when you start to have mental models of how it is that some people are self-aware and so on ad infinitum in terms of the number of...
Starting point is 03:12:52 Is there an ad infinitum? What's the next step after philosophers? No, it's not. As soon as I said that, I realized that was stupid and you'd stop me. No, it's not. Okay. I think we've probably got as far as we can
Starting point is 03:13:05 in terms of levels of recursion. And I say that because there's a lovely notion, which I'm sure you're aware of. This is a notion of the meta-heart problem. No, I've never heard of that. I should have. No, you'll love this. I think it's something that was introduced
Starting point is 03:13:23 just in a presentation I taught by andy kark in about 2001 and then seriously um taken up by david charmers uh you know about two two to three years ago in 2018 so this this is not the hard problem of um finding an explanation for sentience uh you know what is it like to perceive red um it it's more why do we find why do we find ourselves puzzling over it why do we think it's such a mystery yeah so this is not the hard problem but what sits next to the hard problem which is why is it that we find it so perplexing and difficult to explain why to the hard problem, which is why is it that we find it so perplexing and difficult to explain? Why is the hard problem a problem for us?
Starting point is 03:14:11 And, you know, that's, you know, to my mind, it's an extra level that you were talking about. So, you know, now we're asking, we're trying to explain the emergence of philosophers. So we're now moved up to the next level. And it's a very interesting problem. And it tells you immediately a number of very interesting things. If I am puzzled about the fact that I am seeing red or that I can see red,
Starting point is 03:14:40 that tells you immediately that your generative model has to have a model of a counterfactual where i can see red but not experience it now that's quite remarkable you know to actually have that kind of generative model where you can imagine yourself you know in a structured in a very very different way um so i use the word counterfactual there deliberately because it may be just because we are systems, artifacts, sentient artifacts that have these very, very deep generative models that entertain counterfactuals. And Neil Seth calls this counterfactual depth. Now, allows it. I spoke to Neil Seth for this program. He's a great guy.
Starting point is 03:15:26 Well, I hope he mentioned counterfactual depth. If he doesn't, you'll have to get him back. Yeah, yeah. No, I don't think he did. Or maybe I slipped up and don't remember. Okay, so continue. Sorry. Yeah.
Starting point is 03:15:37 So this notion of counterfactual depth, I think is very important. I mean, it's absolutely requisite for planning in the sense that you know planning is choosing the right course of action which means you've got a series of options all of which have these counterfactual outcomes um but of course if you're given that model you're given that kind of model you're now in a position to hypothesize counterfactuals that could never occur, such as seeing red without seeing red, such as looking but not seeing, for example,
Starting point is 03:16:15 such as the thought experiments that philosophers like, like philosophical zombies. The very fact that there is this meme of a philosophical zombie tells you immediately that we are in discourse with creatures namely philosophers that have the capacity to um um represent in their generative models impossible scenarios and counterfactuals of the kind that you would have to bring to the table to you know to to conceive of a of a philosophical zombie so just perhaps even put even more simply the fact that people can do these kinds of thought experiments tells you a lot about the kinds of generative models they use to explain their world and that that is possibly a sufficient explanation for the meta hard problem, just because we can.
Starting point is 03:17:24 The best part of the book to me was when she was talking about philosophical zombies. That is for the people who aren't listening. How can there be robots that look like us that aren't conscious and it produces an equivalent world? And she said something so interesting, which is that the fact that we're talking about consciousness implies that we're conscious is because in an equivalent zombie-like world, it's hard to imagine why they would come up with that question. So that's what separates us from them. Yeah. That sounds exactly like how Andy Clarke would phrase it. The big thing is why we're so puzzled about consciousness and sentience. So that sounds very, very close to the core question that underwrites the meta-heart problem or the meta-problem. Yeah. Anna Lukomsky asks, she says, that's lovely to hear when I told her that I was speaking to you.
Starting point is 03:18:16 She says, fantastic news about Carl Friston. I'm curious to know if he's an idealist or a physicalist. Now I'm biased towards ideal idealism i'm speaking in terms of her she's a psychiatrist by the way or a psychologist and she says well first of all are you an idealist or a physicalist um i'm very happy to bat for both sides depending on who who i'm talking to okay so you're uncommitted right now um no, I mean, I'd be happy to commit to either. So I see this through the lens of the people I know in philosophy. So on the one hand, my friend Andy Clark would be somebody who believes, certainly would not adopt a radical idealist, whereas Jacob Hoy would celebrate the Markov blanket
Starting point is 03:19:09 that separates you from the external world, and therefore the external world can never be known. It's only ever accessed vicariously via the Markov blanket, and he might then take a more sceptical position. I think the maths of the free energy principle really forces you to back for both sides. On the one hand, it is absolutely true that our making sense of the world
Starting point is 03:19:41 through the sensory veil supplied by the Markov blanket precludes anything out there that will be ever known. That is certainly true. So it's certainly consistent with brain and a vat-like thought experiments. There will be no way that you could verify or not what's actually going on out there. Everything is this sort of delusion or hallucination that is entrained and grounded by sensory evidence but you know you never know the causes of that sensory evidence that's the whole premise of inference you know active inference encapsulating or the encapsulating the take on self-organization as a process of inference. So that would be sort of one side of the fence. The problem is that if you commit to that,
Starting point is 03:20:35 it's very difficult to actually simulate or engineer sentience because in order to do that, you've actually got to generate the sensations, the sensory states of the Markov blanket that the agent can infer. And of course, when I talk about or refer to the good regulator theorem, I'm implicitly assuming that there is a world out there to be regulated. I'm assuming that the quality of a generative model, as measured by its free energy, is in relation to what is being modelled. So I have to commit to some realism in the sense that I'm saying there is a model.
Starting point is 03:21:18 So it's very difficult for me to answer that because on the one hand, if it's a model of something, then the something must be real. On the other hand, it's always the process of inference under a particular model means i will never know whether there's anything real or not so you know it sort of accommodates both points of view and resolves neither i guess the psychological reason you're saying sorry there's not a psychological reason that you're saying you're both a physicalist and a realist or neither. It's not because you don't want to offend anyone. It's actually because your model accounts for both. Well, one is like ontological, the other is epistemic. So one is like we can't have knowledge of it, but the other one, but at the same time, you're assuming that there exists a world that's independent.
Starting point is 03:22:09 Yes, that's a beautiful way of putting it but but also i also don't want to offend anybody as well but yes you're a lovely chap you want to keep your charming nature okay so this person named matthew nahemi says does referring to you does carl friston know alfred korzybski and what does he think about his theories? That doesn't ring a vague bell, but you're going to have to tell me. Yeah, then forget it, because I actually don't know this person, and I have been told that I need to learn more about this person. Dr. J.T. Velikovsky of the University
Starting point is 03:22:36 of Newcastle from Australia, who has published some work on Holon-Parton theory, structure of the meme, some memes like Richard Dawkins memes. He asks, do you expect, like David Bohm suggested, that the scale size of entities in the universe,
Starting point is 03:22:57 e.g. matter, energy, go infinitely smaller, e.g. smaller beyond quarks, and also infinitely larger, e.g. to multiple universes? Yes, I mean, I think that would be a sort of mathematical take on the application, if one wanted to do that, of the apparatus of renormalization group when trying to understand the separation of temporal scales between different levels of organization, which is an important part of the free energy formalism when it comes to thinking about Markov blankets and Markov blankets, say, you know, organs that comprise cells, where each cell comprises organelles that themselves comprise molecules that themselves comprise that,
Starting point is 03:23:44 and so on and so forth and then the other way organs constituting phenotypes phenotypes constituting populations conspecifics constituting in groups and out groups constituting species and and so on right up to um sort of a cosmological scale so you know i think mathematically think mathematically that that's a very comfortable position to adopt. So that there is no lower or upper bound on the application of the renormalization or the apparatus of the renormalization group when you move from one scale to the next, which you can do very gracefully when it comes to certainly modeling, for example, applying the free energy principle to single cells in a network of neurons, and then coarse graining that
Starting point is 03:24:36 to try to model parts of the brain as self-evidencing through physical motion, through to coarse-graining that, and then trying to model populations talking to each other or interacting through the Markov blankets that defines an organization or a family or a set of conspecifics that defines it in relation to say the the environment or the econish so these are just examples of um lifting exactly the same dynamics the gradient flows that underwrite the free energy principle from one scale to the next and in so doing working out what the states are what the you know when you say the states of affairs or the sensory states or the internal states yeah what are they and um when you sort of drill down on that what you can do is
Starting point is 03:25:32 use the um the renormalization group formalism um to in moving from one scale to the next you basically take um mixtures of blanket states, and then those mixtures of blanket states, technically the eigenfunctions of the blanket states, now become the states of the next level. And then you look at the conditional dependencies of the states of the next level, then you find the Markov blanket, then you take the eigenfunctions of those states, and then they become states of the next level. At each level of coarse graining, you get bigger and slower. So before we were talking about sort of this from quantum to classical, that inherits simply from this successive coarse graining.
Starting point is 03:26:21 But at each point, exactly the same free energy, for like Lagrangian, you know, the way that you would write down the dynamics and, you know, on one reading the energetics, the functional forms are exactly preserved, you're just swapping out sort of, you know, microscopic states for mixtures of microscopic states, in particular, repeat the eigenfunctions or mixtures of blanket states to get to the next level. Noting that you're throwing away the internal states. Remember, the internal states are independent of the external states, given the Markov blanket, and vice versa, which means there's nothing in the internal states of any particular particle that is not, if you like, encoded in the blanket states if you wanted to move to the next level. So at each level, you're throwing away the internal states that can have very fast dynamics,
Starting point is 03:27:22 that can have very fast dynamics, and you're picking out the slower fluctuations on the Markov blanket states themselves. So the states at the next level up now fluctuate much, much more slowly, and they encompass, they have a larger spatial scale as well. So once one has that um that that that sort of um recursive um you know not proof by induction but the same spirit as a proof by induction where you move from from one level to
Starting point is 03:27:54 the next level uh that um once i have one has that apparatus in play there is no need to ask when does when does the process stop or where did it begin? And in a sense, you dissolve those questions because you're only interested in one particular scale and how it relates to the physics of the cosmos, then you'll be going, let's say, six scales towards classical mechanics. If you're interested in string theory and small particle accelerators, then you'll be going, say, six scales down to very, very small things towards the quantum level. Is six just a number for an example example or are you specifying something in
Starting point is 03:28:47 particular um well yeah it is a number i pulled out of the hat but but it's probably about that when one looks at um when one looks at the time constants um empirically associated with any particular scale or any set of markov blanks at different scales. And then one goes to the next scale using, you know, this is not magic. It has been applied, for example, to brain imaging time series where you look at every little cubic patch of the brain, a few millimetres in size, and then you say, well, that's the scale I'm going to start with. And I'll take the eigenmodes or the eigenfunctions
Starting point is 03:29:33 literally as the activity of each of this little chunk of brain. And then I've got, say, thousands of states. And then I'll look at the dependencies, and I'll work out a Markov blanket. And it could be a Markov blanket around a little chunk of brain the size of your thumbnail. And then you can take the eigenfunctions of the Markov blanket of that little chunk of brain, and then you put the chunks together, you find the Markov blanket at that level, and then you talk about lobes of the brain, and then you talk about the entire brain, and then you think about, well, lots of brains together
Starting point is 03:30:08 at a scientific conference. There's so many, whatever, almost every four sentences that you speak, three ideas occur to me, and I just want to explore them all. So here's an example. We're not going to explore them. I'll just say them. When you mentioned that the internal states are independent of the external, vice versa, that means there's almost an arbitrariness as to which one you call internal, and which means that you can, at least I'm surmising, I'm gonna, I'm pretty much asking, but either way, then that means that you can imagine the environment models you in a sense. And then I wondered, well, what's the connection between that and the holographic principle in string theory, which says that at the boundary, it encodes the information. Anyway, okay, that's just what occurs to me. And also multiple dissociative
Starting point is 03:30:55 personality disorder. Also want to know about that. But it's not about me. Joanna Dong has a question. She says, my question to Carl Friston, how do you explain the meaning of life using the free energy principle? That's an easy question. You can answer that in like 10 seconds. I've never been asked that before, the meaning of life. So no, I'm just going to use my favorite words, tautology and deflation. The meaning is in the existence. The shape of that existence defines the goals for the existential imperatives to keep yourself in those existential domains. So it's just in the existence. It's just the shape of things that exist that define their own meaning.
Starting point is 03:31:43 And, you know, in a sense, you could look at self-evidencing as own meaning. And, you know, in a sense, you could look at self-evidencing as finding meaning for your life or understanding the model for which you're trying to accrue evidence. And what that basically means is, if you're reading that as a psychiatrist or a philosopher, you're basically trying to understand yourself and your own generative models.
Starting point is 03:32:18 So if you understand your own purpose, your purpose is just to aspire to these attracting sets that define you as the kind of thing you are and then the meaning would i think rest upon this some kind of self model where now you recognize you're this kind of person so you you can spin it off at a number of different levels so it wasn't a very i'll think of a more succinct answer to that question psychologically an attracting set would be what? Mathematically, that's fine, but for someone
Starting point is 03:32:48 who is just interested in how one lives their life, is this just a recapitulation of know thyself? Yes, or the things that you are attracted to, literally. The things that you see yourself being attracted to, the things that you see yourself aspiring and working towards um um so what you find interesting and admirable is that
Starting point is 03:33:11 what you mean yeah yeah yeah that that was that at a very high level but remember you know this operates at all scales of of a deep generative model where remember the the the core is dealing with with sort of more abstract things such as as being likable, being loved, being rich. But at the periphery, there are still, there's a model of basically body temperature, body pose, not being in pain. So these are also things that make me happy, in a sense. These are attracting states.
Starting point is 03:33:44 It's attractive to be at a particular temperature. It's attractive not to sense pain. also things that make me happy in a sense you know these are these are attracting states it's attractive to be at a particular temperature so attractive not to sense pain it's attractive um you know to be hydrated to a particular extent uh at some level it is also attractive you know if you have very deep generative models um that have this counterfactual depth, then there are certain counterfactual outcomes that you would find very surprising, would be unattractive, and there'd be other outcomes which you would have learned, this is the kind of outcome that I aspire to, this is the kind of generative model I am, you know, and at that point then I think it would be quite fair to start thinking about this in terms of, you know, I'm going to develop this reputation.
Starting point is 03:34:30 You know, if I say this or if I donate this or I'm going to be this kind of person if I take these actions. if I take these actions. So there are certainly entailed in the prior beliefs that constitute the generative model, ways of behaving, ways of being, outcomes that you a priori expect to happen. And generally, these are going to be things that are consistent with your self-image, with your self-model.
Starting point is 03:35:02 And interestingly, also require an optimism bias, a sort of, you know, because of this self-fulfilling prophecy of all our actions and decisions and choices and behavior, if they're all in the service of realizing our fantasy about the kind of thing that we are, then we're always behaving and selecting those information and ignoring other kinds of information in order to find evidence that indeed this is the kind of thing I am so I'm going to ignore any evidence that run runs counter to my self-image you know that I'm not liked or not believed or that I'm very poor or very hungry unless I can do something about it. So I think it's a really interesting question because it speaks to both the self-fulfilling
Starting point is 03:35:58 prophecy and this sort of optimistic aspect of active inference, which is not a fallacy. It is absolutely essential in keeping us on track and keeping us within these attractive states. So if you're a behavioral psychologist, this would just be some kind of reward. If you're a control engineer, it would be some kind of negative loss function. So you just score being in a given state in terms of the probability that you would find me in that state. And if it is unattractive and the kind of thing I would avoid, that simply is a statement that has a very low probability. That means that the negative logarithm is very, very high. It just means it's surprising. So it's just another way of saying that the attracting set of states or
Starting point is 03:36:54 the attractive states are those that I am not surprised to be in. They are comfortable, they're familiar. Given I am who I am, this is exactly the kind of state that I think I should, I should plausibly occupy. Again, almost all that you say, there's so many thoughts that occur. I'll say some of them now, not for you to explore, but maybe for people in the comment section to explore. I was thinking when you were talking about different goals that one has, and there's different levels of abstraction. So one is to regulate your body temperature at an extremely low level than at a high level, one might be to be a lovable person or to love someone else. Then I was wondering, well, though, you can go much, much lower to parts that you don't have control over. So it's, it's, you want
Starting point is 03:37:38 to have transcriptase in your cells so that you can have a replication of DNA, but you also want to have cell walls. It's not like you can control that, but that's at the extremely low level. And then at a high level, what I'm wondering is, is there a relationship between the highest ideal and then God? And then is there a relationship to being fragmented there and having what's not monotheistic? So that is polytheistic. And then if you have what's polytheistic and there's a contradiction between them, is that one of the reasons why in monotheism they say you cannot serve two gods and is that related to people who are well all of us are in some degree psychologically not fully integrated but is there a relationship between monotheism and being fully integrated as a person where you
Starting point is 03:38:20 have one goal that you're working toward and they're not contradicted at some higher step so you're living in a in a more congruent fashion you don't have to comment on that you could probably say that's all foolishness but people no no no do i can't resist because when you say i have lots of thoughts as well and so that that sort of can we act in order to make sure that our messenger rna working properly? Yes, you can actually, but you'd have to use this sort of scale, different scales of Markov blankets. But also there is actually a formalism that allows you to look at the way that, say, the central nervous system through neuroendocrinology and through sort of your neuroimmunological mechanisms does actually have an effect on gene expression
Starting point is 03:39:13 and this speaks to the sort of the you know the transactions both between Markov blankets but also if a Markov blanket at one scale exists, that necessarily implies existence at another scale. So a cell can only exist if the organ exists, but in the same spirit and in reverse, the organ only exists in virtue of the cells existing. And all of this has to maintain a free energy minimizing dynamic in order to keep both scales in play at the same time. So, you know, it's a really, I think that's a really interesting example. The God versus or polytheistic versus monotheistic models of the world.
Starting point is 03:39:54 I think that's, you know, a lovely description of sort of, you know, sources of cognitive dissonance, but possibly of a theological or existential sort that might underwrite or not ontological security, that if I do have these counterfactuals about different things of the kind of person that I am or the deities that I commit or subscribe to, the deities that I commit or subscribe to, then that's just another example of the opportunity to resolve uncertainty but in that sense there's also uncertainty there, there's this puzzlement that underwrites the meta hard problem that is it this kind of god or that kind of god, is it the father or the holy son. And by introducing that uncertainty, you may well expose yourself to cognitive dissonance and ambiguity and an existential angst at that level, which you will actively try to resolve. So that's this, again, this imperative for self-evidencing is acting in a way to minimize
Starting point is 03:41:05 expected surprise or uncertainty in the future so it may well be that certain um um clerics certain sorry what clerics clerics clerics yes or people people an ecclesiastical leaning, they may spend a lot of time trying to resolve that kind of uncertainty, that kind of cognitive dissonance. Okay, two quick questions. That's it. Faraz says, in the context of embodied cognition, and by the way, Faraz is someone who graduated from U of T right around here in neuroscience. And I forgot the other one. So he's going to comment in the comment section about what I forgot. In the context of embodied cognition, what is the relationship between perception and mathematics?
Starting point is 03:41:54 And furthermore, if the external world perceives our actions, can we extrapolate from that assertion, the idea that our actions matter to the world as well? from that assertion, the idea that our actions matter to the world as well. So that question you sort of hinted at before when you were... The duality between internal, external? Exactly, yeah. That sort of perfect symmetry, which I thought was, you know, that was insightful. And I think that insight sort of speaks to the second half of this question here. integrate these simulation of physics engines using the free energy principle, you get the kind of learning that you would normally expect to see in the
Starting point is 03:42:58 agents about the environment, but that is exactly mirrored by the environment learning about the denizens that you know it plays host to so i'm sure you know this but you know my favorite example from colleagues who really you know push these arguments as far as they they can go at the present time my favorite their favorite my favorite example is a notion of a desire path or an elephant path, which is the phenomenon. Say I'm going to get my coffee from the cafe on the other side of a green or a park, and I can either walk around on the pavement or I can take a shortcut across the grass. cross the grass and if i so do um i will uh eventually me and all my conspecifics will eventually wear a um a you know a dusty trail uh of grassless signs and cues uh from uh you know my office directly to the cafe so from the point of view of the environment this is basically the environment learning about the behavior of the kinds of agents that you're um that occupies that um that um
Starting point is 03:44:13 that environment so the environment is learning about the behavior by being eroded by being changed so it's literally it is remembering in structure, in its sub-personal structure, it is learning about the conspecifics that are good for occupying that environment. The conspecifics that are occupying that environment. What are the conspecifics? Oh, a conspecific, you and I are conspecifics because we're from the same species. You and I are conspecific because we're from the same species. So it's just a way of describing an ensemble of or a collection of phenotypes, agents that come from the same species. And the environment can learn this is the kind of phenotype that operates well in my environment,
Starting point is 03:45:06 whilst at the same time, the phenotypes are trying to change to operate better in this environment. So there's a circular causality. While the phenotypes or the conspecifics are trying to adapt to the environment, the environment is also adapting to those conspecifics. So it's a dance um that in fact can get quite high order in the sense now that that you and i will now see the path and realize oh that must have this path leads to to somewhere that it must be nice for things like me to go
Starting point is 03:45:40 so i can just infer seeing a desire path that it will lead to something i desire because i know that everybody else who has used that are conspecifics they like the same kinds of things they have the same attracting sets so now you know the environment has learned about the uh the phenotypes and has memorized it in its structure And now the phenotypes are now using that to infer the kinds of things that I should do. So now you get, it's like, you're using the environment to cooperate and to signal and to change and to provide information
Starting point is 03:46:16 for other things, other creatures like me. And you can take that right through to the emergence of roads, of traffic lights, of language. So you move these arguments into a cultural domain. And now you've, you know, so language, is that part of the environment or is it part of Popper's third world? Or is it, you know, is it part of your brain? I don't think you can tie it down like that because of the circular causality that you hinted at, in in fact more than hinted at
Starting point is 03:46:45 when you're when you recognize that a mark off blanket you know it's quite arbitrary what is internal and external this depends upon which point of view you're taking so i think it's a really interesting question which you can take in many many different directions you know i i i'm personally not um not taking it in those directions because I got distracted by other things, but I have lots of young colleagues around the world who are really trying to understand this relationship to evolutionary psychology, evo-devo, the relationship to niche construction, the relationship to what Andy Clark would have promoted, which is the designed environment. So one of the sort of, you know, key parts of the triple, you know, the four E's,
Starting point is 03:47:31 you know, so yes, it is embodied, but there's also, you know, an extension, it's situated. So the environment, not just the physical environment outside my body, but my body as environment as well starts to become you know an important part of this reciprocal loop and this sort of joint free energy minimizing mechanics that rests upon a reciprocal exchange between you know one side of a markoff blanket and the other side of the markoff blanket. I'll just close with the first part of the question,
Starting point is 03:48:05 which is, does that mean the environment perceives? I would say possibly not. It certainly would be the case, mathematically speaking, that the environment is learning, but we generally reserve the notion of perception to refer to perceptual inference or inference about states of the world, as opposed to the parameters of a generative model, which are more attributes of contingencies, laws and the like. So that we infer that the world is in this state or that state at this point in time, but we learn the contingencies and the lawful mappings, the likelihoods, the transition probabilities that underwrite the changes and transitions in the state, the flux of states in the world. So it's probably the case that the environment,
Starting point is 03:49:00 the physical environment, not the body environment, so you know things like roads and paths and buildings trees um they they learn but they probably don't infer and perceive in the sense of having qualitative experience of qualia but they do learn um you know they may update their parameters in a way to model what's on the other side of their mark of blanket which is you and me thank you sir you've given you've been far too generous with your time and i appreciate it immensely i think this conversation will be fruitful for quite a few people especially more on the philosophical end there's not many most of the interviews with you are more technical so i'm glad that we got a bit philosophical even though i love some of the technical questions i have maybe 30 30 more that i didn't get to but we'll save that for the next time well as last time it's been a real pleasure talking to you thank you i look forward to our next exchange thank you man appreciate it honestly i Thank you so much. And you've helped me psychologically as well, because, well, just you telling me that other people are going through what I'm going through, but sometimes even worse and sometimes better, but it's normal.
Starting point is 03:50:22 I'm not at a psychotic break, but that means that it's possible. And if Kantor studying infinity could go through psychotic breaks, I wonder how much of what I'm doing is self-imposed from my own study of consciousness and investigating the universe. So it gives me a bit of anxiety because it's like, okay, well, this is a path that may lead to that. But at the same time, it gives me some calm because you're saying it's okay. First of all, if it gets serious serious it can be treated don't worry about
Starting point is 03:50:46 that and and it doesn't seem like i give off any signs of being schizophrenic even though i'm concerned about it so yes ironically the very fact you're concerned means you can't be you can't be psychotic i i did cross my mind when you're talking about the monotheism versus polytheism and resolving that sort of cognitive dissonance and existential angst you get when you, you know, it could be this way or that way. I mean, the very ability to entertain these counterfactuals is very, very close to psychosis.
Starting point is 03:51:21 You know, the kind of angst you see in manic depressive psychosis and study um you know the kind of the kind of angst you see in manic depressive psychosis so it's a natural state for curious creatures so the anxiety you're going to go over the edge it's just the price you pay for being a curious creature um and as you say you'll know when you're going too far and you either get your wife to call him a doctor or you'll just sort of commit to one way of being or another way of being. You will know that. But what I'm saying is you shouldn't be frightened of this. You should celebrate this,
Starting point is 03:51:56 which is why I always encourage you to actually write down everything you think. Trust me, I write down all of it. Here's an extreme sort of loss of ontological security. What I don't do is rewrite, and that's my problem as well, because plenty of the writing is in the rewriting, and plenty of retaining it is in the rewriting as well. But anyway, I've got to work on that.
Starting point is 03:52:19 Thank you, sir. I feel like I've gotten along with you more than almost anyone that I've interviewed, and I've interviewed almost 50 or 50-plus people, so I don't know what to attribute to that, but it's probably your demeanor and there's a caringness that comes across in your voice, and I appreciate that, and a humbleness as well. Thank you. All right. That was very great, Sophia.
Starting point is 03:52:45 Thank you very much

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.