Lex Fridman Podcast - Regina Barzilay: Deep Learning for Cancer Diagnosis and Treatment

Episode Date: September 23, 2019

Regina Barzilay is a professor at MIT and a world-class researcher in natural language processing and applications of deep learning to chemistry and oncology, or the use of deep learning for early dia...gnosis, prevention and treatment of cancer. She has also been recognized for her teaching of several successful AI-related courses at MIT, including the popular Introduction to Machine Learning course. This conversation is part of the Artificial Intelligence podcast. If you would like to get more information about this podcast go to https://lexfridman.com/ai or connect with @lexfridman on Twitter, LinkedIn, Facebook, Medium, or YouTube where you can watch the video versions of these conversations. If you enjoy the podcast, please rate it 5 stars on iTunes or support it on Patreon.

Transcript
Discussion (0)
Starting point is 00:00:00 The following is a conversation with Regina Barzley. She's a professor at MIT and a world-class researcher in natural English processing and applications of deep learning to chemistry and oncology, or the use of deep learning for early diagnosis, prevention, and treatment of cancer. She has also been recognized for a teaching of several successful AI-related courses at MIT,
Starting point is 00:00:24 including the popular introduction to machine learning course. This is the Artificial Intelligence Podcast. If you enjoy it, subscribe on YouTube, give it 5 stars and iTunes. Support it on Patreon or simply connect with me on Twitter, at Lex Friedman spelled F-R-I-D-M-A-N. And now, here's my conversation with Regina Barzley. In an interview you've mentioned that if there's one course you would take, it would be
Starting point is 00:01:09 a literature course with a friend of yours that a friend of yours teaches. Just out of curiosity because I couldn't find anything on it. Are there books or ideas that had profound impact on your life journey, books and ideas perhaps outside of computer science and the technical fields. I think because I'm spending a lot of my time at MIT and previously in other institutions where I was a student, I have limited ability to interact with people so a lot of what I know about the world actually comes from books. And there were quite a number of books that had profound impact on me and how I view the world. Let me just give you one example of such a book. I've maybe a year ago read a book
Starting point is 00:01:56 called The Emperor of All Melodies. It's a book about, it's kind of a history of science book on how the treatments and drugs for cancer were developed. And that book, despite the fact that I am in the business of science, really opened my eyes on how imprecise and imperfect the discovery process is and how imperfect are current solutions. And what makes science succeed and be implemented, and sometimes it's actually not the strengths of the idea, but the devotion of the person who wants to see it implemented. So this is one of the books that, you know, at least for the last year, quite changed the way I'm thinking about scientific process,
Starting point is 00:02:42 just from the historical perspective. And what do I need to do to make my ideas really implemented? Let me give you an example of a book which is a fiction book, is a book called Americana. And this is a book about a young female student who comes from Africa to study in the United States. And it describes her past, you know, within her studies, and her life transformation that, you know, in a new country and kind of adaptation to a new culture. And when I read this book, I saw myself in many different points of it, but it also kind of gave me the lens on different events. And some events
Starting point is 00:03:37 that I never actually paid attention, one of the funny stories in this book is how she stories in this book is how she arrives to New College and she starts speaking in English and she has this beautiful British accent because that's how she was educated in her country. This is not my case. And then she notices that the person who talks to her, you know, talks to her in a very funny way, in a very slow way. And she's thinking that this woman is disabled and she's also trying to kind of to accommodate her. And then after a while, when she finishes her discussion with this officer from her college,
Starting point is 00:04:17 she sees how she interacts with other students, with American students, and she discovers that actually, she talked to her this baby because she thought that she doesn't understand English. And he thought, wow, this is a fine experience. And literally within few weeks, I went to LA to a conference and they ask somebody in an airport, you know, how to find like a cab or something. And then I noticed that this person is talking in a very strange way. And my first thought was that this person have some, you know,
Starting point is 00:04:50 pronunciation issues or something. And I'm trying to talk very slowly to him and I was with another professor, Ernst Frankel. And he's like laughing because it's funny that I don't get that the guy is talking in this way because he thinks that I can speak. So it was really kind of mirroring experience and it led me think a lot about my own experiences moving from different countries. So I think that books play a big role in my understanding of the world. On the science question, you mentioned that it made you discover that personalities of human beings are more important than perhaps ideas. Is that what I heard? It's not necessarily that they are more important than ideas, but I think that ideas on their
Starting point is 00:05:35 own are not sufficient. Many times, at least at the local horizon, it's the personalities, and their devotion to their ideas is really that locally changes the landscape. Now, if you're looking at AI, like, let's say, 30 years ago, you know, dark ages of AI or whatever, what these symbolic times you can use any word, you know, there were some people, and now we're looking at a lot of that work,
Starting point is 00:06:03 and we're kind of thinking this was not really a relevant work, but you can see that some people managed to take it and to make it so shiny and dominate the academic world and make it to be the standard. If you look at the area of natural language processing, it is well known fact, and the reason the statistics in NLP took such a long time to become mainstream, because there were quite a number of personalities which didn't believe in this idea, and the stop research progress in this area. So I do not think that, you know, kind of asymptotically, maybe personality matters, but I think, locally, it does make quite a bit of impact. And it's generally, you know, speed up, spins up the rate of adoption of the new ideas. Yeah, and the other interesting question is, in the early days a particular discipline, I think you mentioned
Starting point is 00:07:06 in that book, is ultimately a book of cancer? It's called the Emperor of All Melodies. Yeah, and those melodies included the trying to, the medicine, was it centered around? So it was actually centered on how people sought of curing cancer. Like for me, it was really a discovery, how people, what was the science of chemistry behind drug development that it actually grew up out of dying, like coloring industry that people who developed chemistry
Starting point is 00:07:39 in 19th century in Germany and Britain to do, you know, the really new diesyes, they looked at the molecules and identified that they do certain things to cells. And from there, the process started, you know, like, the story is, yeah, this is fascinating, that they managed to make the connection and look under the microscope and do all this discovery. But as you continue reading about it, and you read about how chemotherapy drugs were should develop in Boston and some of them were developed. And Farber, Dr. Farber from Dana Farber, you know, how the experiments were done that, you know,
Starting point is 00:08:18 there was some miscalculation. Let's put it this way and they tried it on the patients and the just and those were children with leukemia and they died And they tried another modification you look at the process how imperfect is this process and You don't like if we're again looking back like six years ago 70 years ago You can kind of understand it, but some of the stories in this book which were really shocking to me where really Happening, you know, maybe a decade ago. And we still don't have a vehicle to do it much more fast and effective. And, you know, scientific, the way I'm thinking, computer science, scientific.
Starting point is 00:08:55 So, from the perspective of computer science, you've gotten a chance to work the application to cancer and to medicine in general. From a perspective of an engineer and a computer scientist, how far along are we from understanding the human body, biology, of being able to manipulate it in a way we can cure some of the maladies, some of the diseases. So this is very interesting question. And if you're thinking as a computer scientist about this problem, I think one of the reasons that we succeeded in the areas where the computer scientist succeeded is because we are not trying to understand in some ways. Like if you're thinking about like e-commerce, Amazon doesn't really understand you and that's why it recommends you certain books or certain products correct.
Starting point is 00:09:47 And traditionally when people were thinking about marketing, they divided the population to different kind of subgroups, identify the features of this subgroup and come up with a strategy which is specific to that subgroup. If you're looking about three commendations, systems, they're not claiming that they are understanding somebody, they're just managing from the patterns of your behavior to recommend you a product. Now, if you look at the traditional biology, obviously, I wouldn't say that I at any way, you know, educated in this field, but you know, what I see, there is really a lot of emphasis
Starting point is 00:10:26 on mechanistic understanding. And it was very surprising to me coming from computer science, how much emphasis is on this understanding. And given the complexity of the system, maybe the deterministic full understanding of this process is, you know, beyond our capacity. And the same way as in computer science, when we do recognition, when we do recommendation in many other areas,
Starting point is 00:10:50 it's just probabilistic matching process. And in some way, maybe in certain cases, we shouldn't even attempt to understand. Or we can attempt to understand, but in parallel, we can actually do this kind of marchings that would help us to find cure or to do early diagnostics and so on. And I know that in these communities it's really important to understand, but I'm sometimes wondering what exactly does it mean to understand here?
Starting point is 00:11:20 Well, there's stuff that works, but that can be, like you said, separate from this deep human desire to uncover the mysteries of the universe, of science, of the way the body works, the way the mind works. It's the dream of symbolic AI of being able to reduce human knowledge into logic and be able to play with that logic in a way that's very explainable and understandable for our humans. I mean, that's a beautiful dream. So I understand it, but it seems that what seems to work today and we'll talk about it more is as much as possible, reduce stuff into data, reduce whatever problem you're interested in today to try to apply statistical methods, fly machine learning to that.
Starting point is 00:12:06 On Anna Persel, you were diagnosed with breast cancer in 2014. What did facing your mortality make you think about? How did it change you? You know, this is a great question. And I think that I was in the rear many times, and nobody actually asked me this question. I think I was 43 at a time. And in the first time, I realized in the weird many times, nobody actually asked me this question, I think. I was 43 at a time.
Starting point is 00:12:26 And the first time I realized in my lifetime I die. And I never thought about it before. And there was a long time since you diagnosed until you actually know what you have and have severe disease. For me, it was like maybe two and a half months. And I didn't know where I am during this time because I was getting different tests and one would say it's bad and I would say no it is not so until I knew where I am I really was thinking about all these different possible outcomes.
Starting point is 00:12:55 Were you imagining the worst or were you trying to be optimistic or were you? It would be really, I don't remember what was my thinking. It was really a mixture with many components at the time, speaking in our terms. One thing that I remember, every test comes and you think, oh, it could be this, it may not be this, and you're hopeful, and then you're desperate. It's like there is a whole slow of emotions that goes through you. But what I remember is that when I came back to MIT, I was kind of going the whole time through the treatment to MIT, but my brain was not really there. But when I came back really, I finished my treatment and I was here teaching and everything. You know, I look back at what my group was doing, what other groups was doing,
Starting point is 00:13:45 and I saw these trivialities. It's like people are building their careers on improving some parts around 2-3 percent or whatever. I was like, seriously, I did a work on how to decide for a uglity, like a language that nobody speaks and whatever, like what is significance. When I was sad and you know, I walked out of MIT, which is, you know I walked out of MIT which is you know when people really do care you know what happened to your Eclipse paper you know what is your next publication to ACL to the world where people you know people you see a lot of sufferings that I'm kind of totally shielded on it on daily basis And it's like the first time I've seen like real life and real suffering. And I was thinking, why are we trying to improve the parser?
Starting point is 00:14:30 Or deal with some trivialities when we have capacity to really make a change. And it was really challenging to me because on one hand, you know, I have my graduate students really want to do their papers and their work. And they want to continue to do what they were doing, which was great. And then it was me who really kind of re-valuated what is the importance and also at that point because I had to take some break. I look back into like my years in, and I was thinking, you know, like 10 years ago, this was the biggest thing.
Starting point is 00:15:08 I don't know, topic models. We have millions of papers on topic models and variation of topic models, and I was sort of like irrelevant. And you start looking at this, you know, what do you perceive as important a different point of time and how it fades over time. And since we have a limited time, all of us have limited time on us, it's really important to prioritize things that really matter to you,
Starting point is 00:15:37 maybe matter to you at that particular point, but it's important to take some time and understand what matters to you, which may not necessarily be the same understand what matters to you, which may not necessarily be the same as what matters to the rest of your scientific community and pursue that vision. So, though that moment, did it make you cognizant? You mentioned suffering of just the general amount of suffering in the world. Is that what you're referring to? So, as opposed to topic models and specific
Starting point is 00:16:05 detailed problems in NLP, did you start to think about other people who have been diagnosed with cancer? Is that the way you saw the start to see the world perhaps? Oh, absolutely. And it actually creates because like for instance, you know, the spots of the treatment where you need to go to the hospital every day and you see, you know, there is parts of the treatment where you need to go to the hospital every day. And you see, you know, the community of pupils that you see and many of them are much worse than I was at a time. And you're always sad and see it all. And people who are happy as some day just because they feel better. And for people who are in our normal reality,
Starting point is 00:16:45 you take it totally for granted that you feel well, that if you decide to go running, you can go running. And you can, you know, you're pretty much free to do whatever you want with your body. Like I saw like a community, my community became those people. And I remember one of my friends, Dina Katavi took me to Prudential, to buy my gift for my birthday.
Starting point is 00:17:07 And it was like the first time in months, I said I went to kind of to see other people. And I was like, wow, first of all, these people, they're happy and they're laughing. And they're very different from this other my people. And second of all, I think it's totally crazy. They're like laughing and wasting their money on some stupid gifts.
Starting point is 00:17:25 And they may die. They already may have cancer. And they don't understand it. So you can really see how the mind changes that you can see. And before that, you can, as you know, that you're going to die, of course, I knew. But it was a kind of a theoretical notion.
Starting point is 00:17:45 It wasn't something which was concrete. And at that point when you really see it and see how little means sometimes the system has to harm them, you really feel that we need to take a lot of our brilliance that we have here at MIT and translated it into something useful. Yeah, and useful couldn't have a lot of definitions, but of course alleviating, suffering alleviating, trying to cure cancer is a beautiful mission. So I, of course, know the theoretically the notion of cancer, but just reading more and more about it's 1.7 million new cancer cases in the United States every year, 600,000 cancer related deaths every year. So this has a huge impact. United States globally.
Starting point is 00:18:36 When broadly before we talk about how machine learning, how MIT can help, when do you think we as a civilization will cure cancer? How hard of a problem is it from everything you've learned from it recently? I cannot really assess it. What I do believe will happen with the advancement in machine learnings that a lot of types of cancer will be able to predict way early and more effectively utilize existing treatments. I think I hope at least that with all the advancements in AI and drug discovery, we would be able to much faster find relevant molecules. What I'm not sure about is how long it will take the medical establishment
Starting point is 00:19:27 and regulatory bodies to kind of catch up and to implement it. And I think this is a very big piece of puzzle that is currently not addressed. That's a really interesting question. So first, a small detail that I think the answer is yes, but is cancer one of one of the diseases that when detected earlier that's a significantly improves the outcomes? So like because we will talk about there's the cure and then there is detection and I think one machine learning can really help as earlier detection. So detection help? And I think one machine learning can really help as early detection. So detection help? Detection is crucial.
Starting point is 00:20:06 For instance, the vast majority of pancreatic cancer patients are detected at the stage that they are incurable. That's why they have such a terrible survival rate. It's like just a few percent over five years. It's pretty much today, the sentence. But if you can discover this disease early, there are mechanisms to treat it. And in fact, there are a number of people who were diagnosed and saved just because they had food poisoning.
Starting point is 00:20:41 They had terrible food poisoning. They went to the yard and they go scan their early science on the scan and that would save their lives. But this wasn't really a accidental case. So as we become better, we would be able to help to many more people that are likely to develop diseases. And I just want to say that as I got more into this field, I realized that, you know, that are likely to develop diseases. And I just want to say that as I got more into this field, I realized that, you know, cancer is of course terrible disease, but they're really the whole slew of terrible diseases
Starting point is 00:21:13 out there, like neurodegenerative diseases and others. So we, of course, a lot of us have excited on cancer just because it's so prevalent in our society and you see these people, but there are a lot of patients with neurodegenerative diseases and the kind of aging diseases that we still don't have a good solution for. And we, you know, and I felt as a computer scientist, we kind of decided that it's other people's job to treat these diseases, because it's like traditionally people in biology or in chemistry or MDs, are the ones who are thinking about it. And after kind of start paying attention, I think that it's really a wrong assumption
Starting point is 00:21:57 and we all need to enjoy the bottle. So how it seems like in cancer specifically, that there's a lot of ways that machine learning can help. So what's the role of machine learning in the diagnosis of cancer? So for many cancers today, we really don't know what is your likelihood to get cancer. And for the vast majority of patients, especially on the young patients, it really comes as a surprise. For instance, for breast cancer, 80% of the patients are first in their families, it's like me. And I never thought that I had any increased risk because nobody had it in my family
Starting point is 00:22:38 and for some reason in my head, it was kind of an inherited disease. But even if I would pay attention, the models that currently, this is very simplistic statistical models that are currently used at an clinical practice really don't give you an answer, so you don't know. And the same trofe of pancreatic cancer, the same trofe of non-smoking lung cancer and many others. So what machine learning can do here is utilize all these data to tell us Ellie, who is likely to be susceptible and using all the information that is already there, be it imaging, be it your other tests,
Starting point is 00:23:17 and eventually liquid biopsis and others, where the signal itself is known sufficiently strong for human eye to do a good discrimination because the signal may be weak. But by combining many sources, a machine which is trained on large volumes of data can really detect it early and that what we've seen with breast cancer and people are reporting it in other diseases as well. That really boils down to data, right? And in the different kinds of sources of data.
Starting point is 00:23:48 And you mentioned regulatory challenges. So what are the challenges in gathering large data sets in the space? Again, another great question. So it took me after I decided that I want to walk on it two years to get access to data. And you did, like any significant amount. Like right now in this country, there is no publicly available data set of modern mammograms
Starting point is 00:24:15 that you can just go on your computer, sign a document and get it. It just doesn't exist. I mean, obviously every hospital has its own collection of mammograms. There are data that came out of clinical trials. What we were talking about here is a computer scientist who just want to run, he's a her model and see how it works. This data, like ImageNet, doesn't exist. And there is a set, which is called Flory the Data Set,
Starting point is 00:24:45 which is a film, a mammogram from 90s, which is totally not representative of the current developments, whatever you're learning on them doesn't scale up. This is the only resource that is available. And today, there are many agencies that govern access to data, like the hospital holds your data, and the hospital decides whether they would give it to the researcher to work with this data.
Starting point is 00:25:09 I mean, the visual hospital, yeah, I mean, the hospital may, you know, assume is that you're doing research collaboration, you can submit, you know, there is a proper approval process guided by our B and you, if you go through all the processes, you can eventually get access to the data. But if you yourself know our AI community, there are not that many people who actually ever got access to data because it's a very challenging process. And sorry, just in the quick comment, MGH or any kind of hospital, are they scanning the data? Are they digitally storing it? Oh, it is already digitally stored.
Starting point is 00:25:48 You don't need to do any extra processing steps. It's already there in the right format. It is that right now there are a lot of issues that govern access to the data because the hospital is legally responsible for the data. And they have a lot to lose if they give the data to the wrong person, but they may not have a lot to gain if they give it as a hospital, as a legal entity, as giving it to you. And the way, you know, what I would mention happening in the future is the same thing that happens when you're getting your driving license.
Starting point is 00:26:24 You can decide whether you want to donate your organs. You can imagine that whenever a person goes to the hospital, it should be easy for them to donate their data for research, and it can be different kind of, do they only give you a test result, only imaging data or the whole medical record. imaging data or the whole medical record. Because at the end, we all will benefit from all this insight. And it's only going to say, I want to keep my data private, but I would really love to get it from other people because other people are thinking the same way. So if there is a mechanism to do this donation and the patient has an ability to say how they want to use their data for research.
Starting point is 00:27:08 It will be really a game changer. People when they think about this problem, it depends on the population, it depends on the demographics, but there's some privacy concerns. Generally, not just medical data, it's just any kind of data. It's what you said, my data, it should belong kind of to me. I'm worried how it's going to be misused. How do we alleviate those concerns? Because that seems like a problem that
Starting point is 00:27:35 needs to be that problem of trust, of transparency, needs to be solved before we build large data sets that help detect cancer, help save those very people in their inner feature. So seeing there are two things that could be done. There is a technical solutions and there are societal solutions. So on the technical end, we today have ability to improve this inviguation, like for instance, for imaging, you can do it pretty well.
Starting point is 00:28:11 What's disambiguation? And removing the identification, removing the names of the people. There are other data like if it isn't raw text, you cannot really achieve 99.9%. But there are all these techniques that actually some of them are developed at MIT, how you can do learning on the encoded data, where you locally encode the image, you train on network, which only works on encoded images.
Starting point is 00:28:39 And then you send the outcome back to the hospital, and you can open it up. So those are the technical solutions. There are a lot of people who are walking in this space where the landing happens in the encoded form. We are still early. But this is an interesting research area where I think we'll make more progress. There is a lot of work in natural language processing community, how to do the identification better. But even today, there are already a lot of data, which can be identified perfectly, like your test data, for instance, correct, where you can just, you know, the name of the patient,
Starting point is 00:29:18 you just want to extract the part with the numbers. The big problem here is again, hospitals don't see much incentive to give this data away on one hand and then there is general concern. Now when I'm talking about societal benefits and about the education, the public needs to understand that I think that there is a situation and I still remember myself when I really needed an answer. I had to make a choice. There was no information to make a choice. You're just guessing. At that moment you feel that your life is at the stake, but you just don't have information to make the choice. Many times when I give talks, I get emails from women who say, you know, I'm in this situation, can you please run statistic and see what are the outcomes?
Starting point is 00:30:14 We get almost every week a mammogram that comes by mail to my office at MIT. I'm serious. That people ask to run because they need to make, you know, life-changing decisions. And of course, you know, I'm not planning to open a clinic here, but we do run, and give them the results for their doctors. But the point that I'm trying to make, that we, all at some point or all loved ones, will be in the situation where you need information to make the best choice. And if this information is not available, you would feel vulnerable and unprotected. And then the question is, you know, what do I care more because at the end, everything
Starting point is 00:30:56 is a trade-off, correct? Yeah, exactly. Just out of curiosity. It seems like one possible solution, I'd like to see what you think of it Based on what you just said based on wanting to know answers for when you're in your self-in-the-situation Is it possible for patients to own their data as opposed to the hospitals owning their data? Of course theoretically, I guess patients own their data, but can you walk out there with a USB stick containing everything or uploaded to the cloud where a company, you know, I remember Microsoft had a service like I tried, I was
Starting point is 00:31:33 really excited about and Google Health was there. I tried to give, I was excited about it. Basically companies helping you upload your data to the cloud so that you can move from hospital to hospital from doctor to doctor. Do you see a promise of that kind of possibility? I absolutely think this is, you know, the right way to exchange the data. I don't know now who is the biggest player in this field, but I can clearly see that even for, even for totally selfish health reasons, when you going to a new facility and many of us ascend to some specialized treatment, they don't easily have access to your data. And today, you know, we would want to send a smamogram need to go to their hospital, find some small office,
Starting point is 00:32:18 which gives them that CD and they ship as a CD. So you can imagine we're looking at the kind of decades old mechanism of data exchange. So I definitely think this is in the area where hopefully all the right regulatory and technical forces will align and we will see it actually implemented. It's sad because unfortunately, and I need to research why that happened, but I'm pretty sure Google health and Microsoft health vault or whatever it's called both close down Which means that there was either regulatory pressure or there's not a business case or there's challenges from hospitals Which is very disappointing. So when you say you don't know what the biggest players are The two biggest that I was aware of close their doors.
Starting point is 00:33:07 So I'm hoping I'd love to see why and I'd love to see who else can come up. It seems like one of those Elon Musk style problems that are obvious needs to be solved and somebody needs to step up and actually do this large scale data collection. I know that it's an initiative in Massachusetts, the thing that you led by the governor, to try to create this kind of health exchange system, at least to help people who are kind of when you show up in emergency room and there is no information about what are your allergies and other things. So I don't know how far it will go, But another thing that you said, and I'm finding it very interesting, is actually who are
Starting point is 00:33:48 the successful players in this space and the whole implementation? How does it go? To me, it is from the anthropological perspective. It's more fascinating that AI that today goes in healthcare. You know, we've seen so many, you so many attempts and so very little successes. And it's interesting to understand that I have by no means have no leads to assess why we are in the position where we are. Yeah, it's interesting because data is really fuel for a lot of successful applications, and when that data requires regulatory approval,
Starting point is 00:34:25 like the FDA or any kind of approval, it seems that the computer scientists are not quite there yet in being able to play the regulatory game, understanding the fundamentals of it. I think that in many cases when even people do have data, we still don't know what exactly do you need to demonstrate, to change the standard of care. Well, like, let me give you an example related to my breast cancer research. So in traditional breast cancer risk assessment, there is something called density, which determines the likelihood of a woman to get cancer. And this is pretty much how much white do you see on the mammogram, the white it is,
Starting point is 00:35:13 the more likely the tissue is dense. And the idea behind density, it's not a bad idea, in 1967, an aerodrologist called Wolf decided to look back at women who were diagnosed and see what is special in their images. Can we look back and say that they're likely to develop? So he come up with some patterns. It was the best that his human eye can identify, then it was kind of formalized and coded
Starting point is 00:35:39 into four categories and that what we are using today. And today, this density assessment is actually a federal law from 2019, they are approved by President Trump and for the previous FDA commissioner, where women are supposed to be advised by their providers if they have high density, putting them into high risk category. And in some states, you can actually get supplementary screening paid by your insurance because you're in this category. Now, you can say, how much signs do we have behind it?
Starting point is 00:36:14 Whatever, biological science or epidemiological evidence. So it turns out that between 40 and 50 percent of women have dense breast. So about 40 percent of patients are coming out of their screening and somebody tells them you are in high risk. Now what exactly does it mean if you as half of the population in high risk, it's from say maybe I'm not, you know, or what do I really need to do with it? Because the system doesn't provide me a lot of the solutions because there are so many people like me, we cannot really provide very expensive solutions for them. And the reason this whole density became this big deal, it's actually advocated by the
Starting point is 00:36:57 patients who felt very unprotected because many women went to the mammograms which were normal and then it turns out that they already had cancer, quite developed cancer. So they didn't have a way to know who is really at risk. And what is the likelihood that when the doctor tells you, you're okay, you are not okay. So at the time, and it was 15 years ago, this maybe was the best piece of science that we had.
Starting point is 00:37:24 And it took quite 15, 16 years to make it federal law. But now this is a standard. Now with a deep learning model, we can so much more accurately predict who is going to develop breast cancer, just because you're trained on a logical thing. And instead of describing how much white and what kind of white machine can systematically identify the patterns, which was a regional idea behind the sort of the radiologist
Starting point is 00:37:50 machine is can do it much more systematically and predict the risk when you're training the machine to look at the image and to say the risk in one, two, five years. Now you can ask me, how long it will take to substitute this density, which is broadly used across the country, and really it's not helping to bring this new models. And I would say it's not a matter of the algorithm. Algorithms already orders of magnitude better that what is currently in practice. I think it's really the question,
Starting point is 00:38:19 who do you need to convince? How many hospitals do you need to run the experiment? What, you know, all this mechanism of adoption, you need to convince how many hospitals do you need to run the experiment? What, you know, all this mechanism of adoption and how do you explain to patients and to women across the country that this is really a better measure? And again, I don't think it's an AI question. We can walk more and make the algorithm even better, but I don't think that this is the current barrier. The barrier is really this other piece that for some reason is not really explored.
Starting point is 00:38:52 It's like anthropological piece. And coming back to a question about books, there is a book that I am reading. It's called American Sickness by Elizabeth Strosenthal. And I go this book from my clinical collaborator, Dr. Kony-Lemon. And I should know everything that I need to know about American health system, but you know, every page doesn't fail to surprise me. And I think that it is a lot of interesting
Starting point is 00:39:20 and really deep lessons for people like us, from computer science who are coming into this field to really understand how complex is the system of incentives in the system to understand how you really need to play to drive an option. You just said it's complex, but if we're trying to simplify it, who do you think most likely would be successful if we push on this group of people? Is it the doctors? Is it the hospitals? Is it the governments of policymakers? Is it the individual patients, consumers who needs to be inspired to most likely lead to adoption? Or is there no simple answer? There's no simple answer, but I think there is a lot of good people in medical system who do want, to make a change. I think a lot of power will come from us as a consumer, because we all are
Starting point is 00:40:19 consumers of future consumers, of healthcare services. I think we can do so much more in explaining the potential and not in the hype terms and not saying that we now killed Alzheimer. And I'm really sick of reading these kind of articles which make these claims. We're really to show, with some examples, what this implementation does and how it changes the care. Because I can't imagine it doesn't matter what kind of petition it is, you know, we're all
Starting point is 00:40:50 susceptible to these diseases. There is no one who is free. And eventually, you know, we all are humans and we are looking for way to alleviate the suffering. And this is one possible way where we currently underutilizing which I think can help. So it sounds like the biggest problems are outside of AI in terms of the biggest impact
Starting point is 00:41:14 at this point. But are there any open problems in the application of ML to oncology in general, so improving the detection or any other creative methods, whether it's on the detection segmentations of the vision perception side or some other clever inference. Yeah, what in general in you or the open problems in the space? Yeah, I just want to mention that beside detection, another area when I am kind of quite active and I think it's really an increasingly important area in
Starting point is 00:41:46 healthcare is drug design. Because, you know, it's fine if you detect something early, but you still need to get drugs and new drugs for these conditions. And today, all of the drug design, ML is nonexistent there. We don't have any drug that was developed by the ML model or even not developed, but at least even you, that ML model plays some significant role. I think this area was all the new ability
Starting point is 00:42:20 to generate molecules with desired properties to do in silica screening is really a big open area. It to be totally honest with you know when we are doing diagnostics and imaging primarily taking the ideas that were developed for other areas and you are applying them with some adaptation. The area of you know drag design is really technically interesting and exciting area. You need to work a lot with graphs and capture various 3D properties. There are lots and lots of opportunities to be technically creative. And I think there are a lot of open questions in this area. You know, we're already getting a lot of successes,
Starting point is 00:43:06 even with the first generation of this models. But there is much more new creative things that you can do. And what's very nice to see is actually the more powerful, and more interesting models actually do do better. So there is a place to innovate in machine learning in this area. And some of these techniques are really unique too. Let's say to graph generation and other things.
Starting point is 00:43:36 So what just to interrupt really quick, I'm sorry, graph generation or graphs, this drug discovery in general. What's, how do you discover a drug? Is this chemistry, is this trying to predict different chemical reactions? Or is it some kind of, what are graphs even represented in this piece? Oh, sorry. And what's a drug?
Starting point is 00:44:02 Okay, so let's say you think that there are many different types of drugs, but let's say you're going to talk about small molecules because I think today the majority of drugs are small molecules. So small molecule is a graph, the molecule is just where the node in the graph is an atom and then you have the bond. So it's really a graph representation if you're looking at it in 2D, correct? You can do it 3D, but let's say we're let's keep it simple and stick in 2D. So pretty much my understanding today
Starting point is 00:44:31 how it is done a scale in the companies you're without machine learning. You have high throughput screening. So you know that you are interested to get certain biological activity of the compound. So you scan a lot of compounds, like maybe hundreds of thousands, some really big number of compounds, you identify some compounds which have the right activity. And then at this point, you know, the chemist come and they are trying to now to optimize this original heat to different properties that you
Starting point is 00:45:02 want it to be, maybe solubil, you want You want to decrease toxicity you want to decrease the side effects Are those say again to drop or can I be done in simulation or just by looking at the molecules or do you need to actually run reactions and real Labs who lab so so there is so when you do high-stroke screening you really do Screening it's in the lab. It's really the lab screening. You screen the molecules, correct? I don't know what screening is. The screening is just check them for certain property. Like in the physical space, in the physical world, like actually, there's a machine probably that's doing some, that actually running the reaction. Actually running the reactions, yeah. So there is a process where you can run and it's where it's
Starting point is 00:45:42 called high-stroke, but you know, it becomes cheaper and faster to do it on a very big number of molecules. You run this screening, you identify potential good starts, and then where the chemists come in who have done it many times, and then they can try to look at it and say how can it change the molecule to get the desired profile in terms of all other properties. So maybe how do I make it more bioactive and so on. And there, the creativity of the chemist really is the one that determines the success of this design because again, they have a lot of domain knowledge of what works, how do you decrease the CCD and so on, and that's what they do. So all the drugs, currently, FDA approved the drugs, and drugs that are in clinical trials,
Starting point is 00:46:39 they are designed using these domain experts, which goes through this combinatorial space of molecules and graphs and whatever, and find the right one, or adjust it to be the right ones. Sounds like the breast density heuristic from 67, the same echoes. It's not necessarily that. It's really driven by deep understanding. It's not like they just observed it. I mean, they do deeply understand chemistry and they do understand how different groups
Starting point is 00:47:07 and how does it change the properties. So there is a lot of science that gets into it and a lot of kind of simulation, how do you want it to behave? It's very, very complex. So they're quite effective at this design, obviously. Now, effective, yeah, we have drugs. Like, depending on how do you measure effective, if you measure it's in terms of
Starting point is 00:47:29 cost, it's prohibitive. If you measure it in terms of time, you know, we have lots of diseases for which we don't have any drugs and we don't even know how to approach and don't need to mention view drugs or near degenerative disease drugs that fail, you fail. So there are lots of trials of fail in later stages, which has really cut a straffick from the financial perspective. So is it the most effective mechanism? Absolutely no, but this is the only one that currently works. And I was closely interacting with people in pharmaceutical industry. I was really fascinating on how sharp and what a deep understanding of the domain do they
Starting point is 00:48:11 have. It's not observation driven. There is really a lot of science behind what they do. But if you ask me can machine learning change it, I firmly believe yes, because even the most experienced chemists cannot, chemist cannot hold in their memory and understanding everything that you can learn from millions of molecules and reactions. And the space of graphs is a totally new space. I mean, it's a really interesting space for machine learning to explore graph generation.
Starting point is 00:48:41 Yeah, so there's a lot of things that you can do here. So we do a lot of work. So the first tool that we started with was the tool that can predict properties of the molecules. So you can just give the molecule molecule and the property. It can be by activity property or it can be some other property. And you train the molecules and you can now take a new molecule and predict this property. Now, when people started working in this area, it is something very simple, the kind of existing, you know, fingerprints, which is kind of handcrafted features of the
Starting point is 00:49:17 molecule. When you break the graph to substructures, and then you run, I don't know, a feed-forward neural network. And it was interesting to see that clearly, this was not the most effective way to proceed. And you need to have much more complex models that can induce a representation, which can translate this graph into the embeddings and do these predictions. So this is one direction, another direction, which is kind of related, is not only to stop
Starting point is 00:49:44 by looking at the embedding itself, but actually modify it to produce better molecules. So you can think about it as machine translation that you can start with a molecule and then there is an improved version of molecule and you can again, within code, translate it into the hidden space and then learn how to modify it to improve the in some ways, version of the molecules. So that's, it's kind of really exciting. We already have seen that the property prediction works pretty well and now we are generating molecules and there is actually labs which are manufacturing this molecule. So we'll see where it will get us. Okay, that's really exciting. There's a lot of problems.
Starting point is 00:50:26 Speaking of machine translation and embeddings, I think you do, you have done a lot of really great research in NLP, natural English processing. Can you tell me your journey through NLP, what ideas, problems, approaches, where you are working on, where you fascinated with, did you explore before this magic of deep learning reemerged and after. So when I started for my working at L.P. it was in 97. This is a very interesting time. It was exactly the time that I came to ACL.
Starting point is 00:51:00 And the dynamic would barely understand English, but it was exactly like the transition point. Because half of the papers were really, you know, rule-based approaches where people took more kind of heavy linguistic approaches for small domains and tried to build up from there. And then there were the first generation of papers, which were corpus-based papers. And they were very simple in our terms when you collect some statistics and do prediction based on them. But I found it really fascinating that one community can think so very differently about the problem. I remember my first paper that I wrote, it didn't have a single formula, it didn't have evaluation, it just had examples of outputs.
Starting point is 00:51:45 And this was a standard of the field at a time in some ways. I mean, people maybe just started emphasizing the empirical evaluation, but for many applications like summarization, you just show some examples of outputs. And then increasingly you can see that how the statistical approach is dominated the field. And we've seen increased performance across many basic tasks. The set part of the story may be that if you look again through this journey, we see that the role of linguistics in some ways greatly diminishes. And I think that you really need
Starting point is 00:52:28 to look through the whole proceeding to find one or two papers which makes some interesting linguistic references. Today, today, today, this was definitely. So, the things that seem to act to treat is just even basically against our conversation about human understanding of language, which I guess what linguists do would be structured, representing language in a way that's human-explainable, understandable is missing today. I don't know if it is what is explainable and understandable. At the end, we perform functions, and it's okay to have a machine which performs
Starting point is 00:53:06 a function. Like when you're thinking about your calculator, correct? Your calculator can do calculation very different from you would do the calculation, but it's very effective in it. And this is fine. If we can achieve certain tasks with high accuracy, it doesn't necessarily mean that it has to understand understand the same way as we understand. In some ways it's even the if to request because you have so many other sources of information that are absent when you are training your system. So it's okay. It's a dilemma. I said, I would tell you one application. It's really fascinating. In 1997 when I came to ACL, there were some papers on machine translation. They were like primitive, like people were trying really, really simple.
Starting point is 00:53:48 And the feeling, my feeling was that, you know, to make real machine translation system, it's like to fly in the moon and build a house there in the garden and live happily ever after. I mean, it's like impossible. I never could imagine that within, you know, 10 years, we would already see the system working and now nobody is even surprised to utilize the system on daily basis. So this was like a huge, huge progress, things that people for a very long time tried to solve using other mechanisms and they were unable to solve it. That's why I come back to a question about biology, that in linguistics people try
Starting point is 00:54:27 to go this way and try to write the syntactyries and try to abstract it and to find the right representation. And, you know, they couldn't get very far with this understanding while this models using, you know, other sources actually cable to make a lot of progress. Now, I'm not naive to think that we are in this paradigm space in NLP and shows you know that when we slightly change the domain and when we decrease the amount of training, it can do like really bizarre and funny thing, but I think it's just a matter of improving generalization capacity, which is just a technical question. Wow.
Starting point is 00:55:09 So that's the question. How much of language understanding can be solved with deep neural networks? In your intuition, I mean, it's unknown, I suppose. But as we start to creep towards romantic notions of the spirit of the touring test and conversation and dialogue and something that maybe to me or to us silly humans feels like it needs real understanding how much can I be achieved with these new networks or statistical methods? So I guess I am very much driven by the outcomes
Starting point is 00:55:50 and we achieve the performance, which would be satisfactory for us for different tasks. Now, if you again look at machine transition system, which are trained on large amounts of data, they really can do a remarkable job relatively to where they've been a few years ago. And if you project into the future, if it will be the same speed of improvement,
Starting point is 00:56:16 you know, this is great. Now does it bother me that it's not doing the same translation as we are doing? Now if you go to cognitive science, we still don't really understand what we are doing I mean there are a lot of theories and obviously a lot of progress and styling but our understanding what exactly goes on You know in our brains when we process language is still not crystal clear and precise that we can translate it into
Starting point is 00:56:41 machines what does bother me is that translated into machines. What does bother me is that, again, that machines can be extremely brittle when you go out of your comfort zone of that, when there is a distributional shift between training and testing. And it has been years and years, every year when a teacher not pick class, show them some examples of translation from some newspaper in Hebrew, the way it was perfect. Then I have a recipe that, to me, a callous system sent me a while ago, and it was written in the finish of Kareli and Pais. And it's just a terrible translation.
Starting point is 00:57:16 You cannot understand anything what it does. It's like something, tactic mistakes. It's just terrible. In the year after year, I tried and I'm going to translate it in the end after year. It does a terrible walk because I guess, you know, the recipes are not big bad of the training referred to are. So but in terms of outcomes, that's a really clean good way to look at it.
Starting point is 00:57:38 I guess the question I was asking is, do you think the imaginative future, do you think the current approaches can pass the touring test in the way, in the best possible formulation of the touring test, which is, would you want to have a conversation within your own network for an hour? Oh, God, no. No, there are not that many people that I would want to find out. But there are some people in this world alive or not that you would like to talk to for an hour, could a neural network achieve that outcome? So I think it would be really hard to create a successful training set
Starting point is 00:58:18 which would enable it to have a conversation for an actual conversation for an hour. We think it's a problem of data perhaps. I think in some ways it's not a problem. It's a problem both of data and the problem of the way we're training our systems, their ability to truly do generalize to be very compositional in some ways it limited, you know, in the current capacity at least. You know, we can translate well, we can find information well, we can extract information. So there are many capacities in which it's doing very well.
Starting point is 00:58:52 And you can ask me, would you trust the machine to translate for your newsletter as a source? I would say absolutely, especially if we are talking about newspaper data or other data, which is in the realm of its own training set. I would say yes, but having conversations with the machine, it's no some things that I would choose to do. But I would tell you something, talking about Turing tests and about all these kinds of Eliza conversations. I remember visiting Tencent in China and they have this chat board and they claim that it is like really humongous amount of the local population which like for hours talks to the chat board. To me it was, I cannot believe it, but apparently it's like documented that there are some people who
Starting point is 00:59:36 enjoy this conversation. And you know, it brought to me another MIT story about Eliza and Wazimbaum. I don't know if you familiar with the story. So Waysimbao was a professor at MIT. And when he developed this Eliza, which was just doing string matching, very trivial, like restating of what you said, with very few roles, no syntax. Apparently, there was secretary at MIT
Starting point is 01:00:00 that would sit for hours and converse with this trivial thing. And at the time, there was no beautiful interfaces, so you actually need to go through the pain of communicating. And with Embao himself who's so horrified by this phenomena that people can believe enough to the machine, you just need to give them the hint that machine understands you and you can complete the rest, that he kind of stopped this research and went into kind of trying to understand what
Starting point is 01:00:26 these artificial intelligence can do to our brains. So my point is, you know, how much, it's not how good is the technology, it's how ready we are to believe, that it delivers the good that we are trying to get. That's a really beautiful way to put it. I, by the way, am not horrified by that possibility, but inspired by it because, I mean, human connection, whether it's through language or through love, it seems like it's very amenable to machine learning. And the rest is just the challenges of psychology. Like you said, the secretaries who enjoy spending hours, I would say I would describe most of our lives as enjoying spending hours with those we love for very silly reasons. All we're doing is keyword
Starting point is 01:01:19 matching as well. So I'm not sure how much intelligence we exhibit to each other and where the people will love that we're close with. So it's a very interesting point of what it means to pass the touring test, with language, I think you're right. In terms of conversation, I think machine translation has very clear performance and improvement, right? What it means to have a fulfilling conversation is very, very
Starting point is 01:01:46 person dependent and context dependent and so on. That's, yeah, it's very well put. So, but in your view, what's a benchmark in natural language, a test? That's just out of reach right now, but we might be able to, that's exciting. Is it in machine, isn't perfecting machine translation or is there other, is it summarization? What's out there just that? It goes across specific application. It's more about the ability to learn from few examples for real. What we call future planning and all these cases. Because you know, the way we publish this paper today, we say if we have like naively, we get 55, but now we had this paper today, we say if we have, like, naively, we get 55, but now we had a few examples and we can move to 65.
Starting point is 01:02:29 None of this method is actually realistically doing anything useful. You cannot use them today. And there are ability to be able to generalize and to move, or to be autonomous in finding the data that you need to learn, to be able to perfect new task, new language. This is an area where I think we really need to move forward to. And we are not yet there.
Starting point is 01:03:00 Are you at all excited, curious, by the possibility of creating human-level intelligence? Is this because you've been very in your discussion, so if we're looking at ecology, you're trying to use machine learning to help the world in terms of alleviating suffering. If you look at natural English processing, you're focused on the outcomes of improving practical things like machine translation. But you know, human-level intelligence is the thing that our civilization is dreaming about creating super human level intelligence. Do you think about this? Do you think it's at all within our reach? As you said to yourself earlier, talking about how do you perceive our communications with each other, that we are matching key
Starting point is 01:03:47 words and certain behaviors and so on. And the end, whenever one assesses, let's say, relations with another person, you have a separate kind of measurements and outcomes inside your head, that determine what is the status of the relation. So one way, this is classical dilemma, what is the status of the relation. So one way, this is classical dilemma, what is the intelligence? Is it the fact that now we are going to do the same way as human is doing
Starting point is 01:04:10 when we don't even understand what the human is doing? Or we now have an ability to deliver this out, but not in one area, not in the other area, not just to translate or just answer questions, but across many, many areas, so that we can achieve the functionalities that humans can achieve with their ability to learn and do other things. I think this is, and this we can actually measure how far we are, and that's what makes
Starting point is 01:04:36 me excited, that we, you know, in my lifetime, at least so far, what we've seen is that tremendous progress across these different functionalities. And I think it will be really exciting to see where we will be. And again, one way to think about these machines which are improving their functionality, another one is to think about us with our brains, which are imperfect, how they can be accelerated by this technology as it becomes stronger and stronger. Coming back to another book that I love Flowers for Algernon, have you read this book? So there is a point that the patient gets this miracle cure which changes his brain and, over the sudden,
Starting point is 01:05:26 they see life in a different way and can do certain things better, but certain things much worse. So you can imagine this kind of computer-agmented cognition where it can bring you that now, in the same way as the cars enable us to get to places where we've never been before. Can we think differently? Can we think faster?
Starting point is 01:05:51 And we already see a lot of it happening in how it impacts us, but I think we have a long way to go there. So that's artificial intelligence and technology affecting our augmenting our intelligence, the humans. Yesterday, a company called Neuralink announced they did this whole demonstration. I don't know if you saw it. It's the demonstrated brain, computer, brain machine interface, where there's like a sewing machine for the brain. Do you, you know, a lot of that is quite out there in terms of things that some people would say are impossible, but they're dreamers and want to engineer systems like that. Do you see, based on what you just said, I hope
Starting point is 01:06:37 for that, more direct interaction with the brain? I think the different ways one is a direct interaction with the brain. I think there are different ways. One is a direct interaction with the brain and again there are lots of companies that walk in this space and I think there will be a lot of developments. When I'm just thinking that many times we are not aware of our feelings, so motivation will drive us. Let me give you a trivial example, our attention. There are a lot of studies that demonstrate that it takes a while to a person to understand that they are not attentive anymore. And we know that there are people who really have strong capacity to hold attention. There are another end of the spectrum people with ADD and other
Starting point is 01:07:14 issues that they have a problem to regulate their attention. Imagine to yourself that you have like a cognitive aid that just alerts you based on your gaze. that your attention is now not on what you are doing and it's sort of writing a paper, you're now dreaming of what you're going to do in the evening. So even this kind of simple measurement thing, how they can change us and I see it in a simple way with myself. I have my zone up from that I go to an MIT gym, it kind of records, you know, how much did you run, and you have some points, and you can get some status whatever. There you go. I said, what is this ridiculous thing?
Starting point is 01:07:52 Who will ever care about some status in some arm? Guess what? So to maintain the status, you have to set a number of points every month. And not only is it a dude every single month for the last 18 months, it went to the point that I was injured. And when I could run again, I, in two days, I did like some humongous amount of running just to complete the point. It was like really not safe. It's like, I'm not going to lose my status because I want to get there. So you can already see that this direct measurement and the feedback is, you know,
Starting point is 01:08:32 we're looking at video games and see why, you know, the addiction aspect of it, but you can imagine that the same idea can be expanded to many other areas of our life when we really can get feedback and imagine in your case in relations, when we are doing keyword matching, imagine that the person who is generating the keywords, that person gets direct feedback before the whole thing explodes, is it maybe at this hip point, we are going in the wrong direction, maybe it will be really behavior modifying moment. So yeah, it's a relationship management too. So yeah, that's a fascinating whole area of
Starting point is 01:09:11 psychology actually as well of seeing how our behavior has changed with basically all human relations now have other non-human entities helping us out. So you've, uh, you teach a large, a huge machine learning course here at MIT. I can ask you a million questions, but you've seen a lot of students. What ideas do students struggle with the most as they first enter this world of machine learning? Actually, this year was the first time I started teaching a small machine learning class. It came as a result of what I saw in my big machine learning class at Tommy Akland, I built maybe six years ago. What we've seen that as this area become more and more popular, more and more people at MIT want to take this class. And while we designed it for computer science majors,
Starting point is 01:10:07 there were a lot of people who really are interested to learn it. But unfortunately, their background was not enabling them to do well in the class. And many of them associated machine learning with the world's struggle and failure, primarily for known majors. And that's why we we started a new class,
Starting point is 01:10:25 which we call machine learning from algorithms to modeling, which emphasizes more the modeling aspects of it and focuses on, it has majors and known majors. So we kind of try to extract the relevant parts and make it more accessible, because the fact that we're teaching 20 classifiers in standard machine learning class, it's really a big question, do really needed. But it was interesting to see this from first generation
Starting point is 01:10:54 of students, when they came back from their internships and from their jobs, what different, and exciting things they can do, that I would never be things you can even apply machine learning to. Some of them are like matching, you know, the relations and other things like that. Everything is the matter of the machine learning. You know, that actually brings up an interesting point of computer science in general. It almost seems, maybe I'm crazy, but it almost seems like everybody needs to learn how to
Starting point is 01:11:25 program these days. If you're 20 years old or you're starting school, even if you're an English major, it seems like programming unlocks so much possibility in this world. So when you interact with those non-majors, is there skills that they were simply lacking at the time that you wish they had and that they learned in high school and so on? Like how will it, how should education change in this computerized world that we live in? It seemed because they knew that there is a Python component in the class.
Starting point is 01:12:01 You know, the Python skills were okay and the classes don't really have your own programming. They primarily kind of add parts to the programs. I think it was more of their mathematical barriers. And the class, against with a design on the majors, was using the notation like big old for complexity and others, people who come from different backgrounds, just don't have it in the lexical. So necessarily very challenging notion, but they were just not aware. So I think that you know kind of
Starting point is 01:12:31 linear algebra and probability, the basics that calculus, multivariate calculus, things that can help. What advice would you give to students interested in machine learning, interested? students interested in machine learning, interested. You've talked about detecting curing cancer, drug design. If they want to get into that field, what should they do? Get into it and succeed as researchers and entrepreneurs. The first good piece of news is that right now, there are lots of resources that are created at different levels and you can find online or in the old school classes which are more
Starting point is 01:13:12 mathematical, more applied and so on. So you can find a kind of a preacher which breaches your own language where you can enter the field and you can make many different types of contribution, depending on what is your strengths. And the second point, I think it's really important to find some area which you really care about, and it can motivate your learning, and it can be for somebody curing cancer, or doing self-driving cars, or whatever,
Starting point is 01:13:42 but to find an area where there is data where you believe there are strong patterns and we should be doing it and we're still not doing it or you can do it better and just start there and see where it can bring you. So you've been very successful in many directions in life, but you also mentioned flowers of Argonon. And I think I've read or listened to you mentioned somewhere that researchers often get lost in the details of their work. This is per our original discussion with Cancer and so on. And don't look at the bigger picture, bigger questions of meaning and so on. So let me ask you the impossible question of what's the meaning of this thing, of life, of your life, of research. Why do you think we descendant of great apes are here on this
Starting point is 01:14:40 spinning ball? You know, I don't think that I have really a global answer. You know, maybe that's why I didn't go to humanities. I didn't think about it in these classes in my undergrad. But the way I am thinking about it, each one of us inside of them have their own set of, you know, things that we believe are important. And it just happens that we are busy with achieving various goals, busy listening to others and to kind of try to conform and to be part of the crowd, that we don't listen to that part. And, you know, we all should find some time to understand what is our own individual
Starting point is 01:15:28 missions. And we may have very different missions. And to make sure that while we are running 10,000 things, we are not, you know, missing out and putting all the resources to satisfy our own mission. And if I look over my time, when I was younger, most of these missions, I was primarily driven by the external stimulus, to achieve this, to be that. And now, a lot of what I do is driven by really thinking what is important for me to achieve independently of the external recognition. And I don't mind to be viewed in certain ways. The most important thing for me is to be true to myself to what I think is right. How long did it take? How hard was it to find the you, the earth, to be true to?
Starting point is 01:16:31 So it takes time, and even now sometimes, you know, the vanity and the triviality you can take. At MIT. Yeah, it can everywhere. It's just the vanity, atomity is different, the vanity in different places, but we'll have our piece of vanity. But I think actually, for me,
Starting point is 01:16:51 the many times the place to get back to it is when I'm alone and also when I read. And I think by selecting the right books, you can get the right questions and learn from what you read. So, but again, it's not perfect. Like, Vanitya is not as dominant. Well, that's a beautiful way to end.
Starting point is 01:17:22 Thank you so much for talking today. That was fun. That was fun. you

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.