The Peter Attia Drive - #290 ‒ Liquid biopsies for early cancer detection, the role of epigenetics in aging, and the future of aging research | Alex Aravanis, M.D., Ph.D.

Starting point is 00:00:00 Hey everyone, welcome to the Drive Podcast. I'm your host, Peter Atia. This podcast, my website, and my weekly newsletter all focus on the goal of translating the science of longevity into something accessible for everyone. Our goal is to provide the best content in health and wellness, and we've established a great team of analysts to make this happen. It is extremely important to me to provide all of this content without relying on paid ads. To do this, our work is made entirely possible by our members, and in return, we offer exclusive member-only content and benefits above and beyond what is available for free.

Starting point is 00:00:46 If you want to take your knowledge of this space to the next level, it's our goal to ensure members get back much more than the price of a subscription. If you want to learn more about the benefits of our premium membership, head over to peteratea.com.com forward slash subscribe. My guest this week is Alex Aravanis. Alex is the CEO and co-founder of Moonwalk Biosciences. I should note up front that I am also an investor in and an advisor to Moonwalk Biosciences. Alex and I were colleagues in medical school, so I've known Alex for a little over 25 years

Starting point is 00:01:23 now. Before Moonwalk, Alex was Illumina's chief technology officer, the SVP and head of research and product development. And under his leadership, Illumina launched the industry-leading product for generating and analyzing most of the world's genomic data. He developed large genome-based research and clinical applications, including whole genome sequencing for rare disease diagnoses, comprehensive genomic profiling for cancer, and for selected optimal therapies, and the most advanced AI tools for interpreting genomic information. Alex has been the founder of several biotech and healthcare companies, including Grail

Starting point is 00:01:59 Bio, where he served as the chief science officer and head of R&D. At Grail, he led the development of its multi-cancer early screening test, Gallery, which we'll discuss at length in this podcast. He holds over 30 patents and serves on the scientific advisory board for several biotechnology companies. Alex received his master's and his PhD in electrical engineering and his MD from Stanford University and his undergrad in engineering from Berkeley. In this episode we talk about two related things, liquid biopsies and epigenetics. We cover the evolution of genome sequencing and tumor sequencing. We then speak at length about Alex's work with

Starting point is 00:02:40 grail and liquid biopsies, including an understanding of cell-free DNA, methylation, sensitivity, specificity, along with the positive and negative predictive value of liquid biopsies. We then get into epigenetics, methylation, and the biology of aging. This isn't a specially complicated topic, but truthfully there are few topics in biology today that excite me more than this. And I suspect that my enthusiasm will come across pretty clearly here. So without further delay, please enjoy my conversation with Alex Orovinis. Hey Alex, great to be sitting down with you here today. I kind of wish we were doing this

Starting point is 00:03:23 in person because we haven't seen each other in person in a few months, and even that was sort of a chance meeting. So I guess by way of background, you and I go back over 20 years now, I guess it's 25 years that we both started med school together. It's hard to believe it's been that long, huh? It seems like a million years ago, but it also seems like yesterday. Yeah, those are good times. So Alex, one of the things I remember when we first met was that we pretty much clicked over the fact that we were both engineers coming in and we had a good group of friends that I remember in medical school. And the one thing we had in common is not one of us was a pre-med.

Starting point is 00:03:57 We were all kind of whatever the term was they used to describe as non-traditional path to medical school. So let's talk a little bit about just briefly your background. You came in as an electrical engineer, and then you did a PhD in a lab of a very prominent scientist by the name of Dick Chen. Maybe tell folks a little bit about what you did in that work and what it was that got you excited enough about science

Starting point is 00:04:20 to deviate off the traditional MD path. Yeah. My PhD was in electrical engineering, and Stanford has a cool configuration on the campus where the engineering school is literally across the street from the medical school. And so over time, I became more and more interested in applying signal processing techniques,

Starting point is 00:04:39 circuit design, imaging, AI, things like that, but the problems in medicine that were more interesting to me than some of the traditional engineering products, things like that. But the problems in medicine that were more interesting to me than some of the traditional engineering products and things like that. A world famous neuroscientist, as you mentioned, Dip Chen, who was very interested in fundamental questions about the quantum unit of communication in the brain, which is the individual synaptic vesicle. And there was a question of just what did it look like and how did it operate? And it was the beginning for me of just applying these engineering tools to really important questions in biology and helping answer them. That first story was a great article in Nature where we definitively answered the question of how that quantum is transmitted

Starting point is 00:05:20 between cells. And then went on to do several other projects like that. Can you say a little bit about that? How is that information transmitted? It was really fun to come up with these problems with an engineering and communications background. But if you look at a central neuron on the brain and you look at the rate at which information is transferred, it seemed to be much faster than the number of synaptic vesicles in the terminal. seem to be much faster than the number of synaptic vesicles in the terminal, right? So there's only 30 synaptic vesicles in the terminal by an electron microscope, yet you're

Starting point is 00:05:51 seeing hundreds of transmissions over a few seconds. So how is that possible? And there were various theories. There was an individual vesicle that was fusing and staying fused and pumping neurotransmitter through it without collapsing. And that's how you could get these so much more rapid puffs. We came up with a cute term which was called kiss and run to explain phenomenon and it

Starting point is 00:06:14 again helped answer this fundamental question of how did the brain get so many small neurons yet able to transmit so much information per individual connection. So, Alex, if you think about all the things that you learned during your PhD, I mean, I guess one of the benefits of doing it where you did it and the lab you did it in was you overlapped with some other really thoughtful folks, including a previous guest on the podcast, Carl Deseroth. What do you think were the most important things you learned philosophically, not necessarily technically, that are serving you in the stuff we're going to talk about today? So we're going to talk today a lot about liquid biopsies. We're

Starting point is 00:06:56 going to talk a lot about epigenetics. We're going to talk a lot about certainly technologies that have made those things possible. And when you think back to your background in double E, what were the transferable skills? So I think one of them is just saying in engineering, which is if you can't build it, you don't understand it. So simply understanding a description of something is not the same as you can build it up from scratch. And so you can't always do that in biology, but you can do experiments where you're testing the concept of, can I really make it work? And so I think that

Starting point is 00:07:29 was an engineering concept that served me well a lot. Another, it's not exclusive to engineering, but was being very first principle. Do we really understand how this works? In that particular lab, there's a big emphasis on doing experiments where you always learn something, where regardless of whether or not it confirmed or rejected your hypothesis, you learn something new about the system. Don't do experiments where you may just not learn anything. That was a very powerful way to think about things. So, well, fast forward a bit just for the sake of time. There's obviously an interesting detour where after I go off to residency and after you

Starting point is 00:08:08 finish your PhD, we still find ourselves back together side by side in the same company for four years, which again brought many funny stories, including my favorite is you and I getting lost in the middle of Texas, actually not in the middle of Texas, but just outside of El Paso and nearly running out of gas. I mean, this was no cell signal. We were in trouble, but we somehow made it out of that one together. Yes. Yeah, no, I remember that, that us Californians thought that there must be a Starbucks within, you know, 10, 15 miles out in the west Texas. And it turns out you can go hundreds of miles with that Starbucks.

Starting point is 00:08:49 That's right. Passing a gas station with an eighth of a tank saying, we'll stop at the next one can be a strategic error. There was also the time you bailed me out when I forgot my cuff links because you had some dental floss. Do you remember that? I don't know. I don't know if you remember that. There you go. Yeah. You had some dental floss. That's right. Yeah, total macular removal. But anyway, let's fast forward to all that stuff. So I don't know what year it is. It's got to be circa what 2012? When do you end up at Illumina for the first time? Early 2013. Okay. Talk to me about that role. What was it that you were recruited to Illumina to do? Maybe just tell folks who don't know what Illumina is a little bit about the company as well.

Starting point is 00:09:33 Yeah. So today, Illumina is the largest maker of DNA sequencing technologies. So when you hear about the human genome being sequenced, things like expression data or any seek, most liquid biopsies, most tumor sequencing, finding genetic variants and kids with rare disease, most of that is done with Illumina technology. So they also make the chemistries that process the DNA, the sequencers that generate that information and also the software that helps analyze it. So, I really took that tool from a very niche research technology to standard of care in medicine

Starting point is 00:10:13 and hundreds of thousands of publications and tremendously has been advancing science. So, 11 years ago, you showed up there. What was the role you were cast in? This was earlier on in Illuminous History. What attracted me to the company and why it was recruited was to help develop more clinical applications and more applied applications of the technology. So the technology had a use by certain sequencing aficionados for basic research, but the company and I agreed with the vision felt that, hey, this could be used for a lot more. This could be used to help every cancer patient.

Starting point is 00:10:49 This could be used to help people with genetic diseases. How can we develop the technology and other aspects of it, the assays and softwares, to make that a reality? I was hired to do that. It occurred to me when you even said a little bit of that, Alex, that many of us, you and I would take for granted some of the lingo involved here, sequencing and what's involved. But I still think it might be a bit of a black box to some people listening. And given the topics we're going to cover today, I think just explaining to people,

Starting point is 00:11:23 for example, what was done in the late 90s, early 2000s when quote unquote, the human genome was sequenced. What does that mean? And how had that changed from the very first time it was done by sheer brute force in the most analog way until even when you arrived 10, 11 years ago. So maybe walk us through what it actually means to sequence a genome and feel free to also throw in a little bit of background about some of the basics of DNA and the structure, etc. as it pertains to that. It's really important fundamental stuff.

Starting point is 00:11:57 Quick primer on human genetics. So in most cells of the body, you have 23 pairs of chromosomes. So in most cells of the body, you have 23 pairs of chromosomes. They're very similar except the X and Y chromosome, which are obviously different in men and women. Each one of those chromosomes is actually a lot of DNA packed together in a very orderly way, where the DNA is wrapped around proteins called nucleosomes, which are composed of histones, and then it's packed into something called chromatin, which is this mass of DNA and proteins. And again, packed together, and then you make these units of chromosomes.

Starting point is 00:12:32 Now, if you were to unwind all of those chromosomes, pull the string on the sweater and completely unwind it, and you were to line all of them end to end, you would have three billion individual bases. So the ATCG code, at any given of the one of those 3 billion positions, you would have a string of letters. Each one would either be ATC or G, and it would be 3 billion long. So the sequence, a whole human genome, is to read out that code for an individual. And once you do that, you then know their particular code at each of those physicians.

Starting point is 00:13:10 So at the end of the last century, that was considered a white ad daunting task. But as I think our country has often done, decided that it was a very worthy one to do, along with several other leading countries that believe strongly in science. And so they funded the Human Genome Project. So all over the world at centers, people were trying to sequence bits of this 3 billion basis to comprise the first complete human genome. So it's just quite famous.

Starting point is 00:13:39 There were two efforts. One was a public effort led by the NIH and Francis Collin at the time. They had a particular approach where what they were doing was they were cutting out large sections of the genome and then using an older type of sequencing method called capillary electrophoresis to sequence each of those individual bases. There was a private effort led by Craig Venter and a company called Solera, which took a very different approach, which is they cut up the genome into much, much smaller pieces, pieces that were so small that

Starting point is 00:14:15 you didn't necessarily know a priori what part of the genome they would come from, which is why they were doing this longer, more laborious process through the public effort. But there was a big innovation, which is they realized that if you had enough of these fragments, you could, using a mathematical technique, reconstruct it from these individual pieces. You could take individual pieces, looked at where they overlapped. Again, we're talking about billions of fragments here, and you can imagine mathematically reconstructing that.

Starting point is 00:14:44 Very computationally intensive, very complex. But the benefit of that is that you could generate the data much, much faster. And so in a fraction of the time and for a fraction of the money, they actually caught up to the public effort and then culminated in each having a draft of the human genome around the same time in late 2000, early 2001, and then simultaneously in nature and science, we got the first draft of the human genome, milestone in science. Alex, what were the approximate lengths of the fragments that Solera was breaking DNA down into?

Starting point is 00:15:19 They were taking chunks out in individual megabases, so like a million bases at a time. And then they would isolate that and then deconstruct it even to smaller pieces, which were kilobase fragments, a thousand bases at a time. And again, so they would take a piece of the puzzle, but they would know which piece it was and then break that into smaller and smaller ones. And then after you had the one kilobase sequences, they would put it all back together versus just to contrast that with the private effort, which they called shotgun sequencing,

Starting point is 00:15:51 which is you just took the whole thing, ground it up, brute force sequenced it, and then use the informatics to figure out what went where. And in the shotgun, how small were they broken down into? They got down to kilobase and multihundered base fragments. But the key was all you had to do was just brute force keep sequencing as opposed to this more artisanal approach of trying to take individual pieces and deconstruct them and then reconstruct them.

Starting point is 00:16:20 So it's early 2001, this gets published. By the way, do we know the identity of the individual? I think we do know the identity of the individual who was sequenced, don't we? I can't recall. I think the original one was still anonymous and likely to be a composite of multiple individuals just because of the amount of DNA.

Starting point is 00:16:36 That was needed, yeah. Yeah, soon after there were individuals. Craig Ventner, he may have been the first individual who was named that we had the genome for. Got it. It's often been said, Alex, that that effort costs, at the end of that sequencing, if you decided, I want to now do one more person, it would cost a billion dollars directionally to do that effort. What was the state of the art in transitioning that from where it was, let's just say, order of magnitude, $10 to the $9 per sequence to where it was 10 years later, approximately. What was the

Starting point is 00:17:17 technology introduction or plural version of that question that led to a reduction. And how many logs did it improve by? We went back and did this analysis. So if you literally at the end of the original human genome said, Hey, I want to do one more. And you have the benefit of all the learnings from the previous one. A few hundred million dollars would have been an incremental genome by 2012. It was low tens of thousands of dollars. So let's call that four or five logs of improvement.

Starting point is 00:17:51 And what brought that? So the day you show up at Illumina, and it's if for research purposes or if a very wealthy individual said, I have to know my whole genome sequence and they were willing to pay $25,000 for it or a lab was doing it as part of a clinical trial or for research. What were they buying from Illumina to make that happen? So, it was a series of inventions that allow the sequencing reactions to be miniaturized and then you could do orders of magnitude more sequencing of DNA by miniaturizing it. The older sequencers, they had a small glass tube, and as the DNA went through, you sequenced it.

Starting point is 00:18:30 It got converted into a 2D format, kind of like a glass slide, where you had tiny fragments of DNA stuck to it, hundreds of millions, then ultimately billions, and then you sequenced all of them simultaneously. So there was a huge miniaturization of each individual sequencing reaction, which allowed you to just in one system generate many, many more DNA sequences at the same time. There's a very important chemistry that was developed called sequencing by synthesis by a Cambridge chemist,

Starting point is 00:19:02 who I know well, Shankar Balasupramanian, and he developed a luminous sequence in chemistry, which ultimately went through a company called Selexa, which Illumina required. And that has generated the majority of the world's genomics data of the original chemistry that he developed in Cambridge. And what was it about that chemistry that was such a step forward? It allowed you to miniaturize the sequencing reactions. So, you could have a huge number, ultimately billions, in a very small glass slide.

Starting point is 00:19:31 It also allowed you to do something which is called cyclic sequencing in a very precise and efficient and fast way where you read off one base at a time and you can control it. And so, you imagine you have say a lawn of a billion DNA fragments and you're on base three on every single fragment and you want to know what base four is on every fragment. It allowed you to simultaneously sequence just one more base on all billion fragments, read it out across your whole lawn. And then once you read it out, add one more base, read it all out.

Starting point is 00:20:05 And so this allowed for this huge parallelization. Let's talk a little bit about where we are today. To my recollection the last time I looked, to do a whole genome sequence today is on the order of $1,000, $500 to $1,000. Is that about accurate? Yeah. That's way too expensive, Peter. Today, a couple hundred dollars. Okay. So, a couple hundred dollars today. I feel like I looked at this on a graph

Starting point is 00:20:34 a while ago and it was one of the few things I noticed that was improving faster than Moore's Law. Maybe tell folks what Moore's Law is, why it's often talked about. I think everybody's heard of it and maybe talk about the step function that it's basically, if I'm looking at it correctly, there were two Moore's laws, but there was something in between that became even a bigger improvement. But maybe tell folks what Moore's law is first of all. It's not like a law, like a law of physics or something like that, but it became an industry trend in microprocessors. What it refers to is the density of transistors on a microchip and the cost of the amount of computing power per amount of transistors. And that geometrically decreased kind of in a steady way. She don't remember the exact number

Starting point is 00:21:25 if it's like doubling every two years or something like that. But there was a geometric factor to it that the industry followed for decades. It's not quite following that anymore. I mean, transistors are getting down to like the atomic scale but went way faster than people had envisioned. It basically started in the late 60s. And as you said, it went until it hits the limits of atomic chemistry.

Starting point is 00:21:49 Yeah. And so that relentless push is what made the whole software engineering high-tech industry possible. So back to my question, which is if you just look at the cost of sequencing from 2000 till today, it's sort of like two curves. There's the relentless curve that gets to where we are in 2013. But then there was another big drop in price that occurred after that. I'm guessing that had to do with shotgun sequencing or the commercialization of it. I mean, not the concept of it, which already existed. Does that sound right? Yeah. So when Illumina really started to deliver the higher throughput next generation sequencings,

Starting point is 00:22:29 it brought along a new faster curve because of the miniaturizations. This ability to sequence billions of fragments in a small area, I was privileged to be a big part of this effort. Illumina just continuing to drive the density down, the speed of the chemistry up, all the associated optics, engineering software around it drove that much faster than Moore's Law reduction at cost. Were other companies involved in the culmination of next-gen sequencing? Yeah, many. And some of them are still around, not nearly as successful as Illumina, but also some important players there.

Starting point is 00:23:06 And today that's the industry standard. I assume there's no sequencing that's going on that isn't NextGen. No, the vast majority is NextGen sequencing. There's niche applications where there's other approaches, but in the 99% of the data being generated, some version of NextGeneration sequencing. Got it. percent of the data being generated, some version of next generation sequencing. Got it. So you mentioned a moment ago that part of the effort to bring you to Illumina was presumably

Starting point is 00:23:32 based on not just your innate talents, but also the fact that you came from a somewhat clinical background as well. You're an MD and a PhD. And if their desire is to be able to branch out into clinical applications, that would make for a natural fit. So where in that journey did the idea of liquid biopsies come up and maybe talk a little bit about the history of one of the companies in that space that we're going to talk about today?

Starting point is 00:23:56 So to start with that, I should talk about first tumor sequencing, which predated liquid biopsy. A couple of companies, most notably Foundation Medicine developed using Illumina technology developed tumor sequencing. So there had been some academic work, but they tried to develop it and were the first to do it successfully as a clinical product. What you can imagine is there's these genes that are implicated in cancer that often get mutated. Knowing which mutations a tumor has has big implications for prognosis, but also for treatment. Over time, we have more and more targeted therapies

Starting point is 00:24:31 where if your tumor has a very particular mutation, it's more likely to respond to certain drugs that target that type of tumor. And at the time, as more and more of these mutations were identified that could be important in the treatment of a tumor, it was becoming impractical to say, do a PCR test for every mutation. So imagine there's 100 potential mutations you'd like to know about, if a patient has in their tumor and their lung cancer, doing each of these individually, again, a lot of extents, a lot of false positives. And so what companies like Foundation met is say, hey,

Starting point is 00:25:08 why don't we just sequence all of those positions at once, given next generation sequencing? So they would make a panel to sequence, say, 500 genes or a few hundred genes, the ones that are most important in most solid cancers. And then they would sequence them. And then in one test, they would see the vast majority of the potential mutations that could be relevant

Starting point is 00:25:28 to treatment for that cancer patient. And so that is still a very important tool. In oncology today, a large fraction of tumors are sequenced, and that's what allows people to get access to many types of drugs. Many of the targeted therapies for lung cancer, melanoma, or you hear about things like microsatellite instability or high-mutational burden, that all comes from tumor sequencing.

Starting point is 00:25:55 Once that was established, then a few folks, most notably at Johns Hopkins, but also other places, started to ask the question, could we sequence the tumor from the blood? And you might say, hey, you have a tumor in your lung. Why would sequencing blood be relevant to looking at the tumor? Well, it turns out there is tumor DNA in the blood. And this is interesting. So in the late 40s, it was first identified that there was DNA in the blood outside of cells, so-called cell-free DNA. And then in the 70s, it was noticed that cancer patients had a lot of DNA outside their cells

Starting point is 00:26:32 in the blood, and that some of this was likely from tumors, from the cancer itself. If you know anything about tumor biology, you know that cancer cells are constantly dying. So you think of cancers as growing very quickly, and that's true, but they actually are dying at an incredible rate because it's disordered growth. So many of the cells that divide have all kinds of genomic problems, so they die or they're cut off from vasculature. But the crazy thing about a tumor is, yes, it's growing fast if it's an aggressive tumor, but also the amount of cell death within that tumor is very high.

Starting point is 00:27:09 And every time one of those cells die, some of the DNA has the potential to get into the bloodstream. And so it was this insight along with the tumor sequencing that said, hey, what if we sequenced this cell-free DNA? Could we end up sequencing some of the tumor DNA or the cancer cell DNA that's in circulation? Early results, particularly from this group at Johns Hopkins, began to show that indeed that was possible. And then a few companies, again, using aluminum technology, and then we started doing it at Illumina also, our own liquid biopsy assays and tests and

Starting point is 00:27:45 technologies developed what became liquid biopsy. In this context, it was for late stage cancer. So it was for patients who diagnosed with a cancer. You wanted to know, did their tumor have mutations and you could do it from the blood. There was a big benefit, which was, as you know, for lung cancer, taking a biopsy can be a very dangerous proposition. You can cause a pneumothorax, you can land someone in the ICU. You know, in rare cases, can lead to death in that type or procedure.

Starting point is 00:28:14 And so, the ability to get the mutational profile from the blood was really attractive. And so, that started many companies down the road of developing these liquid biopsies for late stage cancers. So, Alex, let's talk about a couple of things there. Tell me the typical length of a cell-free DNA fragment. How many base pairs or what's the range? Yeah, it depends on the exact context, but around 160 base pairs. So that's 160 letters of the ATCG code. And there's a very particular reason it's that length, which is that if you pull the string on the sweater, you unwind the chromosome, and you keep doing it until you get down to something around 160 base pairs,

Starting point is 00:29:00 what you find is that the DNA, right, it's not just naked, it's wrapped around something called the nucleosome, which is an octamer or eight of these histone proteins in a cube, and the DNA is wrapped around it twice. And that's the smallest unit of chromatin of this larger chromosome structure. And so the reason it's 160 bases is that's more or less the geometry of going around twice. And so DNA can be cleaved by enzymes in the blood, but that nucleosome protects the DNA from being cut to anything smaller than about 160 base pairs. And does that mean that the cell-free DNA that is found in the blood is still wrapped around the nucleosome twice?

Starting point is 00:29:48 Like it's still clinging to that and that's what's protecting it from being cleaved any smaller? That's right. You mentioned that obviously the first application of this was presumably looking for ways to figure out what the mutation was of a person with late-stage cancer without requiring a tissue biopsy. Presumably by this point it was very easy to gather hundreds of 160 base pair fragments and use the same sort of mathematics to reassemble them based on the few overlaps to say this is the

Starting point is 00:30:23 actual sequence because presumably the genes are much longer than 160 base pairs that they're looking at. Dr. Justin Marchegiani That's right. So by this point in 2014, 2015, the informatics was quite sophisticated. So you could take a large number of DNA sequences from fragments and easily determine which gene it was associated with. Dr. Justin Marchegiani At some point, I recall in here, I had a discussion on the podcast maybe a year and a half ago, two years ago with Max Dean, another one of our med school classmates, about looking at

Starting point is 00:30:59 recurrences in patients who were clinically free of disease. So you took a patient who's had a resection, plus or minus, some adjuvant chemotherapy, and to the naked eye and to the radiograph, they appear free of disease. And the question becomes, is that cancer recurring? And the sooner we can find out, the better our chance at treating them systemically again, because it's a pretty well-established fact in oncology that the lower the burden of tumor, the better the response, the lower the mutations, the less escapes, etc. And so did that kind of become the next iteration of this technology, which was, if we know the sequence of the tumor, can we go fishing for

Starting point is 00:31:45 that particular tumor in the cell-free DNA? Yeah, yeah. Broadly speaking, there's kind of three applications for looking at tumor DNA in the blood. One is screening, which we'll talk about later, which is people who don't have cancer or 99% who don't and trying to find the individual who has cancer and invasive cancer if it doesn't know it. There's this application of what we call therapy selection, which is you're a cancer patient trying to decide which targeted therapy would be best for you. And then you're this other one you mentioned is a third application we call often minimal residual disease. We're looking at

Starting point is 00:32:22 monitoring a response, which is your undergoing treatment and you want to know is the amount of tumor DNA in the blood undetectable and also its velocity is it changing because as you mentioned that could tell you is your treatment working the tumor DNA burden or load is going down. Is it undetectable and you're potentially cured that there's no longer that source of tumor DNA in your body? Or is it present even after a treatment with intensive cure and that the presence of that tumor DNA still means basically, and we appreciate this now. Unfortunately, you have not been cured, but that patient hasn't been cured because there is some nitrous tissue somewhere that still harvours these mutations and therefore

Starting point is 00:33:09 is the tumor, even if it's not detectable by any other means. So at what point does this company called Grail that we're going to talk about, at what point does it come into existence and what was the impetus and motivation for that as a distinct entity outside of Illumina? So there were several technological and scientific insights that came together along with as often in this case some really old entrepreneurs and investors. The use of this liquid biopsy technology in late stage cancers, it was clearly possible to sequence tumors from the blood. And it was clearly

Starting point is 00:33:52 actually the tumor DNA, and it was useful for cancer patients. So we knew that there was tumor DNA, we knew it could be done. But what the field didn't know is, could you to see this in early stage cancers, localized cancers that were small, not a lot of data on that, but there was the potential. There was also a really interesting, incidental set of findings in a completely different application called non-invasive prenatal testing.

Starting point is 00:34:19 Again, totally different application, but it was discovered, and so did we by a scientist in Hong Kong named Dennis Lowe, that you could see fetal DNA in the blood, or more specifically placental DNA in the blood. And it was also cell-free DNA. What he developed, actually, along with one of our professors at Stanford, Steve Quake, was a technique to look for trisomies in the blood based on this placental or feval DNA. This is called noninvasive prenatal testing.

Starting point is 00:34:52 What you do is you sequence the cell-free DNA fragments in a pregnant woman, you look at the DNA, and if you see extra DNA, for example, at the position of chromosome 21, well, that indicates that there are tissues in women, presumably the fetus or placenta that's giving off extra chromosome 21. And so this ended up being an incredibly sensitive and specific way to test for the presence of trisomies, chromosome 21, 1813 early in pregnancy. And it's had a tremendous impact. It was also involved in subsequent iterations of the test. In the United States, it decreased amniocentesis by about 80% because the test is so sensitive and specific as a screen that many, many women have now not had to undergo amniocentesis and the risks around them. Again, totally different applications of cell-free DNA, but what happened is during the early

Starting point is 00:35:51 commercialization of about the first few hundred tests, the company's pioneering this, and one of them was a company called Veranada that Illumina acquired, began to see in rare cases very unusual DNA patterns. It wasn't just a chromosome 21 or 18 or 13, but what's often called chromotrypsis, which is many, many abnerations across chromosomes. The two women who really did this analysis and really brought both the aluminum and the world's attention to it were Meredith

Starting point is 00:36:25 Hawks Miller, a pathologist and lab director at this aluminum-owned company, Varanada, and another bioinformatics scientist, Darya Chudovar. What they showed is ultimately that these women actually had cancer. They were young women of childbearing age. They ultimately had healthy children, but they had an invasive cancer and it was being diagnosed in their cell-free DNA by this noninvasive prenatal test. As they began to show these patterns to people, it became clear that they were clearly cancer. If you have many, many chromosomes that are abnormal, that's just not compatible with life or a fetus. If you have many, many chromosomes that are abnormal, that's just not compatible with life, the raffitis. And so when you saw this just genome-wide, chromosomal changes, it was very clear

Starting point is 00:37:12 that we're incidentally finding cancer in these women. Dr. Justin Marchegiani Let's talk a little bit about that actually, because I want to dig into that. It's so interesting. So let's take a step back. So again, whenever you say we're sampling for cell-free DNA, we should all be keeping in the back of our mind. We're looking for these teeny, tiny, little 160 base pair fragments wrapped around little nucleosomes.

Starting point is 00:37:34 Now, let's just go back to the initial use case around trisomy 21. With 160 base pairs, is that sufficient to identify any one chromosome? Presumably you're also sampling maternal blood, so you know what the maternal chromosomes look like, and you're presumably juxtaposing those two as your control. Is that part of it? Not quite.

Starting point is 00:37:57 So it's all mixed together. So in pregnant women's blood, maternal blood, it's a mixture. So you have cell-freeNA. The majority of the self-re-DNA is from her own cells and tissues. And then you have superimposed on that a bit of self-re-DNA from mostly the placenta. And so what you're seeing is this mix of self-re-DNA. And then what you do is you sequence, there's different ways to do it, but most common ways you do shotgun sequencing and you sequence millions of these fragments. Every time you sequence a fragment, you place it in a chromosome based on its sequence.

Starting point is 00:38:34 Your first fragment, you say, hey, when I compare this to the draft human genome, this goes on chromosome two. You sequence your second, third fragment and you say, hey, this sequence looks like chromosome 14. And you keep putting them in the chromosome buckets. And what you expect if every tissue has an even chromosome distribution, you know, or two chromosomes is that that profile would be flat and each bucket would be about the same level. But what you see in a woman carrying a fetus that has a trisomy... You'll see 50% greater in the Tromizome 21 bucket.

Starting point is 00:39:09 You actually see more like 5% or 10% because again, remember 90% of it might be maternal blood, right? So that's all going to be even. But within the 10% fetal, you're going to have an extra 50%. So the total might be an extra 5% or 10%. But that's a whopping big signal and very easy to detect. Isn't it interesting? It just gives a sense of how large the numbers are if a 5% Delta is an off the charts unmistakable increase in significance. I want to make sure again, people understand what you just said because it's very important. Because the majority of the cell-free DNA belongs to the mother and because the fetal slash placental cell-free DNA is a trivial amount, even though by definition a trisomy

Starting point is 00:39:55 means there is 50% more of one chromosome. You've gone from two to three copies. In the fully diluted sample, that might only translate to a few percent, but that's enough given the large numbers that you're testing to be a definitive statistically significant difference that triggers a positive test. Yeah, well, yes. Alex, I want to come back to the story because this is clearly the beginning of the story, but let's come back to just a couple other housekeeping items.

Starting point is 00:40:26 A moment ago, we talked about cell-free DNA in the context of tumor. Someone might be listening to us thinking, wait, guys, you just said that the majority of the cell-free DNA is from this mother. 99.9% of the time, she doesn't have cancer. Where is that cell-free DNA coming from? When cells are destroyed, either through necrosis or apoptosis, there's a lot of cell turnover, right, of cells that replicate, especially epithelial cells, blood cells, and so on. As the natural biochemistry destroys them, some of the DNA from the nucleus ends up in

Starting point is 00:41:00 circulation, again, where they're wrapped around these nucleosomes. So it's essentially cell death and cell turnover is the source of it. And since, again, at any one time, there's millions of cells dying and being turned over, there's always some base-level cell-free DNA in the blood. And again, I don't know if you've ever done the calculation. If not, I don't mean to put you on the spot. But do you have an approximate guess for how many base pairs of Cell-free DNA or floating around your body or my body as we sit here right now?

Starting point is 00:41:30 What I can say is if you took a 10 mil blood tube, which is a lot of what these tests use and you remove all the cellular DNA, remember there's a ton of DNA in the cells and certain. Sure. They're white blood cells, the red blood cells, etc. Get rid of all that. Yep. Huge amount. You probably have on the order of a few thousand cells worth of cell-free DNA in a 10-mil blood tube, which isn't a lot. Just to make sure I understand you, you're saying a few thousand cells worth. Each cell would be three billion base pairs. Yes. Yes. Wow. On the one hand, it doesn't sound like a lot because there are billions of cells. On the other hand, it still sounds like a lot.

Starting point is 00:42:12 That's still a big computational problem. Where it becomes challenging is when we get into early detection, right? Where if you think about it, for any position in the genome, you only have a few thousand representations of it because there's only a few thousand cells. And so that starts to limit your ability to detect events that occur at one in a million or one in a hundred thousand. Dr. Darrell Bock So Alex, do you recall these incident cases of the pregnant mothers? And again, I guess we should probably go back and re-explain that because it's such an important and profound discovery.

Starting point is 00:42:50 There were a handful of cases where in the process of screening for trisomies, they're discovering not that the mother has additional chromosomes that can be attributed to the fetus, but that she has significant mutations across a number of genes that also are probably showing up in relatively small amounts because they're not in all of her cells. Is that correct? Yeah. Yeah. So you might expect a flat pattern, right, in the majority of cases, or when the fetus has a trisomy, you see these very well-known accumulations, mostly in 21, but occasionally in 18 or 13. And instead, what you see is just increases and decreases monosomies and trisomies across many, many chromosomes, which is just not compatible with life even as a fetus.

Starting point is 00:43:46 But there is a biology where you do see these tremendous changes in the chromosomes, and that's often in the case of cancer. Dr. Justin Marchegiani Do you recall what those cancers turned out to be in those young women? I mean, I assume they were breast cancers, but they could have been lung cancers, anything? Dr. David M. N. Merydeth Yeah, so M. So Meredith and Daria, they published a paper in JAMA, which for anyone interested, details these 10 or so cases and what happened in each of them. It was a mix. I think there was a neuroendocrine, uterine, some GI cancers.

Starting point is 00:44:19 It was a smattering of different things. And what was the approximate year of that? We'll try to find that paper and link to it in the show notes. It was 2015 in Jammu. Got it. That seems unbelievable. Of course, one doesn't know the contrapositive. One doesn't know how many women had cancer but weren't captured. But is it safe to assume that the ten who were identified all had cancer? Yes, yes. So there were no false positives. We just don't know how many false negatives there were. Safe to assume that the 10 who were identified all had cancer? Yes.

Starting point is 00:44:45 Yes. So, there were no false positives. We just don't know how many false negatives there were. Right. Yeah. This was one of the things that contributed to the evidence that cancer screening might be possible using cell-free DNA, which is these incidental findings. As I mentioned earlier, we already knew that, yes, tumors do put cell pre-DNA into the bloodstream, but this was a profound demonstration that in actual

Starting point is 00:45:10 clinical practice, you could find undiagnosed cancers and asymptomatic individuals, and that it was highly specific, meaning that when it was found using this method, it almost ... Oh, I think in those initial ones, it was every case, but almost every case turned out to have cancer. Now, to your point, it's not a screening test because even in relatively healthy and women of childbearing age, a population of 100,000, you expect epidemiologically 10 times or so or 50 times that number of cancers over a year or so. So clearly, you're missing the majority of cancer, so it's not a screening test. Right.

Starting point is 00:45:51 It was just a proof of concept though. Yeah. An inadvertent proof of concept that really raised that aluminum, I think in the field, our attention of, hey, using cell-free DNA and sequencing-based methods, it might be possible to develop a very specific test for cancer. So, what was the next step in the process of systematically going after addressing this problem? Myself and some other folks in Illumina, along with the two scientists I mentioned, Meredith

Starting point is 00:46:19 and Daria, and then also in particular, the CMO at the time, Rick Klausner, who had a very long history in cancer research and in cancer screening, he was the previous NCI director under Bill Clinton. So that's the National Cancer Institute at the NIH under Bill Clinton, and he was the CMO at Illumina at the time. And we started to talk more and more about what would it take to develop or determine the feasibility

Starting point is 00:46:46 of a universal blood test for cancer based on this cell-free DNA technology. And being very first principle, I really asked the question, why is it in 50 years of many companies and a tremendous amount of academic research, no one had ever developed a broad based blood test for cancer, not just many cancers, let alone any cancer. Really the only example is PSA, and again the false positive rates there are so high that it's benefit to harm has been questioned many times

Starting point is 00:47:19 and that's why it doesn't have a USPS TF grade A or B anymore. And the fundamental reason is specificity. So there's lots of things that are sensitive, and that's why it doesn't have a USPS, TF grade A or B anymore. And the fundamental reason is specificity. So there's lots of things that are sensitive, meaning that there are proteins that accumulate, biochemistry, metabolites that go up in cancer, but the problem is they go up in a lot of benign conditions. So, you know, a big benign prostate spews out a lot of PSA, and pretty much every other protein or metabolite does that.

Starting point is 00:47:46 The biomarkers to date were all very sensitive, but all had false positive rates of say five or 10%. And so if you're imagining screening the whole population, you can't be working up one of 10 people for a potential cancer. And so the key technological thing to solve was, well, how do you have something that has a 1% false positive rate or a half percent false positive rate? Because that's what you need to get to if you want to do broad-based

Starting point is 00:48:16 cancer screening in relatively healthy asymptomatic people. And this is why we thought it might be possible with self-re-DNA because the tumor DNA could be more specific than proteins and other things that are common in benign disease. And so that was the reason to believe. The things we didn't know is, well, how much DNA does a early-stage tumor pump out? If it doesn't pump out any, well, there's nothing to detect. The other is the heterogeneity. Cancer is not like infectious disease or there's one very identifying antigen or sequence. Every tumor is truly unique, right? So even two lung cancers, they're both the same histological subtype, they can share very few mutations or none.

Starting point is 00:49:02 So you can have two squamous cell lung cancers that honestly don't have a single shared mutation. So now you need to look at hundreds or thousands or even of millions of positions to see enough potential changes. And this is where again, NGS was a really good fit, which is how do you overcome the heterogeneity that you need to now look for a disease that isn't defined. I can't tell you these three mutations are the ones you need to find for this cancer. There's a huge set of different ones for every cancer. And then that got us thinking, well, look, in addition to sequencing many positions

Starting point is 00:49:38 and sequencing very deeply and using cell-free DNA, we were going to need to use AI or machine learning, because we had to learn these complex associations and patterns that no human being could curate thousands of different mutational profiles and try to find the common signals and so on. What emerged over the course of a year is, look, this might be possible, but we're going to have to enroll very large populations just to study and find the signals and develop the technology. And then we're going to need very large studies to actually do interventions and prove it clinically valid that it actually works. We're going to have to use NGS and sequence broadly across the whole genome, and only

Starting point is 00:50:23 then might it be possible. And so the company at the time decided, and this was a board-level decision, that ultimately, this made more sense as an independent company, given the amount of capital that was going to be required, given the scientific and technical risk, given the kind of people that you would need to recruit who are passionate about of people that you would need to recruit who are passionate about this, that it made sense to do as a separate company. And so the CEO at the time, Jay Flatley, in early 2016, announced the founding of the company and then spinning it out of Illumina.

Starting point is 00:50:58 And I had the honor of being one of the co-founders of it. Let's go back to 2016. You guys are now setting up NewShop. You've got this new company. It's called Grail. You've brought over some folks like you from Illumina and presumably you're now also recruiting. What is the sequence of the first two or three problems you immediately get to work on? As I wrote the starting research and development plan. The way I wrote it was we needed to evaluate every potential feature in cell-free DNA, meaning that any known method of looking for cancer in cell-free DNA, we needed to evaluate that if we were going to do this and recruit these cohorts and all these altruistic individuals

Starting point is 00:51:43 and we were going to spend the money to do this. We needed to not look at just one method or someone's favorite method or whatever they thought might work. We needed to look at every single one. That's what we did. We developed an assay and software for mutations and then a bunch of other things, chromosomal changes, changes in the fragment size, and many others. And we said, look, we're going to test each one of these head to head, and we're going to test them in combination. And we're going to figure out the best way to do this. We even had a mantra that

Starting point is 00:52:14 Rick came up with that I thought was very helpful, which is we're either going to figure out how to do this, or we're going to prove it can't be done. I think that was very helpful in thinking about how to do these initial experiments. So it was a lot of building these assays. We needed a massive data sets to train the machine learning algorithm. So we had this study called the CCGA, the Circulating Self-Regino Atlas, where we recruited 15,000 individuals with and without cancer of every major cancer type, in most cases, hundreds. And then we tested all of these different methods, the ones I mentioned, and also importantly, a methylation-based assay. And we did blinded studies to compare them and see, could any of them detect a large fraction of the cancers?

Starting point is 00:52:58 Did any of them have the potential to do it at high specificity? Because that's what we would need if we were going to develop a universal test for cancer that could be used in a broad population. So let's kind of go back and talk about a few of those things there because there was a lot there. So you set up front, look, we're going to make sure that any measurable property of cell-free DNA, we are measuring. We are quantifying it.

Starting point is 00:53:24 We are noting it. We talked about some of them, right? So fragment length, that seems relatively fixed, but presumably at large enough sample size, you're gonna see some variation there, does that matter? The actual genetic sequence, of course, that's your bread and butter, you'd be able to measure that. You also mentioned, of course,

Starting point is 00:53:43 something called methylation, which we haven't really talked about yet, so we should explain what that means. Were there any other properties besides fragment length, sequence, and methylation that I'm missing? There were several others. One was chromosomal changes. So as we mentioned in cancer, the numbers of chromosomes often changes. So many cancers, and this is wild, but often double the number of chromosomes. So you can go from 23 to double or even triple the number, but these chromosomes are not normal. So you'll often have arms or the structures of chromosomes will get

Starting point is 00:54:19 rearranged. And so there's a way to look at that also in the cell-free DNA. Like as we mentioned in the noninvasive prenatal testing, we look at the amount of DNA per chromosome or per part of chromosome. So we looked at what's called these chromosomal abnormalities. We also looked at cell-free RNA. So it turns out there's also RNA from tumors in circulation. Dr. Justin Marchegiani How stable is that, Alex? I was under the impression that RNA wouldn't be terribly stable, unlike DNA, which of course is double-strand and quite stable.

Starting point is 00:54:51 How do you capture cell-free RNA? So naked RNA is not very stable. However, there's proteins that if the RNA is bound to, and one type is called an organot protein, if the RNA is bound to and one type is called an organot protein. If the RNA is bound to it, it is protected. I assume this is typically messenger RNA that's been in the process of being transcribed, but somewhere along the way before translation occurs, there's the disruption to the cell that results in lysis or something. And you're just basically getting the cell-free RNA because

Starting point is 00:55:24 you happen to catch it at that point. It was a replicating cell or something, or it just basically getting the cell-free RNA because you happen to catch it at that point. It was a replicating cell or something or it was just translating protein. Yeah. Or during apoptosis, it's somehow during some kind of program cell death, it's being digested or bound. The amount relative to the amount of cell death is low. So presumably, most of the RNA is destroyed, but enough of it does get protected in bound to proteins. Whether or not it's cellular detritus or garbage or it's intentional, it's kind of a different question, but it is present. There's also vesicular structure, so little bubbles of membrane that the RNA can be contained in.

Starting point is 00:56:02 The most common one is referred to as an exosome, which are these little vesicles in circulation. So, in a variety of different ways, you can have messenger RNA and other types of RNA preserved outside of cells in circulation. And so, we looked at that also. Dr. Darrell Bock How long did it take to quantify all of these things? And presumably, I think you sort of alluded to this, but we're not just looking at any one of these things. You're also asking the question, can combinations of these factors add to the fidelity of the test, correct?

Starting point is 00:56:34 Yeah. So, this initial research phase took close to three years, cost hundreds of millions of dollars. We had to recruit the largest cohort ever for this type of study, the CCGA study as I alluded to. And there were different phases. There was a discovery and then multiple development and validation phases. We had to make the world's best assays to look at each of these features.

Starting point is 00:56:59 And then we had to process all of those samples and then analyze them. And we did it in a very rigorous way where the final testing was all done blinded and the analysis was all done blinded. So we could be sure that the results were not biased. And then we compared them all and we also compared them in combinations. And we use sophisticated machine learning approaches

Starting point is 00:57:21 to really maximize the use of each individual type of data from each, whether or not it was mutations or the chromosomal changes or methylation. So, you mentioned that the CCGA had 15,000 samples. How many of those samples were cancers versus controls? What was the distribution of those? It's about 60% cancer versus controls, 40%. You sort of alluded to it, but just to be sure I understood, you're obviously starting first with a biased set where you know what's happening and then you're moving to a blinded unbiased set for confirmation.

Starting point is 00:57:54 Is that effectively the way you did it? Yeah. Yeah. So often referred to as a training set and a test set. Tell us what emerged, Alex, when it was all said and done, when you had first and foremost identified every single thing that was measurable and knowable. Sorry, before we do that, I keep taking us off methylation. Explain methylation of all the characteristics that's the one I don't think we've covered

Starting point is 00:58:19 yet. So DNA methylation is a chemical modification of the DNA. So in particular at the C in the ATC-G code, the C stands for a cytosine. So that's a particular nucleotide or base in DNA. Mammalian biology can methylate. It means that it can add a methyl group. But a methyl group is just a single carbon atom with three hydrogens and then bonded to that cytosine. And so that's what DNA methylation is.

Starting point is 00:58:47 So to say a cytosine is methylated, it means that it has that single methyl group bonded to it. It turns out that there's about 28 million positions in the human genome that can be methylated. It usually occurs at what's called CPG sites, which is if you go along one strand of DNA, this is not pairing of the DNA, but one strand, a G follows a C. So that's what a CPG is. It's a C with a phosphate bond to a G. And so at those positions in the genome, there are enzymes that can methylate the cytosine and demethylate it. And there's again about 28 million of those sites out of the 3 billion overall bases in

Starting point is 00:59:32 the human genome. These chemical modifications are really important because they affect things like gene expression. It's one of the more important classes of something that's called epigenetics, which is changes that are outside of the genetics or outside of the code itself. As you know, the DNA code is the same in most cells of the human body. Obviously the cells are quite different. So a T cell is very different than a neuron.

Starting point is 00:59:55 And other than the T cell receptor, all of the genes are the same. The code is the same. So why are the cells different? Well, it's the epigenetics. So things like which parts of the gene are methylated or which ones are associated with histones that are blocking access to the DNA, that that's what ultimately determines which genes are transcribed, which proteins are made, and why cells take on very different morphology and properties. The methylation is a very fundamental code for controlling it. So I call the epigenetics the software of the genome. The genetic code is kind of the hardware, but how you use it, which genes you use when, what combination, that's really the epigenetics. What is the technological delta or difference in reading out the methylation sequence on

Starting point is 01:00:49 those CPG sites relative to the ease with which you simply measure the base pair sequences? You can measure C, G, A, T, C, C, A, T, G, etc. But then in the same readout, do you also acquire which of the C's are methylated or are you doing a separate analysis? There's different technologies to do that. For self-re-DNA, you usually want very accurate sequencing of billions of these or many hundreds of millions of these small fragments. The way it's done is, and this adds complexity to the chemistry, is you pre-treat the DNA in a way that encodes the methylation status in the ATG sequence,

Starting point is 01:01:33 and then you just use a sequencer that can only see ATCG, but because you've encoded the information, you can then deconvolute it and infer which sites were methylated. Just to be a little more specific, there's chemicals that will, for example, deaminate a cytosine that's not methylated. And then that deaminated cytosine effectively turns into a uracil, which is a fifth letter in RNA. And then when you copy the DNA and you amplify it by the sequencing, it amplifies as a T because a U, when it's copied by a DNA polymerase becomes a T. And then you end up with a sequence where you expect to see Cs and you see a T. And if you see a T there, then you know that, aha, this must have been an unmetallated site.

Starting point is 01:02:21 That came from a U and the U is an unmethalated C. Brilliant. Right. And if the C was not changed, then you say then that must have been a site that was methylated. Because you'll see Gs opposite them. Oh, sorry. If the C was methylated, you'll see the G opposite because you won't turn it to the urus cell. Right. Right. Yeah. Brilliant. That technique is called bisulfide sequencing. There are other ways to do it, but that's the predominant method.

Starting point is 01:02:46 All right. So now back to the question I started to ask a minute ago, but then realized we hadn't talked about methylation. So you've come up with all these different things that you can do with this tiny amount of blood, because again, you talk about 10 ml, you know, in the grand scheme of things, that's a really small amount of blood. That's two small tubes of blood. Very easy to do.

Starting point is 01:03:07 Presumably, there was an optimization problem in here where you min-max this thing and realize, well, look, this would be easy to do if we could take a liter of blood, but that's clinically impossible. Yeah, it would be nice to Theranos this, quote, unquote, and do this with a finger stick of blood, but you're never going to get the goods. So did you end up at 10 ml? Was it just sort of an optimization problem that got you there as the most blood we could take without being unreasonable, but yet still have high enough fidelity?

Starting point is 01:03:36 And maybe asked another way, can you get better and better at doing this if you were taking eight tubes of blood instead of two? Yeah. There's a couple considerations. One is the practical one. You need a format, to the extent your standard phlebotomy and standard volumes that are below the volumes at which you could put someone in jeopardy, that's a big practical issue. But it actually turned out that what ultimately limited the sensitivity of the test was the background biology. So for broad-based cancer screening,

Starting point is 01:04:07 more blood would actually not help you. Now there's other applications for monitoring or the therapy selection where you're looking for a very particular target. If someone has cancer, you know what kind of cancer. And there you could improve your sensitivity. But just for cancer screening, you're usually not limited by the amount of cancer. And there you could improve your sensitivity. But just for cancer screening, you're usually not limited by the amount of blood.

Starting point is 01:04:29 And so did methylation turn out to be the most predictive element at giving you that very, very high specificity, or was it some combination of those measurable factors? Yeah. So it was pretty unexpected. I would say going into it, most people thought that the mutations were going to be the most sensitive method. Some of us thought that the chromosomal changes were going to be the most sensitive. I would say the methylation signals were kind of a dark horse. I had to fight several times to keep it in the running. But again, we really took out, let the data

Starting point is 01:05:05 tell us what the right thing to do is not biases from other experiments, let's do this in a comprehensive rigorous way. And in the end, the methylation performed by far the best. So it was the most sensitive, so it detected the most cancers. Importantly, it was very specific. It actually had the potential

Starting point is 01:05:23 and ultimately did get to less than 1% false positive rate. Then the methylation had this other feature, which was very unique, which was that it could predict the type of cancer. What was the original, what we call now the cancer site of origin? What organ or tissue did it originate from? Interestingly, adding them all together didn't improve on the methylation. I can explain why, and now in hindsight,

Starting point is 01:05:49 you might have thought, hey, more types of information and signal are better, but it actually didn't. So we ended up with one clear result that the methylation patterns in the cell-free DNA were the most useful information and adding other things was not going to help the performance.

Starting point is 01:06:07 And why do you think that was? Because it is a little counterintuitive. There are clearly examples I could probably think of where you can degrade the signal by adding more information. But I'm curious if you have a biologic, teleologic explanation for why one and only one of these metrics turned out to be the best and any additional information only diluted the signal. It comes down to this is a good engineering principle, right? If you want to improve your prediction, you need an additional signal that carries information

Starting point is 01:06:41 and is independent from your initial signal. If it's totally correlated, then it doesn't actually add anything. Let's take an analogy. Let's say you're on a freeway overpass, and you're developing an image recognition for Fords. You say, okay, what I'm going to start initially with is an algorithm. It's going to look for a blue oval with the letters F-O-R-D in it.

Starting point is 01:07:03 That's pretty good. Now Now let's say you say, okay, I know that some Fords also have the number 150 on the side, f150. So I'm going to add that, right? If you think about it, if your algorithm based on the blue oval is already pretty good, adding the 150 is not gonna help because whenever the 150 occurs, the blue oval is also always there. Now, if the blue oval wasn't always there or there were

Starting point is 01:07:32 fours that didn't have the blue oval, then some other signal could be helpful. And so that's kind of what ended up happening is that the methylation signal was so much more prevalent and so much more distorted in cancer that everything else didn't really add because anytime you could see one of the others, you could also see many more abnormal methylation fragments. Yeah. That's really fantastic. I guess I also want to again just go back and make sure people understand the mission

Starting point is 01:08:03 statement you guys brought to this, which was high specificity is a must. So people have heard me on the podcast do this before, but just in case there are people who haven't heard this example or forget it, I sometimes like to use the metal detector analogy in the airport to help explain sensitivity and specificity. So sensitivity is the ability of the metal detector to detect metal that should not go through. And let's be clear, it's not that people in the airport care if your phone is going through or your laptop or your watch

Starting point is 01:08:37 or your belt, they care that you're bringing guns or explosives. That's why we have metal detectors or knives or things of that nature. That's why we have metal detectors or knives or things of that nature. That's why the metal detector exists. It has to be sensitive enough that no one carrying one of those things can get through. On the other hand, specificity would say, so if you're optimizing for sensitivity, you make it such that you will detect any metal that goes through that thing. And by definition, you're going to be stopping a lot of people. You're going to stop everybody from walking through.

Starting point is 01:09:11 If their zipper is made of metal, you'll stop them. For prosthetic or a big belt or boots or anything. You got a little metal on your glasses, you're going to get stopped. So you have to dial the thing in a way so that you have some specificity to this test as well, which is I can't just stop everybody. In an ideal world, I kind of want everyone to make it through who's not carrying one of those really bad things and we're defining bad thing by a certain quantity of metal.

Starting point is 01:09:40 And therefore, your specificity is to kind of say, I don't want my test to be triggered on good guys. All right, I want my test to be triggered on bad guys. Now, when you guys are designing a test like this, like the Grail test, I guess I should just go back and state, anybody who's ever been through two different airports wearing the exact same

Starting point is 01:10:07 clothing and realizes sometimes it triggers, sometimes it doesn't, what you realize is not every machine has the same setting. And that's because the airport, the people at TSA, they turn up or turn down the sensitivity and that changes the specificity as well. How deliberately do you, when you're setting up this assay have the capacity to dial up and down sensitivity and specificity. So while I understand your mandate was a very high specificity test, where was the control or manipulation of that system, if at all? So there's a threshold. It's complex. Conceptually, there's a threshold inside the algorithm, right? So you can imagine that after you have this comprehensive map of all these different types of methylation changes that can occur in the fragments of hundreds of examples of every

Starting point is 01:10:56 cancer type, and then you compare it to all the methylation changes that can occur outside of cancer, which we haven't talked about, which is very important. So most of the methylation patterns are pretty similar and similar cell types across individuals, but there are changes that occur that occur with age or ethnicity or environmental exposure and so on. What you'd like is those two populations to be completely different, but it turns out there is some overlap. So there are fragments that occur in cancer that can occur outside of cancer. The algorithm in a very complex state space is trying to separate these populations. And whether or not you're going to call something as a potential cancer and say a cancel signal

Starting point is 01:11:43 is detected is whether or not the algorithm thinks is it associated with this cancer group or is it associated with a non-cancer group but again there's some overlap between these and so where you set that overlap like in the border between individuals who don't have cancer but for whatever reason an abnormal level of fragments that kind of cancerous that will determine your specificity so there is a dial to turn where you can increase the stringency. Call your false positives but then you will start to miss some of the true positives now what was so great about methylation is that these populations were pretty well separated, better than anything the world had ever seen before, which is why you could get high specificity and still pretty good sensitivity. But yes, there is some overlap, which means you have to make a trade-off and dial it in.

Starting point is 01:12:39 Inside the company, is there sort of a specific discussion around the trade-offs of it's better to have a false positive than have a false negative? Let's use the example you brought up earlier. Prostate-specific antigen is kind of the mirror image of this. It's a highly, highly sensitive test with very low specificity. It's obviously a protein, so it's a totally different type of assay. It's a far cruder test, of course. But the idea is, in theory, and of course I could give you plenty of examples, someone

Starting point is 01:13:11 with prostate cancer is going to have a high PSA. So you're not going to miss people with cancer. But as you pointed out earlier, you're going to be catching a lot of people who don't have cancer and it's for that reason as you said, there is no longer a formal recommendation around the use of PSA screening. It has now kind of been relegated to the just talk to your doctor about it. And of course, the thinking is look, there are too many men that have undergone an unnecessary prostate biopsy on the basis of an elevated PSA that really should have been attributed

Starting point is 01:13:43 to their BPH or prostatitis or something else. So notwithstanding the fact that we have far better ways to screen for prostate cancer today, that's a test that is highly geared towards never missing a cancer. In its current format under low prevalence populations, which is effectively the population it's being designed for. This is designed as a screening tool. It seems to have better negative predictive value than positive predictive value, correct? It's pretty high in both because negative predictive value also is related to prevalence. We'll just put some numbers out there. In

Starting point is 01:14:21 the CCGA study, but then importantly in an interventional study called Pathfinder, a positive predictive value is around 40%. That's all stages? Yeah. So that's all cancers, all stages. It's a population study. So it's whatever natural set of cancers and stages occur in that group. So that was about 6,500 individuals.

Starting point is 01:14:43 Do you recall, Alex, what the prevalence was in that population? Was it a low risk population? Yeah. So it was a mix of a slightly elevated risk population and then a average risk population. Just in terms of risk, and I think you'll appreciate this, I think of anyone over 50 is high risk. And that's where the majority of these studies are happening. So age is your single biggest risk factor for cancer. The population over 50 is about a 10x increased risk relative to the population under 50. And age 55 to 65 is the decade where cancer is the number one cause of death. I would say in developed nations, I mean, that's actually increasing, right? I mean,

Starting point is 01:15:27 we're making such incredible progress on metabolic disease and cardiovascular disease. Cancer in the developed world is predicted to become surpassed cardiovascular disease as the number one killer. Anyway, older populations are at, I wouldn't call them low risk, I'd call them average risk for that age group, which is still relatively high for the overall population. But it was a mix, prevalence a bit less than 1%. Some of these studies do have a healthy volunteer bias. In a 6,500-person cohort with a prevalence of 1%, which is pretty low, the positive predictive value was 40%.

Starting point is 01:16:05 No, that's right. What was the sensitivity for all stages then? It must have been easy to calculate if I had my spreadsheet in front of me, but it's got to be 60% or higher. Sensitivity and specificity has got to be close over 99% at that point, right? Those are the rough numbers. Yeah, that's right. Some of the important statistics point, right? Those are the rough numbers. Yeah, that's right. Some of the important statistics there, right?

Starting point is 01:16:26 So about half of the cancers that manifested over the lifetime of the study were detected by the test. The test actually doubled the number of cancers in that interventional study, then were detected by standard of care screening alone. The interventional study, the Pathfinder study, the enrollees were getting standard of care screening according to guidelines. So, mammography, importantly cervical cancer screening, and then colonoscopies or stool-based testing based on guidelines. And so, a number of the cancers that the Grail Gallery test detected were also detected by standard of care, which you would expect.

Starting point is 01:17:05 grill gallery test detected were also detected by standard of care, which you would expect. But the total number of cancers found was about doubled with the addition of the gallery test. And that was predominantly cancers where there isn't a screening test for. But just going back to the positive predictive value, just the positive predictive value of most screening tests is low single digits. You probably have the experience more than I have, but many, many times a female colleague, friend, or someone's wife calls and said, you know, I got a mammography, they found something, I'm going to have to go for

Starting point is 01:17:34 a follow up, a biopsy, and so on. And literally 19 times out of 20, it's a false positive. That's one where we've accepted, for better or worse, a huge false positive rate, catch some cancers, right? And that's why there's a fair amount of debate around mammography. But again, that's a positive predictive value of about 4.5%. The vast majority of people who get initial positive, they're not going to end up having cancer, but still potentially worth it. Now we're talking about something where we're approaching one or two positive tests will ultimately lead to a cancer diagnosis that's potentially actionable.

Starting point is 01:18:14 So I think sometimes when people hear 40%, they say, gee, that means there's still a fair amount of people who are going to get a positive test, meaning a cancer signal detected and ultimately not, but again, for a screening test, that's incredibly high yield. I think another way to think about that is to go back to the airport analogy. So this is a metal detector that is basically saying, look, we're willing to beep at people who don't have knives to make sure everybody with a knife or gun gets caught. So the negative predictive value is what's giving you the insight about the bad guys. So a 40% positive predictive value means, let's just make the numbers even simpler.

Starting point is 01:19:01 Let's say it's a 25% positive predictive value. It means for every four people you stop, only one is a true bad guy. Think about what it's like in the actual airport. How many times in a day does the metal detector go off and how many times in a day are they catching a bad guy? The answer is it probably goes off 10,000 times in a day and they catch zero bad guys on average. So that gives you a sense of how low the positive predictive value is and how high the sensitivity is and how low the specificity is. So yes, I think that's a great way to look at it, which is if you are screening a population that is of relatively normal risk, a positive predictive value of 20% is very, very good.

Starting point is 01:19:51 It also explains, I think, where the burden of responsibility falls to the physician, which is, as a physician, I think you have to be able to talk to your patients about this explicitly prior to any testing. I think patients need to understand that, hey, there's a chance that if I get a positive test here, it's not a real positive. I have to have kind of the emotional constitution to go through with that, and I have to be willing to then engage in follow-up testing. Because if this thing says, oh, you know, Alex, it looks like you have a lung cancer, the next step is I'm going to be getting a chest x-ray or I'm going to be getting a low-dose CT of my chest. And that doesn't only come with

Starting point is 01:20:35 a little bit of risk, in this case radiation, although it's an almost trivial amount. But I think more than anything, it's the risk of the emotional discomfort associated with that. But I think more than anything, it's the risk of the emotional discomfort associated with that. And I think honestly, when you present the data this way to patients, they really understand it and they really can make great informed decisions for themselves. And by the way, for some of those patients, it means thank you, but no thank you. I just don't want to go through with this and that's okay too. Let's talk a little bit about some of the really interesting stuff that emerged in the various histologies and the various stages. And I've had some really interesting discussions

Starting point is 01:21:09 with your colleagues. I guess just for the sake of completing the story, you're no longer a part of the company Grail. Maybe just explain that so that we can kind of get back to the Grail stuff, but just so that people understand kind of your trajectory. We should do that. Yes, I was at Illumina and then I helped spin off Grail as a co-founder, led the R&D and clinical development. I actually went back to Illumina as the chief technology officer, running all of the company's research and development. Really, really fantastic, fun job. Subsequently, Illumina job. Subsequently, Illumina acquired Grail, solely owned subsidiary of Illumina. That was almost three years ago. Recently, I left Illumina to start a new company, a really interesting biotech company that I'm the CEO of. I'm no longer actively involved in either company. I have great

Starting point is 01:22:00 relations with all my former colleagues, excited to see their progress. I should also say that I am still a shareholder also of aluminum just for full disclosure. Dr. Justin Marchegiani Yeah, thank you. You have a number of colleagues as you said who are still at Grail who I've gotten to know. And one of the things that really intrigued me was again some of the histologic differences and the stage differences of cancer. histologic differences and the stage differences of cancer. So if you look at the opening data, a few things stood out. So there were certain histologies that if you took them all together by stage, didn't look as good as others. So for example, talk a little bit about prostate cancer detection using the gallery test. So I think what you're referring to is there's a very wide variety of different performances and different cancers.

Starting point is 01:22:53 So they're all highly specific, so very low false positive rate, because there's only one false positive rate for the whole test, which is probably worth spending some time on later. But for example, sensitivity to many GI cancers or certain histologies of lung cancer, the test is very good at detecting earlier stage localized cancers, particularly in prostate cancer and hormone receptor positive breast cancer. The detection rate is lower for stage one cancers. But this gets to a very important issue, which is what is it that you want to detect? So do you want to detect everything

Starting point is 01:23:30 that's called a cancer today? Or is what you want to detect is you want to detect cancers that are going to grow and ultimately cause harm. So the weird thing about cancer screening in general is there's both over and under diagnosis. Most small breast cancers and most DCIS and most even small prostate cancers

Starting point is 01:23:48 will never kill the patient or cause morbidity, but there is a small subset that will. And so for those, we have decided to, again, go for a trade-off where we'll often resect things and go through treatments just to make sure that smaller percentage is removed even though we're removing a ton of other quote cancers that are unlikely to ever proceed into anything dangerous.

Starting point is 01:24:13 On the flip side, 70% of people who die of cancer, they die from an unscreen cancer. So there's huge under diagnosis. You should remember that 70% of people who ultimately die of cancer on their death certificate, they die from a cancer where there is no established screening prior to something like Grail's Gallery. So we have this weird mix of there's a lot of cancers where we know we're overdiagnosing, but we're doing it for a defensible trade-off.

Starting point is 01:24:42 And then there's a huge number of cancer deaths occurring where there's essentially zero diagnosis. But back to the ones where there's under diagnosis, it gets back to what does it mean to have tumor DNA in your blood? So measuring and detecting a cancer from tumor DNA in your blood is a functional asset. To get tumor DNA in your blood, you have to have enough cells, functional acid. To get tumor DNA in your blood, you have to have enough cells, they have to be growing fast enough, dying fast enough, and have blood access. So those are the things that you're required. Now, if you have a tumor that's small, encapsulated, not growing, well, guess what? It's not going to have DNA in the blood. So unlike an imaging acid, which is anatomical,

Starting point is 01:25:24 this is really a functional assay. You're querying for whether or not there's a cancer that has the mass, the cell activity and death, and access to the blood to get and manifest its DNA into the blood. So, it's really stratifying cancers on whether or not they have the activities. Now interestingly, this functional assay is very correlated with ultimate mortality. There's a really nice set of data that AgriL put out where you look at Kaplan-Meier curves. So over the course of the CCGA study, which is now going out, I don't know, five plus years, you can say, well, what do survival curves look like if you were positive? Your test was detected versus your test was negative, meaning your cancer was not

Starting point is 01:26:11 detected by the grail test. And there's a big difference. So basically, if your cancer was undetectable by the grail test, you have a very good outcome, much, much better than the general population with that cancer. So this suggests two things. One is A, those cancers may not have actually been dangerous because there's not a lot of mortality associated with them. And maybe that's also why they couldn't put their tumor DNA in the blood. The other is whatever the existing standard of care is, it's working well. Now, if you look at all the cancers in the Kaplan-Meier curve that were detected, they

Starting point is 01:26:48 have a lot of mortality associated with them. And so what it's showing is that it's the dangerous cancers, the cancers that are accounting for the majority of mortality. Those are the ones that the test is detecting. This biological rationale makes a lot of sense, which is, okay, a tumor that grows fast, you get its DNA in the blood, well, that's probably also a dangerous tumor that is going to become invasive and spread. So again, it's a functional assay. So if your test is detected by one of these tests, like the gallery test, it's saying something about

Starting point is 01:27:21 the tumor that is very profound, which is that it's active enough to get its signal into the blood. It's very likely you have untreated to ultimately be associated with morbidity and potentially mortality. I think it's an open question of these tumors that aren't detectable and that are in cancers. We know there's a lot of indolent disease, what does it really mean that the test is low sensitivity for that? Yeah, I would say that when I went through these data and I went through every single histology by stage, I did this exercise probably 18 months ago, the one that stood out to me more than any other was the sensitivity and specificity discrepancy, well, I should say the sensitivity discrepancy between triple

Starting point is 01:28:15 negative breast cancer and hormone positive breast cancer. You alluded to this, but I want to reiterate the point because I think within the same quote-unquote disease of breast cancer, we clearly understand that there are three diseases. There's estrogen positive, there's HER2-NEW positive, there's triple negative. Those are the defining features of three completely unrelated cancers, with the exception of the fact that they all originate from the same mammary gland. But that's about where the similarity ends. Their treatments are different, their prognoses are different. And to take the two most extreme examples, you take a woman who has triple positive breast cancer,

Starting point is 01:28:55 i.e. its estrogen receptor positive, progesterone positive, hertunus positive. You take a woman who has none of those receptors positive. The difference on the gallery test performance on stage one and stage two, so this is cancers that have not even spread to lymph nodes. The hormone positives were about a 20% sensitivity for stage one, stage two, and the triple negative was 75% sensitivity for stage one, stage two. And so this underscores your point, which is the triple positive cancer is a much, much worse cancer. And that at stage one, stage two,

Starting point is 01:29:36 you're detecting 75% sensitivity portends a very bad prognosis. Now, I think the really important question here, I believe that this is being asked, is does the ability to screen in this way lead to better outcomes? So, I will state my bias because I think it's important to put your biases out there, and I've stated it publicly many times, but I'll state it again. My bias is that yes it will. My bias is that early detection leads to earlier treatment.

Starting point is 01:30:13 And even if the treatments are identical to those that will be used in advanced cancers, the outcomes are better because of the lower rate of tumor burden. And by the way, I would point to two of the most common cancers as examples of that, which are breast and colorectal cancer, where the treatments are virtually indistinguishable in the adjuvant setting versus the metastatic setting, and yet the outcomes are profoundly different.

Starting point is 01:30:41 In other words, when you take a patient with breast or colorectal cancer and you do a surgical resection and they are a stage three or less and you give them adjuvant therapy, they have far, far, far better survival than those patients who undergo a resection but have metastatic disease and receive the same adjuvant therapy. It's not even close. And so that's the reason that I argue that the sooner we know we have cancer and the sooner we can begin treatment, the better we are. But the skeptic will push back at me and say, Peter, the only thing the Grail test is going to do

Starting point is 01:31:16 is tell more people bad news. So we'll concede that people are going to get a better, more relevant diagnosis, that we will not be alerting them to cancers that are irrelevant and overtreating them, and we will alert them to negative or more harmful cancers, but it won't translate to a difference in survival. So what is your take on that and how can that question be definitively answered? It's a very important question.

Starting point is 01:31:48 Over time, it will be definitively answered. So we should talk about some of Grail's Grail studies and how they're going about it. So the statistics are very profound, like you said. So most solid tumors, five-year survival when disease is localized, hasn't spread to another organ, 70 to 80% 5-year survival, less than 20 per metastatic stage 4 disease. That correlation of stage it diagnosis versus 5-year survival is night and day. And obviously, everyone would want them and their loved ones, most people in the localized disease category. Now there's an academic question like you're saying, which is, most people in the localized disease category.

Starting point is 01:32:26 Now, there's an academic question like you're saying, which is, okay, well, that's true, but does that really prove that if you find people at that localized disease through this method, as opposed to all the variety of methods that happen today incidentally, that you will have the same outcome? And sure, I guess you could come up with some very theoretical possibility that somehow that won't, but that doesn't seem very likely. And I think it gets to a fundamental question of, well, are we gonna wait decades to see that?

Starting point is 01:32:59 And in the meantime, give up the possibility, which is probably likely that finding these cancers early and intervening early will change outcome. I'm all for, and I think everyone is, bigger and more definitive studies over time. But the idea that we're never going to do that study or just take kind of a nihilistic point of view that until it's done, we're not going to find cancers early and intervene. I don't think it's questionable to do that, especially when the pulse is positive or it's low. I think there's a few other ways to come at it, which is, if what you said was really

Starting point is 01:33:34 true, I've met some of the folks and called by them, the Grail test has found a positive. I can think of a former colleague in the test found an ovarian cancer. Do you think when she went to her OBGYN and said, look, the test said that I have potentially an ovarian cancer and they did an ultrasound and they found something that OBGYN said, you know what? Since this was found through a new method, let's not intervene. There's a malignancy. It is an ovarian cancer. We know what the natural history is, but we're not going to intervene.

Starting point is 01:34:05 Similarly with cases of pancreatic cancer, head and neck or things like that. I don't understand the logic because today people do show up, it's not very often with early stage versions of these disease, ovarian, pancreatic, head and neck and things, and we treat them. So why is it you wouldn't treat them if you could find them through this modality? I just don't know of any GI surgeon who says, well, you're one of the lucky people we found your pancreatic cancer at stage one, two, but we're not going to treat it because there isn't definitive evidence over decades that mortality isn't better. So I get the academic point and Grail and others are investing tremendous amount to increase the data.

Starting point is 01:34:45 The idea that we have this technology and we're going to allow huge numbers of cancers to just progress to lay stage before treating, I don't think that's the right balance of potential benefit versus burden of evidence. So, is there now a prospective real-world trial ongoing in Europe? Here it is. Let's talk a little bit about that. The NHS has been piloting the Grail Test in a population of about 140,000. So it involves sequential testing, I think at least two tests, and then they look at

Starting point is 01:35:20 outcomes. It's an interventional study with return of results. And they're looking for a really interesting endpoint here. So mortality takes time. So I mean some cancers, I mean to ultimately see whether or not getting diagnosed at a different stage and the intervention changes that that could take one or in some cases two decades. But they came up with a really interesting surrogate endpoint which is reduction in stage four cancers. So here's the logic.

Starting point is 01:35:48 It makes a lot of sense, which is, if people stop getting diagnosed with stage four, and say a big reduction in stage three cancer, then doesn't it stand to reason that ultimately you will reduce mortality? So if you remove the end stage version of cancer, which kills most people, and that you know that you have to pass through,

Starting point is 01:36:09 most people don't die of stage two cancer. They were diagnosed with stage two, they died because it turned out it wasn't stage two and it spread. If you do a study and within a few years, when you're screening people, there's no more, and let's take the extreme, stage four cancer, then you stage shifted the population.

Starting point is 01:36:27 You're eliminating late-stage metastatic cancer. Again, I think while we're waiting for that to read out, my personal belief is the potential benefit of finding cancer is so significant testing now for many patients makes sense. And then I think this endpoint of stage four, reduction. Yeah, that's a clever, clever endpoint. One of the things that I know that a lot of the folks who oppose cancer screening tend to cite is that

Starting point is 01:36:59 a number of cancer screening studies do not find an improvement in all-cause mortality even when there's a reduction in cancer-specific mortality. So, hey, we did this colonoscopy study or we did this breast cancer screening study and it indeed reduced breast cancer deaths, but it didn't actually translate to a difference in all-cause mortality. I've explained this on a previous podcast, but it is worth for folks who didn't hear that to understand why.

Starting point is 01:37:25 To me, that's a very, very, oh, how can I say this charitably? That's a very misguided view of the literature because what you fail to appreciate is those studies are never powered for all cause mortality. And if you reduce breast cancer mortality by 40% or 30%, that translates to a trivial reduction in all-cause mortality because breast cancer is still just one of 50 cancers. And even though it's a relatively prevalent cancer, over the period of time of a study, which is typically five to seven years, the actual number of women who were going to die of breast cancer is still relatively small compared to the number of

Starting point is 01:38:09 women period who were going to die of anything. And I, in previous podcasts have discussed that it's very difficult to get that detection within the margin of error. And so if you actually wanted to be able to see how that translates to a reduction in all-cause mortality, you would need to increase the size of these studies considerably even though really what you're trying to do is detect a reduction in cancer specific mortality. I say all of that to say that I think one of the interesting things about the NHS study is it is a pan-screening study.

Starting point is 01:38:45 And to my knowledge, it's the first. In other words, it has the potential to detect many cancers, and therefore you have many shots on goal. Potentially, this could show a reduction in all-cause mortality and not just cancer-specific mortality. I would have to see the power analysis, but I wonder if the investigators thought that far ahead, do you know? I mean, they're going to follow these patients long term. They will be able to have the data on mortality. I don't know if it's powered for all cause. I think that's unlikely just for the reasons you said, which is the numbers would be really high. I mean, again, if you're powering it to see

Starting point is 01:39:26 a reduction in stage four over a couple of years, that may not be enough. Interesting. Well, time will tell. Alex, I want to pivot, if we've got a few more minutes, to a topic that you and I spend a lot of time talking about these days. And so, by way of disclosure, you sort of noted that you've left Illumina somewhat recently, you've started another company. I'm involved in that company as both an investor and an advisor and it's an incredibly fascinating subject. But one of the things that we talk about a lot is going back to this role of the epigenome. So I think you did a great job explaining it and putting it in context. So we've got these three billion base pairs and lo and behold, some 28 million of them

Starting point is 01:40:11 also happen to have a methyl group on their C. I'll fill in a few more details that we didn't discuss on the podcast, but just to throw it out there. As a general rule, when we're born, we have kind of our max set of them. And as we age, we tend to lose them. As a person ages, the number of those methylation sites goes down. You obviously explain most importantly what they do, what we believe their most important purpose is, which is to impact gene expression. It's worth also pointing out that there are many hallmarks of aging.

Starting point is 01:40:47 There are many things that are really believed to be at the fundamental level that describes why you and I today look and function entirely different from the way we did when we met 25 years ago, where half the men we used to be. I could make a Laplace Fourier joke there, but I will refrain. So I guess the question is, Alex, where do you think methylation fits in to the biology of aging? That's a macro question, but... Yeah, she talked about the hallmarks of aging. Who's the author? I think it was Hanrahan who came up with that about 10 years ago, this hallmarks of aging. And he recently gave a talk

Starting point is 01:41:31 where he talked about perhaps methylation is the hallmark of hallmarks of aging. And what he's referring to is the mounting data that the epigenetic changes are the most descriptive of aging and are becoming more and more causally linked to aging events. There's lots of data that show that people of comparable age but different health status, for example, smokers versus non-smokers, people who exercise versus people who don't, people who are obese versus people who are not, can have very different methylation patterns. There's also some data that look at centenarians relative to non-centenarians. And obviously that's a complicated analysis because by definition there's a difference

Starting point is 01:42:21 in age. But you get a sense of different patterns of methylation. And clearly, centenarians we've established long ago do not acquire their centenarian status by their behaviors. Just look at Charlie Munger and Henry Kissinger, two people who recently passed away at basically the age of 100, despite no evidence whatsoever that they did anything to take care of themselves. So clearly, their biology and their genes are very protective. As you said, there are a bunch of these hallmarks. I think the original paper talked about nine and that's been somewhat expanded. But you share that view, I suppose, that the epigenome sits at the top and that potentially it's the one that's impacting the other. So when we think about mitochondrial dysfunction, which no one would dispute,

Starting point is 01:43:09 mine and yours are nowhere near as good as they were 25 years ago, our nutrient-sensing pathways, inflammation, all of these things are moving in the wrong direction as we age. How do you think those tie to methylation and to the epigenome and to gene expression by extension? Maybe let's reduce it to like an engineering framework. If we took Peter's epigenome from 25 years ago when I first met you, right? And we knew for every cell type and every cell, what was the methylation status at all 28 million positions? We had recorded that and we took yours today where most of those cells have deviated from that and we could flip all those states back. That's kind of how I think about it is the

Starting point is 01:43:57 sites don't go away just whether or not they have the methyl group or not changes. And some places gain it, some places lose it. If we could flip all those back, would that force the cell to behave like it was 25 years ago? Express genes, a fidelity with which it controlled those genes, the interplay between them, would it be reprogrammed back to that state? And so that, I think think is a really provocative hypothesis. We don't know that for sure, but there's more and more evidence that that might be possible. And so to me, that's the burning question is now that we have the ability to characterize that, and we know what it looks like in a higher functioning state, which correlates

Starting point is 01:44:41 with youth, and we are gaining technologies to be able to modulate that and actually change the epigenome as opposed to modifying proteins or gene expressions, but actually go in and remethylate and demethylate certain sites, can we reprogram things back to that earlier state? And if it is the root level at which things are controlled, you then get all of the other features that the cell had and the organism had. That's a really exciting question to answer. Because if the answer is yes or even partially yes, then it gives us a really concrete way

Starting point is 01:45:15 to go about this. And so we talk about the hallmarks, and the hallmarks are complex and interrelated. What I like about the epigenome is we can read it out and we're gaining the ability to modify it directly. So, really, it's the most fundamental level at which all of these other things are controlled. It gives us, again, maybe back to the early discussion, a very straightforward engineering way to go about this. Let's talk a little bit about how that's done. A year ago, you were part of a pretty remarkable effort that culminated in a publication in nature.

Starting point is 01:45:48 If I recall it, sequenced the entire human epigenome. So if we had the human genome project 24 years ago, roughly, we had the epigenome project. Can you talk a little bit about that and maybe explain technologically how that was done as well. Yeah. So in the development of the Grail Gallery test, there was a key capability that we knew was going to be important for a multi-cancer test. So very different than most cancer screening today, which is done one cancer at a time. So if you have a blood test and it's going to tell you there's a cancer signal present

Starting point is 01:46:26 and this person should be worked up for cancer, you'd really like to know, well, where is that cancer likely reside? Because that's where you should start your work up. You want it to be pretty accurate. So if the algorithm detects a cancer and it's really a head and neck cancer, you'd like the test to also say it's likely head and neck

Starting point is 01:46:44 and then do an endoscopy and not have to do lots of whole body imaging or a whole body PET CT or things like that. So we developed something called a cancer site of origin and so today the test has that. If you get a signal detected, it also predicts where the cancer is and it gives like a top two choice, top two choices. It's about 90% accurate in doing that. But how does that work? The physicians and patients have gotten that, have described it as kind of magic, that it detects the cancer and predicts it. And it's based on the methylation patterns. So methylation is what determines cell identity and cell state. So again, DNA code is more or less the same in your cells, but the methylation patterns are strikingly different.

Starting point is 01:47:30 When a cell replicates, why does it continue to be the same type of cell? When epithelial cell replicates, same DNA as a T cell or a heart cell, but it doesn't become those at states, it's because the methylation pattern, those exact methylation states on the 28 million are also replicated.

Starting point is 01:47:48 So just in the same way DNA is a way of replicating the code, there's an enzyme that looks and copies the pattern to the next cell. And so that exact code determines, again, is it a colonic epithelial cell or a fallopian epithelial cell or whatever it is. And so we knew that the only way to make a predictor in the cell predata is to have that path less of all the different methylation patterns.

Starting point is 01:48:16 And so with a collaborator, a guy named U.L. Dorot at Jerusalem University, we laboriously got surgical remnants from healthy individuals, key developed protocols to isolate the individual cell types of most of the cells that get transformed in cancer. And then we got pure methylation patterns where we sequenced, like sequencing the whole genome, sequenced the whole methylome of all those cell types. And we published that a year ago as the first atlas of the human methylome and all of the major cell types. And so for the first time, we could say, hey, this is the code, which makes you beta-ilate cell

Starting point is 01:48:56 in the pancreas that makes insulin versus something else. Interestingly, there's only one cell in the body where the insulin promoter is not methylated, and that is the beta-yelid cell. Every other single cell, that promoter is heavily methylated because it shouldn't be making insulin. It's those kinds of signals that when you have the cell-free DNA, you look at the methylation

Starting point is 01:49:20 pattern, allows the algorithm to predict, hey, this isn't just methylation signal that looks like cancer. The patterns and what's methylated and what's not methylated looks like colorectal tissue or a colorectal cancer. And that's how the algorithm does it. And so this atlas, again, was a real breakthrough for diagnostics and it made cancer site of origin useful. It's also being used for lots of MRD or those cancer monitoring tests too because it's so

Starting point is 01:49:50 sensitive. But it also brought up this interesting possibility which is if you're going to develop therapeutics or you want to say rejuvenate cells or repair them that have changed or become pathologic, what if you compare the methylation pattern in the good state versus the bad state? Does that then tell you the exact positions that need to be fixed? And then with another technology which can go and flip those states, will that reverse or rejuvenate the cell to the original or desired state?

Starting point is 01:50:23 So, Alex, unlike the genome which doesn't migrate so much as we age, I mean, obviously it accumulates mutations, but with enough people, I guess we can figure that out pretty quickly. Do you need longitudinal analysis of a given individual, i.e., within an individual, to really study the methylome? Do you need to be able to say, boy, in an ideal world, this is what Peter's epigenome looked like when he was one year, you know, at birth, one year old, two, three, four, 50 years old, so that you could also see not just how does the methylation site determine the tissue specificity or differentiation, but how is it changing with normal aging as well?

Starting point is 01:51:11 I think a lot of it is not individual specific. I'll give you an example. So, I've done a fair amount of work in T cells, and if you look at, say, exhausted effector T cells versus naive memory cells where younger individuals tend to have more of those and it gives them more reservoir to do things like fight disease, fight cancer. There's very distinct methylation changes. Certain genes get methylated or demethylated and those changes seem to be, again, very

Starting point is 01:51:42 correlated with this change in T cell function. My belief is that those represent fundamental changes as the T cell population gets aged and you end up with more and more T cells that, relatively speaking, are useless. And so, if you wanted to rejuvenate the T cells, repairing those methylation states is something that would benefit everyone. Now, there are definitely a small percentage of methylation sites that are probably drifting or degrading and those could be specific to individuals. There's some gender-specific sites for sure.

Starting point is 01:52:18 There's some ethnic ones, but big, big changes seem to happen more with loss of function, big changes in age that are probably common across individuals or in the case of cancer, we also have profound changes. When you think about this space, a term comes up. If folks have been kind of following this, they've probably heard of things called Yamanaka factors. In fact, a Nobel Prize was awarded to Yamanaka for the discovery of these factors. Can you explain what they are and what role they play in everything you're discussing? Dr. David K. With Yamanaka and colleagues discovered is that if you take fully differentiated cells, for example fibroblasts, and you expose them to particular cocktail of four transcription factors, that the cell reverts to a stem cell like state. And these are called induced pluripotent stem cells. You subject a differentiated cell that was a mature cell, but a particular type. I think

Starting point is 01:53:26 most of their work was in fibroblast. And the cell, when it's exposed to these transcription factors, and these transcription factors are powerful ones at the top of the hierarchy, they unleash a huge number of changes in gene expression, genes get turned on, get turned off. And then ultimately, if you keep letting it going, you end up with something that is a type of stem cell. And why this was so exciting is it gave the possibility to create stem cells through a manufactured process. As you know, there's a lot of controversy about getting stem cells from embryos or other sources, this created a way now to create stem cells and use them for medical research by just taking an individual's own cells and kind of

Starting point is 01:54:12 de-differentiating it back to a stem cell. How much did that alter the phenotype of the cell itself? In other words, the fibroblast has a bunch of phenotypic properties. What are the properties of a stem cell and how much of that is driven by the change in methylation? In other words, I'm trying to understand how these transcription factors are actually exerting their impact throughout this regression, for lack of a better word. We refer to cell type specific features as somatic features like a T cell receptor. That's a feature of a T cell or a dendrite or an axon would be for a neuron or an L-type calcium channel for a cardiac myocyte.

Starting point is 01:54:58 So those are very cell-type specific features. So if you turn on these Yamanaka factors and you go back to a pluripotent stem cell, you lose most of these. And that word pluripotent means the potential to become anything, at least in theory. So you lose most of these cell type specific features. So the use of the IPSCs is then to re-differentiate them. And that's what people have been attempting to do, and it opened up the ability to do that, which is you create this stem cell that now potentially has the ability

Starting point is 01:55:32 to be differentiated into something else. You give it a different cocktail, and you try to make it a neuron or a muscle cell, and then use that in a tissue replacement therapy. And there's a lot of research on that, and a lot of groups trying to do that. You also asked about what is the relationship between that

Starting point is 01:55:49 and the epigenetics and methylation state. That has not been well explored. And that's something that I and others are excited to do, because it could be that you're indirectly affecting the epigenome with these Yamanaka factors, and that if you translated that into an epigenetic programming protocol, you could have a lot more control over it. Because one of the challenges with the Yamanaka factors is if you do this for long enough, eventually the stem cell becomes

Starting point is 01:56:18 something much more like a cancer cell and just becomes kind of unregulated growth. And so, and just becomes kind of unregulated growth. And so again, huge breakthrough in learning about this kind of cell reprogramming and de-differentiation, but our ability to use it in a practical way for tissue and cell replacements is not there. My hope is that by converting it to an epigenetic level, it'll be more tractable. You mentioned that this is typically done with fibroblasts.

Starting point is 01:56:45 I assume the experiment has been done where you sprinkle Yamanaka factors on cardiac myocytes, neurons, and things like that. Do they not regress all the way back to potent stem cells? I think to varying stents. I mean, if you truly have a pluripotent stem cell, I guess in theory, it shouldn't matter where it came from, right? Because it's plotent. So with developmental factors, where did your first neurons come from? You had a stem cell and then in the embryo or the fetus, there were factors that then coaxed that stem cell to become these other types of cells and tissues. So if it's truly

Starting point is 01:57:21 pluripotent, you should be able to do that. Now, I think you're getting at something which is different, which is called partial reprogramming. He and the people have followed his work, they're trying to do his things, which is kind of stop halfway. So what if you took a heart cell or a T cell that's lost a lot of function and you give it these Yamanaka factors, but you stop it before it really loses its cell identity. Will it have gained some properties of its higher function in youthful state without having lost it? There's some provocative papers out there on this. There's a guy at Juan Carlos del Monte

Starting point is 01:58:02 who's done some work on this and some very provocative results in mice of doing these partial programming protocols and rejuvenating, again, it's mice, so all the usual caveats, but getting very striking improvements in function, in eyesight, cognition, again, in these mouse metrics. So certainly interested in trying to understand how that might be able to translate to humans. Again, the worry there would be that if you don't control it, then you could make essentially a tumor. So, it's opened up that whole area of science that it's possible to do these kinds of dramatic de-differentiations. How to really harness that in a context of human rejuvenation. We don't know to do that yet,

Starting point is 01:58:45 but there's a lot of people trying to figure that out. If you had to guess with a little bit of optimism, but not pie-in-the-sky optimism, where do you think this field will be in a decade? Which there's a day when a decade sounded a long time away. It doesn't sound that long anymore. Decades seem to be going by quicker than I remember. So it's going to be a decade pretty soon, but that's still a sizable amount of time for the field to progress. What do you realistically think can happen with respect addressing the aging phenotype, vis-a-vis some method of reversal of aging, some truly giroprotective intervention. So I'm optimistic and I'm a believer.

Starting point is 01:59:35 I think for specific organs and tissues and cell types, there will be treatments that rejuvenate them. It's hard to see in a decade that there's just a complete rejuvenation of every single cell and tissue in a human, but joint tissues, the retina, and immune cells, we're learning so much about the biology related to rejuvenation and healthier states of them. And then in combination with that, the tools to manipulate them, which is equally important, you could understand what the biology is, but not have a way to intervene. The tools to go in and edit these at a genomic level, to edit it at an epigenetic level to change the state and the delivery technologies to get them to very specific tissues and organs is also progressing tremendously.

Starting point is 02:00:25 So I definitely see a world in 10 years from now where we may have rejuvenation therapies for osteoarthritis, rejuvenation for various retinopathies, where we can rejuvenate full classes of immune cells that make you more resistant to disease, more resistant to cancer. I think we'll see things that will have real benefits in proving health spam. Alex, this is an area that I think truly excites me more than anything else in all of biology, which is to say I don't think there's anything else in my professional life that grips my in my professional life that grips my fascination more than this question. Namely, if you can revert the epigenome to a version that existed earlier, can you take the phenotype back with you? And that could be at the tissue level, as you say. Could I make my

Starting point is 02:01:22 joints feel the way they did 25 years ago? Could it make my T cells function as they did 25 years ago? And obviously, one can extrapolate from this and think of the entire organism. So anyway, I'm excited by the work that you and others in this field are doing, and grateful that you've taken the time to talk about something that's really no longer your main project, but something for which you provide probably as good a history of as anyone vis-a-vis the liquid biopsies and then obviously a little bit of a glimpse into the problem that obsesses you today. Awesome. Well, fun chatting with you as always, Peter. Glad I had the opportunity to dive in deep with this. There aren't many places to do this. Thank you. Thanks, Alex.

Starting point is 02:02:07 Thank you for listening to this week's episode of The Drive. It's extremely important to me to provide all of this content without relying on paid ads. To do this, our work is made entirely possible by our members. And in return, we offer exclusive member only content and benefits above and beyond what is available for free. So if you want to take your knowledge of this space to the next level, it's our goal to ensure members get back much more than the price of the subscription.

Starting point is 02:02:30 Premium membership includes several benefits. First, comprehensive podcast show notes that detail every topic, paper, person, and thing that we discuss in each episode. And the word on the street is, nobody's show notes rival ours. Second, monthly Ask Me Anything or AMA episodes. These episodes are comprised of detailed responses to subscriber questions typically focused on a single topic and are designed to offer a great deal of clarity

Starting point is 02:02:58 and detail on topics of special interest to our members. You'll also get access to the show notes for these episodes, of course. Third, delivery of our premium newsletter, which is put together by our dedicated team of research analysts. This newsletter covers a wide range of topics related to longevity and provides much more detail than our free weekly newsletter. Fourth, access to our private podcast feed that provides you with access to every episode including AMAs, Sons the Spill you're listening to now, and in your regular podcast feed. Fifth, the Quallies, an additional member-only podcast we put together that serves as a highlight reel featuring the best excerpts from previous episodes of The Drive. This is a great way to

Starting point is 02:03:42 catch up on previous episodes without having to go back and listen to each one of them. And finally, other benefits that are added along the way. If you want to learn more and access these member-only benefits, you can head over to peteratea.com. You can also find me on YouTube, Instagram, and Twitter, all with the handle PeterateaMD. You can also leave us a review on Apple podcasts or whatever podcast player you use. This podcast is for general informational purposes only and does not constitute the practice of medicine, nursing or other professional health care services including the giving of medical advice. No doctor-patient relationship is formed. The use of this information and the materials linked to

Starting point is 02:04:25 this podcast is at the user's own risk. The content on this podcast is not intended to be a substitute for professional medical advice, diagnosis, or treatment. Users should not disregard or delay in obtaining medical advice from any medical condition they have and they should seek the assistance of their healthcare professionals for any such conditions. Finally, I take all conflicts of interest very seriously. For all of my disclosures and the companies I invest in or advise, please visit peteratiamd.com Thank you. you

The Peter Attia Drive - #290 ‒ Liquid biopsies for early cancer detection, the role of epigenetics in aging, and the future of aging research | Alex Aravanis, M.D., Ph.D.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.