The Joy of Why - How Does Math Keep Our Secrets?
Episode Date: August 1, 2024Can you keep a secret? Modern techniques for maintaining the confidentiality of information are based on mathematical problems that are inherently too difficult for anyone to solve without th...e right hints. Yet what does that mean when quantum computers capable of solving many problems astronomically faster are on the horizon? In this episode, host Janna Levin talks with computer scientist Boaz Barak about the cryptographic techniques that keep information confidential, and why “security through mathematics” beats “security through obscurity.”
Transcript
Discussion (0)
Hello, I'm Brian Cox. I'm Robin Ince, and this is the Infinite Monkey Cage trailer for our brand new series.
We've got mummies, we've got magic, we've got asteroids.
Mummies, magic and asteroids. What's the link?
That it was an asteroid that magically went over the world that led to Imhotep the mummy coming back to life?
That's correct.
I thought it would be weird, scientific as ever.
But the most important thing to know is that we are going to deal with the biggest scientific question we finally ask.
What is better, cats or dogs?
Listen wherever you get your podcasts.
We all have secrets we want to obscure. From childhood notes between friends, to Da Vinci's notebooks, to the wartime messages
famously cracked by Alan Turing and a cohort of English cryptographers. To share secrets with a
friend, an ally, a co-conspirator, there's cryptography. There are codes and ciphers.
Ingenious means to safeguard information against prying eyes. But in lockstep, there are code breakers.
And equally ingenious means to decipher the hidden information.
Cryptography has become crucial to modern life and commerce to protect our emails, our banks, and our national security.
While developing more and more secure encryptions, researchers have recently made some unexpected discoveries that reveal deeper truths about the theoretical limits of secrecy itself.
I'm Jana Levin, and this is The Joy of Why, a podcast from Quantum Magazine, where I take
turns at the mic with my co-host, Steve Strogatz, exploring the biggest questions in math and
science today. In this episode, theoretical computer scientist Boaz Barak demystifies
cryptography as we ask, is it possible to perfectly protect secrets? Boaz is the Gordon
McKay Professor of Computer Science at Harvard University. He's also a member
of the Scientific Advisory Board for Quantum Magazine and the Simons Institute for the Theory
of Computing. His research centers on cryptography, computational complexity, quantum computing,
and the foundations of machine learning. Boaz, welcome to the show.
Thank you. Thank you very much.
So glad to have you. This is quite a challenging subject. And to open, I kind of want to do the
opposite of encrypting this conversation. We're here to share ideas. And so let's start by opening
the dictionary of terms here. What is cryptography? What are ciphers?
So cryptography's meaning has really evolved over the years. I think in the early days,
since humans began writing, they had this notion of secret writing, some ways to obscure
their secrets. And cryptography was kind of synonymous with that, with basically encryption.
Cryptography was kind of synonymous with that, with basically encryption.
But then, more recently, since the 1970s, cryptography has really expanded and evolved.
And it's no longer just encryption.
It's also authentication, like digital signatures, and even more fancier things like zero-knowledge proofs, multi-party secure computation,
and many other ways to basically protect not just communication, but also computation.
So it's as though we figured out how to write language, and then we decide sometimes we don't want to share our inner thoughts, and so we write secretly. And that must go back quite a long way. So what are
some of the earliest encryption and decryption techniques?
So a famous example in encryption is Caesar's cipher, which is attributed to Julius Caesar. I
believe it has predated him, which is a very, very simple system of obscuring data or messages where the idea is
just you shift the letters of the alphabet. So for example, the letter A maps to say the letter F,
the letter B maps to G, the letter C maps to H, etc. And this is a very simplistic approach,
age, etc. And this is a very simplistic approach which is not that hard to break. Generally,
the Caesar cipher is a special case of what's known as a substitution cipher, where you have some kind of a table mapping the letters of your message that you're trying to encrypt.
That is what we typically call the plain text into the encrypted form, which we call the
ciphertext. And these type of substitution ciphers have been very common. One of the famous examples
was used by Mary, the Queen of Scots, when she was planning a coup against her cousin Elizabeth.
planning a coup against her cousin Elizabeth. And substitution ciphers are not very secure.
You can typically break them by just looking at the frequency of how many symbols appear in the ciphertext. For example, you can figure out that the most frequent symbol is probably
the encoding of E because that's the most frequent letter in the English alphabet.
I was going to guess E. Yes.
And using that, Queen Elizabeth's spies managed to crack Mary's cipher and it didn't end up well for Mary who was executed.
Right.
And that has actually been kind of a typical story with encryption throughout the years
where someone comes up with an encryption scheme.
They believe it is unbreakable. They use it for
life and death applications. And it turns out that it is breakable. And typically,
when you use something for life and death applications and it doesn't work,
doesn't bode well for you. Yeah, dire consequences. Yes. To speak to that point,
the 20th century really was a critical point
in cryptography. I mean, there were two world wars where encryption and decryption played a
really major role. And at the same time, maybe as a result, the field of cryptography began to become
a very serious and important subject, both for intellectual and scientific reasons, but also for survival,
for the fate of the world, right? And we mentioned one of the central figures like
Alan Turing and his role famously in cracking the German Enigma cipher. What else was really
shifting in the significance of cryptography in the 20th century?
shifting in the significance of cryptography in the 20th century?
So from a cryptography point of view, I think, yes, the Enigma cipher, which was believed to be unbreakable by the Germans, partly because if you were trying to figure out
how many combinations were for the secret key of the Enigma, which involved setting
wires of several rotors.
It was kind of like a typewriter almost.
Yes, it looked exactly like a typewriter. When I teach at Harvard, I always have to bring up
a photo of a typewriter because now these days students don't know what a typewriter looks like.
But it looked like a typewriter. But when you you hit a key then something else would come out
so you hit the letter a maybe a letter z would come out and it wasn't a simple substitution
cipher in the sense that say if you hit the letter a again then another letter would come
out so the state of the system would be constantly evolving,
which made it much harder to break.
And the secret key was actually the setting of the wires of the rotors inside this typewriter.
So there were like several rotors
which were wired in a certain way.
And the number of possibilities for the secret key
was absolutely enormous,
was something like 10 to the 100,
which even with today's computings,
if we were trying to break the enigma by brute force,
by simply trying out all possibilities,
the sun would die out and collapse before we were done,
and let alone the computing devices
that they had in the 1940s.
So it took a lot of mathematical ingenuity
that was done by Alan Turing and many other people at Bletchley Park, and even before that, some insights were done by the Polish intelligence services before that to actually break the enigma.
And one of the lessons that cryptography took from that is that trying to build security by having a very complicated system like the Enigma
is actually not going to succeed.
Cryptography transitioned into relying in some sense on simpler systems,
but with a more sound mathematical analysis.
And a key figure in bringing about the mathematical analysis of cryptography was Claude Shannon, who in
the late 1940s wrote some influential papers, starting with a mathematical theory of encryption.
And then in the 1970s, people like Diffie, Hellman, and Merkel, these are three separate
people, started with the mathematical theory of public cryptography, which was then built up on really
in the late 70s and early 80s. We started to have a mathematical theory of cryptography that rather
than being based on obscure problems, like the Enigma was actually based on very simple problems,
like say the problem of finding the prime factors of large integers that have been around for
thousands of years, but which despite this history, we still don't know an efficient algorithm for.
Now, it's interesting that Alan Turing comes up not only in these world-changing crises over cracking codes,
but also he is well-known as the inventor of the modern concept of computation at all.
You know, Turing thought about mechanizing thought, and it led him to the notion of a computer
that wasn't a human, which is how the term computer had originally been used. Computers
were people who computed. And Alan Turing changed that to the idea of a machine that could compute.
So you're talking about these wonderful theoretical changes.
How did cryptography change with the advent of actual machines that we now call computers?
So indeed, I think some of the earliest mechanical computers were built in Bletchley Park
exactly for the task of automating some of the analysis in breaking the Enigma and other ciphers.
And Alan Turing had a broader vision than that. So specific computing devices have
always been around to mechanize computation to some extent. But Alan Turing had this broader
vision of a general purpose computer. I should say that Charles Babbage had this same vision maybe 70 years earlier, but he has never gotten around to actually building the device.
And Turing had this idea that you could actually build this device that would be general purpose.
And in some sense, there is a notion of a universal computer, or as we call it today,
a universal Turing machine that is the same piece of hardware
that can run arbitrary functionality
by basically being supplied with software.
It is quite an amazing evolution.
So now here we are,
and cryptography plays a part in nearly everything we do,
whether we're forward aware of it or not.
I mean, we use encryption to hide
private messages, of course, but also to secure information, to compact it and secure it. It shows
up in telecommunications, medical records, how photos are stored on our phones. So tell us a
little bit about how cryptography is integrated into all of our everyday lives. Yes. So people don't realize, for example, the conversations such as the one that you and
I are having and millions of people are having using Zoom or other telecommunication framework,
they often use wireless connections, which basically means that the signals that we are
transmitting are going through the air and anyone can pick
them up. The reason our conversations are still secure is because they are encrypted.
Now, also, all of us basically carry around a device that both stores all of our private
information and has a microphone and a camera that can record potentially all of our communication.
And moreover, this device is a fully programmable computer.
All our smartphones are fully programmable computer,
and they can get over-the-air updates to completely change their functionality.
The reason that, say, some hackers don't send us over-the-air updates that can convert our smartphones into
a listening, recording, and surveillance device is because we use cryptography and specifically
digital signatures so that the device, even if it gets some piece of software update,
it can verify using digital signatures that the software update actually came from the
manufacturer. So fascinating that all of this is really part of our absolutely ordinary everyday
lives, not just life or death stuff. We're going to take a little break and be right back after
this message. Welcome back to The Joy of Why. We're speaking with Boaz Barak about the art and science of
cryptography. Now, you've mentioned some of the ways where today we're trying to protect
information and why we're trying to protect information. So what still makes us vulnerable?
We have these algorithms, we have data that we can encrypt,
we have all kinds of rules about passwords and user privacy. But what makes a system vulnerable?
And how do they typically break? So that's a great question. So there are generally two ways
in which systems can be broken. So first of all, while we have these great encryption schemes,
So first of all, while we have these great encryption schemes, we actually don't have a mathematical proof that they are secure.
And proving that they are secure is actually related to some of the greatest open questions in computer science and science at large, and specifically the P versus NP question.
Can you remind us what P and NP mean? Yes. So P means polynomial time,
and NP means non-deterministic polynomial time. So the P refers to the class of problems that can
be solved efficiently by a computing device, whether that computing device is a standard digital computers or any other computing
device. And NP refers to the class that can be verified by a computing device. So the P versus
NP question really asks whether every problem whose solution can be efficiently verified can actually be also efficiently found.
Now, intuitively, we think that P should be different than NP, that there are some problems
where it's much easier to verify a solution once someone gave it to you than to find it
yourself from scratch. But we actually don't have a mathematical proof that this is the case,
even though it is widely conjectured.
And one example of a problem that is of this type is
if someone gave you the secret key that could decrypt
all of the communication between two parties,
then you could easily verify that this secret key actually works.
But finding out the secret key can take time, which, as I said, could potentially be longer
than the time for the sun to die out.
So if P equals NP, then in some sense, we could break every possible encryption scheme.
But it is widely conjectured that it is not.
So we don't have a proof that the encryption schemes that we use
are unbreakable. And once in a while, people do manage to break the underlying cryptography.
Although at least the main crypto systems that we are using basically have been unbroken since
the 1970s. But it's far more common to go around the cryptography. And that's how
hackers actually break into our systems. So one metaphor that I like to use is that cryptography,
when properly implemented, is basically like a hundred ton steel door. But the systems where
we use it are basically like a wooden shack.
So if you're installing like a handwritten steel door on a wooden shack,
then yes, a thief would not be able to break the door,
but they might find a different way around it.
And the sheer complexity of modern systems means that it's really hard to secure all sides of them.
So hackers typically don't break the cryptography,
at least when it's properly implemented, but go around the cryptography.
Fascinating, because a lot of this also dates back to those very deep concepts in mathematics about what's provable and unprovable. And in a way, this is kind of the computing manifestation
of that. That's a whole other episode. Very deep stuff. So before the 1970s, most cryptography was symmetric in the sense that
the same key would be used to encrypt and decrypt a message. Is that what you're referring to,
that since the 1970s, cryptography is asymmetric, where there's some key used for encryption and a
private key used for decryption?
Yes. So this was one of the major changes that happens in the 1970s.
So before the 1970s, indeed, cryptography was basically what we call private key cryptography,
where the sender and receiver share a secret key that they use for communication.
And this kind of worked okay for the military applications of cryptography,
where you have a spy and maybe before you send them off to a mission, you give them a codebook
that they would use to communicate with the home base. But it doesn't really work with the modern
economic applications of cryptography. So I don't know about you, but I'm a Gmail user and I rely on encrypted communication between me and the Google servers to look at my email securely.
But I've never actually paid a visit to Google headquarters to exchange with them a private key.
And if every user of Google had to physically exchange a key with the Google service, the modern internet would not exist.
So in public key cryptography,
two parties can communicate over an unsecured channel
and exchange confidential information
by basically having one party, let's say the receiver,
send their public key to the sender,
and then the sender can use that public key
to encrypt their message,
send it over the unsecured channel,
and the receiver can use their secret key to decrypt it.
Point being that, yes,
people could be listening on this channel
and they could all learn the public key,
but the public key can only be used for encryption. It cannot be used for decryption. It does make me wonder though,
all of that's quite amazing. That's such tremendous progress. We exchanged gmails today
and I feel pretty confident nobody read our gmails, not least because they weren't that
interesting, right? I'll see you there then, you know, I'm running late, whatever. But are there
theoretical limits to cryptography and can things truly be unconditionally secure?
So there are two answers to these questions. First of all, yes, it is known that public
cryptography can never be unconditionally secure in the sense that it could always be broken by basically trying all possible
keys. But trying all possible keys is an effort that scales exponentially with the key size.
So at very moderate key sizes, that already requires, say, spending more cycles than there are atoms in the observable universe,
or similarly, astronomical quantities, which you are probably more familiar than me.
So that's not really a barrier. So theoretically, we could have a mathematical theorem that tells us
that this crypto system cannot be broken by any attacker that would spend less than, say,
a number of operations that scales like two to the n where n is the length of the key.
We don't have a theorem like that. And the reason is that such a theorem would in particular
also imply that p is different from np, which is a major unsolved problem that we haven't been able to solve. So at the moment, we have to settle for conditional proofs of security that are conditional
based on certain conjectures. Now, there's also twists on this whole idea. Sometimes I don't
want to completely obscure everything that's going on. Sometimes I want to let you know I know something, but I don't necessarily want to reveal all, right? I believe these are known as
zero knowledge proofs. Can you expand on that? Explain to me why sometimes I want you to know I
know. I want to prove to you that I know. I don't want you to just take my word for it without
revealing the information I'm protecting. Sure. Actually, let me give you an example in the context of nuclear disarmament.
You know, you have this, say, Russia and the U.S. Both have a huge stockpile of nuclear warheads,
far more than necessary to destroy, you know, major parts of the world several times over.
It's very expensive.
It's actually in both countries' interest to reduce this stockpile.
And it's also in the interest, obviously, of world safety, because the less warheads out there, the less likely that we would have a completely devastating nuclear war.
But part of the problem is that this is an equilibrium, and it's hard to agree on reducing the stockpiles.
Another part is, how do you verify that the other side really did, say, destroy the warheads?
One solution to that is simple.
You know, say, for example, the Russians declare, we are going to destroy these 100
warheads. They invite American inspectors to come to the warehouse where the warheads are stored
and take a look at them, examine them, and then they go into whatever machine that
destroys all of these things. That is great, except that the design of a nuclear warhead is one of the most classified secrets that the country has.
And the Russians have no intention of letting American inspectors anywhere near opening up the warhead and examining it.
So then it becomes a question of, say, I have a barrel. Can I prove to you that there is a nuclear warhead inside the
barrel without giving you any details of the exact design of this warhead? And this is the type of
question that zero-knowledge proofs are designed to address. So you want to say, prove that something
satisfies a certain predicate. So for example, this barrel contains a nuclear warhead, or maybe this number is a composite
number, or it's a composite number where one of the moduli has the last digit of seven.
So you have a certain object and you want to prove that it satisfies a certain predicate
without giving away the information, such as the design of the word
or the factorization of the number, that really proves why the object satisfies this predicate.
Fascinating example, and unfortunately timely.
Does this relate to the concept of obfuscation that I've been reading about?
So obfuscation is kind of a vast generalization
of a lot of things,
including zero-knowledge proofs and others.
Basically, the idea is,
could you take, say, a computer program
and transform it into a way
that it will become like a virtual black box
in a sense that you will be able to run it,
but you will not be able to examine its internals.
So you could have, say, some computer program that potentially takes as input some of the
secret information, but only produces like a zero or one bit.
Is the secret information satisfies a certain predicate or not?
So obfuscation can be used to achieve zero-knowledge proofs.
It can be used to achieve a lot of other cryptographic primitives.
And one of them, for example, is the idea of secure multi-party computation, where the
idea is maybe, for example, you have a certain private input.
I have a certain private input.
Maybe you are a hospital and you have your own patient records.
I'm in another hospital.
We have our own patient records for reasons of patient confidentiality.
We cannot share with each other the patient records, but could we run some computation
that at the end of it, we will learn some statistical information about both of the
records without revealing any of the secret information?
And this falls under secure multi-party computation.
Now, in particular, I've read the phrase
indistinguishability obfuscation.
Doesn't exactly roll off the tongue,
but I believe I've heard you refer to it as,
you know, the one cryptographic primitive to rule them all.
Yes.
So obfuscation is, in some sense,
if you could have it generically, you could basically
do anything you want in cryptography. And unfortunately, in 2001, I and some colleagues
proved that the most natural notion of obfuscation, virtual black box obfuscation,
which is kind of a mathematical translation of what I said before,
that you take a program and you compile it in such a way that it is virtually a black box.
So we proved that that is impossible to achieve. But then we said there are weaker notions of
obfuscation, in particular, this notion of indistinguishability obfuscation, which our
impossibility proof didn't apply to. But we had no idea whether that notion of indistinguishability obfuscation, which our impossibility proof didn't apply to.
But we had no idea whether that notion of indistinguishability obfuscation is possible to achieve, and if so, whether it would actually be useful for all of the applications.
So then in 2013, there was a breakthrough showing that, yes, distinguishability obfuscation can be
constructed, and in fact, that it can be useful for many applications.
And since then, there's been a steady stream of works showing more and more applications
of distinguishability obfuscation, and also works trying to make it not just theoretically
possible, but also practically feasible.
The latter is still very much a work in progress. So right now, the overhead is such that we cannot
use it. But the theoretical feasibility shows that perhaps in the future, we will be able to
improve the efficiency and make it usable. Now, there's this beautiful thing on the horizon that
keeps looming and receives a lot of discussion, and that's quantum computers, of which we have no examples
yet. But how would these methods fare in a quantum world against quantum technologies?
So this is a fascinating question, and actually a very timely one at the time we were recording
this interview. So first of all, quantum computers are still computers.
They can be mathematically modeled.
And while they appear to give exponential speedups
for some very structured computational problems,
they are not a magic bullet that can speed up any computation whatsoever.
In particular, essentially all of the secret key
encryptions that we currently use will remain secure also against quantum computers. However,
maybe the most important and famous problem for which quantum computers can offer an exponential
speedup is the integer factoring problem,
which lies at the heart of the RSA encryption scheme.
That means that if scalable quantum computers are built, then all the public encryption
schemes that are based on these number theoretic objects, such as RSA system based on integer factoring, Diffie-Hellman based on the discrete
logarithm, and its variant based on elliptic curves will all be broken. Unfortunately,
it's very hard to construct a public key encryption, so we don't have many candidates.
This is one major family of candidates, and there is one other major family of candidates for public key encryptions, which is based on problems relating to lattices.
So we do have these problems-based lattices that are not known to be broken by quantum computers.
So this forms the basis for what is sometimes known as post-quantum
cryptography, at least in the public encryption setting. Now, you've already mentioned a number
of major other techniques used in cryptography, and we don't really have a chance to pick apart
each of these. What does the lattice mean or the integer approach? But I think we get the impression
they're rooted deeply in fundamental mathematics, I think,
is an important point, which maybe not everyone realizes,
and that the tools are somehow kind of fundamental in some way
also to how nature encodes math
and how math encodes information in a deep way
and that a lot of these complex techniques
go back to fundamental
mathematics and keep relying on that sort of core discipline to progress.
Absolutely. And one of the things I tell students when I lecture on cryptography is that we have
moved from security through obscurity to security through mathematics. And it turns out that security for mathematics is much more reliable.
So even though today attackers have access to much higher amounts of computational resources
that far dwarf the resources that people had in Bletchley Park, we still have cryptographic
schemes that nobody has been able to break.
And as I said, therefore, attackers always go around the cryptography.
And the reason is that rather than trying to build very complicated esoteric systems
like the Enigma, we are relying on simpler principles, but applied through mathematical
understanding.
And that is how we are getting more reliably secure.
Yeah, that's a big change. So I guess it's natural for me to ask here,
maybe you've already implicitly answered that in prompting this question,
but where is cryptography headed next?
So there are maybe four different research strands in cryptography. One is expanding the reach of cryptography.
So going beyond, say, secret writing to public re-encryption and then to even fancier things like zero-knowledge proofs,
multi-party secure computation, obfuscation.
The second is bringing things from theory to practice.
So taking some theoretical constructions that initially
are far too much overhead to be used in practice and improving it. The third is sharpening our
understanding of the cryptographic schemes that we currently have by attempting to break them.
So crypto analysis and attempting to break cryptographic schemes is also an important
area of research. And the fourth is really understanding the basic assumptions that we
are using in cryptography, things like integer factoring, lattice-based assumptions, and maybe
we can find new assumptions and trying to understand their validity and whether we can
provide rigorous evidence for their correctness or show that they are incorrect.
It's a field that's taken many turns from the kind of natural instinctive urge to have secret notes
to the depths of mathematics, to computing, to quantum computing. It's really a fascinating subject.
And at this point, we have a question we like to ask,
which is what about this research brings you joy?
When I started as a graduate student in the Weizmann Institute of Science,
I didn't actually intend to study cryptography.
I didn't know much about cryptography and I thought it was a very technical field just having to do with number theory, which is nice,
but not something that I was passionate about. The thing that kind of blew my mind was when I
took a cryptography class with Odette Goldeyer, who became my advisor.
And I realized that you could actually mathematically define what it means to be
secure. And I kind of just found it fascinating that this intuitive notion that I thought that
had no bearing with formal mathematics can actually be captured by mathematics, and then we can actually prove things about it.
And this is the thing that I still find so fascinating about cryptography,
that it brings math into places where we didn't really think it would hold.
And it reminds me of these very deep ideas of 100 years ago,
that we can prove that there are unknowable facts about math.
Yes. Some of the techniques are actually sometimes similar. Maybe another reason why I find
cryptography fascinating is that I think as human beings and as scientists, we have a long history
of being fascinated by impossibility results. So there was the impossibility of deriving the parallel postulate
from the other axioms of geometry,
impossibility of trisecting an angle
with just a compass and a straight edge,
and of course,
Gödel's theorem of impossibilities
of proving all two facts about mathematics.
But cryptography is about
the practical application of impossibility,
which I kind of find really
fascinating that we take what we think are fundamentally negative results that are just
for intellectual curiosity and would have no practical implications whatsoever.
And we turn them into something that we actually apply and are using every day to do commerce
over the internet.
So compelling.
Thank you so much.
We've been speaking with computer scientist
Boaz Barak about cryptography in the modern era. Boaz, thank you for joining us on The Joy of Why.
Thank you very much.
Thanks for listening. If you're enjoying The Joy of Why, and you're not already subscribed,
hit the subscribe or follow button where you're listening. You can also leave a review for the show. It helps people find this podcast.
The Joy of Why is a podcast from Quantum Magazine, an editorially independent publication supported by the Simons Foundation. Funding decisions by the Simons Foundation have no influence on the selection of topics, guests, or other editorial decisions in this podcast or in Quantum Magazine.
The Joy of Why is produced by PRX Productions.
The production team is Caitlin Folds, Livia Brock, Genevieve Sponsler, and Merit Jacob.
The executive producer of PRX Productions is Jocelyn Gonzalez.
Morgan Church and Edwin Ochoa provided additional assistance.
From Quanta Magazine, John Rennie and Thomas Lin provided editorial guidance
with support from Matt Karlstrom, Samuel Velasco, Arlene Santana, and Megan Wilcoxson.
Samir Patel is Quanta's editor-in-chief.
Our theme music is from APM Music. Julian Lin came up with the podcast name. Thank you. Odom-Reed at the Cornell Broadcast Studios. I'm your host, Jana Levin. If you have any
questions or comments for us, please email us at quanta at simonsfoundation.org. Thanks for listening.