The Infinite Monkey Cage - Big Data

Episode Date: July 16, 2018

Brian Cox and Robin Ince are joined on stage by Danny Wallace, mathematician Hannah Fry and science writer Timandra Harkness. They'll be going big on Big Data, and asking just how big is it? They'll b...e looking at where Big Data comes from, should we be worried about it, and what mysteries are hidden within the seemingly endless amounts of information that is collected about us as we go about our daily lives.

Transcript
Discussion (0)
Starting point is 00:00:00 In our new podcast, Nature Answers, rural stories from a changing planet, we are traveling with you to Uganda and Ghana to meet the people on the front lines of climate change. We will share stories of how they are thriving using lessons learned from nature. And good news, it is working. Learn more by listening to Nature Answers wherever you get your podcasts. This is the BBC. Hello, I'm Robin Ince. And I'm Brian Cox. Today's show is all about big data and how the information you unwittingly
Starting point is 00:00:40 offer up every minute of your life is being used to shape the environment around you and even to decide what you need. You will know this if you've ever done internet shopping that suddenly that algorithm i looked up brian cox once once and ever since then i'm inundated with adverts for how to be a one-hit wonder 1990s club classics volume 17 and astronomer's finger wax which i don't even really know what that is. It just says, with Astronomer's Finger Wax, you can point to anywhere in the universe with confidence. May not work at Zenith.
Starting point is 00:01:13 You're right, you said that Zenith bink would be quite a low laugh. Your prediction was good. A niche joke for astronomers. How many astronomers in the audience? Noms. To be honest, we got more noms than we deserved. Well, I put your name into a search engine anyway, and all I get now is adverts for cardigans.
Starting point is 00:01:31 You would. So, 40 years ago, you would know if someone was collecting data from you because you would be stopped in the street by a man in a Mac who would be saying something like, excuse me, have you got five minutes for me to ask you about your thoughts on home pride cooking sauce and the education policy of Barbara Castle? To which I would normally reply, you'd have to be quick, Rumbelows closes at five and I've just blown a valve. We really debated about the Rumbelows reference.
Starting point is 00:01:54 But we felt that to say Rumbelows would somehow find some of our core demographic and take them to a place of delight. Every time you use your loyalty card or your Oyster card, search online or even switch on your phone, information about your behaviour is collected. But where does it go and what is it used for? To answer those questions, we are joined by at least 64 petabytes of pure information,
Starting point is 00:02:17 and they are... My name is Dr Hannah Fry. I'm a mathematician from University College London and the author of a new book on algorithms and data called Hello World. And the most interesting thing that I have discovered on the internet is that France was still using the guillotine when the first
Starting point is 00:02:32 Star Wars film came out. Hi, I'm Timandra Harkness. I'm a lapsed comedian. I'm a mediocre mathematician, but I am the author of Big Data. Does size matter? That's the correct response. And the most interesting thing i found on the internet is that the very first ever registrar
Starting point is 00:02:53 general of births marriages and deaths for england and wales who was a man called thomas henry lister also wrote a romantic novel called granby in which the eponymous hero's love for miss german overcomes her parents' opposition when he is revealed to be the secret heir to Lord Malton. Hello, my name's Danny Wallace. I'm a writer and a presenter. And what I find interesting about the internet is when big companies get their sort of web addresses wrong, their websites, you know, they've got to come up with something good and they've got to come up with their URL,
Starting point is 00:03:26 and they just think, that'll do. Which is why the custom pen manufacturer, Pen Island, when it's all typed out... LAUGHTER ..you are way ahead of me... ..becomes Penisland.net. Or the company Internet Protocol Anywhere, which, until they noticed, you could find at ipanywhere.com.
Starting point is 00:03:48 So it's just a way of bringing in new audiences, I think, for their products. And this is our panel. To my understanding, we'll start off with a definition, because this is, I mean, big data is quite new to everyone on this panel. When we were born, this was nothing that would be referenced. It is a new idea. So what is big data? Okay, well, obviously it's big. It's actually quite hard to specify how big because the amount of data in the world proliferates so quickly that if you give it a figure one year, it's out of date.
Starting point is 00:04:24 of data in the world proliferate so quickly that if you give it a figure one year it's out of date. To give you an idea I interviewed a brain scientist called Professor John D Van Horn and he said that when he got his first post-doctoral research job the lab sent him out to buy the biggest hard drive they could afford because they were doing brain scans which is a lot of data and he brought it back and people from the other labs in California came to look at this hard drive because they'd never seen one so big it was four gigabytes and I'm thinking I'm talking to him on the phone thinking your photo on the internet makes you look quite young but how long ago was this because my phone has eight gigabytes so so it is it is big there's lots and lots of data but is there more to it than that yes and uh i actually came up with a backronym for big data which as you all know is an acronym where you've
Starting point is 00:05:13 reverse engineered it to get the word you wanted that's the only thing anyone's going to remember about what i say tonight uh a backronym so so data big big data is big obviously uh d it's got lots of dimensions so you've got lots of different data sets uh so perhaps you know you you don't just well another brain scientist said yeah okay brain scans are great but i'm much more excited if you get the brain scans and the medical records and the postcodes where the patients have lived and the weather reports for those postcodes when they lived there this is professor paul matthews and i put them all together and then i can study the effects of sunshine on the progression of multiple sclerosis and that he said that's big data if i just have brain scans
Starting point is 00:05:56 that's just large data so it's got different yeah that was what he said uh so it's got different dimensions it's collected automatically that's the first A. It's collected pretty much in real time. It's the T, which means you can also then project it forwards in time and use it to make predictions. And the second A is for AI, because you basically use artificial intelligence-type programs to process it. So that's big D-A-T-A. So, Hannah, we've spoken there about, I suppose, collecting data sets in a way that is, well, understandable in a way, so it's a weather data or brain data or brain scans.
Starting point is 00:06:33 But can you give us an overview of the totality of the amount of data that's being collected? Yeah. Well, in a way, actually, I don't know if I totally agree with everything that Tamanja said said because I think that actually we have had big data in the past I think you know the census for example is this connected data set tells us all about you know different things about one person and lots of things about everyone and you know each one of us has contributed to it across the entire country I mean that really is sort of is big data but I think what has changed is the types of data that we're now collecting I mean you know you don't need me to tell you how much data your your phone can collect just by you walking around um you know you can have a heart rate monitor on your wrist you've got um you know your lights
Starting point is 00:07:16 coming on and off that's all being recorded everything I mean basically every single thought I ever have I practically type into Google um you know there's there's just a catalogue the the range of different stuff that we know about people now that is different and you said it's being recorded almost in passing but is it are we now to take for granted the fact that all those things we do with the phone everywhere that we move every internet search that we perform is recorded or archived somewhere do we have control over that no no not at the moment but uh there are uh i think people have in the last couple of years started to wake up to the fact that actually being able to infer this much about individual people isn't necessarily the kind of society they all want to live in because Because, of course, there are things like, you know, what gender you are,
Starting point is 00:08:07 you know, your sexual orientation, but other things as well, very, very personal things, you know, whether you've had an abortion, perhaps, you know, whether you're having problems conceiving. All of these things can be inferred from your searches that you're doing online. And I think that people are starting to wake up to the fact that, actually, it doesn't feel particularly comfortable to have someone be able to know that about you. There was a story, actually, a very big story in America a few years ago. A company called Target.
Starting point is 00:08:35 It's kind of like, sort of like Woolworths, really. So it sells everything that you can imagine. You can get some, you know, grocery stuff in there, but also things for your house. And they have like a club card type system where they can track what an individual is buying. And they were doing something called basket analysis, where you look at one individual and the things that they're buying over time.
Starting point is 00:08:56 And they brought on a new statistician to look at all of this data. And this statistician realised that there were some clever tricks that you could use to work out whether or not someone was pregnant based on what they were buying so not the obvious stuff not when they're buying nappies and cotton wool but when they're buying unscented body lotion when they're in the second trimester and often that would be preceded by someone buying vitamins and stopping buying alcohol and you could even kind of predict the exact moment that they were going to give birth by the things they would go on to buy.
Starting point is 00:09:27 So what the company decided to do is they set up this pregnancy predictor, right? So if you went past a certain threshold, they would assume you were pregnant and they would send out a series of coupons to you in the mail. Just to, you know, capture your customer early and lock you in so you're, you know, a target customer.
Starting point is 00:09:45 And that was all fine. Don't necessarily have a problem with that. Except that in 2012, a father of a teenage daughter walks into a store in Minneapolis and was outraged that his daughter had been sent this pack of coupons. And he was like, you're normalising teenage pregnancy. This is outrageous. So the store apologised and then called his home later to follow up on that apology.
Starting point is 00:10:08 And by the time they managed to call him, he said, you know what, actually I've had a chance to have a chat with my daughter and it turns out that she was pregnant. Found out through coupons that are mailed through the post. So I think this is something that people are really kind of waking up to that actually does make us feel quite uncomfortable to have people know that much about us can I ask you a question on behalf of my mum yes my mum uh she doesn't know how this is happening or why but the universe
Starting point is 00:10:35 seems to be telling her that she needs a new mattress and like wherever she goes whatever she does is the same mattress is targeting my mother, and she feels victimised by this mattress company, and she says she hasn't been Googling mattresses, and I believe her. What's going on, and should she buy one? Don't know if I can help you on the latter. Right.
Starting point is 00:11:02 But, you know, there are things like maybe she's Googling insomnia if she's having trouble sleeping or back pain. I mean, there are all of these things that are just loosely associated that don't directly say... I mean, not everyone who Googles back pain needs a new mattress. Maybe your mum doesn't either. But there's just enough... If you do it to enough people,
Starting point is 00:11:22 the chances are that you're going to increase your sales. Twitter thinks I'm a man. And I know this not only from all the beard care products that it advertises to me. And the videos of buff guys working out, which is actually okay. But you can go on Twitter, you can see what it thinks about you. It thinks I'm aged between 13 and 55. It's got that right, anyway. And it thinks I'm aged between 13 and 55. It's got that right, anyway. And it thinks I'm a man.
Starting point is 00:11:47 And I don't know why, because I've never told it. So, you know, it's not infallible, which I think is... there's some hope. It's not always right. I feel bad that as a man I'm not getting those things. What do you get on Twitter? Well, I never get buff men working out. I don't know what you've been Googling.
Starting point is 00:12:04 But this is... well, I don't know what you've been Googling. But this is... Well, I don't know. Have you been Googling... Have you been Googling buff men working out? That'll be it. I do remember, actually, my mother saying that she'd been looking for... There was a particular scene from a movie where there was a very moving scene of, I think,
Starting point is 00:12:23 it was Edward Woodward acting, and it was a movie about a prison, and there was a very moving scene of i think it was edward woodward acting and it was a movie about a prison and there was this very moving emotional scene in the shower where he was acting really well and so she googled prison shower scene and she said she she didn't find edward woodward and a lot of the acting wasn't that good it was quite blurry and you couldn't really tell what was going on at all. I mean, you've got me worried now because this afternoon I was literally googling penis land
Starting point is 00:12:53 and IP anywhere. Hannah, we're talking about human behaviour here. So you gave an example, actually, of a single individual in a case where you can predict something about them. How accurately can we predict individuals' behaviours? And then, I suppose, groups of individuals, does that become easier to predict? Yeah, well, so groups of...
Starting point is 00:13:17 You can certainly look at what a lot of people are doing. And actually, there was... In getting my notes ready for this i i was researching something on my work computer that was uh slightly regretted because it's quite an interesting story about porn hub um so uh during the hawaii we don't know what that is what is i mean it's got a backronym somewhere i'm not sure um so during the hawaii missile alerts um there is uh porn hub have released the data um of how much they were being used in hawaii at the time um and as the alert came out the first text message saying you know there's going to be
Starting point is 00:13:58 a missile coming uh there was an 80 drop usage. Still not down to zero. Still 20% of people there. They were too busy to read their text messages. That's why. But as soon as the follow-up message came through to say that everything was all okay, Pornhub then spiked to 50% higher than normal usage
Starting point is 00:14:22 for that time. What a way to celebrate. Exactly. You can certainly observe 50% higher than normal usage for that kind of thing. What a way to celebrate. Exactly, exactly. But I mean, you can certainly observe how people are behaving, especially at the level of a population like that. In terms of predicting what an individual will do, you can do a better job with algorithms and with data than you can just by guessing,
Starting point is 00:14:42 just by another person sort of trying to make a prediction. But you can't get absolute perfection. And that, I think, is one of the slight concerns, really, about all of these algorithms being used. So to give you an example, algorithms based on data of people's past is being used to try and predict whether or not they'll go on to commit a crime. Now, this is in a particular scenario so when a judge is trying to decide whether to give someone bail or not for instance but also now increasingly if they're sentencing an individual and sending them sending them to jail and this is something that i think you know the whole sort of big data community has really been
Starting point is 00:15:19 tussling with because you if you're a judge you sort of do need to make a prediction you need to make a prediction of whether letting someone out uh and and you know giving them their freedom before they they face trial is a good thing you know you've got you've got to make a prediction of whether they're going to um you know betray your trust and and break the the conditions of their bail so in some sense actually an algorithm that makes that prediction better is better but well but but are they better and and also how would you feel about being sentenced because people like you in the past uh went on to re-offend or didn't go on to re-offend because what categories are they using durham police are working on their own version of this, which,
Starting point is 00:16:05 to do them credit, they've looked at the American ones, which have been subject to some controversy and gone, we want something to help us make this decision whether to kind of keep people out of jail and put them onto a rehabilitation scheme instead. But we want it to be transparent. We want everybody to know what's going on, how they're being judged. But the biggest predictor of whether they're going to re-offend turns out to be the postcode. So, I mean, I don't know what kind of area you guys live in, but do you want to go to jail or not on the basis of your postcode? Is that fair?
Starting point is 00:16:36 Danny, how do you feel about this collection of information? I mean, are you careful? Are you canny when you are on the internet, for instance, and, you know, certain things, you know, ticking a box, whatever it might be, are you careful are you canny when you when you are in on the internet for instance and you know certain things when you're you know ticking a box whatever it might be do you are you methodical or do you kind of worry about uh no i sort of in a weird way i find it kind of uh comforting uh in a strange in a strange way it's that kind of you know it's the the electronic version of the nanny state kind of looking after me making sure my mother has a mattress things like that reminding her i worry though about the kind of the nanny state kind of looking after me, making sure my mother has a mattress, things like that, reminding her.
Starting point is 00:17:06 I worry, though, about the next stage. You know, we've all got all these devices that have microphones in them. And we've got one of those robot ladies in our house. We have a little box, and you can talk to her. And you can just go, you know, hello, what's the weather like? And I don't know why I ask her that, because I've got windows. I'm able go, you know, hello, what's the weather like? And I don't know why I ask her that because I've got windows. I'm able to, you know, do that for myself. But you start to wonder, you know, are they listening all the time?
Starting point is 00:17:33 You know, is that how more information will be got? Sorry, what is this? Because I... When you say that, I'm confused as well. I was just seeing one of those old-fashioned barometers where either someone comes out with an umbrella or a lady comes out because it's sunny. So this doesn't sound very modern at all.
Starting point is 00:17:49 That's exactly what it is. No, you know, like Siri or Alexa. Now I've come up with two names. I can say them because it's all branded, isn't it? But, you know, you can go, Alexa, do this for me. And she will. She's very obedient. But, I mean, although there was a story not long ago about these Alex these uh these
Starting point is 00:18:06 alexis in particular or alex i seeing as i'm already a four um uh where every now and again you know they'll only really respond if you say their name um but just recently they've started to just every now and again when you're having a conversation with someone else or terrifyingly in the middle of the night, they begin to laugh maniacally. Which is not what you want to, you know, we could do with a few in here. But you don't want to hear that
Starting point is 00:18:34 at sort of three in the morning, just hearing a disembodied voice, a woman just laughing downstairs, or just mocking you. So I start to think, you know, there are bugs in this, but could this be the sort of the next step where we just accept that these things are always on and then your smart TV is communicating with your phone and your phone is talking to Alexa
Starting point is 00:18:51 and they're all going, his mum wants a mattress. See, this sounds like it has, in terms of AI, I don't know how you feel about it, Andrew, if we now have these machines which have reached the stage of, you know, mocking us and probably also understanding irony, have we now got, you know, have these machines which have reached the stage of, you know, mocking us and probably also understanding irony, have we now got, you know, have these machines reached that point of passing the Turing test? It's like that moment where it's not the chess machine winning,
Starting point is 00:19:13 it's when the chess machine gloats as well. Well, that I think... Apparently there is somebody in London actually working on trying to get AI to understand irony. And I'm like, don't do it, don't do it. Because when they do become more intelligent than us, our only hope will be in irony. We'll live underground and have a secret language based on sarcasm.
Starting point is 00:19:33 Because it is the only thing they don't understand. I mean, now they can do stairs. I once talked to a roboticist and I asked him that kind of, you know, that question that everyone sort of ends up asking a roboticist which is, you know, will they rise up against us? And he sort of laughed and said, probably. And I said, well, what
Starting point is 00:19:54 are we doing about this? And he didn't really have many ideas and he just sort of went, well, at the moment we're making sure that the off switch is quite readily accessible. I feel safe. But in a sense, that's the wrong question because I actually had really funny conversations when I was out in California
Starting point is 00:20:14 at Google HQ for an event and really spookily, I basically, I was going out there for this Google event, at the last minute I stuck my radio recording kit in the bag, just on a whim and I landed at the airport and I got this message from the BBC saying, this is a few years ago, saying, yeah, no, get in touch with us urgently, because we want you to co-present this programme about the singularity, like the super intelligent AI that's going to be more intelligent than us.
Starting point is 00:20:39 And I went, that's quite spooky, because I'm basically on my way to where it probably already is. And so I was talking to people in Google and saying, you know, how long do you think we've got before the super intelligent AI? And several of them said, well, we're in it. You know, it's already here. Look, think about it. If you were a super intelligent AI, would you burst on the scene going, no, you are mine, humans? Or would you quietly sit in silicon valley attracting really clever people giving them nice food and drink uh getting them to service you and bring you all the data that
Starting point is 00:21:12 you need and then maybe some robotic cars and maybe some drones and uh and who knows what else you know that that would be what you would do and i'm sitting there in google in the restaurant talking to this guy going but that means it can hear what we're saying I actually don't think it's already here but you know if it was if it's going to come it'll come like that it'll come with clever people in Silicon Valley going here it is we've built it isn't it great I think we're quite I think calling AI well I just think actually I have a slight problem with the label AI full stop I think what we've seen recently is a revolution in computational statistics, not a revolution in intelligence.
Starting point is 00:21:50 And I admit that that is nowhere near as sexy. Statisticians of the world unite. Unless you really like statistics. Statisticians of the world unite. You have nothing to lose but your Markov chains. All right? That's one of the things actually for both you uh and hannah i imagine i found a quote about this which said about where you are with mathematics and and
Starting point is 00:22:12 statisticians it's like an arms race to hire statisticians nowadays mathematicians are suddenly sexy i mean in terms of big data at that moment going, it's changed people's conception of both mathematics and mathematicians, hasn't it? Yeah, I think so. I mean, I think suddenly we can use all of these techniques that we were able to use in science for so many years, and now suddenly we can apply them to ourselves. And to Mandra's making a face. Should we? That's the question.
Starting point is 00:22:40 Because it's one thing to, like, you know, like Brian does underground in Switzerland, applying big data and a large hadron Collider to protons, isn't it? Anyway, those subatomic things. To be honest, he's not as involved as he used to be. Oh, well... Yeah. Telly or great discoveries?
Starting point is 00:22:57 Ooh. I'm resisting. We get on very well, by the way. That's why I said that. Because one little loo is if to go... I'm resisting the urge to deflect the programme from the subject into the... I'll talk about particle physics all night.
Starting point is 00:23:13 But that's the natural world. The natural world is absolutely... Math is brilliant at studying the natural world and statistics, but human beings, we are part of the natural world, but we're also not part of it. Like, we don't just behave like particles. And, you know, we have free will and we're awkward and we do things for reasons that we then have to explain
Starting point is 00:23:35 and people don't understand. And I get squeamish about saying, oh, well, you know, but you can model how people behave and we basically behave the same as i don't know ball bearings there's a difference here though isn't there because we've talked about it sounded quite sinister actually when you talk about predicting whether someone is likely to re-offend or but but in terms of groups of people in terms of movement of people through shopping centers or cities then i think it sounds less sinister and more sensible doesn doesn't it? Yeah. Is it easier to take, to say,
Starting point is 00:24:05 well, I suppose to predict how crowds will behave in certain environments rather than individuals? Yeah, it's a perfect example. You know, you take something like a transport system, like the tube network, say. You know, it's actually really important to have a really clear idea through data of how people are using your system,
Starting point is 00:24:25 where they're going, when there's a problem, where they redirect to. You know, it's absolutely integral to getting something that works efficiently. And I think that you can say the same thing about, you know, to a degree, and this is where Tamar and me disagree, but I think to a degree,
Starting point is 00:24:42 you can say the same thing about making, you know your policing as efficient as possible um and working out where the best places to to place your forces are in the city um and I think you know actually across the board really you know in health care I think in in yeah everything everything every system that humans are part of I think that we can learn about ourselves by thinking of ourselves through the eyes of data and make it more efficient. Tivanda, do you feel there's a difference between using big data to predict the way that crowds will behave, or large groups of people and individuals?
Starting point is 00:25:17 Is it because you said you were concerned, but is it really about the individual rather than group behaviour? It really is. I mean, that's the root of it, is that... I mean, Hannah's right. It can be really, really useful to look at how people behave en masse in order to find solutions en masse. And, you know, in crime, for example, if you did find a postcode, which happened to have a lot of criminals, it would be useful to go, that's weird, what's happening there?
Starting point is 00:25:41 Is it particularly deprived? Is there something we can do uh en masse on a population level it's where i would get really worried is when you then jump and go okay well if we know that i know 30 of the people in this postcode will end up unemployed then you look at an individual and go you're 30 likely to be unemployed so we are going to treat you as if you are basically a potential unemployed person without any regard to you as an individual and what you might think and want and do. So that is part of it.
Starting point is 00:26:14 But I do... I'm also a bit squeamish about the idea that you kind of guide us without consulting us. If you look at, I don't know, wanting us all to live healthier lives and walk more and get the bus less, us without consulting us if you look at i don't know wanting us all to live healthier lives and walk more and get the bus less and you go okay well what we'll do is we'll redesign the system so that you have to actually really put yourself out to get a bus and we're just going to nudge
Starting point is 00:26:36 you into walking without ever saying to you do you want to walk more or not and and that's another thing i just get a bit you know i mean you, Danny, you seem quite happy with the idea of the Nanny State doing things for your good. I'm a bit more like, well, ask me. You know, I might want to walk more. But I do think it should be my decision, not just kind of nudged into it. I mean, smart cities is, oh, smart cities and everything works really efficiently and it's a great system. But the problem with a smart city is it seems to kind of assume that we're dumb and the city is smarter than we are.
Starting point is 00:27:11 And I don't like that. I was actually at an event in Hong Kong. Robin was there as well. And it was a trade panel and we were talking about things. And then there was a Hong Kong entrepreneur, a property entrepreneur there. And he sort of woke up. I'd said something. I made some joke about brexit or something i can't remember what it was and he looked up and he said this is we're going to beat you in china we're going to beat you and we're going to and the reason is that data in china is owned by the government
Starting point is 00:27:40 so you do not have the right to restrict data, for example, about your movement through a city. And he was using that as the example. So a city like Shanghai, for example, his assertion was it will be a better city, it will be more efficient, because the data is freely available to the planners of the city, and therefore you can build a better city,
Starting point is 00:28:01 which is, you can see the point. And I suppose really what we're talking about here is we should separate the two in the discussion, really, from what we can do with big data and then the oversight that government has. Yeah, what we should do with big data. And actually, China is a very interesting example because China has had ID cards for a really long time, which obviously we rejected in the UK several times. But essentially
Starting point is 00:28:27 that means that the database of ID cards means that the government knows what everyone's face looks like, right? It owns that data on everyone's face. So facial recognition software is now widespread across China. There's even an example actually in um in some toilets in beijing where um the facial recognition system would notice you as you went in um and then if you came back oh it would only release uh i think uh 60 centimeters of toilet paper right every time and if you came back within nine minutes it would lock off all of the toilet paper system because clearly toilet paper theft within this particular toilet in China was so extreme that they needed to register your face. That's a terrifying moment in 2001.
Starting point is 00:29:14 That's what Kubrick did. I'm not going to give you any more toilet paper, Dave. I'm not going to open the toilet door, Dave. You could have just had some dodgy sushi or something. Well, I want to ask Timandra if you have examples of extremely positive outcomes from looking at big data sets and analysing data. Oh, definitely. And, you know, it is true that obviously it can be used to make things more efficient. I would never say we shouldn't use it. I think it really is all about oversight.
Starting point is 00:29:40 to make things more efficient. I would never say we shouldn't use it. I think it really is all about oversight. And we could, for example, contrast, well, China, where they're using facial recognition everywhere and have no compunction. But even the cities in Europe, which are introducing all sorts of different smart systems where different private companies
Starting point is 00:29:59 are just gathering up a lot of data about individuals and nobody even knows what they are with the city of oakland in in california where they uh the citizens basically heard their council had got a federal grant to put in a very integrated surveillance system with facial recognition and number plate recognition and all sorts of things and when um can we can we just ask where is your privacy policy on this and when the council went, they had a big campaign. And they now have, not only is that system kind of quite controlled and scaled back, but they have a standing privacy commission,
Starting point is 00:30:36 which includes citizens and civil liberties organisations. And every time they want to bring in new technology, they sit there and go, OK, well, what do you want the data for? What are you going to do with it? How long are you going to keep it? And it has democratic oversight. And I think that's perfect because they still get the benefits of the technology, but they know what it is. But if you want to talk about how great big data is and what it could do, my favorite example, my kind of poster boy of big data, is a professor in Southern California called Eamon Keogh.
Starting point is 00:31:07 And he's a professor of electrical engineering, but what he works with is insects. And I said, well, what's the connection there? And he said, well, you know how when your emails come in, you've got an algorithm that sorts them and gets rid of the spam and can forward emails automatically? I want to do that with insects. I want to be able to delete them
Starting point is 00:31:25 and forward them and i'm afraid my first thought was great so you could forward wasps to somebody else's office that wasn't what he meant at all uh and he's basically he's using big data to classify insects he's got like a global database of insects based essentially on the sound that the wings make, but to not have background noise. He's got this mad little device using lasers and photodiodes. So you've got this red light falling on a light gate, and it produces an electrical signal which you can turn into sound. And when an insect flies through that or anything interrupts it, it interrupts the electrical signal, it makes a sound.
Starting point is 00:32:05 So essentially, if an insect flies through this light gate, then the electrical signal is the sound of its wings, but without any background noise. And he's used big data techniques with millions and millions of these recordings of insects all around the world to classify this sound as this species of mosquito. And there are something like 5,236 species of mosquito?
Starting point is 00:32:27 That's right. And it can not only tell the species, but tell whether it's male or female, and whether it has already sucked blood from some creature. And so you could track, like, are the Zika-carrying insects moving across Africa? We can trap all the ones. There's all sorts of things you could do with that to control insects,
Starting point is 00:32:51 to know where they are, to know what diseases they're carrying. And that just made me go, this is it. This is a great use of this technology. We can understand insects. We can do something about the diseases they carry or the crops they eat. So he's my kind of big data hero. What about the individual insect, though? You shouldn't label them in the letter.
Starting point is 00:33:12 That's just because they're insects. I don't care about them. I didn't mean that. I suppose there's a real challenge here, because obviously collecting large amounts of data for no apparent reason at first sight can be problematic you might think you someone should define what they want to do with it before it's collected but of course much of the opportunity is in finding patterns in the data isn't it so it's
Starting point is 00:33:38 really it's not only the data you collect but but what you do with it and finding new ways of interrogating the data yes now i totally agree with that i think um people who work in who work with data are you know notoriously greedy um you know give us everything and we'll find something and things are changing though i think that you know there's um there's a new bit of uh european legislation that is now out that changes the sort of ownership of data from the company and then shifts it slightly more in the hands of the individual it's called gdpr that should just give us a little bit more control over what companies know about us and in particular on that point that you just made there of we need to know what our data will be used for a complete list of what it will be used
Starting point is 00:34:27 for um before they're allowed to own it is that a good thing though because um part of the opportunity i suppose is to find patterns so i can just to invent something it may be that people who engage in a certain activity are more likely to develop heart disease or something but it might be something very unexpected like eat too many apples i don't know what it is but something that we no one ever suspected so do we close off the possibility of making really important public health discoveries if we restrict the usage of data it's a very very difficult question without an easy answer but i think one thing that i do know so there was an example of where um all health records uh of she's wondering maybe you know this one slightly better than me
Starting point is 00:35:12 the um the royal free example oh yeah the royal free hospital uh set up a partnership with google deep mind which is the the source of artificial intelligence, although not to Hannah's standards, but the AI program that beat the world champion at Go. And they set up a partnership so that the patient's health records could be analyzed using a new system Google developing to basically just move data around within the hospital more efficiently it was fairly innocuous what it was doing i mean other other projects they're doing is about finding
Starting point is 00:35:50 patterns in the data this one really was just about moving it around more efficiently and tracking it but they didn't ask the permission of the patients to do this with the data the hospital went well they've signed up to be our patients they they must be fine with it and and afterwards there were a lot of wrists slapped because the information commissioner as office said no you really should have said to the patient are you okay with us giving your data to google because it's an outside google deep minds because it's an outside organization and i think there's a really important trust question here i'm actually really in favor of us sharing our health data for everyone's benefit i mean it is a real example of we could all benefit enormously from sharing health data and and like you say brian suddenly finding that oh actually there's a relationship
Starting point is 00:36:38 between this and this and this could be really important but we have to in order to do that we have to feel that we trust the organizations that are using it and that they have our best interests at heart so you know we all love the nhs and it saves everyone's lives and so on but there are also cases where the nhs says well i'm sorry we were short of money so fatties and smokers you're to the back of the queue when you need an operation so you can imagine if if there's this great health data sharing research going on and they suddenly go oh well sorry sorry brian your your store card says that you bought pizza every week for the last 15 years and so you're not getting that operation my name is not brian and i don't know where you got that information from
Starting point is 00:37:23 and that's and that's why this is terrifying now. You've conflated the two things. Danny, would you... I suppose there are two things here. One is that you can... Huge amounts of data about you could be collected. But if it's anonymised, would you care? Is it really the personalisation,
Starting point is 00:37:43 the identification of you with that data that matters oh yeah no i think we're all you know we're all very protective of uh of who we are and what we get up to even if what we're getting up to is fairly innocuous um but if there's a greater good i think the health thing is as you know the best example possible we've all you know you have apps on your phone that tell you uh kind of how many steps you've taken and where you've been and you can add all this other stuff in and if you're adding all this information in there and it can be a central database that can look at these patterns and you know track your health and see you know what troubles you've come up against then like you say you know if they can find
Starting point is 00:38:17 new treatments or or patterns that have never been spotted before that help the greater good then absolutely but yes we all we all want to keep that to ourself instinctively but how this thing that i furnish is how easy is it to actually uh remove the anonymity because i'm not i was reading an article the other day which was about apparently the system that was used in new york taxes and it was just to see about the routes of taxes and and and various different you, the pay that was given on different routes. And some journalists managed to work out which ones... how much different celebrities...
Starting point is 00:38:51 Yeah. Now, that seems incredible because... That was very clever, actually. Well, the way that they undid all of the data because there was a very weak encryption. The data was released for all of the yellow cabs in New York across a year so that people could do these beautiful visualisations, work out efficiency in the city, so on and so on and so on.
Starting point is 00:39:08 But they put a very weak encryption on it so you could work out what cab was what at what time. And then someone else realised that if you took paparazzi photographs of celebrities getting into cabs, if you could see the registration number of the taxi and know what day it was on you could work backwards you put those two data sets together and work out often where celebrities lived but also exactly how much they tipped um isn't this also called stalking yeah well yeah it totally
Starting point is 00:39:38 is but i also think that there's this idea about choice and i think that we're slightly kidding ourselves if we think that we have much choice in this matter. Tamandra and I about a couple of months ago went to a crypto party. I don't know if you've ever been. We go to the best parties. Now the audience at home don't know this but if you could sense the envy going
Starting point is 00:39:58 on in the studio at this moment it is palpable. Keep it below the surface. Crypto parties I didn't know what it was until I went to one. Crypto parties are where you go and people teach you how to hide from everything. So it was people showing you how to have an operating system that only exists on a USB key so that you can take your whole computer with you when you leave and no trace of you will remain. How to use the dark web, how to
Starting point is 00:40:27 change all of the settings on your phone so that no one could track you. And it was very interesting. I went and I was researching my book, Tamanja, I think a similar story. I couldn't help looking around the room and thinking, what have these people got to hide?
Starting point is 00:40:44 Paranoid parties are the best parties. Come on, wouldn't you want to go to a party with people that have something to hide? Surely. Turn up, it's just an empty room. No, I think that's unfair. When people say, you know, nothing to hide, nothing to fear, I say, nothing to hide, you haven't really lived.
Starting point is 00:41:09 Everyone should have something to hide by the time we're adults but i also think that getting together in a room and going we're going to be really technically technical and and spend ages changing our settings is not really the answer because you're not actually helping everybody do that i think the answer is just to make everything more transparent so we can genuinely choose do we want our day to be part of this and also just to generally say what do we want it to be used for do we want to be like china where every individual can be tracked and everyone can be given a credit rating or a social credit rating based on how well behaved they are how polite they are uh which could affect their chances of getting loans and things or do we want to kind of draw some lines and say okay gchq you can you can hack into my phone in order to save me from being blown
Starting point is 00:41:53 up by terrorists but you can't hack into my phone to check that i'm not letting my dog poop where it shouldn't that's the line is it dog pooping. Two very different Liam Neeson films there. So we asked the audience a question, as usual, and today the question was, what is the strangest question you have ever asked the internet? And I can tell you now, this is the largest number of answers we've had to throw away due to the fact that it's just not suitable for 4.30.
Starting point is 00:42:28 What is the strangest question you've ever asked the internet? Why am I so inexplicably attracted to Brian Cox? Oh, Dominic, it's very explicable. Where is the hippo in Hippocampus? I really did. Danny, you got some as well? Yeah, Katie Adam, this is a great question. What is the capital of space? Brian? Well, it's, there isn't a centre to the universe.
Starting point is 00:42:58 It's the ultimate Copernican principle at all points. It's called, it's invariant, essentially, in every direction. It's all the same's invariant, essentially, in every direction. They're all the same. Thank you. That was a lot less exciting than I was hoping for. It's homogeneous and isotropic. And almost exactly the same question here.
Starting point is 00:43:18 Is a fraggle a muppet? If there are an infinity of quantum worlds, why am I stuck in this one? I'll tell you, John, you should see the others. Honestly, it's not as bad as you think. If things can only get better, when? Have you got one? I don't know the answer to this one.
Starting point is 00:43:42 How do whales breastfeed? Why did you Google that? What, not the others, just that? I think I actually know that one. This is brilliant. What a moment for me. Well, the milk is a lot thicker and
Starting point is 00:44:01 you'll like this, more viscous. Meaning it doesn't just dissolve and go away in the water. And that's little bull whales that do that. And also they sleep vertically. Thanks, guys. So, during the show, we've been collating data on those who have been
Starting point is 00:44:24 listening via the BBC's patented Soul Thieving Conundrum Machine, So, during the show, we've been collating data on those who have been listening via the BBC's patented Soul Thieving Conundrum Machine, or, er, P-S-I-T-I-C-A-M-M-E. Got to work on the acronym, don't you? So, whilst this show's been on, we've been using P-S-I-T-I-C-A-M-M-E to find out what you have been searching for on the internet during this broadcast, and here are the top three searches. Top three searches were,
Starting point is 00:44:42 why won't my lightsaber cut ham? Jacob Rees-Mogg Mars Base? What time does Rumbelows close today? Thank you very much to our panel, Hannah Fry, Amanda Harkness and Danny Wallace. Goodbye.
Starting point is 00:44:56 APPLAUSE In the infinite monkey cage. Turned out nice again. In our new podcast, Nature Answers, rural stories from a changing planet, we are traveling with you to Uganda and Ghana to meet the people on the front lines of climate change. We will share stories of how they are thriving using lessons learned from nature. And good news, it is working. Learn more by listening to Nature Answers wherever you get your podcasts.

There aren't comments yet for this episode. Click on any sentence in the transcript to leave a comment.