WEBVTT 1 00:00:00.120 --> 00:00:03.399 Welcome to the deep dive. We are jumping straight into 2 00:00:03.480 --> 00:00:08.400 the digital arena today exploring the phenomenal, almost unbelievable growth 3 00:00:08.400 --> 00:00:13.320 of social networking and well the constant sophisticated security battle 4 00:00:13.359 --> 00:00:15.119 that's raging just below the surface. 5 00:00:15.320 --> 00:00:18.160 It's a battle that's absolutely necessary because of the sheer 6 00:00:18.239 --> 00:00:21.280 scale we're talking about. I mean, by June twenty twenty, 7 00:00:21.320 --> 00:00:24.120 research show the Internet head balloon to over four point 8 00:00:24.120 --> 00:00:25.760 eight billion users. 9 00:00:25.760 --> 00:00:28.199 Four point eight billion, that's what sixty two percent of 10 00:00:28.239 --> 00:00:31.359 the entire global population suddenly connected exactly. 11 00:00:31.719 --> 00:00:35.520 And if you think about social networking as the central 12 00:00:35.600 --> 00:00:39.320 way modern humans communicate, you start to see why securing 13 00:00:39.359 --> 00:00:42.079 it is maybe the highest priority challenge we have right now. 14 00:00:42.200 --> 00:00:45.119 Right and that massive expansion, it didn't just happen smoothly, 15 00:00:45.439 --> 00:00:48.039 and it definitely came with the security cost. So our 16 00:00:48.039 --> 00:00:49.880 mission for you today is pretty clear. We're going to 17 00:00:49.880 --> 00:00:53.560 distill the history of this explosion, pinpoint the biggest threats, 18 00:00:53.880 --> 00:00:58.359 everything from psychological tolls to high tech cybercrime. 19 00:00:57.799 --> 00:01:00.000 And then walk through the really cutting edge technical care 20 00:01:00.039 --> 00:01:02.799 honor measures people are deploying. It's this constant arms race. 21 00:01:03.119 --> 00:01:07.400 Yeah, we've got sources covering the psychology, the politics, the 22 00:01:07.439 --> 00:01:10.959 really technical stuff. It should be a fascinating synthesis. 23 00:01:10.400 --> 00:01:13.400 It really is. So where do we start the origins? 24 00:01:13.519 --> 00:01:16.519 Let's do it, because if you think the story starts 25 00:01:16.519 --> 00:01:19.400 with I don't know the like button, you're missing a 26 00:01:19.439 --> 00:01:22.400 couple of decades. The foundations were actually laid way back 27 00:01:22.439 --> 00:01:25.719 in nineteen ninety seven. You had six degrees dot com, right, 28 00:01:25.840 --> 00:01:26.480 six degrees. 29 00:01:26.519 --> 00:01:29.159 They actually got up to three point five million users, 30 00:01:29.359 --> 00:01:30.879 which was huge. 31 00:01:30.560 --> 00:01:35.760 Then, maybe more memorably for some folks, AOL instant messenger, AM. 32 00:01:35.560 --> 00:01:38.879 Oh yeah am. That was really the precursor, wasn't it. 33 00:01:38.879 --> 00:01:40.959 It brought in things we take for granted now like 34 00:01:41.439 --> 00:01:43.719 real time chat, persistent friend lists. 35 00:01:43.920 --> 00:01:46.799 That was the blueprint. Basically, those early features paved the 36 00:01:46.840 --> 00:01:50.799 way for everything that came later. That mid two thousands explosion. 37 00:01:50.480 --> 00:01:52.959 Definitely, and two thousand and two is a really pivotal 38 00:01:53.040 --> 00:01:55.159 year in that. What happened then, Well, you got Friendster, 39 00:01:55.400 --> 00:01:57.680 which was one of those early kind of original networks, 40 00:01:57.879 --> 00:02:00.400 maybe leaned a bit into dating I remember friends tr 41 00:02:00.560 --> 00:02:03.760 but maybe even more impactful long term was LinkedIn, also 42 00:02:03.840 --> 00:02:04.959 launched in two thousand and two. 43 00:02:05.120 --> 00:02:07.079 Oh. LinkedIn, Okay, that's different. 44 00:02:06.920 --> 00:02:10.479 Totally different. It's the perfect example of a niche platform 45 00:02:10.800 --> 00:02:14.879 that just completely redefined a whole area of connection. It's 46 00:02:15.000 --> 00:02:19.360 all about professional networking, and now over seven hundred million users. 47 00:02:19.599 --> 00:02:23.520 It fundamentally changed modern recruitment. I mean, companies use it 48 00:02:23.560 --> 00:02:25.719 constantly to find people, screen candidates. 49 00:02:25.879 --> 00:02:30.080 Yeah. Absolutely, it's indispensable for many jobs now. But okay, 50 00:02:30.159 --> 00:02:33.919 scale brings problems. As these platforms got bigger and bigger, 51 00:02:33.960 --> 00:02:37.080 that initial dark side started to show up. 52 00:02:37.159 --> 00:02:39.719 Right at first, it was maybe just you know, data 53 00:02:39.759 --> 00:02:43.120 mining concerns, but then it quickly escalated to what organized 54 00:02:43.159 --> 00:02:48.199 phishing attempts, botnet attacks starting to use these platforms, malware spreading. 55 00:02:47.879 --> 00:02:51.719 Like wildfire, and the problems didn't stay purely technical, did they. 56 00:02:51.800 --> 00:02:54.759 They spilled over into the social realm, sometimes in really 57 00:02:54.800 --> 00:02:55.520 severe ways. 58 00:02:55.680 --> 00:02:58.840 Yeah. The sources talk about this concept called digital dramatization, 59 00:02:59.280 --> 00:03:01.800 which sounds a bit academic, but it points to the 60 00:03:01.879 --> 00:03:06.560 really serious, sometimes unintended consequences of broadcasting your life in real. 61 00:03:06.319 --> 00:03:07.879 Time, well kind of consequences. 62 00:03:08.159 --> 00:03:12.719 Well, it covers things like cyberbullying, online vengeance, but tragically, 63 00:03:13.000 --> 00:03:17.360 it also includes things like suicides, even murders being broadcast 64 00:03:17.439 --> 00:03:19.280 live over platforms like Facebook Live. 65 00:03:19.479 --> 00:03:23.439 That's yeah, that's incredibly chilling, just a horrific side effect 66 00:03:23.520 --> 00:03:28.080 of that instant connectivity. And speaking of negative consequences, let's 67 00:03:28.080 --> 00:03:31.520 pivot slightly to the psychological toll, because the research on 68 00:03:31.599 --> 00:03:33.960 why people use these networks is fascinating. 69 00:03:34.159 --> 00:03:36.039 It really is. There's a paradox, right, we call. 70 00:03:35.960 --> 00:03:39.159 Them social networks, but most people aren't primarily using them 71 00:03:39.199 --> 00:03:40.840 to socialize exactly. 72 00:03:41.120 --> 00:03:44.479 Studies show the majority use them mainly to consume information 73 00:03:44.960 --> 00:03:46.919 to scrolling, reading, watching. 74 00:03:46.639 --> 00:03:49.039 And what does that passive consumption do to people? 75 00:03:49.240 --> 00:03:52.639 Well, that seems to be the key disconnect. Researchers found 76 00:03:52.639 --> 00:03:56.639 It often leaves users feeling I guess, unfilled and unsatisfied. 77 00:03:56.800 --> 00:04:01.199 You're constantly bombarded with everyone else's highlight reels of perfect vacations, 78 00:04:01.240 --> 00:04:03.120 the amazing relationship. 79 00:04:02.560 --> 00:04:04.199 Right, the curated perfection, and. 80 00:04:04.159 --> 00:04:08.080 It inevitably triggers envy, sometimes depression, and definitely that fear 81 00:04:08.120 --> 00:04:09.800 of missing out. Fomo. 82 00:04:10.159 --> 00:04:14.319 Yeah, fomo is real. The sources even suggest that social 83 00:04:14.319 --> 00:04:16.879 networking can act almost like a new drug. 84 00:04:17.439 --> 00:04:20.040 It triggers a similar dopamine response in the brain, kind 85 00:04:20.040 --> 00:04:23.519 of like addiction, and the research points out this compulsive 86 00:04:23.639 --> 00:04:27.600 use can lead people to disengage from developing real world skills. 87 00:04:27.360 --> 00:04:31.439 Which can contribute to bigger societal issues like unemployment, because 88 00:04:31.480 --> 00:04:34.600 you're spending so much time consuming instead of I don't know, 89 00:04:34.720 --> 00:04:35.639 learning or doing. 90 00:04:35.759 --> 00:04:37.759 That seems to be the argument. It's presented as a 91 00:04:37.839 --> 00:04:39.680 kind of fundamental crisis of attention. 92 00:04:39.959 --> 00:04:44.199 Okay, so if that passive data consumption creates a personal crisis, 93 00:04:44.639 --> 00:04:48.680 the collection of all that data creates a potential political one. Right, 94 00:04:49.079 --> 00:04:51.759 we really have to talk about Cambridge analytica here. That 95 00:04:51.800 --> 00:04:56.639 feels like the moment data exportation got undeniably political. 96 00:04:56.360 --> 00:04:59.360 Absolutely a landmark case. It wasn't just a simple data theft, 97 00:04:59.360 --> 00:05:01.000 which I think is common misconception. 98 00:05:01.279 --> 00:05:03.480 So how did it work then? What was the mechanism? 99 00:05:03.639 --> 00:05:06.480 It actually started with academic research Back in twenty thirteen, 100 00:05:07.120 --> 00:05:11.079 researchers at Cambridge University showed you could predict detailed psychographic 101 00:05:11.120 --> 00:05:15.160 profiles pretty accurately just by analyzing someone's social media activity, 102 00:05:15.480 --> 00:05:16.160 like their likes. 103 00:05:16.439 --> 00:05:18.920 Okay, so the potential was known. 104 00:05:19.319 --> 00:05:24.519 Yes, and then a researcher named Alexander Cogan weaponized that knowledge. 105 00:05:24.600 --> 00:05:27.480 He created one of those personality quizes on Facebook. 106 00:05:27.839 --> 00:05:30.160 Yeah remember those oh yeah, which Disney character are of 107 00:05:30.240 --> 00:05:30.800 that kind of thing? 108 00:05:30.959 --> 00:05:34.480 Pretty much But the real trick, the sort of malicious 109 00:05:34.480 --> 00:05:37.079 genius of it, wasn't just getting the data from the 110 00:05:37.120 --> 00:05:37.800 people who took the. 111 00:05:37.839 --> 00:05:40.279 Quiz, ah right, There was more to it. 112 00:05:40.279 --> 00:05:44.199 Way more. Taking the quiz gave Cogan access not only 113 00:05:44.199 --> 00:05:47.600 to that user's personal data, but also the data of 114 00:05:47.639 --> 00:05:48.759 all their friends on. 115 00:05:48.680 --> 00:05:51.199 The platform, without the friends even taking the quiz. 116 00:05:50.920 --> 00:05:53.639 Without them knowing or consenting at all. And Cambridge Analytica 117 00:05:53.800 --> 00:05:56.360 used that massive pool of data yours and your friends 118 00:05:56.680 --> 00:05:59.959 to build these incredibly detailed profiles. We're talking over five 119 00:06:00.079 --> 00:06:02.800 thousand data points on something like two hundred and thirty 120 00:06:02.800 --> 00:06:04.079 million US adults. 121 00:06:04.199 --> 00:06:08.399 Wow, all for political ad targeting and manipulation exactly. 122 00:06:08.480 --> 00:06:11.720 It just starkly revealed the immense power of this kind 123 00:06:11.720 --> 00:06:13.079 of surveillance capitalism. 124 00:06:13.160 --> 00:06:16.000 Okay, So if we take that idea is granular surveillance, 125 00:06:16.319 --> 00:06:19.199 detailed profiles, and scale it up to a national level, 126 00:06:19.560 --> 00:06:22.639 the sources point towards what they call the ultimate social threat. 127 00:06:22.920 --> 00:06:26.160 You mean the Chinese government's concept of a citizen's score. 128 00:06:26.319 --> 00:06:28.240 Yeah, explain that a bit. 129 00:06:28.519 --> 00:06:32.000 Well. It evolves using technology like facial recognition combined with 130 00:06:32.000 --> 00:06:36.759 analysis of online behavior, social media activity, financial transactions, all 131 00:06:36.800 --> 00:06:39.120 sorts of data to do what to create a constantly 132 00:06:39.199 --> 00:06:44.160 updated score evaluating a citizen's behavior. Good behavior gets rewarded, 133 00:06:44.199 --> 00:06:47.399 bad behavior gets punished punished. How it could affect anything 134 00:06:47.439 --> 00:06:51.319 from loan applications to travel restrictions, even access to certain 135 00:06:51.399 --> 00:06:56.319 jobs or schools. And the really frightening part beyond just 136 00:06:56.439 --> 00:07:01.240 the surveillance, what's that is that this scorejudgment follows you 137 00:07:01.279 --> 00:07:04.959 for life. The sources argue it fundamentally hinders the human 138 00:07:04.959 --> 00:07:08.519 ability to reinvent yourself or move past mistakes as it 139 00:07:08.560 --> 00:07:09.079 locks you in. 140 00:07:09.279 --> 00:07:12.639 That's dystopian, truly chilling on a societal level. Okay, let's 141 00:07:12.639 --> 00:07:14.680 bring it back down to the individual user though, because 142 00:07:14.720 --> 00:07:18.199 alongside these huge systemic things, we're still facing the everyday 143 00:07:18.240 --> 00:07:19.959 cyber threats like malware. 144 00:07:20.040 --> 00:07:23.920 Oh yeah, malware still rampant, keyloggers snatching your passwords as 145 00:07:23.959 --> 00:07:28.079 you type, infostdealers grabbing files, banking malware trying to empty your. 146 00:07:27.959 --> 00:07:31.079 Accounts, And how does it usually get onto people's devices? 147 00:07:31.519 --> 00:07:36.839 Often through classic methods, malicious email attachments, maybe dodgy links, 148 00:07:37.079 --> 00:07:39.680 sometimes bundled with pirated software you might download. 149 00:07:39.759 --> 00:07:43.360 And then there's fishing, the old, reliable. 150 00:07:43.120 --> 00:07:47.160 Still incredibly effective. It's low tech social engineering right, attackers 151 00:07:47.399 --> 00:07:51.399 pretending to be someone you trust, Microsoft, Amazon, Netflix, your bank. 152 00:07:51.399 --> 00:07:53.519 Trying to trick you into clicking a link and giving 153 00:07:53.560 --> 00:07:55.959 up your login details or credit card number. 154 00:07:55.800 --> 00:07:59.759 Exactly, And it works because honestly, people are still really 155 00:07:59.759 --> 00:08:00.959 bad with passwords. 156 00:08:01.160 --> 00:08:01.839 Don't say it. 157 00:08:02.000 --> 00:08:04.639 The research confirmed it. Even in twenty nineteen, after all 158 00:08:04.639 --> 00:08:07.480 the major breaches we've heard about, the most common passwords 159 00:08:07.480 --> 00:08:09.600 were still things like one, two, three, four, five, six 160 00:08:09.879 --> 00:08:11.519 and password size. 161 00:08:11.680 --> 00:08:13.680 We laugh, but it's also kind of sad, isn't it. 162 00:08:13.959 --> 00:08:15.680 We know better, but convenience wins. 163 00:08:15.959 --> 00:08:19.120 It often does, which perfectly leads us into the technical 164 00:08:19.120 --> 00:08:23.439 fight back because researchers know about these human weaknesses. Given 165 00:08:23.480 --> 00:08:26.720 this huge challenge, how do you share massive social data 166 00:08:26.759 --> 00:08:30.800 sets for research without exposing individuals? What are they building? 167 00:08:30.920 --> 00:08:33.320 Right? How do you get the value without the privacy violation? 168 00:08:33.440 --> 00:08:36.480 That's the core problem for privacy preserving analytics precisely. 169 00:08:36.559 --> 00:08:41.639 The constant fear is the identity disclosure attack, someone figuring 170 00:08:41.639 --> 00:08:45.200 out who's who in supposedly anonymous data. 171 00:08:45.360 --> 00:08:48.159 So what was the first big technical defense. 172 00:08:48.480 --> 00:08:52.200 The foundational technique really was something called k anonymity, introduced 173 00:08:52.559 --> 00:08:53.879 way back in two thousand. 174 00:08:53.559 --> 00:08:56.679 And two, k anonymity. Okay, what's the principle. 175 00:08:56.759 --> 00:09:00.200 The basic idea is to make any single person's record 176 00:09:00.279 --> 00:09:03.159 in the data set indistinguishable from at least K minus 177 00:09:03.159 --> 00:09:07.679 one other records, usually through generalization like replacing an exact 178 00:09:07.720 --> 00:09:11.519 age with an age range, or suppression just removing certain data. 179 00:09:11.279 --> 00:09:14.120 Points so you blend into a small crowd of K people. 180 00:09:14.240 --> 00:09:17.320 That's the goal. Yeah, but it had flaws. 181 00:09:17.360 --> 00:09:19.080 If it's been around since thousand and two, yeah, I 182 00:09:19.080 --> 00:09:20.720 guess it wasn't perfect. What went wrong? 183 00:09:20.799 --> 00:09:24.240 It was vulnerable to what's called a homogeneity tack. Imagine 184 00:09:24.240 --> 00:09:26.559 your group of k people all look similar based on 185 00:09:26.600 --> 00:09:28.879 the generalized data, like they live in the same zip code. 186 00:09:28.919 --> 00:09:31.559 Now what if almost everyone in that group shares the 187 00:09:31.559 --> 00:09:36.039 same sensitive attribute, say they all have a specific medical condition. 188 00:09:36.720 --> 00:09:40.399 Even if one person's data on that condition is suppressed. 189 00:09:40.000 --> 00:09:42.679 You can pretty much guess their status because everyone else 190 00:09:42.720 --> 00:09:44.720 in their anonymous group has it exactly. 191 00:09:45.200 --> 00:09:48.080 The lack of diversity within the group broke the anonymity. 192 00:09:48.720 --> 00:09:52.440 It was also vulnerable to background knowledge attacks, where an 193 00:09:52.480 --> 00:09:55.799 attacker uses external info to reidentify someone. 194 00:09:56.000 --> 00:09:59.360 So generalization wasn't enough if the group itself was too 195 00:09:59.399 --> 00:09:59.840 similar in. 196 00:10:00.039 --> 00:10:02.679 Er right, so that led to the next step, L diversity. 197 00:10:02.759 --> 00:10:05.679 Okay, how does L diversity improve things? 198 00:10:05.759 --> 00:10:08.799 It adds another constraint. It says that within each of 199 00:10:08.799 --> 00:10:11.200 those groups, the ones that look similar, there must be 200 00:10:11.200 --> 00:10:16.279 at least little distinct, well represented values for the sensitive attribute. 201 00:10:16.399 --> 00:10:17.720 It forces diversity into. 202 00:10:17.600 --> 00:10:20.639 The groups, making it harder to infer anything specific about 203 00:10:20.639 --> 00:10:21.360 one individual. 204 00:10:21.480 --> 00:10:23.799 Much charter. But the real cutting edge, now, the sort 205 00:10:23.799 --> 00:10:27.000 of gold standard people aimed for is differential privacy. 206 00:10:27.039 --> 00:10:31.200 Differential privacy heard, the term sounds complex. What's the core idea? 207 00:10:31.360 --> 00:10:35.960 Instead of just generalizing or suppressing, Differential privacy involves adding 208 00:10:36.000 --> 00:10:41.080 carefully calculated mathematical noise to the data, or more accurately, 209 00:10:41.159 --> 00:10:44.200 to the queries run on the data. Adding noise doesn't 210 00:10:44.240 --> 00:10:46.960 that mess up the results? That's the clever part. The 211 00:10:47.000 --> 00:10:52.039 noise is precisely calibrated. It's enough to make it mathematically impossible, 212 00:10:52.159 --> 00:10:54.799 or at least very difficult, to tell if any single 213 00:10:54.799 --> 00:10:57.200 individual's data was included in the data set. 214 00:10:57.279 --> 00:10:59.519 Okay, protecting the individual. 215 00:10:59.080 --> 00:11:03.039 But it's not enough to significantly change the overall aggregate 216 00:11:03.080 --> 00:11:07.440 results or statistical trends needed for research. And crucially, it 217 00:11:07.480 --> 00:11:11.480 allows organizations to actually quantify the level of privacy they're providing. 218 00:11:11.799 --> 00:11:14.120 It gives a mathematical guarantee that sounds. 219 00:11:13.879 --> 00:11:16.320 Much more robust. Okay, let's shift to a really specific 220 00:11:16.399 --> 00:11:20.159 challenge location data. Our phones are constantly broadcasting where we 221 00:11:20.200 --> 00:11:22.759 are for location based services LBS. 222 00:11:22.240 --> 00:11:25.600 Like maps, ride sharing, local recommendations. Yeah, very common. 223 00:11:25.639 --> 00:11:27.039 How do you protect privacy there? 224 00:11:27.200 --> 00:11:30.799 One approach is location k anonymity. It's similar in spirit 225 00:11:30.840 --> 00:11:32.279 to the original K anonymity. 226 00:11:32.440 --> 00:11:33.799 How does it work in practice? 227 00:11:33.960 --> 00:11:38.080 Instead of your phone sending your exact GPS coordinates directly 228 00:11:38.120 --> 00:11:41.480 to the map service, you might use a middleman, a 229 00:11:41.519 --> 00:11:46.720 trusted third party sometimes called a location Trusted service or LTS. 230 00:11:46.960 --> 00:11:48.360 And what does this LTS do? 231 00:11:48.720 --> 00:11:52.320 Your phone tells the LTS your location. The LTS then 232 00:11:52.360 --> 00:11:55.279 finds an area called a cloaking zone that includes you 233 00:11:55.600 --> 00:11:58.799 and at least k other users nearby. It then sends 234 00:11:58.879 --> 00:12:02.240 that zone, not your specif point to the map service provider. 235 00:12:02.639 --> 00:12:05.240 So the service knows someone in this block needs directions, 236 00:12:05.240 --> 00:12:07.919 but not exactly who or where within the block. You're 237 00:12:08.000 --> 00:12:09.840 hidden in a small geographic crowd. 238 00:12:09.960 --> 00:12:12.320 That's the idea, blurring your precise identity. 239 00:12:12.360 --> 00:12:16.960 Okay. Another threat factor the network itself, especially wireless networks. 240 00:12:17.000 --> 00:12:20.399 We hear about rogue access points raps. What's the danger 241 00:12:20.440 --> 00:12:22.399 there for say a social media user? 242 00:12:22.519 --> 00:12:26.159 Big danger potentially imagine you're at a coffee shop or airport. 243 00:12:26.639 --> 00:12:29.919 RAP is basically an unauthorized Wi Fi hotspot set up 244 00:12:29.919 --> 00:12:33.440 by an attacker, often mimicking the legitimate network name like 245 00:12:33.759 --> 00:12:34.759 cafe guest Wi Fi. 246 00:12:34.960 --> 00:12:37.159 The classic evil twin attack Exactly if. 247 00:12:37.039 --> 00:12:38.879 You connect your phone or laptop to it and then 248 00:12:38.879 --> 00:12:40.559 log into your social media, the. 249 00:12:40.519 --> 00:12:44.799 Attacker running the RAP can potentially intercept your username, password, 250 00:12:45.080 --> 00:12:48.320 session cookies, basically take over your account. 251 00:12:48.120 --> 00:12:52.639 YEP, or redirect you to fake login pages, install malware. 252 00:12:53.559 --> 00:12:56.279 It's a major vulnerability in public spaces. 253 00:12:56.399 --> 00:12:58.440 So how do places defend against these? How do you 254 00:12:58.480 --> 00:12:59.480 even find them? 255 00:13:00.000 --> 00:13:02.960 If there's a system architecture proposed in the research, it 256 00:13:03.039 --> 00:13:06.759 involves having an administrative body, maybe the coffee shop owner 257 00:13:06.840 --> 00:13:09.679 or IT staff, use a dedicated Wi Fi scanner. 258 00:13:09.799 --> 00:13:10.679 What does the scanner do? 259 00:13:10.960 --> 00:13:13.519 It listens for the beacon frames that all Wi Fi 260 00:13:13.559 --> 00:13:17.600 access points constantly broadcast. These frames contain key info like 261 00:13:17.679 --> 00:13:22.120 the AP's MSc address, it's unique hardware ID, the network name, SSID, 262 00:13:22.480 --> 00:13:25.960 security settings, signal strength RSSI. 263 00:13:25.600 --> 00:13:29.240 Okay, it gathers the data on all nearby aps. Then 264 00:13:29.279 --> 00:13:30.120 what that. 265 00:13:30.000 --> 00:13:33.159 Collected data is immediately compared against a preapproved database, a 266 00:13:33.159 --> 00:13:35.759 whitelist of all the legitimate access points that should be 267 00:13:35.799 --> 00:13:36.679 operating in that area. 268 00:13:36.759 --> 00:13:38.799 Ah So, if the scanner picks up an AP whose 269 00:13:38.799 --> 00:13:40.000 details aren't on the white. 270 00:13:39.840 --> 00:13:43.120 List, bingo, that's flagged as a potential rogue access point 271 00:13:43.120 --> 00:13:44.120 that needs investigation. 272 00:13:44.360 --> 00:13:48.279 Makes sense? Okay, Shifting gears again into the really modern 273 00:13:48.279 --> 00:13:53.559 stuff AI and automation and security. Content moderation is a huge. 274 00:13:53.279 --> 00:13:58.360 One, absolutely massive. The scale is just impossible for humans alone. YouTube, 275 00:13:58.399 --> 00:14:01.200 for instance, apparently sees something like five hundred hours of 276 00:14:01.279 --> 00:14:02.519 video uploaded every. 277 00:14:02.360 --> 00:14:05.120 Single minute, five hundred hours a minute. You can't possibly 278 00:14:05.120 --> 00:14:07.320 have humans watch all that no way. 279 00:14:07.320 --> 00:14:11.840 So automation, specifically machine learning is essential. Facebook, for example, 280 00:14:12.039 --> 00:14:17.480 uses mL pretty heavily to proactively find and remove harmful content. 281 00:14:17.399 --> 00:14:18.960 Like what kind of content. 282 00:14:18.759 --> 00:14:21.879 They reported, for instance, removing something like twenty six million 283 00:14:21.879 --> 00:14:24.919 pieces of content related to global terrorist groups over a 284 00:14:24.960 --> 00:14:27.279 period and claim that ninety nine percent of it was 285 00:14:27.320 --> 00:14:30.720 removed proactively by their AI systems before any human even 286 00:14:30.799 --> 00:14:31.279 flagged it. 287 00:14:31.399 --> 00:14:33.759 Ninety nine percent. That sounds incredibly effective. 288 00:14:33.879 --> 00:14:37.320 It is technologically speaking, but that remaining one percent given 289 00:14:37.320 --> 00:14:39.840 the volumes and still represent a lot of harmful content 290 00:14:39.879 --> 00:14:42.679 sloping through and automation still really struggles in some. 291 00:14:42.639 --> 00:14:43.919 Areas Where does it fall down? 292 00:14:44.320 --> 00:14:49.039 The big challenges are subjectivity and context. How do you 293 00:14:49.080 --> 00:14:53.480 train an AI to definitively understand vague concepts like terrorism 294 00:14:53.559 --> 00:14:56.159 or obscenity across different cultures and context. 295 00:14:56.200 --> 00:14:59.000 It's incredibly hard, Right, context is everything? 296 00:14:59.279 --> 00:15:03.120 Remember the contra divers over the historical napalm girl photo 297 00:15:03.200 --> 00:15:06.919 from the Vietnam War. It's a famous, important photo depicting 298 00:15:07.039 --> 00:15:11.600 violence and nudity. Some platforms automated systems initially flagged and 299 00:15:11.639 --> 00:15:15.600 removed it completely missing the vital, historical and newsworthy context. 300 00:15:15.759 --> 00:15:19.200 Because the algorithm just saw nudity and violence pretty much. 301 00:15:19.240 --> 00:15:20.879 It lacked the human understanding of nuance. 302 00:15:21.039 --> 00:15:24.080 And you also have people actively trying to fool these systems. 303 00:15:24.200 --> 00:15:29.639 Right, adversaries, Yes, adversarial attacks are a constant problem. Sophisticated groups, 304 00:15:29.759 --> 00:15:32.960 knowing their content might get flagged, actively try to modify 305 00:15:33.000 --> 00:15:35.919 it to evade detection by the machine learning classifiers. 306 00:15:35.960 --> 00:15:36.600 How do they do that? 307 00:15:36.919 --> 00:15:41.279 For example, research showed isis affiliates learned to avoid certain 308 00:15:41.399 --> 00:15:46.159 high risk keywords associated with terrorism. Instead, they started using 309 00:15:46.240 --> 00:15:49.759 more neutral language like just saying Islamic state group, which 310 00:15:49.759 --> 00:15:53.120 apparently helped their accounts stay active longer before the automated 311 00:15:53.159 --> 00:15:53.879 systems caught on. 312 00:15:54.120 --> 00:15:57.320 They're literally learning how the AI works and adapting their 313 00:15:57.320 --> 00:16:00.600 tactics to bypass it. It's a constant cat and mouse game. 314 00:16:00.480 --> 00:16:04.559 It really is, which brings us to maybe the most intriguing, 315 00:16:04.759 --> 00:16:09.840 almost sci fi end of this adversarial spectrum. Using covert channels. 316 00:16:10.000 --> 00:16:13.399 Covert channels, so this isn't about bypassing moderation, it's about 317 00:16:13.480 --> 00:16:15.799 hiding communication completely. 318 00:16:15.399 --> 00:16:19.480 Exactly, making it invisible to defenders. One fascinating piece of 319 00:16:19.519 --> 00:16:23.240 research explored using steganography hiding data within other data to 320 00:16:23.320 --> 00:16:25.200 run a botnet's command and control structure. 321 00:16:25.279 --> 00:16:28.000 Using Twitter hiding botnet commands and tweets. 322 00:16:28.720 --> 00:16:31.159 How they didn't hide it in the text of the tweet, 323 00:16:31.200 --> 00:16:33.960 which might be detectable. Instead, they use the length of 324 00:16:34.000 --> 00:16:36.159 the Twitter post itself as the secret channel. 325 00:16:36.240 --> 00:16:38.039 The number of characters yep. 326 00:16:38.200 --> 00:16:40.879 Back when Twitter had that one hundred and forty character limit, 327 00:16:41.399 --> 00:16:43.919 the specific length of the tweet, say one hundred and 328 00:16:43.960 --> 00:16:47.240 twelve characters versus one hundred and thirty one would correspond 329 00:16:47.279 --> 00:16:50.240 to an encrypted command being sent from the botmaster to 330 00:16:50.320 --> 00:16:52.000 the infected computers in the botnet. 331 00:16:52.039 --> 00:16:55.000 Okay, that's clever, but wait if a single account just 332 00:16:55.039 --> 00:16:59.000 started posting tweets with weirdly specific repeating links, wouldn't that 333 00:16:59.039 --> 00:17:03.120 stick out like a sourt? Wouldn't monitoring systems flag that pattern? 334 00:17:03.320 --> 00:17:06.720 Good point. That's the next layer. They needed plausible cover 335 00:17:06.880 --> 00:17:10.160 for the accounts sending these lengthen coded messages. They couldn't 336 00:17:10.200 --> 00:17:12.920 look like obvious bots, So what did they do? They 337 00:17:13.000 --> 00:17:15.799 used another bit of AI, a markof chain model trained 338 00:17:15.799 --> 00:17:19.000 on a massive data set of real Twitter usernames. This 339 00:17:19.079 --> 00:17:22.240 model learned the patterns of typical usernames and could then 340 00:17:22.319 --> 00:17:26.960 generate new, completely artificial usernames that sounded convincingly human. 341 00:17:27.039 --> 00:17:30.119 They generated fake, but real sounding usernames to post the 342 00:17:30.200 --> 00:17:31.839 secret length tweets. 343 00:17:31.480 --> 00:17:35.519 Exactly, and to test if these generated usernames were actually plausible, 344 00:17:35.799 --> 00:17:39.319 they ran an experiment using Amazon mechanical Turk, asking real 345 00:17:39.400 --> 00:17:42.319 humans to rate the generated names alongside real ones. 346 00:17:42.440 --> 00:17:43.640 And what did the humans say? 347 00:17:43.880 --> 00:17:48.039 They rated the automatically generated, natural sounding user names as 348 00:17:48.200 --> 00:17:52.359 highly plausible. They couldn't easily distinguish them from real accounts. 349 00:17:52.640 --> 00:17:55.599 It showed they could effectively conceal not just the hidden 350 00:17:55.599 --> 00:17:58.400 message channel, but also the identity of the accounts using it. 351 00:17:58.559 --> 00:18:01.279 Wow, So they built AI not just to carry out 352 00:18:01.359 --> 00:18:06.200 the attack, but specifically to fool other AI detection systems 353 00:18:06.240 --> 00:18:09.039 and even human intuition. That's quite something. 354 00:18:09.119 --> 00:18:11.039 It really shows the sophistication we're up against. 355 00:18:11.119 --> 00:18:14.680 Okay, we've covered a huge range here, from the psychological 356 00:18:14.720 --> 00:18:18.160 pool of just passively scrolling all the way to botnets 357 00:18:18.240 --> 00:18:22.119 hiding commands and tweet links using AI generated usernames. The 358 00:18:22.160 --> 00:18:26.240 core tension just seems clearer than ever, This amazing convenience 359 00:18:26.279 --> 00:18:31.440 of being hyper connected versus the constantly evolving, incredibly sophisticated threats. 360 00:18:31.680 --> 00:18:34.000 Absolutely, and it really brings it back to the importance 361 00:18:34.039 --> 00:18:36.319 of individual vigilance. You know, you can't just rely on 362 00:18:36.359 --> 00:18:39.599 the platforms or technology to protect you completely. For mobile 363 00:18:39.599 --> 00:18:44.119 security specifically, the sources mentioned a useful acronym SRP. 364 00:18:45.000 --> 00:18:46.480 SRP Okay, break that down. 365 00:18:47.039 --> 00:18:50.359 S is for secure networks, being cautious about public Wi Fi. 366 00:18:50.559 --> 00:18:54.599 Maybe using a vpn R is for risks awareness just 367 00:18:54.839 --> 00:18:58.039 understanding the kinds of threats we've talked about, like phishing 368 00:18:58.079 --> 00:19:02.279 and malware and p P is for protect personal information. 369 00:19:02.920 --> 00:19:05.559 And this isn't just about passwords. Think about all those 370 00:19:05.599 --> 00:19:08.799 fun quizzes or surveys you fill out online. What was 371 00:19:08.839 --> 00:19:11.359 your first pet's name? What street did you grow up on? 372 00:19:11.640 --> 00:19:14.000 Ah? Classic security question answers. 373 00:19:13.880 --> 00:19:17.559 Exactly Attackers can collect those seemingly harmless bits of info 374 00:19:17.640 --> 00:19:20.519 you share publicly and potentially use them later to bypass 375 00:19:20.519 --> 00:19:23.359 security questions. If they managed to compromise part of your 376 00:19:23.400 --> 00:19:26.039 log in, like your password, don't make it easy for them. 377 00:19:26.160 --> 00:19:28.519 That's a really practical point. Be mindful of all the 378 00:19:28.519 --> 00:19:31.359 info you share. All right, as we wrap up, here's 379 00:19:31.359 --> 00:19:34.359 one final, maybe provocative thought for you, the listener. Based 380 00:19:34.359 --> 00:19:37.160 on the sources, even when you think you're being careful 381 00:19:37.240 --> 00:19:41.279 deleting files using private browsing modes, that digital footprint it's 382 00:19:41.440 --> 00:19:43.039 rarely ever truly gone. 383 00:19:43.200 --> 00:19:47.400 That's a stark reality. Your network provider, your internet service 384 00:19:47.440 --> 00:19:50.599 provider at home, or your mobile carrier on your phone, 385 00:19:50.799 --> 00:19:54.319 they maintain extensive logs. Logs are the sites you visit, 386 00:19:54.319 --> 00:19:55.119 the connections you. 387 00:19:55.079 --> 00:19:58.160 Make, and that data isn't necessarily private forever. 388 00:19:58.519 --> 00:20:02.400 No, in many jurisdictions like the US, that electronic evidence 389 00:20:02.400 --> 00:20:06.400 can be legally requested, obtained, and even admitted into core proceedings. 390 00:20:06.480 --> 00:20:09.480 So the bottom line is what you access online is 391 00:20:09.519 --> 00:20:12.440 almost never truly private from everyone pretty. 392 00:20:12.240 --> 00:20:16.119 Much, and maybe that realization, more than any specific technical 393 00:20:16.119 --> 00:20:18.720 defense we discussed, should be the biggest incentive for all 394 00:20:18.759 --> 00:20:21.240 of us to be thoughtful and vigilant about how we 395 00:20:21.359 --> 00:20:24.680 navigate this incredibly complex, hyperconnected world. 396 00:20:24.920 --> 00:20:27.519 A sobering but essential point to end on. This has 397 00:20:27.559 --> 00:20:31.599 been a fantastic deep dive into the social networking security battleground. 398 00:20:31.839 --> 00:20:34.559 We really hope this synthesized knowledge gives you some greater 399 00:20:34.640 --> 00:20:37.160 insight as you navigate your own digital life. Thanks for 400 00:20:37.279 --> 00:20:37.720 joining us.