WEBVTT

1
00:00:00.120 --> 00:00:03.399
<v Speaker 1>Welcome to the deep dive. We are jumping straight into

2
00:00:03.480 --> 00:00:08.400
<v Speaker 1>the digital arena today exploring the phenomenal, almost unbelievable growth

3
00:00:08.400 --> 00:00:13.320
<v Speaker 1>of social networking and well the constant sophisticated security battle

4
00:00:13.359 --> 00:00:15.119
<v Speaker 1>that's raging just below the surface.

5
00:00:15.320 --> 00:00:18.160
<v Speaker 2>It's a battle that's absolutely necessary because of the sheer

6
00:00:18.239 --> 00:00:21.280
<v Speaker 2>scale we're talking about. I mean, by June twenty twenty,

7
00:00:21.320 --> 00:00:24.120
<v Speaker 2>research show the Internet head balloon to over four point

8
00:00:24.120 --> 00:00:25.760
<v Speaker 2>eight billion users.

9
00:00:25.760 --> 00:00:28.199
<v Speaker 1>Four point eight billion, that's what sixty two percent of

10
00:00:28.239 --> 00:00:31.359
<v Speaker 1>the entire global population suddenly connected exactly.

11
00:00:31.719 --> 00:00:35.520
<v Speaker 2>And if you think about social networking as the central

12
00:00:35.600 --> 00:00:39.320
<v Speaker 2>way modern humans communicate, you start to see why securing

13
00:00:39.359 --> 00:00:42.079
<v Speaker 2>it is maybe the highest priority challenge we have right now.

14
00:00:42.200 --> 00:00:45.119
<v Speaker 1>Right and that massive expansion, it didn't just happen smoothly,

15
00:00:45.439 --> 00:00:48.039
<v Speaker 1>and it definitely came with the security cost. So our

16
00:00:48.039 --> 00:00:49.880
<v Speaker 1>mission for you today is pretty clear. We're going to

17
00:00:49.880 --> 00:00:53.560
<v Speaker 1>distill the history of this explosion, pinpoint the biggest threats,

18
00:00:53.880 --> 00:00:58.359
<v Speaker 1>everything from psychological tolls to high tech cybercrime.

19
00:00:57.799 --> 00:01:00.000
<v Speaker 2>And then walk through the really cutting edge technical care

20
00:01:00.039 --> 00:01:02.799
<v Speaker 2>honor measures people are deploying. It's this constant arms race.

21
00:01:03.119 --> 00:01:07.400
<v Speaker 1>Yeah, we've got sources covering the psychology, the politics, the

22
00:01:07.439 --> 00:01:10.959
<v Speaker 1>really technical stuff. It should be a fascinating synthesis.

23
00:01:10.400 --> 00:01:13.400
<v Speaker 2>It really is. So where do we start the origins?

24
00:01:13.519 --> 00:01:16.519
<v Speaker 1>Let's do it, because if you think the story starts

25
00:01:16.519 --> 00:01:19.400
<v Speaker 1>with I don't know the like button, you're missing a

26
00:01:19.439 --> 00:01:22.400
<v Speaker 1>couple of decades. The foundations were actually laid way back

27
00:01:22.439 --> 00:01:25.719
<v Speaker 1>in nineteen ninety seven. You had six degrees dot com, right,

28
00:01:25.840 --> 00:01:26.480
<v Speaker 1>six degrees.

29
00:01:26.519 --> 00:01:29.159
<v Speaker 2>They actually got up to three point five million users,

30
00:01:29.359 --> 00:01:30.879
<v Speaker 2>which was huge.

31
00:01:30.560 --> 00:01:35.760
<v Speaker 1>Then, maybe more memorably for some folks, AOL instant messenger, AM.

32
00:01:35.560 --> 00:01:38.879
<v Speaker 2>Oh yeah am. That was really the precursor, wasn't it.

33
00:01:38.879 --> 00:01:40.959
<v Speaker 2>It brought in things we take for granted now like

34
00:01:41.439 --> 00:01:43.719
<v Speaker 2>real time chat, persistent friend lists.

35
00:01:43.920 --> 00:01:46.799
<v Speaker 1>That was the blueprint. Basically, those early features paved the

36
00:01:46.840 --> 00:01:50.799
<v Speaker 1>way for everything that came later. That mid two thousands explosion.

37
00:01:50.480 --> 00:01:52.959
<v Speaker 2>Definitely, and two thousand and two is a really pivotal

38
00:01:53.040 --> 00:01:55.159
<v Speaker 2>year in that. What happened then, Well, you got Friendster,

39
00:01:55.400 --> 00:01:57.680
<v Speaker 2>which was one of those early kind of original networks,

40
00:01:57.879 --> 00:02:00.400
<v Speaker 2>maybe leaned a bit into dating I remember friends tr

41
00:02:00.560 --> 00:02:03.760
<v Speaker 2>but maybe even more impactful long term was LinkedIn, also

42
00:02:03.840 --> 00:02:04.959
<v Speaker 2>launched in two thousand and two.

43
00:02:05.120 --> 00:02:07.079
<v Speaker 1>Oh. LinkedIn, Okay, that's different.

44
00:02:06.920 --> 00:02:10.479
<v Speaker 2>Totally different. It's the perfect example of a niche platform

45
00:02:10.800 --> 00:02:14.879
<v Speaker 2>that just completely redefined a whole area of connection. It's

46
00:02:15.000 --> 00:02:19.360
<v Speaker 2>all about professional networking, and now over seven hundred million users.

47
00:02:19.599 --> 00:02:23.520
<v Speaker 2>It fundamentally changed modern recruitment. I mean, companies use it

48
00:02:23.560 --> 00:02:25.719
<v Speaker 2>constantly to find people, screen candidates.

49
00:02:25.879 --> 00:02:30.080
<v Speaker 1>Yeah. Absolutely, it's indispensable for many jobs now. But okay,

50
00:02:30.159 --> 00:02:33.919
<v Speaker 1>scale brings problems. As these platforms got bigger and bigger,

51
00:02:33.960 --> 00:02:37.080
<v Speaker 1>that initial dark side started to show up.

52
00:02:37.159 --> 00:02:39.719
<v Speaker 2>Right at first, it was maybe just you know, data

53
00:02:39.759 --> 00:02:43.120
<v Speaker 2>mining concerns, but then it quickly escalated to what organized

54
00:02:43.159 --> 00:02:48.199
<v Speaker 2>phishing attempts, botnet attacks starting to use these platforms, malware spreading.

55
00:02:47.879 --> 00:02:51.719
<v Speaker 1>Like wildfire, and the problems didn't stay purely technical, did they.

56
00:02:51.800 --> 00:02:54.759
<v Speaker 1>They spilled over into the social realm, sometimes in really

57
00:02:54.800 --> 00:02:55.520
<v Speaker 1>severe ways.

58
00:02:55.680 --> 00:02:58.840
<v Speaker 2>Yeah. The sources talk about this concept called digital dramatization,

59
00:02:59.280 --> 00:03:01.800
<v Speaker 2>which sounds a bit academic, but it points to the

60
00:03:01.879 --> 00:03:06.560
<v Speaker 2>really serious, sometimes unintended consequences of broadcasting your life in real.

61
00:03:06.319 --> 00:03:07.879
<v Speaker 1>Time, well kind of consequences.

62
00:03:08.159 --> 00:03:12.719
<v Speaker 2>Well, it covers things like cyberbullying, online vengeance, but tragically,

63
00:03:13.000 --> 00:03:17.360
<v Speaker 2>it also includes things like suicides, even murders being broadcast

64
00:03:17.439 --> 00:03:19.280
<v Speaker 2>live over platforms like Facebook Live.

65
00:03:19.479 --> 00:03:23.439
<v Speaker 1>That's yeah, that's incredibly chilling, just a horrific side effect

66
00:03:23.520 --> 00:03:28.080
<v Speaker 1>of that instant connectivity. And speaking of negative consequences, let's

67
00:03:28.080 --> 00:03:31.520
<v Speaker 1>pivot slightly to the psychological toll, because the research on

68
00:03:31.599 --> 00:03:33.960
<v Speaker 1>why people use these networks is fascinating.

69
00:03:34.159 --> 00:03:36.039
<v Speaker 2>It really is. There's a paradox, right, we call.

70
00:03:35.960 --> 00:03:39.159
<v Speaker 1>Them social networks, but most people aren't primarily using them

71
00:03:39.199 --> 00:03:40.840
<v Speaker 1>to socialize exactly.

72
00:03:41.120 --> 00:03:44.479
<v Speaker 2>Studies show the majority use them mainly to consume information

73
00:03:44.960 --> 00:03:46.919
<v Speaker 2>to scrolling, reading, watching.

74
00:03:46.639 --> 00:03:49.039
<v Speaker 1>And what does that passive consumption do to people?

75
00:03:49.240 --> 00:03:52.639
<v Speaker 2>Well, that seems to be the key disconnect. Researchers found

76
00:03:52.639 --> 00:03:56.639
<v Speaker 2>It often leaves users feeling I guess, unfilled and unsatisfied.

77
00:03:56.800 --> 00:04:01.199
<v Speaker 2>You're constantly bombarded with everyone else's highlight reels of perfect vacations,

78
00:04:01.240 --> 00:04:03.120
<v Speaker 2>the amazing relationship.

79
00:04:02.560 --> 00:04:04.199
<v Speaker 1>Right, the curated perfection, and.

80
00:04:04.159 --> 00:04:08.080
<v Speaker 2>It inevitably triggers envy, sometimes depression, and definitely that fear

81
00:04:08.120 --> 00:04:09.800
<v Speaker 2>of missing out. Fomo.

82
00:04:10.159 --> 00:04:14.319
<v Speaker 1>Yeah, fomo is real. The sources even suggest that social

83
00:04:14.319 --> 00:04:16.879
<v Speaker 1>networking can act almost like a new drug.

84
00:04:17.439 --> 00:04:20.040
<v Speaker 2>It triggers a similar dopamine response in the brain, kind

85
00:04:20.040 --> 00:04:23.519
<v Speaker 2>of like addiction, and the research points out this compulsive

86
00:04:23.639 --> 00:04:27.600
<v Speaker 2>use can lead people to disengage from developing real world skills.

87
00:04:27.360 --> 00:04:31.439
<v Speaker 1>Which can contribute to bigger societal issues like unemployment, because

88
00:04:31.480 --> 00:04:34.600
<v Speaker 1>you're spending so much time consuming instead of I don't know,

89
00:04:34.720 --> 00:04:35.639
<v Speaker 1>learning or doing.

90
00:04:35.759 --> 00:04:37.759
<v Speaker 2>That seems to be the argument. It's presented as a

91
00:04:37.839 --> 00:04:39.680
<v Speaker 2>kind of fundamental crisis of attention.

92
00:04:39.959 --> 00:04:44.199
<v Speaker 1>Okay, so if that passive data consumption creates a personal crisis,

93
00:04:44.639 --> 00:04:48.680
<v Speaker 1>the collection of all that data creates a potential political one. Right,

94
00:04:49.079 --> 00:04:51.759
<v Speaker 1>we really have to talk about Cambridge analytica here. That

95
00:04:51.800 --> 00:04:56.639
<v Speaker 1>feels like the moment data exportation got undeniably political.

96
00:04:56.360 --> 00:04:59.360
<v Speaker 2>Absolutely a landmark case. It wasn't just a simple data theft,

97
00:04:59.360 --> 00:05:01.000
<v Speaker 2>which I think is common misconception.

98
00:05:01.279 --> 00:05:03.480
<v Speaker 1>So how did it work then? What was the mechanism?

99
00:05:03.639 --> 00:05:06.480
<v Speaker 2>It actually started with academic research Back in twenty thirteen,

100
00:05:07.120 --> 00:05:11.079
<v Speaker 2>researchers at Cambridge University showed you could predict detailed psychographic

101
00:05:11.120 --> 00:05:15.160
<v Speaker 2>profiles pretty accurately just by analyzing someone's social media activity,

102
00:05:15.480 --> 00:05:16.160
<v Speaker 2>like their likes.

103
00:05:16.439 --> 00:05:18.920
<v Speaker 1>Okay, so the potential was known.

104
00:05:19.319 --> 00:05:24.519
<v Speaker 2>Yes, and then a researcher named Alexander Cogan weaponized that knowledge.

105
00:05:24.600 --> 00:05:27.480
<v Speaker 2>He created one of those personality quizes on Facebook.

106
00:05:27.839 --> 00:05:30.160
<v Speaker 1>Yeah remember those oh yeah, which Disney character are of

107
00:05:30.240 --> 00:05:30.800
<v Speaker 1>that kind of thing?

108
00:05:30.959 --> 00:05:34.480
<v Speaker 2>Pretty much But the real trick, the sort of malicious

109
00:05:34.480 --> 00:05:37.079
<v Speaker 2>genius of it, wasn't just getting the data from the

110
00:05:37.120 --> 00:05:37.800
<v Speaker 2>people who took the.

111
00:05:37.839 --> 00:05:40.279
<v Speaker 1>Quiz, ah right, There was more to it.

112
00:05:40.279 --> 00:05:44.199
<v Speaker 2>Way more. Taking the quiz gave Cogan access not only

113
00:05:44.199 --> 00:05:47.600
<v Speaker 2>to that user's personal data, but also the data of

114
00:05:47.639 --> 00:05:48.759
<v Speaker 2>all their friends on.

115
00:05:48.680 --> 00:05:51.199
<v Speaker 1>The platform, without the friends even taking the quiz.

116
00:05:50.920 --> 00:05:53.639
<v Speaker 2>Without them knowing or consenting at all. And Cambridge Analytica

117
00:05:53.800 --> 00:05:56.360
<v Speaker 2>used that massive pool of data yours and your friends

118
00:05:56.680 --> 00:05:59.959
<v Speaker 2>to build these incredibly detailed profiles. We're talking over five

119
00:06:00.079 --> 00:06:02.800
<v Speaker 2>thousand data points on something like two hundred and thirty

120
00:06:02.800 --> 00:06:04.079
<v Speaker 2>million US adults.

121
00:06:04.199 --> 00:06:08.399
<v Speaker 1>Wow, all for political ad targeting and manipulation exactly.

122
00:06:08.480 --> 00:06:11.720
<v Speaker 2>It just starkly revealed the immense power of this kind

123
00:06:11.720 --> 00:06:13.079
<v Speaker 2>of surveillance capitalism.

124
00:06:13.160 --> 00:06:16.000
<v Speaker 1>Okay, So if we take that idea is granular surveillance,

125
00:06:16.319 --> 00:06:19.199
<v Speaker 1>detailed profiles, and scale it up to a national level,

126
00:06:19.560 --> 00:06:22.639
<v Speaker 1>the sources point towards what they call the ultimate social threat.

127
00:06:22.920 --> 00:06:26.160
<v Speaker 2>You mean the Chinese government's concept of a citizen's score.

128
00:06:26.319 --> 00:06:28.240
<v Speaker 1>Yeah, explain that a bit.

129
00:06:28.519 --> 00:06:32.000
<v Speaker 2>Well. It evolves using technology like facial recognition combined with

130
00:06:32.000 --> 00:06:36.759
<v Speaker 2>analysis of online behavior, social media activity, financial transactions, all

131
00:06:36.800 --> 00:06:39.120
<v Speaker 2>sorts of data to do what to create a constantly

132
00:06:39.199 --> 00:06:44.160
<v Speaker 2>updated score evaluating a citizen's behavior. Good behavior gets rewarded,

133
00:06:44.199 --> 00:06:47.399
<v Speaker 2>bad behavior gets punished punished. How it could affect anything

134
00:06:47.439 --> 00:06:51.319
<v Speaker 2>from loan applications to travel restrictions, even access to certain

135
00:06:51.399 --> 00:06:56.319
<v Speaker 2>jobs or schools. And the really frightening part beyond just

136
00:06:56.439 --> 00:07:01.240
<v Speaker 2>the surveillance, what's that is that this scorejudgment follows you

137
00:07:01.279 --> 00:07:04.959
<v Speaker 2>for life. The sources argue it fundamentally hinders the human

138
00:07:04.959 --> 00:07:08.519
<v Speaker 2>ability to reinvent yourself or move past mistakes as it

139
00:07:08.560 --> 00:07:09.079
<v Speaker 2>locks you in.

140
00:07:09.279 --> 00:07:12.639
<v Speaker 1>That's dystopian, truly chilling on a societal level. Okay, let's

141
00:07:12.639 --> 00:07:14.680
<v Speaker 1>bring it back down to the individual user though, because

142
00:07:14.720 --> 00:07:18.199
<v Speaker 1>alongside these huge systemic things, we're still facing the everyday

143
00:07:18.240 --> 00:07:19.959
<v Speaker 1>cyber threats like malware.

144
00:07:20.040 --> 00:07:23.920
<v Speaker 2>Oh yeah, malware still rampant, keyloggers snatching your passwords as

145
00:07:23.959 --> 00:07:28.079
<v Speaker 2>you type, infostdealers grabbing files, banking malware trying to empty your.

146
00:07:27.959 --> 00:07:31.079
<v Speaker 1>Accounts, And how does it usually get onto people's devices?

147
00:07:31.519 --> 00:07:36.839
<v Speaker 2>Often through classic methods, malicious email attachments, maybe dodgy links,

148
00:07:37.079 --> 00:07:39.680
<v Speaker 2>sometimes bundled with pirated software you might download.

149
00:07:39.759 --> 00:07:43.360
<v Speaker 1>And then there's fishing, the old, reliable.

150
00:07:43.120 --> 00:07:47.160
<v Speaker 2>Still incredibly effective. It's low tech social engineering right, attackers

151
00:07:47.399 --> 00:07:51.399
<v Speaker 2>pretending to be someone you trust, Microsoft, Amazon, Netflix, your bank.

152
00:07:51.399 --> 00:07:53.519
<v Speaker 1>Trying to trick you into clicking a link and giving

153
00:07:53.560 --> 00:07:55.959
<v Speaker 1>up your login details or credit card number.

154
00:07:55.800 --> 00:07:59.759
<v Speaker 2>Exactly, And it works because honestly, people are still really

155
00:07:59.759 --> 00:08:00.959
<v Speaker 2>bad with passwords.

156
00:08:01.160 --> 00:08:01.839
<v Speaker 1>Don't say it.

157
00:08:02.000 --> 00:08:04.639
<v Speaker 2>The research confirmed it. Even in twenty nineteen, after all

158
00:08:04.639 --> 00:08:07.480
<v Speaker 2>the major breaches we've heard about, the most common passwords

159
00:08:07.480 --> 00:08:09.600
<v Speaker 2>were still things like one, two, three, four, five, six

160
00:08:09.879 --> 00:08:11.519
<v Speaker 2>and password size.

161
00:08:11.680 --> 00:08:13.680
<v Speaker 1>We laugh, but it's also kind of sad, isn't it.

162
00:08:13.959 --> 00:08:15.680
<v Speaker 1>We know better, but convenience wins.

163
00:08:15.959 --> 00:08:19.120
<v Speaker 2>It often does, which perfectly leads us into the technical

164
00:08:19.120 --> 00:08:23.439
<v Speaker 2>fight back because researchers know about these human weaknesses. Given

165
00:08:23.480 --> 00:08:26.720
<v Speaker 2>this huge challenge, how do you share massive social data

166
00:08:26.759 --> 00:08:30.800
<v Speaker 2>sets for research without exposing individuals? What are they building?

167
00:08:30.920 --> 00:08:33.320
<v Speaker 1>Right? How do you get the value without the privacy violation?

168
00:08:33.440 --> 00:08:36.480
<v Speaker 1>That's the core problem for privacy preserving analytics precisely.

169
00:08:36.559 --> 00:08:41.639
<v Speaker 2>The constant fear is the identity disclosure attack, someone figuring

170
00:08:41.639 --> 00:08:45.200
<v Speaker 2>out who's who in supposedly anonymous data.

171
00:08:45.360 --> 00:08:48.159
<v Speaker 1>So what was the first big technical defense.

172
00:08:48.480 --> 00:08:52.200
<v Speaker 2>The foundational technique really was something called k anonymity, introduced

173
00:08:52.559 --> 00:08:53.879
<v Speaker 2>way back in two thousand.

174
00:08:53.559 --> 00:08:56.679
<v Speaker 1>And two, k anonymity. Okay, what's the principle.

175
00:08:56.759 --> 00:09:00.200
<v Speaker 2>The basic idea is to make any single person's record

176
00:09:00.279 --> 00:09:03.159
<v Speaker 2>in the data set indistinguishable from at least K minus

177
00:09:03.159 --> 00:09:07.679
<v Speaker 2>one other records, usually through generalization like replacing an exact

178
00:09:07.720 --> 00:09:11.519
<v Speaker 2>age with an age range, or suppression just removing certain data.

179
00:09:11.279 --> 00:09:14.120
<v Speaker 1>Points so you blend into a small crowd of K people.

180
00:09:14.240 --> 00:09:17.320
<v Speaker 2>That's the goal. Yeah, but it had flaws.

181
00:09:17.360 --> 00:09:19.080
<v Speaker 1>If it's been around since thousand and two, yeah, I

182
00:09:19.080 --> 00:09:20.720
<v Speaker 1>guess it wasn't perfect. What went wrong?

183
00:09:20.799 --> 00:09:24.240
<v Speaker 2>It was vulnerable to what's called a homogeneity tack. Imagine

184
00:09:24.240 --> 00:09:26.559
<v Speaker 2>your group of k people all look similar based on

185
00:09:26.600 --> 00:09:28.879
<v Speaker 2>the generalized data, like they live in the same zip code.

186
00:09:28.919 --> 00:09:31.559
<v Speaker 2>Now what if almost everyone in that group shares the

187
00:09:31.559 --> 00:09:36.039
<v Speaker 2>same sensitive attribute, say they all have a specific medical condition.

188
00:09:36.720 --> 00:09:40.399
<v Speaker 2>Even if one person's data on that condition is suppressed.

189
00:09:40.000 --> 00:09:42.679
<v Speaker 1>You can pretty much guess their status because everyone else

190
00:09:42.720 --> 00:09:44.720
<v Speaker 1>in their anonymous group has it exactly.

191
00:09:45.200 --> 00:09:48.080
<v Speaker 2>The lack of diversity within the group broke the anonymity.

192
00:09:48.720 --> 00:09:52.440
<v Speaker 2>It was also vulnerable to background knowledge attacks, where an

193
00:09:52.480 --> 00:09:55.799
<v Speaker 2>attacker uses external info to reidentify someone.

194
00:09:56.000 --> 00:09:59.360
<v Speaker 1>So generalization wasn't enough if the group itself was too

195
00:09:59.399 --> 00:09:59.840
<v Speaker 1>similar in.

196
00:10:00.039 --> 00:10:02.679
<v Speaker 2>Er right, so that led to the next step, L diversity.

197
00:10:02.759 --> 00:10:05.679
<v Speaker 1>Okay, how does L diversity improve things?

198
00:10:05.759 --> 00:10:08.799
<v Speaker 2>It adds another constraint. It says that within each of

199
00:10:08.799 --> 00:10:11.200
<v Speaker 2>those groups, the ones that look similar, there must be

200
00:10:11.200 --> 00:10:16.279
<v Speaker 2>at least little distinct, well represented values for the sensitive attribute.

201
00:10:16.399 --> 00:10:17.720
<v Speaker 2>It forces diversity into.

202
00:10:17.600 --> 00:10:20.639
<v Speaker 1>The groups, making it harder to infer anything specific about

203
00:10:20.639 --> 00:10:21.360
<v Speaker 1>one individual.

204
00:10:21.480 --> 00:10:23.799
<v Speaker 2>Much charter. But the real cutting edge, now, the sort

205
00:10:23.799 --> 00:10:27.000
<v Speaker 2>of gold standard people aimed for is differential privacy.

206
00:10:27.039 --> 00:10:31.200
<v Speaker 1>Differential privacy heard, the term sounds complex. What's the core idea?

207
00:10:31.360 --> 00:10:35.960
<v Speaker 2>Instead of just generalizing or suppressing, Differential privacy involves adding

208
00:10:36.000 --> 00:10:41.080
<v Speaker 2>carefully calculated mathematical noise to the data, or more accurately,

209
00:10:41.159 --> 00:10:44.200
<v Speaker 2>to the queries run on the data. Adding noise doesn't

210
00:10:44.240 --> 00:10:46.960
<v Speaker 2>that mess up the results? That's the clever part. The

211
00:10:47.000 --> 00:10:52.039
<v Speaker 2>noise is precisely calibrated. It's enough to make it mathematically impossible,

212
00:10:52.159 --> 00:10:54.799
<v Speaker 2>or at least very difficult, to tell if any single

213
00:10:54.799 --> 00:10:57.200
<v Speaker 2>individual's data was included in the data set.

214
00:10:57.279 --> 00:10:59.519
<v Speaker 1>Okay, protecting the individual.

215
00:10:59.080 --> 00:11:03.039
<v Speaker 2>But it's not enough to significantly change the overall aggregate

216
00:11:03.080 --> 00:11:07.440
<v Speaker 2>results or statistical trends needed for research. And crucially, it

217
00:11:07.480 --> 00:11:11.480
<v Speaker 2>allows organizations to actually quantify the level of privacy they're providing.

218
00:11:11.799 --> 00:11:14.120
<v Speaker 2>It gives a mathematical guarantee that sounds.

219
00:11:13.879 --> 00:11:16.320
<v Speaker 1>Much more robust. Okay, let's shift to a really specific

220
00:11:16.399 --> 00:11:20.159
<v Speaker 1>challenge location data. Our phones are constantly broadcasting where we

221
00:11:20.200 --> 00:11:22.759
<v Speaker 1>are for location based services LBS.

222
00:11:22.240 --> 00:11:25.600
<v Speaker 2>Like maps, ride sharing, local recommendations. Yeah, very common.

223
00:11:25.639 --> 00:11:27.039
<v Speaker 1>How do you protect privacy there?

224
00:11:27.200 --> 00:11:30.799
<v Speaker 2>One approach is location k anonymity. It's similar in spirit

225
00:11:30.840 --> 00:11:32.279
<v Speaker 2>to the original K anonymity.

226
00:11:32.440 --> 00:11:33.799
<v Speaker 1>How does it work in practice?

227
00:11:33.960 --> 00:11:38.080
<v Speaker 2>Instead of your phone sending your exact GPS coordinates directly

228
00:11:38.120 --> 00:11:41.480
<v Speaker 2>to the map service, you might use a middleman, a

229
00:11:41.519 --> 00:11:46.720
<v Speaker 2>trusted third party sometimes called a location Trusted service or LTS.

230
00:11:46.960 --> 00:11:48.360
<v Speaker 1>And what does this LTS do?

231
00:11:48.720 --> 00:11:52.320
<v Speaker 2>Your phone tells the LTS your location. The LTS then

232
00:11:52.360 --> 00:11:55.279
<v Speaker 2>finds an area called a cloaking zone that includes you

233
00:11:55.600 --> 00:11:58.799
<v Speaker 2>and at least k other users nearby. It then sends

234
00:11:58.879 --> 00:12:02.240
<v Speaker 2>that zone, not your specif point to the map service provider.

235
00:12:02.639 --> 00:12:05.240
<v Speaker 1>So the service knows someone in this block needs directions,

236
00:12:05.240 --> 00:12:07.919
<v Speaker 1>but not exactly who or where within the block. You're

237
00:12:08.000 --> 00:12:09.840
<v Speaker 1>hidden in a small geographic crowd.

238
00:12:09.960 --> 00:12:12.320
<v Speaker 2>That's the idea, blurring your precise identity.

239
00:12:12.360 --> 00:12:16.960
<v Speaker 1>Okay. Another threat factor the network itself, especially wireless networks.

240
00:12:17.000 --> 00:12:20.399
<v Speaker 1>We hear about rogue access points raps. What's the danger

241
00:12:20.440 --> 00:12:22.399
<v Speaker 1>there for say a social media user?

242
00:12:22.519 --> 00:12:26.159
<v Speaker 2>Big danger potentially imagine you're at a coffee shop or airport.

243
00:12:26.639 --> 00:12:29.919
<v Speaker 2>RAP is basically an unauthorized Wi Fi hotspot set up

244
00:12:29.919 --> 00:12:33.440
<v Speaker 2>by an attacker, often mimicking the legitimate network name like

245
00:12:33.759 --> 00:12:34.759
<v Speaker 2>cafe guest Wi Fi.

246
00:12:34.960 --> 00:12:37.159
<v Speaker 1>The classic evil twin attack Exactly if.

247
00:12:37.039 --> 00:12:38.879
<v Speaker 2>You connect your phone or laptop to it and then

248
00:12:38.879 --> 00:12:40.559
<v Speaker 2>log into your social media, the.

249
00:12:40.519 --> 00:12:44.799
<v Speaker 1>Attacker running the RAP can potentially intercept your username, password,

250
00:12:45.080 --> 00:12:48.320
<v Speaker 1>session cookies, basically take over your account.

251
00:12:48.120 --> 00:12:52.639
<v Speaker 2>YEP, or redirect you to fake login pages, install malware.

252
00:12:53.559 --> 00:12:56.279
<v Speaker 2>It's a major vulnerability in public spaces.

253
00:12:56.399 --> 00:12:58.440
<v Speaker 1>So how do places defend against these? How do you

254
00:12:58.480 --> 00:12:59.480
<v Speaker 1>even find them?

255
00:13:00.000 --> 00:13:02.960
<v Speaker 2>If there's a system architecture proposed in the research, it

256
00:13:03.039 --> 00:13:06.759
<v Speaker 2>involves having an administrative body, maybe the coffee shop owner

257
00:13:06.840 --> 00:13:09.679
<v Speaker 2>or IT staff, use a dedicated Wi Fi scanner.

258
00:13:09.799 --> 00:13:10.679
<v Speaker 1>What does the scanner do?

259
00:13:10.960 --> 00:13:13.519
<v Speaker 2>It listens for the beacon frames that all Wi Fi

260
00:13:13.559 --> 00:13:17.600
<v Speaker 2>access points constantly broadcast. These frames contain key info like

261
00:13:17.679 --> 00:13:22.120
<v Speaker 2>the AP's MSc address, it's unique hardware ID, the network name, SSID,

262
00:13:22.480 --> 00:13:25.960
<v Speaker 2>security settings, signal strength RSSI.

263
00:13:25.600 --> 00:13:29.240
<v Speaker 1>Okay, it gathers the data on all nearby aps. Then

264
00:13:29.279 --> 00:13:30.120
<v Speaker 1>what that.

265
00:13:30.000 --> 00:13:33.159
<v Speaker 2>Collected data is immediately compared against a preapproved database, a

266
00:13:33.159 --> 00:13:35.759
<v Speaker 2>whitelist of all the legitimate access points that should be

267
00:13:35.799 --> 00:13:36.679
<v Speaker 2>operating in that area.

268
00:13:36.759 --> 00:13:38.799
<v Speaker 1>Ah So, if the scanner picks up an AP whose

269
00:13:38.799 --> 00:13:40.000
<v Speaker 1>details aren't on the white.

270
00:13:39.840 --> 00:13:43.120
<v Speaker 2>List, bingo, that's flagged as a potential rogue access point

271
00:13:43.120 --> 00:13:44.120
<v Speaker 2>that needs investigation.

272
00:13:44.360 --> 00:13:48.279
<v Speaker 1>Makes sense? Okay, Shifting gears again into the really modern

273
00:13:48.279 --> 00:13:53.559
<v Speaker 1>stuff AI and automation and security. Content moderation is a huge.

274
00:13:53.279 --> 00:13:58.360
<v Speaker 2>One, absolutely massive. The scale is just impossible for humans alone. YouTube,

275
00:13:58.399 --> 00:14:01.200
<v Speaker 2>for instance, apparently sees something like five hundred hours of

276
00:14:01.279 --> 00:14:02.519
<v Speaker 2>video uploaded every.

277
00:14:02.360 --> 00:14:05.120
<v Speaker 1>Single minute, five hundred hours a minute. You can't possibly

278
00:14:05.120 --> 00:14:07.320
<v Speaker 1>have humans watch all that no way.

279
00:14:07.320 --> 00:14:11.840
<v Speaker 2>So automation, specifically machine learning is essential. Facebook, for example,

280
00:14:12.039 --> 00:14:17.480
<v Speaker 2>uses mL pretty heavily to proactively find and remove harmful content.

281
00:14:17.399 --> 00:14:18.960
<v Speaker 1>Like what kind of content.

282
00:14:18.759 --> 00:14:21.879
<v Speaker 2>They reported, for instance, removing something like twenty six million

283
00:14:21.879 --> 00:14:24.919
<v Speaker 2>pieces of content related to global terrorist groups over a

284
00:14:24.960 --> 00:14:27.279
<v Speaker 2>period and claim that ninety nine percent of it was

285
00:14:27.320 --> 00:14:30.720
<v Speaker 2>removed proactively by their AI systems before any human even

286
00:14:30.799 --> 00:14:31.279
<v Speaker 2>flagged it.

287
00:14:31.399 --> 00:14:33.759
<v Speaker 1>Ninety nine percent. That sounds incredibly effective.

288
00:14:33.879 --> 00:14:37.320
<v Speaker 2>It is technologically speaking, but that remaining one percent given

289
00:14:37.320 --> 00:14:39.840
<v Speaker 2>the volumes and still represent a lot of harmful content

290
00:14:39.879 --> 00:14:42.679
<v Speaker 2>sloping through and automation still really struggles in some.

291
00:14:42.639 --> 00:14:43.919
<v Speaker 1>Areas Where does it fall down?

292
00:14:44.320 --> 00:14:49.039
<v Speaker 2>The big challenges are subjectivity and context. How do you

293
00:14:49.080 --> 00:14:53.480
<v Speaker 2>train an AI to definitively understand vague concepts like terrorism

294
00:14:53.559 --> 00:14:56.159
<v Speaker 2>or obscenity across different cultures and context.

295
00:14:56.200 --> 00:14:59.000
<v Speaker 1>It's incredibly hard, Right, context is everything?

296
00:14:59.279 --> 00:15:03.120
<v Speaker 2>Remember the contra divers over the historical napalm girl photo

297
00:15:03.200 --> 00:15:06.919
<v Speaker 2>from the Vietnam War. It's a famous, important photo depicting

298
00:15:07.039 --> 00:15:11.600
<v Speaker 2>violence and nudity. Some platforms automated systems initially flagged and

299
00:15:11.639 --> 00:15:15.600
<v Speaker 2>removed it completely missing the vital, historical and newsworthy context.

300
00:15:15.759 --> 00:15:19.200
<v Speaker 1>Because the algorithm just saw nudity and violence pretty much.

301
00:15:19.240 --> 00:15:20.879
<v Speaker 2>It lacked the human understanding of nuance.

302
00:15:21.039 --> 00:15:24.080
<v Speaker 1>And you also have people actively trying to fool these systems.

303
00:15:24.200 --> 00:15:29.639
<v Speaker 2>Right, adversaries, Yes, adversarial attacks are a constant problem. Sophisticated groups,

304
00:15:29.759 --> 00:15:32.960
<v Speaker 2>knowing their content might get flagged, actively try to modify

305
00:15:33.000 --> 00:15:35.919
<v Speaker 2>it to evade detection by the machine learning classifiers.

306
00:15:35.960 --> 00:15:36.600
<v Speaker 1>How do they do that?

307
00:15:36.919 --> 00:15:41.279
<v Speaker 2>For example, research showed isis affiliates learned to avoid certain

308
00:15:41.399 --> 00:15:46.159
<v Speaker 2>high risk keywords associated with terrorism. Instead, they started using

309
00:15:46.240 --> 00:15:49.759
<v Speaker 2>more neutral language like just saying Islamic state group, which

310
00:15:49.759 --> 00:15:53.120
<v Speaker 2>apparently helped their accounts stay active longer before the automated

311
00:15:53.159 --> 00:15:53.879
<v Speaker 2>systems caught on.

312
00:15:54.120 --> 00:15:57.320
<v Speaker 1>They're literally learning how the AI works and adapting their

313
00:15:57.320 --> 00:16:00.600
<v Speaker 1>tactics to bypass it. It's a constant cat and mouse game.

314
00:16:00.480 --> 00:16:04.559
<v Speaker 2>It really is, which brings us to maybe the most intriguing,

315
00:16:04.759 --> 00:16:09.840
<v Speaker 2>almost sci fi end of this adversarial spectrum. Using covert channels.

316
00:16:10.000 --> 00:16:13.399
<v Speaker 1>Covert channels, so this isn't about bypassing moderation, it's about

317
00:16:13.480 --> 00:16:15.799
<v Speaker 1>hiding communication completely.

318
00:16:15.399 --> 00:16:19.480
<v Speaker 2>Exactly, making it invisible to defenders. One fascinating piece of

319
00:16:19.519 --> 00:16:23.240
<v Speaker 2>research explored using steganography hiding data within other data to

320
00:16:23.320 --> 00:16:25.200
<v Speaker 2>run a botnet's command and control structure.

321
00:16:25.279 --> 00:16:28.000
<v Speaker 1>Using Twitter hiding botnet commands and tweets.

322
00:16:28.720 --> 00:16:31.159
<v Speaker 2>How they didn't hide it in the text of the tweet,

323
00:16:31.200 --> 00:16:33.960
<v Speaker 2>which might be detectable. Instead, they use the length of

324
00:16:34.000 --> 00:16:36.159
<v Speaker 2>the Twitter post itself as the secret channel.

325
00:16:36.240 --> 00:16:38.039
<v Speaker 1>The number of characters yep.

326
00:16:38.200 --> 00:16:40.879
<v Speaker 2>Back when Twitter had that one hundred and forty character limit,

327
00:16:41.399 --> 00:16:43.919
<v Speaker 2>the specific length of the tweet, say one hundred and

328
00:16:43.960 --> 00:16:47.240
<v Speaker 2>twelve characters versus one hundred and thirty one would correspond

329
00:16:47.279 --> 00:16:50.240
<v Speaker 2>to an encrypted command being sent from the botmaster to

330
00:16:50.320 --> 00:16:52.000
<v Speaker 2>the infected computers in the botnet.

331
00:16:52.039 --> 00:16:55.000
<v Speaker 1>Okay, that's clever, but wait if a single account just

332
00:16:55.039 --> 00:16:59.000
<v Speaker 1>started posting tweets with weirdly specific repeating links, wouldn't that

333
00:16:59.039 --> 00:17:03.120
<v Speaker 1>stick out like a sourt? Wouldn't monitoring systems flag that pattern?

334
00:17:03.320 --> 00:17:06.720
<v Speaker 2>Good point. That's the next layer. They needed plausible cover

335
00:17:06.880 --> 00:17:10.160
<v Speaker 2>for the accounts sending these lengthen coded messages. They couldn't

336
00:17:10.200 --> 00:17:12.920
<v Speaker 2>look like obvious bots, So what did they do? They

337
00:17:13.000 --> 00:17:15.799
<v Speaker 2>used another bit of AI, a markof chain model trained

338
00:17:15.799 --> 00:17:19.000
<v Speaker 2>on a massive data set of real Twitter usernames. This

339
00:17:19.079 --> 00:17:22.240
<v Speaker 2>model learned the patterns of typical usernames and could then

340
00:17:22.319 --> 00:17:26.960
<v Speaker 2>generate new, completely artificial usernames that sounded convincingly human.

341
00:17:27.039 --> 00:17:30.119
<v Speaker 1>They generated fake, but real sounding usernames to post the

342
00:17:30.200 --> 00:17:31.839
<v Speaker 1>secret length tweets.

343
00:17:31.480 --> 00:17:35.519
<v Speaker 2>Exactly, and to test if these generated usernames were actually plausible,

344
00:17:35.799 --> 00:17:39.319
<v Speaker 2>they ran an experiment using Amazon mechanical Turk, asking real

345
00:17:39.400 --> 00:17:42.319
<v Speaker 2>humans to rate the generated names alongside real ones.

346
00:17:42.440 --> 00:17:43.640
<v Speaker 1>And what did the humans say?

347
00:17:43.880 --> 00:17:48.039
<v Speaker 2>They rated the automatically generated, natural sounding user names as

348
00:17:48.200 --> 00:17:52.359
<v Speaker 2>highly plausible. They couldn't easily distinguish them from real accounts.

349
00:17:52.640 --> 00:17:55.599
<v Speaker 2>It showed they could effectively conceal not just the hidden

350
00:17:55.599 --> 00:17:58.400
<v Speaker 2>message channel, but also the identity of the accounts using it.

351
00:17:58.559 --> 00:18:01.279
<v Speaker 1>Wow, So they built AI not just to carry out

352
00:18:01.359 --> 00:18:06.200
<v Speaker 1>the attack, but specifically to fool other AI detection systems

353
00:18:06.240 --> 00:18:09.039
<v Speaker 1>and even human intuition. That's quite something.

354
00:18:09.119 --> 00:18:11.039
<v Speaker 2>It really shows the sophistication we're up against.

355
00:18:11.119 --> 00:18:14.680
<v Speaker 1>Okay, we've covered a huge range here, from the psychological

356
00:18:14.720 --> 00:18:18.160
<v Speaker 1>pool of just passively scrolling all the way to botnets

357
00:18:18.240 --> 00:18:22.119
<v Speaker 1>hiding commands and tweet links using AI generated usernames. The

358
00:18:22.160 --> 00:18:26.240
<v Speaker 1>core tension just seems clearer than ever, This amazing convenience

359
00:18:26.279 --> 00:18:31.440
<v Speaker 1>of being hyper connected versus the constantly evolving, incredibly sophisticated threats.

360
00:18:31.680 --> 00:18:34.000
<v Speaker 2>Absolutely, and it really brings it back to the importance

361
00:18:34.039 --> 00:18:36.319
<v Speaker 2>of individual vigilance. You know, you can't just rely on

362
00:18:36.359 --> 00:18:39.599
<v Speaker 2>the platforms or technology to protect you completely. For mobile

363
00:18:39.599 --> 00:18:44.119
<v Speaker 2>security specifically, the sources mentioned a useful acronym SRP.

364
00:18:45.000 --> 00:18:46.480
<v Speaker 1>SRP Okay, break that down.

365
00:18:47.039 --> 00:18:50.359
<v Speaker 2>S is for secure networks, being cautious about public Wi Fi.

366
00:18:50.559 --> 00:18:54.599
<v Speaker 2>Maybe using a vpn R is for risks awareness just

367
00:18:54.839 --> 00:18:58.039
<v Speaker 2>understanding the kinds of threats we've talked about, like phishing

368
00:18:58.079 --> 00:19:02.279
<v Speaker 2>and malware and p P is for protect personal information.

369
00:19:02.920 --> 00:19:05.559
<v Speaker 2>And this isn't just about passwords. Think about all those

370
00:19:05.599 --> 00:19:08.799
<v Speaker 2>fun quizzes or surveys you fill out online. What was

371
00:19:08.839 --> 00:19:11.359
<v Speaker 2>your first pet's name? What street did you grow up on?

372
00:19:11.640 --> 00:19:14.000
<v Speaker 1>Ah? Classic security question answers.

373
00:19:13.880 --> 00:19:17.559
<v Speaker 2>Exactly Attackers can collect those seemingly harmless bits of info

374
00:19:17.640 --> 00:19:20.519
<v Speaker 2>you share publicly and potentially use them later to bypass

375
00:19:20.519 --> 00:19:23.359
<v Speaker 2>security questions. If they managed to compromise part of your

376
00:19:23.400 --> 00:19:26.039
<v Speaker 2>log in, like your password, don't make it easy for them.

377
00:19:26.160 --> 00:19:28.519
<v Speaker 1>That's a really practical point. Be mindful of all the

378
00:19:28.519 --> 00:19:31.359
<v Speaker 1>info you share. All right, as we wrap up, here's

379
00:19:31.359 --> 00:19:34.359
<v Speaker 1>one final, maybe provocative thought for you, the listener. Based

380
00:19:34.359 --> 00:19:37.160
<v Speaker 1>on the sources, even when you think you're being careful

381
00:19:37.240 --> 00:19:41.279
<v Speaker 1>deleting files using private browsing modes, that digital footprint it's

382
00:19:41.440 --> 00:19:43.039
<v Speaker 1>rarely ever truly gone.

383
00:19:43.200 --> 00:19:47.400
<v Speaker 2>That's a stark reality. Your network provider, your internet service

384
00:19:47.440 --> 00:19:50.599
<v Speaker 2>provider at home, or your mobile carrier on your phone,

385
00:19:50.799 --> 00:19:54.319
<v Speaker 2>they maintain extensive logs. Logs are the sites you visit,

386
00:19:54.319 --> 00:19:55.119
<v Speaker 2>the connections you.

387
00:19:55.079 --> 00:19:58.160
<v Speaker 1>Make, and that data isn't necessarily private forever.

388
00:19:58.519 --> 00:20:02.400
<v Speaker 2>No, in many jurisdictions like the US, that electronic evidence

389
00:20:02.400 --> 00:20:06.400
<v Speaker 2>can be legally requested, obtained, and even admitted into core proceedings.

390
00:20:06.480 --> 00:20:09.480
<v Speaker 1>So the bottom line is what you access online is

391
00:20:09.519 --> 00:20:12.440
<v Speaker 1>almost never truly private from everyone pretty.

392
00:20:12.240 --> 00:20:16.119
<v Speaker 2>Much, and maybe that realization, more than any specific technical

393
00:20:16.119 --> 00:20:18.720
<v Speaker 2>defense we discussed, should be the biggest incentive for all

394
00:20:18.759 --> 00:20:21.240
<v Speaker 2>of us to be thoughtful and vigilant about how we

395
00:20:21.359 --> 00:20:24.680
<v Speaker 2>navigate this incredibly complex, hyperconnected world.

396
00:20:24.920 --> 00:20:27.519
<v Speaker 1>A sobering but essential point to end on. This has

397
00:20:27.559 --> 00:20:31.599
<v Speaker 1>been a fantastic deep dive into the social networking security battleground.

398
00:20:31.839 --> 00:20:34.559
<v Speaker 1>We really hope this synthesized knowledge gives you some greater

399
00:20:34.640 --> 00:20:37.160
<v Speaker 1>insight as you navigate your own digital life. Thanks for

400
00:20:37.279 --> 00:20:37.720
<v Speaker 1>joining us.
