WEBVTT

1
00:00:00.160 --> 00:00:03.120
<v Speaker 1>Welcome to the Deep Dive. Today, we're taking a deep

2
00:00:03.160 --> 00:00:06.799
<v Speaker 1>dive into the world of digital security and investigative journalism,

3
00:00:07.440 --> 00:00:09.080
<v Speaker 1>and we're going to be doing that through the lens

4
00:00:09.199 --> 00:00:14.240
<v Speaker 1>of Michael Lee's book, Hacks, Leaks and Revelations. I've got

5
00:00:14.279 --> 00:00:16.440
<v Speaker 1>to say, you shared some really fascinating excerpts with me,

6
00:00:16.600 --> 00:00:18.679
<v Speaker 1>and I'm really excited to get into it. Yeah.

7
00:00:18.719 --> 00:00:20.719
<v Speaker 2>It's really interesting how Lee kind of pulls back the

8
00:00:20.760 --> 00:00:24.559
<v Speaker 2>curtain on this whole world of data leaks and how

9
00:00:24.719 --> 00:00:28.120
<v Speaker 2>journalists can actually use cutting edge technology to get to

10
00:00:28.160 --> 00:00:30.480
<v Speaker 2>the truth. You know, it's not just about looking at

11
00:00:30.480 --> 00:00:33.359
<v Speaker 2>the raw data, it's about understanding like the context and

12
00:00:33.479 --> 00:00:36.359
<v Speaker 2>you know, really importantly the potential impact on everyone.

13
00:00:36.759 --> 00:00:38.479
<v Speaker 1>Yeah, it kind of feels like we're stepping into like

14
00:00:38.520 --> 00:00:42.399
<v Speaker 1>a real life digital detective story. Yeah, you know, speaking

15
00:00:42.439 --> 00:00:44.759
<v Speaker 1>of detectives, one of the things that really stood out

16
00:00:44.759 --> 00:00:48.280
<v Speaker 1>to me was Lee's emphasis on data sensitivity. He makes

17
00:00:48.280 --> 00:00:50.439
<v Speaker 1>the point that not all data sets are created equal,

18
00:00:50.600 --> 00:00:54.399
<v Speaker 1>especially when you're dealing with information that could expose really

19
00:00:54.399 --> 00:00:55.280
<v Speaker 1>powerful entities.

20
00:00:55.399 --> 00:00:58.200
<v Speaker 2>Yeah, it definitely makes you think about like where we

21
00:00:58.240 --> 00:01:00.359
<v Speaker 2>store our data. You know, Lee actually are is that

22
00:01:00.439 --> 00:01:05.719
<v Speaker 2>everyday cloud services like Google Drive, which seem really convenient,

23
00:01:05.799 --> 00:01:07.879
<v Speaker 2>might not be the best choice for investigations that are

24
00:01:07.959 --> 00:01:12.319
<v Speaker 2>highly sensitive because they're vulnerable to legal requests. Right, So

25
00:01:12.359 --> 00:01:13.879
<v Speaker 2>that means that you know, a company with a team

26
00:01:13.879 --> 00:01:16.959
<v Speaker 2>of lawyers could potentially access your data.

27
00:01:17.040 --> 00:01:20.359
<v Speaker 1>Okay, so no storing my top secret expose on Google Docs.

28
00:01:21.000 --> 00:01:23.239
<v Speaker 1>Got it. But isn't the cloud supposed to be like

29
00:01:23.319 --> 00:01:26.159
<v Speaker 1>super secure? Why would I need anything more than that?

30
00:01:26.599 --> 00:01:29.120
<v Speaker 2>Well, that's a great question, and it is true that

31
00:01:29.159 --> 00:01:32.840
<v Speaker 2>cloud providers do offer some security measures, but they are

32
00:01:32.879 --> 00:01:36.560
<v Speaker 2>still subject to legal processes. So if you're investigating like

33
00:01:36.560 --> 00:01:39.719
<v Speaker 2>a really powerful entity like a corporation, they might have

34
00:01:39.760 --> 00:01:43.120
<v Speaker 2>the resources to compel those providers to hand over your data.

35
00:01:44.120 --> 00:01:46.799
<v Speaker 2>And so for those cases, Lee really recommends taking extra

36
00:01:46.879 --> 00:01:49.599
<v Speaker 2>precautions and using more robust solutions.

37
00:01:49.799 --> 00:01:51.760
<v Speaker 1>That makes sense. So let's say we have taken those

38
00:01:51.760 --> 00:01:55.120
<v Speaker 1>precautions and we've actually received elked data set. How can

39
00:01:55.159 --> 00:01:58.120
<v Speaker 1>we be sure it's authentic? Can't anyone just like fabricate

40
00:01:58.200 --> 00:01:59.400
<v Speaker 1>data these days? Oh?

41
00:01:59.480 --> 00:02:03.560
<v Speaker 2>Absolutely? Verifying the authenticity of your sources is absolutely crucial,

42
00:02:03.760 --> 00:02:06.519
<v Speaker 2>and Lee actually uses some pretty clever methods to confirm

43
00:02:06.560 --> 00:02:10.199
<v Speaker 2>the legitimacy of the data sets he works with. For example,

44
00:02:10.199 --> 00:02:12.960
<v Speaker 2>when he received the data set from America's Frontline Doctors,

45
00:02:13.240 --> 00:02:16.520
<v Speaker 2>a group that's been known for promoting misinformation about COVID nineteen.

46
00:02:17.000 --> 00:02:21.919
<v Speaker 2>He actually cross reference patient information with public posts on GAB.

47
00:02:22.240 --> 00:02:24.680
<v Speaker 2>It was almost like a digital fingerprint that helped confirm

48
00:02:24.719 --> 00:02:25.599
<v Speaker 2>the data's origin.

49
00:02:25.840 --> 00:02:28.560
<v Speaker 1>Wow, so he was able to trace the data back

50
00:02:28.599 --> 00:02:32.080
<v Speaker 1>to the source just using public information. That's really impressive.

51
00:02:32.240 --> 00:02:35.560
<v Speaker 2>And another compelling example is how he verified a data

52
00:02:35.560 --> 00:02:39.080
<v Speaker 2>set that was leaked from a private wiki leak's Twitter group.

53
00:02:39.919 --> 00:02:42.400
<v Speaker 2>He did this by saving the HTML of a Twitter

54
00:02:42.439 --> 00:02:46.080
<v Speaker 2>direct message conversation. So by doing that, by preserving it

55
00:02:46.120 --> 00:02:48.439
<v Speaker 2>in its original form, he could prove that it was genuine.

56
00:02:48.439 --> 00:02:50.800
<v Speaker 1>So it's all about finding those digital breadcrumbs that lead

57
00:02:50.840 --> 00:02:53.439
<v Speaker 1>back to the source. But we can't forget about protecting

58
00:02:53.439 --> 00:02:56.879
<v Speaker 1>ourselves and our sources too, right, What does Lee recommend

59
00:02:56.879 --> 00:02:59.199
<v Speaker 1>for staying safe in this digital landscape?

60
00:02:59.280 --> 00:03:02.960
<v Speaker 2>Well, strong passwords and two factor authentication. He says those

61
00:03:02.960 --> 00:03:06.599
<v Speaker 2>are non negotiables. You know, imagine your work on a

62
00:03:06.639 --> 00:03:10.759
<v Speaker 2>sensitive investigation. Maybe you're uncovering corruption, right, you're exposing wrongdoing.

63
00:03:11.000 --> 00:03:12.879
<v Speaker 2>You need to protect yourself and the people who trust

64
00:03:12.960 --> 00:03:16.439
<v Speaker 2>you with information. A password manager like key PASSXC is

65
00:03:16.439 --> 00:03:19.280
<v Speaker 2>a great option. It's like a digital vault for your passwords.

66
00:03:19.639 --> 00:03:22.840
<v Speaker 2>It can generate really strong, unique passwords and keep them

67
00:03:22.840 --> 00:03:25.120
<v Speaker 2>safe even if your device is compromised.

68
00:03:25.439 --> 00:03:28.560
<v Speaker 1>Yeah, I can see how that would be essential, especially

69
00:03:28.599 --> 00:03:30.680
<v Speaker 1>when the stakes are so high. What about disc encryption?

70
00:03:30.800 --> 00:03:32.560
<v Speaker 1>Is that something I should be considering?

71
00:03:32.680 --> 00:03:36.400
<v Speaker 2>Disc encryption? Absolutely, it's another really crucial layer of protection.

72
00:03:36.960 --> 00:03:39.680
<v Speaker 2>Think about it. If you lose your laptop or say

73
00:03:39.680 --> 00:03:43.800
<v Speaker 2>it's confiscated, and your disc isn't encrypted, anyone with access

74
00:03:44.039 --> 00:03:47.479
<v Speaker 2>can read all your data. Encryption basically scrambles that data.

75
00:03:47.520 --> 00:03:50.680
<v Speaker 2>It makes it unreadable without the correct key. And Lee

76
00:03:50.759 --> 00:03:54.039
<v Speaker 2>even provides like step by step instructions for enabling disc

77
00:03:54.120 --> 00:03:56.759
<v Speaker 2>encryption on Windows, so you know it's not as daunting

78
00:03:56.800 --> 00:03:57.520
<v Speaker 2>as it might sound.

79
00:03:57.759 --> 00:04:02.639
<v Speaker 1>Okay, I'm convinced disc encryption it is now shifting gears

80
00:04:02.639 --> 00:04:04.520
<v Speaker 1>a little bit. You mentioned the book also delves into

81
00:04:04.719 --> 00:04:07.039
<v Speaker 1>spearfishing and an election interference.

82
00:04:07.240 --> 00:04:09.719
<v Speaker 2>Yeah, that's where things get really interesting and honestly a

83
00:04:09.759 --> 00:04:13.680
<v Speaker 2>little unsettling. Lee details a spearfishing attack that targeted election

84
00:04:13.800 --> 00:04:17.120
<v Speaker 2>workers in North Carolina. And these weren't just your random

85
00:04:17.199 --> 00:04:21.360
<v Speaker 2>like spam emails. They were very carefully crafted messages. Designed

86
00:04:21.360 --> 00:04:25.560
<v Speaker 2>to trick specific individuals to reveal sensitive information. And this

87
00:04:25.600 --> 00:04:28.600
<v Speaker 2>particular attack was only brought to light because of reality Winner,

88
00:04:28.920 --> 00:04:31.959
<v Speaker 2>the whistleblower who leaked the classified documents exposing it.

89
00:04:32.279 --> 00:04:35.279
<v Speaker 1>So this league had major real world consequences. It wasn't

90
00:04:35.319 --> 00:04:37.920
<v Speaker 1>just about exposing like a security flaw, it was about

91
00:04:38.000 --> 00:04:41.600
<v Speaker 1>uncovering a threat to democracy. I'm curious does Lee offer

92
00:04:41.639 --> 00:04:44.120
<v Speaker 1>any ways to protect ourselves from these kinds of attacks?

93
00:04:44.399 --> 00:04:47.319
<v Speaker 2>Well, he definitely highlights the importance of being aware of

94
00:04:47.360 --> 00:04:50.680
<v Speaker 2>these fishing tactics and being really cautious about opening emails

95
00:04:50.680 --> 00:04:53.879
<v Speaker 2>that look suspicious or clicking on links you don't recognize.

96
00:04:54.000 --> 00:04:57.439
<v Speaker 2>But on broader level, this example really underscores the importance

97
00:04:57.480 --> 00:05:03.800
<v Speaker 2>of robust cybersecurity measures, especially for critical infrastructure like collection systems.

98
00:05:04.319 --> 00:05:06.279
<v Speaker 1>Yeah, it makes you wonder just how much we don't

99
00:05:06.319 --> 00:05:08.279
<v Speaker 1>know about, right, It's almost like an iceberg what we

100
00:05:08.319 --> 00:05:11.800
<v Speaker 1>see as just the tip. Now BitTorrent, Lee explains how

101
00:05:11.839 --> 00:05:15.560
<v Speaker 1>this file sharing technology can be used to resist data sensorship.

102
00:05:15.839 --> 00:05:19.800
<v Speaker 2>Yeah. BitTorrent is fascinating because it's decentralized, so instead of

103
00:05:19.800 --> 00:05:22.560
<v Speaker 2>relying on a single server to host a file, it's

104
00:05:22.600 --> 00:05:25.800
<v Speaker 2>distributed across a whole network of users, and this makes

105
00:05:25.839 --> 00:05:28.360
<v Speaker 2>it really difficult to censor. Even if one user is

106
00:05:28.399 --> 00:05:30.959
<v Speaker 2>shut down, others can still share the data. And it

107
00:05:31.000 --> 00:05:33.360
<v Speaker 2>was actually used to distribute the blood Leaks data set,

108
00:05:33.399 --> 00:05:36.759
<v Speaker 2>you know, that massive trove of law enforcement documents, making

109
00:05:36.800 --> 00:05:39.079
<v Speaker 2>it accessible even when people try to censor it.

110
00:05:39.160 --> 00:05:41.639
<v Speaker 1>That's incredible. It's like a digital hydra. You cut off

111
00:05:41.680 --> 00:05:44.920
<v Speaker 1>one head and two more growback. Sounds like a powerful

112
00:05:44.959 --> 00:05:47.480
<v Speaker 1>tool for whistleblowers and activists who want to bring information

113
00:05:47.519 --> 00:05:51.480
<v Speaker 1>to light. But what about everyday communication with sources. Surely

114
00:05:51.560 --> 00:05:53.399
<v Speaker 1>we're not using BitTorrent for that.

115
00:05:53.800 --> 00:05:57.800
<v Speaker 2>Not exactly. For secure day to day communication, Lee strongly

116
00:05:57.839 --> 00:06:03.279
<v Speaker 2>recommends Signal, encrypted messaging app that prioritizes user privacy and

117
00:06:03.399 --> 00:06:06.800
<v Speaker 2>it uses end to end encryption, meaning that only the

118
00:06:06.879 --> 00:06:09.240
<v Speaker 2>sender and the recipient can read those messages.

119
00:06:09.399 --> 00:06:11.680
<v Speaker 1>Okay, I've heard of Signal, but can you explain how

120
00:06:11.680 --> 00:06:14.639
<v Speaker 1>this encryption actually works? It sounds really complex.

121
00:06:14.800 --> 00:06:17.480
<v Speaker 2>It's actually a pretty simple concept. Think about like sending

122
00:06:17.480 --> 00:06:20.560
<v Speaker 2>a postcard through the mail right, anyone who handles it

123
00:06:20.600 --> 00:06:23.639
<v Speaker 2>can read your message with end to end encryption. It's

124
00:06:23.639 --> 00:06:26.240
<v Speaker 2>like you're sending that postcard in a locked box and

125
00:06:26.279 --> 00:06:28.920
<v Speaker 2>only the recipient has the key to open it. Even

126
00:06:28.959 --> 00:06:31.279
<v Speaker 2>signal itself can't access your messages.

127
00:06:31.360 --> 00:06:33.800
<v Speaker 1>That's a great analogy. So it's like having a private

128
00:06:33.839 --> 00:06:37.240
<v Speaker 1>conversation in a crowded room, but digitally. Does Lee mention

129
00:06:37.360 --> 00:06:39.040
<v Speaker 1>any other tools for secure communication?

130
00:06:39.360 --> 00:06:42.560
<v Speaker 2>He does. He also talks about tour which anonymizes your

131
00:06:42.560 --> 00:06:46.879
<v Speaker 2>Internet traffic, onion share, which allows for secure file sharing,

132
00:06:47.399 --> 00:06:50.399
<v Speaker 2>and PGP, which is an email encryption standard. It's like

133
00:06:50.439 --> 00:06:53.879
<v Speaker 2>a whole arsenal of tools for protecting communication in the

134
00:06:53.920 --> 00:06:54.639
<v Speaker 2>digital age.

135
00:06:54.920 --> 00:06:57.360
<v Speaker 1>It sounds like staying ahead of surveillance is a full

136
00:06:57.399 --> 00:06:59.959
<v Speaker 1>time job. But what happens when we're actually dealing with

137
00:07:00.199 --> 00:07:03.519
<v Speaker 1>like massive data sets. Does Lee offer any guidance on

138
00:07:03.560 --> 00:07:05.759
<v Speaker 1>how to manage those securely?

139
00:07:06.000 --> 00:07:08.160
<v Speaker 2>Yeah, he suggests a few things, setting up like a

140
00:07:08.199 --> 00:07:11.839
<v Speaker 2>tips page on your website to encourage submissions, using secure

141
00:07:11.920 --> 00:07:15.319
<v Speaker 2>drop for anonymous communication, and setting up a cloud server

142
00:07:15.439 --> 00:07:18.360
<v Speaker 2>specifically for data processing. You know, it's about creating a

143
00:07:18.360 --> 00:07:21.519
<v Speaker 2>secure pipeline for receiving and analyzing information.

144
00:07:22.120 --> 00:07:24.040
<v Speaker 1>Okay, so it's not just about getting the data, it's

145
00:07:24.040 --> 00:07:27.199
<v Speaker 1>about having the infrastructure to handle it. And speaking of infrastructure,

146
00:07:27.319 --> 00:07:29.920
<v Speaker 1>Lee's a big advocate for using the command line interface

147
00:07:30.079 --> 00:07:34.279
<v Speaker 1>or CLI for navigating and manipulating this data. Why is that?

148
00:07:34.639 --> 00:07:37.120
<v Speaker 2>The command line might seem kind of intimidating at first,

149
00:07:37.480 --> 00:07:41.279
<v Speaker 2>but Lee argues that it's incredibly powerful and efficient. It's

150
00:07:41.279 --> 00:07:44.279
<v Speaker 2>almost like speaking directly to your computer, giving it these

151
00:07:44.319 --> 00:07:47.920
<v Speaker 2>precise instructions, you know, without all the graphical distractions. So

152
00:07:48.040 --> 00:07:52.439
<v Speaker 2>like basic commands like to clear your screen, or PWD

153
00:07:52.639 --> 00:07:54.439
<v Speaker 2>to see like where you are in the file system,

154
00:07:54.800 --> 00:07:57.399
<v Speaker 2>and LS to list the files in a directory. Those

155
00:07:57.439 --> 00:07:59.920
<v Speaker 2>are kind of the foundation of navigating the command line.

156
00:08:00.079 --> 00:08:02.279
<v Speaker 1>It does sound like learning a new language. Are there

157
00:08:02.319 --> 00:08:04.720
<v Speaker 1>other commands he recommends for working with data?

158
00:08:04.839 --> 00:08:08.639
<v Speaker 2>Oh yeah, definitely. He goes through commands for like creating, deleting, moving,

159
00:08:08.680 --> 00:08:12.120
<v Speaker 2>and copying files and directories, and even viewing the contents

160
00:08:12.120 --> 00:08:14.639
<v Speaker 2>of files. All these commands can be combined to do

161
00:08:14.680 --> 00:08:16.079
<v Speaker 2>these really powerful tasks.

162
00:08:16.279 --> 00:08:18.759
<v Speaker 1>So it's about building up your vocabulary of commands to

163
00:08:18.800 --> 00:08:23.240
<v Speaker 1>do increasingly complex tasks. But what about actually editing those files?

164
00:08:23.279 --> 00:08:25.079
<v Speaker 1>Are we using the command line for that too?

165
00:08:25.399 --> 00:08:28.759
<v Speaker 2>For that, we need text editors. Lee differentiates between what

166
00:08:28.800 --> 00:08:32.840
<v Speaker 2>are called text files, which contain readable characters, and binary files,

167
00:08:32.879 --> 00:08:34.840
<v Speaker 2>which are not meant to be read by humans. He

168
00:08:34.879 --> 00:08:38.519
<v Speaker 2>recommends using visual Studio code. It's free and it's really user.

169
00:08:38.320 --> 00:08:40.240
<v Speaker 1>Friendly, makes sense. I mean a lot of the data

170
00:08:40.240 --> 00:08:42.600
<v Speaker 1>from leaks comes in text based format, so having a

171
00:08:42.639 --> 00:08:44.840
<v Speaker 1>good text editor is essential, exactly.

172
00:08:45.159 --> 00:08:47.519
<v Speaker 2>And because we often end up doing the same tasks

173
00:08:47.600 --> 00:08:50.360
<v Speaker 2>over and over again when we're analyzing data, Lee introduces

174
00:08:50.399 --> 00:08:52.759
<v Speaker 2>what's called shell scripting. And what this does is it

175
00:08:52.799 --> 00:08:56.240
<v Speaker 2>allows you to automate those repetitive tasks. Yeah yeah, saving

176
00:08:56.320 --> 00:08:57.399
<v Speaker 2>you time and effort.

177
00:08:57.799 --> 00:09:00.360
<v Speaker 1>It's like having a digital assistant that takes scare of

178
00:09:00.399 --> 00:09:02.440
<v Speaker 1>all the boring stuff so you can focus on the

179
00:09:02.480 --> 00:09:05.519
<v Speaker 1>bigger picture. But what about when our own computers just

180
00:09:05.519 --> 00:09:08.720
<v Speaker 1>aren't powerful enough to handle these massive data sets. What

181
00:09:08.759 --> 00:09:09.600
<v Speaker 1>does Lise adjust?

182
00:09:09.639 --> 00:09:13.639
<v Speaker 2>Then that's where cloud servers come in, and Lee specifically

183
00:09:13.639 --> 00:09:17.200
<v Speaker 2>recommends a company called Digital Ocean. These servers can be

184
00:09:17.279 --> 00:09:20.360
<v Speaker 2>much more powerful than your average laptop, and they offer

185
00:09:20.360 --> 00:09:24.519
<v Speaker 2>a lot of advantages like scalability, remote access, and better

186
00:09:24.600 --> 00:09:26.639
<v Speaker 2>bandwidth for large file transfers.

187
00:09:26.799 --> 00:09:29.960
<v Speaker 1>So it's like upgrading your computer's brain and its Internet

188
00:09:30.000 --> 00:09:34.120
<v Speaker 1>connection all at once for those heavy duty data crunching sessions.

189
00:09:34.200 --> 00:09:37.200
<v Speaker 2>Yeah exactly. And Lee actually guides us through the process

190
00:09:37.240 --> 00:09:40.200
<v Speaker 2>of setting up a server on digital Ocean, connecting to

191
00:09:40.279 --> 00:09:45.000
<v Speaker 2>it securely using SSH, and even installing software updates. It's

192
00:09:45.039 --> 00:09:47.360
<v Speaker 2>like a step by step guide that makes this world

193
00:09:47.360 --> 00:09:49.879
<v Speaker 2>of cloud computing much less mysterious.

194
00:09:50.399 --> 00:09:53.399
<v Speaker 1>This is all starting to feel very James Bond. You know,

195
00:09:54.080 --> 00:09:58.240
<v Speaker 1>secret servers, remote access encrypted files. But how do we

196
00:09:58.320 --> 00:10:00.679
<v Speaker 1>keep those servers organized and secure?

197
00:10:00.840 --> 00:10:03.279
<v Speaker 2>Well, that's where Docker comes in. Docker allows you to

198
00:10:03.360 --> 00:10:07.279
<v Speaker 2>isolate applications and their dependencies within these things called containers.

199
00:10:07.360 --> 00:10:11.960
<v Speaker 2>They're like miniature virtual environments. This ensures that your applications

200
00:10:12.039 --> 00:10:15.240
<v Speaker 2>run consistently no matter what operating system you're using, and

201
00:10:15.279 --> 00:10:17.559
<v Speaker 2>it makes the whole process of setting up software a

202
00:10:17.559 --> 00:10:18.240
<v Speaker 2>lot simpler.

203
00:10:18.399 --> 00:10:21.000
<v Speaker 1>So it's like having tiny little compartments for your applications,

204
00:10:21.080 --> 00:10:23.759
<v Speaker 1>keeping everything nice and tidy and preventing them from interfering

205
00:10:23.799 --> 00:10:24.360
<v Speaker 1>with each other.

206
00:10:24.559 --> 00:10:27.240
<v Speaker 2>Yeah, exactly. And Lee doesn't just tell us what docer is.

207
00:10:27.279 --> 00:10:29.759
<v Speaker 2>He actually shows us how to use it. He walks

208
00:10:29.799 --> 00:10:34.679
<v Speaker 2>us through installing Docker, running containers, managing storage, even using

209
00:10:34.720 --> 00:10:38.399
<v Speaker 2>Docker Compose, which is for more complex applications that use

210
00:10:38.480 --> 00:10:39.440
<v Speaker 2>multiple containers.

211
00:10:39.679 --> 00:10:42.080
<v Speaker 1>It sounds like we're getting a crash course in Docker.

212
00:10:42.519 --> 00:10:44.639
<v Speaker 1>But once we have our data on that cloud server

213
00:10:44.840 --> 00:10:48.000
<v Speaker 1>and it's safely tucked away in its Docker container. How

214
00:10:48.039 --> 00:10:49.840
<v Speaker 1>do we actually start analyzing it?

215
00:10:50.200 --> 00:10:55.000
<v Speaker 2>Well? For interactive data analysis and visualization, Lee recommends Jupiter Notebook.

216
00:10:55.759 --> 00:10:58.120
<v Speaker 2>It's a very powerful tool that lets you write code,

217
00:10:58.440 --> 00:11:01.240
<v Speaker 2>execute it, and see the results all in one place.

218
00:11:01.399 --> 00:11:05.120
<v Speaker 1>So it's like a digital laboratory for experimenting with data exactly.

219
00:11:05.799 --> 00:11:08.559
<v Speaker 2>And Lee explains how to run Jupiter Notebook inside of

220
00:11:08.559 --> 00:11:10.840
<v Speaker 2>a Docker container so you can access it remotely from

221
00:11:10.840 --> 00:11:11.519
<v Speaker 2>your own computer.

222
00:11:11.720 --> 00:11:14.399
<v Speaker 1>It's all about bringing those data analysis tools to where

223
00:11:14.399 --> 00:11:16.919
<v Speaker 1>the data lives. Now, what happens when you have a

224
00:11:17.000 --> 00:11:19.279
<v Speaker 1>really huge data set and you want to make it searchable?

225
00:11:19.320 --> 00:11:20.240
<v Speaker 1>Is there a tool for that?

226
00:11:20.399 --> 00:11:22.679
<v Speaker 2>There is. It's called a LEFT, and it's an open

227
00:11:22.720 --> 00:11:27.720
<v Speaker 2>source intelligence platform specifically for searching and analyzing large data sets.

228
00:11:28.360 --> 00:11:30.399
<v Speaker 2>You can think of it like having your own personal Google,

229
00:11:30.480 --> 00:11:31.480
<v Speaker 2>but for your data.

230
00:11:31.879 --> 00:11:34.639
<v Speaker 1>That sounds incredibly useful. But how do we even begin

231
00:11:34.840 --> 00:11:36.120
<v Speaker 1>to set something like that up?

232
00:11:36.200 --> 00:11:39.080
<v Speaker 2>Well, don't worry. Lee walks us through the entire process,

233
00:11:39.240 --> 00:11:42.799
<v Speaker 2>from setting up a LEFT using Docker Compose, to importing

234
00:11:42.879 --> 00:11:46.919
<v Speaker 2>data sets and then exploring that data using ls web interface.

235
00:11:46.960 --> 00:11:49.960
<v Speaker 1>Wow, it's amazing how many tools are available for investigative journalists.

236
00:11:50.000 --> 00:11:52.320
<v Speaker 1>These days, it seems like you need a whole toolbox

237
00:11:52.480 --> 00:11:55.440
<v Speaker 1>just to get started. But let's not forget that sometimes

238
00:11:55.480 --> 00:11:58.720
<v Speaker 1>the most revealing information comes from the most I guess

239
00:11:58.759 --> 00:12:03.000
<v Speaker 1>you could say mundane, like email dumps. What does Lee

240
00:12:03.080 --> 00:12:03.759
<v Speaker 1>say about those?

241
00:12:03.879 --> 00:12:06.440
<v Speaker 2>You're absolutely right. Email dumps might seem like a relic

242
00:12:06.480 --> 00:12:09.240
<v Speaker 2>of the past, but they can still be incredibly valuable.

243
00:12:09.799 --> 00:12:12.879
<v Speaker 2>Lee takes us into this world of leaked emails, explaining

244
00:12:12.919 --> 00:12:16.240
<v Speaker 2>the common formats like EML and mbox, the structure of

245
00:12:16.279 --> 00:12:19.320
<v Speaker 2>those messages, and even how to analyze them using tools

246
00:12:19.360 --> 00:12:21.279
<v Speaker 2>like Thunderbird and Microsoft Outlook.

247
00:12:21.360 --> 00:12:24.759
<v Speaker 1>Makes you wonder what secrets are hidden within those seemingly

248
00:12:24.919 --> 00:12:26.600
<v Speaker 1>ordinary inboxes, doesn't it?

249
00:12:26.600 --> 00:12:28.960
<v Speaker 2>It really does, And when you're dealing with a really

250
00:12:29.039 --> 00:12:33.080
<v Speaker 2>large volume of emails, traditional email clients might not be enough,

251
00:12:33.600 --> 00:12:35.679
<v Speaker 2>and that's where Python programming can really come in.

252
00:12:35.639 --> 00:12:38.159
<v Speaker 1>Handy Ah, Python, I knew it would make an appearance.

253
00:12:38.600 --> 00:12:41.679
<v Speaker 1>Why is Python so important for investigative journalism.

254
00:12:42.159 --> 00:12:45.840
<v Speaker 2>Well, Python's incredibly versatile. It's used in a wide variety

255
00:12:45.840 --> 00:12:49.840
<v Speaker 2>of applications, from web development to data science, and for

256
00:12:49.879 --> 00:12:53.639
<v Speaker 2>investigative journalists, it's a really powerful tool for automating tasks,

257
00:12:54.240 --> 00:12:58.320
<v Speaker 2>analyzing data and just generally extracting insights from these large

258
00:12:58.360 --> 00:12:58.960
<v Speaker 2>data sets.

259
00:12:59.000 --> 00:13:02.519
<v Speaker 1>So it's like our digital'ss army knife for working with data.

260
00:13:02.159 --> 00:13:05.200
<v Speaker 2>Precisely, and Lee doesn't assume that you have any prior

261
00:13:05.240 --> 00:13:10.440
<v Speaker 2>programming experience. He introduces the basics like variables, lists, strings,

262
00:13:10.519 --> 00:13:14.639
<v Speaker 2>conditional statements, loops, functions. He even walks you through writing

263
00:13:14.679 --> 00:13:18.960
<v Speaker 2>simple Python scripts for text manipulation, data extraction, and analysis.

264
00:13:19.120 --> 00:13:21.200
<v Speaker 1>So even if you're a complete coding newbie, you can

265
00:13:21.279 --> 00:13:24.240
<v Speaker 1>still follow along and learn the ropes. But what about

266
00:13:24.240 --> 00:13:26.919
<v Speaker 1>people who are already familiar with Python? Is there anything

267
00:13:26.960 --> 00:13:27.840
<v Speaker 1>for them in this book?

268
00:13:28.200 --> 00:13:31.399
<v Speaker 2>Definitely? Lee dives into more advanced techniques too. He talks

269
00:13:31.399 --> 00:13:36.679
<v Speaker 2>about modules, functions, and these data structures called dictionaries. For example,

270
00:13:36.679 --> 00:13:39.879
<v Speaker 2>he covers using the click module for creating command line interfaces,

271
00:13:40.399 --> 00:13:44.720
<v Speaker 2>the OS module for filesystem operations, and the CSV module

272
00:13:44.919 --> 00:13:46.519
<v Speaker 2>for processing CSV files.

273
00:13:46.679 --> 00:13:49.039
<v Speaker 1>So it's all about expanding your Python toolkit for more

274
00:13:49.039 --> 00:13:52.080
<v Speaker 1>efficient and sophisticated data analysis. But before we get two

275
00:13:52.159 --> 00:13:55.519
<v Speaker 1>carried away with Python, let's not forget about the humble spreadsheet,

276
00:13:55.679 --> 00:13:57.440
<v Speaker 1>a mainstay of investigative journalism.

277
00:13:57.519 --> 00:14:00.759
<v Speaker 2>Oh absolutely, Lee reminds us that CSV files, which are

278
00:14:00.799 --> 00:14:03.360
<v Speaker 2>often used to store what we call structured data can

279
00:14:03.399 --> 00:14:07.720
<v Speaker 2>be easily viewed and analyzed using spreadsheet software like Librofice, Calc,

280
00:14:07.919 --> 00:14:09.080
<v Speaker 2>or Microsoft Excel.

281
00:14:09.320 --> 00:14:11.480
<v Speaker 1>So it's all about choosing the right tool for the job,

282
00:14:11.600 --> 00:14:15.159
<v Speaker 1>whether it's command line wizardry, Python scripting, or good old

283
00:14:15.200 --> 00:14:18.879
<v Speaker 1>fashioned spreadsheet analysis. But it seems like Lee takes it

284
00:14:18.879 --> 00:14:22.600
<v Speaker 1>a step further, right He actually guides readers through building

285
00:14:22.600 --> 00:14:26.240
<v Speaker 1>a Blue Leaks Explore app using Python and these web

286
00:14:26.279 --> 00:14:29.440
<v Speaker 1>development frameworks like Flask and Schololcamy.

287
00:14:28.919 --> 00:14:32.240
<v Speaker 2>You're right, and this app provides a really user friendly

288
00:14:32.279 --> 00:14:36.600
<v Speaker 2>interface for browsing and searching that massive Blue Leaks data set,

289
00:14:37.039 --> 00:14:40.200
<v Speaker 2>even generating reports based on the data. It's a practical

290
00:14:40.240 --> 00:14:42.879
<v Speaker 2>example of how Python can be used to unlock the

291
00:14:42.919 --> 00:14:44.559
<v Speaker 2>secrets hidden within these leaks.

292
00:14:44.720 --> 00:14:47.360
<v Speaker 1>Wow, it's like we're building our own digital command center

293
00:14:47.399 --> 00:14:51.399
<v Speaker 1>for investigative journalism. But speaking of really interesting data sets,

294
00:14:51.519 --> 00:14:55.240
<v Speaker 1>Lee also analyzes the Parlor data set, which contained millions

295
00:14:55.279 --> 00:14:58.000
<v Speaker 1>of videos and metadata from that social media platform. What

296
00:14:58.120 --> 00:14:59.960
<v Speaker 1>kind of insights was he able to get from that data.

297
00:15:00.360 --> 00:15:02.039
<v Speaker 2>One of the most interesting things he did was actually

298
00:15:02.159 --> 00:15:06.480
<v Speaker 2>use Python to extract GPS coordinates from that video metadata.

299
00:15:06.559 --> 00:15:10.000
<v Speaker 1>GPS coordinates that's next level digital detective work.

300
00:15:10.000 --> 00:15:13.320
<v Speaker 2>It is, and by extracting those coordinates, Lee was actually

301
00:15:13.360 --> 00:15:16.759
<v Speaker 2>able to pinpoint the exact locations where those videos were filmed.

302
00:15:17.240 --> 00:15:20.440
<v Speaker 2>And then he even converted that data into KML files,

303
00:15:20.519 --> 00:15:21.519
<v Speaker 2>which can be viewed.

304
00:15:21.279 --> 00:15:23.360
<v Speaker 1>In Google Earth, so he was basically able to put

305
00:15:23.360 --> 00:15:25.799
<v Speaker 1>those parlor videos on a map exactly.

306
00:15:26.639 --> 00:15:30.440
<v Speaker 2>This technique is incredibly powerful. Imagine being able to see

307
00:15:30.519 --> 00:15:33.240
<v Speaker 2>where people were gathering during a protest or a rally.

308
00:15:34.039 --> 00:15:36.960
<v Speaker 2>You could potentially even identify individuals who were present.

309
00:15:37.200 --> 00:15:41.120
<v Speaker 1>It's like creating a digital breadcrumb trail. Now, Lee also

310
00:15:41.159 --> 00:15:44.120
<v Speaker 1>investigated the Epic data breach. Didn't he what was so

311
00:15:44.159 --> 00:15:45.440
<v Speaker 1>significant about that one?

312
00:15:45.639 --> 00:15:47.879
<v Speaker 2>Well, the Epic data breach is really important because it

313
00:15:47.960 --> 00:15:51.919
<v Speaker 2>involved SQL databases, which are basically a fundamental part of

314
00:15:51.960 --> 00:15:55.159
<v Speaker 2>how many websites and online services store their data.

315
00:15:55.279 --> 00:15:57.559
<v Speaker 1>SQL databases. Can you remind me what those are?

316
00:15:57.639 --> 00:16:00.600
<v Speaker 2>Sure? Think of a database like a really organized collection

317
00:16:00.639 --> 00:16:04.200
<v Speaker 2>of data. And SQL, which stands for a structured query language,

318
00:16:04.399 --> 00:16:07.480
<v Speaker 2>is basically a special language that's used to interact with

319
00:16:07.519 --> 00:16:10.279
<v Speaker 2>those databases. It's how you ask questions and get answers

320
00:16:10.279 --> 00:16:10.919
<v Speaker 2>from that data.

321
00:16:11.000 --> 00:16:14.799
<v Speaker 1>So it's like Python, but specifically for talking to databases.

322
00:16:14.480 --> 00:16:17.759
<v Speaker 2>Right, and there are different types of SQL databases. Epic

323
00:16:17.879 --> 00:16:20.840
<v Speaker 2>use something called Mysequel, which is a popular open source

324
00:16:20.879 --> 00:16:23.559
<v Speaker 2>database system. And Lee explains all of this and even

325
00:16:23.559 --> 00:16:25.519
<v Speaker 2>shows you how to use a free tool called Maria

326
00:16:25.600 --> 00:16:28.600
<v Speaker 2>dB to explore that leaked EPIC data.

327
00:16:28.799 --> 00:16:31.039
<v Speaker 1>So it's like he's giving us a crash course in

328
00:16:31.120 --> 00:16:33.039
<v Speaker 1>SQL database management in a way.

329
00:16:33.120 --> 00:16:35.799
<v Speaker 2>Yes, And by doing so, he's kind of empowering us

330
00:16:35.879 --> 00:16:38.679
<v Speaker 2>to investigate these types of leaks ourselves. You know, we're

331
00:16:38.720 --> 00:16:41.559
<v Speaker 2>not just passively reading about his findings. We're learning the

332
00:16:41.559 --> 00:16:43.360
<v Speaker 2>skills to make our own discoveries.

333
00:16:43.559 --> 00:16:46.279
<v Speaker 1>It's amazing how he takes these complex topics and breaks

334
00:16:46.279 --> 00:16:50.480
<v Speaker 1>them down. But Lee also explored the AFLDS healthcare data leak,

335
00:16:50.679 --> 00:16:53.120
<v Speaker 1>and this one actually involved patient records. What did he

336
00:16:53.200 --> 00:16:53.679
<v Speaker 1>find there?

337
00:16:53.840 --> 00:16:57.200
<v Speaker 2>Yeah, this leak was particular concerning because it contained really

338
00:16:57.240 --> 00:17:02.360
<v Speaker 2>sensitive patient information like medical condition, treatments, even credit card details.

339
00:17:02.960 --> 00:17:06.200
<v Speaker 2>Lee used Python to analyze this data, look for patterns,

340
00:17:06.279 --> 00:17:09.759
<v Speaker 2>and create visualizations to support investigative reporting.

341
00:17:10.039 --> 00:17:12.839
<v Speaker 1>It sounds like a story with very real world consequences,

342
00:17:12.960 --> 00:17:17.680
<v Speaker 1>especially considering that AFLDS was promoting unproven medical treatments exactly.

343
00:17:18.160 --> 00:17:23.240
<v Speaker 2>Lee's analysis really highlights the potential dangers of misinformation, especially

344
00:17:23.240 --> 00:17:24.240
<v Speaker 2>when it comes to healthcare.

345
00:17:24.480 --> 00:17:27.359
<v Speaker 1>It's a stark reminder that data breaches can have really

346
00:17:27.400 --> 00:17:31.480
<v Speaker 1>serious implications that go far beyond just the digital world.

347
00:17:31.759 --> 00:17:33.960
<v Speaker 1>But before we wrap up this part of our deep dive,

348
00:17:34.000 --> 00:17:37.680
<v Speaker 1>I want to talk about one last data set, Discord chatlogs.

349
00:17:38.200 --> 00:17:41.000
<v Speaker 1>I imagine those can provide a wealth of information, especially

350
00:17:41.039 --> 00:17:43.759
<v Speaker 1>when it comes to investigating online groups in their activities.

351
00:17:43.880 --> 00:17:48.920
<v Speaker 2>Absolutely leaked discord chat logs can be incredibly resealing, and

352
00:17:49.000 --> 00:17:51.359
<v Speaker 2>Lee describes how these logs are structured. He uses tools

353
00:17:51.400 --> 00:17:53.880
<v Speaker 2>to explore them, and even builds a custom web app

354
00:17:53.880 --> 00:17:54.759
<v Speaker 2>for analyzing them.

355
00:17:54.920 --> 00:17:57.319
<v Speaker 1>Wow, this is all so much to take in, and

356
00:17:57.359 --> 00:17:59.839
<v Speaker 1>I know we're only scratching the surface of Mica Leese Hacks,

357
00:17:59.880 --> 00:18:02.599
<v Speaker 1>Leeks and Revelations, But we'll have to continue our deep

358
00:18:02.599 --> 00:18:04.160
<v Speaker 1>dive in part two. Stay tuned.

359
00:18:04.440 --> 00:18:06.359
<v Speaker 2>One thing that really stood out to me in Hacks,

360
00:18:06.440 --> 00:18:10.119
<v Speaker 2>Leaks and Revelations was how Michae Lee really emphasizes working

361
00:18:10.119 --> 00:18:13.200
<v Speaker 2>with structured data, especially in spreadsheets.

362
00:18:13.440 --> 00:18:16.039
<v Speaker 1>Yeah, it feels like we're getting like a master class

363
00:18:16.039 --> 00:18:19.519
<v Speaker 1>in data analysis, but you know, without that stuffy classroom setting.

364
00:18:19.880 --> 00:18:22.880
<v Speaker 1>And he even reminds us that sometimes the simplest tools

365
00:18:23.000 --> 00:18:24.200
<v Speaker 1>can be the most effective.

366
00:18:24.400 --> 00:18:29.799
<v Speaker 2>Absolutely, he really highlights how valuable CSV files are. They're

367
00:18:29.880 --> 00:18:33.759
<v Speaker 2>basically spreadsheets without all the fancy formatting.

368
00:18:33.359 --> 00:18:35.759
<v Speaker 1>So they're like the raw ingredients of a spreadsheet, just

369
00:18:35.799 --> 00:18:37.480
<v Speaker 1>waiting to be analyzed exactly.

370
00:18:37.839 --> 00:18:40.519
<v Speaker 2>And even though we have these powerful tools like Python

371
00:18:40.559 --> 00:18:44.400
<v Speaker 2>and SQL databases, sometimes good old fashioned spreadsheets software is

372
00:18:44.400 --> 00:18:47.039
<v Speaker 2>the best way to go for viewing and working with

373
00:18:47.119 --> 00:18:48.039
<v Speaker 2>CSV files.

374
00:18:48.240 --> 00:18:50.200
<v Speaker 1>It's like choosing the right tool for the job right.

375
00:18:50.279 --> 00:18:52.319
<v Speaker 1>Sometimes you need a blender and sometimes you just need

376
00:18:52.319 --> 00:18:55.759
<v Speaker 1>a good knife. But does he give any specific examples

377
00:18:55.799 --> 00:18:57.559
<v Speaker 1>of how this works in practice?

378
00:18:57.680 --> 00:18:59.680
<v Speaker 2>He does. He uses the Blue Leaks data set as

379
00:18:59.680 --> 00:19:02.720
<v Speaker 2>an egis, and specifically he focuses on what are called

380
00:19:03.000 --> 00:19:08.000
<v Speaker 2>Suspicious Activity Reports or sars from the Northern California Regional

381
00:19:08.079 --> 00:19:09.480
<v Speaker 2>Intelligence Center SARS.

382
00:19:09.519 --> 00:19:12.200
<v Speaker 1>That sounds intreasing. What kind of information do they contain?

383
00:19:12.559 --> 00:19:16.880
<v Speaker 2>Stars are basically reports filed by law enforcement agencies about

384
00:19:16.880 --> 00:19:20.400
<v Speaker 2>individuals or activities that they consider suspicious. They can include

385
00:19:20.400 --> 00:19:24.240
<v Speaker 2>a wide range of information, from physical descriptions to observed

386
00:19:24.319 --> 00:19:28.240
<v Speaker 2>behaviors to allege connections to criminal activity.

387
00:19:28.599 --> 00:19:31.240
<v Speaker 1>Sounds like a window into the world of surveillance. What

388
00:19:31.319 --> 00:19:33.039
<v Speaker 1>did Lee uncover from these reports.

389
00:19:33.240 --> 00:19:36.119
<v Speaker 2>Well, he used Python to extract and format the data

390
00:19:36.359 --> 00:19:39.319
<v Speaker 2>from the blue Leaks CSV file, making it easier to

391
00:19:39.359 --> 00:19:42.200
<v Speaker 2>read and analyze, and then he showed how to search

392
00:19:42.240 --> 00:19:45.519
<v Speaker 2>for specific keywords and uncover patterns in the data.

393
00:19:45.599 --> 00:19:48.480
<v Speaker 1>So he's not just passively reading these stars, he's actively

394
00:19:48.519 --> 00:19:50.839
<v Speaker 1>interrogating them using Python exactly.

395
00:19:51.319 --> 00:19:53.240
<v Speaker 2>And this technique can be applied to other data sets

396
00:19:53.279 --> 00:19:56.319
<v Speaker 2>as well. It's about using Python to transform that data

397
00:19:56.400 --> 00:19:58.799
<v Speaker 2>into a more useful and insightful format.

398
00:19:59.039 --> 00:20:02.279
<v Speaker 1>This is fascinating stuff. Did Lee analyze any other parts

399
00:20:02.279 --> 00:20:03.640
<v Speaker 1>of the Blue Leak's data set?

400
00:20:03.920 --> 00:20:06.319
<v Speaker 2>He did. He also looked at the bulk emails sent

401
00:20:06.359 --> 00:20:07.559
<v Speaker 2>out by fusion centers.

402
00:20:07.640 --> 00:20:09.519
<v Speaker 1>Fusion centers, Can you remind me what those are again?

403
00:20:09.720 --> 00:20:12.920
<v Speaker 2>Sure? Fusion centers are basically intelligent sharing hubs that were

404
00:20:12.920 --> 00:20:15.559
<v Speaker 2>created after the nine to eleven attacks to try to

405
00:20:15.559 --> 00:20:20.200
<v Speaker 2>improve communication and collaboration between different law enforcement agencies. They

406
00:20:20.240 --> 00:20:24.240
<v Speaker 2>often send out these bulk emails to local police officers

407
00:20:24.559 --> 00:20:29.920
<v Speaker 2>with information about potential threats, suspicious activities, even upcoming training

408
00:20:29.920 --> 00:20:30.839
<v Speaker 2>courses I see.

409
00:20:30.839 --> 00:20:33.319
<v Speaker 1>So they play a crucial role in keeping everyone informed.

410
00:20:33.319 --> 00:20:35.759
<v Speaker 1>But did Lee find anything noteworthy in these emails?

411
00:20:36.240 --> 00:20:39.319
<v Speaker 2>He focused on a file called email builder dot csv,

412
00:20:39.599 --> 00:20:43.119
<v Speaker 2>which contained the content of these bulk emails, and interestingly,

413
00:20:43.160 --> 00:20:45.960
<v Speaker 2>he points out that the email bodies in the CSV

414
00:20:46.000 --> 00:20:50.319
<v Speaker 2>file are actually htmail templates, meaning they contain these placeholders

415
00:20:50.359 --> 00:20:53.359
<v Speaker 2>that are filled in with specific information when those emails

416
00:20:53.359 --> 00:20:53.799
<v Speaker 2>are sent out.

417
00:20:53.880 --> 00:20:56.759
<v Speaker 1>So it's like madlibs for law enforcement communication exactly.

418
00:20:57.000 --> 00:20:59.319
<v Speaker 2>And to make these emails easier to read, he actually

419
00:20:59.319 --> 00:21:03.119
<v Speaker 2>wrote a Python script that extracts the relevant information formats

420
00:21:03.160 --> 00:21:05.880
<v Speaker 2>it properly using HTML tags and then saves it as

421
00:21:05.920 --> 00:21:06.880
<v Speaker 2>an htmol file.

422
00:21:07.079 --> 00:21:10.279
<v Speaker 1>So it's like we're reconstructing those bulk emails, bringing them

423
00:21:10.279 --> 00:21:12.559
<v Speaker 1>back to life in their original format so we can

424
00:21:12.599 --> 00:21:14.920
<v Speaker 1>get a clearer picture of what kind of information was

425
00:21:14.960 --> 00:21:15.519
<v Speaker 1>being shared.

426
00:21:15.640 --> 00:21:19.039
<v Speaker 2>That's right. And to make it even easier to explore

427
00:21:19.079 --> 00:21:23.039
<v Speaker 2>the entire Blue Leak's data set, Lee guides us through

428
00:21:23.079 --> 00:21:25.160
<v Speaker 2>building a web app using Python.

429
00:21:25.519 --> 00:21:28.359
<v Speaker 1>Wait, we're building a web app. Now, this is getting serious?

430
00:21:28.519 --> 00:21:31.000
<v Speaker 2>It is. He calls it the Blue Leaks Explorer, and

431
00:21:31.039 --> 00:21:35.160
<v Speaker 2>it provides this really user friendly interface for browsing and

432
00:21:35.200 --> 00:21:38.000
<v Speaker 2>searching that massive Blue Leaks data set. It's like having

433
00:21:38.000 --> 00:21:40.680
<v Speaker 2>your own Google, but specifically for this data.

434
00:21:40.880 --> 00:21:43.799
<v Speaker 1>That's incredible. Did he actually make this app available for

435
00:21:43.839 --> 00:21:44.640
<v Speaker 1>others to use.

436
00:21:44.799 --> 00:21:48.039
<v Speaker 2>He doesn't explicitly say that, but he does provide very

437
00:21:48.039 --> 00:21:51.880
<v Speaker 2>detailed instructions, which suggests that he wants to empower others

438
00:21:51.880 --> 00:21:53.839
<v Speaker 2>to build and use it. You know, it speaks to

439
00:21:53.920 --> 00:21:56.119
<v Speaker 2>his commitment to transparency and collaboration.

440
00:21:56.400 --> 00:21:58.400
<v Speaker 1>I love that he's giving people the tools to do

441
00:21:58.440 --> 00:22:01.359
<v Speaker 1>their own investigating. It's so empowered. Now. You mentioned earlier

442
00:22:01.400 --> 00:22:04.599
<v Speaker 1>that Lee also analyzed the Parlor data set. That's the

443
00:22:04.680 --> 00:22:07.759
<v Speaker 1>social media platform that's popular among right wing users.

444
00:22:07.799 --> 00:22:11.240
<v Speaker 2>Right, yes, that's right, and the data set contained millions

445
00:22:11.240 --> 00:22:14.319
<v Speaker 2>of videos and metadata, and Lee was able to extract

446
00:22:14.359 --> 00:22:17.559
<v Speaker 2>some really interesting insights from it, like what well. One

447
00:22:17.599 --> 00:22:19.839
<v Speaker 2>of the most striking things he did was use Python

448
00:22:20.319 --> 00:22:25.440
<v Speaker 2>to extract GPS coordinates from the video metadatas coordinates.

449
00:22:25.480 --> 00:22:28.240
<v Speaker 1>That's some next level digital detective.

450
00:22:27.799 --> 00:22:31.960
<v Speaker 2>Work, it is, And by extracting those coordinates, Lee was

451
00:22:32.000 --> 00:22:35.519
<v Speaker 2>able to actually pinpoint the exact locations where those videos

452
00:22:35.559 --> 00:22:39.000
<v Speaker 2>were filmed. And then he converted the data into KML files,

453
00:22:39.039 --> 00:22:40.680
<v Speaker 2>which can be viewed in Google.

454
00:22:40.359 --> 00:22:43.960
<v Speaker 1>Earth, so he was basically able to put those Parlor videos.

455
00:22:43.640 --> 00:22:48.039
<v Speaker 2>On a map exactly. This technique is incredibly powerful. Imagine

456
00:22:48.079 --> 00:22:50.279
<v Speaker 2>being able to see where people were gathering during a

457
00:22:50.279 --> 00:22:53.880
<v Speaker 2>protest or a rally. You could even potentially identify individuals

458
00:22:53.920 --> 00:22:54.559
<v Speaker 2>who were present.

459
00:22:54.759 --> 00:22:57.759
<v Speaker 1>It's like creating a digital breadcrumb trail. Now. Lee also

460
00:22:57.799 --> 00:23:00.640
<v Speaker 1>investigated the Epic data breach. Didn't he what was so

461
00:23:00.680 --> 00:23:01.880
<v Speaker 1>significant about that one?

462
00:23:01.920 --> 00:23:04.000
<v Speaker 2>Well, the Epic data breach is important because it involves

463
00:23:04.000 --> 00:23:07.359
<v Speaker 2>SQL databases, which are basically a fundamental part of how

464
00:23:07.359 --> 00:23:10.880
<v Speaker 2>many websites and online services actually store their data.

465
00:23:10.960 --> 00:23:13.279
<v Speaker 1>SQL databases. Can you remind me what those are?

466
00:23:13.319 --> 00:23:16.440
<v Speaker 2>Again? Sure? Think of a database like a really organized

467
00:23:16.480 --> 00:23:20.680
<v Speaker 2>collection of data, and SQL, which stands for Structured Query Language,

468
00:23:20.920 --> 00:23:23.880
<v Speaker 2>is basically a special language that's used to interact with

469
00:23:23.880 --> 00:23:26.599
<v Speaker 2>those databases and how you ask questions and get answers

470
00:23:26.640 --> 00:23:27.359
<v Speaker 2>from your data.

471
00:23:27.440 --> 00:23:31.519
<v Speaker 1>So it's like Python, but specifically for talking to databases.

472
00:23:31.200 --> 00:23:35.039
<v Speaker 2>Right, And there are different types of SQL databases. Epic

473
00:23:35.160 --> 00:23:38.559
<v Speaker 2>used myseqel, which is a very popular open source database system.

474
00:23:38.839 --> 00:23:41.119
<v Speaker 2>Lee explains all of this and even shows you how

475
00:23:41.119 --> 00:23:44.880
<v Speaker 2>to use a free tool called Murray ADB to explore

476
00:23:44.960 --> 00:23:46.079
<v Speaker 2>the leaked Epic data.

477
00:23:46.599 --> 00:23:48.759
<v Speaker 1>So it's like he's giving us a crash course in

478
00:23:48.839 --> 00:23:50.160
<v Speaker 1>SQL database management.

479
00:23:50.480 --> 00:23:53.319
<v Speaker 2>In a way, he is, and by doing so he's

480
00:23:53.480 --> 00:23:56.799
<v Speaker 2>kind of empowering us to investigate these types of leaks ourselves. Yeah,

481
00:23:56.880 --> 00:23:59.440
<v Speaker 2>you know, we're not just passively reading about his findings.

482
00:23:59.599 --> 00:24:01.720
<v Speaker 2>Where the skills to make our own discoveries.

483
00:24:01.799 --> 00:24:04.279
<v Speaker 1>It's incredible how he takes these really complex concepts and

484
00:24:04.279 --> 00:24:06.960
<v Speaker 1>breaks them down makes them so accessible. But Lee also

485
00:24:07.160 --> 00:24:10.519
<v Speaker 1>explored the AFLDS healthcare data leak, and this one actually

486
00:24:10.559 --> 00:24:12.839
<v Speaker 1>involved patient records. What did he find there?

487
00:24:13.000 --> 00:24:17.240
<v Speaker 2>This leak was particularly concerning because it contains sensitive patient

488
00:24:17.240 --> 00:24:22.319
<v Speaker 2>information like medical conditions, treatments, even credit card details. Lee

489
00:24:22.440 --> 00:24:26.359
<v Speaker 2>used Python to analyze this data, identify patterns, and create

490
00:24:26.480 --> 00:24:28.960
<v Speaker 2>visualizations to support investigative reporting.

491
00:24:29.160 --> 00:24:32.319
<v Speaker 1>It sounds like a story with very real world consequences,

492
00:24:32.480 --> 00:24:36.480
<v Speaker 1>especially given that AFLDS was promoting unproven medical treatments exactly.

493
00:24:36.839 --> 00:24:41.319
<v Speaker 2>Lee's analysis really highlights the potential dangers of misinformation, especially

494
00:24:41.359 --> 00:24:42.559
<v Speaker 2>in the context of healthcare.

495
00:24:42.759 --> 00:24:45.799
<v Speaker 1>It's a stark reminder that data breaches can have serious

496
00:24:45.799 --> 00:24:49.759
<v Speaker 1>implications that go far beyond just the digital world. But

497
00:24:49.839 --> 00:24:53.240
<v Speaker 1>we've talked a lot about data analysis techniques. What about

498
00:24:53.240 --> 00:24:56.319
<v Speaker 1>the ethical considerations of working with these data sets.

499
00:24:56.440 --> 00:24:59.039
<v Speaker 2>That's a great question, and Lee is very mindful of

500
00:24:59.039 --> 00:25:01.880
<v Speaker 2>the ethical implication of his work. He actually dedicates a

501
00:25:01.880 --> 00:25:04.960
<v Speaker 2>whole chapter to discussing how to protect your sources and

502
00:25:05.039 --> 00:25:06.960
<v Speaker 2>handle sensitive information responsibly.

503
00:25:07.039 --> 00:25:09.240
<v Speaker 1>Can you give us an example of how he addresses

504
00:25:09.279 --> 00:25:10.359
<v Speaker 1>these ethical concerns?

505
00:25:10.640 --> 00:25:13.519
<v Speaker 2>Absolutely. One example that stood out to me was his

506
00:25:13.599 --> 00:25:17.359
<v Speaker 2>investigation of the pony Power discord server. This group was

507
00:25:17.359 --> 00:25:20.640
<v Speaker 2>engaged in doxing, which is the malicious act of publicly

508
00:25:20.680 --> 00:25:22.519
<v Speaker 2>revealing someone's personal information.

509
00:25:22.599 --> 00:25:26.960
<v Speaker 1>Online doxing can have devastating consequences. It's a form of

510
00:25:27.079 --> 00:25:30.759
<v Speaker 1>online harassment that can lead to real world harm. What

511
00:25:30.839 --> 00:25:32.599
<v Speaker 1>did Lee discover about this group?

512
00:25:33.279 --> 00:25:35.720
<v Speaker 2>Well, he used his discord analysis app to go through

513
00:25:35.759 --> 00:25:39.519
<v Speaker 2>the chat logs, searching for keywords, identifying key players, and

514
00:25:39.559 --> 00:25:43.480
<v Speaker 2>really tracking their doxing campaigns. He uncovered personal information from

515
00:25:43.480 --> 00:25:48.759
<v Speaker 2>over fifty individuals across fourteen states, including photographs, social media profiles,

516
00:25:48.799 --> 00:25:52.640
<v Speaker 2>home addresses, phone numbers, email addresses, dates of birth, driver's

517
00:25:52.640 --> 00:25:56.079
<v Speaker 2>license numbers, vehicle information, places of employment, and even a

518
00:25:56.079 --> 00:25:57.279
<v Speaker 2>social Security number.

519
00:25:57.480 --> 00:25:59.960
<v Speaker 1>That's a horrifying amount of sensitive information. It's hard to

520
00:26:00.240 --> 00:26:02.440
<v Speaker 1>imagine the potential damage that could be done with this

521
00:26:02.559 --> 00:26:03.599
<v Speaker 1>data exactly.

522
00:26:03.920 --> 00:26:06.000
<v Speaker 2>And Lee's reporting on this group, which was published in

523
00:26:06.039 --> 00:26:09.160
<v Speaker 2>the Intercept, brought much needed attention to the dangers of

524
00:26:09.160 --> 00:26:13.279
<v Speaker 2>online harassment and doxing. It wasn't just about exposing the perpetrators,

525
00:26:13.400 --> 00:26:17.119
<v Speaker 2>but also about raising awareness and advocating for better protections

526
00:26:17.119 --> 00:26:17.720
<v Speaker 2>for victims.

527
00:26:17.799 --> 00:26:20.799
<v Speaker 1>It's inspiring to see how Lee uses his technical skills

528
00:26:21.200 --> 00:26:23.640
<v Speaker 1>not only to uncover secrets, but also to protect those

529
00:26:23.680 --> 00:26:24.000
<v Speaker 1>who are.

530
00:26:24.000 --> 00:26:27.400
<v Speaker 2>Vulnerable absolutely, and his collaboration with Unicorn Riot on the

531
00:26:27.400 --> 00:26:31.000
<v Speaker 2>discord Leaks project further demonstrates his commitment to making these

532
00:26:31.039 --> 00:26:32.599
<v Speaker 2>tools and resources available.

533
00:26:32.839 --> 00:26:36.000
<v Speaker 1>Discord Leaks has become an invaluable resource for journalists and

534
00:26:36.079 --> 00:26:40.440
<v Speaker 1>researchers who are investigating online extremism and hate speech. But

535
00:26:40.559 --> 00:26:43.000
<v Speaker 1>Lee doesn't stop there. He also tackles some of the

536
00:26:43.039 --> 00:26:46.599
<v Speaker 1>technical challenges that Windows users might face when working with

537
00:26:46.640 --> 00:26:50.720
<v Speaker 1>these large data sets. Specifically, he addresses the Windows Subsystem

538
00:26:50.799 --> 00:26:52.480
<v Speaker 1>for Linux or WSL.

539
00:26:52.720 --> 00:26:56.599
<v Speaker 2>That's right. WSL is a fantastic tool that allows Windows

540
00:26:56.680 --> 00:27:00.160
<v Speaker 2>users to run a Linux environment right within Windows, and

541
00:27:00.200 --> 00:27:03.279
<v Speaker 2>this is a huge advantage for data analysis because many

542
00:27:03.279 --> 00:27:06.240
<v Speaker 2>of the powerful tools we've been discussing are actually designed

543
00:27:06.279 --> 00:27:06.839
<v Speaker 2>for Linux.

544
00:27:07.079 --> 00:27:09.480
<v Speaker 1>But it sounds like WSL can be a bit finicky,

545
00:27:10.400 --> 00:27:12.519
<v Speaker 1>especially when dealing with large data sets.

546
00:27:12.599 --> 00:27:17.200
<v Speaker 2>It can be one common issue is performance. Accessing files

547
00:27:17.200 --> 00:27:20.200
<v Speaker 2>from a Windows drive within the WSL environment can be

548
00:27:20.240 --> 00:27:24.559
<v Speaker 2>significantly slower than accessing files stored on the Linux file system,

549
00:27:25.319 --> 00:27:28.519
<v Speaker 2>and Lee offers some practical solutions for improving that performance.

550
00:27:28.640 --> 00:27:31.359
<v Speaker 1>So he's basically giving us a tune up guide for

551
00:27:31.440 --> 00:27:33.279
<v Speaker 1>our WSL environment exactly.

552
00:27:33.480 --> 00:27:36.920
<v Speaker 2>He wants to make sure everyone, regardless of their operating system,

553
00:27:37.400 --> 00:27:39.319
<v Speaker 2>can use these powerful tools for good.

554
00:27:39.480 --> 00:27:41.559
<v Speaker 1>It's amazing how much ground we've covered already and we're

555
00:27:41.599 --> 00:27:44.519
<v Speaker 1>only halfway through the book. Mike Lee really does provide

556
00:27:44.559 --> 00:27:47.720
<v Speaker 1>a wealth of information and practical guidance for anyone who's

557
00:27:47.759 --> 00:27:50.480
<v Speaker 1>interested in data analysis and investigative journalism.

558
00:27:50.599 --> 00:27:53.799
<v Speaker 2>I agree, and what's really inspiring is how he emphasizes

559
00:27:53.880 --> 00:27:56.880
<v Speaker 2>the ethical dimensions of this work, always keeping in mind

560
00:27:56.920 --> 00:28:00.400
<v Speaker 2>the potential impact on individuals and society as a whole.

561
00:28:00.640 --> 00:28:04.599
<v Speaker 1>Well said, we'll continue our deep dive into hacks, Leaks

562
00:28:04.640 --> 00:28:08.000
<v Speaker 1>and Revelations in Part three. Stay tuned for more insights

563
00:28:08.039 --> 00:28:10.279
<v Speaker 1>into the world of data leaks and the tools that

564
00:28:10.400 --> 00:28:14.519
<v Speaker 1>journalists use to unlock the truth. Welcome back to our

565
00:28:14.559 --> 00:28:18.359
<v Speaker 1>deep dive into Mike Lee's hacks, Leaks and Revelations. We've

566
00:28:18.400 --> 00:28:22.359
<v Speaker 1>already uncovered so many tools and techniques for investigating data weeks.

567
00:28:22.839 --> 00:28:25.160
<v Speaker 1>What else has stood out to you as we've continued reading?

568
00:28:25.599 --> 00:28:27.920
<v Speaker 2>You know, I was really struck by Lee's deep dive

569
00:28:27.960 --> 00:28:29.240
<v Speaker 2>into email dumps.

570
00:28:29.519 --> 00:28:30.480
<v Speaker 1>Oh yeah, yeah.

571
00:28:30.519 --> 00:28:33.880
<v Speaker 2>They might seem like old news compared to today's world

572
00:28:33.880 --> 00:28:36.480
<v Speaker 2>of instant messaging and social media and all that. Right,

573
00:28:36.640 --> 00:28:38.960
<v Speaker 2>but email dumps can still be an absolute gold mine

574
00:28:38.960 --> 00:28:39.519
<v Speaker 2>of information.

575
00:28:39.759 --> 00:28:41.720
<v Speaker 1>That's true. I mean, emails are often used for more

576
00:28:41.759 --> 00:28:44.039
<v Speaker 1>official communication, so it makes sense that they would contain

577
00:28:44.079 --> 00:28:47.400
<v Speaker 1>a lot of sensitive information. But analyzing those emails it

578
00:28:47.440 --> 00:28:49.559
<v Speaker 1>can feel like trying to find a needle in a haystack,

579
00:28:49.680 --> 00:28:50.759
<v Speaker 1>right exactly.

580
00:28:51.319 --> 00:28:53.920
<v Speaker 2>But Lee doesn't shy away from that challenge at all.

581
00:28:54.359 --> 00:28:57.599
<v Speaker 2>He really walks us through those common formats for email

582
00:28:57.680 --> 00:29:01.319
<v Speaker 2>dumps like EML and mbox. He explains how these files

583
00:29:01.319 --> 00:29:03.480
<v Speaker 2>are structured, and he shows us how to use those

584
00:29:03.480 --> 00:29:08.599
<v Speaker 2>familiar tools like Thunderbird and Microsoft Outlook oh to make

585
00:29:08.640 --> 00:29:09.240
<v Speaker 2>sense of it all.

586
00:29:09.359 --> 00:29:11.680
<v Speaker 1>So it's like he's giving us a crash course in

587
00:29:11.720 --> 00:29:12.839
<v Speaker 1>email archaeology.

588
00:29:13.119 --> 00:29:15.799
<v Speaker 2>I like that analogy. He's teaching us how to sift

589
00:29:15.839 --> 00:29:19.880
<v Speaker 2>through those digital ruins and extract those valuable insights. He

590
00:29:19.920 --> 00:29:22.519
<v Speaker 2>even gives us practical tips for organizing and searching and

591
00:29:22.599 --> 00:29:25.640
<v Speaker 2>analyzing large volumes of emails efficiently.

592
00:29:25.839 --> 00:29:28.559
<v Speaker 1>It is fascinating how you can piece together narratives and

593
00:29:28.680 --> 00:29:33.359
<v Speaker 1>uncover those hidden connections just by carefully examining email threads.

594
00:29:33.079 --> 00:29:35.839
<v Speaker 2>Absolutely and when those email threads get too complex to

595
00:29:35.880 --> 00:29:38.880
<v Speaker 2>manage with those traditional email clients, Lee shows us how

596
00:29:38.920 --> 00:29:40.799
<v Speaker 2>to bring in our trusty friend Python.

597
00:29:40.960 --> 00:29:43.519
<v Speaker 1>Ah, Python always there to save the day. What kind

598
00:29:43.559 --> 00:29:47.000
<v Speaker 1>of Python magic does Lee use for analyzing these email dumps?

599
00:29:47.160 --> 00:29:49.400
<v Speaker 2>Well, he gives us really clear examples of how to

600
00:29:49.480 --> 00:29:54.759
<v Speaker 2>use Python to extract specific information from emails like sender addresses,

601
00:29:54.960 --> 00:29:59.240
<v Speaker 2>recipient lists, dates, keywords. You know. Imagine having a digital

602
00:29:59.279 --> 00:30:03.400
<v Speaker 2>assistant that could scan through mountains of emails and just

603
00:30:03.480 --> 00:30:06.000
<v Speaker 2>pinpoint the most relevant information for your investigation.

604
00:30:06.119 --> 00:30:08.599
<v Speaker 1>Oh, that would be amazing. It would save so much

605
00:30:08.640 --> 00:30:11.440
<v Speaker 1>time and effort, freeing you up to really focus on

606
00:30:11.480 --> 00:30:15.079
<v Speaker 1>the analysis and interpretation of the data. So it really

607
00:30:15.079 --> 00:30:17.759
<v Speaker 1>seems like Lee is giving us a comprehensive toolkit for

608
00:30:17.799 --> 00:30:22.799
<v Speaker 1>investigating data leaks, from understanding basic digital security practices to

609
00:30:23.039 --> 00:30:26.240
<v Speaker 1>mastering those really powerful data analysis techniques.

610
00:30:26.440 --> 00:30:29.079
<v Speaker 2>He really is. But what's even more impressive to me

611
00:30:29.160 --> 00:30:32.559
<v Speaker 2>is his emphasis on the ethical considerations of working with

612
00:30:32.599 --> 00:30:35.680
<v Speaker 2>the sensitive information. He's not just teaching us how to

613
00:30:35.680 --> 00:30:39.400
<v Speaker 2>find the information, but also how to handle it responsibly.

614
00:30:39.480 --> 00:30:41.880
<v Speaker 1>Can you give us a concrete example of how Lee

615
00:30:42.079 --> 00:30:45.119
<v Speaker 1>kind of navigates these ethical considerations.

616
00:30:45.200 --> 00:30:48.119
<v Speaker 2>Absolutely. One example that really resonated with me was his

617
00:30:48.160 --> 00:30:51.799
<v Speaker 2>investigation of the pony Power discord server. This group was

618
00:30:51.839 --> 00:30:55.640
<v Speaker 2>engaged in doxing, which is the malicious act of publicly

619
00:30:55.680 --> 00:30:59.680
<v Speaker 2>revealing someone's personal information online right and doxing can have

620
00:30:59.720 --> 00:31:04.599
<v Speaker 2>de stating consequences for the victims, exposing them to harassment, threats,

621
00:31:04.960 --> 00:31:06.640
<v Speaker 2>and even real world violence.

622
00:31:06.920 --> 00:31:09.240
<v Speaker 1>It's a terrifying thought that something as simple as an

623
00:31:09.279 --> 00:31:12.440
<v Speaker 1>online post can lead to such serious harm. How did

624
00:31:12.519 --> 00:31:15.319
<v Speaker 1>Lee approach this investigation while still being mindful of those

625
00:31:15.359 --> 00:31:16.559
<v Speaker 1>ethical implications.

626
00:31:16.599 --> 00:31:20.400
<v Speaker 2>Well, he used his discord analysis app to carefully examine

627
00:31:20.519 --> 00:31:24.000
<v Speaker 2>those chat logs, and he was very meticulous about protecting

628
00:31:24.000 --> 00:31:27.720
<v Speaker 2>the identities of those doxing victims while still uncovering the

629
00:31:27.759 --> 00:31:30.519
<v Speaker 2>extent of the group's activities. He discovered that they had

630
00:31:30.519 --> 00:31:34.759
<v Speaker 2>targeted over fifty individuals across fourteen states, collecting and sharing

631
00:31:34.920 --> 00:31:37.000
<v Speaker 2>a really shocking amount of personal information.

632
00:31:37.160 --> 00:31:40.799
<v Speaker 1>Wow, that's incredibly disturbing. What happened after Lee uncovered this

633
00:31:40.880 --> 00:31:41.759
<v Speaker 1>information well.

634
00:31:41.799 --> 00:31:45.160
<v Speaker 2>His investigation was published in The Intercept, and it brought

635
00:31:45.240 --> 00:31:48.559
<v Speaker 2>much needed attention to the dangers of online harassment and doxing.

636
00:31:49.240 --> 00:31:52.319
<v Speaker 2>His work didn't just expose the perpetrators, it also sparked

637
00:31:52.319 --> 00:31:55.640
<v Speaker 2>a conversation about the need for better protections for victims

638
00:31:55.680 --> 00:31:56.720
<v Speaker 2>of online abuse.

639
00:31:56.920 --> 00:31:59.720
<v Speaker 1>It's inspiring to see how Lee uses his skills to

640
00:31:59.720 --> 00:32:02.599
<v Speaker 1>fight for justice and to protect those who are vulnerable.

641
00:32:02.960 --> 00:32:05.599
<v Speaker 1>It's not just about uncovering the truth, it's about using

642
00:32:05.640 --> 00:32:08.359
<v Speaker 1>that truth to make a positive impact exactly.

643
00:32:08.440 --> 00:32:10.680
<v Speaker 2>And it's not just about his own reporting either. His

644
00:32:10.799 --> 00:32:14.200
<v Speaker 2>collaboration with Unicorn Riot on the discord Leaks project is

645
00:32:14.240 --> 00:32:17.640
<v Speaker 2>a prime example of his commitment to making these investigative

646
00:32:17.640 --> 00:32:21.079
<v Speaker 2>tools and techniques accessible to everyone.

647
00:32:20.759 --> 00:32:24.319
<v Speaker 1>And discord Leaks has become an invaluable resource for journalists

648
00:32:24.359 --> 00:32:28.240
<v Speaker 1>and researchers who are investigating online extremism and hate speech.

649
00:32:28.680 --> 00:32:31.359
<v Speaker 1>It's a testament to the power of collaboration and those

650
00:32:31.359 --> 00:32:32.680
<v Speaker 1>open source tools.

651
00:32:32.400 --> 00:32:35.480
<v Speaker 2>It really is, and Lee's willingness to share his knowledge

652
00:32:35.480 --> 00:32:39.079
<v Speaker 2>and expertise is truly commendable. He wants to empower others

653
00:32:39.160 --> 00:32:41.920
<v Speaker 2>to conduct their own investigations and hold those in power

654
00:32:41.960 --> 00:32:43.000
<v Speaker 2>accountable well.

655
00:32:43.119 --> 00:32:47.279
<v Speaker 1>Micah Lee's Hacks, Leaks and Revelations has been a fascinating

656
00:32:47.279 --> 00:32:50.000
<v Speaker 1>and thought provoking read. It's given us a glimpse into

657
00:32:50.000 --> 00:32:53.920
<v Speaker 1>the world of data leaks, investigative journalism, and the power

658
00:32:54.000 --> 00:32:56.119
<v Speaker 1>of these digital tools for uncovering the truth.

659
00:32:56.480 --> 00:32:59.440
<v Speaker 2>I couldn't agree more. It's a must read for anyone

660
00:32:59.440 --> 00:33:01.920
<v Speaker 2>who wants to done understand how data is shaping our

661
00:33:01.960 --> 00:33:04.400
<v Speaker 2>world and how we can use it to make a difference.

662
00:33:04.519 --> 00:33:07.200
<v Speaker 1>What a fantastic deep dive. A huge thank you to

663
00:33:07.359 --> 00:33:10.240
<v Speaker 1>you are expert for sharing your insights and expertise with

664
00:33:10.319 --> 00:33:12.599
<v Speaker 1>us today, and to our listeners, thank you for joining

665
00:33:12.640 --> 00:33:14.680
<v Speaker 1>us on this journey. We hope you found it as

666
00:33:14.839 --> 00:33:18.200
<v Speaker 1>enlightening as we did. Until next time, stay curious and

667
00:33:18.279 --> 00:33:19.079
<v Speaker 1>keep exploring.
