WEBVTT

1
00:00:00.040 --> 00:00:03.359
<v Speaker 1>Welcome to our deep dive into reverse engineering with IDA pro.

2
00:00:04.480 --> 00:00:07.599
<v Speaker 1>You sent over some excerpts from the IDA pro book. Yeah,

3
00:00:07.639 --> 00:00:09.960
<v Speaker 1>so it looks like you're ready to really get into

4
00:00:09.960 --> 00:00:12.000
<v Speaker 1>the nuts and bolts of software.

5
00:00:12.160 --> 00:00:14.960
<v Speaker 2>Absolutely. Yeah, it's a whole world where we kind of

6
00:00:15.000 --> 00:00:16.519
<v Speaker 2>get to look under the hood and see how things

7
00:00:16.559 --> 00:00:17.199
<v Speaker 2>actually work.

8
00:00:17.679 --> 00:00:19.960
<v Speaker 1>I've always wondered, like who uses IDA pro.

9
00:00:20.160 --> 00:00:23.440
<v Speaker 2>You know, it's used by like security researchers obviously, you know,

10
00:00:23.559 --> 00:00:27.640
<v Speaker 2>analyzing malware and things like that, but also like software engineers,

11
00:00:27.760 --> 00:00:30.399
<v Speaker 2>Oh wow, who have to deal with legacy code and

12
00:00:30.440 --> 00:00:34.920
<v Speaker 2>they need to update something or figure out how something works.

13
00:00:35.039 --> 00:00:37.159
<v Speaker 1>So when you say like figure out how something works,

14
00:00:37.600 --> 00:00:40.399
<v Speaker 1>what does that actually mean? What's IDA pro doing?

15
00:00:41.439 --> 00:00:44.240
<v Speaker 2>Well, you know, computers run on machine code, which is

16
00:00:44.320 --> 00:00:48.600
<v Speaker 2>just ones and zeros, right, and that's not very human readable.

17
00:00:48.679 --> 00:00:51.719
<v Speaker 2>So ida pro takes those ones and zeros and turns

18
00:00:51.719 --> 00:00:55.280
<v Speaker 2>them into assembly language, which is a little more human readable.

19
00:00:55.320 --> 00:00:57.960
<v Speaker 1>Okay, So it's like a translator yeah, exactly between computer

20
00:00:58.039 --> 00:00:59.799
<v Speaker 1>and something we can at least try to understand.

21
00:01:00.119 --> 00:01:03.359
<v Speaker 2>And it stores all that information in a database that

22
00:01:03.439 --> 00:01:05.680
<v Speaker 2>you can work with, which is in dot IDB file

23
00:01:06.079 --> 00:01:08.640
<v Speaker 2>and you don't even need the original program to look

24
00:01:08.680 --> 00:01:09.920
<v Speaker 2>at the analysis after that.

25
00:01:10.640 --> 00:01:13.040
<v Speaker 1>So backing up a little bit, how does IDA pro

26
00:01:13.159 --> 00:01:15.400
<v Speaker 1>even know what kind of file it's looking at? I

27
00:01:15.400 --> 00:01:16.680
<v Speaker 1>mean a lot of times you can't tell by the

28
00:01:16.680 --> 00:01:17.400
<v Speaker 1>file extension.

29
00:01:17.519 --> 00:01:20.120
<v Speaker 2>Yeah, you can't go by file extensions, you know, it's

30
00:01:20.280 --> 00:01:22.680
<v Speaker 2>very surface level. You need to look at the actual

31
00:01:22.760 --> 00:01:23.840
<v Speaker 2>contents of the file.

32
00:01:24.280 --> 00:01:26.000
<v Speaker 1>So it's like more than just judging a book by

33
00:01:26.000 --> 00:01:29.159
<v Speaker 1>its cover. It's actually looking inside and kind of getting

34
00:01:29.159 --> 00:01:29.799
<v Speaker 1>a fingerprint.

35
00:01:29.920 --> 00:01:32.959
<v Speaker 2>It uses things called magic numbers, which are like digital

36
00:01:33.000 --> 00:01:35.599
<v Speaker 2>fingerprints that tell you what type of file it is.

37
00:01:36.000 --> 00:01:38.719
<v Speaker 1>So once you know what you're looking at, what are

38
00:01:38.760 --> 00:01:41.159
<v Speaker 1>some of the things that IDA can do well.

39
00:01:41.200 --> 00:01:44.879
<v Speaker 2>It can identify like standard library routines that a lot

40
00:01:44.879 --> 00:01:48.640
<v Speaker 2>of programs use, okay, and by identifying those, you can

41
00:01:48.719 --> 00:01:50.599
<v Speaker 2>kind of like skip over those parts and look at

42
00:01:50.599 --> 00:01:52.120
<v Speaker 2>the more unique parts of the code.

43
00:01:52.239 --> 00:01:53.760
<v Speaker 1>So it's kind of like if you were reading a book,

44
00:01:53.840 --> 00:01:55.799
<v Speaker 1>you could skip the chapters that are the same in

45
00:01:55.840 --> 00:01:58.599
<v Speaker 1>every book, yeah, exactly, and get to the good stuff.

46
00:01:58.640 --> 00:02:02.400
<v Speaker 2>It can also do things like visualized code as graphs,

47
00:02:02.879 --> 00:02:04.799
<v Speaker 2>so you can see like a map of how the

48
00:02:04.840 --> 00:02:08.599
<v Speaker 2>program flows. Wow, all the different paths it can take.

49
00:02:09.039 --> 00:02:12.120
<v Speaker 1>That's got to be really helpful, especially for like more

50
00:02:12.159 --> 00:02:13.280
<v Speaker 1>complicated code.

51
00:02:13.439 --> 00:02:15.840
<v Speaker 2>Yeah, and you know, you can even extend IDA to

52
00:02:15.879 --> 00:02:19.879
<v Speaker 2>support new processors, processors that haven't even been invented yet.

53
00:02:19.960 --> 00:02:23.159
<v Speaker 1>So okay, we've talked about what it can do, but

54
00:02:23.240 --> 00:02:25.800
<v Speaker 1>what does it actually look like when you're using IDA pro.

55
00:02:26.120 --> 00:02:27.719
<v Speaker 2>Well, when you open it up, you see like a

56
00:02:27.840 --> 00:02:31.039
<v Speaker 2>multi windowed interface. Okay, it might seem a little overwhelming

57
00:02:31.039 --> 00:02:33.719
<v Speaker 2>at first, but each window serves a specific purpose.

58
00:02:33.840 --> 00:02:35.439
<v Speaker 1>Okay, So you have like different tools.

59
00:02:35.199 --> 00:02:38.120
<v Speaker 2>In there, yeah, exactly. Like you have the graph view,

60
00:02:38.159 --> 00:02:41.280
<v Speaker 2>which we talked about right, shows the code flow. There's

61
00:02:41.319 --> 00:02:44.520
<v Speaker 2>the text view, which shows the disassemble code line by line.

62
00:02:44.759 --> 00:02:46.719
<v Speaker 1>It's a lot of different ways to look at it. Yeah.

63
00:02:47.120 --> 00:02:50.080
<v Speaker 2>You have the functions window, which lists all the functions

64
00:02:50.120 --> 00:02:51.520
<v Speaker 2>that it's identified in the program.

65
00:02:51.520 --> 00:02:52.759
<v Speaker 1>And then you have the output window.

66
00:02:52.960 --> 00:02:55.719
<v Speaker 2>Yeah, the output window which shows you messages from idea,

67
00:02:55.800 --> 00:02:58.840
<v Speaker 2>and the strings window, which is really useful for finding

68
00:02:58.960 --> 00:03:00.879
<v Speaker 2>like human reatable text.

69
00:03:01.240 --> 00:03:03.039
<v Speaker 1>So you have to kind of know how to navigate

70
00:03:03.120 --> 00:03:04.520
<v Speaker 1>all those different windows. Yeah.

71
00:03:04.560 --> 00:03:08.199
<v Speaker 2>Absolutely, it's actually designed to be pretty intuitive. You can

72
00:03:08.319 --> 00:03:11.680
<v Speaker 2>jump to certain addresses in the code, okay. You can

73
00:03:11.800 --> 00:03:14.560
<v Speaker 2>use like the navigation history like a web browser.

74
00:03:14.319 --> 00:03:15.120
<v Speaker 1>So you can go back.

75
00:03:15.400 --> 00:03:18.199
<v Speaker 2>Yeah, and you can follow like cross references, which are

76
00:03:18.240 --> 00:03:20.840
<v Speaker 2>like links between different parts of the code, so.

77
00:03:20.960 --> 00:03:22.599
<v Speaker 1>Like a map and a compass for software.

78
00:03:22.719 --> 00:03:23.280
<v Speaker 2>Exactly.

79
00:03:23.360 --> 00:03:26.159
<v Speaker 1>All right, well let's talk about those instructions. So I imagine

80
00:03:26.080 --> 00:03:28.280
<v Speaker 1>it's like steps in a recipe or something.

81
00:03:28.360 --> 00:03:31.960
<v Speaker 2>Yeah. Some instructions are just executed sequentially, one after the other,

82
00:03:32.039 --> 00:03:36.680
<v Speaker 2>like a recipe, right, But other instructions can introduce like branching.

83
00:03:36.639 --> 00:03:38.719
<v Speaker 1>So the program can go in different directions.

84
00:03:38.800 --> 00:03:40.919
<v Speaker 2>Yeah, exactly. It's called conditional branching.

85
00:03:41.080 --> 00:03:43.400
<v Speaker 1>An idea pro can can help you look at all

86
00:03:43.439 --> 00:03:44.879
<v Speaker 1>those different paths. Yeah.

87
00:03:44.960 --> 00:03:47.439
<v Speaker 2>It disassembles both paths of a branch, so you can

88
00:03:47.479 --> 00:03:50.400
<v Speaker 2>see like all the potential flows of execution.

89
00:03:50.560 --> 00:03:52.759
<v Speaker 1>And then you have function calls, which are like mini

90
00:03:52.759 --> 00:03:56.039
<v Speaker 1>programs within the bigger program. Yeah exactly, So that can

91
00:03:56.080 --> 00:03:57.400
<v Speaker 1>get really complicated.

92
00:03:57.520 --> 00:04:00.680
<v Speaker 2>You know, understanding how those function calls work is really

93
00:04:00.719 --> 00:04:03.520
<v Speaker 2>crucial for understanding the logic of a program.

94
00:04:03.759 --> 00:04:06.439
<v Speaker 1>So how do these functions communicate with each other?

95
00:04:06.719 --> 00:04:09.400
<v Speaker 2>Well, that's where calling conventions come in. They're like the

96
00:04:09.479 --> 00:04:12.479
<v Speaker 2>rules of etiquette for how functions talk to each other.

97
00:04:13.240 --> 00:04:16.040
<v Speaker 1>So why is it important to understand the calling conventions?

98
00:04:17.000 --> 00:04:19.720
<v Speaker 2>Because if you don't understand the rules of communication, then

99
00:04:19.759 --> 00:04:22.319
<v Speaker 2>you're not going to understand what the code is doing.

100
00:04:22.439 --> 00:04:25.279
<v Speaker 2>It's like trying to understand a foreign language without knowing

101
00:04:25.319 --> 00:04:25.839
<v Speaker 2>the grammar.

102
00:04:26.000 --> 00:04:29.040
<v Speaker 1>Okay. And so there are different types of calling conventions.

103
00:04:29.639 --> 00:04:33.120
<v Speaker 2>Yeah, Like common ones are sedical and SETI.

104
00:04:32.800 --> 00:04:35.439
<v Speaker 1>Call, and id pro can help you understand which calling

105
00:04:35.480 --> 00:04:36.720
<v Speaker 1>convention is being used.

106
00:04:36.839 --> 00:04:39.199
<v Speaker 2>Yeah, it gives you clues and helps you decipher it.

107
00:04:39.360 --> 00:04:41.560
<v Speaker 1>But it's hard to keep track of all that data

108
00:04:41.639 --> 00:04:42.959
<v Speaker 1>moving between these functions.

109
00:04:43.040 --> 00:04:46.199
<v Speaker 2>Well, that's where stack frames come in. Each function has

110
00:04:46.240 --> 00:04:49.040
<v Speaker 2>like a little workspace and memory where it stores its

111
00:04:49.079 --> 00:04:51.600
<v Speaker 2>local variables and parameters and things like that.

112
00:04:51.680 --> 00:04:54.240
<v Speaker 1>So it's like its own little office exactly.

113
00:04:54.439 --> 00:04:57.160
<v Speaker 2>And ida pro lets you view those stack frames and

114
00:04:57.240 --> 00:04:58.360
<v Speaker 2>even modify them.

115
00:04:58.439 --> 00:04:59.959
<v Speaker 1>Wow, this is getting really complicated.

116
00:05:00.360 --> 00:05:02.279
<v Speaker 2>Yeah, but he's pretty powerful.

117
00:05:02.480 --> 00:05:04.120
<v Speaker 1>I'm hooked. I want to learn more.

118
00:05:04.199 --> 00:05:06.920
<v Speaker 2>All right, let's get good. So we've been talking about code,

119
00:05:06.920 --> 00:05:10.199
<v Speaker 2>but what about data. How does ida pro handle the data?

120
00:05:10.439 --> 00:05:12.839
<v Speaker 1>Yeah, because it's not just about the instructions, right, it's

121
00:05:12.839 --> 00:05:15.839
<v Speaker 1>about how the program actually uses data exactly.

122
00:05:15.920 --> 00:05:18.839
<v Speaker 2>Yeah, have the instructions like the verbs, but you need

123
00:05:18.920 --> 00:05:21.879
<v Speaker 2>the nouns to understand what's going on. So ida pro

124
00:05:22.079 --> 00:05:24.480
<v Speaker 2>goes beyond just showing you a bunch of raw bites.

125
00:05:24.600 --> 00:05:24.959
<v Speaker 1>Oh okay.

126
00:05:25.000 --> 00:05:27.560
<v Speaker 2>It lets you define data types, erase structures, all kinds

127
00:05:27.560 --> 00:05:27.920
<v Speaker 2>of things.

128
00:05:28.160 --> 00:05:29.600
<v Speaker 1>Wait, structures, what are those?

129
00:05:29.720 --> 00:05:33.759
<v Speaker 2>Well, imagine you have a program that's storing information about people.

130
00:05:34.480 --> 00:05:37.439
<v Speaker 2>Each person has a name and age, an address.

131
00:05:37.560 --> 00:05:37.800
<v Speaker 1>Okay.

132
00:05:37.879 --> 00:05:40.720
<v Speaker 2>A structure lets you group all that data together under

133
00:05:40.720 --> 00:05:41.160
<v Speaker 2>one name.

134
00:05:41.279 --> 00:05:43.519
<v Speaker 1>So instead of seeing just a bunch of variables, you

135
00:05:43.560 --> 00:05:45.720
<v Speaker 1>see like a person with information.

136
00:05:46.040 --> 00:05:49.319
<v Speaker 2>And ida pro lets you create and apply these structures

137
00:05:49.360 --> 00:05:51.920
<v Speaker 2>to make the disassembly much more readable.

138
00:05:52.000 --> 00:05:54.160
<v Speaker 1>Oh that's cool. Yeah, especially if you're working with some

139
00:05:54.240 --> 00:05:55.720
<v Speaker 1>kind of complex data format.

140
00:05:56.079 --> 00:05:59.519
<v Speaker 2>Yeah, and remember those type libraries we talked about. Those

141
00:05:59.560 --> 00:06:02.480
<v Speaker 2>come in hand and here too, because id pro has

142
00:06:02.519 --> 00:06:05.560
<v Speaker 2>these libraries that define standard data structures.

143
00:06:05.720 --> 00:06:08.920
<v Speaker 1>So it's like a reference guide yeah for a data format.

144
00:06:08.639 --> 00:06:12.040
<v Speaker 2>So it can often automatically recognize and apply these structures

145
00:06:12.040 --> 00:06:12.360
<v Speaker 2>for you.

146
00:06:12.480 --> 00:06:14.720
<v Speaker 1>Well that's awesome. Saves you a lot of time.

147
00:06:14.720 --> 00:06:18.160
<v Speaker 2>And once you have the code and the data well defined,

148
00:06:18.800 --> 00:06:21.000
<v Speaker 2>you can start to see the bigger picture. Okay, this

149
00:06:21.079 --> 00:06:23.680
<v Speaker 2>is where idpro's graphing capabilities come in.

150
00:06:24.000 --> 00:06:26.680
<v Speaker 1>The graphs, right, Why are these so important again?

151
00:06:26.800 --> 00:06:29.759
<v Speaker 2>Well, they give you a visual representation of the relationships

152
00:06:29.759 --> 00:06:32.399
<v Speaker 2>between different parts of the program. They can show you

153
00:06:32.439 --> 00:06:36.519
<v Speaker 2>how code flows within a function, how functions call each other,

154
00:06:36.879 --> 00:06:37.839
<v Speaker 2>how data.

155
00:06:37.680 --> 00:06:40.680
<v Speaker 1>Is accessed, so it's like a map of the software exactly.

156
00:06:41.120 --> 00:06:43.120
<v Speaker 2>And there are two main types of graphs.

157
00:06:43.319 --> 00:06:43.639
<v Speaker 1>Okay.

158
00:06:43.720 --> 00:06:47.680
<v Speaker 2>You have external graphs, which use separate graphing applications, and

159
00:06:47.720 --> 00:06:50.399
<v Speaker 2>then you have integrated graphs, which are built right into

160
00:06:50.439 --> 00:06:51.199
<v Speaker 2>IDA so.

161
00:06:51.160 --> 00:06:52.399
<v Speaker 1>You don't have to leave id pro.

162
00:06:52.560 --> 00:06:55.720
<v Speaker 2>Yeah, exactly. And the integrated graphs are really cool because

163
00:06:55.720 --> 00:06:56.600
<v Speaker 2>they're interactive.

164
00:06:56.800 --> 00:06:57.360
<v Speaker 1>What does that mean.

165
00:06:57.480 --> 00:06:59.480
<v Speaker 2>Well, you can zoom in and out, you can pan around,

166
00:06:59.519 --> 00:07:01.680
<v Speaker 2>you can click on different nodes in the graph to

167
00:07:01.879 --> 00:07:02.319
<v Speaker 2>jump to.

168
00:07:02.279 --> 00:07:04.480
<v Speaker 1>The corresponding kind Oh that's awesome.

169
00:07:04.560 --> 00:07:07.279
<v Speaker 2>It makes it much more engaging to understand the structure

170
00:07:07.279 --> 00:07:08.000
<v Speaker 2>of the program.

171
00:07:08.040 --> 00:07:09.839
<v Speaker 1>And there were different types of graphs, right, yeah, like

172
00:07:09.879 --> 00:07:12.439
<v Speaker 1>flow chart graphs. Those show the flow of the program.

173
00:07:12.560 --> 00:07:15.839
<v Speaker 2>Yeah, the flow of execution within a function, okay. And

174
00:07:15.879 --> 00:07:18.600
<v Speaker 2>then you have call graphs, which show you the hierarchy

175
00:07:18.639 --> 00:07:21.519
<v Speaker 2>of function calls, like who calls who exactly.

176
00:07:21.759 --> 00:07:23.879
<v Speaker 1>So it seems like there's a lot of customization that

177
00:07:23.959 --> 00:07:25.399
<v Speaker 1>you can do in IDA pro.

178
00:07:25.639 --> 00:07:26.439
<v Speaker 2>Yeah. Absolutely.

179
00:07:26.639 --> 00:07:27.040
<v Speaker 1>Yeah.

180
00:07:27.319 --> 00:07:28.839
<v Speaker 2>One of the first things you want to look at

181
00:07:28.879 --> 00:07:32.319
<v Speaker 2>is the configuration options. You can customize like the font,

182
00:07:32.519 --> 00:07:35.079
<v Speaker 2>the colors, keyboard shortcuts, all kinds of.

183
00:07:35.040 --> 00:07:36.639
<v Speaker 1>Things, so you can make it work the way you

184
00:07:36.639 --> 00:07:37.839
<v Speaker 1>want it to exactly.

185
00:07:38.000 --> 00:07:40.199
<v Speaker 2>And then if you want to go even further, you

186
00:07:40.199 --> 00:07:43.560
<v Speaker 2>can use ida Pro's scripting language IDC.

187
00:07:44.120 --> 00:07:46.959
<v Speaker 1>I've heard about IDC scripting, but like, what can you

188
00:07:47.000 --> 00:07:47.800
<v Speaker 1>actually do with it?

189
00:07:47.879 --> 00:07:50.600
<v Speaker 2>Oh? All kinds of things. Let's say you're analyzing a

190
00:07:50.639 --> 00:07:55.000
<v Speaker 2>program that uses a custom encryption algorithm. You could write

191
00:07:55.079 --> 00:07:58.360
<v Speaker 2>an IDC script to automatically decrypt the data.

192
00:07:58.480 --> 00:07:58.920
<v Speaker 1>Oh wow.

193
00:07:59.079 --> 00:08:00.800
<v Speaker 2>Or you could use it to rename a bunch of

194
00:08:00.839 --> 00:08:03.000
<v Speaker 2>functions that have like generic names.

195
00:08:03.079 --> 00:08:05.319
<v Speaker 1>Oh. So it's like you have these little helpers working

196
00:08:05.360 --> 00:08:07.800
<v Speaker 1>behind the scenes exactly. And you can even go beyond

197
00:08:07.839 --> 00:08:10.360
<v Speaker 1>IDC scripting with like external plug in.

198
00:08:10.560 --> 00:08:13.079
<v Speaker 2>Yeah, you can write plugins to add whole new features

199
00:08:13.120 --> 00:08:13.759
<v Speaker 2>to id pro.

200
00:08:14.000 --> 00:08:17.079
<v Speaker 1>So it's like you're expanding on its capability exactly. Wow.

201
00:08:17.120 --> 00:08:17.759
<v Speaker 1>That's a lot.

202
00:08:17.839 --> 00:08:20.000
<v Speaker 2>But even with all this power, sometimes you need a

203
00:08:20.000 --> 00:08:22.720
<v Speaker 2>little extra help and that's where FLIRTA comes in.

204
00:08:23.279 --> 00:08:24.480
<v Speaker 1>FLOORTA What was that again?

205
00:08:24.560 --> 00:08:28.560
<v Speaker 2>It stands for Fast Library Identification and Recognition Technology.

206
00:08:28.680 --> 00:08:30.439
<v Speaker 1>Oh okay, I remember you mentioned that.

207
00:08:30.480 --> 00:08:34.279
<v Speaker 2>It's really helpful because it lets ida pro automatically identify

208
00:08:34.320 --> 00:08:35.720
<v Speaker 2>standard library functions.

209
00:08:35.799 --> 00:08:37.360
<v Speaker 1>Why that's so important, Well.

210
00:08:37.279 --> 00:08:40.120
<v Speaker 2>A lot of programs use common libraries for things like

211
00:08:40.320 --> 00:08:42.519
<v Speaker 2>string manipulation, map operations.

212
00:08:42.639 --> 00:08:42.799
<v Speaker 1>Right.

213
00:08:43.159 --> 00:08:47.440
<v Speaker 2>Without flotaate, you would have to manually analyze those library

214
00:08:47.440 --> 00:08:48.399
<v Speaker 2>functions every time.

215
00:08:48.480 --> 00:08:49.759
<v Speaker 1>Oh, that would be tedious.

216
00:08:49.919 --> 00:08:52.559
<v Speaker 2>Yeah, it would be really time consuming. Yeah, So floirtaate

217
00:08:52.600 --> 00:08:54.679
<v Speaker 2>basically takes care of that for you, so you.

218
00:08:54.600 --> 00:08:57.519
<v Speaker 1>Can focus on the interesting parts of the code. Exactly

219
00:08:57.600 --> 00:08:58.639
<v Speaker 1>how does it even work?

220
00:08:58.840 --> 00:09:01.759
<v Speaker 2>Well, it uses these things called signature files, which are

221
00:09:01.799 --> 00:09:07.240
<v Speaker 2>basically databases of function signatures. Okay, and when ideapro loads

222
00:09:07.279 --> 00:09:12.080
<v Speaker 2>a binary, it compares the code to those signature files.

223
00:09:11.759 --> 00:09:15.320
<v Speaker 1>So it's like a fingerprint databasect library functions.

224
00:09:15.120 --> 00:09:16.879
<v Speaker 2>And if it finds a match, it can tell you

225
00:09:16.919 --> 00:09:20.440
<v Speaker 2>exactly what that function is, what it's calling convention is,

226
00:09:20.480 --> 00:09:21.679
<v Speaker 2>what parameters it takes.

227
00:09:22.200 --> 00:09:26.919
<v Speaker 1>That's amazing. So what about when you have custom functions

228
00:09:27.679 --> 00:09:30.639
<v Speaker 1>or functions with non standard calling conventions? How do you

229
00:09:30.639 --> 00:09:31.240
<v Speaker 1>deal with those?

230
00:09:31.399 --> 00:09:34.759
<v Speaker 2>That's where understanding those calling conventions and the stack comes.

231
00:09:34.519 --> 00:09:37.120
<v Speaker 1>In the calling convention. Those were the rules of etiquette

232
00:09:37.200 --> 00:09:39.519
<v Speaker 1>right exactly for how functions talk to each other. Yea,

233
00:09:39.639 --> 00:09:42.600
<v Speaker 1>and the stack was like a temporary workspace exactly.

234
00:09:43.039 --> 00:09:45.919
<v Speaker 2>So different calling conventions use the stack in different ways.

235
00:09:46.320 --> 00:09:48.519
<v Speaker 2>Some require the caller to clean up the stack after

236
00:09:48.519 --> 00:09:51.720
<v Speaker 2>a function call, Others require the function itself to do

237
00:09:51.759 --> 00:09:52.240
<v Speaker 2>the cleaning.

238
00:09:52.519 --> 00:09:55.279
<v Speaker 1>So if you don't know which convention's being used, you

239
00:09:55.279 --> 00:09:57.440
<v Speaker 1>could misinterpret what's happening exactly.

240
00:09:57.879 --> 00:09:59.960
<v Speaker 2>You might think data is being passed when it's not,

241
00:10:00.600 --> 00:10:03.039
<v Speaker 2>or you might miss something important about how the function

242
00:10:03.159 --> 00:10:04.480
<v Speaker 2>is managing its data.

243
00:10:04.519 --> 00:10:08.240
<v Speaker 1>So you have to know the secret handshake, yeah, for

244
00:10:08.360 --> 00:10:09.200
<v Speaker 1>each different.

245
00:10:09.000 --> 00:10:11.919
<v Speaker 2>Function, and idea pro can actually help you figure out

246
00:10:11.960 --> 00:10:15.120
<v Speaker 2>which handshake is being used. Cool by analyzing the code.

247
00:10:15.440 --> 00:10:18.159
<v Speaker 1>So how can we see what's happening on the stack

248
00:10:18.279 --> 00:10:19.480
<v Speaker 1>as the program is running.

249
00:10:19.919 --> 00:10:21.000
<v Speaker 2>That's where the debugger comes in.

250
00:10:21.240 --> 00:10:22.279
<v Speaker 1>Oh yeah, the debugger.

251
00:10:22.320 --> 00:10:25.559
<v Speaker 2>It lets you step through the code one instruction at

252
00:10:25.600 --> 00:10:28.320
<v Speaker 2>a time, and you can see how data is pushed

253
00:10:28.360 --> 00:10:30.120
<v Speaker 2>onto the stack and popped off the stack.

254
00:10:30.200 --> 00:10:31.919
<v Speaker 1>So you can actually see the stack in action.

255
00:10:32.159 --> 00:10:33.200
<v Speaker 2>Yeah, exactly.

256
00:10:33.480 --> 00:10:34.200
<v Speaker 1>That's incredible.

257
00:10:34.279 --> 00:10:35.759
<v Speaker 2>It's a really powerful tool.

258
00:10:35.960 --> 00:10:36.759
<v Speaker 1>We've covered a lot.

259
00:10:36.840 --> 00:10:39.399
<v Speaker 2>We've only just scratched the surface of what ideapro can

260
00:10:39.440 --> 00:10:41.759
<v Speaker 2>do and the techniques that reverse engineers use.

261
00:10:42.000 --> 00:10:44.000
<v Speaker 1>But this is a good start, right, Yeah.

262
00:10:44.039 --> 00:10:46.320
<v Speaker 2>Absolutely, it's a great foundation to build upon.

263
00:10:46.679 --> 00:10:50.080
<v Speaker 1>So we talked about analyzing code and data, you know,

264
00:10:50.240 --> 00:10:52.600
<v Speaker 1>understanding those functions and how they communicate with each other,

265
00:10:53.159 --> 00:10:56.159
<v Speaker 1>even using that debugger to step through the program. But

266
00:10:56.200 --> 00:11:00.440
<v Speaker 1>I imagine like things get even more complicated when developers

267
00:11:00.480 --> 00:11:02.879
<v Speaker 1>try to make their code hard to understand on purpose.

268
00:11:03.360 --> 00:11:07.320
<v Speaker 2>Yeah, you're talking about obfuscated code. It's a common tactic,

269
00:11:07.440 --> 00:11:10.080
<v Speaker 2>especially for like malware authors who want to hide what

270
00:11:10.120 --> 00:11:10.960
<v Speaker 2>their code is doing.

271
00:11:11.200 --> 00:11:12.639
<v Speaker 1>So it's like they're trying to make it look like

272
00:11:12.679 --> 00:11:16.960
<v Speaker 1>a big tangled mess of spaghetti so nobody can figure

273
00:11:16.960 --> 00:11:17.799
<v Speaker 1>out the recipe.

274
00:11:18.000 --> 00:11:22.039
<v Speaker 2>Obfuscation techniques can range from simple things like renaming variables

275
00:11:22.600 --> 00:11:25.440
<v Speaker 2>to more complex things that actually change the structure and

276
00:11:25.519 --> 00:11:26.519
<v Speaker 2>logic of the code.

277
00:11:26.600 --> 00:11:29.840
<v Speaker 1>So how do reverse engineers even begin to make sense

278
00:11:29.879 --> 00:11:31.960
<v Speaker 1>of that? Does IDA pro help with that at all?

279
00:11:32.200 --> 00:11:35.720
<v Speaker 2>Yeah? It does. The debugger is really helpful here, okay,

280
00:11:35.759 --> 00:11:38.120
<v Speaker 2>because it lets you step through the code and observe

281
00:11:38.200 --> 00:11:42.120
<v Speaker 2>its behavior and look for clues about how the obfuscation

282
00:11:42.240 --> 00:11:43.840
<v Speaker 2>is being applied or removed.

283
00:11:43.879 --> 00:11:46.039
<v Speaker 1>So it's like peeling back the layers of an onion

284
00:11:46.120 --> 00:11:48.039
<v Speaker 1>to get to the real code. Underneath.

285
00:11:48.360 --> 00:11:51.399
<v Speaker 2>And once you've identified those key points in the code

286
00:11:51.399 --> 00:11:55.120
<v Speaker 2>where the obfuscation is being handled, you can often use

287
00:11:55.279 --> 00:11:59.440
<v Speaker 2>IDA pro scripting capabilities to automate the deoffuscation process.

288
00:11:59.679 --> 00:12:00.720
<v Speaker 1>Wow, that's pretty cool.

289
00:12:00.840 --> 00:12:04.120
<v Speaker 2>Yeah, you can write scripts to like unpack data, decrypt code,

290
00:12:04.440 --> 00:12:07.559
<v Speaker 2>even emulate the obfuscation routines to get back to the

291
00:12:07.559 --> 00:12:08.480
<v Speaker 2>original code.

292
00:12:08.679 --> 00:12:11.080
<v Speaker 1>So you really need to understand both how the obfuscation

293
00:12:11.200 --> 00:12:13.600
<v Speaker 1>works and how to use IDA pros scripting.

294
00:12:13.759 --> 00:12:16.679
<v Speaker 2>Yeah, it can be pretty challenging, but it's often necessary

295
00:12:16.679 --> 00:12:19.519
<v Speaker 2>to understand what the code is really doing, especially if.

296
00:12:19.399 --> 00:12:23.519
<v Speaker 1>You think it might be malicious. Right, speaking of malicious code,

297
00:12:23.559 --> 00:12:27.000
<v Speaker 1>you mentioned that IDA pros often used for vulnerability analysis. Yeah,

298
00:12:27.120 --> 00:12:28.200
<v Speaker 1>can you tell me more about that.

299
00:12:28.279 --> 00:12:31.159
<v Speaker 2>So a vulnerability is basically a weakness and a piece

300
00:12:31.159 --> 00:12:36.399
<v Speaker 2>of software that an attacker could exploit to gain unauthorized

301
00:12:36.440 --> 00:12:37.519
<v Speaker 2>access or control.

302
00:12:37.639 --> 00:12:39.480
<v Speaker 1>So they're looking for like security holes.

303
00:12:39.399 --> 00:12:42.200
<v Speaker 2>Exactly, and IDEA pro is a great tool for finding

304
00:12:42.240 --> 00:12:44.639
<v Speaker 2>and analyzing these vulnerabilities.

305
00:12:44.720 --> 00:12:46.240
<v Speaker 1>How do they actually use it to find them?

306
00:12:46.399 --> 00:12:49.000
<v Speaker 2>Well, they use it to dissect the software, looking for

307
00:12:49.200 --> 00:12:54.039
<v Speaker 2>common coding errors that can lead to vulnerabilities, things like

308
00:12:54.080 --> 00:12:58.159
<v Speaker 2>buffer overflows where data is written beyond the allocated space

309
00:12:58.200 --> 00:13:02.720
<v Speaker 2>and memory or form a string vulnerabilities where an attacker

310
00:13:02.759 --> 00:13:06.080
<v Speaker 2>can manipulate the way data is formatted to gain control

311
00:13:06.159 --> 00:13:07.360
<v Speaker 2>of the program's execution.

312
00:13:07.960 --> 00:13:10.159
<v Speaker 1>That sounds pretty complicated, Yeah it is.

313
00:13:10.200 --> 00:13:13.559
<v Speaker 2>It's a specialized field, but the tools and techniques are

314
00:13:13.600 --> 00:13:16.759
<v Speaker 2>fundamentally the same as those used for general reverse engineering.

315
00:13:17.360 --> 00:13:20.000
<v Speaker 1>And you mentioned earlier that IDA pro has a built

316
00:13:20.120 --> 00:13:23.960
<v Speaker 1>in debugger. Is that used for vulnerability analysis as well?

317
00:13:23.960 --> 00:13:27.080
<v Speaker 2>Oh? Absolutely, It's invaluable because it allows you to step

318
00:13:27.120 --> 00:13:31.000
<v Speaker 2>through the code and observe how data is handled and

319
00:13:31.159 --> 00:13:35.759
<v Speaker 2>identify those potential points of weakness where an attacker might

320
00:13:35.759 --> 00:13:39.919
<v Speaker 2>be able to inject malicious code or manipulate the program's behavior.

321
00:13:40.080 --> 00:13:42.679
<v Speaker 1>So again, it's like being a detective looking for clues.

322
00:13:42.759 --> 00:13:45.679
<v Speaker 2>And one of the most effective techniques for vulnerability analysis

323
00:13:45.759 --> 00:13:49.679
<v Speaker 2>is called differential analysis. It involves comparing different versions of

324
00:13:49.679 --> 00:13:53.200
<v Speaker 2>a program, typically a patched version and an unpatched version,

325
00:13:53.279 --> 00:13:53.840
<v Speaker 2>to see.

326
00:13:53.600 --> 00:13:56.039
<v Speaker 1>What's changed, So you're looking for the differences.

327
00:13:55.639 --> 00:13:59.600
<v Speaker 2>Exactly, and by analyzing those differences, security researchers can often

328
00:13:59.639 --> 00:14:03.039
<v Speaker 2>identify the code that was vulnerable and understand how the

329
00:14:03.080 --> 00:14:05.200
<v Speaker 2>patch fixed the issue.

330
00:14:05.000 --> 00:14:07.720
<v Speaker 1>So they can figure out how to protect against attacks. Right,

331
00:14:08.080 --> 00:14:11.480
<v Speaker 1>that's amazing. It sounds like a constant arms race is

332
00:14:11.559 --> 00:14:13.919
<v Speaker 1>between the security researchers and the attackers.

333
00:14:14.240 --> 00:14:18.080
<v Speaker 2>You know, as attackers develop new techniques and exploit new vulnerabilities,

334
00:14:18.440 --> 00:14:21.679
<v Speaker 2>the defenders have to adapt and develop new countermeasures.

335
00:14:21.879 --> 00:14:25.559
<v Speaker 1>Well, it's definitely a fascinating field and an important one too. Absolutely,

336
00:14:25.600 --> 00:14:28.480
<v Speaker 1>we've covered so much ground in this deep dive. We

337
00:14:28.639 --> 00:14:32.120
<v Speaker 1>have from the basics of reverse engineering to some pretty

338
00:14:32.159 --> 00:14:35.200
<v Speaker 1>advanced techniques. Yeah, anything else you want to add before

339
00:14:35.200 --> 00:14:35.840
<v Speaker 1>we wrap up.

340
00:14:35.919 --> 00:14:38.759
<v Speaker 2>I think we've touched on the most important aspects, But

341
00:14:38.799 --> 00:14:41.320
<v Speaker 2>I do want to emphasize that this is just the beginning.

342
00:14:41.919 --> 00:14:46.200
<v Speaker 2>Reverse engineering is a vast and constantly evolving field.

343
00:14:46.120 --> 00:14:47.399
<v Speaker 1>So there's always more to learn.

344
00:14:47.559 --> 00:14:51.320
<v Speaker 2>Yeah, always more to learn, new tools and techniques, to

345
00:14:51.440 --> 00:14:53.440
<v Speaker 2>explore new challenges to tackle.

346
00:14:53.879 --> 00:14:56.919
<v Speaker 1>It sounds exciting and maybe a little bit daunting.

347
00:14:57.159 --> 00:15:00.480
<v Speaker 2>It can be both, but IDPro is a really powerful

348
00:15:00.759 --> 00:15:04.080
<v Speaker 2>and with dedication and a thirst for knowledge, you can

349
00:15:04.159 --> 00:15:06.600
<v Speaker 2>unlock a lot of the secrets of the digital world.

350
00:15:06.799 --> 00:15:09.879
<v Speaker 1>Well, I'm definitely inspired to learn more. Thanks for taking

351
00:15:09.960 --> 00:15:11.879
<v Speaker 1>us on this deep dive into idea pro.

352
00:15:12.080 --> 00:15:12.720
<v Speaker 2>You're welcome.

353
00:15:12.879 --> 00:15:13.480
<v Speaker 1>It's been great.

354
00:15:13.600 --> 00:15:14.559
<v Speaker 2>It's been my pleasure,
