WEBVTT

1
00:00:00.120 --> 00:00:03.879
<v Speaker 1>Have you ever found yourself wondering how computers actually manage

2
00:00:03.919 --> 00:00:08.720
<v Speaker 1>these really complex challenges like optimizing delivery routes, or you know,

3
00:00:08.759 --> 00:00:10.519
<v Speaker 1>recognizing faces in.

4
00:00:10.560 --> 00:00:13.960
<v Speaker 2>Photos, or even just storing huge amounts of data efficiently.

5
00:00:14.359 --> 00:00:17.120
<v Speaker 1>Exactly what if there was sort of a shortcut to

6
00:00:17.239 --> 00:00:22.280
<v Speaker 1>understanding the basic logic behind so much of our modern tech.

7
00:00:22.719 --> 00:00:24.800
<v Speaker 2>Well, in this deep dive, that's what we're doing. We're

8
00:00:24.839 --> 00:00:29.359
<v Speaker 2>pulling back the curtain on classic computer science problems. And

9
00:00:29.440 --> 00:00:32.320
<v Speaker 2>these aren't just you know, dusty academic exercise.

10
00:00:31.920 --> 00:00:34.399
<v Speaker 1>Well is, They're not just for university courses, not at all.

11
00:00:34.479 --> 00:00:38.960
<v Speaker 2>They're the foundational programming challenges that actually solve real world problems.

12
00:00:39.200 --> 00:00:44.000
<v Speaker 1>And we've dug into a really comprehensive book covering these problems.

13
00:00:44.079 --> 00:00:46.960
<v Speaker 1>Our mission today it's basically to pull out the most

14
00:00:46.960 --> 00:00:49.679
<v Speaker 1>important bits of knowledge, the key insights for you.

15
00:00:49.799 --> 00:00:53.479
<v Speaker 2>We'll unpack the core ideas, maybe reveal some surprising facts

16
00:00:53.479 --> 00:00:54.280
<v Speaker 2>along the way, and.

17
00:00:54.240 --> 00:00:57.320
<v Speaker 1>Show you how these these foundational concepts are like woven

18
00:00:57.399 --> 00:01:00.479
<v Speaker 1>right into the fabric of our digital Worldfully, you'll have

19
00:01:00.520 --> 00:01:04.239
<v Speaker 1>a few aha moments. Okay, So to kick things off,

20
00:01:04.680 --> 00:01:07.680
<v Speaker 1>let's look at something that seems pretty simple on the surface.

21
00:01:08.200 --> 00:01:12.000
<v Speaker 1>The Fibonacci sequence. You know, one, one, two, three, five, right,

22
00:01:12.439 --> 00:01:15.200
<v Speaker 1>But I've heard this sequence can actually teach us some

23
00:01:15.319 --> 00:01:21.200
<v Speaker 1>really profound lessons about computational efficiency. What's the story there,

24
00:01:21.239 --> 00:01:22.280
<v Speaker 1>what's the hidden lesson?

25
00:01:22.439 --> 00:01:26.280
<v Speaker 2>Yeah, it's fascinating. It really highlights the stark difference between

26
00:01:26.400 --> 00:01:29.519
<v Speaker 2>just jumping in with a straightforward, what we might call

27
00:01:29.560 --> 00:01:33.640
<v Speaker 2>a naive recursive solution, and more optimized ways of thinking.

28
00:01:33.760 --> 00:01:35.760
<v Speaker 1>Okay, so what happens with that naive way?

29
00:01:35.959 --> 00:01:41.280
<v Speaker 2>Well, get this Calculating just the twentieth Fibonacci number using

30
00:01:41.319 --> 00:01:46.200
<v Speaker 2>that simple recursive function, it ends up making twenty ninety one.

31
00:01:46.239 --> 00:01:49.400
<v Speaker 1>Separate function calls twenty one thousand for the twentieth number.

32
00:01:49.439 --> 00:01:50.840
<v Speaker 1>That seems excessive.

33
00:01:50.879 --> 00:01:54.239
<v Speaker 2>It's incredibly inefficient, yeah, because it just keeps recalculating the

34
00:01:54.280 --> 00:01:56.000
<v Speaker 2>same values over and over and over.

35
00:01:55.920 --> 00:01:58.439
<v Speaker 1>Again, right, recalculating things that already figured out. But there

36
00:01:58.439 --> 00:01:59.920
<v Speaker 1>are better ways. You mentioned optimization.

37
00:02:00.079 --> 00:02:03.319
<v Speaker 2>Absolutely, there's a technique called memoization. It was actually coined

38
00:02:03.359 --> 00:02:04.959
<v Speaker 2>by computer scientist Donald Mitchee.

39
00:02:04.959 --> 00:02:07.200
<v Speaker 1>Memoization like writing a memo to.

40
00:02:07.159 --> 00:02:10.840
<v Speaker 2>Yourself kind of yeah, think it like a human memorization machine.

41
00:02:11.199 --> 00:02:15.080
<v Speaker 2>The function, remember, the results had already calculated, so with memoization,

42
00:02:15.879 --> 00:02:19.719
<v Speaker 2>that same twentieth number, it only takes thirty nine calls.

43
00:02:19.599 --> 00:02:22.759
<v Speaker 1>Thirty nine down from almost twenty two thousand. That's huge.

44
00:02:22.879 --> 00:02:26.560
<v Speaker 2>It's massive. And with this method you can actually calculate say,

45
00:02:26.919 --> 00:02:29.360
<v Speaker 2>FIB fifty without the whole thing grinding to a halt.

46
00:02:29.800 --> 00:02:31.479
<v Speaker 2>The naive version just wouldn't cope.

47
00:02:31.879 --> 00:02:34.639
<v Speaker 1>Okay, So memorization is a big step up that the

48
00:02:34.680 --> 00:02:35.319
<v Speaker 1>best we can do.

49
00:02:35.520 --> 00:02:37.759
<v Speaker 2>Well. There's also what you might call an old fashioned

50
00:02:37.800 --> 00:02:41.199
<v Speaker 2>iterative approach, just using a loop. Yeah, that approach runs

51
00:02:41.240 --> 00:02:44.479
<v Speaker 2>its main loop at most and one times, So for

52
00:02:44.520 --> 00:02:47.759
<v Speaker 2>the twentieth number, that's like nineteen loops, even fewer operations.

53
00:02:47.800 --> 00:02:50.000
<v Speaker 1>Wow, okay, and this really matters.

54
00:02:50.159 --> 00:02:52.319
<v Speaker 2>You know, this kind of difference can have a serious

55
00:02:52.360 --> 00:02:55.120
<v Speaker 2>difference in a real world application. It's not just theory.

56
00:02:55.240 --> 00:02:57.159
<v Speaker 2>It's about thinking efficiently before.

57
00:02:56.960 --> 00:02:59.879
<v Speaker 1>You code, right, It's the difference between something working instantly

58
00:03:00.080 --> 00:03:04.479
<v Speaker 1>and something maybe never finishing this sie So okay that speed.

59
00:03:05.039 --> 00:03:08.000
<v Speaker 1>But efficiency isn't just about speed? Is it saving space?

60
00:03:08.159 --> 00:03:11.960
<v Speaker 1>Virtual storage memory? That's also huge. How did these fundamental

61
00:03:12.000 --> 00:03:13.000
<v Speaker 1>ideas help with that?

62
00:03:13.319 --> 00:03:18.400
<v Speaker 2>Oh? Definitely? Yeah, Saving space often directly translates to saving money, right,

63
00:03:18.919 --> 00:03:22.319
<v Speaker 2>virtual or real makes sense. So think about DNA the

64
00:03:22.400 --> 00:03:27.159
<v Speaker 2>nucleotides ACG or T. That's it, just four possibilities. If

65
00:03:27.199 --> 00:03:29.719
<v Speaker 2>you store a DNA sequence as a normal tech string,

66
00:03:30.360 --> 00:03:33.439
<v Speaker 2>each letter, each character usually takes up eight bits.

67
00:03:33.599 --> 00:03:35.479
<v Speaker 1>Okay, standard tech storage.

68
00:03:35.800 --> 00:03:38.439
<v Speaker 2>But wait, since there are only four possible values, you

69
00:03:38.479 --> 00:03:41.479
<v Speaker 2>don't actually need eight bits. You only need two bits

70
00:03:41.479 --> 00:03:43.719
<v Speaker 2>per nuclear tie. You could say, you know A is zero, zero,

71
00:03:43.919 --> 00:03:46.919
<v Speaker 2>C is zero, one, G is ten, T is eleven.

72
00:03:47.360 --> 00:03:49.759
<v Speaker 1>I see, so you're mapping the four options onto just

73
00:03:49.800 --> 00:03:50.759
<v Speaker 1>two bits. That's clever.

74
00:03:50.960 --> 00:03:54.680
<v Speaker 2>It's simple but effective. That bit string representation. That can

75
00:03:54.719 --> 00:03:58.039
<v Speaker 2>slash the storage needed for DNA data by seventy five percent,

76
00:03:58.479 --> 00:04:00.520
<v Speaker 2>from eight bits down to just two it's per.

77
00:04:00.439 --> 00:04:03.960
<v Speaker 1>Nucleodide seventy five percent. That's enormous when you think about

78
00:04:03.960 --> 00:04:05.800
<v Speaker 1>the scale of genomic data exactly.

79
00:04:05.840 --> 00:04:08.360
<v Speaker 2>We saw an example I think where an original string

80
00:04:08.439 --> 00:04:11.479
<v Speaker 2>was maybe eighty six hundred and forty nine bytes. Compressed

81
00:04:11.479 --> 00:04:13.360
<v Speaker 2>this way, it dropped to just twenty three hundred and

82
00:04:13.439 --> 00:04:14.000
<v Speaker 2>twenty bytes.

83
00:04:14.039 --> 00:04:16.439
<v Speaker 1>That's a massive saving. Now, this brings up something interesting

84
00:04:16.439 --> 00:04:19.079
<v Speaker 1>about choosing how to implement things in code, doesn't it.

85
00:04:19.079 --> 00:04:22.800
<v Speaker 2>It does. Sometimes using a dictionary or a hash map

86
00:04:22.879 --> 00:04:26.519
<v Speaker 2>for lookups, which theoretically should be super fast like one

87
00:04:26.959 --> 00:04:27.839
<v Speaker 2>or constant time.

88
00:04:27.959 --> 00:04:30.079
<v Speaker 1>Right, it's supposed to be instant regardless of size.

89
00:04:30.160 --> 00:04:34.360
<v Speaker 2>Yeah, but in practice it can sometimes be less performant

90
00:04:34.399 --> 00:04:38.319
<v Speaker 2>than a series of ifs. Really, why is that because

91
00:04:38.360 --> 00:04:41.079
<v Speaker 2>of the hidden cost of running the hash function itself.

92
00:04:41.879 --> 00:04:43.959
<v Speaker 2>The calculation to figure out where to look in the

93
00:04:43.959 --> 00:04:47.079
<v Speaker 2>dictionary takes a little bit of time. So in really

94
00:04:47.160 --> 00:04:50.279
<v Speaker 2>performance critical code you might actually need to run performance

95
00:04:50.319 --> 00:04:53.480
<v Speaker 2>tests to see what's really faster in your specific case.

96
00:04:53.800 --> 00:04:54.839
<v Speaker 2>Theory doesn't always win.

97
00:04:55.040 --> 00:04:58.639
<v Speaker 1>That's a great point. Real world testing matters, and just quickly.

98
00:04:58.720 --> 00:05:01.639
<v Speaker 1>Another example of level stuff is the xor operation right,

99
00:05:01.839 --> 00:05:02.600
<v Speaker 1>used in.

100
00:05:02.600 --> 00:05:07.040
<v Speaker 2>Absolutely xor is fundamental for certain types of unbreakable encryption.

101
00:05:07.360 --> 00:05:10.879
<v Speaker 2>Another example of low level bit manipulation having powerful applications.

102
00:05:10.959 --> 00:05:14.120
<v Speaker 1>Okay, so we've seen efficiency and speed efficiency in space

103
00:05:14.240 --> 00:05:16.920
<v Speaker 1>down to the bits. What about problems that are more

104
00:05:16.959 --> 00:05:20.639
<v Speaker 1>about logic rules constraints?

105
00:05:21.000 --> 00:05:24.319
<v Speaker 2>Ah? Yeah, that brings us to a really elegant framework

106
00:05:24.480 --> 00:05:26.800
<v Speaker 2>constraints satisfaction problems or CSPs.

107
00:05:27.079 --> 00:05:28.759
<v Speaker 1>CSPs? What are they?

108
00:05:28.800 --> 00:05:31.240
<v Speaker 2>Basically, there are a way to solve problems you can

109
00:05:31.279 --> 00:05:35.800
<v Speaker 2>define abstractly by variables of limited domains that have constraints

110
00:05:35.800 --> 00:05:36.240
<v Speaker 2>between them.

111
00:05:36.240 --> 00:05:39.240
<v Speaker 1>Okay, break that down. Variables domains constraints.

112
00:05:39.360 --> 00:05:41.319
<v Speaker 2>So variables are the things you need to decide on.

113
00:05:41.680 --> 00:05:45.040
<v Speaker 2>Domains are the possible choices or values for each variable,

114
00:05:45.639 --> 00:05:48.600
<v Speaker 2>and constraints are the rules that say which combinations of

115
00:05:48.759 --> 00:05:49.720
<v Speaker 2>values are allowed.

116
00:05:49.879 --> 00:05:51.399
<v Speaker 1>Gotcha, can you give an example?

117
00:05:51.560 --> 00:05:55.240
<v Speaker 2>Sure? Think about the classic Australian map coloring problem. The

118
00:05:55.279 --> 00:05:58.279
<v Speaker 2>regions or states are the variables. The possible colors are

119
00:05:58.319 --> 00:06:01.560
<v Speaker 2>the domain, say red, green, blue, and the constraint the

120
00:06:01.600 --> 00:06:04.439
<v Speaker 2>extrain is simple, no two neighboring regions can have the

121
00:06:04.439 --> 00:06:08.000
<v Speaker 2>same color or Another famous one is the eight queens problem,

122
00:06:08.560 --> 00:06:10.800
<v Speaker 2>placing eight chess queens on a board so none can

123
00:06:10.839 --> 00:06:11.519
<v Speaker 2>attack each other.

124
00:06:11.720 --> 00:06:16.399
<v Speaker 1>Ah okay, So the queens are variables, their positions are domains.

125
00:06:16.079 --> 00:06:19.639
<v Speaker 2>Exactly, and the constraint is the no attacking rule. The

126
00:06:19.680 --> 00:06:23.040
<v Speaker 2>core of solving these often involves a function, maybe called consistent,

127
00:06:23.199 --> 00:06:27.519
<v Speaker 2>that just checks does putting this value here violate any rules?

128
00:06:27.800 --> 00:06:31.240
<v Speaker 1>Okay, cool puzzles, But where do CSPs show up in

129
00:06:31.279 --> 00:06:34.399
<v Speaker 1>the real world Outside of map coloring and chess.

130
00:06:34.519 --> 00:06:38.160
<v Speaker 2>They're actually commonly used in scheduling. Think about scheduling staff

131
00:06:38.160 --> 00:06:41.839
<v Speaker 2>in a hospital. People are variables. Timeslots are domains, and

132
00:06:41.920 --> 00:06:44.759
<v Speaker 2>constraints are things like nurse A needs two days off

133
00:06:44.920 --> 00:06:47.319
<v Speaker 2>or doctor B must be on call with surgency.

134
00:06:47.519 --> 00:06:50.000
<v Speaker 1>Right, that makes sense complex scheduling anywhere else.

135
00:06:50.160 --> 00:06:54.279
<v Speaker 2>Yeah. In motion planning for robotics, imagine guiding a robot

136
00:06:54.439 --> 00:06:57.199
<v Speaker 2>arm to fit inside a narrow tube without hitting the sides.

137
00:06:57.600 --> 00:07:00.480
<v Speaker 2>The tube walls are the constraints. The angles of the

138
00:07:00.560 --> 00:07:04.079
<v Speaker 2>robot's joints are the variables, the possible movements of the domains.

139
00:07:04.639 --> 00:07:07.560
<v Speaker 2>CSPs help figure out a valid sequence of moves.

140
00:07:07.600 --> 00:07:12.399
<v Speaker 1>Fascinating. Okay, So from logic puzzles to navigating spaces finding

141
00:07:12.399 --> 00:07:14.439
<v Speaker 1>a path from A to B. That's a massive area

142
00:07:14.480 --> 00:07:18.120
<v Speaker 1>in computer science, right from simple lists to complex mazes.

143
00:07:18.199 --> 00:07:21.560
<v Speaker 2>Oh huge. And we have different tools depending on the situation.

144
00:07:21.879 --> 00:07:24.160
<v Speaker 2>For basic lists, if the data isn't sorted, you might

145
00:07:24.240 --> 00:07:26.839
<v Speaker 2>just use a linear search look at everything one by one.

146
00:07:27.000 --> 00:07:30.360
<v Speaker 1>Like scanning a whole bookshelf for one specific book exactly.

147
00:07:30.720 --> 00:07:32.639
<v Speaker 2>But if the data is sorted, you can use the

148
00:07:32.680 --> 00:07:34.160
<v Speaker 2>much faster binary search.

149
00:07:34.519 --> 00:07:36.839
<v Speaker 1>That's like opening a phone book right to the middle

150
00:07:37.040 --> 00:07:38.800
<v Speaker 1>than the middle of the remaining half right.

151
00:07:38.839 --> 00:07:42.639
<v Speaker 2>Much quicker, precisely. But what about something more complex like

152
00:07:42.639 --> 00:07:46.040
<v Speaker 2>a maze. How does a computer navigate that? What does

153
00:07:46.040 --> 00:07:48.839
<v Speaker 2>this all mean for finding your way through complex spaces?

154
00:07:49.000 --> 00:07:49.959
<v Speaker 1>Yeah? How does that work?

155
00:07:50.120 --> 00:07:54.360
<v Speaker 2>Well? Two fundamental approaches are depth first search or DFS

156
00:07:54.519 --> 00:07:56.720
<v Speaker 2>and Brett first search or BFS.

157
00:07:56.839 --> 00:07:58.240
<v Speaker 1>DFS and BFS.

158
00:07:58.319 --> 00:08:01.480
<v Speaker 2>Okay. DFS uses a data strictually call a stack. I

159
00:08:01.519 --> 00:08:03.519
<v Speaker 2>think last and first out, like a stack of plates.

160
00:08:03.600 --> 00:08:06.160
<v Speaker 1>Got it last, one on, first, one off right?

161
00:08:06.240 --> 00:08:09.519
<v Speaker 2>So DFS goes deep down one path, explores as far

162
00:08:09.560 --> 00:08:11.839
<v Speaker 2>as it can. If it hits a dead end, it

163
00:08:11.920 --> 00:08:13.279
<v Speaker 2>backtracks and tries another.

164
00:08:13.079 --> 00:08:15.399
<v Speaker 1>Branch, So it dives deep first.

165
00:08:15.560 --> 00:08:19.160
<v Speaker 2>Yeah. The paths that finds can sometimes seem unnatural, though,

166
00:08:19.279 --> 00:08:20.959
<v Speaker 2>and they're usually not the shortest paths.

167
00:08:21.199 --> 00:08:23.639
<v Speaker 1>It just finds a path okay, so not necessarily the

168
00:08:23.680 --> 00:08:26.160
<v Speaker 1>best path, just one that works. What about BFS.

169
00:08:26.279 --> 00:08:29.000
<v Speaker 2>BFS uses a queue. Think first and first out like

170
00:08:29.040 --> 00:08:29.680
<v Speaker 2>waiting in line.

171
00:08:29.800 --> 00:08:30.439
<v Speaker 1>Okay, fifo.

172
00:08:30.600 --> 00:08:33.600
<v Speaker 2>So DFS explores outwards from the start layer by layer.

173
00:08:33.799 --> 00:08:36.600
<v Speaker 2>It checks all neighbors one step away, then all neighbors

174
00:08:36.639 --> 00:08:38.000
<v Speaker 2>two steps away, and so on.

175
00:08:38.039 --> 00:08:40.519
<v Speaker 1>Ah, spreading out evenly exactly.

176
00:08:41.000 --> 00:08:44.759
<v Speaker 2>And because of that systematic exploration, BFS always finds the

177
00:08:44.799 --> 00:08:47.840
<v Speaker 2>shortest path in terms of the number of steps or edges,

178
00:08:48.320 --> 00:08:50.480
<v Speaker 2>assuming the cost of each step is the same.

179
00:08:50.759 --> 00:08:54.159
<v Speaker 1>Okay, So BFS guarantees the shortest path. DFS might be

180
00:08:54.279 --> 00:08:57.679
<v Speaker 1>faster to find any path, but not the shortest sounds

181
00:08:57.679 --> 00:08:58.600
<v Speaker 1>like a trade off.

182
00:08:58.519 --> 00:09:01.799
<v Speaker 2>It often is. Yeah, choosing between them is sometimes a

183
00:09:01.840 --> 00:09:04.519
<v Speaker 2>trade off between the possibility of finding a solution quickly

184
00:09:04.840 --> 00:09:06.840
<v Speaker 2>and the certainty of finding the shortest pass.

185
00:09:06.919 --> 00:09:08.840
<v Speaker 1>Makes sense. Are there more advanced methods?

186
00:09:08.960 --> 00:09:11.320
<v Speaker 2>Oh? Yes, then you get into things like a search

187
00:09:12.200 --> 00:09:15.799
<v Speaker 2>A is more sophisticated. It uses a priority queue and

188
00:09:15.840 --> 00:09:16.960
<v Speaker 2>something called a heuristic.

189
00:09:17.200 --> 00:09:19.759
<v Speaker 1>Okay, priority queue and a heuristic. What does that mean?

190
00:09:19.840 --> 00:09:23.240
<v Speaker 2>A priority Q is like a queue where items have

191
00:09:23.279 --> 00:09:25.919
<v Speaker 2>different levels of importance, so the most promising options get

192
00:09:25.919 --> 00:09:29.519
<v Speaker 2>looked at first, and a heuristic is basically an educated

193
00:09:29.600 --> 00:09:32.279
<v Speaker 2>gas or a rule of thumb to estimate how close

194
00:09:32.320 --> 00:09:32.960
<v Speaker 2>you are to the goal.

195
00:09:33.159 --> 00:09:34.840
<v Speaker 1>An estimate. How does a use that?

196
00:09:35.039 --> 00:09:38.159
<v Speaker 2>Oh, combines the actual cost to get to a certain point,

197
00:09:38.200 --> 00:09:41.120
<v Speaker 2>we call it GN with the estimated cost from that

198
00:09:41.159 --> 00:09:45.960
<v Speaker 2>point to the finish line. That's hn the heuristic. It

199
00:09:46.039 --> 00:09:49.559
<v Speaker 2>prioritizes exploring paths that seem to have the lowest total cost,

200
00:09:49.919 --> 00:09:51.480
<v Speaker 2>both actual and estimated, So.

201
00:09:51.399 --> 00:09:54.559
<v Speaker 1>It's using an estimate to guide its search more intelligently exactly.

202
00:09:54.840 --> 00:09:58.559
<v Speaker 2>For mazes, a common heuristic is the Manhattan distance, just

203
00:09:58.600 --> 00:10:02.960
<v Speaker 2>the horizontal plus vertical distance to the goal, ignoring walls

204
00:10:02.960 --> 00:10:06.080
<v Speaker 2>for the estimate like distance in city blocks. Okay, and

205
00:10:06.480 --> 00:10:09.480
<v Speaker 2>using a good heuristic like that, A not only finds

206
00:10:09.519 --> 00:10:13.360
<v Speaker 2>the optimal shortest path, but it far outperforms BFS because

207
00:10:13.360 --> 00:10:16.480
<v Speaker 2>it explores way fewer dead ends. It's much more focused.

208
00:10:16.600 --> 00:10:19.159
<v Speaker 1>That sounds really powerful. And the nice thing is if

209
00:10:19.159 --> 00:10:22.440
<v Speaker 1>you write these search algorithms well generically.

210
00:10:21.960 --> 00:10:25.159
<v Speaker 2>Right, they become incredibly versatile. These same search functions can

211
00:10:25.200 --> 00:10:28.080
<v Speaker 2>be easily adapted for solving a diverse set of problems,

212
00:10:28.440 --> 00:10:31.639
<v Speaker 2>like that old brain teaser the missionaries and Cannibal's problem

213
00:10:31.840 --> 00:10:33.919
<v Speaker 2>is just another state space to search.

214
00:10:33.960 --> 00:10:37.720
<v Speaker 1>Right, Okay, Moving beyond mazes and paths, let's talk about graphs.

215
00:10:37.759 --> 00:10:40.480
<v Speaker 1>You mentioned graphs earlier with maps. They seem really fundamental.

216
00:10:40.720 --> 00:10:44.720
<v Speaker 2>They are. The world of graph algorithms is surprisingly broad

217
00:10:44.799 --> 00:10:47.679
<v Speaker 2>in their applicability, so much can be modeled as.

218
00:10:47.559 --> 00:10:51.360
<v Speaker 1>A graph like the map. Example, cities are vertices, connections,

219
00:10:51.440 --> 00:10:52.279
<v Speaker 1>or edges.

220
00:10:52.200 --> 00:10:55.440
<v Speaker 2>Exactly, and graphs can be undirected like a two way

221
00:10:55.519 --> 00:10:57.799
<v Speaker 2>road or direct like a one way street, and the

222
00:10:57.919 --> 00:11:01.320
<v Speaker 2>edges can be unweighted just showing a connection exists, or

223
00:11:01.360 --> 00:11:04.519
<v Speaker 2>weighted showing a cost like distance or travel time.

224
00:11:04.759 --> 00:11:07.559
<v Speaker 1>So if we want the shortest path in an unweighted graph,

225
00:11:08.039 --> 00:11:10.360
<v Speaker 1>the path with the fewest edges, we can just use

226
00:11:10.399 --> 00:11:11.600
<v Speaker 1>BFS again, right.

227
00:11:11.519 --> 00:11:14.759
<v Speaker 2>Yeah, BFS works perfectly for that. Like, if you were

228
00:11:14.759 --> 00:11:18.639
<v Speaker 2>planning a hypothetical hyperloop network, BFS could find the route

229
00:11:18.679 --> 00:11:22.519
<v Speaker 2>from say Boston to Miami with the minimum number of stops.

230
00:11:22.559 --> 00:11:27.080
<v Speaker 2>Maybe it's Boston, Detroit, Washington, Miami, just three segments, three edges.

231
00:11:27.240 --> 00:11:29.759
<v Speaker 1>Okay, but what if the goal isn't just one path,

232
00:11:29.840 --> 00:11:33.799
<v Speaker 1>but connecting everything efficiently, like connecting all the major US

233
00:11:33.840 --> 00:11:35.919
<v Speaker 1>cities with the minimum amount of track.

234
00:11:36.320 --> 00:11:40.440
<v Speaker 2>Now you're talking about a minimum spanning tree or MST problem.

235
00:11:40.519 --> 00:11:42.759
<v Speaker 2>The goal is to minimize the cost of building the

236
00:11:42.759 --> 00:11:44.919
<v Speaker 2>network while ensuring everything is connected.

237
00:11:45.000 --> 00:11:45.799
<v Speaker 1>How do you solve that?

238
00:11:45.960 --> 00:11:49.840
<v Speaker 2>A common way is Jarnick's algorithm, sometimes called Prim's algorithm.

239
00:11:50.240 --> 00:11:53.399
<v Speaker 2>It basically starts somewhere and greedily adds the cheapest edge

240
00:11:53.399 --> 00:11:55.960
<v Speaker 2>that connects a new city to the growing network without

241
00:11:55.960 --> 00:11:56.720
<v Speaker 2>creating a cycle.

242
00:11:56.879 --> 00:11:59.840
<v Speaker 1>So it builds a network piece by piece, always picking

243
00:11:59.840 --> 00:12:02.519
<v Speaker 1>the cheapest next connection pretty much.

244
00:12:02.799 --> 00:12:05.799
<v Speaker 2>We saw a calculation using this for the fifteen largest

245
00:12:05.879 --> 00:12:11.360
<v Speaker 2>US metropolitan statistical areas msas the minimum length of track

246
00:12:11.480 --> 00:12:13.120
<v Speaker 2>needed to connect all of them was found to be

247
00:12:13.440 --> 00:12:17.279
<v Speaker 2>five three hundred and seventy two miles. Crucial for planning

248
00:12:17.279 --> 00:12:19.360
<v Speaker 2>infrastructure like railways or pipelines.

249
00:12:19.519 --> 00:12:22.919
<v Speaker 1>Wow, and what about finding shortest paths in weighted graphs

250
00:12:23.159 --> 00:12:27.120
<v Speaker 1>where edges have different costs like actual road distances. BFS

251
00:12:27.159 --> 00:12:28.360
<v Speaker 1>won't work then, right, right?

252
00:12:28.399 --> 00:12:31.519
<v Speaker 2>BFS only cares about the number of edges. For weighted graphs,

253
00:12:31.519 --> 00:12:33.279
<v Speaker 2>you need something like Dykstra's algorithm.

254
00:12:33.360 --> 00:12:34.720
<v Speaker 1>Dikester has heard of that one.

255
00:12:34.759 --> 00:12:37.000
<v Speaker 2>It's designed to find the lowest cost path from a

256
00:12:37.039 --> 00:12:39.559
<v Speaker 2>starting point to all other points in a weighted graph.

257
00:12:39.879 --> 00:12:41.919
<v Speaker 2>It keeps track of the cheapest known way to reach

258
00:12:41.960 --> 00:12:46.120
<v Speaker 2>each city and systematically explores outwards, always updating paths if

259
00:12:46.159 --> 00:12:48.799
<v Speaker 2>it finds a cheaper wrap. Very important for things like

260
00:12:48.879 --> 00:12:49.879
<v Speaker 2>GPS navigation.

261
00:12:50.240 --> 00:12:53.480
<v Speaker 1>Absolutely, it seems like graphs are everywhere once you start looking.

262
00:12:53.240 --> 00:12:56.120
<v Speaker 2>They really are. A huge amount of our world can

263
00:12:56.159 --> 00:13:00.519
<v Speaker 2>be represented using graphs, and these algorithms are essential for

264
00:13:00.559 --> 00:13:05.120
<v Speaker 2>efficiency in the telecommunications, shipping, transportation, and utility industries.

265
00:13:05.519 --> 00:13:06.879
<v Speaker 1>Can you do some big examples.

266
00:13:06.879 --> 00:13:10.000
<v Speaker 2>Sure, think about Walmart building out an efficient distribution network,

267
00:13:10.120 --> 00:13:14.120
<v Speaker 2>making warehouses and stores. That's a graph problem. Or Google

268
00:13:14.360 --> 00:13:17.840
<v Speaker 2>indexing the web. The entire Internet is a gigantic graph

269
00:13:17.919 --> 00:13:18.840
<v Speaker 2>of linked pages.

270
00:13:19.159 --> 00:13:20.919
<v Speaker 1>Right links are edges.

271
00:13:20.559 --> 00:13:23.879
<v Speaker 2>Exactly or FedEx finding the right set of hubs to

272
00:13:23.960 --> 00:13:27.720
<v Speaker 2>minimize delivery times and costs. It's all graph algorithms working

273
00:13:27.759 --> 00:13:28.480
<v Speaker 2>behind the scenes.

274
00:13:28.639 --> 00:13:31.279
<v Speaker 1>Okay, let's shift gears a bit now away from these

275
00:13:31.320 --> 00:13:36.120
<v Speaker 1>more deterministic algorithms towards things that feel more intelligent or adaptive,

276
00:13:36.639 --> 00:13:39.080
<v Speaker 1>like genetic algorithms. You mention they're less predictable.

277
00:13:39.159 --> 00:13:43.000
<v Speaker 2>Yeah, genetic algorithms are less deterministic than most traditional methods,

278
00:13:43.360 --> 00:13:46.799
<v Speaker 2>but that unpredictability can be a strength. Sometimes they can

279
00:13:46.799 --> 00:13:50.039
<v Speaker 2>solve problems that other approaches cannot solve in a reasonable

280
00:13:50.039 --> 00:13:53.799
<v Speaker 2>amount of time, especially really complex optimization problems.

281
00:13:54.039 --> 00:13:56.639
<v Speaker 1>How do they work? You mentioned they have a biological background.

282
00:13:56.759 --> 00:14:01.039
<v Speaker 2>They do. They simulate natural selection like evolution. You start

283
00:14:01.039 --> 00:14:03.919
<v Speaker 2>with a population of pencil solutions called chromosomes.

284
00:14:04.000 --> 00:14:06.600
<v Speaker 1>Chromosomes okay, like individuals, right, and.

285
00:14:06.559 --> 00:14:10.320
<v Speaker 2>Each chromosome has genes, which are like its specific traits

286
00:14:10.480 --> 00:14:14.480
<v Speaker 2>or parts of the solution. These individuals then compete in

287
00:14:14.519 --> 00:14:18.600
<v Speaker 2>a sense. To solve the problem. Their success is measured

288
00:14:18.639 --> 00:14:21.879
<v Speaker 2>by a fitness function, how well they do.

289
00:14:21.799 --> 00:14:24.559
<v Speaker 1>The job Survival of the fittest basically exactly.

290
00:14:24.840 --> 00:14:28.440
<v Speaker 2>The process involves creating an initial population, often randomly. Then

291
00:14:28.480 --> 00:14:32.240
<v Speaker 2>you measure everyone's fitness. Then you select individuals for reproduction.

292
00:14:32.279 --> 00:14:35.840
<v Speaker 2>Fitter ones are more likely to be chosen. Common methods

293
00:14:35.840 --> 00:14:38.679
<v Speaker 2>are rulette, rule selection or tournament selection.

294
00:14:38.759 --> 00:14:39.759
<v Speaker 1>Okay, select the best.

295
00:14:40.519 --> 00:14:44.000
<v Speaker 2>Then what then comes crossover? You take two parent solutions

296
00:14:44.039 --> 00:14:46.039
<v Speaker 2>and combine parts of them to create one or two

297
00:14:46.120 --> 00:14:49.559
<v Speaker 2>children's solutions, hoping to mix good traits, mixing gen right.

298
00:14:50.000 --> 00:14:54.159
<v Speaker 2>And finally, there's mutation, making small random changes to some individuals.

299
00:14:54.639 --> 00:14:58.080
<v Speaker 2>This helps maintain diversity of the population and prevents getting

300
00:14:58.120 --> 00:15:02.039
<v Speaker 2>stuck too early on a sub optimal solution. You repeat

301
00:15:02.080 --> 00:15:07.080
<v Speaker 2>this cycle fitness, selection, crossover, mutation over many generations.

302
00:15:07.159 --> 00:15:10.440
<v Speaker 1>That's a really cool analogy. Can they solve actual problems?

303
00:15:10.600 --> 00:15:13.279
<v Speaker 2>Oh? Yeah, they can tackle things like that. Send plus

304
00:15:13.279 --> 00:15:16.440
<v Speaker 2>more money. Cryptoithmetic puzzle we mentioned earlier, which is also

305
00:15:16.480 --> 00:15:19.320
<v Speaker 2>a CSP. You can represent the letters and digits and

306
00:15:19.399 --> 00:15:22.080
<v Speaker 2>let the genetic algorithm evolve toward a valid assignment.

307
00:15:22.360 --> 00:15:26.000
<v Speaker 1>Huh. Solving a logic puzzle through simulated evolution. What else?

308
00:15:26.200 --> 00:15:30.480
<v Speaker 2>Something may be more surprising optimizing list compression. It turns

309
00:15:30.480 --> 00:15:32.919
<v Speaker 2>out that for many compression algorithms, the order of the

310
00:15:32.919 --> 00:15:34.799
<v Speaker 2>items will affect the compression ratio.

311
00:15:34.919 --> 00:15:37.120
<v Speaker 1>The order matters really for some algorithms.

312
00:15:37.200 --> 00:15:39.879
<v Speaker 2>Yes, So for a list of say twelve names, maybe

313
00:15:39.919 --> 00:15:42.039
<v Speaker 2>standard compression gets it down to one hundred and sixty

314
00:15:42.039 --> 00:15:45.200
<v Speaker 2>five bytes, a genetic algorithm could try rearranging the order

315
00:15:45.240 --> 00:15:47.639
<v Speaker 2>of those names and found an order that compressed down

316
00:15:47.639 --> 00:15:49.000
<v Speaker 2>to one hundred and fifty nine bytes.

317
00:15:49.120 --> 00:15:52.279
<v Speaker 1>Okay, from one to sixty five seems like a small gain.

318
00:15:52.559 --> 00:15:55.879
<v Speaker 2>It might seem small for twelve names, but imagine optimizing

319
00:15:55.919 --> 00:15:59.360
<v Speaker 2>the order for millions of items. Now. Could you find

320
00:15:59.360 --> 00:16:02.840
<v Speaker 2>the absolute best order for those twelve names by checking

321
00:16:03.279 --> 00:16:05.559
<v Speaker 2>every possibility twelve names?

322
00:16:05.559 --> 00:16:08.720
<v Speaker 1>That's twelve factorial permutations? Right, that's huge.

323
00:16:08.799 --> 00:16:12.039
<v Speaker 2>It's four hundred and seventy nine thousand, one thousand, six

324
00:16:12.159 --> 00:16:16.919
<v Speaker 2>hundred possible orders. Absolutely unfeasible to check them all. Brute

325
00:16:16.919 --> 00:16:20.399
<v Speaker 2>force is impossible, right, No way, And that's where genetic

326
00:16:20.440 --> 00:16:24.440
<v Speaker 2>algorithms shine. They don't guarantee the absolute, perfect optimal solution,

327
00:16:25.279 --> 00:16:28.799
<v Speaker 2>but they're great at finding very good, near optimal solutions

328
00:16:28.799 --> 00:16:32.840
<v Speaker 2>for problems where finding the true optimum is computationally impossible

329
00:16:33.159 --> 00:16:35.600
<v Speaker 2>or would take like the age of the universe.

330
00:16:35.639 --> 00:16:39.039
<v Speaker 1>So they're a practical way to tackle incredibly complex optimization.

331
00:16:39.200 --> 00:16:40.240
<v Speaker 1>Are there downsides?

332
00:16:40.320 --> 00:16:43.000
<v Speaker 2>Well, As we said, they're less deterministic. Run it twice

333
00:16:43.039 --> 00:16:45.039
<v Speaker 2>you might get slightly different results. It could also be

334
00:16:45.279 --> 00:16:47.559
<v Speaker 2>something of a black box. It's not always clear why

335
00:16:47.600 --> 00:16:49.840
<v Speaker 2>the solution they found work so well, and we don't

336
00:16:49.840 --> 00:16:52.240
<v Speaker 2>really know if we found the optimal order just a good.

337
00:16:52.039 --> 00:16:55.000
<v Speaker 1>One, gotcha, but still useful any wild applications.

338
00:16:55.240 --> 00:16:59.159
<v Speaker 2>People have used them for computer generated art, evolving images

339
00:16:59.159 --> 00:17:03.399
<v Speaker 2>made of polygon to resemble photographs, and even genetic programming,

340
00:17:03.720 --> 00:17:07.000
<v Speaker 2>where the things evolving aren't just data but actual pieces

341
00:17:07.079 --> 00:17:10.160
<v Speaker 2>of computer code programs that write programs.

342
00:17:10.359 --> 00:17:14.559
<v Speaker 1>WHOA, Okay, that's mind bending, all right. Another area dealing

343
00:17:14.599 --> 00:17:19.000
<v Speaker 1>with complexity finding hidden structure in data. We have more

344
00:17:19.039 --> 00:17:20.720
<v Speaker 1>data than ever, how do we make sense of it

345
00:17:20.720 --> 00:17:22.200
<v Speaker 1>if we don't even know what we're looking for.

346
00:17:22.599 --> 00:17:25.519
<v Speaker 2>That's where clustering comes in. It's a key technique when

347
00:17:25.559 --> 00:17:27.599
<v Speaker 2>you want to learn about the structure of a data set,

348
00:17:28.160 --> 00:17:30.839
<v Speaker 2>but you do not know ahead of time. It's constituent parts.

349
00:17:31.119 --> 00:17:34.039
<v Speaker 1>Finding groups without knowing the groups beforehand exactly.

350
00:17:34.279 --> 00:17:36.880
<v Speaker 2>K means clustering is a very common type. It's a

351
00:17:36.920 --> 00:17:40.599
<v Speaker 2>form of unsupervised learning. You don't train it with labeled examples.

352
00:17:40.640 --> 00:17:43.039
<v Speaker 2>It finds the inherent groupings in the data itself.

353
00:17:43.279 --> 00:17:45.960
<v Speaker 1>Okay, K means How might that be used?

354
00:17:46.359 --> 00:17:46.599
<v Speaker 2>Say?

355
00:17:46.880 --> 00:17:47.400
<v Speaker 1>In business?

356
00:17:47.519 --> 00:17:49.720
<v Speaker 2>Imagine you run a grocery store. You have tons of

357
00:17:49.759 --> 00:17:52.400
<v Speaker 2>data about who buys what. When you can use K

358
00:17:52.559 --> 00:17:56.799
<v Speaker 2>means to cluster customers based on demographics, purchase history, day

359
00:17:56.799 --> 00:17:57.440
<v Speaker 2>of the week.

360
00:17:57.240 --> 00:17:59.000
<v Speaker 1>They shop and what would that tell you?

361
00:17:58.599 --> 00:18:02.000
<v Speaker 2>You might discover hidden patterns, like maybe there's a distinct

362
00:18:02.079 --> 00:18:06.000
<v Speaker 2>cluster of younger shoppers prefer to shop on Tuesdays. You

363
00:18:06.000 --> 00:18:08.680
<v Speaker 2>didn't know that group existed, but the algorithm found.

364
00:18:08.400 --> 00:18:09.960
<v Speaker 1>It, and then you could use that insight.

365
00:18:10.400 --> 00:18:13.240
<v Speaker 2>Right, you could run an ad specifically targeting them on

366
00:18:13.319 --> 00:18:16.480
<v Speaker 2>Mondays or Tuesdays, making your marketing much more effective.

367
00:18:16.799 --> 00:18:20.440
<v Speaker 1>Makes sense? How does the K means algorithm actually work?

368
00:18:20.880 --> 00:18:21.680
<v Speaker 1>Is it complex?

369
00:18:22.119 --> 00:18:27.079
<v Speaker 2>The algorithm itself is surprisingly simple conceptually. First, you decide

370
00:18:27.079 --> 00:18:28.880
<v Speaker 2>how many clusters you want to find, that's the K.

371
00:18:29.400 --> 00:18:32.319
<v Speaker 2>Then you initialize K centroids, which are just the starting

372
00:18:32.319 --> 00:18:34.880
<v Speaker 2>center points for each cluster, often placed randomly.

373
00:18:35.000 --> 00:18:37.359
<v Speaker 1>Okay, pick k random starting points.

374
00:18:37.559 --> 00:18:41.680
<v Speaker 2>Then you repeat two steps. Step one, assign clusters. You

375
00:18:41.680 --> 00:18:43.640
<v Speaker 2>go through every single data point and assign it to

376
00:18:43.640 --> 00:18:45.440
<v Speaker 2>the cluster whose centroid is closest.

377
00:18:45.640 --> 00:18:47.599
<v Speaker 1>Assign each point to its nearest center.

378
00:18:47.759 --> 00:18:52.400
<v Speaker 2>Step two, generate centroids. For each cluster. You calculate the

379
00:18:52.480 --> 00:18:55.799
<v Speaker 2>mean the average position of all the points currently assigned

380
00:18:55.839 --> 00:18:59.000
<v Speaker 2>to it. That mean becomes the new centroid for that cluster.

381
00:18:59.160 --> 00:19:02.880
<v Speaker 1>Okay, move the center to the average location of its points.

382
00:19:02.920 --> 00:19:05.759
<v Speaker 2>Exactly, and you just keep repeating those two steps. Assigned points,

383
00:19:05.839 --> 00:19:09.119
<v Speaker 2>update centroids until the centroids stop moving much or converge.

384
00:19:09.640 --> 00:19:11.559
<v Speaker 2>The points naturally group around these centers.

385
00:19:11.680 --> 00:19:16.359
<v Speaker 1>Huh, elegant. Where else does k meines use besides customer segmentation?

386
00:19:16.640 --> 00:19:21.359
<v Speaker 2>Oh, lots of places. It helps with pattern recognition in biology,

387
00:19:21.880 --> 00:19:25.119
<v Speaker 2>maybe identifying groups of incongruous cells that might signal a

388
00:19:25.160 --> 00:19:29.680
<v Speaker 2>problem finding anomalies. Yeah. In image recognition, pixels themselves can

389
00:19:29.720 --> 00:19:32.720
<v Speaker 2>be data lights. You could cluster pixels by color to

390
00:19:32.759 --> 00:19:34.359
<v Speaker 2>segment an image into regions.

391
00:19:34.559 --> 00:19:34.880
<v Speaker 1>Okay.

392
00:19:35.079 --> 00:19:37.680
<v Speaker 2>Even in political science, it's used to group voters based

393
00:19:37.680 --> 00:19:42.359
<v Speaker 2>on survey responses or demographics to find voters to target

394
00:19:42.559 --> 00:19:46.079
<v Speaker 2>and understand their underlying concerns without predefining the groups.

395
00:19:46.279 --> 00:19:49.839
<v Speaker 1>Powerful stuff for finding structure in the unknown. Now, let's

396
00:19:49.839 --> 00:19:53.720
<v Speaker 1>talk about the elephant in the AI room, neural networks.

397
00:19:53.880 --> 00:19:57.200
<v Speaker 1>When we hear about deep learning, it's usually neural nets.

398
00:19:57.000 --> 00:20:00.519
<v Speaker 2>Right, almost always, yes, especially the really impressive advances in

399
00:20:00.559 --> 00:20:04.160
<v Speaker 2>AI lately, they often concern neural networks, particularly deep ones

400
00:20:04.200 --> 00:20:05.200
<v Speaker 2>with many layers.

401
00:20:05.240 --> 00:20:07.880
<v Speaker 1>So what are they? Fundamentally, they're inspired by the brain.

402
00:20:08.240 --> 00:20:11.400
<v Speaker 2>That's the core idea they mimic in a very simplified way,

403
00:20:12.000 --> 00:20:17.000
<v Speaker 2>the structure of biological neurons. A basic unit, a neuron

404
00:20:17.559 --> 00:20:21.839
<v Speaker 2>takes several inputs. Each input has a weight associated with it,

405
00:20:21.960 --> 00:20:23.079
<v Speaker 2>representing its importance.

406
00:20:23.319 --> 00:20:24.400
<v Speaker 1>Inputs have weights.

407
00:20:24.640 --> 00:20:27.640
<v Speaker 2>The neurons sums up these weighted inputs and then applies

408
00:20:27.680 --> 00:20:31.480
<v Speaker 2>an activation function to the total. This function determines the

409
00:20:31.480 --> 00:20:35.079
<v Speaker 2>neuron's output, whether it fires and how strongly.

410
00:20:34.839 --> 00:20:39.440
<v Speaker 1>Okay weighted Some than an activation and these neurons are connected.

411
00:20:39.519 --> 00:20:42.319
<v Speaker 2>Yes, they're organized into layers. You have an input layer

412
00:20:42.680 --> 00:20:45.480
<v Speaker 2>that receives the raw data, one or more hidden layers

413
00:20:45.519 --> 00:20:48.359
<v Speaker 2>where the computation happens, and an output layer that gives

414
00:20:48.359 --> 00:20:51.359
<v Speaker 2>a final result. The outputs from neurons in one layer

415
00:20:51.400 --> 00:20:53.640
<v Speaker 2>become the inputs for neurons in the next layer.

416
00:20:53.680 --> 00:20:56.440
<v Speaker 1>A network of connected layers. But how do they learn?

417
00:20:56.519 --> 00:20:59.720
<v Speaker 1>How does it go from random connections to say, recognizing

418
00:20:59.720 --> 00:21:01.119
<v Speaker 1>a case through training.

419
00:21:01.319 --> 00:21:03.559
<v Speaker 2>It's a process of showing the network vast amounts of

420
00:21:03.599 --> 00:21:06.559
<v Speaker 2>labeled data, like millions of images, each tagged as cat

421
00:21:06.640 --> 00:21:07.160
<v Speaker 2>or not cat.

422
00:21:07.319 --> 00:21:09.559
<v Speaker 1>Learning by example, exactly.

423
00:21:09.480 --> 00:21:12.359
<v Speaker 2>When the network makes a prediction, say it calls a

424
00:21:12.359 --> 00:21:15.559
<v Speaker 2>cat picture a dog, you calculate the error, the difference

425
00:21:15.599 --> 00:21:18.960
<v Speaker 2>between its output and the correct answer. This air signal

426
00:21:19.000 --> 00:21:22.359
<v Speaker 2>is then propagated backward through the network using a process

427
00:21:22.359 --> 00:21:23.920
<v Speaker 2>called back propagation.

428
00:21:24.160 --> 00:21:26.359
<v Speaker 1>Backpropagation calculates the error.

429
00:21:26.759 --> 00:21:30.759
<v Speaker 2>Then what backpropagation helps figure out how much each connection's

430
00:21:30.799 --> 00:21:35.160
<v Speaker 2>weight contributed to the error. Then an optimization algorithm, usually

431
00:21:35.240 --> 00:21:39.440
<v Speaker 2>gradient descent, is used to slightly adjust those weights. It

432
00:21:39.519 --> 00:21:42.000
<v Speaker 2>nudges them in a direction that should reduce the error.

433
00:21:42.039 --> 00:21:44.599
<v Speaker 1>Next time adjusting the weights to get closer to the

434
00:21:44.680 --> 00:21:45.920
<v Speaker 1>right answer precisely.

435
00:21:46.000 --> 00:21:48.599
<v Speaker 2>And there's a learning rate parameter that controls how big

436
00:21:48.640 --> 00:21:51.960
<v Speaker 2>those adjustments are. Too big and you might overshoot too small,

437
00:21:52.119 --> 00:21:55.319
<v Speaker 2>and learning takes forever. You repeat this process forward pass

438
00:21:55.680 --> 00:22:00.000
<v Speaker 2>calculate air backpropagate adjust weights millions of times.

439
00:21:59.799 --> 00:22:01.759
<v Speaker 1>It sounds computationally intensive.

440
00:22:01.960 --> 00:22:04.839
<v Speaker 2>Training can be yes, But what's fascinating is that the

441
00:22:04.839 --> 00:22:07.519
<v Speaker 2>network figures out the important features on its own. It

442
00:22:07.559 --> 00:22:10.759
<v Speaker 2>does not care what the various attributes represent in human terms.

443
00:22:10.839 --> 00:22:12.000
<v Speaker 1>It just finds patterns.

444
00:22:12.400 --> 00:22:17.559
<v Speaker 2>Yeah, we saw examples classifying iris flowers based on four measurements,

445
00:22:17.759 --> 00:22:21.519
<v Speaker 2>or different wine cultivars based on thirteen chemical properties. The

446
00:22:21.559 --> 00:22:25.119
<v Speaker 2>network just learns the underlying patterns linking inputs to outputs

447
00:22:25.559 --> 00:22:27.119
<v Speaker 2>without needing explicit rules.

448
00:22:27.279 --> 00:22:30.640
<v Speaker 1>So what does this all mean? They're incredibly powerful, clearly,

449
00:22:31.000 --> 00:22:34.240
<v Speaker 1>but are there drawbacks? You mentioned black box before.

450
00:22:34.559 --> 00:22:38.680
<v Speaker 2>That's a big one. They're often something of a black box.

451
00:22:39.079 --> 00:22:40.799
<v Speaker 2>I can give you the right answer, but it's very

452
00:22:40.839 --> 00:22:43.680
<v Speaker 2>difficult to understand why they gave that answer, what features

453
00:22:43.720 --> 00:22:46.240
<v Speaker 2>they relied on. This lack of interpretability could be a

454
00:22:46.279 --> 00:22:50.359
<v Speaker 2>major issue in sensitive areas like medical diagnosis or loan applications.

455
00:22:50.440 --> 00:22:52.400
<v Speaker 1>Right, you need to be able to explain the decision

456
00:22:52.559 --> 00:22:53.839
<v Speaker 1>any other downsides.

457
00:22:54.279 --> 00:22:57.799
<v Speaker 2>They also often require very large data sets for training.

458
00:22:58.160 --> 00:23:01.160
<v Speaker 2>We're talking potentially millions of training images for a good

459
00:23:01.200 --> 00:23:04.319
<v Speaker 2>image classifier. Gathering and labeling that much data can be

460
00:23:04.319 --> 00:23:05.880
<v Speaker 2>a huge undertaking millions.

461
00:23:05.960 --> 00:23:09.720
<v Speaker 1>Okay, so expensive to train, potentially hard to interpret. But

462
00:23:09.960 --> 00:23:12.000
<v Speaker 1>once trained, that's the good news.

463
00:23:12.240 --> 00:23:15.680
<v Speaker 2>While training is computationally expensive, using a pre trained network

464
00:23:15.720 --> 00:23:17.319
<v Speaker 2>is actually very fast and efficient.

465
00:23:17.680 --> 00:23:20.759
<v Speaker 1>Ah, so the heavy lifting is done beforehand exactly.

466
00:23:21.240 --> 00:23:23.799
<v Speaker 2>That's why you see things like Apple's core mL framework.

467
00:23:24.200 --> 00:23:28.240
<v Speaker 2>It lets developers easily incorporate powerful, pre trained neural network

468
00:23:28.279 --> 00:23:31.759
<v Speaker 2>models directly into their apps. The training was done on

469
00:23:31.799 --> 00:23:34.400
<v Speaker 2>big servers, but the app on your phone can use

470
00:23:34.440 --> 00:23:35.920
<v Speaker 2>the result instantly.

471
00:23:35.759 --> 00:23:37.799
<v Speaker 1>And that enables all sorts of cool stuff we use

472
00:23:37.839 --> 00:23:38.240
<v Speaker 1>every day.

473
00:23:38.319 --> 00:23:43.000
<v Speaker 2>Totally. It's enabling things like practical voice recognition Siri, Alexa,

474
00:23:43.200 --> 00:23:49.039
<v Speaker 2>Google Assistant, and image recognition like Facebook automatically suggesting tags

475
00:23:49.079 --> 00:23:50.839
<v Speaker 2>for faces in your photos.

476
00:23:50.519 --> 00:23:53.039
<v Speaker 1>Or even handwriting recognition. I know my phone can search

477
00:23:53.079 --> 00:23:54.000
<v Speaker 1>my handwritten notes.

478
00:23:54.039 --> 00:23:56.960
<v Speaker 2>Now yep, that's often neural networks too. They're really changing

479
00:23:56.960 --> 00:23:58.839
<v Speaker 2>what's possible in user phasing applications.

480
00:23:58.880 --> 00:24:01.880
<v Speaker 1>Okay, fantastic overview. Now for a quick bonus round a

481
00:24:01.880 --> 00:24:05.480
<v Speaker 1>few more classic problems. First up, the knapsack problem sounds

482
00:24:05.559 --> 00:24:06.079
<v Speaker 1>like a story.

483
00:24:06.279 --> 00:24:07.960
<v Speaker 2>It is often told that way. A thief with a

484
00:24:08.000 --> 00:24:11.440
<v Speaker 2>knapsack of limited capacity trying to steal the most valuable items.

485
00:24:12.200 --> 00:24:15.599
<v Speaker 2>But it represents a really common computational need finding the

486
00:24:15.640 --> 00:24:19.920
<v Speaker 2>best use of limited resources given a finite set of usage.

487
00:24:19.519 --> 00:24:23.680
<v Speaker 1>Options, maximizing value within a budget or capacity exactly.

488
00:24:23.720 --> 00:24:26.599
<v Speaker 2>We're usually talking about the one variant, meaning for each item,

489
00:24:26.960 --> 00:24:28.559
<v Speaker 2>you either take the whole thing or you leave it.

490
00:24:28.599 --> 00:24:29.319
<v Speaker 2>You can't take.

491
00:24:29.160 --> 00:24:31.480
<v Speaker 1>Half, Okay, take it or leave it. How hard is

492
00:24:31.480 --> 00:24:32.000
<v Speaker 1>that to solve?

493
00:24:32.599 --> 00:24:36.079
<v Speaker 2>Well, if you try brute force checking every possible combination

494
00:24:36.160 --> 00:24:39.400
<v Speaker 2>of items, you're looking at two end different possible subsets.

495
00:24:39.440 --> 00:24:41.240
<v Speaker 2>Where n is the number of items.

496
00:24:41.400 --> 00:24:43.279
<v Speaker 1>Exponential growth gets big.

497
00:24:43.119 --> 00:24:46.400
<v Speaker 2>Fast, very fast. For just eleven items, that's two thousan

498
00:24:46.559 --> 00:24:50.799
<v Speaker 2>forty eight combinations. Manageable maybe, but scale that up it

499
00:24:50.839 --> 00:24:53.079
<v Speaker 2>quickly becomes untenable for a large number.

500
00:24:53.240 --> 00:24:56.160
<v Speaker 1>Yeah, so brute forces out for anything non trivial. Is

501
00:24:56.160 --> 00:24:57.279
<v Speaker 1>there a smarter way?

502
00:24:57.640 --> 00:25:00.640
<v Speaker 2>Yes? This is where dynamic programming are often comes in.

503
00:25:00.680 --> 00:25:02.200
<v Speaker 2>We mentioned it briefly before.

504
00:25:02.039 --> 00:25:03.400
<v Speaker 1>Right breaking the problem down.

505
00:25:03.440 --> 00:25:07.319
<v Speaker 2>Instead of solving the big problem directly, dynamic programming solves

506
00:25:07.559 --> 00:25:11.599
<v Speaker 2>smaller sub problems first and uses those solutions to build

507
00:25:11.640 --> 00:25:15.880
<v Speaker 2>up to the final answer, storing intermediate results to avoid recomputation.

508
00:25:16.079 --> 00:25:18.559
<v Speaker 2>How much better is it for that example? With eleven

509
00:25:18.599 --> 00:25:22.359
<v Speaker 2>items and a knapsack capacity of seventy five, dynamic programming

510
00:25:22.359 --> 00:25:24.319
<v Speaker 2>only needs to do about eight hundred and twenty five

511
00:25:24.359 --> 00:25:28.000
<v Speaker 2>calculations roughly end time, ce the capacity down from twenty

512
00:25:28.119 --> 00:25:30.559
<v Speaker 2>forty eight. That's a significant improvement.

513
00:25:30.319 --> 00:25:33.440
<v Speaker 1>Definitely much more scalable. Where else does this idea apply?

514
00:25:33.759 --> 00:25:37.640
<v Speaker 2>The technique is widely applicable. Think about a university deciding

515
00:25:37.640 --> 00:25:41.160
<v Speaker 2>how to allocate its athletic budget. They have a limited budget,

516
00:25:41.200 --> 00:25:45.279
<v Speaker 2>the knapsack various sports programs items, each with a cost

517
00:25:45.359 --> 00:25:48.680
<v Speaker 2>and an expected return, maybe in alumni donations or prestige.

518
00:25:48.960 --> 00:25:51.039
<v Speaker 2>They want to pick the set of programs that gives

519
00:25:51.039 --> 00:25:54.359
<v Speaker 2>the best return within budget. That's a knapsack problem.

520
00:25:54.480 --> 00:25:57.400
<v Speaker 1>Huh, never thought of it that way, Okay, next bonus.

521
00:25:57.599 --> 00:25:59.839
<v Speaker 1>The traveling salesman problem TSP.

522
00:26:00.160 --> 00:26:03.839
<v Speaker 2>Ah, the famous TSP. The goal sounds simple. You have

523
00:26:03.839 --> 00:26:05.599
<v Speaker 2>a list of cities. You need to visit every single

524
00:26:05.599 --> 00:26:08.200
<v Speaker 2>one exactly once and end up back where you started

525
00:26:08.440 --> 00:26:10.559
<v Speaker 2>finding the absolute shortest possible route.

526
00:26:10.759 --> 00:26:14.000
<v Speaker 1>Sounds like something every delivery company needs to solve daily.

527
00:26:14.319 --> 00:26:17.680
<v Speaker 2>They do. But here's the catch. TSP is notoriously difficult.

528
00:26:17.960 --> 00:26:21.200
<v Speaker 2>It's an NP hard problem. N P hard meaning meaning

529
00:26:21.200 --> 00:26:24.960
<v Speaker 2>there's no known polynomial time algorithm they can guarantee finding

530
00:26:24.960 --> 00:26:28.599
<v Speaker 2>the perfect shortest route quickly for any arbitrary number of cities.

531
00:26:29.039 --> 00:26:32.680
<v Speaker 2>The complexity explodes. It becomes essentially impossible to solve the

532
00:26:32.720 --> 00:26:37.640
<v Speaker 2>problem perfectly optimally for millions of cities in any reasonable timeframe.

533
00:26:37.799 --> 00:26:41.599
<v Speaker 1>So companies like UPS or FedEx can't find the absolute,

534
00:26:41.759 --> 00:26:44.920
<v Speaker 1>mathematically perfect route for all their trucks every day.

535
00:26:45.279 --> 00:26:49.400
<v Speaker 2>Not perfectly. No, they use very sophisticated heuristics and approximation

536
00:26:49.480 --> 00:26:53.240
<v Speaker 2>algorithms that find extremely good routes very close to optimal,

537
00:26:53.559 --> 00:26:56.480
<v Speaker 2>but they likely aren't guaranteed to be the single shortest

538
00:26:56.519 --> 00:26:59.559
<v Speaker 2>possible path out of the trillions and trillions of possibilities.

539
00:26:59.799 --> 00:27:02.200
<v Speaker 1>For smaller instances like just a few cities.

540
00:27:02.279 --> 00:27:04.880
<v Speaker 2>Oh yeah, for a small number, like visiting five cities

541
00:27:04.880 --> 00:27:08.240
<v Speaker 2>in Vermont, brute force is feasible. You just generate all

542
00:27:08.279 --> 00:27:12.119
<v Speaker 2>possible orderings permutations of the cities, calculate the total distance

543
00:27:12.160 --> 00:27:14.720
<v Speaker 2>for each, and pick the shortest. We saw an example

544
00:27:14.759 --> 00:27:16.880
<v Speaker 2>where the shortest path was three hundred and eighteen miles.

545
00:27:16.920 --> 00:27:20.440
<v Speaker 1>Okay, so solvable for small N, but relies on approximations

546
00:27:20.480 --> 00:27:24.720
<v Speaker 1>for large N. Got it last? One phone number mnemonics.

547
00:27:24.880 --> 00:27:26.839
<v Speaker 2>Yeah, this is a fun one. Remember those old phone

548
00:27:26.880 --> 00:27:30.319
<v Speaker 2>numbers that spelled out words like one eight hundred flowers?

549
00:27:30.400 --> 00:27:32.640
<v Speaker 1>Sure, how do computers figure those out or help you

550
00:27:32.680 --> 00:27:35.519
<v Speaker 1>find a new one, like maybe one ga zero sts

551
00:27:35.559 --> 00:27:38.079
<v Speaker 1>for one, four, four, zero, seven eight seven.

552
00:27:38.400 --> 00:27:41.559
<v Speaker 2>You can use algorithms that explore the combinations for each

553
00:27:41.599 --> 00:27:45.359
<v Speaker 2>digit on the keypad to abc, three, def, et cetera.

554
00:27:45.720 --> 00:27:48.559
<v Speaker 2>You have multiple letter options, you can use something like

555
00:27:48.599 --> 00:27:52.279
<v Speaker 2>the Cartesian product mathematically or just nested loops in code

556
00:27:52.359 --> 00:27:55.640
<v Speaker 2>to generate all possible letter combinations for the number sequence.

557
00:27:55.720 --> 00:27:57.720
<v Speaker 1>And then you'd check that list against a dictionary to

558
00:27:57.720 --> 00:28:00.240
<v Speaker 1>see if any parts spell actual words exactly.

559
00:28:00.279 --> 00:28:03.440
<v Speaker 2>It's a systematic way to explore the possibilities and find

560
00:28:03.440 --> 00:28:04.720
<v Speaker 2>those memorable combinations.

561
00:28:04.880 --> 00:28:08.920
<v Speaker 1>Wow, Okay, what an incredible journey through these classic computer

562
00:28:08.960 --> 00:28:12.680
<v Speaker 1>science problems. It's amazing how much ground we've covered, from

563
00:28:13.119 --> 00:28:16.319
<v Speaker 1>the subtle efficiency games in fibonacci, right.

564
00:28:16.200 --> 00:28:19.240
<v Speaker 2>Down to the bit level savings in compression, to.

565
00:28:19.319 --> 00:28:23.160
<v Speaker 1>The strategic choices and the knapsack problem, navigating graphs with

566
00:28:23.319 --> 00:28:26.519
<v Speaker 1>Diichstra or A, and then these adaptive methods like genetic

567
00:28:26.559 --> 00:28:30.359
<v Speaker 1>algorithms and neural networks. It really shows how ingenious solutions

568
00:28:30.400 --> 00:28:33.359
<v Speaker 1>can tackle seemingly impossible challenges.

569
00:28:33.400 --> 00:28:36.400
<v Speaker 2>Absolutely, and I think the key takeaway is that knowledge

570
00:28:36.440 --> 00:28:38.559
<v Speaker 2>is most valuable when you understand it and can apply

571
00:28:38.640 --> 00:28:41.039
<v Speaker 2>it right. These aren't just abstract puzzles locked away in

572
00:28:41.039 --> 00:28:42.079
<v Speaker 2>computer science to work.

573
00:28:42.599 --> 00:28:43.799
<v Speaker 1>No, they're really fundamental.

574
00:28:43.920 --> 00:28:46.960
<v Speaker 2>They're the building blocks that have shaped and continued to

575
00:28:47.039 --> 00:28:51.960
<v Speaker 2>shape our technological world. They drive innovation in fields way

576
00:28:52.000 --> 00:28:55.640
<v Speaker 2>beyond just programming. They offer these really powerful ways to

577
00:28:55.720 --> 00:28:59.400
<v Speaker 2>analyze and approach problems in incredibly diverse areas.

578
00:28:59.440 --> 00:29:02.039
<v Speaker 1>It's helping us that makes sense of complexity exactly. So

579
00:29:02.119 --> 00:29:04.720
<v Speaker 1>here's a final thought for everyone listening. As you go

580
00:29:04.759 --> 00:29:08.000
<v Speaker 1>about your day, think about a problem you encounter, maybe

581
00:29:08.039 --> 00:29:10.960
<v Speaker 1>at work, maybe just in daily life. It could be simple,

582
00:29:11.079 --> 00:29:14.519
<v Speaker 1>could be complex. How might you try to abstract it?

583
00:29:15.000 --> 00:29:17.880
<v Speaker 1>Could you define it in terms of variables and constraints,

584
00:29:18.359 --> 00:29:21.359
<v Speaker 1>or maybe nodes and edges in a graph. Could you

585
00:29:21.400 --> 00:29:24.839
<v Speaker 1>even think of it as a population of competing solutions,

586
00:29:25.359 --> 00:29:26.880
<v Speaker 1>like in a genetic algorithm.

587
00:29:27.000 --> 00:29:30.279
<v Speaker 2>That's a great challenge applying these frameworks in unexpected places.

588
00:29:30.440 --> 00:29:33.799
<v Speaker 1>Yeah, could one of these classic computer science approaches like

589
00:29:33.799 --> 00:29:37.039
<v Speaker 1>the ones we've talked about today offer a surprising maybe

590
00:29:37.039 --> 00:29:41.279
<v Speaker 1>even a powerful new way to understand or even solve

591
00:29:41.279 --> 00:29:42.880
<v Speaker 1>that problem. Give it some thought,
