WEBVTT 1 00:00:00.120 --> 00:00:03.879 Have you ever found yourself wondering how computers actually manage 2 00:00:03.919 --> 00:00:08.720 these really complex challenges like optimizing delivery routes, or you know, 3 00:00:08.759 --> 00:00:10.519 recognizing faces in. 4 00:00:10.560 --> 00:00:13.960 Photos, or even just storing huge amounts of data efficiently. 5 00:00:14.359 --> 00:00:17.120 Exactly what if there was sort of a shortcut to 6 00:00:17.239 --> 00:00:22.280 understanding the basic logic behind so much of our modern tech. 7 00:00:22.719 --> 00:00:24.800 Well, in this deep dive, that's what we're doing. We're 8 00:00:24.839 --> 00:00:29.359 pulling back the curtain on classic computer science problems. And 9 00:00:29.440 --> 00:00:32.320 these aren't just you know, dusty academic exercise. 10 00:00:31.920 --> 00:00:34.399 Well is, They're not just for university courses, not at all. 11 00:00:34.479 --> 00:00:38.960 They're the foundational programming challenges that actually solve real world problems. 12 00:00:39.200 --> 00:00:44.000 And we've dug into a really comprehensive book covering these problems. 13 00:00:44.079 --> 00:00:46.960 Our mission today it's basically to pull out the most 14 00:00:46.960 --> 00:00:49.679 important bits of knowledge, the key insights for you. 15 00:00:49.799 --> 00:00:53.479 We'll unpack the core ideas, maybe reveal some surprising facts 16 00:00:53.479 --> 00:00:54.280 along the way, and. 17 00:00:54.240 --> 00:00:57.320 Show you how these these foundational concepts are like woven 18 00:00:57.399 --> 00:01:00.479 right into the fabric of our digital Worldfully, you'll have 19 00:01:00.520 --> 00:01:04.239 a few aha moments. Okay, So to kick things off, 20 00:01:04.680 --> 00:01:07.680 let's look at something that seems pretty simple on the surface. 21 00:01:08.200 --> 00:01:12.000 The Fibonacci sequence. You know, one, one, two, three, five, right, 22 00:01:12.439 --> 00:01:15.200 But I've heard this sequence can actually teach us some 23 00:01:15.319 --> 00:01:21.200 really profound lessons about computational efficiency. What's the story there, 24 00:01:21.239 --> 00:01:22.280 what's the hidden lesson? 25 00:01:22.439 --> 00:01:26.280 Yeah, it's fascinating. It really highlights the stark difference between 26 00:01:26.400 --> 00:01:29.519 just jumping in with a straightforward, what we might call 27 00:01:29.560 --> 00:01:33.640 a naive recursive solution, and more optimized ways of thinking. 28 00:01:33.760 --> 00:01:35.760 Okay, so what happens with that naive way? 29 00:01:35.959 --> 00:01:41.280 Well, get this Calculating just the twentieth Fibonacci number using 30 00:01:41.319 --> 00:01:46.200 that simple recursive function, it ends up making twenty ninety one. 31 00:01:46.239 --> 00:01:49.400 Separate function calls twenty one thousand for the twentieth number. 32 00:01:49.439 --> 00:01:50.840 That seems excessive. 33 00:01:50.879 --> 00:01:54.239 It's incredibly inefficient, yeah, because it just keeps recalculating the 34 00:01:54.280 --> 00:01:56.000 same values over and over and over. 35 00:01:55.920 --> 00:01:58.439 Again, right, recalculating things that already figured out. But there 36 00:01:58.439 --> 00:01:59.920 are better ways. You mentioned optimization. 37 00:02:00.079 --> 00:02:03.319 Absolutely, there's a technique called memoization. It was actually coined 38 00:02:03.359 --> 00:02:04.959 by computer scientist Donald Mitchee. 39 00:02:04.959 --> 00:02:07.200 Memoization like writing a memo to. 40 00:02:07.159 --> 00:02:10.840 Yourself kind of yeah, think it like a human memorization machine. 41 00:02:11.199 --> 00:02:15.080 The function, remember, the results had already calculated, so with memoization, 42 00:02:15.879 --> 00:02:19.719 that same twentieth number, it only takes thirty nine calls. 43 00:02:19.599 --> 00:02:22.759 Thirty nine down from almost twenty two thousand. That's huge. 44 00:02:22.879 --> 00:02:26.560 It's massive. And with this method you can actually calculate say, 45 00:02:26.919 --> 00:02:29.360 FIB fifty without the whole thing grinding to a halt. 46 00:02:29.800 --> 00:02:31.479 The naive version just wouldn't cope. 47 00:02:31.879 --> 00:02:34.639 Okay, So memorization is a big step up that the 48 00:02:34.680 --> 00:02:35.319 best we can do. 49 00:02:35.520 --> 00:02:37.759 Well. There's also what you might call an old fashioned 50 00:02:37.800 --> 00:02:41.199 iterative approach, just using a loop. Yeah, that approach runs 51 00:02:41.240 --> 00:02:44.479 its main loop at most and one times, So for 52 00:02:44.520 --> 00:02:47.759 the twentieth number, that's like nineteen loops, even fewer operations. 53 00:02:47.800 --> 00:02:50.000 Wow, okay, and this really matters. 54 00:02:50.159 --> 00:02:52.319 You know, this kind of difference can have a serious 55 00:02:52.360 --> 00:02:55.120 difference in a real world application. It's not just theory. 56 00:02:55.240 --> 00:02:57.159 It's about thinking efficiently before. 57 00:02:56.960 --> 00:02:59.879 You code, right, It's the difference between something working instantly 58 00:03:00.080 --> 00:03:04.479 and something maybe never finishing this sie So okay that speed. 59 00:03:05.039 --> 00:03:08.000 But efficiency isn't just about speed? Is it saving space? 60 00:03:08.159 --> 00:03:11.960 Virtual storage memory? That's also huge. How did these fundamental 61 00:03:12.000 --> 00:03:13.000 ideas help with that? 62 00:03:13.319 --> 00:03:18.400 Oh? Definitely? Yeah, Saving space often directly translates to saving money, right, 63 00:03:18.919 --> 00:03:22.319 virtual or real makes sense. So think about DNA the 64 00:03:22.400 --> 00:03:27.159 nucleotides ACG or T. That's it, just four possibilities. If 65 00:03:27.199 --> 00:03:29.719 you store a DNA sequence as a normal tech string, 66 00:03:30.360 --> 00:03:33.439 each letter, each character usually takes up eight bits. 67 00:03:33.599 --> 00:03:35.479 Okay, standard tech storage. 68 00:03:35.800 --> 00:03:38.439 But wait, since there are only four possible values, you 69 00:03:38.479 --> 00:03:41.479 don't actually need eight bits. You only need two bits 70 00:03:41.479 --> 00:03:43.719 per nuclear tie. You could say, you know A is zero, zero, 71 00:03:43.919 --> 00:03:46.919 C is zero, one, G is ten, T is eleven. 72 00:03:47.360 --> 00:03:49.759 I see, so you're mapping the four options onto just 73 00:03:49.800 --> 00:03:50.759 two bits. That's clever. 74 00:03:50.960 --> 00:03:54.680 It's simple but effective. That bit string representation. That can 75 00:03:54.719 --> 00:03:58.039 slash the storage needed for DNA data by seventy five percent, 76 00:03:58.479 --> 00:04:00.520 from eight bits down to just two it's per. 77 00:04:00.439 --> 00:04:03.960 Nucleodide seventy five percent. That's enormous when you think about 78 00:04:03.960 --> 00:04:05.800 the scale of genomic data exactly. 79 00:04:05.840 --> 00:04:08.360 We saw an example I think where an original string 80 00:04:08.439 --> 00:04:11.479 was maybe eighty six hundred and forty nine bytes. Compressed 81 00:04:11.479 --> 00:04:13.360 this way, it dropped to just twenty three hundred and 82 00:04:13.439 --> 00:04:14.000 twenty bytes. 83 00:04:14.039 --> 00:04:16.439 That's a massive saving. Now, this brings up something interesting 84 00:04:16.439 --> 00:04:19.079 about choosing how to implement things in code, doesn't it. 85 00:04:19.079 --> 00:04:22.800 It does. Sometimes using a dictionary or a hash map 86 00:04:22.879 --> 00:04:26.519 for lookups, which theoretically should be super fast like one 87 00:04:26.959 --> 00:04:27.839 or constant time. 88 00:04:27.959 --> 00:04:30.079 Right, it's supposed to be instant regardless of size. 89 00:04:30.160 --> 00:04:34.360 Yeah, but in practice it can sometimes be less performant 90 00:04:34.399 --> 00:04:38.319 than a series of ifs. Really, why is that because 91 00:04:38.360 --> 00:04:41.079 of the hidden cost of running the hash function itself. 92 00:04:41.879 --> 00:04:43.959 The calculation to figure out where to look in the 93 00:04:43.959 --> 00:04:47.079 dictionary takes a little bit of time. So in really 94 00:04:47.160 --> 00:04:50.279 performance critical code you might actually need to run performance 95 00:04:50.319 --> 00:04:53.480 tests to see what's really faster in your specific case. 96 00:04:53.800 --> 00:04:54.839 Theory doesn't always win. 97 00:04:55.040 --> 00:04:58.639 That's a great point. Real world testing matters, and just quickly. 98 00:04:58.720 --> 00:05:01.639 Another example of level stuff is the xor operation right, 99 00:05:01.839 --> 00:05:02.600 used in. 100 00:05:02.600 --> 00:05:07.040 Absolutely xor is fundamental for certain types of unbreakable encryption. 101 00:05:07.360 --> 00:05:10.879 Another example of low level bit manipulation having powerful applications. 102 00:05:10.959 --> 00:05:14.120 Okay, so we've seen efficiency and speed efficiency in space 103 00:05:14.240 --> 00:05:16.920 down to the bits. What about problems that are more 104 00:05:16.959 --> 00:05:20.639 about logic rules constraints? 105 00:05:21.000 --> 00:05:24.319 Ah? Yeah, that brings us to a really elegant framework 106 00:05:24.480 --> 00:05:26.800 constraints satisfaction problems or CSPs. 107 00:05:27.079 --> 00:05:28.759 CSPs? What are they? 108 00:05:28.800 --> 00:05:31.240 Basically, there are a way to solve problems you can 109 00:05:31.279 --> 00:05:35.800 define abstractly by variables of limited domains that have constraints 110 00:05:35.800 --> 00:05:36.240 between them. 111 00:05:36.240 --> 00:05:39.240 Okay, break that down. Variables domains constraints. 112 00:05:39.360 --> 00:05:41.319 So variables are the things you need to decide on. 113 00:05:41.680 --> 00:05:45.040 Domains are the possible choices or values for each variable, 114 00:05:45.639 --> 00:05:48.600 and constraints are the rules that say which combinations of 115 00:05:48.759 --> 00:05:49.720 values are allowed. 116 00:05:49.879 --> 00:05:51.399 Gotcha, can you give an example? 117 00:05:51.560 --> 00:05:55.240 Sure? Think about the classic Australian map coloring problem. The 118 00:05:55.279 --> 00:05:58.279 regions or states are the variables. The possible colors are 119 00:05:58.319 --> 00:06:01.560 the domain, say red, green, blue, and the constraint the 120 00:06:01.600 --> 00:06:04.439 extrain is simple, no two neighboring regions can have the 121 00:06:04.439 --> 00:06:08.000 same color or Another famous one is the eight queens problem, 122 00:06:08.560 --> 00:06:10.800 placing eight chess queens on a board so none can 123 00:06:10.839 --> 00:06:11.519 attack each other. 124 00:06:11.720 --> 00:06:16.399 Ah okay, So the queens are variables, their positions are domains. 125 00:06:16.079 --> 00:06:19.639 Exactly, and the constraint is the no attacking rule. The 126 00:06:19.680 --> 00:06:23.040 core of solving these often involves a function, maybe called consistent, 127 00:06:23.199 --> 00:06:27.519 that just checks does putting this value here violate any rules? 128 00:06:27.800 --> 00:06:31.240 Okay, cool puzzles, But where do CSPs show up in 129 00:06:31.279 --> 00:06:34.399 the real world Outside of map coloring and chess. 130 00:06:34.519 --> 00:06:38.160 They're actually commonly used in scheduling. Think about scheduling staff 131 00:06:38.160 --> 00:06:41.839 in a hospital. People are variables. Timeslots are domains, and 132 00:06:41.920 --> 00:06:44.759 constraints are things like nurse A needs two days off 133 00:06:44.920 --> 00:06:47.319 or doctor B must be on call with surgency. 134 00:06:47.519 --> 00:06:50.000 Right, that makes sense complex scheduling anywhere else. 135 00:06:50.160 --> 00:06:54.279 Yeah. In motion planning for robotics, imagine guiding a robot 136 00:06:54.439 --> 00:06:57.199 arm to fit inside a narrow tube without hitting the sides. 137 00:06:57.600 --> 00:07:00.480 The tube walls are the constraints. The angles of the 138 00:07:00.560 --> 00:07:04.079 robot's joints are the variables, the possible movements of the domains. 139 00:07:04.639 --> 00:07:07.560 CSPs help figure out a valid sequence of moves. 140 00:07:07.600 --> 00:07:12.399 Fascinating. Okay, So from logic puzzles to navigating spaces finding 141 00:07:12.399 --> 00:07:14.439 a path from A to B. That's a massive area 142 00:07:14.480 --> 00:07:18.120 in computer science, right from simple lists to complex mazes. 143 00:07:18.199 --> 00:07:21.560 Oh huge. And we have different tools depending on the situation. 144 00:07:21.879 --> 00:07:24.160 For basic lists, if the data isn't sorted, you might 145 00:07:24.240 --> 00:07:26.839 just use a linear search look at everything one by one. 146 00:07:27.000 --> 00:07:30.360 Like scanning a whole bookshelf for one specific book exactly. 147 00:07:30.720 --> 00:07:32.639 But if the data is sorted, you can use the 148 00:07:32.680 --> 00:07:34.160 much faster binary search. 149 00:07:34.519 --> 00:07:36.839 That's like opening a phone book right to the middle 150 00:07:37.040 --> 00:07:38.800 than the middle of the remaining half right. 151 00:07:38.839 --> 00:07:42.639 Much quicker, precisely. But what about something more complex like 152 00:07:42.639 --> 00:07:46.040 a maze. How does a computer navigate that? What does 153 00:07:46.040 --> 00:07:48.839 this all mean for finding your way through complex spaces? 154 00:07:49.000 --> 00:07:49.959 Yeah? How does that work? 155 00:07:50.120 --> 00:07:54.360 Well? Two fundamental approaches are depth first search or DFS 156 00:07:54.519 --> 00:07:56.720 and Brett first search or BFS. 157 00:07:56.839 --> 00:07:58.240 DFS and BFS. 158 00:07:58.319 --> 00:08:01.480 Okay. DFS uses a data strictually call a stack. I 159 00:08:01.519 --> 00:08:03.519 think last and first out, like a stack of plates. 160 00:08:03.600 --> 00:08:06.160 Got it last, one on, first, one off right? 161 00:08:06.240 --> 00:08:09.519 So DFS goes deep down one path, explores as far 162 00:08:09.560 --> 00:08:11.839 as it can. If it hits a dead end, it 163 00:08:11.920 --> 00:08:13.279 backtracks and tries another. 164 00:08:13.079 --> 00:08:15.399 Branch, So it dives deep first. 165 00:08:15.560 --> 00:08:19.160 Yeah. The paths that finds can sometimes seem unnatural, though, 166 00:08:19.279 --> 00:08:20.959 and they're usually not the shortest paths. 167 00:08:21.199 --> 00:08:23.639 It just finds a path okay, so not necessarily the 168 00:08:23.680 --> 00:08:26.160 best path, just one that works. What about BFS. 169 00:08:26.279 --> 00:08:29.000 BFS uses a queue. Think first and first out like 170 00:08:29.040 --> 00:08:29.680 waiting in line. 171 00:08:29.800 --> 00:08:30.439 Okay, fifo. 172 00:08:30.600 --> 00:08:33.600 So DFS explores outwards from the start layer by layer. 173 00:08:33.799 --> 00:08:36.600 It checks all neighbors one step away, then all neighbors 174 00:08:36.639 --> 00:08:38.000 two steps away, and so on. 175 00:08:38.039 --> 00:08:40.519 Ah, spreading out evenly exactly. 176 00:08:41.000 --> 00:08:44.759 And because of that systematic exploration, BFS always finds the 177 00:08:44.799 --> 00:08:47.840 shortest path in terms of the number of steps or edges, 178 00:08:48.320 --> 00:08:50.480 assuming the cost of each step is the same. 179 00:08:50.759 --> 00:08:54.159 Okay, So BFS guarantees the shortest path. DFS might be 180 00:08:54.279 --> 00:08:57.679 faster to find any path, but not the shortest sounds 181 00:08:57.679 --> 00:08:58.600 like a trade off. 182 00:08:58.519 --> 00:09:01.799 It often is. Yeah, choosing between them is sometimes a 183 00:09:01.840 --> 00:09:04.519 trade off between the possibility of finding a solution quickly 184 00:09:04.840 --> 00:09:06.840 and the certainty of finding the shortest pass. 185 00:09:06.919 --> 00:09:08.840 Makes sense. Are there more advanced methods? 186 00:09:08.960 --> 00:09:11.320 Oh? Yes, then you get into things like a search 187 00:09:12.200 --> 00:09:15.799 A is more sophisticated. It uses a priority queue and 188 00:09:15.840 --> 00:09:16.960 something called a heuristic. 189 00:09:17.200 --> 00:09:19.759 Okay, priority queue and a heuristic. What does that mean? 190 00:09:19.840 --> 00:09:23.240 A priority Q is like a queue where items have 191 00:09:23.279 --> 00:09:25.919 different levels of importance, so the most promising options get 192 00:09:25.919 --> 00:09:29.519 looked at first, and a heuristic is basically an educated 193 00:09:29.600 --> 00:09:32.279 gas or a rule of thumb to estimate how close 194 00:09:32.320 --> 00:09:32.960 you are to the goal. 195 00:09:33.159 --> 00:09:34.840 An estimate. How does a use that? 196 00:09:35.039 --> 00:09:38.159 Oh, combines the actual cost to get to a certain point, 197 00:09:38.200 --> 00:09:41.120 we call it GN with the estimated cost from that 198 00:09:41.159 --> 00:09:45.960 point to the finish line. That's hn the heuristic. It 199 00:09:46.039 --> 00:09:49.559 prioritizes exploring paths that seem to have the lowest total cost, 200 00:09:49.919 --> 00:09:51.480 both actual and estimated, So. 201 00:09:51.399 --> 00:09:54.559 It's using an estimate to guide its search more intelligently exactly. 202 00:09:54.840 --> 00:09:58.559 For mazes, a common heuristic is the Manhattan distance, just 203 00:09:58.600 --> 00:10:02.960 the horizontal plus vertical distance to the goal, ignoring walls 204 00:10:02.960 --> 00:10:06.080 for the estimate like distance in city blocks. Okay, and 205 00:10:06.480 --> 00:10:09.480 using a good heuristic like that, A not only finds 206 00:10:09.519 --> 00:10:13.360 the optimal shortest path, but it far outperforms BFS because 207 00:10:13.360 --> 00:10:16.480 it explores way fewer dead ends. It's much more focused. 208 00:10:16.600 --> 00:10:19.159 That sounds really powerful. And the nice thing is if 209 00:10:19.159 --> 00:10:22.440 you write these search algorithms well generically. 210 00:10:21.960 --> 00:10:25.159 Right, they become incredibly versatile. These same search functions can 211 00:10:25.200 --> 00:10:28.080 be easily adapted for solving a diverse set of problems, 212 00:10:28.440 --> 00:10:31.639 like that old brain teaser the missionaries and Cannibal's problem 213 00:10:31.840 --> 00:10:33.919 is just another state space to search. 214 00:10:33.960 --> 00:10:37.720 Right, Okay, Moving beyond mazes and paths, let's talk about graphs. 215 00:10:37.759 --> 00:10:40.480 You mentioned graphs earlier with maps. They seem really fundamental. 216 00:10:40.720 --> 00:10:44.720 They are. The world of graph algorithms is surprisingly broad 217 00:10:44.799 --> 00:10:47.679 in their applicability, so much can be modeled as. 218 00:10:47.559 --> 00:10:51.360 A graph like the map. Example, cities are vertices, connections, 219 00:10:51.440 --> 00:10:52.279 or edges. 220 00:10:52.200 --> 00:10:55.440 Exactly, and graphs can be undirected like a two way 221 00:10:55.519 --> 00:10:57.799 road or direct like a one way street, and the 222 00:10:57.919 --> 00:11:01.320 edges can be unweighted just showing a connection exists, or 223 00:11:01.360 --> 00:11:04.519 weighted showing a cost like distance or travel time. 224 00:11:04.759 --> 00:11:07.559 So if we want the shortest path in an unweighted graph, 225 00:11:08.039 --> 00:11:10.360 the path with the fewest edges, we can just use 226 00:11:10.399 --> 00:11:11.600 BFS again, right. 227 00:11:11.519 --> 00:11:14.759 Yeah, BFS works perfectly for that. Like, if you were 228 00:11:14.759 --> 00:11:18.639 planning a hypothetical hyperloop network, BFS could find the route 229 00:11:18.679 --> 00:11:22.519 from say Boston to Miami with the minimum number of stops. 230 00:11:22.559 --> 00:11:27.080 Maybe it's Boston, Detroit, Washington, Miami, just three segments, three edges. 231 00:11:27.240 --> 00:11:29.759 Okay, but what if the goal isn't just one path, 232 00:11:29.840 --> 00:11:33.799 but connecting everything efficiently, like connecting all the major US 233 00:11:33.840 --> 00:11:35.919 cities with the minimum amount of track. 234 00:11:36.320 --> 00:11:40.440 Now you're talking about a minimum spanning tree or MST problem. 235 00:11:40.519 --> 00:11:42.759 The goal is to minimize the cost of building the 236 00:11:42.759 --> 00:11:44.919 network while ensuring everything is connected. 237 00:11:45.000 --> 00:11:45.799 How do you solve that? 238 00:11:45.960 --> 00:11:49.840 A common way is Jarnick's algorithm, sometimes called Prim's algorithm. 239 00:11:50.240 --> 00:11:53.399 It basically starts somewhere and greedily adds the cheapest edge 240 00:11:53.399 --> 00:11:55.960 that connects a new city to the growing network without 241 00:11:55.960 --> 00:11:56.720 creating a cycle. 242 00:11:56.879 --> 00:11:59.840 So it builds a network piece by piece, always picking 243 00:11:59.840 --> 00:12:02.519 the cheapest next connection pretty much. 244 00:12:02.799 --> 00:12:05.799 We saw a calculation using this for the fifteen largest 245 00:12:05.879 --> 00:12:11.360 US metropolitan statistical areas msas the minimum length of track 246 00:12:11.480 --> 00:12:13.120 needed to connect all of them was found to be 247 00:12:13.440 --> 00:12:17.279 five three hundred and seventy two miles. Crucial for planning 248 00:12:17.279 --> 00:12:19.360 infrastructure like railways or pipelines. 249 00:12:19.519 --> 00:12:22.919 Wow, and what about finding shortest paths in weighted graphs 250 00:12:23.159 --> 00:12:27.120 where edges have different costs like actual road distances. BFS 251 00:12:27.159 --> 00:12:28.360 won't work then, right, right? 252 00:12:28.399 --> 00:12:31.519 BFS only cares about the number of edges. For weighted graphs, 253 00:12:31.519 --> 00:12:33.279 you need something like Dykstra's algorithm. 254 00:12:33.360 --> 00:12:34.720 Dikester has heard of that one. 255 00:12:34.759 --> 00:12:37.000 It's designed to find the lowest cost path from a 256 00:12:37.039 --> 00:12:39.559 starting point to all other points in a weighted graph. 257 00:12:39.879 --> 00:12:41.919 It keeps track of the cheapest known way to reach 258 00:12:41.960 --> 00:12:46.120 each city and systematically explores outwards, always updating paths if 259 00:12:46.159 --> 00:12:48.799 it finds a cheaper wrap. Very important for things like 260 00:12:48.879 --> 00:12:49.879 GPS navigation. 261 00:12:50.240 --> 00:12:53.480 Absolutely, it seems like graphs are everywhere once you start looking. 262 00:12:53.240 --> 00:12:56.120 They really are. A huge amount of our world can 263 00:12:56.159 --> 00:13:00.519 be represented using graphs, and these algorithms are essential for 264 00:13:00.559 --> 00:13:05.120 efficiency in the telecommunications, shipping, transportation, and utility industries. 265 00:13:05.519 --> 00:13:06.879 Can you do some big examples. 266 00:13:06.879 --> 00:13:10.000 Sure, think about Walmart building out an efficient distribution network, 267 00:13:10.120 --> 00:13:14.120 making warehouses and stores. That's a graph problem. Or Google 268 00:13:14.360 --> 00:13:17.840 indexing the web. The entire Internet is a gigantic graph 269 00:13:17.919 --> 00:13:18.840 of linked pages. 270 00:13:19.159 --> 00:13:20.919 Right links are edges. 271 00:13:20.559 --> 00:13:23.879 Exactly or FedEx finding the right set of hubs to 272 00:13:23.960 --> 00:13:27.720 minimize delivery times and costs. It's all graph algorithms working 273 00:13:27.759 --> 00:13:28.480 behind the scenes. 274 00:13:28.639 --> 00:13:31.279 Okay, let's shift gears a bit now away from these 275 00:13:31.320 --> 00:13:36.120 more deterministic algorithms towards things that feel more intelligent or adaptive, 276 00:13:36.639 --> 00:13:39.080 like genetic algorithms. You mention they're less predictable. 277 00:13:39.159 --> 00:13:43.000 Yeah, genetic algorithms are less deterministic than most traditional methods, 278 00:13:43.360 --> 00:13:46.799 but that unpredictability can be a strength. Sometimes they can 279 00:13:46.799 --> 00:13:50.039 solve problems that other approaches cannot solve in a reasonable 280 00:13:50.039 --> 00:13:53.799 amount of time, especially really complex optimization problems. 281 00:13:54.039 --> 00:13:56.639 How do they work? You mentioned they have a biological background. 282 00:13:56.759 --> 00:14:01.039 They do. They simulate natural selection like evolution. You start 283 00:14:01.039 --> 00:14:03.919 with a population of pencil solutions called chromosomes. 284 00:14:04.000 --> 00:14:06.600 Chromosomes okay, like individuals, right, and. 285 00:14:06.559 --> 00:14:10.320 Each chromosome has genes, which are like its specific traits 286 00:14:10.480 --> 00:14:14.480 or parts of the solution. These individuals then compete in 287 00:14:14.519 --> 00:14:18.600 a sense. To solve the problem. Their success is measured 288 00:14:18.639 --> 00:14:21.879 by a fitness function, how well they do. 289 00:14:21.799 --> 00:14:24.559 The job Survival of the fittest basically exactly. 290 00:14:24.840 --> 00:14:28.440 The process involves creating an initial population, often randomly. Then 291 00:14:28.480 --> 00:14:32.240 you measure everyone's fitness. Then you select individuals for reproduction. 292 00:14:32.279 --> 00:14:35.840 Fitter ones are more likely to be chosen. Common methods 293 00:14:35.840 --> 00:14:38.679 are rulette, rule selection or tournament selection. 294 00:14:38.759 --> 00:14:39.759 Okay, select the best. 295 00:14:40.519 --> 00:14:44.000 Then what then comes crossover? You take two parent solutions 296 00:14:44.039 --> 00:14:46.039 and combine parts of them to create one or two 297 00:14:46.120 --> 00:14:49.559 children's solutions, hoping to mix good traits, mixing gen right. 298 00:14:50.000 --> 00:14:54.159 And finally, there's mutation, making small random changes to some individuals. 299 00:14:54.639 --> 00:14:58.080 This helps maintain diversity of the population and prevents getting 300 00:14:58.120 --> 00:15:02.039 stuck too early on a sub optimal solution. You repeat 301 00:15:02.080 --> 00:15:07.080 this cycle fitness, selection, crossover, mutation over many generations. 302 00:15:07.159 --> 00:15:10.440 That's a really cool analogy. Can they solve actual problems? 303 00:15:10.600 --> 00:15:13.279 Oh? Yeah, they can tackle things like that. Send plus 304 00:15:13.279 --> 00:15:16.440 more money. Cryptoithmetic puzzle we mentioned earlier, which is also 305 00:15:16.480 --> 00:15:19.320 a CSP. You can represent the letters and digits and 306 00:15:19.399 --> 00:15:22.080 let the genetic algorithm evolve toward a valid assignment. 307 00:15:22.360 --> 00:15:26.000 Huh. Solving a logic puzzle through simulated evolution. What else? 308 00:15:26.200 --> 00:15:30.480 Something may be more surprising optimizing list compression. It turns 309 00:15:30.480 --> 00:15:32.919 out that for many compression algorithms, the order of the 310 00:15:32.919 --> 00:15:34.799 items will affect the compression ratio. 311 00:15:34.919 --> 00:15:37.120 The order matters really for some algorithms. 312 00:15:37.200 --> 00:15:39.879 Yes, So for a list of say twelve names, maybe 313 00:15:39.919 --> 00:15:42.039 standard compression gets it down to one hundred and sixty 314 00:15:42.039 --> 00:15:45.200 five bytes, a genetic algorithm could try rearranging the order 315 00:15:45.240 --> 00:15:47.639 of those names and found an order that compressed down 316 00:15:47.639 --> 00:15:49.000 to one hundred and fifty nine bytes. 317 00:15:49.120 --> 00:15:52.279 Okay, from one to sixty five seems like a small gain. 318 00:15:52.559 --> 00:15:55.879 It might seem small for twelve names, but imagine optimizing 319 00:15:55.919 --> 00:15:59.360 the order for millions of items. Now. Could you find 320 00:15:59.360 --> 00:16:02.840 the absolute best order for those twelve names by checking 321 00:16:03.279 --> 00:16:05.559 every possibility twelve names? 322 00:16:05.559 --> 00:16:08.720 That's twelve factorial permutations? Right, that's huge. 323 00:16:08.799 --> 00:16:12.039 It's four hundred and seventy nine thousand, one thousand, six 324 00:16:12.159 --> 00:16:16.919 hundred possible orders. Absolutely unfeasible to check them all. Brute 325 00:16:16.919 --> 00:16:20.399 force is impossible, right, No way, And that's where genetic 326 00:16:20.440 --> 00:16:24.440 algorithms shine. They don't guarantee the absolute, perfect optimal solution, 327 00:16:25.279 --> 00:16:28.799 but they're great at finding very good, near optimal solutions 328 00:16:28.799 --> 00:16:32.840 for problems where finding the true optimum is computationally impossible 329 00:16:33.159 --> 00:16:35.600 or would take like the age of the universe. 330 00:16:35.639 --> 00:16:39.039 So they're a practical way to tackle incredibly complex optimization. 331 00:16:39.200 --> 00:16:40.240 Are there downsides? 332 00:16:40.320 --> 00:16:43.000 Well, As we said, they're less deterministic. Run it twice 333 00:16:43.039 --> 00:16:45.039 you might get slightly different results. It could also be 334 00:16:45.279 --> 00:16:47.559 something of a black box. It's not always clear why 335 00:16:47.600 --> 00:16:49.840 the solution they found work so well, and we don't 336 00:16:49.840 --> 00:16:52.240 really know if we found the optimal order just a good. 337 00:16:52.039 --> 00:16:55.000 One, gotcha, but still useful any wild applications. 338 00:16:55.240 --> 00:16:59.159 People have used them for computer generated art, evolving images 339 00:16:59.159 --> 00:17:03.399 made of polygon to resemble photographs, and even genetic programming, 340 00:17:03.720 --> 00:17:07.000 where the things evolving aren't just data but actual pieces 341 00:17:07.079 --> 00:17:10.160 of computer code programs that write programs. 342 00:17:10.359 --> 00:17:14.559 WHOA, Okay, that's mind bending, all right. Another area dealing 343 00:17:14.599 --> 00:17:19.000 with complexity finding hidden structure in data. We have more 344 00:17:19.039 --> 00:17:20.720 data than ever, how do we make sense of it 345 00:17:20.720 --> 00:17:22.200 if we don't even know what we're looking for. 346 00:17:22.599 --> 00:17:25.519 That's where clustering comes in. It's a key technique when 347 00:17:25.559 --> 00:17:27.599 you want to learn about the structure of a data set, 348 00:17:28.160 --> 00:17:30.839 but you do not know ahead of time. It's constituent parts. 349 00:17:31.119 --> 00:17:34.039 Finding groups without knowing the groups beforehand exactly. 350 00:17:34.279 --> 00:17:36.880 K means clustering is a very common type. It's a 351 00:17:36.920 --> 00:17:40.599 form of unsupervised learning. You don't train it with labeled examples. 352 00:17:40.640 --> 00:17:43.039 It finds the inherent groupings in the data itself. 353 00:17:43.279 --> 00:17:45.960 Okay, K means How might that be used? 354 00:17:46.359 --> 00:17:46.599 Say? 355 00:17:46.880 --> 00:17:47.400 In business? 356 00:17:47.519 --> 00:17:49.720 Imagine you run a grocery store. You have tons of 357 00:17:49.759 --> 00:17:52.400 data about who buys what. When you can use K 358 00:17:52.559 --> 00:17:56.799 means to cluster customers based on demographics, purchase history, day 359 00:17:56.799 --> 00:17:57.440 of the week. 360 00:17:57.240 --> 00:17:59.000 They shop and what would that tell you? 361 00:17:58.599 --> 00:18:02.000 You might discover hidden patterns, like maybe there's a distinct 362 00:18:02.079 --> 00:18:06.000 cluster of younger shoppers prefer to shop on Tuesdays. You 363 00:18:06.000 --> 00:18:08.680 didn't know that group existed, but the algorithm found. 364 00:18:08.400 --> 00:18:09.960 It, and then you could use that insight. 365 00:18:10.400 --> 00:18:13.240 Right, you could run an ad specifically targeting them on 366 00:18:13.319 --> 00:18:16.480 Mondays or Tuesdays, making your marketing much more effective. 367 00:18:16.799 --> 00:18:20.440 Makes sense? How does the K means algorithm actually work? 368 00:18:20.880 --> 00:18:21.680 Is it complex? 369 00:18:22.119 --> 00:18:27.079 The algorithm itself is surprisingly simple conceptually. First, you decide 370 00:18:27.079 --> 00:18:28.880 how many clusters you want to find, that's the K. 371 00:18:29.400 --> 00:18:32.319 Then you initialize K centroids, which are just the starting 372 00:18:32.319 --> 00:18:34.880 center points for each cluster, often placed randomly. 373 00:18:35.000 --> 00:18:37.359 Okay, pick k random starting points. 374 00:18:37.559 --> 00:18:41.680 Then you repeat two steps. Step one, assign clusters. You 375 00:18:41.680 --> 00:18:43.640 go through every single data point and assign it to 376 00:18:43.640 --> 00:18:45.440 the cluster whose centroid is closest. 377 00:18:45.640 --> 00:18:47.599 Assign each point to its nearest center. 378 00:18:47.759 --> 00:18:52.400 Step two, generate centroids. For each cluster. You calculate the 379 00:18:52.480 --> 00:18:55.799 mean the average position of all the points currently assigned 380 00:18:55.839 --> 00:18:59.000 to it. That mean becomes the new centroid for that cluster. 381 00:18:59.160 --> 00:19:02.880 Okay, move the center to the average location of its points. 382 00:19:02.920 --> 00:19:05.759 Exactly, and you just keep repeating those two steps. Assigned points, 383 00:19:05.839 --> 00:19:09.119 update centroids until the centroids stop moving much or converge. 384 00:19:09.640 --> 00:19:11.559 The points naturally group around these centers. 385 00:19:11.680 --> 00:19:16.359 Huh, elegant. Where else does k meines use besides customer segmentation? 386 00:19:16.640 --> 00:19:21.359 Oh, lots of places. It helps with pattern recognition in biology, 387 00:19:21.880 --> 00:19:25.119 maybe identifying groups of incongruous cells that might signal a 388 00:19:25.160 --> 00:19:29.680 problem finding anomalies. Yeah. In image recognition, pixels themselves can 389 00:19:29.720 --> 00:19:32.720 be data lights. You could cluster pixels by color to 390 00:19:32.759 --> 00:19:34.359 segment an image into regions. 391 00:19:34.559 --> 00:19:34.880 Okay. 392 00:19:35.079 --> 00:19:37.680 Even in political science, it's used to group voters based 393 00:19:37.680 --> 00:19:42.359 on survey responses or demographics to find voters to target 394 00:19:42.559 --> 00:19:46.079 and understand their underlying concerns without predefining the groups. 395 00:19:46.279 --> 00:19:49.839 Powerful stuff for finding structure in the unknown. Now, let's 396 00:19:49.839 --> 00:19:53.720 talk about the elephant in the AI room, neural networks. 397 00:19:53.880 --> 00:19:57.200 When we hear about deep learning, it's usually neural nets. 398 00:19:57.000 --> 00:20:00.519 Right, almost always, yes, especially the really impressive advances in 399 00:20:00.559 --> 00:20:04.160 AI lately, they often concern neural networks, particularly deep ones 400 00:20:04.200 --> 00:20:05.200 with many layers. 401 00:20:05.240 --> 00:20:07.880