WEBVTT 1 00:00:00.120 --> 00:00:04.320 What if the key to building really intelligent AI isn't 2 00:00:04.360 --> 00:00:07.639 about meticulously optimizing every single parameter. What if it's more 3 00:00:07.639 --> 00:00:12.000 about letting it evolve. Imagine maybe a shortcut to sophisticated AI, 4 00:00:12.400 --> 00:00:14.839 one that sidesteps some of the usual complexities. May be 5 00:00:14.960 --> 00:00:18.440 taking a page straight from Nature's playbook. Today, we're embarking 6 00:00:18.440 --> 00:00:21.760 in a deep dive into neuroevolution. It's this fascinating family 7 00:00:21.760 --> 00:00:25.160 of machine learning methods that use evolutionary algorithms to build well, 8 00:00:25.359 --> 00:00:28.920 high performing artificial neural networks. Our mission is to unpack 9 00:00:29.000 --> 00:00:32.399 this powerful alternative to conventional deep learning. We want to 10 00:00:32.439 --> 00:00:35.920 reveal how it's used for complex tasks than games, robotics, 11 00:00:36.079 --> 00:00:39.000 and how it delivers sometimes surprisingly energy efficient, kind of 12 00:00:39.119 --> 00:00:42.280 elegant solutions. You'll get insights from the core concepts right 13 00:00:42.320 --> 00:00:46.039 through to some really surprising real world applications, all distilled 14 00:00:46.079 --> 00:00:49.600 from hands on neuroevolution with Python. So let's explore this 15 00:00:49.679 --> 00:00:52.079 alternative path to AI. I mean, many of us know 16 00:00:52.119 --> 00:00:56.000 about AI learning from massive data sets complex calculations, but 17 00:00:56.039 --> 00:00:59.439 neuroevolution offers this really different, almost organic approach. 18 00:00:59.560 --> 00:01:02.840 It's really fascinating, isn't it. How directly it draws inspiration 19 00:01:02.920 --> 00:01:07.000 from natural selection. Instead of, you know, explicitly programming the 20 00:01:07.040 --> 00:01:11.719 perfect network, you're essentially cultivating a population of networks. You 21 00:01:11.799 --> 00:01:14.719 let the fittest survive and reproduce, and that leads to 22 00:01:14.840 --> 00:01:18.879 increasingly complex, more optimal solutions over generations. It's a very 23 00:01:18.879 --> 00:01:19.760 different way of thinking. 24 00:01:20.159 --> 00:01:22.079 So it all starts at the brain, doesn't it? Our 25 00:01:22.120 --> 00:01:26.319 own brains these incredibly complex graphs of nodes and links. 26 00:01:26.760 --> 00:01:31.319 Early AI ambitions were sort of to imitate that directly, right, 27 00:01:31.359 --> 00:01:35.319 hoping for artificial general intelligence. We're still well working towards that, 28 00:01:35.359 --> 00:01:38.599 but neuroevolution is helping us build some powerful narrow AI 29 00:01:38.680 --> 00:01:39.519 agents right now. 30 00:01:39.680 --> 00:01:44.519 Indeed, artificial neural networks an ns their universal approximators. Theoretically 31 00:01:44.560 --> 00:01:46.959 they can approximate to any function, but the real challenge 32 00:01:47.000 --> 00:01:48.680 is how you train them. How do you select the 33 00:01:48.719 --> 00:01:51.799 right weight values for all those connections. Do you like 34 00:01:51.920 --> 00:01:56.280 meticulously adjust weights with methods like gradiate descent, or do 35 00:01:56.319 --> 00:01:59.439 you let them evolve? Neuroevolution takes that second. 36 00:01:59.120 --> 00:02:03.319 Path, utionary path. So a foundational algorithm here is neat 37 00:02:03.439 --> 00:02:07.799 neuroevolution of augmenting topologies. What's the core idea there? What 38 00:02:07.920 --> 00:02:11.840 makes it revolutionary, especially in how it deals with complexity. 39 00:02:12.280 --> 00:02:16.120 Well, the big breakthrough is it's complexification strategy. It starts simple, 40 00:02:16.360 --> 00:02:19.319 It reduces the huge parameter search space. But beginning with 41 00:02:19.400 --> 00:02:24.400 tiny simple ceed genomes, just inputs, outputs, maybe a biased neuron, 42 00:02:24.759 --> 00:02:28.000 no hidden nodes at first. Then generation by generation it 43 00:02:28.000 --> 00:02:31.800 introduces additional genes. It expands the solution space incrementally. This 44 00:02:31.840 --> 00:02:34.639 mirror is natural evolution, you know, where new genes sometimes 45 00:02:34.639 --> 00:02:37.479 add complexity. It's way more efficient than trying to search 46 00:02:37.479 --> 00:02:38.960 a massive space from the get go. 47 00:02:39.240 --> 00:02:41.680 Okay, so if they're evolving, how do they reproduce and 48 00:02:41.759 --> 00:02:43.800 mutate in a way that lets them get more complex? 49 00:02:43.919 --> 00:02:46.199 Is it like classic genetic algorithms? 50 00:02:46.280 --> 00:02:50.599 It is, Yeah, neurorevolution uses those genetic operators. Mutation can 51 00:02:50.639 --> 00:02:53.840 be simple things flipping bits, changing values in the genome, 52 00:02:53.879 --> 00:02:57.360 altering existing connections. But where NEAT gets really clever is 53 00:02:57.360 --> 00:03:01.360 with structural mutations, actually adding new or even entirely new 54 00:03:01.400 --> 00:03:03.800 nodes to the network's architecture itself. 55 00:03:04.039 --> 00:03:08.159 Huh. But if their structures are constantly changing, growing independently, 56 00:03:09.000 --> 00:03:12.039 how do you combine two different networks during reproduction? Doesn't 57 00:03:12.039 --> 00:03:14.680 that get messy? How do you match things up. 58 00:03:14.960 --> 00:03:18.759 That's a really important question, and NEAT solves it brilliantly 59 00:03:18.800 --> 00:03:22.439 with the innovation number. Every new gene, a connection, or 60 00:03:22.479 --> 00:03:26.680 a node introduced by mutation gets a unique, globally incrementing 61 00:03:26.759 --> 00:03:31.639 number across the whole evolutionary run during crossover. These numbers 62 00:03:31.639 --> 00:03:35.080 act like genetic IDs. They let the algorithm precisely align 63 00:03:35.199 --> 00:03:38.280 corresponding genes from two parents, even if their structures look 64 00:03:38.360 --> 00:03:41.159 quite different. Any genes that don't match up, disjoint or 65 00:03:41.199 --> 00:03:43.960 excess ones are just added unconditionally to the offspring. 66 00:03:44.080 --> 00:03:46.280 Okay, I think I follow that. But what if a new, 67 00:03:46.479 --> 00:03:50.120 more complex structure is like temporarily less fit than a 68 00:03:50.159 --> 00:03:53.439 simpler one that's already pretty optimized. How do these potentially 69 00:03:53.520 --> 00:03:58.159 groundbreaking innovations survive long enough to actually prove their worth. Ah. 70 00:03:58.520 --> 00:04:01.960 That's where speciation comes in. It's directly inspired by how 71 00:04:02.039 --> 00:04:06.400 species form in nature. Literally, in NEAT, the population gets 72 00:04:06.439 --> 00:04:09.599 divided into species or niches based on how similar their 73 00:04:09.639 --> 00:04:14.120 network structures their topologies are. Organisms within the same species 74 00:04:14.400 --> 00:04:17.079 mainly compete and mate with each other. This is crucial. 75 00:04:17.480 --> 00:04:21.800 It shields new, possibly brilliant, but currently underperforming topologies from 76 00:04:21.839 --> 00:04:25.360 immediate negative pressure For the more established networks, it gives 77 00:04:25.360 --> 00:04:27.839 them breathing room, lets them evolve within their niche until 78 00:04:27.879 --> 00:04:31.480 they might become genuinely superior. It's all about cultivating diversity 79 00:04:31.519 --> 00:04:32.439 for long term gain. 80 00:04:32.800 --> 00:04:36.079 That's pretty neat. Okay, so neat sounds powerful, But I 81 00:04:36.120 --> 00:04:38.720 can imagine a problem when you need a really big network, 82 00:04:39.000 --> 00:04:43.240 like millions of connections for complex visual recognition, directly encoding, 83 00:04:43.279 --> 00:04:45.040 every single connection must get unwieldy. 84 00:04:45.120 --> 00:04:48.040 Right, You're absolutely right. That's the big drawback of directing 85 00:04:48.120 --> 00:04:51.120 coding for large scale A and NS. As the network grows, 86 00:04:51.160 --> 00:04:54.680 the genome just balloons. It becomes computationally expensive, hard to manage. 87 00:04:55.040 --> 00:04:59.680 So researchers developed indirect encoding schemes much more efficient. 88 00:05:00.000 --> 00:05:02.399 Okay, and here's where it gets I think, really ingenious, 89 00:05:03.000 --> 00:05:06.720 hyper need. It uses something called a compositional pattern producing 90 00:05:06.759 --> 00:05:09.360 network a CPPN. What exactly is that? What does it 91 00:05:09.439 --> 00:05:10.160 let you do? Right? 92 00:05:10.160 --> 00:05:13.720 A CPPN it's a specialized neural network itself. Its job 93 00:05:13.759 --> 00:05:16.800 is to represent the connectivity patterns of another network, the 94 00:05:16.839 --> 00:05:19.879 main one You want to build the phenotype ANN as 95 00:05:19.920 --> 00:05:22.439 a function of its geometry. Think of it like a 96 00:05:22.439 --> 00:05:25.600 master blueprint, a compact set of rules for building a 97 00:05:25.639 --> 00:05:29.560 complex structure. This connectivity pattern is often visualized as a 98 00:05:29.600 --> 00:05:32.680 kind of high dimensional space like a grid. Each point 99 00:05:32.680 --> 00:05:35.240 on the grid tells you if and how strongly two 100 00:05:35.279 --> 00:05:39.160 specific nodes in the main ANN should connect. The CPPN 101 00:05:39.399 --> 00:05:41.959 takes the coordinates of these nodes as input, and it 102 00:05:42.000 --> 00:05:45.360 outputs the connection weight. If the waits below a certain threshold, 103 00:05:45.360 --> 00:05:47.040 well no connection gets made. 104 00:05:47.199 --> 00:05:51.319 WHOA. So one small CPPN can basically act as a 105 00:05:51.360 --> 00:05:55.600 compressed set of instructions a blueprint for a potentially massive ANN. 106 00:05:55.879 --> 00:05:57.319 That sounds incredibly efficient. 107 00:05:57.680 --> 00:06:02.439 It allows for remarkable information compression, seriously remarkable. There is 108 00:06:02.480 --> 00:06:06.480 this visual discrimination task, for instance, where a CPPN with 109 00:06:06.519 --> 00:06:10.199 only like sixteen connections define the patterns for a main 110 00:06:10.399 --> 00:06:14.120 A and M with almost fifteen thousand connections. That's the 111 00:06:14.160 --> 00:06:17.000 compression ratio of what about point one one percent? 112 00:06:17.120 --> 00:06:17.399 Wow? 113 00:06:17.879 --> 00:06:20.360 And what this practically means for you? The listener is 114 00:06:20.399 --> 00:06:24.360 potentially much more energy efficient AI you can deploy powerful 115 00:06:24.360 --> 00:06:27.040 models where traditional deep learning is just too big or 116 00:06:27.079 --> 00:06:30.759 power hungry. Think edge devices. Plus, it often lets you 117 00:06:30.800 --> 00:06:34.120 generate solutions at different resolutions without retraining. 118 00:06:34.439 --> 00:06:37.560 That's a huge leap, but okay, HyperNEAT sounds powerful, but 119 00:06:37.600 --> 00:06:40.560 If the CPPN is the blueprint, someone still has to 120 00:06:40.600 --> 00:06:42.720 decide where the bricks go. Right, someone has to define 121 00:06:42.720 --> 00:06:44.439 the layout of the nodes in the final network. 122 00:06:44.439 --> 00:06:48.040 You've hit its main limitation exactly. The human experimenter still 123 00:06:48.040 --> 00:06:51.480 defines the layout of the phenotype Ann's nodes, the substrate 124 00:06:51.519 --> 00:06:53.120 we call it, right at the start. If you make 125 00:06:53.120 --> 00:06:55.680 a bad assumption about that layout, performance can suffer, so 126 00:06:56.160 --> 00:06:59.480 es HyperNEAT or evolvable substrate hypernea. It tackles this. It 127 00:06:59.519 --> 00:07:01.639 introduces an evolvable substrate. 128 00:07:01.240 --> 00:07:04.120 Hold on, so the layout of the network itself that 129 00:07:04.160 --> 00:07:06.680 evolves automatically too. That's really next level. 130 00:07:06.720 --> 00:07:10.199 Precisely, it figures out where information seems to be flowing 131 00:07:10.279 --> 00:07:14.600 most intensely within the potential connection space. It uses techniques 132 00:07:14.680 --> 00:07:19.079 like quad tree information extraction, basically clever ways to divide 133 00:07:19.120 --> 00:07:21.759 up the space and focus effort where needed, and then 134 00:07:21.800 --> 00:07:25.439 it automatically puts more hidden nodes in those high intensity regions, 135 00:07:25.759 --> 00:07:27.959 so the system learns not just the connections, but where 136 00:07:27.959 --> 00:07:30.560 to put the nodes for the best representation. It allows 137 00:07:30.639 --> 00:07:34.680 automatic hidden node placement easier modular networks, and it can 138 00:07:34.720 --> 00:07:38.560 elaborate the structure adding nodes and connections during evolution, which 139 00:07:38.639 --> 00:07:40.439 basic hyper need it doesn't really do. 140 00:07:40.639 --> 00:07:44.399 Okay, let's shift gears a bit. Most optimization algorithms, including 141 00:07:44.399 --> 00:07:46.839 a lot of evolutionary ones, they try to get closer 142 00:07:46.839 --> 00:07:49.480 and closer to a goal. Right. They reward progress towards 143 00:07:49.480 --> 00:07:52.120 some objective. But what happens if the best path to 144 00:07:52.160 --> 00:07:55.079 that goal involves, I don't know, temporarily moving away from it, 145 00:07:55.319 --> 00:07:57.439 or if there are dead ends that look promising. That 146 00:07:57.560 --> 00:07:59.120 sounds like a fundamental problem. 147 00:07:59.279 --> 00:08:03.079 It is. It's the classic local optima trap. Imagine a 148 00:08:03.120 --> 00:08:06.639 maze the shortest path out actually requires you to walk 149 00:08:06.639 --> 00:08:09.319 away from the exit for a bit. First, a simple 150 00:08:09.399 --> 00:08:12.800 goal oriented search, one that just rewards getting closer, might 151 00:08:12.839 --> 00:08:15.079 walk into a dead end, a cul de sact that 152 00:08:15.120 --> 00:08:17.879 seems close to the exit but offers no way forward. 153 00:08:18.040 --> 00:08:21.279 The algorithm gets stuck. It converges to a local champion, 154 00:08:21.439 --> 00:08:22.879 not the true best solution. 155 00:08:23.399 --> 00:08:27.680 Okay, So if that goal focused approach gets stuck, what's 156 00:08:27.720 --> 00:08:32.080 the alternative? How does neuroevolution break free from these deceptive landscapes. 157 00:08:32.600 --> 00:08:35.799 That's where novelty search or NS comes in, and the 158 00:08:35.840 --> 00:08:38.759 core idea is really counterintuitive, almost zen like the ejective 159 00:08:38.799 --> 00:08:41.600 function isn't proximity to a goal. It's defined by the 160 00:08:41.600 --> 00:08:44.200 novelty of the behavior shown by the agent. It actively 161 00:08:44.240 --> 00:08:47.960 rewards doing something different. It drives evolution towards diversity of behavior. 162 00:08:48.039 --> 00:08:50.960 Wait you're saying it just wanders around exploring, hoping to 163 00:08:51.000 --> 00:08:53.919 stumble onto the solution by accident. That feels indirect. 164 00:08:54.159 --> 00:08:57.399 It's more sophisticated than just random watering. There's a novelty metric. 165 00:08:57.960 --> 00:09:01.080 Often it's measured as like the average distance of an 166 00:09:01.120 --> 00:09:05.679 individual's behavior to its k nearest neighbors in some abstract 167 00:09:05.720 --> 00:09:09.360 behavioral space. If you're doing something unique far from what 168 00:09:09.440 --> 00:09:12.320 others are doing, you get a high novelty score. You're rewarded. 169 00:09:12.840 --> 00:09:16.679 This encourages divergent evolution. It forces the population to spread out, 170 00:09:16.960 --> 00:09:20.039 explore the whole space, not just clump together in one 171 00:09:20.159 --> 00:09:23.519 seemingly good spot. And here's the really wild part. For 172 00:09:23.600 --> 00:09:27.279 certain tricky, deceptive problems, novelty search can actually find solutions 173 00:09:27.600 --> 00:09:31.679 faster than traditional objective based search. It forces exploration that 174 00:09:31.720 --> 00:09:32.720 goal seeking misses. 175 00:09:33.039 --> 00:09:37.120 Okay, wow, so we've covered the mechanics, these cool complexification strategies, 176 00:09:37.159 --> 00:09:40.480 ways to handle scale. Even this idea of rewarding novelty. 177 00:09:40.960 --> 00:09:43.200 Let's see how this all plays out in practice. How 178 00:09:43.240 --> 00:09:46.759 does neuroevolution tackle some real challenges, from classic problems to 179 00:09:47.480 --> 00:09:50.639 complex games, even evolving its own goals. Let's start simple. 180 00:09:50.759 --> 00:09:54.840 Maybe the xor problem sounds basic but notoriously tricky for 181 00:09:54.879 --> 00:09:58.159 simple networks because it's not linearly separable. How does net 182 00:09:58.200 --> 00:09:58.639 handle that? 183 00:09:58.960 --> 00:10:02.879 Right? XR a basic ANN no hidden layers, just can't 184 00:10:02.919 --> 00:10:06.519 crack it, but neat starting super simple, two inputs, one 185 00:10:06.559 --> 00:10:10.639 output consistently evolves the necessary structure. It adds that crucial 186 00:10:10.720 --> 00:10:14.679 hidden node. It perfectly demonstrates needs power to grow the 187 00:10:14.720 --> 00:10:18.759 complexity it needs and avoid those traps that stump fixed networks. 188 00:10:19.360 --> 00:10:22.240 For XOR, fitness is usually calculated based on how close 189 00:10:22.279 --> 00:10:25.000 the output is to the correct zero or one for 190 00:10:25.039 --> 00:10:28.399 all four input patterns. Get close enough, like fifteen point 191 00:10:28.399 --> 00:10:30.080 five out of sixteen and you solved it. 192 00:10:30.159 --> 00:10:34.000 Okay, makes sense moving to something more dynamic. Balancing a 193 00:10:34.039 --> 00:10:36.759 pole on a cart. That's a real classic and reinforcement learning. 194 00:10:36.759 --> 00:10:39.480 Isn't it absolutely the single pole balancing task? It's an 195 00:10:39.519 --> 00:10:43.519 avoidance control problem. The ANN gets inputs, cart position, velocity, 196 00:10:43.559 --> 00:10:46.480 poll angle, its angular velocity, all scaled nicely and then 197 00:10:46.519 --> 00:10:48.960 it just outputs a simple action push left or push right. 198 00:10:49.240 --> 00:10:51.360 Fitness is just how long it keeps the pole balanced, 199 00:10:51.480 --> 00:10:54.120 often measured in time steps, maybe up to hundreds of thousands, 200 00:10:54.159 --> 00:10:56.759 and the physics underneath are often simulated using something like 201 00:10:56.759 --> 00:10:58.600 a Runge Kuda method to keep it accurate. 202 00:10:58.679 --> 00:11:01.519 And then you mentioned trying a double pole balancing problem. 203 00:11:01.559 --> 00:11:02.639 That sounds way harder. 204 00:11:02.720 --> 00:11:06.919 Two poles, oh much harder. Two poles, often different lengths 205 00:11:07.000 --> 00:11:10.240 on the same cart, more state variables, much more complex 206 00:11:10.240 --> 00:11:14.600 physics involved. That experiment really highlighted how important that speciation 207 00:11:14.720 --> 00:11:18.240 thing is, finding the right balance of species diversity. Too 208 00:11:18.320 --> 00:11:20.919 many species and they become too small. Maybe it don't 209 00:11:20.960 --> 00:11:24.919 evolve fast enough. Too few any stifle innovation. It also 210 00:11:24.960 --> 00:11:27.480 really showed how sensitive things can be to the initial 211 00:11:27.559 --> 00:11:29.720 random seed. Sometimes you just need a bit of luck 212 00:11:29.720 --> 00:11:31.440 in that initial population set up right. 213 00:11:31.480 --> 00:11:35.440 The starting conditions matter. Okay, Mazes, they're great test beds 214 00:11:35.480 --> 00:11:40.000 for autonomous agents. How does neuroevolution do with, say, a 215 00:11:40.120 --> 00:11:43.200 robot navigating a maze, avoiding walls, finding an exit. 216 00:11:43.639 --> 00:11:46.720 Mazes are fascinating because they often have those deceptive landscapes. 217 00:11:46.759 --> 00:11:49.080 We talked about cul de sacs that look promising, but 218 00:11:49.080 --> 00:11:52.120 are dead ends local optima. If you just use a 219 00:11:52.159 --> 00:11:55.200 goal oriented fitness function rewarding distance to the exit, agents 220 00:11:55.240 --> 00:11:57.480 often get stuck. We saw this in experiments with a 221 00:11:57.480 --> 00:12:01.559 hard maze configuration. Objective based search just failed. Agents got 222 00:12:01.559 --> 00:12:03.639 trapped near the start or in those dead ends. 223 00:12:03.960 --> 00:12:06.919 But what about novelty search? Did that make a difference 224 00:12:06.960 --> 00:12:10.240 in the mazes? Could it actually beat the goal focused approach? There? 225 00:12:10.320 --> 00:12:13.519 That's the key question, right For a simple maze, NS 226 00:12:13.559 --> 00:12:17.320 often found a solution faster and interestingly, often with a 227 00:12:17.360 --> 00:12:21.639 simpler network topology, sometimes even needing no hidden nodes at all. 228 00:12:21.759 --> 00:12:25.679 Compared to the goal oriented method. It consistently pushed agents 229 00:12:25.720 --> 00:12:29.559 to explore more varied paths, even for the really hard maze. 230 00:12:29.720 --> 00:12:33.120 While the specific library implementation we use struggled to find 231 00:12:33.120 --> 00:12:36.360 a perfect, winning solution, The results were far more promising 232 00:12:36.360 --> 00:12:39.720 with novelty search. The exploration was much broader, much more 233 00:12:39.759 --> 00:12:43.559 intelligent looking. It really shows that sometimes not aiming directly 234 00:12:43.600 --> 00:12:45.080 at the goal is the best way to get there. 235 00:12:45.480 --> 00:12:47.960 Okay, this next one. It sounds like pure science fiction 236 00:12:48.360 --> 00:12:53.039 co evolution. Two AI populations evolving together, influencing each other. 237 00:12:53.120 --> 00:12:56.519 Yeah, it's a really azance concept inspired by biological ideas 238 00:12:56.519 --> 00:12:59.759 like commensalism, where one species benefits without affecting the other 239 00:12:59.840 --> 00:13:03.679 mine much. The method called safe involves two populations evolving 240 00:13:03.720 --> 00:13:06.759 side by side, one population of MAY solving agents and 241 00:13:06.799 --> 00:13:10.840 another population of well objective function candidates. 242 00:13:10.399 --> 00:13:14.360 Wait objective function candidates. So the MAY solver's fitness isn't 243 00:13:14.399 --> 00:13:16.679 just about reaching the exit anymore exactly. 244 00:13:16.759 --> 00:13:19.360 That's where it gets really clever. The maze solver's fitness 245 00:13:19.399 --> 00:13:22.039 is a combination of two things. One it's distance to 246 00:13:22.080 --> 00:13:25.480 the exit that's the objective part, and two the novelty 247 00:13:25.519 --> 00:13:28.799 of its final position, the behavioral novelty part. But here's 248 00:13:28.840 --> 00:13:32.720 the crucial twist. The weights used to combine these two scores. 249 00:13:33.159 --> 00:13:35.759 They come as outputs from an individual in the other 250 00:13:35.919 --> 00:13:40.440 evolving population, the objective function candidates. So the system literally 251 00:13:40.480 --> 00:13:43.759 evolved to find solutions for that hard maze where objective 252 00:13:43.759 --> 00:13:46.639 search alone failed. It's like the AI is learning how 253 00:13:46.639 --> 00:13:50.679 to define its own success criteria, dynamically shifting focus between 254 00:13:50.679 --> 00:13:51.799 the goal and exploration. 255 00:13:51.919 --> 00:13:55.559 That is wild. Okay. From mazes to video games you mentioned, 256 00:13:55.679 --> 00:13:59.120 neuroevolution can train agents for classic atari games that usually 257 00:13:59.120 --> 00:14:02.879 involves deep reinforcement learning like DQN, which is known for 258 00:14:02.919 --> 00:14:04.840 being super computationally heavy. 259 00:14:05.039 --> 00:14:09.919 Traditionally, yes, deep RL methods like DQN use deep neural nets. 260 00:14:10.159 --> 00:14:14.480 Gradient based backpropagation needs serious GPU power for all those 261 00:14:14.480 --> 00:14:19.039 matrix multiplications. Deep neuroevolution offers a different path. It can 262 00:14:19.080 --> 00:14:22.720 approximate that q value function needed for reinforcement learning without 263 00:14:22.759 --> 00:14:24.600 relying on air or backpropagation at all. 264 00:14:24.759 --> 00:14:27.600 No backpropagation. How on earth does it train those huge 265 00:14:27.639 --> 00:14:28.600 deep neural networks? 266 00:14:28.600 --> 00:14:32.279 Then, instead of backpropit uses a pretty straightforward genetic algorithm 267 00:14:32.519 --> 00:14:36.519 to evolve a population of potential network controllers. The genome 268 00:14:36.600 --> 00:14:40.240 of each individual encodes all the trainable parameters, the millions 269 00:14:40.240 --> 00:14:43.120 of connection weights of a deep neural network. For the 270 00:14:43.120 --> 00:14:46.159 Frostbite Atari game, for instance, the agent learns just by 271 00:14:46.159 --> 00:14:49.279 looking at the screen pixels. It uses a convolutional neural 272 00:14:49.320 --> 00:14:52.720 network a CNN with something like four million parameters. 273 00:14:53.039 --> 00:14:55.960 Four million parameters? How do you encode that efficiently in 274 00:14:56.000 --> 00:14:57.519 a genome that sounds massive? 275 00:14:57.960 --> 00:15:00.840 This is another really clever bit of encoding. It uses 276 00:15:00.879 --> 00:15:04.799 the seeds of a pseudorandom number generator. The genome isn't 277 00:15:04.799 --> 00:15:08.039 the weights themselves, It's a list of these random seeds. 278 00:15:08.519 --> 00:15:11.519 These seeds are then used sequentially to generate the entire 279 00:15:11.600 --> 00:15:15.480 massive parameter vector for the network. So a relatively compact 280 00:15:15.519 --> 00:15:19.799 list of seeds can define an incredibly complex high dimensional network. 281 00:15:20.240 --> 00:15:22.960 GPU acceleration is still vital, mind you, because you have 282 00:15:23.000 --> 00:15:25.919 to evaluate each agent, maybe running the game for twenty 283 00:15:25.960 --> 00:15:29.559 thousand frames or more, but the learning mechanism itself is different. 284 00:15:29.600 --> 00:15:32.720 It potentially avoids some of the complexities and instabilities of 285 00:15:32.799 --> 00:15:35.399 gradient based methods for these huge RL problems. 286 00:15:35.679 --> 00:15:42.279 Amazing stuff. Okay, with all this complexity evolving topologies, CPPNs, novelty, coevolution, 287 00:15:43.080 --> 00:15:45.360 what are some practical tips for someone listening who actually 288 00:15:45.360 --> 00:15:47.679 wants to build or experiment with these systems? Where should 289 00:15:47.679 --> 00:15:48.679 they start? What's crucial? 290 00:15:49.000 --> 00:15:54.080 Rule number one always careful problem analysis and really rigorous 291 00:15:54.159 --> 00:15:58.919 data preprocessing. Neuroevolution is pretty robust, but numerical instability can 292 00:15:58.960 --> 00:16:03.600 totally derail things. Input data needs attention, especially if different 293 00:16:03.600 --> 00:16:07.519 features have vastly different scales, like differing by orders of magnitude. 294 00:16:07.679 --> 00:16:11.240 You absolutely need to standardize it zero mean unit variants 295 00:16:11.320 --> 00:16:14.000 like with Psychic Learned standard scaler, or scale it to 296 00:16:14.000 --> 00:16:16.720 a specific range maybe zero to one using minmax scaler 297 00:16:17.399 --> 00:16:19.799 or normalize it. If you don't, the features with bigger 298 00:16:19.840 --> 00:16:22.360 numbers will just dominate the learning process and you'll miss 299 00:16:22.399 --> 00:16:23.759 subtle but important signals. 300 00:16:23.879 --> 00:16:26.720 Got it preprocessing first, and once the data is ready, 301 00:16:26.720 --> 00:16:28.799 what about tuning the evolution itself? What are the key 302 00:16:28.840 --> 00:16:30.200 dials we can turn right? 303 00:16:30.440 --> 00:16:34.600 Tuning the evolutionary process? That's critical. Okay, so things seem installed. 304 00:16:34.639 --> 00:16:38.639 If fitness isn't improving, maybe try decreasing the need survival threshold. 305 00:16:38.919 --> 00:16:42.720 This makes selections stricter, only letting higher quality individuals reproduce. 306 00:16:43.120 --> 00:16:46.679 You could also try increasing max stagnation. This gives species 307 00:16:46.720 --> 00:16:51.879 more generations to potentially develop useful mutations before being considered stagnant. 308 00:16:52.080 --> 00:16:56.559 But maybe start lower like fifteen twenty generations for quicker turnover. Initially, 309 00:16:57.039 --> 00:16:59.159 keep an eye on the number of species. Usually somewhere 310 00:16:59.200 --> 00:17:01.799 between five and twenty is a decent range. Too many 311 00:17:01.960 --> 00:17:04.640 and they might be too small to evolve effectively. Too 312 00:17:04.720 --> 00:17:08.160 few and you might kill off diversity too quickly. Population 313 00:17:08.279 --> 00:17:11.680 size is a big one. Larger populations mean more initial diversity, 314 00:17:11.759 --> 00:17:15.279 which is good but obviously increases the computational costs per generation. 315 00:17:15.839 --> 00:17:18.160 It's a trade off, and please please always put the 316 00:17:18.200 --> 00:17:20.440 random seed value at the start of every run. If 317 00:17:20.480 --> 00:17:22.839 you get an interesting result, you absolutely need that seed 318 00:17:22.839 --> 00:17:26.799 to replicate the exact evolutionary path later for analysis, for debugging. 319 00:17:27.000 --> 00:17:27.720 Super important. 320 00:17:27.799 --> 00:17:30.519 That's a great practical tip. Okay, beyond just looking at 321 00:17:30.559 --> 00:17:33.839 fitness scores going up, are there visual ways to understand 322 00:17:33.839 --> 00:17:36.039 what's happening, how the evolution is progressing. 323 00:17:36.160 --> 00:17:39.400 Oh? Absolutely, visualization is crucial. Don't just look at numbers. 324 00:17:39.720 --> 00:17:42.759 Use tools like matt plotlib or seaborn to plot fitness 325 00:17:42.759 --> 00:17:45.960 trends over generations. See how the best and average fitness 326 00:17:45.960 --> 00:17:49.640 are changing, look at species counts. And it's incredibly valuable 327 00:17:49.640 --> 00:17:53.839 to visually inspect the topology of the final evolved an ns, 328 00:17:54.359 --> 00:17:56.759 like when tackling that modular red enough problem with the 329 00:17:56.960 --> 00:18:00.559 es hyper need. Actually seeing the evolved modular structures in 330 00:18:00.599 --> 00:18:04.160 the network diagram confirms the algorithm worked as intended. It 331 00:18:04.200 --> 00:18:07.079 gives you intuition you can't get from numbers alone. 332 00:18:06.799 --> 00:18:10.000 Right, Seeing is a leading sometimes. And finally, how do 333 00:18:10.039 --> 00:18:12.880 you know if your evolved solution is genuinely good? Not 334 00:18:13.000 --> 00:18:15.079 just it worked, but how well did it work? What 335 00:18:15.119 --> 00:18:16.240 metrics should we look at? 336 00:18:16.400 --> 00:18:18.799 Yeah, don't just rely on one single success metric like 337 00:18:18.880 --> 00:18:22.759 raw fitness or just accuracy, especially for classification tasks. Get 338 00:18:22.759 --> 00:18:25.759 familiar with things like precision recall, the F one score 339 00:18:26.200 --> 00:18:30.079 ROCAUC that's the receiver operating characteristic area under the curve, 340 00:18:30.519 --> 00:18:32.720 and of course overall accuracy. They pain in a much 341 00:18:32.799 --> 00:18:36.039 richer picture of performance and for actually implementing this stuff. 342 00:18:36.079 --> 00:18:39.519 There are several good Python libraries out there. Neat Python 343 00:18:39.599 --> 00:18:42.440 is stable, well documented. For standard NEED, it's in maintenance 344 00:18:42.440 --> 00:18:45.119 mode now maybe a bit slower. Multi Need is probably 345 00:18:45.119 --> 00:18:47.400 the most versatile right now. It does need hyper neat es, 346 00:18:47.519 --> 00:18:50.240 hyper need, even novelty search. It has a C plus 347 00:18:50.279 --> 00:18:53.920 plus cour so what's fast and is decent visualization support. 348 00:18:54.319 --> 00:18:58.480 Then there's deep neuroevolution from uber Ai lab built on TensorFlow, 349 00:18:58.559 --> 00:19:02.359 specifically for those big DP neural networks on GPUs. Choosing 350 00:19:02.400 --> 00:19:04.599 the right one really depends on your specific problem, what 351 00:19:04.680 --> 00:19:08.400 features you need. And one last tip always use isolated 352 00:19:08.480 --> 00:19:11.599 virtual Python environments for each project, things like Anaconda or 353 00:19:11.680 --> 00:19:14.640 van voked. It saves so many headaches with dependencies. 354 00:19:14.799 --> 00:19:17.799 What an absolutely incredible journey through neuroevolution, I mean, from 355 00:19:17.920 --> 00:19:22.720 mimicking a single neuron to evolving these complex networks that 356 00:19:22.799 --> 00:19:26.480 play atary, navigate mazes, even figure out their own learning goals. 357 00:19:26.559 --> 00:19:28.880 It's really a testament to the power of looking at 358 00:19:28.880 --> 00:19:32.039 the natural world for inspiration to solve some really tough 359 00:19:32.039 --> 00:19:32.720 AI problem. 360 00:19:32.839 --> 00:19:35.880 It trually does redefine how we think about intelligence emerging, 361 00:19:35.880 --> 00:19:39.240 doesn't it that core idea the complexity and really optimal 362 00:19:39.279 --> 00:19:42.880 solutions can arise not from meticulous, top down design, but 363 00:19:42.920 --> 00:19:47.000 from this iterative, messy, nature inspired evolutionary process. It's just 364 00:19:47.279 --> 00:19:50.119 profoundly powerful and really challenges us to think differently about 365 00:19:50.359 --> 00:19:51.799 building intelligent systems. 366 00:19:51.960 --> 00:19:54.200 So as we keep pushing the boundaries of AI, it 367 00:19:54.240 --> 00:19:58.839 makes you wonder, right, what other unconventional approaches maybe hiding 368 00:19:58.839 --> 00:20:01.880 and planesight in biology, might unlock that next level. And 369 00:20:01.960 --> 00:20:05.279 how might you, the listener, apply this mindset, this idea 370 00:20:05.359 --> 00:20:09.000 of evolving, adapting, maybe even co evolving solutions in your 371 00:20:09.039 --> 00:20:12.720 own projects or just in how you approach problem solving generally. 372 00:20:13.319 --> 00:20:16.000 If you are eager to dive deeper, we definitely recommend 373 00:20:16.079 --> 00:20:18.759 exploring the work from Uber ai labs, checking out the 374 00:20:18.759 --> 00:20:22.759 International Society for Artificial Life that's alife dot org. There 375 00:20:22.759 --> 00:20:25.200 are great discussions on open ended evolution on Reddit. The 376 00:20:25.319 --> 00:20:29.039 neat Software Catalog list implementations, rxv dot org always has 377 00:20:29.079 --> 00:20:30.960 cutting edge papers, and of course go back to the 378 00:20:30.960 --> 00:20:34.240 source kenneth O. Stanley's original PhD dissertation on the NEAT 379 00:20:34.279 --> 00:20:37.240 algorithm itself. There's always always more to learn.