WEBVTT

1
00:00:00.120 --> 00:00:04.320
<v Speaker 1>What if the key to building really intelligent AI isn't

2
00:00:04.360 --> 00:00:07.639
<v Speaker 1>about meticulously optimizing every single parameter. What if it's more

3
00:00:07.639 --> 00:00:12.000
<v Speaker 1>about letting it evolve. Imagine maybe a shortcut to sophisticated AI,

4
00:00:12.400 --> 00:00:14.839
<v Speaker 1>one that sidesteps some of the usual complexities. May be

5
00:00:14.960 --> 00:00:18.440
<v Speaker 1>taking a page straight from Nature's playbook. Today, we're embarking

6
00:00:18.440 --> 00:00:21.760
<v Speaker 1>in a deep dive into neuroevolution. It's this fascinating family

7
00:00:21.760 --> 00:00:25.160
<v Speaker 1>of machine learning methods that use evolutionary algorithms to build well,

8
00:00:25.359 --> 00:00:28.920
<v Speaker 1>high performing artificial neural networks. Our mission is to unpack

9
00:00:29.000 --> 00:00:32.399
<v Speaker 1>this powerful alternative to conventional deep learning. We want to

10
00:00:32.439 --> 00:00:35.920
<v Speaker 1>reveal how it's used for complex tasks than games, robotics,

11
00:00:36.079 --> 00:00:39.000
<v Speaker 1>and how it delivers sometimes surprisingly energy efficient, kind of

12
00:00:39.119 --> 00:00:42.280
<v Speaker 1>elegant solutions. You'll get insights from the core concepts right

13
00:00:42.320 --> 00:00:46.039
<v Speaker 1>through to some really surprising real world applications, all distilled

14
00:00:46.079 --> 00:00:49.600
<v Speaker 1>from hands on neuroevolution with Python. So let's explore this

15
00:00:49.679 --> 00:00:52.079
<v Speaker 1>alternative path to AI. I mean, many of us know

16
00:00:52.119 --> 00:00:56.000
<v Speaker 1>about AI learning from massive data sets complex calculations, but

17
00:00:56.039 --> 00:00:59.439
<v Speaker 1>neuroevolution offers this really different, almost organic approach.

18
00:00:59.560 --> 00:01:02.840
<v Speaker 2>It's really fascinating, isn't it. How directly it draws inspiration

19
00:01:02.920 --> 00:01:07.000
<v Speaker 2>from natural selection. Instead of, you know, explicitly programming the

20
00:01:07.040 --> 00:01:11.719
<v Speaker 2>perfect network, you're essentially cultivating a population of networks. You

21
00:01:11.799 --> 00:01:14.719
<v Speaker 2>let the fittest survive and reproduce, and that leads to

22
00:01:14.840 --> 00:01:18.879
<v Speaker 2>increasingly complex, more optimal solutions over generations. It's a very

23
00:01:18.879 --> 00:01:19.760
<v Speaker 2>different way of thinking.

24
00:01:20.159 --> 00:01:22.079
<v Speaker 1>So it all starts at the brain, doesn't it? Our

25
00:01:22.120 --> 00:01:26.319
<v Speaker 1>own brains these incredibly complex graphs of nodes and links.

26
00:01:26.760 --> 00:01:31.319
<v Speaker 1>Early AI ambitions were sort of to imitate that directly, right,

27
00:01:31.359 --> 00:01:35.319
<v Speaker 1>hoping for artificial general intelligence. We're still well working towards that,

28
00:01:35.359 --> 00:01:38.599
<v Speaker 1>but neuroevolution is helping us build some powerful narrow AI

29
00:01:38.680 --> 00:01:39.519
<v Speaker 1>agents right now.

30
00:01:39.680 --> 00:01:44.519
<v Speaker 2>Indeed, artificial neural networks an ns their universal approximators. Theoretically

31
00:01:44.560 --> 00:01:46.959
<v Speaker 2>they can approximate to any function, but the real challenge

32
00:01:47.000 --> 00:01:48.680
<v Speaker 2>is how you train them. How do you select the

33
00:01:48.719 --> 00:01:51.799
<v Speaker 2>right weight values for all those connections. Do you like

34
00:01:51.920 --> 00:01:56.280
<v Speaker 2>meticulously adjust weights with methods like gradiate descent, or do

35
00:01:56.319 --> 00:01:59.439
<v Speaker 2>you let them evolve? Neuroevolution takes that second.

36
00:01:59.120 --> 00:02:03.319
<v Speaker 1>Path, utionary path. So a foundational algorithm here is neat

37
00:02:03.439 --> 00:02:07.799
<v Speaker 1>neuroevolution of augmenting topologies. What's the core idea there? What

38
00:02:07.920 --> 00:02:11.840
<v Speaker 1>makes it revolutionary, especially in how it deals with complexity.

39
00:02:12.280 --> 00:02:16.120
<v Speaker 2>Well, the big breakthrough is it's complexification strategy. It starts simple,

40
00:02:16.360 --> 00:02:19.319
<v Speaker 2>It reduces the huge parameter search space. But beginning with

41
00:02:19.400 --> 00:02:24.400
<v Speaker 2>tiny simple ceed genomes, just inputs, outputs, maybe a biased neuron,

42
00:02:24.759 --> 00:02:28.000
<v Speaker 2>no hidden nodes at first. Then generation by generation it

43
00:02:28.000 --> 00:02:31.800
<v Speaker 2>introduces additional genes. It expands the solution space incrementally. This

44
00:02:31.840 --> 00:02:34.639
<v Speaker 2>mirror is natural evolution, you know, where new genes sometimes

45
00:02:34.639 --> 00:02:37.479
<v Speaker 2>add complexity. It's way more efficient than trying to search

46
00:02:37.479 --> 00:02:38.960
<v Speaker 2>a massive space from the get go.

47
00:02:39.240 --> 00:02:41.680
<v Speaker 1>Okay, so if they're evolving, how do they reproduce and

48
00:02:41.759 --> 00:02:43.800
<v Speaker 1>mutate in a way that lets them get more complex?

49
00:02:43.919 --> 00:02:46.199
<v Speaker 1>Is it like classic genetic algorithms?

50
00:02:46.280 --> 00:02:50.599
<v Speaker 2>It is, Yeah, neurorevolution uses those genetic operators. Mutation can

51
00:02:50.639 --> 00:02:53.840
<v Speaker 2>be simple things flipping bits, changing values in the genome,

52
00:02:53.879 --> 00:02:57.360
<v Speaker 2>altering existing connections. But where NEAT gets really clever is

53
00:02:57.360 --> 00:03:01.360
<v Speaker 2>with structural mutations, actually adding new or even entirely new

54
00:03:01.400 --> 00:03:03.800
<v Speaker 2>nodes to the network's architecture itself.

55
00:03:04.039 --> 00:03:08.159
<v Speaker 1>Huh. But if their structures are constantly changing, growing independently,

56
00:03:09.000 --> 00:03:12.039
<v Speaker 1>how do you combine two different networks during reproduction? Doesn't

57
00:03:12.039 --> 00:03:14.680
<v Speaker 1>that get messy? How do you match things up.

58
00:03:14.960 --> 00:03:18.759
<v Speaker 2>That's a really important question, and NEAT solves it brilliantly

59
00:03:18.800 --> 00:03:22.439
<v Speaker 2>with the innovation number. Every new gene, a connection, or

60
00:03:22.479 --> 00:03:26.680
<v Speaker 2>a node introduced by mutation gets a unique, globally incrementing

61
00:03:26.759 --> 00:03:31.639
<v Speaker 2>number across the whole evolutionary run during crossover. These numbers

62
00:03:31.639 --> 00:03:35.080
<v Speaker 2>act like genetic IDs. They let the algorithm precisely align

63
00:03:35.199 --> 00:03:38.280
<v Speaker 2>corresponding genes from two parents, even if their structures look

64
00:03:38.360 --> 00:03:41.159
<v Speaker 2>quite different. Any genes that don't match up, disjoint or

65
00:03:41.199 --> 00:03:43.960
<v Speaker 2>excess ones are just added unconditionally to the offspring.

66
00:03:44.080 --> 00:03:46.280
<v Speaker 1>Okay, I think I follow that. But what if a new,

67
00:03:46.479 --> 00:03:50.120
<v Speaker 1>more complex structure is like temporarily less fit than a

68
00:03:50.159 --> 00:03:53.439
<v Speaker 1>simpler one that's already pretty optimized. How do these potentially

69
00:03:53.520 --> 00:03:58.159
<v Speaker 1>groundbreaking innovations survive long enough to actually prove their worth. Ah.

70
00:03:58.520 --> 00:04:01.960
<v Speaker 2>That's where speciation comes in. It's directly inspired by how

71
00:04:02.039 --> 00:04:06.400
<v Speaker 2>species form in nature. Literally, in NEAT, the population gets

72
00:04:06.439 --> 00:04:09.599
<v Speaker 2>divided into species or niches based on how similar their

73
00:04:09.639 --> 00:04:14.120
<v Speaker 2>network structures their topologies are. Organisms within the same species

74
00:04:14.400 --> 00:04:17.079
<v Speaker 2>mainly compete and mate with each other. This is crucial.

75
00:04:17.480 --> 00:04:21.800
<v Speaker 2>It shields new, possibly brilliant, but currently underperforming topologies from

76
00:04:21.839 --> 00:04:25.360
<v Speaker 2>immediate negative pressure For the more established networks, it gives

77
00:04:25.360 --> 00:04:27.839
<v Speaker 2>them breathing room, lets them evolve within their niche until

78
00:04:27.879 --> 00:04:31.480
<v Speaker 2>they might become genuinely superior. It's all about cultivating diversity

79
00:04:31.519 --> 00:04:32.439
<v Speaker 2>for long term gain.

80
00:04:32.800 --> 00:04:36.079
<v Speaker 1>That's pretty neat. Okay, so neat sounds powerful, But I

81
00:04:36.120 --> 00:04:38.720
<v Speaker 1>can imagine a problem when you need a really big network,

82
00:04:39.000 --> 00:04:43.240
<v Speaker 1>like millions of connections for complex visual recognition, directly encoding,

83
00:04:43.279 --> 00:04:45.040
<v Speaker 1>every single connection must get unwieldy.

84
00:04:45.120 --> 00:04:48.040
<v Speaker 2>Right, You're absolutely right. That's the big drawback of directing

85
00:04:48.120 --> 00:04:51.120
<v Speaker 2>coding for large scale A and NS. As the network grows,

86
00:04:51.160 --> 00:04:54.680
<v Speaker 2>the genome just balloons. It becomes computationally expensive, hard to manage.

87
00:04:55.040 --> 00:04:59.680
<v Speaker 2>So researchers developed indirect encoding schemes much more efficient.

88
00:05:00.000 --> 00:05:02.399
<v Speaker 1>Okay, and here's where it gets I think, really ingenious,

89
00:05:03.000 --> 00:05:06.720
<v Speaker 1>hyper need. It uses something called a compositional pattern producing

90
00:05:06.759 --> 00:05:09.360
<v Speaker 1>network a CPPN. What exactly is that? What does it

91
00:05:09.439 --> 00:05:10.160
<v Speaker 1>let you do? Right?

92
00:05:10.160 --> 00:05:13.720
<v Speaker 2>A CPPN it's a specialized neural network itself. Its job

93
00:05:13.759 --> 00:05:16.800
<v Speaker 2>is to represent the connectivity patterns of another network, the

94
00:05:16.839 --> 00:05:19.879
<v Speaker 2>main one You want to build the phenotype ANN as

95
00:05:19.920 --> 00:05:22.439
<v Speaker 2>a function of its geometry. Think of it like a

96
00:05:22.439 --> 00:05:25.600
<v Speaker 2>master blueprint, a compact set of rules for building a

97
00:05:25.639 --> 00:05:29.560
<v Speaker 2>complex structure. This connectivity pattern is often visualized as a

98
00:05:29.600 --> 00:05:32.680
<v Speaker 2>kind of high dimensional space like a grid. Each point

99
00:05:32.680 --> 00:05:35.240
<v Speaker 2>on the grid tells you if and how strongly two

100
00:05:35.279 --> 00:05:39.160
<v Speaker 2>specific nodes in the main ANN should connect. The CPPN

101
00:05:39.399 --> 00:05:41.959
<v Speaker 2>takes the coordinates of these nodes as input, and it

102
00:05:42.000 --> 00:05:45.360
<v Speaker 2>outputs the connection weight. If the waits below a certain threshold,

103
00:05:45.360 --> 00:05:47.040
<v Speaker 2>well no connection gets made.

104
00:05:47.199 --> 00:05:51.319
<v Speaker 1>WHOA. So one small CPPN can basically act as a

105
00:05:51.360 --> 00:05:55.600
<v Speaker 1>compressed set of instructions a blueprint for a potentially massive ANN.

106
00:05:55.879 --> 00:05:57.319
<v Speaker 1>That sounds incredibly efficient.

107
00:05:57.680 --> 00:06:02.439
<v Speaker 2>It allows for remarkable information compression, seriously remarkable. There is

108
00:06:02.480 --> 00:06:06.480
<v Speaker 2>this visual discrimination task, for instance, where a CPPN with

109
00:06:06.519 --> 00:06:10.199
<v Speaker 2>only like sixteen connections define the patterns for a main

110
00:06:10.399 --> 00:06:14.120
<v Speaker 2>A and M with almost fifteen thousand connections. That's the

111
00:06:14.160 --> 00:06:17.000
<v Speaker 2>compression ratio of what about point one one percent?

112
00:06:17.120 --> 00:06:17.399
<v Speaker 1>Wow?

113
00:06:17.879 --> 00:06:20.360
<v Speaker 2>And what this practically means for you? The listener is

114
00:06:20.399 --> 00:06:24.360
<v Speaker 2>potentially much more energy efficient AI you can deploy powerful

115
00:06:24.360 --> 00:06:27.040
<v Speaker 2>models where traditional deep learning is just too big or

116
00:06:27.079 --> 00:06:30.759
<v Speaker 2>power hungry. Think edge devices. Plus, it often lets you

117
00:06:30.800 --> 00:06:34.120
<v Speaker 2>generate solutions at different resolutions without retraining.

118
00:06:34.439 --> 00:06:37.560
<v Speaker 1>That's a huge leap, but okay, HyperNEAT sounds powerful, but

119
00:06:37.600 --> 00:06:40.560
<v Speaker 1>If the CPPN is the blueprint, someone still has to

120
00:06:40.600 --> 00:06:42.720
<v Speaker 1>decide where the bricks go. Right, someone has to define

121
00:06:42.720 --> 00:06:44.439
<v Speaker 1>the layout of the nodes in the final network.

122
00:06:44.439 --> 00:06:48.040
<v Speaker 2>You've hit its main limitation exactly. The human experimenter still

123
00:06:48.040 --> 00:06:51.480
<v Speaker 2>defines the layout of the phenotype Ann's nodes, the substrate

124
00:06:51.519 --> 00:06:53.120
<v Speaker 2>we call it, right at the start. If you make

125
00:06:53.120 --> 00:06:55.680
<v Speaker 2>a bad assumption about that layout, performance can suffer, so

126
00:06:56.160 --> 00:06:59.480
<v Speaker 2>es HyperNEAT or evolvable substrate hypernea. It tackles this. It

127
00:06:59.519 --> 00:07:01.639
<v Speaker 2>introduces an evolvable substrate.

128
00:07:01.240 --> 00:07:04.120
<v Speaker 1>Hold on, so the layout of the network itself that

129
00:07:04.160 --> 00:07:06.680
<v Speaker 1>evolves automatically too. That's really next level.

130
00:07:06.720 --> 00:07:10.199
<v Speaker 2>Precisely, it figures out where information seems to be flowing

131
00:07:10.279 --> 00:07:14.600
<v Speaker 2>most intensely within the potential connection space. It uses techniques

132
00:07:14.680 --> 00:07:19.079
<v Speaker 2>like quad tree information extraction, basically clever ways to divide

133
00:07:19.120 --> 00:07:21.759
<v Speaker 2>up the space and focus effort where needed, and then

134
00:07:21.800 --> 00:07:25.439
<v Speaker 2>it automatically puts more hidden nodes in those high intensity regions,

135
00:07:25.759 --> 00:07:27.959
<v Speaker 2>so the system learns not just the connections, but where

136
00:07:27.959 --> 00:07:30.560
<v Speaker 2>to put the nodes for the best representation. It allows

137
00:07:30.639 --> 00:07:34.680
<v Speaker 2>automatic hidden node placement easier modular networks, and it can

138
00:07:34.720 --> 00:07:38.560
<v Speaker 2>elaborate the structure adding nodes and connections during evolution, which

139
00:07:38.639 --> 00:07:40.439
<v Speaker 2>basic hyper need it doesn't really do.

140
00:07:40.639 --> 00:07:44.399
<v Speaker 1>Okay, let's shift gears a bit. Most optimization algorithms, including

141
00:07:44.399 --> 00:07:46.839
<v Speaker 1>a lot of evolutionary ones, they try to get closer

142
00:07:46.839 --> 00:07:49.480
<v Speaker 1>and closer to a goal. Right. They reward progress towards

143
00:07:49.480 --> 00:07:52.120
<v Speaker 1>some objective. But what happens if the best path to

144
00:07:52.160 --> 00:07:55.079
<v Speaker 1>that goal involves, I don't know, temporarily moving away from it,

145
00:07:55.319 --> 00:07:57.439
<v Speaker 1>or if there are dead ends that look promising. That

146
00:07:57.560 --> 00:07:59.120
<v Speaker 1>sounds like a fundamental problem.

147
00:07:59.279 --> 00:08:03.079
<v Speaker 2>It is. It's the classic local optima trap. Imagine a

148
00:08:03.120 --> 00:08:06.639
<v Speaker 2>maze the shortest path out actually requires you to walk

149
00:08:06.639 --> 00:08:09.319
<v Speaker 2>away from the exit for a bit. First, a simple

150
00:08:09.399 --> 00:08:12.800
<v Speaker 2>goal oriented search, one that just rewards getting closer, might

151
00:08:12.839 --> 00:08:15.079
<v Speaker 2>walk into a dead end, a cul de sact that

152
00:08:15.120 --> 00:08:17.879
<v Speaker 2>seems close to the exit but offers no way forward.

153
00:08:18.040 --> 00:08:21.279
<v Speaker 2>The algorithm gets stuck. It converges to a local champion,

154
00:08:21.439 --> 00:08:22.879
<v Speaker 2>not the true best solution.

155
00:08:23.399 --> 00:08:27.680
<v Speaker 1>Okay, So if that goal focused approach gets stuck, what's

156
00:08:27.720 --> 00:08:32.080
<v Speaker 1>the alternative? How does neuroevolution break free from these deceptive landscapes.

157
00:08:32.600 --> 00:08:35.799
<v Speaker 2>That's where novelty search or NS comes in, and the

158
00:08:35.840 --> 00:08:38.759
<v Speaker 2>core idea is really counterintuitive, almost zen like the ejective

159
00:08:38.799 --> 00:08:41.600
<v Speaker 2>function isn't proximity to a goal. It's defined by the

160
00:08:41.600 --> 00:08:44.200
<v Speaker 2>novelty of the behavior shown by the agent. It actively

161
00:08:44.240 --> 00:08:47.960
<v Speaker 2>rewards doing something different. It drives evolution towards diversity of behavior.

162
00:08:48.039 --> 00:08:50.960
<v Speaker 1>Wait you're saying it just wanders around exploring, hoping to

163
00:08:51.000 --> 00:08:53.919
<v Speaker 1>stumble onto the solution by accident. That feels indirect.

164
00:08:54.159 --> 00:08:57.399
<v Speaker 2>It's more sophisticated than just random watering. There's a novelty metric.

165
00:08:57.960 --> 00:09:01.080
<v Speaker 2>Often it's measured as like the average distance of an

166
00:09:01.120 --> 00:09:05.679
<v Speaker 2>individual's behavior to its k nearest neighbors in some abstract

167
00:09:05.720 --> 00:09:09.360
<v Speaker 2>behavioral space. If you're doing something unique far from what

168
00:09:09.440 --> 00:09:12.320
<v Speaker 2>others are doing, you get a high novelty score. You're rewarded.

169
00:09:12.840 --> 00:09:16.679
<v Speaker 2>This encourages divergent evolution. It forces the population to spread out,

170
00:09:16.960 --> 00:09:20.039
<v Speaker 2>explore the whole space, not just clump together in one

171
00:09:20.159 --> 00:09:23.519
<v Speaker 2>seemingly good spot. And here's the really wild part. For

172
00:09:23.600 --> 00:09:27.279
<v Speaker 2>certain tricky, deceptive problems, novelty search can actually find solutions

173
00:09:27.600 --> 00:09:31.679
<v Speaker 2>faster than traditional objective based search. It forces exploration that

174
00:09:31.720 --> 00:09:32.720
<v Speaker 2>goal seeking misses.

175
00:09:33.039 --> 00:09:37.120
<v Speaker 1>Okay, wow, so we've covered the mechanics, these cool complexification strategies,

176
00:09:37.159 --> 00:09:40.480
<v Speaker 1>ways to handle scale. Even this idea of rewarding novelty.

177
00:09:40.960 --> 00:09:43.200
<v Speaker 1>Let's see how this all plays out in practice. How

178
00:09:43.240 --> 00:09:46.759
<v Speaker 1>does neuroevolution tackle some real challenges, from classic problems to

179
00:09:47.480 --> 00:09:50.639
<v Speaker 1>complex games, even evolving its own goals. Let's start simple.

180
00:09:50.759 --> 00:09:54.840
<v Speaker 1>Maybe the xor problem sounds basic but notoriously tricky for

181
00:09:54.879 --> 00:09:58.159
<v Speaker 1>simple networks because it's not linearly separable. How does net

182
00:09:58.200 --> 00:09:58.639
<v Speaker 1>handle that?

183
00:09:58.960 --> 00:10:02.879
<v Speaker 2>Right? XR a basic ANN no hidden layers, just can't

184
00:10:02.919 --> 00:10:06.519
<v Speaker 2>crack it, but neat starting super simple, two inputs, one

185
00:10:06.559 --> 00:10:10.639
<v Speaker 2>output consistently evolves the necessary structure. It adds that crucial

186
00:10:10.720 --> 00:10:14.679
<v Speaker 2>hidden node. It perfectly demonstrates needs power to grow the

187
00:10:14.720 --> 00:10:18.759
<v Speaker 2>complexity it needs and avoid those traps that stump fixed networks.

188
00:10:19.360 --> 00:10:22.240
<v Speaker 2>For XOR, fitness is usually calculated based on how close

189
00:10:22.279 --> 00:10:25.000
<v Speaker 2>the output is to the correct zero or one for

190
00:10:25.039 --> 00:10:28.399
<v Speaker 2>all four input patterns. Get close enough, like fifteen point

191
00:10:28.399 --> 00:10:30.080
<v Speaker 2>five out of sixteen and you solved it.

192
00:10:30.159 --> 00:10:34.000
<v Speaker 1>Okay, makes sense moving to something more dynamic. Balancing a

193
00:10:34.039 --> 00:10:36.759
<v Speaker 1>pole on a cart. That's a real classic and reinforcement learning.

194
00:10:36.759 --> 00:10:39.480
<v Speaker 2>Isn't it absolutely the single pole balancing task? It's an

195
00:10:39.519 --> 00:10:43.519
<v Speaker 2>avoidance control problem. The ANN gets inputs, cart position, velocity,

196
00:10:43.559 --> 00:10:46.480
<v Speaker 2>poll angle, its angular velocity, all scaled nicely and then

197
00:10:46.519 --> 00:10:48.960
<v Speaker 2>it just outputs a simple action push left or push right.

198
00:10:49.240 --> 00:10:51.360
<v Speaker 2>Fitness is just how long it keeps the pole balanced,

199
00:10:51.480 --> 00:10:54.120
<v Speaker 2>often measured in time steps, maybe up to hundreds of thousands,

200
00:10:54.159 --> 00:10:56.759
<v Speaker 2>and the physics underneath are often simulated using something like

201
00:10:56.759 --> 00:10:58.600
<v Speaker 2>a Runge Kuda method to keep it accurate.

202
00:10:58.679 --> 00:11:01.519
<v Speaker 1>And then you mentioned trying a double pole balancing problem.

203
00:11:01.559 --> 00:11:02.639
<v Speaker 1>That sounds way harder.

204
00:11:02.720 --> 00:11:06.919
<v Speaker 2>Two poles, oh much harder. Two poles, often different lengths

205
00:11:07.000 --> 00:11:10.240
<v Speaker 2>on the same cart, more state variables, much more complex

206
00:11:10.240 --> 00:11:14.600
<v Speaker 2>physics involved. That experiment really highlighted how important that speciation

207
00:11:14.720 --> 00:11:18.240
<v Speaker 2>thing is, finding the right balance of species diversity. Too

208
00:11:18.320 --> 00:11:20.919
<v Speaker 2>many species and they become too small. Maybe it don't

209
00:11:20.960 --> 00:11:24.919
<v Speaker 2>evolve fast enough. Too few any stifle innovation. It also

210
00:11:24.960 --> 00:11:27.480
<v Speaker 2>really showed how sensitive things can be to the initial

211
00:11:27.559 --> 00:11:29.720
<v Speaker 2>random seed. Sometimes you just need a bit of luck

212
00:11:29.720 --> 00:11:31.440
<v Speaker 2>in that initial population set up right.

213
00:11:31.480 --> 00:11:35.440
<v Speaker 1>The starting conditions matter. Okay, Mazes, they're great test beds

214
00:11:35.480 --> 00:11:40.000
<v Speaker 1>for autonomous agents. How does neuroevolution do with, say, a

215
00:11:40.120 --> 00:11:43.200
<v Speaker 1>robot navigating a maze, avoiding walls, finding an exit.

216
00:11:43.639 --> 00:11:46.720
<v Speaker 2>Mazes are fascinating because they often have those deceptive landscapes.

217
00:11:46.759 --> 00:11:49.080
<v Speaker 2>We talked about cul de sacs that look promising, but

218
00:11:49.080 --> 00:11:52.120
<v Speaker 2>are dead ends local optima. If you just use a

219
00:11:52.159 --> 00:11:55.200
<v Speaker 2>goal oriented fitness function rewarding distance to the exit, agents

220
00:11:55.240 --> 00:11:57.480
<v Speaker 2>often get stuck. We saw this in experiments with a

221
00:11:57.480 --> 00:12:01.559
<v Speaker 2>hard maze configuration. Objective based search just failed. Agents got

222
00:12:01.559 --> 00:12:03.639
<v Speaker 2>trapped near the start or in those dead ends.

223
00:12:03.960 --> 00:12:06.919
<v Speaker 1>But what about novelty search? Did that make a difference

224
00:12:06.960 --> 00:12:10.240
<v Speaker 1>in the mazes? Could it actually beat the goal focused approach? There?

225
00:12:10.320 --> 00:12:13.519
<v Speaker 2>That's the key question, right For a simple maze, NS

226
00:12:13.559 --> 00:12:17.320
<v Speaker 2>often found a solution faster and interestingly, often with a

227
00:12:17.360 --> 00:12:21.639
<v Speaker 2>simpler network topology, sometimes even needing no hidden nodes at all.

228
00:12:21.759 --> 00:12:25.679
<v Speaker 2>Compared to the goal oriented method. It consistently pushed agents

229
00:12:25.720 --> 00:12:29.559
<v Speaker 2>to explore more varied paths, even for the really hard maze.

230
00:12:29.720 --> 00:12:33.120
<v Speaker 2>While the specific library implementation we use struggled to find

231
00:12:33.120 --> 00:12:36.360
<v Speaker 2>a perfect, winning solution, The results were far more promising

232
00:12:36.360 --> 00:12:39.720
<v Speaker 2>with novelty search. The exploration was much broader, much more

233
00:12:39.759 --> 00:12:43.559
<v Speaker 2>intelligent looking. It really shows that sometimes not aiming directly

234
00:12:43.600 --> 00:12:45.080
<v Speaker 2>at the goal is the best way to get there.

235
00:12:45.480 --> 00:12:47.960
<v Speaker 1>Okay, this next one. It sounds like pure science fiction

236
00:12:48.360 --> 00:12:53.039
<v Speaker 1>co evolution. Two AI populations evolving together, influencing each other.

237
00:12:53.120 --> 00:12:56.519
<v Speaker 2>Yeah, it's a really azance concept inspired by biological ideas

238
00:12:56.519 --> 00:12:59.759
<v Speaker 2>like commensalism, where one species benefits without affecting the other

239
00:12:59.840 --> 00:13:03.679
<v Speaker 2>mine much. The method called safe involves two populations evolving

240
00:13:03.720 --> 00:13:06.759
<v Speaker 2>side by side, one population of MAY solving agents and

241
00:13:06.799 --> 00:13:10.840
<v Speaker 2>another population of well objective function candidates.

242
00:13:10.399 --> 00:13:14.360
<v Speaker 1>Wait objective function candidates. So the MAY solver's fitness isn't

243
00:13:14.399 --> 00:13:16.679
<v Speaker 1>just about reaching the exit anymore exactly.

244
00:13:16.759 --> 00:13:19.360
<v Speaker 2>That's where it gets really clever. The maze solver's fitness

245
00:13:19.399 --> 00:13:22.039
<v Speaker 2>is a combination of two things. One it's distance to

246
00:13:22.080 --> 00:13:25.480
<v Speaker 2>the exit that's the objective part, and two the novelty

247
00:13:25.519 --> 00:13:28.799
<v Speaker 2>of its final position, the behavioral novelty part. But here's

248
00:13:28.840 --> 00:13:32.720
<v Speaker 2>the crucial twist. The weights used to combine these two scores.

249
00:13:33.159 --> 00:13:35.759
<v Speaker 2>They come as outputs from an individual in the other

250
00:13:35.919 --> 00:13:40.440
<v Speaker 2>evolving population, the objective function candidates. So the system literally

251
00:13:40.480 --> 00:13:43.759
<v Speaker 2>evolved to find solutions for that hard maze where objective

252
00:13:43.759 --> 00:13:46.639
<v Speaker 2>search alone failed. It's like the AI is learning how

253
00:13:46.639 --> 00:13:50.679
<v Speaker 2>to define its own success criteria, dynamically shifting focus between

254
00:13:50.679 --> 00:13:51.799
<v Speaker 2>the goal and exploration.

255
00:13:51.919 --> 00:13:55.559
<v Speaker 1>That is wild. Okay. From mazes to video games you mentioned,

256
00:13:55.679 --> 00:13:59.120
<v Speaker 1>neuroevolution can train agents for classic atari games that usually

257
00:13:59.120 --> 00:14:02.879
<v Speaker 1>involves deep reinforcement learning like DQN, which is known for

258
00:14:02.919 --> 00:14:04.840
<v Speaker 1>being super computationally heavy.

259
00:14:05.039 --> 00:14:09.919
<v Speaker 2>Traditionally, yes, deep RL methods like DQN use deep neural nets.

260
00:14:10.159 --> 00:14:14.480
<v Speaker 2>Gradient based backpropagation needs serious GPU power for all those

261
00:14:14.480 --> 00:14:19.039
<v Speaker 2>matrix multiplications. Deep neuroevolution offers a different path. It can

262
00:14:19.080 --> 00:14:22.720
<v Speaker 2>approximate that q value function needed for reinforcement learning without

263
00:14:22.759 --> 00:14:24.600
<v Speaker 2>relying on air or backpropagation at all.

264
00:14:24.759 --> 00:14:27.600
<v Speaker 1>No backpropagation. How on earth does it train those huge

265
00:14:27.639 --> 00:14:28.600
<v Speaker 1>deep neural networks?

266
00:14:28.600 --> 00:14:32.279
<v Speaker 2>Then, instead of backpropit uses a pretty straightforward genetic algorithm

267
00:14:32.519 --> 00:14:36.519
<v Speaker 2>to evolve a population of potential network controllers. The genome

268
00:14:36.600 --> 00:14:40.240
<v Speaker 2>of each individual encodes all the trainable parameters, the millions

269
00:14:40.240 --> 00:14:43.120
<v Speaker 2>of connection weights of a deep neural network. For the

270
00:14:43.120 --> 00:14:46.159
<v Speaker 2>Frostbite Atari game, for instance, the agent learns just by

271
00:14:46.159 --> 00:14:49.279
<v Speaker 2>looking at the screen pixels. It uses a convolutional neural

272
00:14:49.320 --> 00:14:52.720
<v Speaker 2>network a CNN with something like four million parameters.

273
00:14:53.039 --> 00:14:55.960
<v Speaker 1>Four million parameters? How do you encode that efficiently in

274
00:14:56.000 --> 00:14:57.519
<v Speaker 1>a genome that sounds massive?

275
00:14:57.960 --> 00:15:00.840
<v Speaker 2>This is another really clever bit of encoding. It uses

276
00:15:00.879 --> 00:15:04.799
<v Speaker 2>the seeds of a pseudorandom number generator. The genome isn't

277
00:15:04.799 --> 00:15:08.039
<v Speaker 2>the weights themselves, It's a list of these random seeds.

278
00:15:08.519 --> 00:15:11.519
<v Speaker 2>These seeds are then used sequentially to generate the entire

279
00:15:11.600 --> 00:15:15.480
<v Speaker 2>massive parameter vector for the network. So a relatively compact

280
00:15:15.519 --> 00:15:19.799
<v Speaker 2>list of seeds can define an incredibly complex high dimensional network.

281
00:15:20.240 --> 00:15:22.960
<v Speaker 2>GPU acceleration is still vital, mind you, because you have

282
00:15:23.000 --> 00:15:25.919
<v Speaker 2>to evaluate each agent, maybe running the game for twenty

283
00:15:25.960 --> 00:15:29.559
<v Speaker 2>thousand frames or more, but the learning mechanism itself is different.

284
00:15:29.600 --> 00:15:32.720
<v Speaker 2>It potentially avoids some of the complexities and instabilities of

285
00:15:32.799 --> 00:15:35.399
<v Speaker 2>gradient based methods for these huge RL problems.

286
00:15:35.679 --> 00:15:42.279
<v Speaker 1>Amazing stuff. Okay, with all this complexity evolving topologies, CPPNs, novelty, coevolution,

287
00:15:43.080 --> 00:15:45.360
<v Speaker 1>what are some practical tips for someone listening who actually

288
00:15:45.360 --> 00:15:47.679
<v Speaker 1>wants to build or experiment with these systems? Where should

289
00:15:47.679 --> 00:15:48.679
<v Speaker 1>they start? What's crucial?

290
00:15:49.000 --> 00:15:54.080
<v Speaker 2>Rule number one always careful problem analysis and really rigorous

291
00:15:54.159 --> 00:15:58.919
<v Speaker 2>data preprocessing. Neuroevolution is pretty robust, but numerical instability can

292
00:15:58.960 --> 00:16:03.600
<v Speaker 2>totally derail things. Input data needs attention, especially if different

293
00:16:03.600 --> 00:16:07.519
<v Speaker 2>features have vastly different scales, like differing by orders of magnitude.

294
00:16:07.679 --> 00:16:11.240
<v Speaker 2>You absolutely need to standardize it zero mean unit variants

295
00:16:11.320 --> 00:16:14.000
<v Speaker 2>like with Psychic Learned standard scaler, or scale it to

296
00:16:14.000 --> 00:16:16.720
<v Speaker 2>a specific range maybe zero to one using minmax scaler

297
00:16:17.399 --> 00:16:19.799
<v Speaker 2>or normalize it. If you don't, the features with bigger

298
00:16:19.840 --> 00:16:22.360
<v Speaker 2>numbers will just dominate the learning process and you'll miss

299
00:16:22.399 --> 00:16:23.759
<v Speaker 2>subtle but important signals.

300
00:16:23.879 --> 00:16:26.720
<v Speaker 1>Got it preprocessing first, and once the data is ready,

301
00:16:26.720 --> 00:16:28.799
<v Speaker 1>what about tuning the evolution itself? What are the key

302
00:16:28.840 --> 00:16:30.200
<v Speaker 1>dials we can turn right?

303
00:16:30.440 --> 00:16:34.600
<v Speaker 2>Tuning the evolutionary process? That's critical. Okay, so things seem installed.

304
00:16:34.639 --> 00:16:38.639
<v Speaker 2>If fitness isn't improving, maybe try decreasing the need survival threshold.

305
00:16:38.919 --> 00:16:42.720
<v Speaker 2>This makes selections stricter, only letting higher quality individuals reproduce.

306
00:16:43.120 --> 00:16:46.679
<v Speaker 2>You could also try increasing max stagnation. This gives species

307
00:16:46.720 --> 00:16:51.879
<v Speaker 2>more generations to potentially develop useful mutations before being considered stagnant.

308
00:16:52.080 --> 00:16:56.559
<v Speaker 2>But maybe start lower like fifteen twenty generations for quicker turnover. Initially,

309
00:16:57.039 --> 00:16:59.159
<v Speaker 2>keep an eye on the number of species. Usually somewhere

310
00:16:59.200 --> 00:17:01.799
<v Speaker 2>between five and twenty is a decent range. Too many

311
00:17:01.960 --> 00:17:04.640
<v Speaker 2>and they might be too small to evolve effectively. Too

312
00:17:04.720 --> 00:17:08.160
<v Speaker 2>few and you might kill off diversity too quickly. Population

313
00:17:08.279 --> 00:17:11.680
<v Speaker 2>size is a big one. Larger populations mean more initial diversity,

314
00:17:11.759 --> 00:17:15.279
<v Speaker 2>which is good but obviously increases the computational costs per generation.

315
00:17:15.839 --> 00:17:18.160
<v Speaker 2>It's a trade off, and please please always put the

316
00:17:18.200 --> 00:17:20.440
<v Speaker 2>random seed value at the start of every run. If

317
00:17:20.480 --> 00:17:22.839
<v Speaker 2>you get an interesting result, you absolutely need that seed

318
00:17:22.839 --> 00:17:26.799
<v Speaker 2>to replicate the exact evolutionary path later for analysis, for debugging.

319
00:17:27.000 --> 00:17:27.720
<v Speaker 2>Super important.

320
00:17:27.799 --> 00:17:30.519
<v Speaker 1>That's a great practical tip. Okay, beyond just looking at

321
00:17:30.559 --> 00:17:33.839
<v Speaker 1>fitness scores going up, are there visual ways to understand

322
00:17:33.839 --> 00:17:36.039
<v Speaker 1>what's happening, how the evolution is progressing.

323
00:17:36.160 --> 00:17:39.400
<v Speaker 2>Oh? Absolutely, visualization is crucial. Don't just look at numbers.

324
00:17:39.720 --> 00:17:42.759
<v Speaker 2>Use tools like matt plotlib or seaborn to plot fitness

325
00:17:42.759 --> 00:17:45.960
<v Speaker 2>trends over generations. See how the best and average fitness

326
00:17:45.960 --> 00:17:49.640
<v Speaker 2>are changing, look at species counts. And it's incredibly valuable

327
00:17:49.640 --> 00:17:53.839
<v Speaker 2>to visually inspect the topology of the final evolved an ns,

328
00:17:54.359 --> 00:17:56.759
<v Speaker 2>like when tackling that modular red enough problem with the

329
00:17:56.960 --> 00:18:00.559
<v Speaker 2>es hyper need. Actually seeing the evolved modular structures in

330
00:18:00.599 --> 00:18:04.160
<v Speaker 2>the network diagram confirms the algorithm worked as intended. It

331
00:18:04.200 --> 00:18:07.079
<v Speaker 2>gives you intuition you can't get from numbers alone.

332
00:18:06.799 --> 00:18:10.000
<v Speaker 1>Right, Seeing is a leading sometimes. And finally, how do

333
00:18:10.039 --> 00:18:12.880
<v Speaker 1>you know if your evolved solution is genuinely good? Not

334
00:18:13.000 --> 00:18:15.079
<v Speaker 1>just it worked, but how well did it work? What

335
00:18:15.119 --> 00:18:16.240
<v Speaker 1>metrics should we look at?

336
00:18:16.400 --> 00:18:18.799
<v Speaker 2>Yeah, don't just rely on one single success metric like

337
00:18:18.880 --> 00:18:22.759
<v Speaker 2>raw fitness or just accuracy, especially for classification tasks. Get

338
00:18:22.759 --> 00:18:25.759
<v Speaker 2>familiar with things like precision recall, the F one score

339
00:18:26.200 --> 00:18:30.079
<v Speaker 2>ROCAUC that's the receiver operating characteristic area under the curve,

340
00:18:30.519 --> 00:18:32.720
<v Speaker 2>and of course overall accuracy. They pain in a much

341
00:18:32.799 --> 00:18:36.039
<v Speaker 2>richer picture of performance and for actually implementing this stuff.

342
00:18:36.079 --> 00:18:39.519
<v Speaker 2>There are several good Python libraries out there. Neat Python

343
00:18:39.599 --> 00:18:42.440
<v Speaker 2>is stable, well documented. For standard NEED, it's in maintenance

344
00:18:42.440 --> 00:18:45.119
<v Speaker 2>mode now maybe a bit slower. Multi Need is probably

345
00:18:45.119 --> 00:18:47.400
<v Speaker 2>the most versatile right now. It does need hyper neat es,

346
00:18:47.519 --> 00:18:50.240
<v Speaker 2>hyper need, even novelty search. It has a C plus

347
00:18:50.279 --> 00:18:53.920
<v Speaker 2>plus cour so what's fast and is decent visualization support.

348
00:18:54.319 --> 00:18:58.480
<v Speaker 2>Then there's deep neuroevolution from uber Ai lab built on TensorFlow,

349
00:18:58.559 --> 00:19:02.359
<v Speaker 2>specifically for those big DP neural networks on GPUs. Choosing

350
00:19:02.400 --> 00:19:04.599
<v Speaker 2>the right one really depends on your specific problem, what

351
00:19:04.680 --> 00:19:08.400
<v Speaker 2>features you need. And one last tip always use isolated

352
00:19:08.480 --> 00:19:11.599
<v Speaker 2>virtual Python environments for each project, things like Anaconda or

353
00:19:11.680 --> 00:19:14.640
<v Speaker 2>van voked. It saves so many headaches with dependencies.

354
00:19:14.799 --> 00:19:17.799
<v Speaker 1>What an absolutely incredible journey through neuroevolution, I mean, from

355
00:19:17.920 --> 00:19:22.720
<v Speaker 1>mimicking a single neuron to evolving these complex networks that

356
00:19:22.799 --> 00:19:26.480
<v Speaker 1>play atary, navigate mazes, even figure out their own learning goals.

357
00:19:26.559 --> 00:19:28.880
<v Speaker 1>It's really a testament to the power of looking at

358
00:19:28.880 --> 00:19:32.039
<v Speaker 1>the natural world for inspiration to solve some really tough

359
00:19:32.039 --> 00:19:32.720
<v Speaker 1>AI problem.

360
00:19:32.839 --> 00:19:35.880
<v Speaker 2>It trually does redefine how we think about intelligence emerging,

361
00:19:35.880 --> 00:19:39.240
<v Speaker 2>doesn't it that core idea the complexity and really optimal

362
00:19:39.279 --> 00:19:42.880
<v Speaker 2>solutions can arise not from meticulous, top down design, but

363
00:19:42.920 --> 00:19:47.000
<v Speaker 2>from this iterative, messy, nature inspired evolutionary process. It's just

364
00:19:47.279 --> 00:19:50.119
<v Speaker 2>profoundly powerful and really challenges us to think differently about

365
00:19:50.359 --> 00:19:51.799
<v Speaker 2>building intelligent systems.

366
00:19:51.960 --> 00:19:54.200
<v Speaker 1>So as we keep pushing the boundaries of AI, it

367
00:19:54.240 --> 00:19:58.839
<v Speaker 1>makes you wonder, right, what other unconventional approaches maybe hiding

368
00:19:58.839 --> 00:20:01.880
<v Speaker 1>and planesight in biology, might unlock that next level. And

369
00:20:01.960 --> 00:20:05.279
<v Speaker 1>how might you, the listener, apply this mindset, this idea

370
00:20:05.359 --> 00:20:09.000
<v Speaker 1>of evolving, adapting, maybe even co evolving solutions in your

371
00:20:09.039 --> 00:20:12.720
<v Speaker 1>own projects or just in how you approach problem solving generally.

372
00:20:13.319 --> 00:20:16.000
<v Speaker 1>If you are eager to dive deeper, we definitely recommend

373
00:20:16.079 --> 00:20:18.759
<v Speaker 1>exploring the work from Uber ai labs, checking out the

374
00:20:18.759 --> 00:20:22.759
<v Speaker 1>International Society for Artificial Life that's alife dot org. There

375
00:20:22.759 --> 00:20:25.200
<v Speaker 1>are great discussions on open ended evolution on Reddit. The

376
00:20:25.319 --> 00:20:29.039
<v Speaker 1>neat Software Catalog list implementations, rxv dot org always has

377
00:20:29.079 --> 00:20:30.960
<v Speaker 1>cutting edge papers, and of course go back to the

378
00:20:30.960 --> 00:20:34.240
<v Speaker 1>source kenneth O. Stanley's original PhD dissertation on the NEAT

379
00:20:34.279 --> 00:20:37.240
<v Speaker 1>algorithm itself. There's always always more to learn.
