WEBVTT

1
00:00:00.080 --> 00:00:03.680
<v Speaker 1>Imagine really getting a handle on artificial intelligence, not just

2
00:00:04.160 --> 00:00:08.919
<v Speaker 1>the headlines, but the actual mechanics, the tools, what makes

3
00:00:08.960 --> 00:00:11.279
<v Speaker 1>it tick. That's exactly what we're doing today. This is

4
00:00:11.279 --> 00:00:14.480
<v Speaker 1>our deep dive into AI with Python. We've gathered a

5
00:00:14.519 --> 00:00:18.120
<v Speaker 1>stack of sources, articles, research papers, technotes, runner and our

6
00:00:18.160 --> 00:00:21.719
<v Speaker 1>mission really is to pull out the absolute key insights,

7
00:00:22.120 --> 00:00:26.160
<v Speaker 1>the surprising bits and give you a solid but easy

8
00:00:26.199 --> 00:00:29.839
<v Speaker 1>to grasp understanding from what intelligence even means here to

9
00:00:29.920 --> 00:00:35.000
<v Speaker 1>how machines learn see here even play games. So let's

10
00:00:35.079 --> 00:00:37.119
<v Speaker 1>unpack this right from the start. What does it really

11
00:00:37.159 --> 00:00:39.960
<v Speaker 1>mean when we talk about artificial intelligence right.

12
00:00:40.000 --> 00:00:43.640
<v Speaker 2>Well, the foundational idea, going back to John McCarthy, people

13
00:00:43.719 --> 00:00:46.159
<v Speaker 2>often call them the fogger of AI. He defined it

14
00:00:46.200 --> 00:00:49.200
<v Speaker 2>as the science and engineering of making intelligent machines, okay,

15
00:00:49.280 --> 00:00:52.960
<v Speaker 2>especially intelligent computer programs. So fundamentally it's about trying to

16
00:00:52.960 --> 00:00:55.439
<v Speaker 2>build machines or software they can do things we normally

17
00:00:55.520 --> 00:01:00.200
<v Speaker 2>associate with well, human intelligence, learning, problem solving, making decisions.

18
00:01:00.159 --> 00:01:02.320
<v Speaker 1>That kind of thing things humans do exactly.

19
00:01:02.479 --> 00:01:05.159
<v Speaker 2>And you know what's really interesting isn't just the definition,

20
00:01:05.200 --> 00:01:07.120
<v Speaker 2>but why we need it. What it lets us do

21
00:01:07.799 --> 00:01:11.280
<v Speaker 2>think about learning from just massive amounts of data. Yeah,

22
00:01:11.319 --> 00:01:16.400
<v Speaker 2>impossible for one person totally or doing repetitive tasks accurately, tirelessly.

23
00:01:16.920 --> 00:01:21.200
<v Speaker 2>And crucially, AI can sort of teach itself as new

24
00:01:21.200 --> 00:01:24.879
<v Speaker 2>information comes in. That's huge because you know, the world

25
00:01:24.879 --> 00:01:28.000
<v Speaker 2>doesn't stand still. Plus, it can react in real time,

26
00:01:28.200 --> 00:01:31.680
<v Speaker 2>organize huge data sets efficiently, things that give us much

27
00:01:31.719 --> 00:01:34.480
<v Speaker 2>better results than we could manage alone, especially at scale.

28
00:01:35.280 --> 00:01:38.719
<v Speaker 2>And when we think about what makes the system intelligent

29
00:01:38.840 --> 00:01:41.920
<v Speaker 2>in this context, it's useful to think about Howard Gardner's

30
00:01:41.920 --> 00:01:43.680
<v Speaker 2>idea of multiple intelligences.

31
00:01:43.719 --> 00:01:48.159
<v Speaker 1>Ah, yeah, I remember that, like musical, logical exactly, linguistic logical, mathematical, spatial,

32
00:01:48.200 --> 00:01:48.519
<v Speaker 1>and so on.

33
00:01:49.280 --> 00:01:52.040
<v Speaker 2>The idea is if a machine shows capability and even

34
00:01:52.079 --> 00:01:54.840
<v Speaker 2>one or maybe several of these areas, we consider it

35
00:01:54.959 --> 00:01:58.680
<v Speaker 2>artificially intelligent. It's not always about perfectly mimicking humans, but

36
00:01:58.760 --> 00:02:01.000
<v Speaker 2>about having the right kind of intelligence for a task.

37
00:02:01.200 --> 00:02:04.319
<v Speaker 1>Okay, So if that's the big picture of AI, then

38
00:02:05.680 --> 00:02:08.240
<v Speaker 1>here's the key question I think for anyone wanting to

39
00:02:08.240 --> 00:02:12.560
<v Speaker 1>build this stuff, why Python? Why is that language so central?

40
00:02:12.639 --> 00:02:15.479
<v Speaker 2>Yeah, that's a great question, and it's not really an accident.

41
00:02:15.879 --> 00:02:19.039
<v Speaker 2>Python's dominance in AI comes down to a few really

42
00:02:19.080 --> 00:02:23.159
<v Speaker 2>key things. First, its syntax is well, simple and consistent,

43
00:02:23.240 --> 00:02:26.439
<v Speaker 2>easier to learn, much easier to learn, write, and importantly

44
00:02:26.479 --> 00:02:29.840
<v Speaker 2>to read. This means you could prototype ideas really fast,

45
00:02:30.080 --> 00:02:32.479
<v Speaker 2>try something out in hours, maybe not days or weeks.

46
00:02:32.879 --> 00:02:36.120
<v Speaker 2>That speed is vital in AI research. Then there's a community.

47
00:02:36.120 --> 00:02:39.360
<v Speaker 2>It's huge, it's active, it's open source, so much support,

48
00:02:39.439 --> 00:02:44.000
<v Speaker 2>so many people contributing tools. But maybe the killer feature

49
00:02:44.039 --> 00:02:47.639
<v Speaker 2>for AI is its libraries. Python has this incredible ecosystem

50
00:02:47.879 --> 00:02:51.000
<v Speaker 2>of pre built libraries specifically for AI tasks.

51
00:02:51.000 --> 00:02:52.520
<v Speaker 1>I mean like toolkits.

52
00:02:52.080 --> 00:02:55.120
<v Speaker 2>Exactly, things like NUMBPI for handling numbers and raise efficiently,

53
00:02:55.280 --> 00:02:58.400
<v Speaker 2>SCIPI for more scientific stuff, matt plotlib for plotting data,

54
00:02:58.520 --> 00:03:02.120
<v Speaker 2>NLTK for natural language processing with that yeah, and open

55
00:03:02.199 --> 00:03:06.479
<v Speaker 2>cv for computer vision seeing things and as from manipulating data,

56
00:03:06.759 --> 00:03:10.680
<v Speaker 2>open AIGM for reinforcement learning experiments. The list goes on.

57
00:03:10.759 --> 00:03:12.719
<v Speaker 2>These aren't just bits of code, they're like whole work

58
00:03:12.759 --> 00:03:15.280
<v Speaker 2>benches that save developers tons of time.

59
00:03:15.879 --> 00:03:18.919
<v Speaker 1>So for you, if you're looking to actually build AI applications,

60
00:03:19.000 --> 00:03:22.639
<v Speaker 1>starting with Python means you've got a language built for clarity, speed,

61
00:03:23.199 --> 00:03:26.360
<v Speaker 1>and you get this massive head start with all these

62
00:03:26.479 --> 00:03:29.560
<v Speaker 1>powerful tools ready to go. Okay, So we've got the

63
00:03:29.599 --> 00:03:32.639
<v Speaker 1>what and the why Python. Now the next big piece

64
00:03:32.719 --> 00:03:36.120
<v Speaker 1>is how these machines actually learn, and that really brings

65
00:03:36.199 --> 00:03:39.560
<v Speaker 1>us to machine learning, right, which is essentially giving computers

66
00:03:39.560 --> 00:03:42.479
<v Speaker 1>the ability to learn from data, find patterns, and get

67
00:03:42.479 --> 00:03:46.039
<v Speaker 1>better with experience, but without programming every single step explicitly.

68
00:03:46.520 --> 00:03:48.639
<v Speaker 1>And there seem to be three main ways they do this, that's.

69
00:03:48.560 --> 00:03:51.400
<v Speaker 2>Right, three main paradigms. The first and probably the most

70
00:03:51.439 --> 00:03:54.240
<v Speaker 2>common you'll encounter is supervised machine learning.

71
00:03:54.439 --> 00:03:57.159
<v Speaker 1>Supervised like with a teacher, exactly like that.

72
00:03:57.159 --> 00:03:59.080
<v Speaker 2>It's like the algorithm is a student and you give

73
00:03:59.080 --> 00:04:01.680
<v Speaker 2>it a textbook with problems and the answers. Yeah, So

74
00:04:01.759 --> 00:04:04.360
<v Speaker 2>the training data is labeled. You tell it this input

75
00:04:04.360 --> 00:04:08.039
<v Speaker 2>corresponds to this correct output. The goal is for the

76
00:04:08.080 --> 00:04:10.719
<v Speaker 2>algorithm to learn that mapping so it can predict the

77
00:04:10.759 --> 00:04:15.280
<v Speaker 2>output for new unseen inputs. Okay, and supervised learning usually

78
00:04:15.319 --> 00:04:19.639
<v Speaker 2>handles two kinds of problems. There's classification, where the output

79
00:04:19.680 --> 00:04:23.120
<v Speaker 2>is a category like is this email spam or not spam?

80
00:04:23.319 --> 00:04:26.040
<v Speaker 2>This picture a cat or a dog, or in medicine,

81
00:04:26.079 --> 00:04:27.920
<v Speaker 2>is this tumor malignant or benign?

82
00:04:28.040 --> 00:04:28.720
<v Speaker 1>Categories?

83
00:04:29.079 --> 00:04:32.639
<v Speaker 2>And the other's regression, here the output is a continuous

84
00:04:32.720 --> 00:04:35.279
<v Speaker 2>value a number like predicting the price of a house,

85
00:04:35.439 --> 00:04:39.360
<v Speaker 2>or forecasting temperature or estimating say someone's age from a photo.

86
00:04:39.560 --> 00:04:41.319
<v Speaker 1>So numbers not labels.

87
00:04:40.959 --> 00:04:44.480
<v Speaker 2>Exactly, and you see algorithms like decision trees, random forests,

88
00:04:44.560 --> 00:04:48.160
<v Speaker 2>K and N Logistic regression used a lot here. The

89
00:04:48.839 --> 00:04:51.839
<v Speaker 2>main challenge, often especially with big projects, is just getting

90
00:04:51.920 --> 00:04:54.720
<v Speaker 2>enough high quality labeled data. That can be expensive and

91
00:04:54.759 --> 00:04:55.399
<v Speaker 2>time consuming.

92
00:04:55.519 --> 00:04:57.639
<v Speaker 1>Right, someone has to label it all precisely.

93
00:04:58.199 --> 00:05:01.800
<v Speaker 2>Now, moving on. If supervised learning is like learning with

94
00:05:01.839 --> 00:05:07.600
<v Speaker 2>an answer key, unsupervised machine learning is, well, it's more

95
00:05:07.680 --> 00:05:10.120
<v Speaker 2>like being thrown into a library without a catalog and

96
00:05:10.199 --> 00:05:12.839
<v Speaker 2>asked to find interesting connections. Some might even say it's

97
00:05:12.839 --> 00:05:15.160
<v Speaker 2>closer to true AI.

98
00:05:15.040 --> 00:05:17.079
<v Speaker 1>In some ways because there's no teacher.

99
00:05:16.839 --> 00:05:19.639
<v Speaker 2>Exactly, no supervisor, no pre labeled answers. You give the

100
00:05:19.680 --> 00:05:23.560
<v Speaker 2>algorithm raw unlabeled data, and its job is to discover

101
00:05:23.680 --> 00:05:26.240
<v Speaker 2>hidden structures or patterns all by itself.

102
00:05:26.360 --> 00:05:28.240
<v Speaker 1>Okay, like what kind of patterns? Well?

103
00:05:28.279 --> 00:05:31.319
<v Speaker 2>The two main types again. Clustering is about finding natural

104
00:05:31.319 --> 00:05:34.680
<v Speaker 2>groupings in the data. Imagine groupping customers based on their

105
00:05:34.680 --> 00:05:37.680
<v Speaker 2>buying habits without knowing beforehand what those groups might be.

106
00:05:38.040 --> 00:05:40.199
<v Speaker 2>The algorithm figures out the clusters.

107
00:05:39.920 --> 00:05:41.720
<v Speaker 1>Ah finding similar things together.

108
00:05:41.839 --> 00:05:44.920
<v Speaker 2>Yeah, and then there's association. This is about discovering rules

109
00:05:44.920 --> 00:05:48.079
<v Speaker 2>that describe large parts of the data. The classic example

110
00:05:48.160 --> 00:05:53.000
<v Speaker 2>is a market basket analysis finding that customers who buy, say, diapers,

111
00:05:53.439 --> 00:05:54.480
<v Speaker 2>often also by beer.

112
00:05:54.600 --> 00:05:55.959
<v Speaker 1>Right the surprise and connections.

113
00:05:56.199 --> 00:05:59.800
<v Speaker 2>Those kinds of rules. Algorithms like k means are popular

114
00:05:59.839 --> 00:06:03.839
<v Speaker 2>for clustering and a priori for association rules. And the

115
00:06:03.879 --> 00:06:07.439
<v Speaker 2>third type, which is maybe less common but really powerful,

116
00:06:07.839 --> 00:06:11.319
<v Speaker 2>is reinforcement machine learning. Reinforcement this is where the machine

117
00:06:11.519 --> 00:06:15.319
<v Speaker 2>or the agent learns by doing. It interacts with an

118
00:06:15.399 --> 00:06:18.040
<v Speaker 2>environment could be a game, could be the real world

119
00:06:18.040 --> 00:06:20.600
<v Speaker 2>for a robot, and it takes actions trail and error

120
00:06:20.600 --> 00:06:23.759
<v Speaker 2>exactly trial and error. Based on its actions, it gets feedback,

121
00:06:23.839 --> 00:06:26.759
<v Speaker 2>usually as rewards or penalties, and over time it learns

122
00:06:26.759 --> 00:06:31.399
<v Speaker 2>the strategy of policy to maximize its total reward. Think

123
00:06:31.439 --> 00:06:33.199
<v Speaker 2>of training a dog with treats.

124
00:06:33.040 --> 00:06:35.439
<v Speaker 1>Okay, So it learns from consequences.

125
00:06:35.800 --> 00:06:39.160
<v Speaker 2>Precisely, it learns from experience to make better decisions towards

126
00:06:39.199 --> 00:06:41.800
<v Speaker 2>a goal. This is how AI gets really good at

127
00:06:41.839 --> 00:06:44.639
<v Speaker 2>games or controlling robots and complex situations.

128
00:06:44.720 --> 00:06:46.560
<v Speaker 1>So those are learning styles, but how do you actually

129
00:06:46.600 --> 00:06:49.240
<v Speaker 1>build an AI using these what's the first step with

130
00:06:49.480 --> 00:06:51.759
<v Speaker 1>you know, just raw data, because I imagine you can't

131
00:06:51.800 --> 00:06:54.519
<v Speaker 1>just dump raw data into these algorithms, right, it probably

132
00:06:54.600 --> 00:06:56.199
<v Speaker 1>needs cleaning up, shaping.

133
00:06:56.319 --> 00:07:00.759
<v Speaker 2>Oh absolutely, that's a critical off and overlooked step. Data preparation,

134
00:07:01.240 --> 00:07:05.839
<v Speaker 2>specifically data preprocessing. You've hit on a key point. Garbage in,

135
00:07:05.879 --> 00:07:09.720
<v Speaker 2>garbage out. So preprocessing is all about taking that raw, messy,

136
00:07:10.120 --> 00:07:13.920
<v Speaker 2>maybe incomplete data and transforming it into a clean, structured

137
00:07:13.959 --> 00:07:17.360
<v Speaker 2>format that the machine learning algorithms can actually understand and

138
00:07:17.399 --> 00:07:18.399
<v Speaker 2>work with effectively.

139
00:07:18.680 --> 00:07:21.519
<v Speaker 1>Okay, so what does that involve like specific techniques?

140
00:07:21.720 --> 00:07:25.720
<v Speaker 2>Yeah, there are several standard techniques. One is binarization. That's

141
00:07:25.759 --> 00:07:29.680
<v Speaker 2>basically converting numerical values into simple boolean values zero or

142
00:07:29.720 --> 00:07:32.480
<v Speaker 2>one based on some threshold, like if a temperature is

143
00:07:32.480 --> 00:07:35.879
<v Speaker 2>above thirty degrees, it's one hot, otherwise zero not hot.

144
00:07:36.160 --> 00:07:39.319
<v Speaker 2>Simple but useful, sometimes turning things into yes no kind of. Yeah.

145
00:07:39.399 --> 00:07:43.040
<v Speaker 2>Then there's mean removal. This involves subtracting the average value

146
00:07:43.120 --> 00:07:46.639
<v Speaker 2>the mean from each feature across all samples. This centers

147
00:07:46.639 --> 00:07:50.279
<v Speaker 2>the data around zero, which can help some algorithms perform better.

148
00:07:50.519 --> 00:07:54.279
<v Speaker 2>And another really common one is scaling. Data often comes

149
00:07:54.279 --> 00:07:57.000
<v Speaker 2>in with features on vastly different scales like age in

150
00:07:57.120 --> 00:08:00.879
<v Speaker 2>years and income in thousands of dollars. Aniling brings all

151
00:08:00.920 --> 00:08:04.800
<v Speaker 2>features to a comparable range, maybe between zero and one,

152
00:08:05.199 --> 00:08:08.000
<v Speaker 2>or maybe so they have a standard deviation of one.

153
00:08:08.040 --> 00:08:11.800
<v Speaker 2>This prevents features with larger values from unfairly dominating the

154
00:08:11.879 --> 00:08:12.680
<v Speaker 2>learning process.

155
00:08:12.920 --> 00:08:16.040
<v Speaker 1>Makes sense, so everyone gets a fair say data wise.

156
00:08:15.879 --> 00:08:18.439
<v Speaker 2>Exactly, It levels the playing field for the features.

157
00:08:18.839 --> 00:08:21.199
<v Speaker 1>So, okay, the data is prepped, it's clean at scaled.

158
00:08:21.480 --> 00:08:24.319
<v Speaker 1>Now what what are some of those workhorse algorithms that

159
00:08:24.399 --> 00:08:26.079
<v Speaker 1>actually start finding the patterns?

160
00:08:26.319 --> 00:08:28.279
<v Speaker 2>Right? So, once the data is ready, you can apply

161
00:08:28.399 --> 00:08:31.079
<v Speaker 2>various algorithms. Let's touch on a couple of common ones

162
00:08:31.120 --> 00:08:34.320
<v Speaker 2>mentioned in the sources. One is naive base. Naive base

163
00:08:34.519 --> 00:08:39.519
<v Speaker 2>it's a classification technique based on Bayes' theorem from probability.

164
00:08:39.639 --> 00:08:43.320
<v Speaker 2>The naive part comes from its core assumption. It assumes

165
00:08:43.360 --> 00:08:46.279
<v Speaker 2>that all the features, all the input variables are independent of.

166
00:08:46.200 --> 00:08:48.519
<v Speaker 1>Each other, which isn't always true in real life.

167
00:08:48.559 --> 00:08:51.679
<v Speaker 2>Often not. No, that's why it's naive. But surprisingly it

168
00:08:51.679 --> 00:08:55.080
<v Speaker 2>works really well in many situations, especially for text classification

169
00:08:55.159 --> 00:08:58.879
<v Speaker 2>like spam filtering, and it's computationally very efficient, easy to build,

170
00:08:59.200 --> 00:09:01.840
<v Speaker 2>and good with large data sets. Then you have support

171
00:09:01.919 --> 00:09:04.000
<v Speaker 2>vector machines or SVMs.

172
00:09:04.240 --> 00:09:05.840
<v Speaker 1>SVM heard of that one.

173
00:09:05.919 --> 00:09:09.919
<v Speaker 2>Yes, a powerful supervised learning algorithm used for both classification

174
00:09:10.039 --> 00:09:14.000
<v Speaker 2>and regression, though maybe more famous for classification. The basic

175
00:09:14.080 --> 00:09:16.919
<v Speaker 2>idea is to represent the data points as vectors in

176
00:09:16.960 --> 00:09:20.759
<v Speaker 2>space and find the optimal boundary the hyperplane that best

177
00:09:20.799 --> 00:09:22.559
<v Speaker 2>separates the different classes.

178
00:09:22.279 --> 00:09:24.000
<v Speaker 1>Like drawing a line between the dots.

179
00:09:24.240 --> 00:09:27.639
<v Speaker 2>Essentially yes, but in potentially very high dimensional spaces. And

180
00:09:27.679 --> 00:09:30.320
<v Speaker 2>it tries to find the line that has the maximum margin,

181
00:09:30.360 --> 00:09:33.399
<v Speaker 2>the biggest possible gap between the classes, which often leads

182
00:09:33.440 --> 00:09:36.519
<v Speaker 2>to good generalization. And you know, the sources actually give

183
00:09:36.519 --> 00:09:39.399
<v Speaker 2>a concrete example using naive bays. They applied it to

184
00:09:39.440 --> 00:09:42.000
<v Speaker 2>the breast cancer Wisconsin diagnostic database.

185
00:09:42.159 --> 00:09:43.679
<v Speaker 1>Oh wow, real medical data.

186
00:09:43.840 --> 00:09:46.559
<v Speaker 2>Yeah. And the goal was to classify tumors as either

187
00:09:46.639 --> 00:09:50.200
<v Speaker 2>malignant or benign based on certain features from diagnostic tests.

188
00:09:50.960 --> 00:09:54.320
<v Speaker 2>And the naive bath classifier achieved an accuracy of ninety

189
00:09:54.399 --> 00:09:56.120
<v Speaker 2>five point one seven percent.

190
00:09:56.279 --> 00:09:58.600
<v Speaker 1>That's really high. Impressive.

191
00:09:58.679 --> 00:10:00.960
<v Speaker 2>It is impressive, but it also high something crucially you

192
00:10:01.039 --> 00:10:04.600
<v Speaker 2>mentioned earlier, just accuracy isn't always the whole story, especially

193
00:10:04.639 --> 00:10:07.159
<v Speaker 2>in a medicine. You need to understand what kind of

194
00:10:07.200 --> 00:10:08.519
<v Speaker 2>mistakes the model makes.

195
00:10:08.840 --> 00:10:11.360
<v Speaker 1>Right, Like telling someone they don't have cancer when they

196
00:10:11.360 --> 00:10:13.840
<v Speaker 1>do is way worse than the other way around.

197
00:10:13.960 --> 00:10:17.240
<v Speaker 2>Exactly, that's a false negative, and it's often much more

198
00:10:17.240 --> 00:10:20.480
<v Speaker 2>critical to minimize than a false positive a false alarm.

199
00:10:20.559 --> 00:10:21.960
<v Speaker 2>This is where the confusion matrix comes.

200
00:10:21.960 --> 00:10:23.759
<v Speaker 1>In the confusion matrix, it's just.

201
00:10:23.720 --> 00:10:26.559
<v Speaker 2>A simple table really that summarizes the performance of a

202
00:10:26.600 --> 00:10:30.159
<v Speaker 2>classification model. It breaks down the predictions into four categories.

203
00:10:30.720 --> 00:10:34.159
<v Speaker 2>True positives TP correctly identified positive.

204
00:10:33.840 --> 00:10:35.039
<v Speaker 1>Got the cancer right, yep?

205
00:10:35.200 --> 00:10:37.519
<v Speaker 2>True negatives TN correctly identified.

206
00:10:37.120 --> 00:10:38.840
<v Speaker 1>Negative you directly said no cancer right.

207
00:10:39.440 --> 00:10:43.480
<v Speaker 2>False positives FP incorrectly identified as positive, false alarm, and

208
00:10:43.519 --> 00:10:47.039
<v Speaker 2>false negatives FN incorrectly identified as negative.

209
00:10:46.840 --> 00:10:48.679
<v Speaker 1>The dangerous miss that's the one you.

210
00:10:48.639 --> 00:10:51.879
<v Speaker 2>Often worry about most. And from these four numbers in

211
00:10:51.919 --> 00:10:55.799
<v Speaker 2>the matrix you calculate other important metrics. Besides overall accuracy,

212
00:10:56.279 --> 00:10:59.600
<v Speaker 2>there's precision. Out of all the times the model said positive,

213
00:10:59.600 --> 00:11:01.039
<v Speaker 2>how many are actually positive?

214
00:11:01.120 --> 00:11:03.679
<v Speaker 1>How trustworthy are the positive prediction exactly?

215
00:11:03.799 --> 00:11:06.720
<v Speaker 2>Then there's recall, also called sensitivity. Out of all the

216
00:11:06.720 --> 00:11:09.360
<v Speaker 2>actual positive cases, how many did the model.

217
00:11:09.120 --> 00:11:12.000
<v Speaker 1>Find how good is it at catching the positives yep?

218
00:11:12.360 --> 00:11:16.320
<v Speaker 2>And specificity? Out of all the actual negative cases, how

219
00:11:16.320 --> 00:11:20.000
<v Speaker 2>many did the model correctly identify as negative. So depending

220
00:11:20.000 --> 00:11:24.120
<v Speaker 2>on the problem medical diagnosis, spam filtering, fraud detection, you

221
00:11:24.200 --> 00:11:28.080
<v Speaker 2>might care more about maximizing recall or precision or specificity,

222
00:11:28.360 --> 00:11:29.919
<v Speaker 2>not just the overall accuracy.

223
00:11:30.000 --> 00:11:31.759
<v Speaker 1>Okay, that makes a lot of sense. It's about choosing

224
00:11:31.799 --> 00:11:34.639
<v Speaker 1>the right measure for what matters most. All Right, let's

225
00:11:34.679 --> 00:11:37.480
<v Speaker 1>shift gears a bit. We've talked about learning from data

226
00:11:37.480 --> 00:11:41.080
<v Speaker 1>classifying things, But what about how AI interacts with something

227
00:11:41.159 --> 00:11:44.799
<v Speaker 1>uniquely human like language? How do machines understand us?

228
00:11:45.200 --> 00:11:48.840
<v Speaker 2>Ah? Yeah, that's the fascinating world of natural language processing

229
00:11:49.159 --> 00:11:52.960
<v Speaker 2>or NLP. It's basically the field of AI focused on

230
00:11:53.159 --> 00:11:57.759
<v Speaker 2>enabling computers to understand, interpret, and even generate human language,

231
00:11:57.960 --> 00:11:58.799
<v Speaker 2>both spoken and.

232
00:11:58.759 --> 00:12:02.080
<v Speaker 1>Written, understanding and talking back essentially pretty much.

233
00:12:02.080 --> 00:12:04.519
<v Speaker 2>It usually breaks down into two main parts. There's natural

234
00:12:04.559 --> 00:12:08.279
<v Speaker 2>language understanding NLU, which is about figuring out the meaning

235
00:12:08.320 --> 00:12:11.320
<v Speaker 2>behind the words, analyzing the structure.

236
00:12:10.840 --> 00:12:13.159
<v Speaker 1>Of the intent, making sense of the input right.

237
00:12:13.360 --> 00:12:16.320
<v Speaker 2>And then there's natural language generation NLG, which is the

238
00:12:16.320 --> 00:12:21.559
<v Speaker 2>flip side, taking some internal computer representation or data and

239
00:12:21.639 --> 00:12:25.080
<v Speaker 2>producing natural sounding language output, like writing a summary or

240
00:12:25.080 --> 00:12:29.279
<v Speaker 2>answering a question coherently. Now, the NLU part understanding language

241
00:12:29.320 --> 00:12:32.799
<v Speaker 2>is notoriously difficult because human language is just full of ambiguity.

242
00:12:32.879 --> 00:12:36.039
<v Speaker 1>Ambiguity how so well on multiple levels.

243
00:12:36.080 --> 00:12:39.639
<v Speaker 2>There's lexical ambiguity single words having multiple meanings. Think of

244
00:12:39.679 --> 00:12:44.480
<v Speaker 2>the word bank, riverbank, financial thing, okay, yeah, Then syntactic ambiguity,

245
00:12:44.639 --> 00:12:47.840
<v Speaker 2>where the sentence structure is unclear. The classic example is

246
00:12:47.960 --> 00:12:50.279
<v Speaker 2>I saw the man on the hill with a telescope.

247
00:12:50.879 --> 00:12:53.840
<v Speaker 2>Who has the telescope? Meet? Yeah, the man is the

248
00:12:53.840 --> 00:12:54.759
<v Speaker 2>man on the hill that has a.

249
00:12:54.759 --> 00:12:57.720
<v Speaker 1>Telescope A right grammar puzzles exactly.

250
00:12:58.159 --> 00:13:01.919
<v Speaker 2>And then there's referential ambiguity, especially with pronouns. The cat

251
00:13:02.000 --> 00:13:04.919
<v Speaker 2>chased the mouse and it was fast. What does it

252
00:13:05.360 --> 00:13:09.120
<v Speaker 2>refer to the cat or the mouse? Context usually tells us.

253
00:13:09.159 --> 00:13:10.519
<v Speaker 2>But for a computer that's tricky.

254
00:13:10.720 --> 00:13:13.919
<v Speaker 1>Wow, okay, So how does AI even begin to untangle

255
00:13:13.960 --> 00:13:14.279
<v Speaker 1>all that?

256
00:13:14.600 --> 00:13:17.639
<v Speaker 2>It has to break it down systematically? NLP typically involves

257
00:13:17.639 --> 00:13:21.639
<v Speaker 2>a pipeline of steps. First is lexical analysis, just identifying

258
00:13:21.639 --> 00:13:25.720
<v Speaker 2>words in their structures. Then syntactic analysis or parsing, which

259
00:13:25.759 --> 00:13:28.639
<v Speaker 2>figures out the grammatical structure of the sentence how words.

260
00:13:28.399 --> 00:13:30.720
<v Speaker 1>Relate, like diagramming sentences in school.

261
00:13:31.000 --> 00:13:34.120
<v Speaker 2>Kind of like that, yeah, but automated. Then semantic analysis

262
00:13:34.159 --> 00:13:36.600
<v Speaker 2>tries to figure out the actual meaning based on that structure.

263
00:13:36.799 --> 00:13:39.120
<v Speaker 2>Discourse integration looks at how the meaning of a sentence

264
00:13:39.159 --> 00:13:42.360
<v Speaker 2>depends on the sentences the game before it. Context matters hugely,

265
00:13:42.840 --> 00:13:46.159
<v Speaker 2>and finally, pragmatic analysis tries to understand the meaning in

266
00:13:46.200 --> 00:13:50.000
<v Speaker 2>the broader context of the situation, the speaker's intent, real

267
00:13:50.039 --> 00:13:53.360
<v Speaker 2>world knowledge. It's about understanding not just what was said,

268
00:13:53.440 --> 00:13:56.679
<v Speaker 2>but why now. To actually do this analysis, NLP uses

269
00:13:56.759 --> 00:14:01.039
<v Speaker 2>various techniques, many available Python's NLTK library. A fundamental one

270
00:14:01.080 --> 00:14:05.279
<v Speaker 2>is tokenization. Tokenization just breaking text down into smaller units

271
00:14:05.360 --> 00:14:09.440
<v Speaker 2>or tokens, usually words, sometimes sentences, or even characters. It's

272
00:14:09.440 --> 00:14:12.399
<v Speaker 2>a first step in processing texts, chopping it up pretty much.

273
00:14:12.960 --> 00:14:16.240
<v Speaker 2>Then you have things like stemming and limitization. Both try

274
00:14:16.240 --> 00:14:19.240
<v Speaker 2>to reduce words to their root or base form. Stemming

275
00:14:19.279 --> 00:14:21.240
<v Speaker 2>is simpler, kind of a blunt tool, just chops off

276
00:14:21.320 --> 00:14:25.159
<v Speaker 2>endings based on rules, So writing writs written might all

277
00:14:25.200 --> 00:14:28.480
<v Speaker 2>become writ or write depending on the stemmer. It's fast,

278
00:14:28.600 --> 00:14:32.840
<v Speaker 2>but can be crude. Limatization is smarter. It uses vocabulary

279
00:14:32.919 --> 00:14:36.279
<v Speaker 2>and morphological analysis, understanding word structure and parts of speech

280
00:14:36.480 --> 00:14:40.799
<v Speaker 2>to get the actual dictionary form the lemma. So writing

281
00:14:41.360 --> 00:14:45.759
<v Speaker 2>rights written would likely all become right. And importantly, it

282
00:14:45.799 --> 00:14:49.080
<v Speaker 2>can distinguish based on context, like the word saw could

283
00:14:49.120 --> 00:14:52.159
<v Speaker 2>become ce verb or stay saw noun.

284
00:14:52.279 --> 00:14:54.120
<v Speaker 1>More accurate, but probably slower.

285
00:14:54.320 --> 00:14:58.440
<v Speaker 2>Generally yes, limaitization is usually more linguistically correct. And then

286
00:14:58.480 --> 00:15:00.360
<v Speaker 2>for machine learning on texts, you need to convert words

287
00:15:00.399 --> 00:15:03.279
<v Speaker 2>into numbers into features. Two common ways are the bag

288
00:15:03.320 --> 00:15:05.080
<v Speaker 2>of words bo model.

289
00:15:05.240 --> 00:15:07.519
<v Speaker 1>Bag of words sounds messy.

290
00:15:07.360 --> 00:15:10.200
<v Speaker 2>Huh it kind of is? It basically treats a document

291
00:15:10.240 --> 00:15:12.600
<v Speaker 2>as just a collection a bag of its words, ignoring

292
00:15:12.639 --> 00:15:14.960
<v Speaker 2>grammar and word order, and just counts how many times

293
00:15:14.960 --> 00:15:17.759
<v Speaker 2>each word appears, simple but often effective for things like

294
00:15:17.799 --> 00:15:21.279
<v Speaker 2>topic classification. Just the counts matter mostly yes, yeah. And

295
00:15:21.320 --> 00:15:25.000
<v Speaker 2>a more sophisticated approach is TFIDF, which stands for term

296
00:15:25.039 --> 00:15:29.480
<v Speaker 2>frequency inverse document frequency TFIDF. This tries to figure out

297
00:15:29.519 --> 00:15:31.639
<v Speaker 2>how important a word is to a document within a

298
00:15:31.679 --> 00:15:35.000
<v Speaker 2>larger collection of documents. It weighs words higher if they

299
00:15:35.039 --> 00:15:39.120
<v Speaker 2>appear frequently in one document term frequency but rarely in

300
00:15:39.159 --> 00:15:43.200
<v Speaker 2>other documents. Inverse document frequency. This helps filter out common

301
00:15:43.200 --> 00:15:46.759
<v Speaker 2>words like the or is and highlight the truly meaningful.

302
00:15:46.440 --> 00:15:50.159
<v Speaker 1>Terms, so it finds the keywords. Essentially in a statistical way.

303
00:15:50.240 --> 00:15:54.840
<v Speaker 2>Yes, and these techniques BOW and TFIDF are really useful

304
00:15:54.840 --> 00:15:58.639
<v Speaker 2>for things like automatically predicting the category of a news article, or,

305
00:15:58.679 --> 00:16:01.840
<v Speaker 2>as the sources mention, even something like predicting gender based

306
00:16:01.840 --> 00:16:02.440
<v Speaker 2>on names.

307
00:16:02.720 --> 00:16:05.960
<v Speaker 1>Interesting. Okay, so that's text. What about sound? How does

308
00:16:06.000 --> 00:16:08.840
<v Speaker 1>AI hear and understand speech? Right?

309
00:16:08.919 --> 00:16:12.879
<v Speaker 2>Speech recognition? It's about getting a machine to understand spoken language,

310
00:16:13.320 --> 00:16:16.240
<v Speaker 2>and the difficulty really varies. A big factor is the

311
00:16:16.320 --> 00:16:20.240
<v Speaker 2>vocabulary size. A system designed to understand just digits zero

312
00:16:20.320 --> 00:16:23.039
<v Speaker 2>through nine is much simpler than one trying to handle

313
00:16:23.200 --> 00:16:25.519
<v Speaker 2>general dictation with tens of thousands.

314
00:16:25.120 --> 00:16:27.840
<v Speaker 1>Of words makes sense, fewer options exactly.

315
00:16:28.159 --> 00:16:31.639
<v Speaker 2>Then you have channel characteristics. Is it clean recording or

316
00:16:31.720 --> 00:16:34.799
<v Speaker 2>is there lots of background noise? Signal to noise ratio

317
00:16:34.919 --> 00:16:38.159
<v Speaker 2>is key, and the microphone quality and placement matter too.

318
00:16:38.440 --> 00:16:41.039
<v Speaker 1>Okay, So how does it process the sound itself?

319
00:16:41.480 --> 00:16:45.519
<v Speaker 2>Well, the first steps are usually recording, which digitizes the

320
00:16:45.559 --> 00:16:49.759
<v Speaker 2>analog sound wave, and sampling, which converts that continuous signal

321
00:16:49.759 --> 00:16:52.840
<v Speaker 2>into a series of discrete numerical values at a certain rate,

322
00:16:53.080 --> 00:16:54.000
<v Speaker 2>the sampling.

323
00:16:53.639 --> 00:16:56.120
<v Speaker 1>Frequency turning sound into numbers decisely.

324
00:16:56.639 --> 00:16:59.039
<v Speaker 2>Then to make sense of those numbers, AI needs to

325
00:16:59.080 --> 00:17:02.840
<v Speaker 2>extract meaningful features from the speech signal. A very common

326
00:17:02.879 --> 00:17:09.640
<v Speaker 2>technique here is using mscc's malfrequency sexual coefficients MFCCs. Catchy, huh.

327
00:17:09.880 --> 00:17:12.640
<v Speaker 1>Yeah. They're basically a way to represent the short term

328
00:17:12.680 --> 00:17:15.720
<v Speaker 1>power spectrum of the sound, but transformed onto a scale,

329
00:17:15.759 --> 00:17:19.359
<v Speaker 1>the male scale that better reflects human hearing perception. It

330
00:17:19.400 --> 00:17:22.640
<v Speaker 1>helps capture the unique characteristics of different sounds like vowels

331
00:17:22.640 --> 00:17:25.319
<v Speaker 1>and consonants, in a compact form that the AI can

332
00:17:25.400 --> 00:17:25.960
<v Speaker 1>learn from.

333
00:17:26.039 --> 00:17:28.759
<v Speaker 2>So it's finding the fingerprints of speech sounds. That's a

334
00:17:28.759 --> 00:17:30.839
<v Speaker 2>good way to put it, and the sources point out

335
00:17:30.880 --> 00:17:33.599
<v Speaker 2>how practical this has become, mentioning things like using the

336
00:17:33.640 --> 00:17:37.400
<v Speaker 2>Google Speech API, readily available tools that let developers incorporate

337
00:17:37.440 --> 00:17:38.960
<v Speaker 2>speech recognition into their apps.

338
00:17:39.440 --> 00:17:43.240
<v Speaker 1>Very cool. All right, so we've covered language and hearing.

339
00:17:43.680 --> 00:17:46.119
<v Speaker 1>What about site? How does AI see?

340
00:17:46.440 --> 00:17:49.640
<v Speaker 2>That brings us to computer vision or CV. This is

341
00:17:49.680 --> 00:17:52.000
<v Speaker 2>the field that tries to enable machines to see and

342
00:17:52.079 --> 00:17:55.799
<v Speaker 2>interpret the visual world, usually from digital images or videos.

343
00:17:56.279 --> 00:17:59.759
<v Speaker 2>The goal is often to reconstruct, understand, or interpret a

344
00:17:59.759 --> 00:18:01.799
<v Speaker 2>three from its two D images.

345
00:18:02.079 --> 00:18:04.480
<v Speaker 1>So it's more than just processing an image.

346
00:18:04.559 --> 00:18:07.640
<v Speaker 2>Yeah, that's a key distinction. Image processing usually takes an

347
00:18:07.680 --> 00:18:11.240
<v Speaker 2>image as input and produces another image's output, maybe enhanced

348
00:18:11.440 --> 00:18:14.400
<v Speaker 2>or filtered. Computer vision, on the other hand, takes an

349
00:18:14.440 --> 00:18:17.079
<v Speaker 2>image as input but aims to produce some kind of

350
00:18:17.160 --> 00:18:21.119
<v Speaker 2>understanding or description as output. What objects are in the image,

351
00:18:21.119 --> 00:18:22.319
<v Speaker 2>Where are they, what's happening?

352
00:18:22.480 --> 00:18:25.039
<v Speaker 1>Got it understanding not just tweaking.

353
00:18:24.759 --> 00:18:27.880
<v Speaker 2>Exactly, And the applications are just huge. Think robotics helping

354
00:18:27.960 --> 00:18:32.720
<v Speaker 2>robots navigate, identify objects, avoid obstacles, even understand human gestures.

355
00:18:32.960 --> 00:18:36.759
<v Speaker 1>Self driving cars must use this heavily, absolutely a prime example.

356
00:18:36.799 --> 00:18:41.000
<v Speaker 2>And in medicine it's revolutionizing things like analyzing medical scans

357
00:18:41.039 --> 00:18:44.839
<v Speaker 2>to detect tumors or anomalies, reconstructing three D models of organs,

358
00:18:45.480 --> 00:18:49.160
<v Speaker 2>really powerful stuff. A cornerstone library for doing CV and

359
00:18:49.200 --> 00:18:53.400
<v Speaker 2>Python is open CV open source Computer Vision Library OpenCV.

360
00:18:53.640 --> 00:18:54.000
<v Speaker 1>Okay.

361
00:18:54.160 --> 00:18:56.680
<v Speaker 2>It's incredibly powerful and versatile. Lets you do all sorts

362
00:18:56.720 --> 00:19:00.680
<v Speaker 2>of things read, write, display images and video between color

363
00:19:00.720 --> 00:19:04.920
<v Speaker 2>spaces like from standard bgr color to grayscale, detect edges

364
00:19:05.039 --> 00:19:06.680
<v Speaker 2>using algorithms like the Canny edge.

365
00:19:06.559 --> 00:19:08.839
<v Speaker 1>Detector, finding the outlines of things yep.

366
00:19:08.920 --> 00:19:12.799
<v Speaker 2>And even more complex tasks like object detection. OpenCV includes

367
00:19:12.839 --> 00:19:16.119
<v Speaker 2>pre train models like hair cascade classifiers that are quite

368
00:19:16.119 --> 00:19:19.319
<v Speaker 2>effective at detecting specific objects like faces or even eyes

369
00:19:19.359 --> 00:19:20.799
<v Speaker 2>within an image in real time.

370
00:19:20.920 --> 00:19:24.000
<v Speaker 1>Wow, okay, so AI can learn. It can understand language,

371
00:19:24.039 --> 00:19:26.680
<v Speaker 1>process audio, see the world. Yeah, how does it build

372
00:19:26.720 --> 00:19:30.240
<v Speaker 1>the brains behind all this? The complex structures that enable

373
00:19:30.279 --> 00:19:31.160
<v Speaker 1>this advanced stuff?

374
00:19:31.279 --> 00:19:33.640
<v Speaker 2>Right? That takes us into the realm of neural networks

375
00:19:33.680 --> 00:19:37.799
<v Speaker 2>and deep learning at a basic level. Artificial neural networks

376
00:19:37.839 --> 00:19:42.079
<v Speaker 2>ANNs are computing systems inspired by the structure and function

377
00:19:42.359 --> 00:19:45.000
<v Speaker 2>of the biological neural networks that make up animal brains.

378
00:19:45.440 --> 00:19:47.400
<v Speaker 1>Like modeling the brain loosely.

379
00:19:47.119 --> 00:19:51.640
<v Speaker 2>Yes, they consist of interconnected nodes or neurons organized in layers.

380
00:19:52.160 --> 00:19:55.039
<v Speaker 2>Each connection has a weight associated with it, and these

381
00:19:55.039 --> 00:19:58.799
<v Speaker 2>weights are adjusted during the learning process. They essentially learn

382
00:19:58.880 --> 00:20:03.319
<v Speaker 2>to recognize patterns by strengthening or weakening these connections based

383
00:20:03.359 --> 00:20:06.200
<v Speaker 2>on the input data. Deep learning is essentially a type

384
00:20:06.200 --> 00:20:09.880
<v Speaker 2>of machine learning that uses A and n's with many layers,

385
00:20:10.000 --> 00:20:13.039
<v Speaker 2>hence deep What makes deep learning special is that these

386
00:20:13.119 --> 00:20:16.640
<v Speaker 2>layered structures allow the model to learn hierarchies of features

387
00:20:16.680 --> 00:20:17.680
<v Speaker 2>directly from the data.

388
00:20:17.759 --> 00:20:18.440
<v Speaker 1>Hierarchies.

389
00:20:18.519 --> 00:20:21.119
<v Speaker 2>Yeah, so, for image recognition, the first layer might learn

390
00:20:21.160 --> 00:20:23.799
<v Speaker 2>to detect simple edges. The next layer might combine edges

391
00:20:23.839 --> 00:20:26.400
<v Speaker 2>to detect shapes. The layer after that might combine shapes

392
00:20:26.440 --> 00:20:28.519
<v Speaker 2>to detect parts of objects like an I or a nose,

393
00:20:28.880 --> 00:20:32.039
<v Speaker 2>and later layers combine those parts to recognize whole objects

394
00:20:32.279 --> 00:20:36.119
<v Speaker 2>like a face. It learns these representations automatically.

395
00:20:35.519 --> 00:20:38.759
<v Speaker 1>So it builds understanding layer by layer exactly.

396
00:20:38.680 --> 00:20:41.599
<v Speaker 2>And that's a key difference from traditional machine learning. In

397
00:20:41.680 --> 00:20:45.880
<v Speaker 2>traditional mL, you often need significant human effort to design

398
00:20:45.920 --> 00:20:49.160
<v Speaker 2>and select the right features from the data first. Deep

399
00:20:49.240 --> 00:20:52.200
<v Speaker 2>learning aims to learn those features automatically as part of

400
00:20:52.240 --> 00:20:52.799
<v Speaker 2>the process.

401
00:20:53.000 --> 00:20:56.640
<v Speaker 1>Okay, so what are the trade offs then, between traditional

402
00:20:56.799 --> 00:20:57.960
<v Speaker 1>mL and deep learning?

403
00:20:58.160 --> 00:21:01.599
<v Speaker 2>Good question. Deep learning generally needs a lot more data

404
00:21:01.599 --> 00:21:05.200
<v Speaker 2>to perform really well. With smaller data sets, traditional mL

405
00:21:05.279 --> 00:21:09.319
<v Speaker 2>might actually be better. Deep learning also typically requires more

406
00:21:09.359 --> 00:21:13.119
<v Speaker 2>powerful hardware. GPUs are almost essential because training these deep

407
00:21:13.160 --> 00:21:15.759
<v Speaker 2>networks is computationally very intensive.

408
00:21:15.920 --> 00:21:17.400
<v Speaker 1>More data, more power needed.

409
00:21:17.720 --> 00:21:20.559
<v Speaker 2>Right, future extraction, as we said, is largely automatic and

410
00:21:20.599 --> 00:21:25.440
<v Speaker 2>deep learning versus manual In much of traditional mL. Trading

411
00:21:25.480 --> 00:21:28.319
<v Speaker 2>time for deep learning can be much longer, but testing

412
00:21:28.359 --> 00:21:32.480
<v Speaker 2>time making predictions once trained can sometimes be faster, and

413
00:21:32.519 --> 00:21:35.559
<v Speaker 2>deep learning often tackles problems into end. Taking raw input

414
00:21:35.680 --> 00:21:39.359
<v Speaker 2>like pixels and producing the final output like a classification,

415
00:21:39.519 --> 00:21:43.440
<v Speaker 2>whereas traditional mL might break the problem into several distinct steps.

416
00:21:44.000 --> 00:21:47.000
<v Speaker 2>One particularly important type of deep network, especially for images,

417
00:21:47.480 --> 00:21:50.160
<v Speaker 2>is the convolutional neural network or CNN.

418
00:21:50.400 --> 00:21:52.720
<v Speaker 1>CNN heard that acronym a log Yeah.

419
00:21:52.559 --> 00:21:55.920
<v Speaker 2>They're ubiquitous in computer vision now. Unlike standard neural networks

420
00:21:56.000 --> 00:21:57.960
<v Speaker 2>that might just treat an image as a long list

421
00:21:58.000 --> 00:22:01.680
<v Speaker 2>of pixels, CNNs are specific designed to process data that

422
00:22:01.759 --> 00:22:05.039
<v Speaker 2>has a grid like topology, like an image. These special layers,

423
00:22:05.119 --> 00:22:06.480
<v Speaker 2>particularly convolutional layers.

424
00:22:06.519 --> 00:22:07.039
<v Speaker 1>What do those do?

425
00:22:07.319 --> 00:22:11.039
<v Speaker 2>They apply learnable filters across the input image, essentially sliding

426
00:22:11.039 --> 00:22:13.519
<v Speaker 2>a small window over the image and looking for specific

427
00:22:13.559 --> 00:22:17.960
<v Speaker 2>patterns like edges, corners, textures. Other key layers include ReLU

428
00:22:18.119 --> 00:22:22.079
<v Speaker 2>layers for introducing nonlinearity, pooling layers to reduce the spatial

429
00:22:22.119 --> 00:22:26.319
<v Speaker 2>size and computational load, and finally, fully connected layers often

430
00:22:26.359 --> 00:22:29.720
<v Speaker 2>at the end to perform the actual classification based on

431
00:22:29.759 --> 00:22:30.640
<v Speaker 2>the learned features.

432
00:22:30.880 --> 00:22:34.559
<v Speaker 1>So they're built specifically to understand the structure of images.

433
00:22:34.119 --> 00:22:38.920
<v Speaker 2>Precisely, and they are incredibly effective for image classification, object detection,

434
00:22:39.160 --> 00:22:40.559
<v Speaker 2>and many other vision tasks.

435
00:22:40.880 --> 00:22:44.319
<v Speaker 1>Okay, so that covers learning and perception, but AI isn't

436
00:22:44.400 --> 00:22:47.920
<v Speaker 1>just about processing data, right, It's also about strategy, about

437
00:22:47.960 --> 00:22:51.480
<v Speaker 1>making decisions, especially in complex situations like games.

438
00:22:51.599 --> 00:22:55.400
<v Speaker 2>Absolutely that involves different kinds of AI techniques, often related

439
00:22:55.440 --> 00:22:58.920
<v Speaker 2>to search and planning. One key area is heuristic search.

440
00:22:59.039 --> 00:23:01.799
<v Speaker 1>Heuristic search, yeah, think of it as informed search.

441
00:23:02.400 --> 00:23:05.480
<v Speaker 2>When you have a really complex problem with many possible

442
00:23:05.519 --> 00:23:08.319
<v Speaker 2>pads or solutions, like finding the best route run a map,

443
00:23:08.400 --> 00:23:10.559
<v Speaker 2>or figuring out the next move and chess, searching through

444
00:23:10.559 --> 00:23:15.319
<v Speaker 2>every possibility is often impossible. Heuristic search uses a heuristic,

445
00:23:15.559 --> 00:23:17.920
<v Speaker 2>which is like an educated guess or a rule of thumb,

446
00:23:18.200 --> 00:23:21.119
<v Speaker 2>to estimate how close a particular state is to the goal.

447
00:23:21.799 --> 00:23:24.640
<v Speaker 2>This helps guide the search towards promising areas and prune

448
00:23:24.680 --> 00:23:26.200
<v Speaker 2>away less likely paths.

449
00:23:26.559 --> 00:23:28.799
<v Speaker 1>A smart shortcut for searching kind of yeah.

450
00:23:28.920 --> 00:23:31.559
<v Speaker 2>It's used in solving lots of problems, including things called

451
00:23:31.599 --> 00:23:35.440
<v Speaker 2>constraint satisfaction problems CSPs, where you need to find a

452
00:23:35.480 --> 00:23:38.359
<v Speaker 2>solution that meets a set of specific rules or constraints,

453
00:23:38.799 --> 00:23:42.960
<v Speaker 2>and search algorithms are absolutely fundamental in games. AI in

454
00:23:43.000 --> 00:23:46.279
<v Speaker 2>games often works by thinking ahead, exploring possible future moves

455
00:23:46.319 --> 00:23:49.519
<v Speaker 2>and counter moves. You can visualize this as a game tree.

456
00:23:49.400 --> 00:23:51.839
<v Speaker 1>Like mapping out all the possibilities exactly.

457
00:23:52.200 --> 00:23:55.319
<v Speaker 2>A famous algorithm for this in two player games is minimax.

458
00:23:55.920 --> 00:23:59.720
<v Speaker 2>It assumes both players play optimally. The AI tries to

459
00:23:59.759 --> 00:24:03.240
<v Speaker 2>chew the move that maximizes its own potential score while

460
00:24:03.240 --> 00:24:06.599
<v Speaker 2>minimizing the maximum score the opponent can achieve. It looks

461
00:24:06.640 --> 00:24:09.799
<v Speaker 2>ahead and plays defensively, assuming the worst from the opponent.

462
00:24:10.000 --> 00:24:13.000
<v Speaker 1>Planning for the worst case right, but exploring.

463
00:24:12.680 --> 00:24:15.920
<v Speaker 2>The entire game tree is often still too much, so

464
00:24:16.119 --> 00:24:18.640
<v Speaker 2>A crucial optimization is alpha beta.

465
00:24:18.440 --> 00:24:20.200
<v Speaker 1>Proning alpha beta printing Okay.

466
00:24:20.240 --> 00:24:22.920
<v Speaker 2>This is a clever technique that dramatically reduces the number

467
00:24:23.000 --> 00:24:25.960
<v Speaker 2>of nodes the minimax algorithm needs to examine in the

468
00:24:26.000 --> 00:24:28.559
<v Speaker 2>game tree. It keeps track of the best score each

469
00:24:28.559 --> 00:24:32.079
<v Speaker 2>player can guarantee themselves alpha for the maximizer, beta for

470
00:24:32.119 --> 00:24:35.400
<v Speaker 2>the minimizer, and stops exploring a branch as soon as

471
00:24:35.440 --> 00:24:37.920
<v Speaker 2>it realizes that branch definitely won't lead to a better

472
00:24:37.960 --> 00:24:39.200
<v Speaker 2>outcome than one already.

473
00:24:38.960 --> 00:24:41.960
<v Speaker 1>Found, so it avoids wasting time on definitely.

474
00:24:41.640 --> 00:24:45.599
<v Speaker 2>Bad moves exactly. It makes searching much much faster. These

475
00:24:45.680 --> 00:24:48.839
<v Speaker 2>kinds of algorithms minimax with alpha beta are how AI

476
00:24:48.920 --> 00:24:51.359
<v Speaker 2>got good at games like Tiktac toe checkers and even

477
00:24:51.440 --> 00:24:54.480
<v Speaker 2>chess as mentioned in the sources, with examples like last.

478
00:24:54.240 --> 00:24:59.160
<v Speaker 1>Coin standing fascinating, and for really complex problems where maybe

479
00:24:59.160 --> 00:25:03.960
<v Speaker 1>the rules aren't perfec defined, AI can even evolve solutions.

480
00:25:04.079 --> 00:25:08.319
<v Speaker 2>Yeah, that's another really interesting approach. Genetic algorithms GAS. These

481
00:25:08.319 --> 00:25:14.920
<v Speaker 2>are search techniques inspired directly by biological evolution, natural selection, crossover, mutation, like.

482
00:25:14.920 --> 00:25:17.359
<v Speaker 1>Survival of the fittest for computer programs.

483
00:25:17.480 --> 00:25:20.480
<v Speaker 2>That's the core idea. You start with a population of

484
00:25:20.519 --> 00:25:24.960
<v Speaker 2>potential solutions to your problem, then you iteratively apply these steps.

485
00:25:25.640 --> 00:25:28.400
<v Speaker 2>Selection where the better performing solutions are more likely to

486
00:25:28.480 --> 00:25:32.240
<v Speaker 2>be chosen to reproduce, crossover where you combine parts of

487
00:25:32.279 --> 00:25:35.759
<v Speaker 2>two parent solutions to create new offspring solutions hoping to

488
00:25:35.799 --> 00:25:39.759
<v Speaker 2>blend good characteristics. And mutation where you introduce small random

489
00:25:39.839 --> 00:25:43.119
<v Speaker 2>changes to maintain diversity and potentially discover new improvements.

490
00:25:43.240 --> 00:25:47.559
<v Speaker 1>So it tries out combinations and random tweaks over generations.

491
00:25:47.319 --> 00:25:50.440
<v Speaker 2>Right over many cycles, the population tends to evolve towards

492
00:25:50.440 --> 00:25:54.680
<v Speaker 2>better and better solutions. GAS are particularly good for optimization

493
00:25:54.759 --> 00:25:58.200
<v Speaker 2>problems where the search space is huge or complex and

494
00:25:58.240 --> 00:26:01.920
<v Speaker 2>traditional methods might get stuck. The sources give examples like

495
00:26:02.000 --> 00:26:05.119
<v Speaker 2>evolving bitstrings to maximize the number of ones, or even

496
00:26:05.160 --> 00:26:09.319
<v Speaker 2>complex tasks like symbolic regression evolving mathematical formulas to fit data.

497
00:26:09.960 --> 00:26:12.839
<v Speaker 1>Wow. Okay, we have definitely covered a lot of ground here,

498
00:26:13.319 --> 00:26:16.359
<v Speaker 1>from the basic definition of AI and Python's roll through

499
00:26:16.400 --> 00:26:19.079
<v Speaker 1>all the ways machines learn and perceive to how they

500
00:26:19.119 --> 00:26:21.160
<v Speaker 1>strategize and even evolve solutions.

501
00:26:21.319 --> 00:26:24.400
<v Speaker 2>We really have. It's a huge feel, but hopefully breaking

502
00:26:24.400 --> 00:26:26.559
<v Speaker 2>it down like this helps you know seeing how we

503
00:26:26.599 --> 00:26:30.000
<v Speaker 2>get from asking what is intelligence to building systems that

504
00:26:30.039 --> 00:26:34.000
<v Speaker 2>you supervised, unsupervised or reinforcement learning, than equipping them with

505
00:26:34.079 --> 00:26:37.880
<v Speaker 2>NLP for language or a CV for site and underpinning

506
00:26:37.880 --> 00:26:40.200
<v Speaker 2>it all with things like neural networks or clever search

507
00:26:40.240 --> 00:26:43.680
<v Speaker 2>algorithms like mini max or gas. It connects data to

508
00:26:43.759 --> 00:26:45.799
<v Speaker 2>decisions in really sophisticated ways.

509
00:26:46.200 --> 00:26:49.640
<v Speaker 1>Yeah, our mission was definitely to cut through that complexity,

510
00:26:49.640 --> 00:26:52.920
<v Speaker 1>pull out those core ideas, those nuggets of knowledge. We

511
00:26:53.000 --> 00:26:57.599
<v Speaker 1>really hope you listening have had some aha moments and feel,

512
00:26:57.920 --> 00:27:00.240
<v Speaker 1>you know, much better equipped to understand what's happened at

513
00:27:00.279 --> 00:27:02.720
<v Speaker 1>the cutting edge of AI. So thinking about all this

514
00:27:02.960 --> 00:27:06.599
<v Speaker 1>machines that learn, see, hear, talk, strategize, it really makes

515
00:27:06.599 --> 00:27:09.880
<v Speaker 1>you wonder what new frontiers do you think artificial intelligence

516
00:27:09.880 --> 00:27:12.599
<v Speaker 1>will conquer next. What kinds of amazing or maybe even

517
00:27:12.680 --> 00:27:15.680
<v Speaker 1>challenging problems will help us tackle down the road. Something

518
00:27:15.680 --> 00:27:16.240
<v Speaker 1>to think about.
