WEBVTT

1
00:00:00.120 --> 00:00:02.640
<v Speaker 1>Welcome back to the deep dive, where we unpack complex

2
00:00:02.759 --> 00:00:07.280
<v Speaker 1>topics and bring you the essential insights. Today we're navigating

3
00:00:07.320 --> 00:00:10.359
<v Speaker 1>the exciting world of machine learning on Amazon Web Services.

4
00:00:10.880 --> 00:00:13.039
<v Speaker 1>Our mission for this deep dive is really to distill

5
00:00:13.080 --> 00:00:15.759
<v Speaker 1>the core concepts of machine learning, walk you through its

6
00:00:15.839 --> 00:00:19.280
<v Speaker 1>end to end life cycle, the whole process, and also

7
00:00:19.519 --> 00:00:24.000
<v Speaker 1>highlight how AWS services provide a well powerful toolkit for

8
00:00:24.079 --> 00:00:27.760
<v Speaker 1>every single stage. We're drawing our insights from the AWS

9
00:00:27.800 --> 00:00:32.359
<v Speaker 1>Certified Machine Learning Specialty MLSC zero one certification guide, trying

10
00:00:32.359 --> 00:00:35.640
<v Speaker 1>to pull out those aha moments and strategic takeaways. The

11
00:00:35.679 --> 00:00:38.079
<v Speaker 1>goal is to make you feel truly well informed, whether

12
00:00:38.159 --> 00:00:41.359
<v Speaker 1>you're a strategizing for a meeting or just really curious

13
00:00:41.359 --> 00:00:42.039
<v Speaker 1>about this field.

14
00:00:42.399 --> 00:00:44.280
<v Speaker 2>Yeah, and what's truly valuable in this guide and what

15
00:00:44.399 --> 00:00:46.920
<v Speaker 2>we'll focus on today is its ability to break down

16
00:00:46.960 --> 00:00:50.439
<v Speaker 2>these complex mL ideas into actionable knowledge. We're going to

17
00:00:50.479 --> 00:00:53.399
<v Speaker 2>try and give you a clear roadmap basically from foundational

18
00:00:53.439 --> 00:00:57.280
<v Speaker 2>definitions right through to practical AWOS applications, showing you how

19
00:00:57.280 --> 00:00:59.079
<v Speaker 2>you might build truly intelligent solutions.

20
00:00:59.240 --> 00:01:00.280
<v Speaker 1>Okay, let's dig in to this.

21
00:01:00.320 --> 00:01:00.520
<v Speaker 2>Then.

22
00:01:00.920 --> 00:01:04.599
<v Speaker 1>The guide starts by clarifying the relationship between artificial intelligence

23
00:01:04.640 --> 00:01:08.040
<v Speaker 1>machine learning and deep learning. I like the analogy they use.

24
00:01:08.079 --> 00:01:11.400
<v Speaker 1>Think of it like a set of nested Russian dolls exactly.

25
00:01:11.840 --> 00:01:15.719
<v Speaker 2>So at the outermost layer you've got artificial intelligence AI.

26
00:01:16.000 --> 00:01:19.079
<v Speaker 2>That's the really broad field, aiming to create machines that

27
00:01:19.159 --> 00:01:24.079
<v Speaker 2>can do tasks mimicking human intelligence. Then moving inward, machine

28
00:01:24.159 --> 00:01:27.519
<v Speaker 2>learning mL is a key subset of AI. This is

29
00:01:27.560 --> 00:01:31.359
<v Speaker 2>where systems learn from data. They identify patterns and make

30
00:01:31.400 --> 00:01:35.719
<v Speaker 2>predictions without being explicitly programmed. It's about learning from experience,

31
00:01:35.760 --> 00:01:37.400
<v Speaker 2>observing adapting.

32
00:01:36.959 --> 00:01:39.319
<v Speaker 1>Okay, learning from data, not rules precisely.

33
00:01:39.519 --> 00:01:42.319
<v Speaker 2>And then at the very core you have deep learning DL.

34
00:01:42.680 --> 00:01:45.799
<v Speaker 2>That's an even more specialized subset of mL. Deep learning

35
00:01:45.920 --> 00:01:48.439
<v Speaker 2>uses these multi layered structures you've probably heard of them,

36
00:01:48.560 --> 00:01:52.439
<v Speaker 2>deep neural networks. They solve highly complex problems. They're powering

37
00:01:52.480 --> 00:01:53.840
<v Speaker 2>a lot of the state of the art stuff we

38
00:01:53.879 --> 00:01:57.319
<v Speaker 2>see today, like language translation or facial recognition.

39
00:01:57.680 --> 00:02:00.920
<v Speaker 1>So what this hierarchy really means for us, for you listening,

40
00:02:00.959 --> 00:02:04.519
<v Speaker 1>is that we're witnessing this incredible evolution. It's fueled by

41
00:02:04.760 --> 00:02:07.920
<v Speaker 1>well more computing power and just vast amounts of data

42
00:02:07.920 --> 00:02:11.319
<v Speaker 1>being available now, and AI applications are becoming more powerful,

43
00:02:11.360 --> 00:02:14.680
<v Speaker 1>more accessible and really applicable across almost every industry.

44
00:02:15.840 --> 00:02:18.560
<v Speaker 2>And when these systems learn, they generally fall into three

45
00:02:18.599 --> 00:02:22.159
<v Speaker 2>main approaches, three ways of learning. The first is supervised learning.

46
00:02:22.960 --> 00:02:26.000
<v Speaker 2>This relies on labeled data. So imagine you have a

47
00:02:26.080 --> 00:02:29.159
<v Speaker 2>data set where every example has an answer already attached.

48
00:02:29.439 --> 00:02:30.479
<v Speaker 2>That's your labeled.

49
00:02:30.240 --> 00:02:33.479
<v Speaker 1>Data, right, like inputs and the correct outputs exactly.

50
00:02:33.840 --> 00:02:37.680
<v Speaker 2>So. One common use for supervised learning is classification. Here

51
00:02:37.759 --> 00:02:40.879
<v Speaker 2>the model predicts a category or class. For instance, the

52
00:02:40.919 --> 00:02:45.840
<v Speaker 2>guy talks about classifying financial transactions. Is this fraudulent or legitimate?

53
00:02:46.039 --> 00:02:49.039
<v Speaker 2>Based on features like amount, time of day, that sort.

54
00:02:48.840 --> 00:02:50.439
<v Speaker 1>Of thing, okay, putting things into buckets?

55
00:02:50.520 --> 00:02:53.639
<v Speaker 2>Yeah? And the other key type is regression. The goal

56
00:02:53.719 --> 00:02:56.840
<v Speaker 2>here is to predict a continuous numerical value. This could

57
00:02:56.840 --> 00:02:59.840
<v Speaker 2>be forecasting sales figures for the next quarter maybe, or

58
00:03:00.039 --> 00:03:01.280
<v Speaker 2>predicting the obstable price for.

59
00:03:01.280 --> 00:03:03.520
<v Speaker 1>A product, got it, predicting a number.

60
00:03:03.960 --> 00:03:08.280
<v Speaker 2>Then there's unsupervised learning. This works with unlabeled data, so

61
00:03:08.360 --> 00:03:12.120
<v Speaker 2>no answer is provided beforehand. Here, this system tries to

62
00:03:12.120 --> 00:03:14.280
<v Speaker 2>find hidden patterns or structures on its own.

63
00:03:14.520 --> 00:03:16.520
<v Speaker 1>Ah, okay, so finding patterns we didn't know.

64
00:03:16.479 --> 00:03:19.639
<v Speaker 2>We're there exactly a great example of this is clustering

65
00:03:19.919 --> 00:03:22.879
<v Speaker 2>you group similar data points together. Think about segmenting your

66
00:03:22.879 --> 00:03:26.560
<v Speaker 2>customer base based on their purchasing behavior. You know, to

67
00:03:26.599 --> 00:03:27.680
<v Speaker 2>understand different.

68
00:03:27.360 --> 00:03:29.560
<v Speaker 1>Market segments, right, finding natural groupings.

69
00:03:30.599 --> 00:03:34.319
<v Speaker 2>And finally, we have reinforcement learning. This is where a

70
00:03:34.360 --> 00:03:38.199
<v Speaker 2>system learns by interacting with an environment. It gets rewards

71
00:03:38.199 --> 00:03:42.479
<v Speaker 2>for good decisions and well penalties for poor ones. It's

72
00:03:42.520 --> 00:03:44.479
<v Speaker 2>a bit like how we learn through trial and error.

73
00:03:44.560 --> 00:03:47.599
<v Speaker 2>The guide mentions an example like an automated call center

74
00:03:47.639 --> 00:03:51.120
<v Speaker 2>agent learning the best path to resolve customer queries by

75
00:03:51.120 --> 00:03:52.879
<v Speaker 2>getting rewarded for good recommendations.

76
00:03:53.240 --> 00:03:57.319
<v Speaker 1>Interesting learning by doing essentially So, this next point seems

77
00:03:57.360 --> 00:04:00.400
<v Speaker 1>crucial because it's about how we actually use these The

78
00:04:00.479 --> 00:04:05.479
<v Speaker 1>approach you choose, supervised, unsupervised, or reinforcement, it totally depends

79
00:04:05.520 --> 00:04:07.080
<v Speaker 1>on your data and the problem you're trying to.

80
00:04:07.000 --> 00:04:11.719
<v Speaker 2>Solve, right, absolutely, it's fundamental. Do you have clearly labeled examples?

81
00:04:12.400 --> 00:04:15.719
<v Speaker 2>Supervised is likely your path? Are you looking for hidden

82
00:04:15.759 --> 00:04:19.639
<v Speaker 2>groups in just raw data? Unsupervised? Is it about learning

83
00:04:19.639 --> 00:04:23.800
<v Speaker 2>through interaction and feedback? Reinforcement The data and the goal

84
00:04:23.920 --> 00:04:24.800
<v Speaker 2>dictate the method.

85
00:04:25.079 --> 00:04:28.360
<v Speaker 1>Makes sense. Now, building effective mL models isn't just about

86
00:04:28.360 --> 00:04:31.680
<v Speaker 1>picking one of those algorithms. It's a structured process. The

87
00:04:31.720 --> 00:04:35.480
<v Speaker 1>guide highlights something called crisp DM, the cross industry's standard

88
00:04:35.480 --> 00:04:38.199
<v Speaker 1>process for data mining, as a blueprint for this.

89
00:04:38.439 --> 00:04:41.560
<v Speaker 2>Yeah, chris DM is really widely used. It provides a clear,

90
00:04:41.680 --> 00:04:46.040
<v Speaker 2>iterative framework with six key phases. It starts with business understanding.

91
00:04:46.399 --> 00:04:49.120
<v Speaker 2>This is all about clearly defining your project objectives, your

92
00:04:49.120 --> 00:04:52.600
<v Speaker 2>success criteria, potential risks. It sounds obvious, but honestly, this

93
00:04:52.639 --> 00:04:54.720
<v Speaker 2>is where many projects can go wrong if the problem

94
00:04:54.720 --> 00:04:55.639
<v Speaker 2>isn't nailed down.

95
00:04:55.480 --> 00:04:58.079
<v Speaker 1>Precisely right, knowing what you're actually trying to achieve.

96
00:04:58.439 --> 00:05:04.399
<v Speaker 2>Then data understanding. This involves collecting, describing, exploring, checking the

97
00:05:04.480 --> 00:05:07.680
<v Speaker 2>quality of your raw data. Data scientists need to be

98
00:05:07.839 --> 00:05:11.279
<v Speaker 2>well super skeptical here, look for every nuance. Then comes

99
00:05:11.360 --> 00:05:15.480
<v Speaker 2>data preparation and this is often the most time consuming phase. Really.

100
00:05:15.519 --> 00:05:20.519
<v Speaker 2>It involves selecting, cleaning, transforming, formatting the data for your chosen.

101
00:05:20.279 --> 00:05:21.959
<v Speaker 1>Algorithm, Okay, getting the data ready.

102
00:05:22.160 --> 00:05:25.639
<v Speaker 2>Following that is modeling. Here you select the appropriate algorithm,

103
00:05:25.959 --> 00:05:28.839
<v Speaker 2>design your tests, approach and train the model. You need

104
00:05:28.839 --> 00:05:32.120
<v Speaker 2>to distinguish between parameters, which are learned from the data itself,

105
00:05:32.360 --> 00:05:34.920
<v Speaker 2>and hyper parameters, which are like knobs you turn to

106
00:05:34.959 --> 00:05:38.839
<v Speaker 2>control the learning process. The fifth phase is evaluation. You

107
00:05:38.920 --> 00:05:42.920
<v Speaker 2>review the model's performance against those initial business success criteria you.

108
00:05:42.959 --> 00:05:44.720
<v Speaker 1>Defined and if it's not good enough.

109
00:05:45.040 --> 00:05:48.399
<v Speaker 2>That's the key. mL is iterative. It's a scientific process.

110
00:05:48.600 --> 00:05:51.000
<v Speaker 2>If your model isn't cutting it, you loop back, maybe

111
00:05:51.000 --> 00:05:53.439
<v Speaker 2>you tune those hyper parameters, maybe you need more data,

112
00:05:53.560 --> 00:05:58.399
<v Speaker 2>maybe you even need to rethink the business problem itself. Finally, deployment,

113
00:05:59.000 --> 00:06:02.519
<v Speaker 2>getting your model into reduction. This involves creating pipelines for

114
00:06:02.560 --> 00:06:06.000
<v Speaker 2>continuous training and inference and setting up monitoring to catch

115
00:06:06.560 --> 00:06:09.920
<v Speaker 2>model drift. All drift, Yeah, that's what a model's performance

116
00:06:09.959 --> 00:06:13.040
<v Speaker 2>degrades over time because the real world data or patterns change,

117
00:06:13.319 --> 00:06:16.920
<v Speaker 2>So you need to monitor and potentially retrain. And you know,

118
00:06:16.959 --> 00:06:20.000
<v Speaker 2>if we connect this back to the AWS certification, the

119
00:06:20.079 --> 00:06:24.560
<v Speaker 2>four domains covered in the exam data engineering, exploratory, data analysis, modeling,

120
00:06:24.680 --> 00:06:27.879
<v Speaker 2>and mL OPS, right, they really map quite directly to

121
00:06:27.920 --> 00:06:30.279
<v Speaker 2>these CRISP DM stages. It's complete life cycle.

122
00:06:30.360 --> 00:06:32.279
<v Speaker 1>Okay, that framework makes a lot of sense. Now, you

123
00:06:32.319 --> 00:06:35.000
<v Speaker 1>mentioned data preparation is often the most time consuming part.

124
00:06:35.079 --> 00:06:37.839
<v Speaker 1>The guide really stresses this too. It's the absolute foundation

125
00:06:38.000 --> 00:06:41.279
<v Speaker 1>for any good model. Get the data wrong, and well

126
00:06:41.319 --> 00:06:41.959
<v Speaker 1>nothing else.

127
00:06:41.800 --> 00:06:45.000
<v Speaker 2>Matters much absolutely garbage in, garbage out. As they say,

128
00:06:45.560 --> 00:06:48.600
<v Speaker 2>A critical first step is understanding your feature types, the

129
00:06:48.680 --> 00:06:51.040
<v Speaker 2>kind of beta you have. So you've got numerical data.

130
00:06:51.360 --> 00:06:54.800
<v Speaker 2>This could be discrete like countable items, number of clicks maybe,

131
00:06:55.120 --> 00:06:59.199
<v Speaker 2>or continuous measurements with potentially infinite values like temperature or.

132
00:06:59.160 --> 00:07:01.079
<v Speaker 1>Price, numbers, screen or continuous.

133
00:07:01.160 --> 00:07:04.319
<v Speaker 2>Got it. Then you have categorical data. This describes qualities

134
00:07:04.399 --> 00:07:07.920
<v Speaker 2>or labels. It can be nominal labels without any inherent order,

135
00:07:08.480 --> 00:07:11.720
<v Speaker 2>like colors or types of products, or ordinal labels that

136
00:07:11.839 --> 00:07:15.079
<v Speaker 2>do have a meaningful order like low, medium, high, or

137
00:07:15.240 --> 00:07:16.040
<v Speaker 2>education levels.

138
00:07:16.079 --> 00:07:18.439
<v Speaker 1>Okay, categories with or without an order.

139
00:07:18.360 --> 00:07:22.600
<v Speaker 2>Right, and categorical data, especially nominal, usually can't be fed

140
00:07:22.600 --> 00:07:26.680
<v Speaker 2>directly into most algorithms. It needs transforming into numbers. For example,

141
00:07:26.759 --> 00:07:29.560
<v Speaker 2>for that nominal data without order, like countries, we often

142
00:07:29.680 --> 00:07:32.360
<v Speaker 2>use one hot encoding. This creates a new binary column

143
00:07:32.399 --> 00:07:35.720
<v Speaker 2>a zero or one for each category. It avoids accidentally

144
00:07:35.759 --> 00:07:39.040
<v Speaker 2>implying that, say, country three is somehow greater than country two.

145
00:07:39.279 --> 00:07:42.680
<v Speaker 1>Ah avoids creating a false order exactly, whereas for ordinal

146
00:07:42.759 --> 00:07:47.439
<v Speaker 1>data like those education levels, ordinal encoding preserves that inherent sequence.

147
00:07:48.839 --> 00:07:51.639
<v Speaker 2>Now the crucial rule here in this trips people up

148
00:07:51.720 --> 00:07:55.000
<v Speaker 2>sometimes is that any encoder you create must be fitted

149
00:07:55.120 --> 00:07:58.279
<v Speaker 2>only on your training data. Then you use that same

150
00:07:58.360 --> 00:08:01.279
<v Speaker 2>fitted encoder to transform your teches data and any new

151
00:08:01.279 --> 00:08:04.959
<v Speaker 2>production data. You never refit on test data that introduces bias.

152
00:08:05.040 --> 00:08:06.959
<v Speaker 1>Okay, fit on train, transform on tests.

153
00:08:07.000 --> 00:08:09.439
<v Speaker 2>Got it now. For numerical features, you often need to

154
00:08:09.439 --> 00:08:13.199
<v Speaker 2>adjust their scale. Data normalization, for instance, might scale data

155
00:08:13.199 --> 00:08:16.279
<v Speaker 2>to arrange between zero and one. This is really vital

156
00:08:16.319 --> 00:08:19.120
<v Speaker 2>for algorithms that are sensitive to the magnitude of numbers,

157
00:08:19.160 --> 00:08:21.480
<v Speaker 2>like neural networks or caneurous.

158
00:08:21.040 --> 00:08:23.879
<v Speaker 1>Neighbors, so they don't overweight big numbers precisely.

159
00:08:24.199 --> 00:08:28.399
<v Speaker 2>Alternatively, data standardization transforms data to have a mean of

160
00:08:28.480 --> 00:08:31.720
<v Speaker 2>zero and a standard deviation of one. This is fantastic

161
00:08:31.759 --> 00:08:35.039
<v Speaker 2>for identifying outliers, for example, and for features that are

162
00:08:35.039 --> 00:08:38.480
<v Speaker 2>skewed think income distributions often bunched up at one end.

163
00:08:38.960 --> 00:08:42.559
<v Speaker 2>Logarithmic and power transformations like the box Cox method can

164
00:08:42.559 --> 00:08:45.559
<v Speaker 2>make them more symmetrical, more like a Bell curve, and

165
00:08:45.600 --> 00:08:49.360
<v Speaker 2>that often significantly improves the performance of many algorithms like

166
00:08:49.480 --> 00:08:50.440
<v Speaker 2>linear regression.

167
00:08:50.639 --> 00:08:53.200
<v Speaker 1>Wow, lots of ways to wrangle the data. What about

168
00:08:53.240 --> 00:08:55.440
<v Speaker 1>problems like missing values?

169
00:08:55.519 --> 00:08:57.440
<v Speaker 2>Yeah, that's a common one. First, you have to try

170
00:08:57.480 --> 00:09:00.600
<v Speaker 2>and understand why they're missing. Is it ran them or

171
00:09:00.679 --> 00:09:03.960
<v Speaker 2>is there a pattern. Options range from just listwise deletion

172
00:09:04.080 --> 00:09:07.240
<v Speaker 2>discarding rows or columns with missing data, but be careful

173
00:09:07.279 --> 00:09:10.720
<v Speaker 2>you might lose valuable information, or imputation where you replace

174
00:09:10.799 --> 00:09:14.200
<v Speaker 2>missing values. Simple imputation might use the mean or the median,

175
00:09:14.200 --> 00:09:16.840
<v Speaker 2>which is less sensitive to outliers, or the mode for

176
00:09:16.919 --> 00:09:20.519
<v Speaker 2>categorical data, but you can get more sophisticated even using

177
00:09:20.519 --> 00:09:22.960
<v Speaker 2>other mL models to predict what the missing values should be.

178
00:09:23.120 --> 00:09:26.600
<v Speaker 1>Okay, and outliers those weird data points.

179
00:09:26.399 --> 00:09:30.759
<v Speaker 2>So another common hurdle. Outliers are data points significantly different

180
00:09:30.799 --> 00:09:34.639
<v Speaker 2>from the rest. They can dramatically skew your model's understanding,

181
00:09:35.080 --> 00:09:38.480
<v Speaker 2>like pulling a regression line way off course. Tools like

182
00:09:38.600 --> 00:09:42.000
<v Speaker 2>z scores or visualizing with box plots help detect them.

183
00:09:42.519 --> 00:09:45.120
<v Speaker 2>Once found, you might remove them or maybe just flag

184
00:09:45.159 --> 00:09:46.759
<v Speaker 2>them so your model knows they're unusual.

185
00:09:46.919 --> 00:09:50.039
<v Speaker 1>Makes sense, And what if the data is like really unbalanced.

186
00:09:50.039 --> 00:09:51.639
<v Speaker 1>You mentioned fraud detection earlier.

187
00:09:51.440 --> 00:09:54.840
<v Speaker 2>Right, Unbalanced data sets very common. Say only one percent

188
00:09:54.879 --> 00:09:57.960
<v Speaker 2>of your transactions are actually fraudulent. Your model might just

189
00:09:58.039 --> 00:10:01.159
<v Speaker 2>learn to always predict not fraud, because that's accurate ninety

190
00:10:01.200 --> 00:10:04.039
<v Speaker 2>nine percent of the time, but it misses the important cases.

191
00:10:04.559 --> 00:10:07.080
<v Speaker 2>So to address this, you can tune your algorithm, maybe

192
00:10:07.120 --> 00:10:09.399
<v Speaker 2>tell to pay more attention to the rare class using

193
00:10:09.480 --> 00:10:12.720
<v Speaker 2>something like a class weight hyperparameter. Or you can resample

194
00:10:12.759 --> 00:10:16.279
<v Speaker 2>your data. Either undersample the majority class just use fewer

195
00:10:16.320 --> 00:10:20.120
<v Speaker 2>examples of not fraud, or oversample the minority class. A

196
00:10:20.159 --> 00:10:23.960
<v Speaker 2>popular technique for oversampling is SMO and a synthetic minority

197
00:10:24.000 --> 00:10:28.159
<v Speaker 2>over sampling technique. It intelligently creates new synthetic examples of

198
00:10:28.200 --> 00:10:29.759
<v Speaker 2>the rare class to help balance things.

199
00:10:29.639 --> 00:10:34.759
<v Speaker 1>Out smot okay, creating fake but plausible examples kind.

200
00:10:34.519 --> 00:10:38.080
<v Speaker 2>Of yeah, based on the characteristics of the existing minority examples,

201
00:10:38.879 --> 00:10:42.799
<v Speaker 2>and finally preparing text data for mL or natural language

202
00:10:42.799 --> 00:10:47.120
<v Speaker 2>processing NLP. This has evolved a lot. Older methods like

203
00:10:47.200 --> 00:10:51.679
<v Speaker 2>bag of Words BOW just count how often words appear simple,

204
00:10:51.720 --> 00:10:56.159
<v Speaker 2>but loses context. More advanced techniques like word embedding, used

205
00:10:56.159 --> 00:10:59.480
<v Speaker 2>in models like word two, VEK or glove represent words

206
00:10:59.519 --> 00:11:03.320
<v Speaker 2>as dense numerical vectors. What's fascinating here is these vectors

207
00:11:03.360 --> 00:11:06.960
<v Speaker 2>capture semantic meaning. Words with similar meanings end up closer

208
00:11:07.000 --> 00:11:09.240
<v Speaker 2>together in this multi dimensional space, so.

209
00:11:09.159 --> 00:11:12.000
<v Speaker 1>The model understands relationships between words in.

210
00:11:11.919 --> 00:11:14.759
<v Speaker 2>A mathematical sense. Yes, it captures context and meaning much

211
00:11:14.799 --> 00:11:15.799
<v Speaker 2>better than just counting.

212
00:11:16.080 --> 00:11:18.559
<v Speaker 1>That's a really thorough look at data prep. It's clear

213
00:11:18.600 --> 00:11:22.600
<v Speaker 1>its critical and well often complex. But all this meticulously

214
00:11:22.679 --> 00:11:25.559
<v Speaker 1>prepared information needs a robust place to live. You need

215
00:11:25.600 --> 00:11:28.480
<v Speaker 1>to store it somewhere, and on AWS. That journey often

216
00:11:28.480 --> 00:11:31.399
<v Speaker 1>begins with S three, Right, our digital warehouse, where do

217
00:11:31.440 --> 00:11:32.879
<v Speaker 1>we store all this data? For mL?

218
00:11:33.000 --> 00:11:36.480
<v Speaker 2>You're absolutely right. The storage choice is fundamental. Amazon S

219
00:11:36.559 --> 00:11:40.519
<v Speaker 2>three Simple Storage Service is very often the starting point

220
00:11:40.799 --> 00:11:45.200
<v Speaker 2>and the core its object storage, known for its incredible durability,

221
00:11:45.720 --> 00:11:49.200
<v Speaker 2>designed for eleven nine's durability, which is just astronomical protection

222
00:11:49.279 --> 00:11:53.200
<v Speaker 2>against data loss. It's highly scalable. You store objects your

223
00:11:53.200 --> 00:11:56.200
<v Speaker 2>files basically within these things called buckets, which are specific

224
00:11:56.200 --> 00:12:00.159
<v Speaker 2>to an AWS region, and S three offers different storage classes.

225
00:12:00.480 --> 00:12:03.159
<v Speaker 2>This lets you optimize costs based on how frequently you

226
00:12:03.200 --> 00:12:06.240
<v Speaker 2>need to access the data. Data you access rarely can

227
00:12:06.279 --> 00:12:09.919
<v Speaker 2>go into cheaper, colder storage. Plus, it has robust access

228
00:12:09.960 --> 00:12:12.480
<v Speaker 2>control and encryption options to keep everything secure. OK.

229
00:12:12.720 --> 00:12:15.639
<v Speaker 1>S three for scalable, durable object storage, what about more

230
00:12:15.679 --> 00:12:17.840
<v Speaker 1>structured data like traditional databases?

231
00:12:18.000 --> 00:12:21.799
<v Speaker 2>For that, Amazon Relational Database Service RDS is the managed service.

232
00:12:21.840 --> 00:12:25.120
<v Speaker 2>It supports popular engines like Mycycle, Postgress, Goal, Oracle, etc.

233
00:12:25.879 --> 00:12:29.480
<v Speaker 2>A key feature for reliability is multi easy deployments. This

234
00:12:29.559 --> 00:12:32.360
<v Speaker 2>automatically creates a synchronous standby copy of your database in

235
00:12:32.399 --> 00:12:35.440
<v Speaker 2>a different availability zone, so if one AZ has an issue,

236
00:12:35.480 --> 00:12:36.759
<v Speaker 2>it fails over automatically.

237
00:12:36.919 --> 00:12:39.440
<v Speaker 1>Great for high availability, so it keeps running even if

238
00:12:39.440 --> 00:12:41.519
<v Speaker 1>there's an outage in one place exactly.

239
00:12:41.559 --> 00:12:44.279
<v Speaker 2>And for scaling read performance, especially for applications that do

240
00:12:44.320 --> 00:12:47.720
<v Speaker 2>a lot of reading, you can use read replicas. These

241
00:12:47.720 --> 00:12:50.960
<v Speaker 2>are asynchronously replicated copies of your main database. You can

242
00:12:50.960 --> 00:12:53.240
<v Speaker 2>point your read heavy traffic to them. You can even

243
00:12:53.240 --> 00:12:56.480
<v Speaker 2>place them in different regions for global reach. This directly

244
00:12:56.519 --> 00:12:59.759
<v Speaker 2>impacts your RPO recovery point objective how much data you

245
00:12:59.799 --> 00:13:03.559
<v Speaker 2>might lose an RTO recovery time objective how fast you recover.

246
00:13:04.440 --> 00:13:07.679
<v Speaker 2>Multi asy and read replicas help you achieve low RPO

247
00:13:07.759 --> 00:13:08.679
<v Speaker 2>and RTO.

248
00:13:08.440 --> 00:13:10.840
<v Speaker 1>Makes sense availability and read scaling.

249
00:13:10.879 --> 00:13:14.519
<v Speaker 2>HM and beyond S three and rds AWS has specialized

250
00:13:14.519 --> 00:13:18.159
<v Speaker 2>stores too. Amazon Redshift is a data warehouse optimized for

251
00:13:18.200 --> 00:13:22.240
<v Speaker 2>analyzing massive data sets using SQL and Amazon DynamoDB is

252
00:13:22.279 --> 00:13:25.519
<v Speaker 2>a fully managed no SQL database the key value in

253
00:13:25.600 --> 00:13:29.039
<v Speaker 2>document data where you need super fast, flexible access at

254
00:13:29.039 --> 00:13:29.919
<v Speaker 2>really any scale.

255
00:13:30.120 --> 00:13:32.200
<v Speaker 1>Okay, so a whole range of options. The key takeaway

256
00:13:32.240 --> 00:13:34.279
<v Speaker 1>here seems to be it's not just about storing data,

257
00:13:34.279 --> 00:13:36.440
<v Speaker 1>it's about choosing the right storage for the right kind

258
00:13:36.480 --> 00:13:40.559
<v Speaker 1>of data, getting that optimal balance of availability, performance, security,

259
00:13:40.919 --> 00:13:44.240
<v Speaker 1>and cost for your specific mL use.

260
00:13:44.080 --> 00:13:46.840
<v Speaker 2>Case, precisely matching the tool to the job.

261
00:13:47.279 --> 00:13:50.120
<v Speaker 1>So once our data is carefully stored and prepped, we

262
00:13:50.200 --> 00:13:52.759
<v Speaker 1>often need to process it further, maybe transform it in

263
00:13:52.799 --> 00:13:55.519
<v Speaker 1>bulk or analyze streams of it. The guide walks us

264
00:13:55.559 --> 00:13:58.840
<v Speaker 1>through a WUS services for both batch processing and real

265
00:13:58.879 --> 00:13:59.440
<v Speaker 1>time stuff.

266
00:13:59.559 --> 00:14:04.399
<v Speaker 2>Yeah, large scale data transformation and movement like etlxtract transform

267
00:14:04.519 --> 00:14:09.159
<v Speaker 2>load AWS. Glue is a really powerful, fully managed service.

268
00:14:09.200 --> 00:14:11.759
<v Speaker 2>It's a secret Sauce is the data catalog. You can

269
00:14:11.799 --> 00:14:15.759
<v Speaker 2>automatically crawl your data sources, figure out the schema, detect changes,

270
00:14:15.799 --> 00:14:19.240
<v Speaker 2>and make it all queriable. Then glues ETL jobs, which

271
00:14:19.320 --> 00:14:22.000
<v Speaker 2>usually run on a patchy spark, do the heavy lifting

272
00:14:22.039 --> 00:14:25.480
<v Speaker 2>of the actual data transformation, maybe copying and cleaning data

273
00:14:25.480 --> 00:14:27.360
<v Speaker 2>from S three into redshift for example.

274
00:14:27.480 --> 00:14:30.039
<v Speaker 1>So Glue handles the whole ETL pipeline.

275
00:14:29.600 --> 00:14:32.240
<v Speaker 2>Pretty much in a serverlest way. Now, if you just

276
00:14:32.279 --> 00:14:34.279
<v Speaker 2>want a query data that's already sitting in S three

277
00:14:34.600 --> 00:14:37.879
<v Speaker 2>without moving or transforming it first, Amazon Athena is amazing

278
00:14:37.919 --> 00:14:41.240
<v Speaker 2>for this. It's serverless, interactive use standard SQL to query

279
00:14:41.279 --> 00:14:46.080
<v Speaker 2>data directly in S three across various formats CSV, json, parquet, ORC,

280
00:14:46.600 --> 00:14:49.200
<v Speaker 2>no infrastructure to manage. Is incredibly fast for ad hoc

281
00:14:49.240 --> 00:14:50.440
<v Speaker 2>analysis or quick.

282
00:14:50.240 --> 00:14:53.720
<v Speaker 1>Exploration schema onread right, you define the structure as.

283
00:14:53.600 --> 00:14:57.639
<v Speaker 2>You query it exactly. Now, for processing real time streaming data,

284
00:14:57.960 --> 00:15:01.639
<v Speaker 2>we turn to Amazon Kinesis Visais data streams can capture

285
00:15:01.679 --> 00:15:04.639
<v Speaker 2>and store huge amounts of data per second from loads

286
00:15:04.639 --> 00:15:09.200
<v Speaker 2>of sources website clicks, IoT sensors, financial transactions. You can

287
00:15:09.200 --> 00:15:11.919
<v Speaker 2>then build applications to process this stream in real time.

288
00:15:12.320 --> 00:15:14.960
<v Speaker 2>Then there's Kinesis Data fire Hose. This is a fully

289
00:15:14.960 --> 00:15:17.759
<v Speaker 2>managed service that takes that streaming data and automatically loads

290
00:15:17.799 --> 00:15:21.279
<v Speaker 2>it into destinations like S three, redshift or analytics services.

291
00:15:21.519 --> 00:15:23.679
<v Speaker 2>It can even transform the data on the fly using

292
00:15:23.720 --> 00:15:25.799
<v Speaker 2>AWS Lambda before delivering it.

293
00:15:25.879 --> 00:15:28.399
<v Speaker 1>So fire Hose is more about getting the stream into

294
00:15:28.399 --> 00:15:30.159
<v Speaker 1>storage or other services easily.

295
00:15:30.480 --> 00:15:33.639
<v Speaker 2>Yeah, simplifies the delivery part. And what about getting data

296
00:15:33.639 --> 00:15:38.039
<v Speaker 2>from your own data centers into AWS. AWS Storage Gateway

297
00:15:38.039 --> 00:15:41.360
<v Speaker 2>connects your on premises software appliances to cloud storage using

298
00:15:41.399 --> 00:15:45.360
<v Speaker 2>standard file or block protocols. For really massive data transfers

299
00:15:45.440 --> 00:15:48.240
<v Speaker 2>where the Internet is too slow, you have the AWS

300
00:15:48.279 --> 00:15:51.639
<v Speaker 2>snow family. These are physical devices like Snowball Edge which

301
00:15:51.679 --> 00:15:54.679
<v Speaker 2>is like a ruggedized suitcase computer, or even Snowmobile, a

302
00:15:54.720 --> 00:15:57.799
<v Speaker 2>whole shipping container. You load data onto them locally, ship

303
00:15:57.840 --> 00:16:00.679
<v Speaker 2>them to AWS and they upload it securely, much faster

304
00:16:00.759 --> 00:16:02.039
<v Speaker 2>for petabytes a truck.

305
00:16:01.840 --> 00:16:03.879
<v Speaker 1>Full of data literally pretty much.

306
00:16:04.159 --> 00:16:07.519
<v Speaker 2>And AWS Data Sinc. Is great for ongoing online data

307
00:16:07.519 --> 00:16:11.360
<v Speaker 2>transfer between your on premises storage and AWS services like

308
00:16:11.440 --> 00:16:15.360
<v Speaker 2>S three or EFS. Finally, for those really big computation

309
00:16:15.480 --> 00:16:17.919
<v Speaker 2>heavy batch jobs, things that might take hours or days

310
00:16:18.240 --> 00:16:21.799
<v Speaker 2>or need massive resources beyond what Lander offers, Aws Bachil

311
00:16:21.840 --> 00:16:24.240
<v Speaker 2>lets you schedule and run these efficiently. It manages the

312
00:16:24.320 --> 00:16:28.480
<v Speaker 2>job queues, provisions the right compute resources like EC two instances,

313
00:16:28.720 --> 00:16:29.919
<v Speaker 2>and scales automatically.

314
00:16:30.120 --> 00:16:33.480
<v Speaker 1>Okay. This really covers the spectrum, from analyzing static data

315
00:16:33.519 --> 00:16:37.080
<v Speaker 1>with Athena and glue to handling real time streams with kinesis,

316
00:16:37.399 --> 00:16:41.159
<v Speaker 1>and even moving massive data sets physically. AWS seems to

317
00:16:41.159 --> 00:16:43.559
<v Speaker 1>have a tool for almost every data processing need.

318
00:16:43.720 --> 00:16:45.639
<v Speaker 2>It's a very comprehensive set of services.

319
00:16:45.759 --> 00:16:49.120
<v Speaker 1>Now, before we dive headfirst into coding raw algorithms, the

320
00:16:49.200 --> 00:16:51.879
<v Speaker 1>guide makes a point of highlighting aws's out of the

321
00:16:51.919 --> 00:16:55.879
<v Speaker 1>box AI services. These seem designed to make advanced mL

322
00:16:55.960 --> 00:16:58.679
<v Speaker 1>accessible even if you're not a deep learning expert. Right,

323
00:16:58.919 --> 00:17:00.960
<v Speaker 1>no model building recques exactly.

324
00:17:01.080 --> 00:17:04.599
<v Speaker 2>These are pre trained managed services. You use them via

325
00:17:04.680 --> 00:17:08.799
<v Speaker 2>simple API calls. They bring sophisticated AI capabilities directly into

326
00:17:08.799 --> 00:17:12.799
<v Speaker 2>your applications with minimal fuss. For example, Amazon Recognition provides

327
00:17:12.839 --> 00:17:16.759
<v Speaker 2>powerful visual analysis. It can detect objects, people, faces, texts

328
00:17:16.759 --> 00:17:20.240
<v Speaker 2>and images, and videos, even sentiment analysis on faces. Amazon

329
00:17:20.240 --> 00:17:24.759
<v Speaker 2>Polly converts text into remarkably lifelike speech, loads of voices languages,

330
00:17:24.880 --> 00:17:27.240
<v Speaker 2>great for accessibility or creating voice interfaces.

331
00:17:27.319 --> 00:17:29.799
<v Speaker 1>Polly Speaks and Recognition Ce's right.

332
00:17:29.880 --> 00:17:32.319
<v Speaker 2>And Amazon transcribed as the opposite of poly. It converts

333
00:17:32.319 --> 00:17:36.839
<v Speaker 2>speech into text, excellent for transcribing audio, video calls, generating captions.

334
00:17:37.039 --> 00:17:41.160
<v Speaker 2>It supports custom vocabularies too, for better accuracy and specific domains.

335
00:17:41.480 --> 00:17:46.000
<v Speaker 2>Amazon comprehend digs into unstructured text I think customer reviews, emails,

336
00:17:46.079 --> 00:17:50.319
<v Speaker 2>social media feeds. It pulls out insights like sentiment positive, negative, neutral,

337
00:17:50.519 --> 00:17:53.200
<v Speaker 2>key phrases, entities, even topics.

338
00:17:52.880 --> 00:17:54.799
<v Speaker 1>So comprehend understands text.

339
00:17:55.480 --> 00:18:00.839
<v Speaker 2>Amazon Translate provides high quality, real time language translation between languages.

340
00:18:01.119 --> 00:18:04.519
<v Speaker 2>Amazon TExtract is really interesting. It goes beyond basic ocr

341
00:18:04.920 --> 00:18:08.480
<v Speaker 2>optical character recognition. It understands the structure of documents, so

342
00:18:08.519 --> 00:18:11.000
<v Speaker 2>it can extract data not just as raw text, but

343
00:18:11.039 --> 00:18:15.319
<v Speaker 2>specifically from forms and tables, preserving their layout and relationships.

344
00:18:15.359 --> 00:18:17.039
<v Speaker 2>Super useful for document.

345
00:18:16.680 --> 00:18:19.880
<v Speaker 1>Processing while understanding forms and tables not just text.

346
00:18:20.160 --> 00:18:23.720
<v Speaker 2>Yeah, and finally, Amazon Lex this is the engine that

347
00:18:23.759 --> 00:18:28.880
<v Speaker 2>powers Amazon Alexa. It lets you build sophisticated conversational interfaces chatbots,

348
00:18:29.160 --> 00:18:34.759
<v Speaker 2>voice spots using natural language understanding NLU and automatic speech recognition. ASR.

349
00:18:35.640 --> 00:18:40.039
<v Speaker 2>You define the user's goals, intense the information needed, slots

350
00:18:40.079 --> 00:18:44.440
<v Speaker 2>and sample phrases, utterances, and LEX handles the complex conversation flow.

351
00:18:45.039 --> 00:18:47.599
<v Speaker 1>Okay, that's an incredible menu of ready to use AI

352
00:18:47.839 --> 00:18:50.839
<v Speaker 1>really lowers the barrier to entry, But it begs the

353
00:18:50.920 --> 00:18:56.039
<v Speaker 1>question for you listening, how do you decide when should

354
00:18:56.079 --> 00:18:58.960
<v Speaker 1>you use these powerful pre built tools versus actually diving

355
00:18:59.000 --> 00:19:01.400
<v Speaker 1>in and building a custom mL model from scratch.

356
00:19:01.759 --> 00:19:04.839
<v Speaker 2>That's a really important strategic decision and the answer often

357
00:19:04.880 --> 00:19:08.400
<v Speaker 2>comes down to specificity and control. For common, well defined

358
00:19:08.400 --> 00:19:13.279
<v Speaker 2>tasks like general translation, sentiment analysis, standard object recognition, and images,

359
00:19:13.599 --> 00:19:16.440
<v Speaker 2>these managed services are often the fastest, easiest, and most

360
00:19:16.440 --> 00:19:19.880
<v Speaker 2>cost effective path. They're pre trained by AWS on massive

361
00:19:19.960 --> 00:19:22.880
<v Speaker 2>data sets, so you benefit from that expertise with minimal

362
00:19:22.880 --> 00:19:26.039
<v Speaker 2>development effort. You don't need deep mL knowledge to integrate

363
00:19:26.079 --> 00:19:27.519
<v Speaker 2>them via APIs.

364
00:19:27.160 --> 00:19:28.680
<v Speaker 1>So use them for the standard stuff.

365
00:19:28.960 --> 00:19:32.839
<v Speaker 2>Generally, yes, However, if your problem is highly specialized, maybe

366
00:19:32.839 --> 00:19:36.240
<v Speaker 2>involves unique data types not covered by the services, or

367
00:19:36.279 --> 00:19:38.759
<v Speaker 2>if you need fine grained control over the model architecture

368
00:19:38.799 --> 00:19:41.759
<v Speaker 2>or the training process or the specific performance trade offs.

369
00:19:42.240 --> 00:19:45.240
<v Speaker 2>That's when building a custom model, probably using a platform

370
00:19:45.279 --> 00:19:48.720
<v Speaker 2>like Amazon sage Maker becomes the better choice. It gives

371
00:19:48.759 --> 00:19:52.319
<v Speaker 2>you full flexibility, but requires more mL expertise and effort.

372
00:19:52.480 --> 00:19:55.640
<v Speaker 1>Got it. Use managed services for speed and common tasks,

373
00:19:55.720 --> 00:19:59.359
<v Speaker 1>build custom for unique needs and control. Okay, now let's

374
00:19:59.359 --> 00:20:01.519
<v Speaker 1>go deeper into the custom model, building into the heart

375
00:20:01.559 --> 00:20:05.359
<v Speaker 1>of mL, the algorithms themselves. The guide outlines aws's built

376
00:20:05.359 --> 00:20:08.359
<v Speaker 1>in algorithms available in sage Maker, which are often optimized

377
00:20:08.359 --> 00:20:10.880
<v Speaker 1>for the AWS environment. But first, maybe a quick word

378
00:20:10.920 --> 00:20:13.559
<v Speaker 1>on ensemble models. The guide mentions these are pretty powerful.

379
00:20:13.759 --> 00:20:17.000
<v Speaker 2>Yeah. Ensemble methods are a really important concept. The idea

380
00:20:17.039 --> 00:20:20.319
<v Speaker 2>is to combine multiple individual mL models to get better

381
00:20:20.359 --> 00:20:23.599
<v Speaker 2>predictive performance than any single model could achieve on its own.

382
00:20:24.119 --> 00:20:28.640
<v Speaker 2>Two main types are bagging, think bootstrap aggregating. Like in

383
00:20:28.680 --> 00:20:32.640
<v Speaker 2>a random forest algorithm, You train many models, usually decision trees,

384
00:20:32.920 --> 00:20:35.960
<v Speaker 2>independently on different random samples of your data, and then

385
00:20:36.000 --> 00:20:39.680
<v Speaker 2>you average their predictions for regression or take a majority

386
00:20:39.759 --> 00:20:42.799
<v Speaker 2>vote for classification. It helps reduce variants.

387
00:20:43.039 --> 00:20:45.400
<v Speaker 1>So wisdom of the crowd applied to models.

388
00:20:45.680 --> 00:20:49.440
<v Speaker 2>Kinda yeah. The other main type is boosting. Here models

389
00:20:49.440 --> 00:20:52.240
<v Speaker 2>are trained sequentially. Each new model focuses on correcting the

390
00:20:52.319 --> 00:20:54.960
<v Speaker 2>errors made by the previous ones. It builds a strong

391
00:20:55.039 --> 00:20:58.880
<v Speaker 2>predictor Iteratively, algorithms like ATTA boost or the very popular

392
00:20:59.000 --> 00:21:02.920
<v Speaker 2>XG boost uses approach. Boosting often leads to very high accuracy,

393
00:21:03.160 --> 00:21:04.880
<v Speaker 2>but you need to be careful about overfitting.

394
00:21:05.079 --> 00:21:08.319
<v Speaker 1>Okay, bagging is parallel boosting a sequential makes sense? So

395
00:21:08.400 --> 00:21:10.920
<v Speaker 1>what are some of the key built in algorithms sage

396
00:21:10.960 --> 00:21:13.720
<v Speaker 1>Maker offers, for say, supervised learning.

397
00:21:13.799 --> 00:21:17.119
<v Speaker 2>Right for supervised tasks with labeled data, sage Maker has

398
00:21:17.160 --> 00:21:20.680
<v Speaker 2>several optimized algorithms. The linear learner algorithm is a good

399
00:21:20.680 --> 00:21:25.400
<v Speaker 2>starting point. It's versatile handling with regression, predicting numbers and

400
00:21:25.480 --> 00:21:30.599
<v Speaker 2>classification predicted categories. It's great for understanding linear relationships and

401
00:21:30.720 --> 00:21:33.680
<v Speaker 2>includes options like L one and L two regularization to

402
00:21:33.680 --> 00:21:38.160
<v Speaker 2>prevent overfitting and even perform some automatic feature selection. Then

403
00:21:38.200 --> 00:21:40.880
<v Speaker 2>there's XG boost. As we mentioned, it's a gradi at

404
00:21:40.880 --> 00:21:44.960
<v Speaker 2>boosting algorithm, incredibly popular and often wins data science competitions,

405
00:21:45.240 --> 00:21:48.559
<v Speaker 2>especially with structured tabular data. Sage Maker has a highly

406
00:21:48.599 --> 00:21:49.480
<v Speaker 2>optimized version.

407
00:21:49.759 --> 00:21:52.759
<v Speaker 1>XG boost seems like a go to for many problems.

408
00:21:52.799 --> 00:21:56.759
<v Speaker 2>It often is for unsupervised learning finding patterns in unlabeled data.

409
00:21:56.880 --> 00:21:59.279
<v Speaker 2>K means is a classic clustering algorithm. You tell how

410
00:21:59.319 --> 00:22:01.400
<v Speaker 2>many clusters you want to find, and it groups your

411
00:22:01.440 --> 00:22:05.480
<v Speaker 2>data points based on similarity typically distance. Great for customer

412
00:22:05.519 --> 00:22:10.200
<v Speaker 2>segmentation or finding archetypes. Random cut Forest RCF is specifically

413
00:22:10.240 --> 00:22:13.200
<v Speaker 2>designed for anomaly detection. It builds a collection of random

414
00:22:13.200 --> 00:22:16.319
<v Speaker 2>trees and identifies data points that are easily isolated. These

415
00:22:16.359 --> 00:22:19.160
<v Speaker 2>are likely anomalies, good for fraud or outlier.

416
00:22:18.839 --> 00:22:21.720
<v Speaker 1>Detection, finding the odd ones out exactly.

417
00:22:21.440 --> 00:22:25.279
<v Speaker 2>And principal component analysis PCA. This is a fundamental technique

418
00:22:25.279 --> 00:22:28.440
<v Speaker 2>for dimensionality reduction. If you have lots and lots of features,

419
00:22:28.480 --> 00:22:31.480
<v Speaker 2>PCA can transform them into a smaller set of uncorrelated

420
00:22:31.480 --> 00:22:35.440
<v Speaker 2>principal components that capture most of the original information. This

421
00:22:35.519 --> 00:22:40.119
<v Speaker 2>helps simplify models, reduce noise, sometimes improve performance, and even

422
00:22:40.160 --> 00:22:43.319
<v Speaker 2>makes high dimensional data easier to visualize.

423
00:22:42.880 --> 00:22:46.039
<v Speaker 1>Reducing complexity while keeping the important info.

424
00:22:46.000 --> 00:22:49.759
<v Speaker 2>That's the goal. Sage Maker also has specialized algorithms like

425
00:22:49.839 --> 00:22:53.799
<v Speaker 2>deeper for time series forecasting using sophisticated recurrent neural networks.

426
00:22:54.279 --> 00:22:57.640
<v Speaker 2>For text analysis, there's blazing text, which is optimized for

427
00:22:57.680 --> 00:23:00.920
<v Speaker 2>both text classification and generating work word embeddings like word

428
00:23:00.960 --> 00:23:03.920
<v Speaker 2>twvec very quickly on large data sets, And of course

429
00:23:03.920 --> 00:23:07.480
<v Speaker 2>a suite of algorithms for image processing image classification that's

430
00:23:07.480 --> 00:23:10.960
<v Speaker 2>the main object object detection find multiple objects in drawboxes,

431
00:23:11.240 --> 00:23:14.519
<v Speaker 2>and semantic segmentation classify every pixel in the image.

432
00:23:14.599 --> 00:23:17.319
<v Speaker 1>Wow, so a really broad set of tools. What about

433
00:23:17.400 --> 00:23:20.440
<v Speaker 1>data formats? Do these algorithms just take CSV files?

434
00:23:20.920 --> 00:23:25.240
<v Speaker 2>Many can take text CSSV Yes. For supervised learning, the

435
00:23:25.319 --> 00:23:27.880
<v Speaker 2>convention is usually the target variable in the first calumn

436
00:23:27.880 --> 00:23:30.960
<v Speaker 2>no head or row. However, for peak performance and efficiency,

437
00:23:31.200 --> 00:23:34.559
<v Speaker 2>especially with large data sets, many stage Maker built in

438
00:23:34.599 --> 00:23:39.359
<v Speaker 2>algorithms prefer an optimized binary format called recordio protobuff. This

439
00:23:39.480 --> 00:23:42.440
<v Speaker 2>format allows for something called pipe mode where data is

440
00:23:42.480 --> 00:23:45.680
<v Speaker 2>streamed directly from S three to the training instance without

441
00:23:45.720 --> 00:23:48.480
<v Speaker 2>needing to download it all first. It saves time and

442
00:23:48.559 --> 00:23:50.160
<v Speaker 2>disk space. Uh.

443
00:23:50.200 --> 00:23:54.359
<v Speaker 1>Recordio protobuff for speed and streaming. This is a really

444
00:23:54.400 --> 00:23:58.400
<v Speaker 1>comprehensive toolkit. It's clear AWS provides these highly optimized tools

445
00:23:58.400 --> 00:24:01.799
<v Speaker 1>for almost any mL task. It lets you, the user

446
00:24:02.119 --> 00:24:05.440
<v Speaker 1>focus more on framing the problem and interpreting results, rather

447
00:24:05.440 --> 00:24:08.279
<v Speaker 1>than getting totally bogged down in the low level infrastructure

448
00:24:08.359 --> 00:24:09.519
<v Speaker 1>or algorithm implementation.

449
00:24:09.680 --> 00:24:12.119
<v Speaker 2>That's definitely the aim of a managed service like sage Maker.

450
00:24:12.200 --> 00:24:15.920
<v Speaker 1>Okay, so we've built these potentially incredible models using these algorithms,

451
00:24:15.960 --> 00:24:18.440
<v Speaker 1>but how do we actually know if they're any good?

452
00:24:18.440 --> 00:24:20.559
<v Speaker 1>How do we evaluate them? It's not just about hitting

453
00:24:20.599 --> 00:24:21.960
<v Speaker 1>run and hoping for the best.

454
00:24:21.759 --> 00:24:26.079
<v Speaker 2>Right, absolutely not. Evaluation is critical. It's not just about

455
00:24:26.079 --> 00:24:29.599
<v Speaker 2>getting a single accuracy number. It's about understanding how your

456
00:24:29.599 --> 00:24:33.359
<v Speaker 2>model performs, its strengths, its weaknesses, and whether it actually

457
00:24:33.359 --> 00:24:38.279
<v Speaker 2>meets the business need. Evaluation metrics are crucial for documenting performance,

458
00:24:38.599 --> 00:24:41.359
<v Speaker 2>comparing different models or different versions of the same model,

459
00:24:41.720 --> 00:24:45.720
<v Speaker 2>tracking them over time and production, and importantly for detecting

460
00:24:45.720 --> 00:24:48.680
<v Speaker 2>that model drift we talked about earlier, when performance degrades

461
00:24:48.880 --> 00:24:51.519
<v Speaker 2>metrics tell you it's time to retrain or investigate.

462
00:24:51.799 --> 00:24:53.799
<v Speaker 1>So it's about ongoing quality control too.

463
00:24:54.039 --> 00:24:57.920
<v Speaker 2>Definitely for classification models, the ones predicting categories like fraud

464
00:24:57.960 --> 00:25:01.599
<v Speaker 2>not fraud, or spam not spam. The confusion matrix is fundamental.

465
00:25:01.839 --> 00:25:04.839
<v Speaker 2>It's a simple table that breaks down predictions versus actual outcomes.

466
00:25:05.079 --> 00:25:08.680
<v Speaker 2>You get four key numbers. True positives TP correctly predicted positive,

467
00:25:08.680 --> 00:25:11.839
<v Speaker 2>said fraud, was fraud. True negative PN correctly predicted negatives

468
00:25:12.039 --> 00:25:15.640
<v Speaker 2>not fraud, wasn't fraud. False positives FP incorrectly predicted positive,

469
00:25:15.640 --> 00:25:17.799
<v Speaker 2>said fraud, but wasn't. This is a Type I error

470
00:25:17.839 --> 00:25:21.119
<v Speaker 2>of false alarm false negatives. FN incorrectly predicted negative, said

471
00:25:21.160 --> 00:25:23.400
<v Speaker 2>not fraud, but was fraud. This is a type two

472
00:25:23.519 --> 00:25:24.799
<v Speaker 2>error of missed detection.

473
00:25:24.559 --> 00:25:28.039
<v Speaker 1>TP tn fp FN. Okay, the four outcomes.

474
00:25:27.640 --> 00:25:30.000
<v Speaker 2>Right, and from this matrix we derive the most common

475
00:25:30.000 --> 00:25:34.279
<v Speaker 2>classification metrics accuracy TP plus TN divided by the total

476
00:25:34.440 --> 00:25:38.599
<v Speaker 2>overall correctness. But careful. It can be really misleading if

477
00:25:38.640 --> 00:25:41.680
<v Speaker 2>your data set is unbalanced like that ninety nine percent

478
00:25:41.839 --> 00:25:46.039
<v Speaker 2>not fraud. Example, recall or sensitivity tp TP plus fn.

479
00:25:46.720 --> 00:25:49.680
<v Speaker 2>This measures how well the model finds all the positive cases.

480
00:25:50.079 --> 00:25:52.519
<v Speaker 2>High recall is crucial when missing a positive is bad

481
00:25:52.759 --> 00:25:57.000
<v Speaker 2>reg missing a disease diagnosis. Precision or positive predictive value

482
00:25:57.119 --> 00:26:00.640
<v Speaker 2>tp TP plus fp this measures how often the model

483
00:26:00.680 --> 00:26:03.359
<v Speaker 2>is correct when it does predict positive High precision is

484
00:26:03.440 --> 00:26:06.559
<v Speaker 2>key when false alarms are costly, for example, marking important

485
00:26:06.559 --> 00:26:07.640
<v Speaker 2>emails as spam.

486
00:26:07.839 --> 00:26:10.880
<v Speaker 1>Recall finds them all. Precision avoids false alarms. A trade off.

487
00:26:11.000 --> 00:26:13.880
<v Speaker 2>Often, yes, there's usually a trade off between precision and recall.

488
00:26:14.279 --> 00:26:16.359
<v Speaker 2>The f one score is the harmonic mean of the two,

489
00:26:16.599 --> 00:26:19.680
<v Speaker 2>providing a single score that balances both. Useful when both

490
00:26:19.720 --> 00:26:21.119
<v Speaker 2>precision and recall are important.

491
00:26:21.160 --> 00:26:24.079
<v Speaker 1>Okay, and what about those curves you see like ROC.

492
00:26:23.960 --> 00:26:26.920
<v Speaker 2>Right evaluation curves help visualize that trade off across different

493
00:26:26.920 --> 00:26:32.000
<v Speaker 2>decision thresholds. The Precision Recall PR curve plots precision versus recall.

494
00:26:32.519 --> 00:26:35.920
<v Speaker 2>It's particularly useful for imbalanced data sets as it focuses

495
00:26:35.920 --> 00:26:39.200
<v Speaker 2>directly on the performance on the minority class. The ROC

496
00:26:39.319 --> 00:26:43.200
<v Speaker 2>curve receiver operating characteristic plots the true positive rate, which

497
00:26:43.240 --> 00:26:46.240
<v Speaker 2>is just recall, against the false positive rate FP FP

498
00:26:46.359 --> 00:26:49.279
<v Speaker 2>plus TN. It's commonly used for more balanced data sets.

499
00:26:49.519 --> 00:26:52.119
<v Speaker 2>The area under the curve AUC summarizes the curve into

500
00:26:52.119 --> 00:26:52.720
<v Speaker 2>a single.

501
00:26:52.559 --> 00:26:55.720
<v Speaker 1>Number PR curve for imbalance, ROC for balance. Good tip.

502
00:26:55.920 --> 00:26:58.960
<v Speaker 1>What about regression models? The ones predicting numbers.

503
00:26:58.799 --> 00:27:01.759
<v Speaker 2>Different metrics there, since we're not dealing with classes. Common

504
00:27:01.799 --> 00:27:06.119
<v Speaker 2>ones include MAE mean absolute error, the average of the

505
00:27:06.160 --> 00:27:10.680
<v Speaker 2>absolute differences between predictions and actual values. Simple intuitive units

506
00:27:11.200 --> 00:27:15.000
<v Speaker 2>MS mean squared error, the average of the squared differences.

507
00:27:15.440 --> 00:27:18.799
<v Speaker 2>This penalizes larger errors much more heavily than smaller ones.

508
00:27:19.400 --> 00:27:22.559
<v Speaker 2>RMS root means squared error the square root of MS.

509
00:27:23.079 --> 00:27:25.119
<v Speaker 2>This brings the metric back into the same units as

510
00:27:25.160 --> 00:27:28.119
<v Speaker 2>your target variable, making it easier to interpret while still

511
00:27:28.119 --> 00:27:32.160
<v Speaker 2>penalizing large errors. RMSE is probably the most common regression metric,

512
00:27:32.720 --> 00:27:36.559
<v Speaker 2>and MAP mean absolute percentage error calculates the error as

513
00:27:36.599 --> 00:27:39.240
<v Speaker 2>an average percentage of the actual values. Very intuitive for

514
00:27:39.279 --> 00:27:40.559
<v Speaker 2>things like sales forecasting.

515
00:27:40.799 --> 00:27:45.480
<v Speaker 1>Okay, MA rmsc AMAPE for regression. That's a lot of metrics.

516
00:27:45.519 --> 00:27:47.279
<v Speaker 1>If you know, if you're listening and looking at these,

517
00:27:47.319 --> 00:27:50.000
<v Speaker 1>what's maybe one piece of advice you'd give about picking

518
00:27:50.000 --> 00:27:52.119
<v Speaker 1>the right metric for your specific project.

519
00:27:52.359 --> 00:27:55.160
<v Speaker 2>That's a great question. The single most important thing is

520
00:27:55.200 --> 00:27:57.759
<v Speaker 2>to deeply understand your business goal and the cost of

521
00:27:57.799 --> 00:28:01.000
<v Speaker 2>different types of errors. Don't just default to accuracy because

522
00:28:01.039 --> 00:28:05.559
<v Speaker 2>it sounds good. Ask yourself what's worse a false positive

523
00:28:05.640 --> 00:28:09.359
<v Speaker 2>or a false negative. In medical diagnosis, missing a disease,

524
00:28:09.519 --> 00:28:13.160
<v Speaker 2>a false negative could be catastrophic, so you'd optimize for recall.

525
00:28:13.920 --> 00:28:17.160
<v Speaker 2>In filtering spam, marking a crucial email as spam, a

526
00:28:17.240 --> 00:28:21.240
<v Speaker 2>false positive is highly annoying, so you'd prioritize precision. The

527
00:28:21.319 --> 00:28:24.079
<v Speaker 2>context dictates the metric, Always tie it back to the

528
00:28:24.119 --> 00:28:25.039
<v Speaker 2>real world impact.

529
00:28:25.160 --> 00:28:28.039
<v Speaker 1>Connect the metric to the business impact excellent advice.

530
00:28:28.480 --> 00:28:31.359
<v Speaker 2>So once we can measure our models effectively using these metrics,

531
00:28:31.400 --> 00:28:36.240
<v Speaker 2>the next logical step is optimization, specifically hyperparameter tuning. Remember

532
00:28:36.279 --> 00:28:39.599
<v Speaker 2>those knobs that control the learning process. Finding the best

533
00:28:39.640 --> 00:28:42.640
<v Speaker 2>combination of those settings for your specific data is crucial.

534
00:28:43.160 --> 00:28:45.079
<v Speaker 2>The goal isn't just to get a model that performs

535
00:28:45.119 --> 00:28:47.000
<v Speaker 2>well in the data it was trained on. It's to

536
00:28:47.000 --> 00:28:50.720
<v Speaker 2>get a model that generalizes well to new unseen data.

537
00:28:51.279 --> 00:28:56.359
<v Speaker 2>We want to minimize both bias oversimplification and variance overfitting.

538
00:28:55.920 --> 00:28:58.599
<v Speaker 1>Finding that sweet spot between too simple and too complex.

539
00:28:58.960 --> 00:29:02.960
<v Speaker 2>Exactly there's several techniques for this hyper parameter search. Grid

540
00:29:03.000 --> 00:29:06.079
<v Speaker 2>search is the most basic. You define a grid of

541
00:29:06.160 --> 00:29:09.440
<v Speaker 2>possible values for each hyper parameter, and it literally tests

542
00:29:09.519 --> 00:29:13.759
<v Speaker 2>every single combination very thorough but can be incredibly slow

543
00:29:13.799 --> 00:29:17.440
<v Speaker 2>and computationally expensive, especially if you have many hyper parameters

544
00:29:17.519 --> 00:29:20.680
<v Speaker 2>or wide ranges. A more efficient approach is random search.

545
00:29:21.279 --> 00:29:24.960
<v Speaker 2>Instead of trying every combination, it randomly samples combinations from

546
00:29:24.960 --> 00:29:28.960
<v Speaker 2>your defined search space. Surprisingly, it often finds very good

547
00:29:29.039 --> 00:29:32.079
<v Speaker 2>or even optimal hyper parameters, much faster than grid search.

548
00:29:32.559 --> 00:29:34.880
<v Speaker 1>Randomly trying things can be faster.

549
00:29:34.720 --> 00:29:38.279
<v Speaker 2>Often yes, because not all hyper parameters are equally important.

550
00:29:38.559 --> 00:29:42.240
<v Speaker 2>Random search explores the space more broadly quicker. For even

551
00:29:42.279 --> 00:29:46.920
<v Speaker 2>more intelligence, there's Bayesian optimization. This method learns from past evaluations.

552
00:29:47.160 --> 00:29:49.720
<v Speaker 2>It builds a probability model of how hyper parameters relate

553
00:29:49.759 --> 00:29:52.559
<v Speaker 2>to performance, and uses it to intelligently choose the next

554
00:29:52.559 --> 00:29:55.920
<v Speaker 2>set of hyper parameters to try. Focusing on promising regions

555
00:29:55.920 --> 00:29:58.880
<v Speaker 2>of the search space. It can converge on optimal settings

556
00:29:59.160 --> 00:30:01.400
<v Speaker 2>much much faster, especially for complex models.

557
00:30:01.480 --> 00:30:02.799
<v Speaker 1>Subaesian learns as it goes.

558
00:30:02.839 --> 00:30:04.279
<v Speaker 2>Precisely, it's a smarter search.

559
00:30:04.480 --> 00:30:06.960
<v Speaker 1>The core idea here, then, is that tuning isn't just

560
00:30:06.960 --> 00:30:10.680
<v Speaker 1>a one shot deal. It's an empirical process. You combine

561
00:30:10.680 --> 00:30:14.839
<v Speaker 1>these smart search strategies with solid evaluation techniques, often using

562
00:30:14.880 --> 00:30:19.279
<v Speaker 1>things like cross validation to build robust models, Models that

563
00:30:19.319 --> 00:30:22.640
<v Speaker 1>don't just memorize the training data, but actually perform well

564
00:30:22.680 --> 00:30:23.839
<v Speaker 1>out there in the real world.

565
00:30:24.000 --> 00:30:26.119
<v Speaker 2>That's the name of the game generalization.

566
00:30:26.240 --> 00:30:28.599
<v Speaker 1>Okay, bringing this all together. Now we've talked about the

567
00:30:28.640 --> 00:30:33.119
<v Speaker 1>life cycle, data algorithms, evaluation optimization. The guide clearly points

568
00:30:33.160 --> 00:30:36.000
<v Speaker 1>to Amazon stage Maker as the central workbench, the main

569
00:30:36.119 --> 00:30:37.799
<v Speaker 1>hub for doing all this on AWS.

570
00:30:38.079 --> 00:30:41.079
<v Speaker 2>Yes, sage Maker is designed to be that integrated environment

571
00:30:41.119 --> 00:30:44.279
<v Speaker 2>for the entire mL workflow. It's a fully managed service

572
00:30:44.759 --> 00:30:49.119
<v Speaker 2>aiming to simplify each step. It provides notebook instances. These

573
00:30:49.119 --> 00:30:52.480
<v Speaker 2>are basically managed Jupiter notebooks running on EC two instances.

574
00:30:52.880 --> 00:30:57.319
<v Speaker 2>They're great for data exploration, cleaning, preprocessing, and generally orchestrating

575
00:30:57.319 --> 00:31:00.440
<v Speaker 2>your mL pipeline. For the heavy listing of training, stage

576
00:31:00.480 --> 00:31:04.960
<v Speaker 2>Maker provides dedicated, optimized training instances. You choose the instance

577
00:31:05.000 --> 00:31:07.680
<v Speaker 2>type based on your needs, submit your training code, and

578
00:31:07.680 --> 00:31:12.079
<v Speaker 2>Stagemaker handles provisioning, execution, and tearing down the resources. And

579
00:31:12.160 --> 00:31:15.440
<v Speaker 2>once your model is trained, sage Maker offers endpoint instances

580
00:31:15.480 --> 00:31:18.200
<v Speaker 2>for deploying it and getting real time predictions via a

581
00:31:18.279 --> 00:31:21.720
<v Speaker 2>simple API call. It handles scaling and availability for you.

582
00:31:21.680 --> 00:31:25.799
<v Speaker 1>Notebooks for exploring training, instances for building endpoints for predicting.

583
00:31:26.039 --> 00:31:28.480
<v Speaker 2>That's a good summary, and sage Maker also has managed

584
00:31:28.519 --> 00:31:31.920
<v Speaker 2>services specifically for hyper parameter tuning jobs. You de find

585
00:31:31.920 --> 00:31:34.839
<v Speaker 2>your hyper parameters ranges, the metric you want to optimize,

586
00:31:34.880 --> 00:31:38.559
<v Speaker 2>and SageMaker automatically runs the search using strategies like basing, optimization,

587
00:31:38.720 --> 00:31:42.079
<v Speaker 2>random search, or grid search, keeping track of the best performing.

588
00:31:41.799 --> 00:31:44.880
<v Speaker 1>Job automating that tuning process we just discussed.

589
00:31:44.519 --> 00:31:48.240
<v Speaker 2>Exactly when you choose instance types for these sage Maker components,

590
00:31:48.640 --> 00:31:52.079
<v Speaker 2>they're all based on EC two instances, often with mL prefixes.

591
00:31:52.119 --> 00:31:54.880
<v Speaker 2>You need to consider your workload. There are general purpose

592
00:31:55.160 --> 00:31:59.359
<v Speaker 2>M family, Compute optimized CE family, Memory optimized OUR family,

593
00:31:59.640 --> 00:32:03.720
<v Speaker 2>and portantly GPU enabled instances like the P and G families.

594
00:32:04.519 --> 00:32:08.680
<v Speaker 2>GPUs are essential for accelerating deep learning training, which involves

595
00:32:08.759 --> 00:32:12.559
<v Speaker 2>massive matrix calculations. The choice really depends on your data size,

596
00:32:12.599 --> 00:32:15.480
<v Speaker 2>algorithm complexity, budget, and how fast you need.

597
00:32:15.400 --> 00:32:18.799
<v Speaker 1>Results matching the hardware to the mL task. What about security?

598
00:32:19.039 --> 00:32:21.039
<v Speaker 1>Keeping notebooks and data private.

599
00:32:20.839 --> 00:32:23.720
<v Speaker 2>Security is built in you can launch sage Maker components

600
00:32:23.920 --> 00:32:27.000
<v Speaker 2>like notebook instances or training jobs within your own private

601
00:32:27.119 --> 00:32:30.880
<v Speaker 2>VPC virtual private cloud. This gives you fine grain control

602
00:32:30.920 --> 00:32:34.400
<v Speaker 2>over network access. You can restrict internet access, connect securely

603
00:32:34.480 --> 00:32:37.319
<v Speaker 2>to your on premises data sources, use security groups and

604
00:32:37.400 --> 00:32:41.519
<v Speaker 2>network acls. SageMaker also supports network isolation for training and

605
00:32:41.559 --> 00:32:45.599
<v Speaker 2>inference containers, preventing them from making unauthorized outbound network calls.

606
00:32:45.720 --> 00:32:47.400
<v Speaker 1>So you can lock it down pretty tightly.

607
00:32:47.400 --> 00:32:51.279
<v Speaker 2>Absolutely and while sage Maker provides that integrated environment, you

608
00:32:51.279 --> 00:32:56.079
<v Speaker 2>can also orchestrate mL workflows using other AWS services, sometimes

609
00:32:56.119 --> 00:33:01.440
<v Speaker 2>in combination. AWS lamb to functions, ervalless Event driven compute

610
00:33:01.440 --> 00:33:04.839
<v Speaker 2>functions are great for automating parts of the pipeline. For example,

611
00:33:04.839 --> 00:33:06.799
<v Speaker 2>an S three upload event could trigger a lamb to

612
00:33:06.880 --> 00:33:10.680
<v Speaker 2>function to do some initial data preprocessing or validation. For

613
00:33:10.799 --> 00:33:15.400
<v Speaker 2>more complex multi step workflows, AWS step functions is fantastic.

614
00:33:15.599 --> 00:33:18.480
<v Speaker 2>It lets you define your workflow as a visual state machine.

615
00:33:18.640 --> 00:33:22.640
<v Speaker 2>You can sequence and coordinate calls to Lambda functions, glue jobs, SageMaker,

616
00:33:22.680 --> 00:33:26.279
<v Speaker 2>training jobs, manual approval steps, pretty much any AWS service.

617
00:33:26.680 --> 00:33:29.839
<v Speaker 2>It's great for managing long running distributed processes with built

618
00:33:29.839 --> 00:33:30.640
<v Speaker 2>in error handling.

619
00:33:30.640 --> 00:33:33.359
<v Speaker 1>And retries step functions for orchestrating the whole flow.

620
00:33:33.559 --> 00:33:35.759
<v Speaker 2>Yeah, and if we connect this back to that bigger picture,

621
00:33:36.039 --> 00:33:39.200
<v Speaker 2>sage Maker, often combined with services like Glue, Lambda and

622
00:33:39.240 --> 00:33:43.279
<v Speaker 2>step functions, really aims to provide a managed, scalable, and

623
00:33:43.319 --> 00:33:47.279
<v Speaker 2>secure environment for that entire Cristium life cycle. We started

624
00:33:47.319 --> 00:33:50.400
<v Speaker 2>with from understanding the business need and preparing data in

625
00:33:50.440 --> 00:33:54.480
<v Speaker 2>notebooks all the way through training, tuning, deploying, and monitoring

626
00:33:54.480 --> 00:33:57.319
<v Speaker 2>models in production. It's designed to be end to end.

627
00:33:57.599 --> 00:34:00.319
<v Speaker 1>And there you have it. Wow, what an insightful deep

628
00:34:00.359 --> 00:34:04.200
<v Speaker 1>dive that was. We started with the Very Foundation's AIMLDL.

629
00:34:04.519 --> 00:34:07.880
<v Speaker 1>We explored that critical CRISPM life cycle. We understood the

630
00:34:08.199 --> 00:34:11.760
<v Speaker 1>frankly immense importance of data preparation and the variety of

631
00:34:11.760 --> 00:34:15.960
<v Speaker 1>AWS storage options. Then we delved into the specific AWSAI

632
00:34:16.000 --> 00:34:20.039
<v Speaker 1>application services, those ready made tools, and also the powerful

633
00:34:20.079 --> 00:34:23.199
<v Speaker 1>built in algorithms within sage Maker. And finally we saw

634
00:34:23.239 --> 00:34:26.679
<v Speaker 1>how crucial evaluation and optimization are and how sage Maker

635
00:34:26.679 --> 00:34:30.239
<v Speaker 1>and other services help operationalize it. All You've just gained,

636
00:34:30.280 --> 00:34:33.199
<v Speaker 1>I think, a really valuable shortcut to being well informed

637
00:34:33.199 --> 00:34:36.599
<v Speaker 1>about the whole landscape of machine learning on AWS. Extracting

638
00:34:36.599 --> 00:34:39.920
<v Speaker 1>hopefully incredible value from what's normally a pretty dense technical guide.

639
00:34:40.000 --> 00:34:42.719
<v Speaker 2>And maybe this raises an important final question for you,

640
00:34:42.960 --> 00:34:45.960
<v Speaker 2>the listener, to consider. Given the power you've just heard

641
00:34:45.960 --> 00:34:50.280
<v Speaker 2>about in these integrated AWS services, from intelligent data processing

642
00:34:50.320 --> 00:34:53.920
<v Speaker 2>and diverse storage to that comprehensive suite of AI and

643
00:34:54.079 --> 00:34:58.480
<v Speaker 2>mL tools, how might you reimagine a current data analysis task,

644
00:34:58.960 --> 00:35:01.920
<v Speaker 2>or maybe an automation challenge in your own work? How

645
00:35:01.920 --> 00:35:05.519
<v Speaker 2>could you potentially transform it into an intelligent, scalable mL

646
00:35:05.639 --> 00:35:07.719
<v Speaker 2>solution using some of these capabilities.

647
00:35:08.079 --> 00:35:10.039
<v Speaker 1>That's a great thought to leave everyone with. How can

648
00:35:10.079 --> 00:35:12.559
<v Speaker 1>you apply this? Thanks for diving deep with us today.

649
00:35:12.639 --> 00:35:16.000
<v Speaker 1>Until next time, keep exploring, keep learning, and stay curious.
