WEBVTT

1
00:00:00.160 --> 00:00:04.320
<v Speaker 1>In a world absolutely flooded with data. Mastering complex tech

2
00:00:04.440 --> 00:00:07.759
<v Speaker 1>like deep learning and cloud infrastructure can often feel like

3
00:00:08.919 --> 00:00:11.119
<v Speaker 1>trying to drink from a fire hose totably. But what

4
00:00:11.160 --> 00:00:13.400
<v Speaker 1>if there was a shortcut, you know, a way to

5
00:00:13.480 --> 00:00:16.559
<v Speaker 1>truly understand what matters, cut through the noise and get

6
00:00:16.600 --> 00:00:19.079
<v Speaker 1>straight to the impactful insights.

7
00:00:19.399 --> 00:00:22.480
<v Speaker 2>That's precisely our mission with this deep dive. Yeah, tailor

8
00:00:22.559 --> 00:00:25.280
<v Speaker 2>made for you. Yeah, I mean imagine a startup let's

9
00:00:25.280 --> 00:00:28.640
<v Speaker 2>call them Precision Analytics. Yeah, they want to revolutionize healthcare

10
00:00:28.760 --> 00:00:32.640
<v Speaker 2>predictation outcomes at scale. Okay, big goal, huge goal, and

11
00:00:32.719 --> 00:00:36.960
<v Speaker 2>their challenge moving beyond manually crunching data to building a

12
00:00:37.039 --> 00:00:40.880
<v Speaker 2>really robust automated system, something they could handle like petabytes

13
00:00:40.920 --> 00:00:43.759
<v Speaker 2>of health records and train these cutting edge neural networks.

14
00:00:44.399 --> 00:00:46.719
<v Speaker 2>So today we're can uncack the most important nuggets of

15
00:00:46.759 --> 00:00:49.719
<v Speaker 2>knowledge from our sources. Will reveal how they and how

16
00:00:49.759 --> 00:00:52.520
<v Speaker 2>you can actually conquer this using tools like pie Spark,

17
00:00:52.679 --> 00:00:57.240
<v Speaker 2>pietorch TensorFlow and apatche airflow, all on Amazon Web services.

18
00:00:57.479 --> 00:01:02.200
<v Speaker 1>Absolutely from predicting stock prices to classifying medical conditions. This

19
00:01:02.320 --> 00:01:05.799
<v Speaker 1>deep dive is your personalized guide. We found some really

20
00:01:06.000 --> 00:01:08.920
<v Speaker 1>surprising facts and insights that should give you some serious

21
00:01:08.920 --> 00:01:13.799
<v Speaker 1>aha moments. So let's trace that journey and really unpack

22
00:01:13.920 --> 00:01:14.719
<v Speaker 1>this whole thing.

23
00:01:15.000 --> 00:01:15.519
<v Speaker 2>Let's do it.

24
00:01:15.680 --> 00:01:20.439
<v Speaker 1>So why the cloud, specifically AWS? Why is it the

25
00:01:20.799 --> 00:01:24.959
<v Speaker 1>sort of undisputed champion for these deep learning pipelines. Our

26
00:01:25.000 --> 00:01:29.120
<v Speaker 1>research really hammered home the why traditional on premises infrastructure

27
00:01:29.200 --> 00:01:31.400
<v Speaker 1>it often just hits a wall. I mean, with the

28
00:01:31.439 --> 00:01:34.200
<v Speaker 1>exponential growth of data we're seeing, that makes perfect sense,

29
00:01:34.239 --> 00:01:34.719
<v Speaker 1>doesn't it.

30
00:01:34.719 --> 00:01:35.319
<v Speaker 2>It really does.

31
00:01:35.400 --> 00:01:37.239
<v Speaker 1>Simply can't keep buying hardware fast.

32
00:01:37.079 --> 00:01:41.519
<v Speaker 2>Enough exactly the computational muscle and frankly, this shear scalability

33
00:01:41.560 --> 00:01:44.799
<v Speaker 2>needed for modern deep learning workflows are just immense, and

34
00:01:44.879 --> 00:01:48.280
<v Speaker 2>on premises setup just can't offer that elastic capacity. Like

35
00:01:48.280 --> 00:01:51.000
<v Speaker 2>what happens when you suddenly get ten times the data.

36
00:01:50.799 --> 00:01:51.560
<v Speaker 1>Right, You're stuck.

37
00:01:51.840 --> 00:01:55.319
<v Speaker 2>You're stuck. And this is where cloud based deep learning

38
00:01:55.319 --> 00:02:00.480
<v Speaker 2>steps in. It offers incredible flexibility, true scalability, and often

39
00:02:00.680 --> 00:02:05.519
<v Speaker 2>it's surprisingly cost effective. It fundamentally changes how organizations like

40
00:02:05.560 --> 00:02:09.159
<v Speaker 2>our Precision Analytics example, can rapidly develop and deploy these

41
00:02:09.199 --> 00:02:10.439
<v Speaker 2>advanced mL algorithms.

42
00:02:10.479 --> 00:02:12.680
<v Speaker 1>And when we talk about cloud, the data we pulled

43
00:02:12.680 --> 00:02:16.560
<v Speaker 1>it consistently points to AWS as the clear leader biggest

44
00:02:16.599 --> 00:02:21.639
<v Speaker 1>market share, so their infrastructure becomes this incredibly robust foundation

45
00:02:21.919 --> 00:02:23.159
<v Speaker 1>for these kinds of tasks.

46
00:02:23.280 --> 00:02:28.039
<v Speaker 2>Indeed, AWS provides a really comprehensive suite of services and

47
00:02:28.080 --> 00:02:32.000
<v Speaker 2>they're pretty well tuned for orchestrating these complex, data intensive pipelines.

48
00:02:32.120 --> 00:02:34.319
<v Speaker 1>Okay, all right, we frame the challenge, We get why

49
00:02:34.319 --> 00:02:38.199
<v Speaker 1>the cloud is necessary. Now let's visualize this whole operation.

50
00:02:38.360 --> 00:02:41.000
<v Speaker 1>What does this end to end deep learning workflow on

51
00:02:41.039 --> 00:02:45.120
<v Speaker 1>AWS actually look like, from the initial data intake all

52
00:02:45.199 --> 00:02:47.280
<v Speaker 1>the way to a model actually making predictions.

53
00:02:47.319 --> 00:02:50.479
<v Speaker 2>Okay, think of it like a meticulously engineered assembly line. Yeah,

54
00:02:50.520 --> 00:02:52.759
<v Speaker 2>you know, for your data in your models. It starts

55
00:02:52.759 --> 00:02:56.680
<v Speaker 2>with raw data ingestion and it ends with automated model

56
00:02:56.719 --> 00:03:00.919
<v Speaker 2>runs hopefully in production. Our sources detailed the critical components

57
00:03:00.919 --> 00:03:03.840
<v Speaker 2>that make this journey well not just possible, but actually

58
00:03:03.879 --> 00:03:04.560
<v Speaker 2>pretty efficient.

59
00:03:04.599 --> 00:03:08.639
<v Speaker 1>Okay. First up, the backbone data storage Amazon S three

60
00:03:09.159 --> 00:03:12.719
<v Speaker 1>Simple Storage Service. This is like the central nervous system

61
00:03:12.759 --> 00:03:13.840
<v Speaker 1>for all your data, isn't it.

62
00:03:13.919 --> 00:03:16.360
<v Speaker 2>That's a great analogy. Yeah. S three acts as the

63
00:03:16.439 --> 00:03:21.360
<v Speaker 2>centralized data Like basically, it stores everything your raw data sets,

64
00:03:21.719 --> 00:03:26.039
<v Speaker 2>the carefully preprocessed data, even the final model artifacts, everything everything.

65
00:03:26.120 --> 00:03:30.599
<v Speaker 2>It offers virtually limitless storage capacity and ensures incredibly easy,

66
00:03:30.879 --> 00:03:32.080
<v Speaker 2>highly available data.

67
00:03:31.840 --> 00:03:34.719
<v Speaker 1>Retrieval, which is non negotiable when you're dealing with terabytes

68
00:03:34.800 --> 00:03:38.000
<v Speaker 1>or even petabytes of patient records like precision analytics would.

69
00:03:37.800 --> 00:03:40.879
<v Speaker 2>Be absolutely and Once that massive amount of data is

70
00:03:40.919 --> 00:03:43.639
<v Speaker 2>safely sitting in S three, the next challenge is actually

71
00:03:43.680 --> 00:03:47.639
<v Speaker 2>processing it, transforming it into something usable. That's where pistpark

72
00:03:47.719 --> 00:03:51.520
<v Speaker 2>really shines right absolutely. Pistpark is the engine for large

73
00:03:51.560 --> 00:03:56.599
<v Speaker 2>scale distributed data processing. It's essential for efficient preprocessing and

74
00:03:56.639 --> 00:04:01.479
<v Speaker 2>transformation of those massive data sets. Without its parallel processing power,

75
00:04:01.680 --> 00:04:04.759
<v Speaker 2>preparing data for deep learning would be agonizingly slow, a

76
00:04:04.840 --> 00:04:08.439
<v Speaker 2>huge bottleneck, a major bottleneck, yack, and resource intensive. Too

77
00:04:08.680 --> 00:04:10.240
<v Speaker 2>bad for any high volume operation.

78
00:04:10.400 --> 00:04:13.879
<v Speaker 1>Okay, so data is prepped. Now you need serious computational

79
00:04:13.879 --> 00:04:15.680
<v Speaker 1>horsepower for the actual model training.

80
00:04:16.480 --> 00:04:20.639
<v Speaker 2>Enter amazon EC two correct amazon EC two or elastic

81
00:04:20.639 --> 00:04:24.800
<v Speaker 2>compute cloud. It provides the necessary virtual servers with powerful

82
00:04:24.839 --> 00:04:29.120
<v Speaker 2>CPUs and importantly GPUs for model training. It ensures efficient

83
00:04:29.199 --> 00:04:32.360
<v Speaker 2>utilization of cloud resources. You can quickly spin up or

84
00:04:32.360 --> 00:04:36.959
<v Speaker 2>spin down instances based on your specific training needs. Saves time, saves.

85
00:04:36.600 --> 00:04:40.240
<v Speaker 1>Cost, very elastic, and then the actual brain of the

86
00:04:40.319 --> 00:04:44.480
<v Speaker 1>operation PyTorch and TensorFlow. These are the deep learning frameworks themselves,

87
00:04:44.480 --> 00:04:46.720
<v Speaker 1>the tools you use to actually build, train, and evaluate

88
00:04:46.720 --> 00:04:47.199
<v Speaker 1>your models.

89
00:04:47.279 --> 00:04:50.079
<v Speaker 2>Yes, they are the real powerhouses of the deep learning world.

90
00:04:50.319 --> 00:04:52.839
<v Speaker 2>And finally, to kind of glue it all together, to

91
00:04:53.040 --> 00:04:56.199
<v Speaker 2>automate and streamline the entire process, we have a patchy

92
00:04:56.199 --> 00:05:02.879
<v Speaker 2>airflow or it's fully managed AWS counterpart Amazon. Mwaa ah, okay,

93
00:05:03.040 --> 00:05:06.199
<v Speaker 2>this is your orchestrator. It ensures every step from data

94
00:05:06.240 --> 00:05:10.879
<v Speaker 2>prep all the way to model deployment run seamlessly like clockwork.

95
00:05:11.040 --> 00:05:11.439
<v Speaker 2>Got it?

96
00:05:11.839 --> 00:05:15.199
<v Speaker 1>Okay, So if you're like our hypothetical company, Precision Analytics,

97
00:05:15.240 --> 00:05:17.360
<v Speaker 1>and you want to get your hands dirty, what are

98
00:05:17.399 --> 00:05:20.120
<v Speaker 1>the foundational steps setting up this environment from scratch?

99
00:05:20.399 --> 00:05:23.040
<v Speaker 2>Yeah? The foundation is absolutely critical. First, you need an

100
00:05:23.040 --> 00:05:27.199
<v Speaker 2>AWS account obviously. Then you provision your EC two instances

101
00:05:27.600 --> 00:05:30.439
<v Speaker 2>your virtual servers. Right, you can do that manually or

102
00:05:30.480 --> 00:05:34.000
<v Speaker 2>for more complex, repeatable setups, you'd probably use automation tools

103
00:05:34.000 --> 00:05:36.399
<v Speaker 2>like AWS. CloudFormation makes life easier.

104
00:05:36.480 --> 00:05:38.399
<v Speaker 1>And getting S three ready, what does that involved?

105
00:05:38.399 --> 00:05:41.720
<v Speaker 2>That involves creating your S three buckets, carefully configuring the

106
00:05:41.759 --> 00:05:46.519
<v Speaker 2>appropriate access permissions super important to keep sensitive data secure crucial. Yeah,

107
00:05:46.600 --> 00:05:49.199
<v Speaker 2>and then uploading your initial data sets, but you know,

108
00:05:49.319 --> 00:05:52.480
<v Speaker 2>beyond the raw AWS services. One really crucial insight from

109
00:05:52.480 --> 00:05:55.920
<v Speaker 2>our sources was just the importance of organization. How so

110
00:05:56.240 --> 00:06:01.000
<v Speaker 2>well having a well designed project directory structure with distinct

111
00:06:01.040 --> 00:06:05.639
<v Speaker 2>folders for data logs, output SRC for your code visualizations,

112
00:06:05.680 --> 00:06:10.199
<v Speaker 2>plus those keyfiles like readymmy, dot MD, requirements dot txt

113
00:06:10.600 --> 00:06:12.040
<v Speaker 2>and maybe a config dot YAML.

114
00:06:12.319 --> 00:06:13.199
<v Speaker 1>Oh okay, it.

115
00:06:13.199 --> 00:06:17.759
<v Speaker 2>Sounds basic, but it's paramount for collaboration, for reproducibility, and

116
00:06:17.879 --> 00:06:21.519
<v Speaker 2>just clear documentation. It's off and overlooked, but honestly it's

117
00:06:21.560 --> 00:06:22.800
<v Speaker 2>a huge timesaver down the.

118
00:06:22.720 --> 00:06:25.199
<v Speaker 1>Line, I can see that. And for ensuring everything runs

119
00:06:25.240 --> 00:06:29.759
<v Speaker 1>smoothly without conflicts. Isolation is key with Python virtual environments, right.

120
00:06:29.800 --> 00:06:33.839
<v Speaker 2>Yes, absolutely, creating a Python virtual environment like maybe Miandi,

121
00:06:33.959 --> 00:06:37.079
<v Speaker 2>as we saw in the sources, is paramount. It neatly

122
00:06:37.120 --> 00:06:40.800
<v Speaker 2>manages all your project dependencies, okay, and it ensures reproducibility

123
00:06:40.800 --> 00:06:44.680
<v Speaker 2>across different systems by preventing those pesky conflicts between different

124
00:06:44.720 --> 00:06:47.399
<v Speaker 2>Python versions or library versions. Think of it like a

125
00:06:47.439 --> 00:06:49.920
<v Speaker 2>clean custom sandbox for each project.

126
00:06:50.199 --> 00:06:52.600
<v Speaker 1>Nice, and where does all this coding actually happen? What's

127
00:06:52.639 --> 00:06:53.600
<v Speaker 1>a typical workspace?

128
00:06:54.199 --> 00:06:57.879
<v Speaker 2>Development environments like Jupiter lab are really commonly used for

129
00:06:58.040 --> 00:07:01.759
<v Speaker 2>writing and developing the machine learning models. Within this whole setup,

130
00:07:02.000 --> 00:07:06.040
<v Speaker 2>they provide that interactive, iterative workspace that's so crucial for

131
00:07:06.120 --> 00:07:06.720
<v Speaker 2>data science.

132
00:07:06.920 --> 00:07:10.959
<v Speaker 1>Makes sense. Okay, environments provision organized. Let's talk about the data.

133
00:07:11.000 --> 00:07:13.920
<v Speaker 1>It truly is the foundation. You mentioned pisce Spark as

134
00:07:13.920 --> 00:07:18.079
<v Speaker 1>the powerhouse for data prep. How does it supercharge this process,

135
00:07:18.160 --> 00:07:20.160
<v Speaker 1>especially with massive data sets?

136
00:07:20.240 --> 00:07:23.600
<v Speaker 2>Right? Pisce Park's secret weapon is its parallel processing. It

137
00:07:23.680 --> 00:07:27.279
<v Speaker 2>dramatically enhances it efficiency and speed. Instead of one computer

138
00:07:27.399 --> 00:07:31.560
<v Speaker 2>just slogging through everything sequentially. Yeah, it intelligently breaks down

139
00:07:31.600 --> 00:07:36.040
<v Speaker 2>these large data tasks into independent subtasks that run concurrently

140
00:07:36.199 --> 00:07:39.959
<v Speaker 2>across a whole cluster of machines distributed power exactly. And

141
00:07:40.000 --> 00:07:43.279
<v Speaker 2>we discovered several key optimization techniques in our sources that

142
00:07:43.319 --> 00:07:44.759
<v Speaker 2>can really transform performance.

143
00:07:44.920 --> 00:07:46.800
<v Speaker 1>Oh yeah, like what give us an example.

144
00:07:46.959 --> 00:07:51.120
<v Speaker 2>Okay, take repartitioning. It intelligently redistributes your data across a

145
00:07:51.160 --> 00:07:55.680
<v Speaker 2>specified number of partitions, say ten partitions, to really improve parallelism,

146
00:07:55.879 --> 00:07:58.800
<v Speaker 2>get more work done at once or caching. This keeps

147
00:07:58.879 --> 00:08:03.000
<v Speaker 2>data frames in memory for lightning fast access during repeated operations,

148
00:08:03.319 --> 00:08:06.959
<v Speaker 2>so you avoid costly recomputations. Are And what was fascinating

149
00:08:07.079 --> 00:08:12.199
<v Speaker 2>was how a seemingly minor pist spark optimization like broadcasting.

150
00:08:11.639 --> 00:08:13.240
<v Speaker 1>Ah I remember reading about that.

151
00:08:13.439 --> 00:08:17.160
<v Speaker 2>Yeah, it dramatically reduced processing time for a multi terabyte

152
00:08:17.199 --> 00:08:20.480
<v Speaker 2>data set from hours down to minutes. In a specific

153
00:08:20.560 --> 00:08:23.399
<v Speaker 2>real world case study we found Wow, it's a common

154
00:08:23.439 --> 00:08:27.160
<v Speaker 2>pitfall teams overlook when they're scaling up and also saving

155
00:08:27.240 --> 00:08:30.759
<v Speaker 2>large data sets in par qu format that supports compression

156
00:08:30.800 --> 00:08:34.080
<v Speaker 2>and optimized read operations, another crucial performance game.

157
00:08:34.480 --> 00:08:37.360
<v Speaker 1>So these aren't just minor tweaks, they can have huge.

158
00:08:37.080 --> 00:08:38.840
<v Speaker 2>Impacts, huge impacts exactly.

159
00:08:38.919 --> 00:08:40.879
<v Speaker 1>We saw a real world example of this in the

160
00:08:40.919 --> 00:08:45.000
<v Speaker 1>sources looking at historical Tesla stock prices. How exactly was

161
00:08:45.039 --> 00:08:46.080
<v Speaker 1>pist spark used there?

162
00:08:46.279 --> 00:08:49.159
<v Speaker 2>Right in that Tesla stock example, piscepark was used to

163
00:08:49.200 --> 00:08:52.159
<v Speaker 2>swiftly explore the data set. It efficiently checked for null

164
00:08:52.240 --> 00:08:56.639
<v Speaker 2>values luckily the source showed none, which simplified things very handy,

165
00:08:56.759 --> 00:08:59.600
<v Speaker 2>and then visualizing closing prices over time. It was just

166
00:08:59.639 --> 00:09:03.320
<v Speaker 2>the perfect tool for that initial large scale data exploration.

167
00:09:03.639 --> 00:09:07.120
<v Speaker 1>Okay, and feature engineering that crucial step that can really

168
00:09:08.000 --> 00:09:09.759
<v Speaker 1>elevate a model's predictive power.

169
00:09:09.919 --> 00:09:13.399
<v Speaker 2>Yes, feature engineering is where you get creative, you create new,

170
00:09:13.960 --> 00:09:17.320
<v Speaker 2>hopefully more informative features from your raw data. For the

171
00:09:17.360 --> 00:09:20.799
<v Speaker 2>Tesla stock, this included calculating things like price range so

172
00:09:21.120 --> 00:09:24.320
<v Speaker 2>high minus low, okay, price change close minus open, and

173
00:09:24.399 --> 00:09:28.360
<v Speaker 2>even volume price interaction volume multiplied by clothes, trying to

174
00:09:28.399 --> 00:09:29.960
<v Speaker 2>capture more dynamics.

175
00:09:29.559 --> 00:09:31.200
<v Speaker 1>Right, creating signals exactly.

176
00:09:31.559 --> 00:09:34.320
<v Speaker 2>And then tools like vector assembler and standard scaler and

177
00:09:34.320 --> 00:09:38.240
<v Speaker 2>pie spark prepare these newly engineered features. They transform them

178
00:09:38.279 --> 00:09:40.759
<v Speaker 2>into the right format and scale for the deep learning

179
00:09:40.759 --> 00:09:41.679
<v Speaker 2>models down the line.

180
00:09:41.720 --> 00:09:45.200
<v Speaker 1>Got it now for the brain of the operation, the

181
00:09:45.240 --> 00:09:49.519
<v Speaker 1>deep learning models themselves powered by pietrch and TensorFlow. These

182
00:09:49.519 --> 00:09:52.440
<v Speaker 1>are the two big titans dominating the deep learning landscape

183
00:09:52.480 --> 00:09:53.320
<v Speaker 1>right absolutely.

184
00:09:53.679 --> 00:09:58.039
<v Speaker 2>Both pietorch and TensorFlow are incredibly powerful frameworks. They build

185
00:09:58.080 --> 00:10:02.399
<v Speaker 2>deep learning models capable of tackling really diverse tasks from regression,

186
00:10:02.879 --> 00:10:06.720
<v Speaker 2>like predicting continuous values, say future stock prices like the

187
00:10:06.759 --> 00:10:10.240
<v Speaker 2>Tesla example. It's exactly to classification like predicting the presence

188
00:10:10.240 --> 00:10:12.919
<v Speaker 2>of diabetes, which was the other main example in our sources.

189
00:10:13.039 --> 00:10:16.080
<v Speaker 1>How do these two heavyweights stack up against each other?

190
00:10:16.240 --> 00:10:18.799
<v Speaker 1>The materials provided a pretty clear showdown.

191
00:10:18.919 --> 00:10:22.200
<v Speaker 2>They certainly did. It's interesting PyTorch typically uses what are

192
00:10:22.200 --> 00:10:25.840
<v Speaker 2>called dynamic computational graphs. They're defined during run time.

193
00:10:26.000 --> 00:10:27.600
<v Speaker 1>Okay, what does that mean practically?

194
00:10:28.000 --> 00:10:31.080
<v Speaker 2>Think of it like building legos one piece at a time.

195
00:10:31.559 --> 00:10:34.519
<v Speaker 2>You can easily adjust things and see the immediate impact.

196
00:10:34.919 --> 00:10:39.519
<v Speaker 2>It's incredibly flexible, really ideal for research and rapid prototyping.

197
00:10:39.720 --> 00:10:41.399
<v Speaker 1>More interactive, yeah, more.

198
00:10:41.240 --> 00:10:44.559
<v Speaker 2>Interactive, more pithonics, some would say. Cancer flow, on the

199
00:10:44.559 --> 00:10:49.320
<v Speaker 2>other hand, traditionally use static graphs defined before execution. This

200
00:10:49.399 --> 00:10:52.120
<v Speaker 2>is more like following a detailed blueprint, right, which is

201
00:10:52.200 --> 00:10:56.639
<v Speaker 2>incredibly efficient for optimization and deployment, especially with its seamless

202
00:10:56.679 --> 00:10:57.480
<v Speaker 2>caras integration.

203
00:10:57.720 --> 00:11:00.600
<v Speaker 1>So maybe one for research, one for production. Is that

204
00:11:00.639 --> 00:11:01.200
<v Speaker 1>too simple?

205
00:11:01.320 --> 00:11:04.480
<v Speaker 2>It's a common pattern. Our sources did indicate teams often

206
00:11:04.519 --> 00:11:08.120
<v Speaker 2>gravitate towards PyTorch for that initial experimental phase because it's

207
00:11:08.120 --> 00:11:12.120
<v Speaker 2>so flexible. Then they might potentially transition to TensorFlow for

208
00:11:12.200 --> 00:11:16.679
<v Speaker 2>more robust production scaling. But TensorFlow is becoming more dynamic too,

209
00:11:16.720 --> 00:11:18.080
<v Speaker 2>so the lines are blurring.

210
00:11:17.799 --> 00:11:21.480
<v Speaker 1>A bit interesting. What about the training loops themselves, any

211
00:11:21.480 --> 00:11:23.639
<v Speaker 1>differences there in how you actually train the model?

212
00:11:23.879 --> 00:11:28.320
<v Speaker 2>Yes, PyTorch often requires a bit more manual implementation of

213
00:11:28.360 --> 00:11:30.600
<v Speaker 2>the training loop because you really find grain control, which

214
00:11:30.679 --> 00:11:31.639
<v Speaker 2>research is often like.

215
00:11:31.759 --> 00:11:32.080
<v Speaker 1>Okay.

216
00:11:32.159 --> 00:11:36.120
<v Speaker 2>Tensorflow's care is API, however, provides a higher level model

217
00:11:36.159 --> 00:11:38.759
<v Speaker 2>dot fifth method. It autom makes a lot of that process,

218
00:11:38.840 --> 00:11:41.639
<v Speaker 2>makes it very accessible, maybe easier to get started with

219
00:11:41.679 --> 00:11:41.919
<v Speaker 2>for some.

220
00:11:42.159 --> 00:11:45.360
<v Speaker 1>And how did they actually perform on that Tesla stock

221
00:11:45.399 --> 00:11:47.080
<v Speaker 1>price prediction task? Did one win?

222
00:11:47.559 --> 00:11:51.000
<v Speaker 2>Well? Both models achieved an exceptionally high R squared score

223
00:11:51.320 --> 00:11:55.200
<v Speaker 2>like point nine to nine eight, which indicates excellent predictive accuracy.

224
00:11:55.279 --> 00:11:56.559
<v Speaker 1>Wow. Okay, so both very good.

225
00:11:56.600 --> 00:11:59.559
<v Speaker 2>Both very good. What was particularly interesting, though, was that

226
00:11:59.600 --> 00:12:02.440
<v Speaker 2>the ten flow model had a slightly lower test loss

227
00:12:02.480 --> 00:12:05.519
<v Speaker 2>twelve point one one compared to Pytorch's twenty point five

228
00:12:05.559 --> 00:12:08.320
<v Speaker 2>to four. Now, this difference might seem small, but in

229
00:12:08.360 --> 00:12:12.039
<v Speaker 2>a financial context like stock predition, even marginal improvements and

230
00:12:12.159 --> 00:12:16.960
<v Speaker 2>loss can translate to significant real world financial impact and

231
00:12:17.000 --> 00:12:19.559
<v Speaker 2>potentially better generalization to unseen data.

232
00:12:19.639 --> 00:12:22.879
<v Speaker 1>Good point. And for the diabetes classification example.

233
00:12:23.039 --> 00:12:26.759
<v Speaker 2>For diabetes, both models showed pretty comparable accuracy, tensorflows at

234
00:12:26.799 --> 00:12:30.320
<v Speaker 2>point seven six ninety two PyTorch at point seven six

235
00:12:30.440 --> 00:12:34.879
<v Speaker 2>zero seven very close. A key insight from analyzing that

236
00:12:35.000 --> 00:12:37.799
<v Speaker 2>data was that the glucose level had the strongest correlation

237
00:12:37.879 --> 00:12:40.720
<v Speaker 2>with the outcome the diagnosis about point four to eighty

238
00:12:40.720 --> 00:12:43.759
<v Speaker 2>eight interesting, but the source is also importantly noted the

239
00:12:43.799 --> 00:12:48.639
<v Speaker 2>presence of skewed data in several features things like pregnancies, BMI, diabetes,

240
00:12:48.679 --> 00:12:49.679
<v Speaker 2>pedigree function and.

241
00:12:49.759 --> 00:12:51.000
<v Speaker 1>H Why does that matter?

242
00:12:51.279 --> 00:12:53.519
<v Speaker 2>Well, skew data isn't just a technical detail. It can

243
00:12:53.559 --> 00:12:57.600
<v Speaker 2>profoundly impact model bias and learning. It really emphasizes why

244
00:12:57.639 --> 00:13:01.080
<v Speaker 2>appropriate metrics like precision rec call and the F one

245
00:13:01.080 --> 00:13:05.480
<v Speaker 2>score are absolutely crucial for evaluating performance on imbalanced classification

246
00:13:05.559 --> 00:13:08.679
<v Speaker 2>tasks like this, where just looking at overall accuracy can

247
00:13:08.759 --> 00:13:10.240
<v Speaker 2>be really misleading.

248
00:13:09.879 --> 00:13:12.720
<v Speaker 1>Right, you might mispredicting the rarer cases. So once you've

249
00:13:12.720 --> 00:13:15.320
<v Speaker 1>got your basic model built, how do you really boost

250
00:13:15.320 --> 00:13:20.360
<v Speaker 1>its performance tackle those common challenges like overfitting or underfitting?

251
00:13:20.480 --> 00:13:23.399
<v Speaker 2>Ah? Yeah, that's where the advanced techniques come in. Yeah,

252
00:13:23.440 --> 00:13:26.360
<v Speaker 2>and our sources gave us some fascinating practical insights here.

253
00:13:26.600 --> 00:13:30.919
<v Speaker 2>Overfitting and underfitting are like ubiquitous challenges in deep learning,

254
00:13:31.080 --> 00:13:34.840
<v Speaker 2>always fighting them always. For instance, early stopping it doesn't

255
00:13:34.879 --> 00:13:39.360
<v Speaker 2>just prevent overfitting by halting training. When your validation performance

256
00:13:39.440 --> 00:13:44.120
<v Speaker 2>maybe the loss stops improving. For the Tesla stock example,

257
00:13:44.480 --> 00:13:50.200
<v Speaker 2>it explicitly demonstrated significant cost savings by preventing unnecessary compute cycles.

258
00:13:50.879 --> 00:13:53.759
<v Speaker 2>Training stopped at at bock eighty seven. But crucially, it

259
00:13:53.799 --> 00:13:56.240
<v Speaker 2>restored the weights from the best epoch, which was actually

260
00:13:56.240 --> 00:13:58.480
<v Speaker 2>ep box seventy seven. So you get the best model and.

261
00:13:58.440 --> 00:14:01.320
<v Speaker 1>Safe compute smart drop out I hear that's a powerful

262
00:14:01.320 --> 00:14:01.639
<v Speaker 1>one too.

263
00:14:01.759 --> 00:14:05.399
<v Speaker 2>It is dropout randomly drops out a certain percentage of neurons,

264
00:14:05.440 --> 00:14:09.399
<v Speaker 2>maybe fifty percent during each training stat turns them off temporarily. Yeah.

265
00:14:09.679 --> 00:14:13.720
<v Speaker 2>This prevents complex coadaptations between neurons, sort of forces the

266
00:14:13.759 --> 00:14:17.799
<v Speaker 2>network to learn more robust features. It significantly improves the

267
00:14:17.840 --> 00:14:22.279
<v Speaker 2>model's ability to generalize to new unseen data. Our source

268
00:14:22.279 --> 00:14:24.080
<v Speaker 2>has kind of likened it to the model learning from

269
00:14:24.120 --> 00:14:26.159
<v Speaker 2>multiple perspectives to become more robust.

270
00:14:26.320 --> 00:14:30.320
<v Speaker 1>Interesting analogy. There's also L one and L two regularization,

271
00:14:30.919 --> 00:14:33.120
<v Speaker 1>which sounds a bit like putting your model on a diet.

272
00:14:33.320 --> 00:14:34.919
<v Speaker 2>That's a great way to put it. Yeah, think of

273
00:14:35.039 --> 00:14:38.039
<v Speaker 2>L one regularization as a strict diet for your model's weights.

274
00:14:38.600 --> 00:14:41.240
<v Speaker 2>It actually forces some weights to go completely to zero,

275
00:14:41.399 --> 00:14:44.639
<v Speaker 2>oh okay, which makes the model simpler promote sparsity, meaning

276
00:14:44.679 --> 00:14:48.240
<v Speaker 2>it uses fewer features. L two regularization is more like

277
00:14:48.240 --> 00:14:51.080
<v Speaker 2>a gentle nudge. It makes all ways smaller but keeps

278
00:14:51.120 --> 00:14:54.360
<v Speaker 2>them present. It helps prevent any one feature from dominating

279
00:14:54.360 --> 00:14:58.679
<v Speaker 2>the prediction. They're both powerful tools for raining in that overfitting.

280
00:14:58.279 --> 00:15:02.000
<v Speaker 1>Got it and adjusting the arning rate that seems fundamental but.

281
00:15:02.039 --> 00:15:05.919
<v Speaker 2>Tricky, oh absolutely critical. Learning rate tuning, basically adjusting the

282
00:15:05.960 --> 00:15:09.960
<v Speaker 2>step size for optimization, can profoundly impact how fast and

283
00:15:10.000 --> 00:15:14.440
<v Speaker 2>effectively your model converges and performs. Our sources showed clear

284
00:15:14.480 --> 00:15:18.679
<v Speaker 2>examples where different learning rates like point zero one versus

285
00:15:18.679 --> 00:15:21.919
<v Speaker 2>point zero zero zero one led to widely varied test

286
00:15:21.960 --> 00:15:25.960
<v Speaker 2>loss and R squared scores. It really underscores the importance

287
00:15:25.960 --> 00:15:30.200
<v Speaker 2>of finding that Goldilocks zone, not too fast, not too slow, right.

288
00:15:30.600 --> 00:15:33.240
<v Speaker 1>What about the actual structure of the model itself, like

289
00:15:33.440 --> 00:15:36.000
<v Speaker 1>the number of layers, the number of neurons in each layer.

290
00:15:36.159 --> 00:15:39.679
<v Speaker 2>That's model capacity And a kind of counterintuitive finding from

291
00:15:39.679 --> 00:15:42.919
<v Speaker 2>our sources was that sometimes deeper models meaning more layers

292
00:15:42.919 --> 00:15:46.600
<v Speaker 2>but maybe fewer neurons per layer, can outperform wider models

293
00:15:46.639 --> 00:15:49.360
<v Speaker 2>which have fewer layers but more neurons. Yeah. For the

294
00:15:49.399 --> 00:15:52.360
<v Speaker 2>Tesla stock example, a deeper model with five hidden layers

295
00:15:52.600 --> 00:15:55.879
<v Speaker 2>actually achieved lower test loss and higher are squared compared

296
00:15:55.879 --> 00:15:57.759
<v Speaker 2>to a wider one that only had two hidden layers.

297
00:15:58.159 --> 00:16:00.759
<v Speaker 2>It suggests that for some problems depth that really matters

298
00:16:00.759 --> 00:16:04.440
<v Speaker 2>more than just width. Adding layers can capture more complex patterns.

299
00:16:04.480 --> 00:16:07.279
<v Speaker 1>Fascinating. All this tuning, though it can feel like searching

300
00:16:07.279 --> 00:16:08.320
<v Speaker 1>for a needle in a haystack.

301
00:16:08.399 --> 00:16:10.320
<v Speaker 2>Sometimes it definitely can.

302
00:16:10.480 --> 00:16:14.200
<v Speaker 1>That's where hyper parameter optimization tools like care Stooner that

303
00:16:14.279 --> 00:16:15.639
<v Speaker 1>was mentioned come into play.

304
00:16:15.679 --> 00:16:19.080
<v Speaker 2>I guess precisely, tools like care Student automate that search

305
00:16:19.120 --> 00:16:22.759
<v Speaker 2>for optimal hyper parameter combinations things like the number of

306
00:16:22.840 --> 00:16:26.919
<v Speaker 2>units in a layer, the learning rate itself dropout rates.

307
00:16:26.720 --> 00:16:28.200
<v Speaker 1>Takes the guesswork out well.

308
00:16:28.360 --> 00:16:31.960
<v Speaker 2>It makes a search systematic, It can yield significantly better

309
00:16:32.000 --> 00:16:36.000
<v Speaker 2>performance than just manual tuning alone, and potentially fave countless

310
00:16:36.039 --> 00:16:37.240
<v Speaker 2>hours of trial and error.

311
00:16:37.440 --> 00:16:41.519
<v Speaker 1>Makes sense. And finally, K fold cross validation Why is

312
00:16:41.519 --> 00:16:42.159
<v Speaker 1>that important?

313
00:16:42.480 --> 00:16:46.720
<v Speaker 2>This technique is essential for getting truly reliable model performance estimates,

314
00:16:47.559 --> 00:16:49.240
<v Speaker 2>especially when you have smaller data.

315
00:16:49.039 --> 00:16:50.639
<v Speaker 1>Sets like the diabetes one.

316
00:16:50.679 --> 00:16:54.120
<v Speaker 2>Maybe exactly. It involves splitting your data into k folds,

317
00:16:54.360 --> 00:16:57.200
<v Speaker 2>say five folds. Then you train and test the model

318
00:16:57.279 --> 00:17:00.240
<v Speaker 2>k times, using a different fold for testing each time,

319
00:17:00.519 --> 00:17:03.000
<v Speaker 2>and training on the rest, then you average the results

320
00:17:03.000 --> 00:17:06.000
<v Speaker 2>across all the folds. For the diabetes classification, we saw

321
00:17:06.000 --> 00:17:09.200
<v Speaker 2>on average accuracy of around zero point seventy five sixty

322
00:17:09.279 --> 00:17:12.279
<v Speaker 2>nine across five folds. That gives you a far more

323
00:17:12.400 --> 00:17:15.960
<v Speaker 2>robust and trustworthy performance estimate than just a single train

324
00:17:16.079 --> 00:17:18.400
<v Speaker 2>test split, which could be lucky or unlucky.

325
00:17:18.559 --> 00:17:22.000
<v Speaker 1>Right reduces the chance factor. Okay, wow, it sounds incredibly

326
00:17:22.000 --> 00:17:24.720
<v Speaker 1>complex to manage all of this manually, especially for a

327
00:17:24.720 --> 00:17:27.640
<v Speaker 1>company like our Precision Analytics trying to scale up really is.

328
00:17:28.000 --> 00:17:31.640
<v Speaker 1>So what's the grand orchestrator? What brings this entire pipeline

329
00:17:31.680 --> 00:17:34.759
<v Speaker 1>together from the data ingestion right through to deploying and

330
00:17:34.799 --> 00:17:38.400
<v Speaker 1>running the model? You mentioned apatche, Airflow and Amazon MWAA.

331
00:17:38.680 --> 00:17:42.079
<v Speaker 2>Yeah, you've highlighted the crucial next step manually running complex

332
00:17:42.160 --> 00:17:46.039
<v Speaker 2>deep learning workflows. Maybe just executing a Python script a

333
00:17:46.119 --> 00:17:51.240
<v Speaker 2>main function. It utterly lacks automation, it lacks robust monitoring,

334
00:17:51.519 --> 00:17:54.400
<v Speaker 2>and it lacks the reproducibility you absolutely need for any

335
00:17:54.440 --> 00:17:58.119
<v Speaker 2>real world application. It's simply not a scalable or reliable solution.

336
00:17:58.480 --> 00:18:01.119
<v Speaker 1>So air flu rides in to save the How does

337
00:18:01.160 --> 00:18:04.000
<v Speaker 1>it tackle these automation and monitoring challenges?

338
00:18:04.240 --> 00:18:08.759
<v Speaker 2>Well apatche airflow facilitates automated execution. You define your workflow

339
00:18:08.960 --> 00:18:11.519
<v Speaker 2>and it runs based on pre defined schedules or triggers.

340
00:18:11.960 --> 00:18:16.319
<v Speaker 2>It virtually eliminates that need for manual intervention. Nice and critically,

341
00:18:16.640 --> 00:18:21.079
<v Speaker 2>it offers comprehensive monitoring and logging capabilities. These are absolutely

342
00:18:21.160 --> 00:18:23.960
<v Speaker 2>vital for tracking the health and progress of your complex

343
00:18:24.200 --> 00:18:28.240
<v Speaker 2>deep learning pipelines. It ensures every step runs predictably and

344
00:18:28.279 --> 00:18:30.839
<v Speaker 2>if something fails, you know exactly where and why.

345
00:18:31.039 --> 00:18:33.319
<v Speaker 1>And I've heard the term DAGs a lot when people

346
00:18:33.359 --> 00:18:35.480
<v Speaker 1>talk about airflow. What exactly are those? Right?

347
00:18:35.559 --> 00:18:38.039
<v Speaker 2>DAGs? They stand for directed acyclic graphs.

348
00:18:38.119 --> 00:18:38.480
<v Speaker 1>Okay.

349
00:18:38.759 --> 00:18:42.200
<v Speaker 2>In airflow, your workflows are visually defined as these DAGs.

350
00:18:42.720 --> 00:18:45.880
<v Speaker 2>They're composed of individual tasks. Think of them as building

351
00:18:45.880 --> 00:18:49.519
<v Speaker 2>blocks like run, pist park, job, train model, evaluate model,

352
00:18:49.960 --> 00:18:52.799
<v Speaker 2>and you define the dependencies between them. This task runs

353
00:18:52.799 --> 00:18:55.599
<v Speaker 2>only after that one succeeds like a flow chart, exactly

354
00:18:55.680 --> 00:18:58.480
<v Speaker 2>like a flow chart, but one that enforces dependencies and

355
00:18:58.519 --> 00:19:01.240
<v Speaker 2>doesn't loop back on itself. That it's the acyclic part.

356
00:19:01.440 --> 00:19:05.680
<v Speaker 2>This modular design greatly enhances reusability and scalability for your workflows,

357
00:19:06.039 --> 00:19:08.720
<v Speaker 2>makes them much easier to visualize, manage and debug.

358
00:19:08.920 --> 00:19:13.720
<v Speaker 1>Okay, And for AWS users there's Amazon MWAA. What's the

359
00:19:13.759 --> 00:19:16.640
<v Speaker 1>big advantage there over just running Airflow yourself.

360
00:19:16.759 --> 00:19:21.599
<v Speaker 2>Huh? Amazon MWAA managed workflows for Apache Airflow. It's a

361
00:19:21.599 --> 00:19:23.599
<v Speaker 2>bit of a game changer because it's a fully managed

362
00:19:23.599 --> 00:19:28.400
<v Speaker 2>service from AWS, meaning it radically simplifies setting up, managing,

363
00:19:28.440 --> 00:19:31.799
<v Speaker 2>and scaling Apache Airflow environments. It basically slashes all the

364
00:19:31.839 --> 00:19:35.480
<v Speaker 2>manual insallation, configuration, patching, and maintenance overhead you'd face if

365
00:19:35.480 --> 00:19:38.319
<v Speaker 2>you try to run airflow yourself on EC two instances

366
00:19:38.640 --> 00:19:39.839
<v Speaker 2>or using Donker.

367
00:19:39.599 --> 00:19:42.440
<v Speaker 1>So AWS handles the infrastructure part exactly.

368
00:19:42.480 --> 00:19:44.920
<v Speaker 2>It's like having a dedicated team of experts managing your

369
00:19:44.920 --> 00:19:48.240
<v Speaker 2>airflow infrastructure for you, letting you focus just on building

370
00:19:48.240 --> 00:19:49.319
<v Speaker 2>your workflows your DAGs.

371
00:19:49.880 --> 00:19:52.920
<v Speaker 1>That sounds pretty appealing. How does that deployment process actually

372
00:19:53.039 --> 00:19:55.480
<v Speaker 1>work with MWAA as it simpler.

373
00:19:55.160 --> 00:19:58.440
<v Speaker 2>It's remarkably streamlined. Yeah. First, you set up the NWA

374
00:19:58.599 --> 00:20:02.519
<v Speaker 2>environment itself in the AWA console. That involves configuring things

375
00:20:02.559 --> 00:20:05.240
<v Speaker 2>like an S three bucket where your DAG files will live,

376
00:20:05.640 --> 00:20:09.680
<v Speaker 2>setting up the networking ensuring proper security roles. Then you

377
00:20:09.680 --> 00:20:12.440
<v Speaker 2>simply upload your DAG files. Often you'll zip them up

378
00:20:12.640 --> 00:20:15.920
<v Speaker 2>with any custom Python dependencies they need into that designated

379
00:20:16.000 --> 00:20:19.640
<v Speaker 2>S three bucket, configure any environment variables your DAGs need,

380
00:20:20.000 --> 00:20:22.960
<v Speaker 2>and then you can trigger the DAG execution either manually

381
00:20:23.000 --> 00:20:26.880
<v Speaker 2>through the airflow UI that MWAA provides, or set up

382
00:20:26.880 --> 00:20:27.799
<v Speaker 2>a preset schedule.

383
00:20:27.960 --> 00:20:31.079
<v Speaker 1>Seems much less hassle. Okay, So once everything's deployed and running,

384
00:20:31.079 --> 00:20:35.119
<v Speaker 1>maybe on a schedule, continuous monitoring is critical. Why is

385
00:20:35.160 --> 00:20:38.720
<v Speaker 1>that so important? Specifically for deep learning models after they're deployed.

386
00:20:38.920 --> 00:20:42.440
<v Speaker 2>Yeah, continuous monitoring post deportment is absolutely crucial. You need

387
00:20:42.480 --> 00:20:46.680
<v Speaker 2>to detect issues like model drift that's where the statistical

388
00:20:46.680 --> 00:20:49.799
<v Speaker 2>properties of the input data change over time compared to

389
00:20:49.839 --> 00:20:53.160
<v Speaker 2>the training data, so the world changes exactly. Or concept drift,

390
00:20:53.200 --> 00:20:56.160
<v Speaker 2>which is even trickier. That's where the relationship between the

391
00:20:56.160 --> 00:20:59.480
<v Speaker 2>input features and the target variable actually shifts. The underlying

392
00:20:59.559 --> 00:21:03.720
<v Speaker 2>patterns learned might no longer hold true. Yeah. Monitoring also

393
00:21:03.839 --> 00:21:07.000
<v Speaker 2>helps you spot critical resource bottlenecks like is your prediction

394
00:21:07.119 --> 00:21:10.920
<v Speaker 2>service running out of CPU, GPU or memory, and track

395
00:21:11.000 --> 00:21:14.920
<v Speaker 2>latency problems that could impact real time applications, especially for

396
00:21:15.000 --> 00:21:18.920
<v Speaker 2>something like patient diagnosis where speed in accuracy or paramount

397
00:21:19.279 --> 00:21:22.599
<v Speaker 2>you can't have your model suddenly getting slow or inaccurate.

398
00:21:22.720 --> 00:21:25.720
<v Speaker 1>Definitely not. What tools do you use for that kind

399
00:21:25.720 --> 00:21:26.440
<v Speaker 1>of monitoring?

400
00:21:26.880 --> 00:21:31.200
<v Speaker 2>Well, the MWAA console itself and the standard apatche Airflow

401
00:21:31.319 --> 00:21:34.279
<v Speaker 2>UI provide direct monitoring of your DAG runs. Did they

402
00:21:34.279 --> 00:21:37.400
<v Speaker 2>succeed fail? How long did they take? Okay, but for

403
00:21:37.519 --> 00:21:41.519
<v Speaker 2>even more comprehensive insights into the models in thefrastructure, Amazon

404
00:21:41.519 --> 00:21:44.000
<v Speaker 2>cloud Watch is really powerful. It offers a huge suite

405
00:21:44.039 --> 00:21:47.160
<v Speaker 2>of metrics and logs tracking. You can create custom dashboards

406
00:21:47.160 --> 00:21:50.440
<v Speaker 2>to visualize performance over time, set up alarms for critical

407
00:21:50.480 --> 00:21:53.519
<v Speaker 2>events like if prediction latency spikes, or accuracy drops, and

408
00:21:53.799 --> 00:21:57.640
<v Speaker 2>receive notifications. It really helps ensure your models remain reliable

409
00:21:57.680 --> 00:22:00.799
<v Speaker 2>and performance in production long after that initial deployment.

410
00:22:00.960 --> 00:22:04.079
<v Speaker 1>Wow, okay, you've just taken us on quite a deep

411
00:22:04.119 --> 00:22:08.039
<v Speaker 1>dive here into the architecture, the key technologies behind building

412
00:22:08.039 --> 00:22:12.079
<v Speaker 1>these scalable deep learning pipelines on AWSH. From that efficient

413
00:22:12.160 --> 00:22:15.240
<v Speaker 1>data prep with pie Spark, to the intelligent model training

414
00:22:15.240 --> 00:22:19.240
<v Speaker 1>with PyTorch intensorflow, and then finally that robust orchestration and

415
00:22:19.319 --> 00:22:23.839
<v Speaker 1>monitoring with Airflow and MWAA. You really now have a

416
00:22:23.880 --> 00:22:27.599
<v Speaker 1>comprehensive understanding of how all these pieces fit together, how

417
00:22:27.599 --> 00:22:30.759
<v Speaker 1>they unlock the immense power of AI at scale. It's

418
00:22:30.759 --> 00:22:33.640
<v Speaker 1>an incredible journey, isn't it, From just raw data to

419
00:22:33.759 --> 00:22:35.440
<v Speaker 1>actual actionable insight.

420
00:22:35.599 --> 00:22:39.000
<v Speaker 2>It truly is, and this powerful combination of tools and services,

421
00:22:39.079 --> 00:22:42.400
<v Speaker 2>it really has the potential to transform how organizations like

422
00:22:42.440 --> 00:22:46.559
<v Speaker 2>our Precision Analytics example, leverage their data for advanced analytics,

423
00:22:46.599 --> 00:22:50.079
<v Speaker 2>for truly impactful predictive modeling. It really pushes the boundaries

424
00:22:50.119 --> 00:22:50.920
<v Speaker 2>of what's possible.

425
00:22:51.000 --> 00:22:53.640
<v Speaker 1>Absolutely so the final thought for you, the listener, with

426
00:22:53.880 --> 00:22:57.960
<v Speaker 1>this new perspective seeing how these pieces connect, what complex

427
00:22:58.119 --> 00:23:00.839
<v Speaker 1>data rich challenge will you choose to tech by designing

428
00:23:00.880 --> 00:23:03.559
<v Speaker 1>your own scalable deep learning pipeline on the cloud
