WEBVTT

1
00:00:00.080 --> 00:00:03.319
<v Speaker 1>Welcome to the deep dive. We're taking on a really

2
00:00:03.399 --> 00:00:08.000
<v Speaker 1>crucial area today, deep learning specifically tailored for you, the

3
00:00:08.080 --> 00:00:11.439
<v Speaker 1>data architect. You shared some great material and our focus

4
00:00:11.519 --> 00:00:15.880
<v Speaker 1>is this book Deep Learning for Data Architects by Shikhar Condowall.

5
00:00:16.199 --> 00:00:18.760
<v Speaker 2>Yeah, it looks like a really solid resource, good for

6
00:00:18.879 --> 00:00:22.320
<v Speaker 2>understanding how these advanced techniques fit into data infrastructure.

7
00:00:22.440 --> 00:00:25.800
<v Speaker 1>It's pretty current too, right BPB Online twenty twenty four.

8
00:00:26.000 --> 00:00:28.600
<v Speaker 2>That's right ISBN nine seven eight nine three five five

9
00:00:28.679 --> 00:00:31.079
<v Speaker 2>five one five three nine one. So yeah, very up

10
00:00:31.079 --> 00:00:31.359
<v Speaker 2>to date.

11
00:00:31.519 --> 00:00:35.679
<v Speaker 1>Absolutely, and the author, Shikhar Kundawal, he seems to really

12
00:00:35.719 --> 00:00:38.439
<v Speaker 1>know with stuff. Senior AI and data scientists in Hamburg,

13
00:00:38.840 --> 00:00:41.600
<v Speaker 1>got a master's in data science specialized in computer vision

14
00:00:41.719 --> 00:00:45.520
<v Speaker 1>and wow, over fifteen years in AI and machine learning.

15
00:00:45.399 --> 00:00:48.359
<v Speaker 2>And experience with all the big cloud platforms aws, Google

16
00:00:48.359 --> 00:00:50.960
<v Speaker 2>Cloud as your IBM cloud. That's important.

17
00:00:51.039 --> 00:00:53.880
<v Speaker 1>Definitely brings that practical, real world angle. And apparently he's

18
00:00:53.880 --> 00:00:56.880
<v Speaker 1>also into marathons and CrossFit huh.

19
00:00:57.200 --> 00:01:00.439
<v Speaker 2>Yeah, saw that because that drive carries over and seriously,

20
00:01:00.520 --> 00:01:05.599
<v Speaker 2>that real world experience is invaluable, especially for bridging that

21
00:01:05.719 --> 00:01:11.040
<v Speaker 2>gap between complex AI theory and the practical stuff data

22
00:01:11.120 --> 00:01:15.120
<v Speaker 2>architects deal with. Yeah, the book's dedication is quite nice too,

23
00:01:15.319 --> 00:01:19.000
<v Speaker 2>to his wife, daughter, his uncle Danesh, a bookseller.

24
00:01:18.519 --> 00:01:20.359
<v Speaker 1>A bookseller uncle, that's a nice touch.

25
00:01:20.439 --> 00:01:25.239
<v Speaker 2>Yeah, and parents, PPB publications, colleagues, readers. Gives it a

26
00:01:25.319 --> 00:01:25.920
<v Speaker 2>human feel.

27
00:01:26.120 --> 00:01:28.799
<v Speaker 1>Okay, so let's unpack this for you, our listener. The

28
00:01:28.920 --> 00:01:31.480
<v Speaker 1>mission here is to pull out the key knowledge from

29
00:01:31.480 --> 00:01:34.040
<v Speaker 1>this book that's well directly relevant to your work as

30
00:01:34.079 --> 00:01:37.719
<v Speaker 1>a data architect. We'll look at how deep learning concepts

31
00:01:37.719 --> 00:01:41.879
<v Speaker 1>get implemented using Python, focusing on the practical side without

32
00:01:41.959 --> 00:01:43.680
<v Speaker 1>getting too bogged down in dense theory.

33
00:01:43.840 --> 00:01:46.959
<v Speaker 2>Think of it as like your fast track to understanding

34
00:01:46.959 --> 00:01:49.359
<v Speaker 2>the deep learning bits that matter for your field exactly.

35
00:01:49.439 --> 00:01:50.879
<v Speaker 1>We want to give you a clear picture of the

36
00:01:50.920 --> 00:01:53.439
<v Speaker 1>main principles the tools, so you can see how deep

37
00:01:53.519 --> 00:01:56.959
<v Speaker 1>learning could maybe be integrated strategically into the architectures you're

38
00:01:56.959 --> 00:01:58.000
<v Speaker 1>designing or managing.

39
00:01:58.200 --> 00:01:59.879
<v Speaker 2>Sounds good. Where does the book start?

40
00:02:00.079 --> 00:02:03.439
<v Speaker 1>Right at the foundation Python for data science? Chapter one.

41
00:02:04.959 --> 00:02:07.519
<v Speaker 1>There's a quote there that kind of sets the stage.

42
00:02:08.039 --> 00:02:10.319
<v Speaker 1>You can't build a great building on a weak foundation,

43
00:02:10.800 --> 00:02:11.560
<v Speaker 1>you know the one.

44
00:02:11.800 --> 00:02:14.719
<v Speaker 2>You must have a solid foundation if you are going

45
00:02:14.759 --> 00:02:16.319
<v Speaker 2>to have a super strong structure.

46
00:02:16.840 --> 00:02:19.800
<v Speaker 1>Yeah, It's the perfect analogy, isn't it totally Just like

47
00:02:19.840 --> 00:02:23.520
<v Speaker 1>a building needs that solid base. Any serious deep learning

48
00:02:23.560 --> 00:02:28.840
<v Speaker 1>setup relies heavily on Python and its whole library ecosystem.

49
00:02:28.400 --> 00:02:30.759
<v Speaker 2>So the book jumps right into the essential libraries.

50
00:02:30.879 --> 00:02:34.080
<v Speaker 1>YEP kicks off with pandas for data handling, NUMBPI for

51
00:02:34.120 --> 00:02:36.560
<v Speaker 1>all the numerical stuff, the real.

52
00:02:36.319 --> 00:02:38.439
<v Speaker 2>Workhorses, you know, right basics.

53
00:02:38.520 --> 00:02:42.039
<v Speaker 1>Then matt plot Live and Seaborn for visualization, super important

54
00:02:42.080 --> 00:02:45.240
<v Speaker 1>for understanding data flows or how models are doing you.

55
00:02:45.240 --> 00:02:47.639
<v Speaker 2>Can't really see inside the black box out of wise exactly.

56
00:02:47.719 --> 00:02:50.439
<v Speaker 1>Then there's psyche learn, the big toolkit for general machine

57
00:02:50.520 --> 00:02:53.840
<v Speaker 1>learning tasks, and of course TensorFlow and Cares.

58
00:02:53.719 --> 00:02:55.879
<v Speaker 2>The deep learning powerhouses definitely.

59
00:02:55.960 --> 00:02:58.879
<v Speaker 1>And for images it mentions psychic image and open cv too,

60
00:02:59.039 --> 00:02:59.240
<v Speaker 1>And it.

61
00:02:59.240 --> 00:03:02.439
<v Speaker 2>Gives you the install commands like pip install. Yeah, it

62
00:03:02.479 --> 00:03:06.159
<v Speaker 2>provides the basic PIP install commands and actually a useful

63
00:03:06.199 --> 00:03:10.159
<v Speaker 2>tip for notebook users using cyst dot executable and m

64
00:03:10.840 --> 00:03:13.800
<v Speaker 2>pip install an su library name oh.

65
00:03:13.560 --> 00:03:15.120
<v Speaker 1>Right, to make sure it installs in the.

66
00:03:15.159 --> 00:03:17.599
<v Speaker 2>Right place exactly avoid some common headaches.

67
00:03:17.759 --> 00:03:20.800
<v Speaker 1>Okay, so beyond us installing, what about using them Pandas

68
00:03:20.800 --> 00:03:21.800
<v Speaker 1>for data io.

69
00:03:21.919 --> 00:03:24.960
<v Speaker 2>Yeah, It covers reading and writing data extensively. You know

70
00:03:25.000 --> 00:03:29.800
<v Speaker 2>the usual suspects, CSV with talks and read CSV standard

71
00:03:29.960 --> 00:03:33.960
<v Speaker 2>Excel using two excel READEXL Jason two Two's dreets in

72
00:03:34.759 --> 00:03:37.840
<v Speaker 2>pretty comprehensive for a data architect needing to connect different systems.

73
00:03:37.879 --> 00:03:40.319
<v Speaker 1>That broad format support is key, and it gets.

74
00:03:40.199 --> 00:03:43.039
<v Speaker 2>Into some interesting ones too, like read clipboard for quick

75
00:03:43.039 --> 00:03:43.919
<v Speaker 2>copy pasting.

76
00:03:43.759 --> 00:03:45.879
<v Speaker 1>Data, pandy for testing small things, and.

77
00:03:45.879 --> 00:03:48.879
<v Speaker 2>Read ATML to pull tables straight from websites, which is

78
00:03:48.919 --> 00:03:49.439
<v Speaker 2>pretty cool.

79
00:03:49.479 --> 00:03:51.039
<v Speaker 1>Wait, multiple tables on a page.

80
00:03:51.120 --> 00:03:53.400
<v Speaker 2>Yeah. It mentions the match parameter. You can tell it

81
00:03:53.439 --> 00:03:55.520
<v Speaker 2>like look for a table with specific text in it

82
00:03:55.639 --> 00:03:59.159
<v Speaker 2>or near it. Yeah. Really useful for automating data scraping pipelines.

83
00:03:59.199 --> 00:04:00.680
<v Speaker 1>Okay, that is useful, And it.

84
00:04:00.599 --> 00:04:03.039
<v Speaker 2>Even points to a blog post by the author about

85
00:04:03.039 --> 00:04:06.319
<v Speaker 2>other formats like parquet and pickle. So PANDAS really positions

86
00:04:06.360 --> 00:04:08.479
<v Speaker 2>itself as a central hub for data movement.

87
00:04:08.639 --> 00:04:11.680
<v Speaker 1>But it's not just about reading data right. Efficiency matters,

88
00:04:11.800 --> 00:04:13.240
<v Speaker 1>especially with big data sets.

89
00:04:13.319 --> 00:04:17.639
<v Speaker 2>Oh absolutely, the book stress is optimizing pandas dot reacsv

90
00:04:18.639 --> 00:04:23.519
<v Speaker 2>huge for data architects worried about resources, performance, cost, How

91
00:04:23.759 --> 00:04:26.519
<v Speaker 2>well the d type parameter. First you can tell pand

92
00:04:26.600 --> 00:04:29.279
<v Speaker 2>is exactly what data type each column should be, like

93
00:04:29.319 --> 00:04:31.680
<v Speaker 2>this one's an integer, this one's a smaller float.

94
00:04:31.480 --> 00:04:34.120
<v Speaker 1>Instead of letting pandas guests and maybe use more memory

95
00:04:34.160 --> 00:04:35.560
<v Speaker 1>than needed precisely.

96
00:04:36.199 --> 00:04:39.079
<v Speaker 2>The book gives an example with housing data setting rooms

97
00:04:39.439 --> 00:04:42.199
<v Speaker 2>to MP dot nine ten thirty two distance to MP

98
00:04:42.279 --> 00:04:45.399
<v Speaker 2>dot float sixteen. Stuff like that saves a noticeable chunk

99
00:04:45.399 --> 00:04:45.839
<v Speaker 2>of memory.

100
00:04:46.040 --> 00:04:47.240
<v Speaker 1>Okay, I see sense.

101
00:04:47.240 --> 00:04:49.680
<v Speaker 2>Then there's use calls. Just tell which columns you actually need.

102
00:04:49.800 --> 00:04:52.439
<v Speaker 1>Ah, so don't even load the rest into memory.

103
00:04:52.199 --> 00:04:54.240
<v Speaker 2>Right if you only need five columns out of fifty,

104
00:04:54.279 --> 00:04:57.199
<v Speaker 2>why load all fifty again? The example shows a big

105
00:04:57.240 --> 00:04:59.519
<v Speaker 2>memory drop just by specifying.

106
00:04:58.920 --> 00:05:01.199
<v Speaker 1>The columns, and you can come those d type and

107
00:05:01.319 --> 00:05:01.800
<v Speaker 1>use calls.

108
00:05:01.920 --> 00:05:05.120
<v Speaker 2>Yeah, that's where the real magic happens. For optimization, the

109
00:05:05.160 --> 00:05:08.759
<v Speaker 2>book shows combining them can bring memory usage way down,

110
00:05:08.879 --> 00:05:12.639
<v Speaker 2>sometimes to just like a few thousand kbs. That's significant

111
00:05:12.639 --> 00:05:14.519
<v Speaker 2>for pipeline performance definitely.

112
00:05:14.560 --> 00:05:18.120
<v Speaker 1>What about data sets that are just too big, like

113
00:05:18.600 --> 00:05:19.959
<v Speaker 1>won't fit in memory at all.

114
00:05:20.120 --> 00:05:23.240
<v Speaker 2>That's where chunk size comes in critical for architects designing

115
00:05:23.240 --> 00:05:26.399
<v Speaker 2>for massive data volumes unless you process the file piece

116
00:05:26.439 --> 00:05:28.639
<v Speaker 2>by piece in manageable chunks.

117
00:05:28.759 --> 00:05:33.199
<v Speaker 1>Okay, So solid foundation with Python efficient data handling. Where

118
00:05:33.199 --> 00:05:35.480
<v Speaker 1>does it go next? It seems like chapter two tackles

119
00:05:35.600 --> 00:05:37.079
<v Speaker 1>real world data challenges.

120
00:05:37.160 --> 00:05:40.240
<v Speaker 2>Yeah, it shifts focused to you know, the practical side

121
00:05:40.240 --> 00:05:43.720
<v Speaker 2>of turning that raw data into something useful, into insights,

122
00:05:44.120 --> 00:05:46.759
<v Speaker 2>because just having the data store isn't the endgame.

123
00:05:46.560 --> 00:05:47.000
<v Speaker 1>Right right.

124
00:05:47.040 --> 00:05:49.920
<v Speaker 2>You need to understand it, prepare it exactly, and especially

125
00:05:49.920 --> 00:05:53.639
<v Speaker 2>before feeding it into say a deep learning model. So

126
00:05:53.759 --> 00:05:57.800
<v Speaker 2>the book introduces some automated tools for exploratory data analysis da.

127
00:05:57.920 --> 00:06:01.480
<v Speaker 1>AH tools to speed up that initial data understanding phase.

128
00:06:01.879 --> 00:06:04.319
<v Speaker 1>Useful for an architect looking at a new source definitely.

129
00:06:04.360 --> 00:06:05.519
<v Speaker 2>First one mentioned is panned.

130
00:06:05.600 --> 00:06:06.959
<v Speaker 1>Is profiling heard of that one?

131
00:06:07.040 --> 00:06:12.399
<v Speaker 2>Yeah? Generates reports right yep, pretty comprehensive reports, gives you stats, distributions,

132
00:06:12.680 --> 00:06:15.560
<v Speaker 2>potential issues like missing values, correlations all in one go.

133
00:06:16.279 --> 00:06:18.399
<v Speaker 2>Great for getting a fast assessment of a data set

134
00:06:18.439 --> 00:06:19.199
<v Speaker 2>you need to integrate.

135
00:06:19.319 --> 00:06:21.800
<v Speaker 1>Saves a lot of manual plotting and checking for.

136
00:06:21.759 --> 00:06:25.600
<v Speaker 2>Sure, and it can generate interactive widgets in Jupiter notebooks

137
00:06:25.720 --> 00:06:30.480
<v Speaker 2>using dfprofile dot to widgets. Good for collaboration. It also

138
00:06:30.519 --> 00:06:32.920
<v Speaker 2>mentions a minimal mode for really huge data sets so

139
00:06:32.959 --> 00:06:36.079
<v Speaker 2>it doesn't choke practical. What else next is sweet viz

140
00:06:36.639 --> 00:06:39.680
<v Speaker 2>similar goal DA, but it really focuses on creating nice

141
00:06:39.839 --> 00:06:41.639
<v Speaker 2>interactive HTML reports.

142
00:06:41.839 --> 00:06:44.519
<v Speaker 1>HTML reports so easy to share exactly you can.

143
00:06:44.399 --> 00:06:46.839
<v Speaker 2>Share them with people who aren't running Python. And a

144
00:06:46.920 --> 00:06:49.360
<v Speaker 2>key feature is comparing two data sets side by side.

145
00:06:49.439 --> 00:06:53.279
<v Speaker 1>Wooh, that sounds useful, like for data migration, validation or

146
00:06:53.560 --> 00:06:54.720
<v Speaker 1>ab test results.

147
00:06:54.720 --> 00:06:57.759
<v Speaker 2>Precisely, the commands are simple sv dot analyzed df dot

148
00:06:57.759 --> 00:07:01.199
<v Speaker 2>SHOWTML for one data set as vadas compare DF ONEDF

149
00:07:01.240 --> 00:07:04.879
<v Speaker 2>two do showtmail compare report dot HGMO for two. The

150
00:07:05.000 --> 00:07:06.879
<v Speaker 2>visual comparison could be really powerful.

151
00:07:07.000 --> 00:07:09.759
<v Speaker 1>Okay, cool, what's next? Autoviz sounds fast.

152
00:07:09.639 --> 00:07:12.759
<v Speaker 2>That's the idea. One line of code for visualizations av

153
00:07:12.879 --> 00:07:16.800
<v Speaker 2>dot autovis your data dot CSV deepvar target column.

154
00:07:16.920 --> 00:07:18.040
<v Speaker 1>One line. What does it show?

155
00:07:18.240 --> 00:07:21.639
<v Speaker 2>It tries to automatically figure out the important relationships and

156
00:07:21.720 --> 00:07:25.800
<v Speaker 2>generates a bunch of relevant plots, scatterplots, distributions, box plots,

157
00:07:25.800 --> 00:07:29.360
<v Speaker 2>heat maps, whatever seems appropriate for the variables. Quick visual

158
00:07:29.399 --> 00:07:30.759
<v Speaker 2>scans for patterns or problems.

159
00:07:30.839 --> 00:07:35.240
<v Speaker 1>Efficient. Then there's a LUX that integrates with Jupiter widgets.

160
00:07:35.439 --> 00:07:37.920
<v Speaker 2>Yeah. Lux is interesting because it works within the notebook.

161
00:07:37.959 --> 00:07:41.199
<v Speaker 2>You get a toggle Panda Seleux button on your data frame.

162
00:07:41.120 --> 00:07:43.639
<v Speaker 1>So you switch between table and visuals.

163
00:07:43.879 --> 00:07:47.160
<v Speaker 2>Kind of when you display a data frame. LUX ads

164
00:07:47.199 --> 00:07:51.040
<v Speaker 2>recommendations for visualizations based on the current data, or you

165
00:07:51.040 --> 00:07:55.279
<v Speaker 2>can state in intent like DF dot intent age fair,

166
00:07:56.079 --> 00:07:58.160
<v Speaker 2>and it generates plots relevant to.

167
00:07:58.079 --> 00:07:59.480
<v Speaker 1>Those columns intent based.

168
00:07:59.519 --> 00:08:01.920
<v Speaker 2>I like that it tries to be smarter about what

169
00:08:01.959 --> 00:08:04.000
<v Speaker 2>you might want to see. You can save reports to

170
00:08:04.079 --> 00:08:07.920
<v Speaker 2>HTML too, DF save EACHTML clear the intent even get

171
00:08:07.920 --> 00:08:10.240
<v Speaker 2>more advanced with vislist. It's quite interactive.

172
00:08:10.319 --> 00:08:13.079
<v Speaker 1>Lots of EDA tools. What about modeling good transition?

173
00:08:13.480 --> 00:08:16.759
<v Speaker 2>The next tool is lazy predict. It shifts gears towards

174
00:08:16.839 --> 00:08:19.279
<v Speaker 2>quickly trying out lots of different machine learning models.

175
00:08:19.399 --> 00:08:22.079
<v Speaker 1>Lazy like it does the work for you pretty much.

176
00:08:22.680 --> 00:08:25.439
<v Speaker 2>If you're thinking about adding mL but aren't sure which

177
00:08:25.480 --> 00:08:29.240
<v Speaker 2>algorithm might work best. Lazy predict runs your data through

178
00:08:29.279 --> 00:08:31.959
<v Speaker 2>a whole bunch of standard classifiers or regressors.

179
00:08:32.080 --> 00:08:32.840
<v Speaker 1>How does that work?

180
00:08:33.000 --> 00:08:36.120
<v Speaker 2>You use lazy classifier or lazy regressor, Give it your

181
00:08:36.159 --> 00:08:40.000
<v Speaker 2>training and testing data and call fit. It then spits

182
00:08:40.000 --> 00:08:43.320
<v Speaker 2>out a table comparing the performance metrics accuracy F one

183
00:08:43.360 --> 00:08:46.799
<v Speaker 2>score are squared whatever for dozens of models.

184
00:08:46.960 --> 00:08:49.720
<v Speaker 1>Wow. Okay, so a quick benchmark to see what directions.

185
00:08:49.720 --> 00:08:53.720
<v Speaker 2>Look promising exactly helps you understand potential complexity and performance

186
00:08:53.759 --> 00:08:56.440
<v Speaker 2>trade offs early on before you commit architecturally.

187
00:08:56.720 --> 00:08:58.879
<v Speaker 1>And the last one in this chapter, Pi Carrot.

188
00:08:59.159 --> 00:09:02.000
<v Speaker 2>PI Carrot is a no other automated mL library, maybe

189
00:09:02.039 --> 00:09:04.840
<v Speaker 2>a bit more comprehensive than Lazy Predict. It covers more

190
00:09:04.840 --> 00:09:09.639
<v Speaker 2>of the workflow, model building, tuning, evaluation, even some deployment.

191
00:09:09.240 --> 00:09:11.080
<v Speaker 1>Aspects, so more end to end.

192
00:09:11.600 --> 00:09:13.639
<v Speaker 2>Yeah, you can do things like compare models to get

193
00:09:13.639 --> 00:09:16.120
<v Speaker 2>a leader board and create model to build say an

194
00:09:16.120 --> 00:09:19.639
<v Speaker 2>extra trees model and plot model to visualize confusion matrices,

195
00:09:19.759 --> 00:09:21.480
<v Speaker 2>future importance learning curves.

196
00:09:21.519 --> 00:09:23.200
<v Speaker 1>So it helps understand the whole life cycle.

197
00:09:23.440 --> 00:09:27.240
<v Speaker 2>Right for an architect, Understanding that full process helps inform

198
00:09:27.279 --> 00:09:32.279
<v Speaker 2>decisions about deployment, monitoring, scaling the mL components within the

199
00:09:32.360 --> 00:09:33.039
<v Speaker 2>larger system.

200
00:09:33.159 --> 00:09:36.320
<v Speaker 1>Okay, that's a powerful set of tools for understanding and

201
00:09:36.360 --> 00:09:39.799
<v Speaker 1>initially modeling data. What's next? Chapter three gets into the

202
00:09:39.799 --> 00:09:42.000
<v Speaker 1>core right, building neural networks.

203
00:09:42.080 --> 00:09:45.200
<v Speaker 2>Yes, Chapter three dives into artificial neural networks an NS.

204
00:09:45.759 --> 00:09:50.200
<v Speaker 2>It starts conceptually explaining the inspiration from biology, our.

205
00:09:50.200 --> 00:09:53.200
<v Speaker 1>Brains, the whole neurons and connections idea exactly.

206
00:09:53.200 --> 00:09:55.799
<v Speaker 2>It provides that helpful mental model Then it breaks down

207
00:09:55.799 --> 00:09:59.399
<v Speaker 2>the basic building blocks, the artificial neurons, the weights connecting them,

208
00:09:59.600 --> 00:10:03.039
<v Speaker 2>the BUI, and importantly, activation functions.

209
00:10:02.639 --> 00:10:06.000
<v Speaker 1>Crucial for an architect to understand the components if they're

210
00:10:06.039 --> 00:10:07.360
<v Speaker 1>supporting the infrastructure.

211
00:10:07.519 --> 00:10:10.720
<v Speaker 2>Absolutely. It briefly covers the feed forward process how data

212
00:10:10.759 --> 00:10:14.080
<v Speaker 2>flows through in the basic neuron math. Then yeah, activation functions,

213
00:10:14.080 --> 00:10:14.799
<v Speaker 2>big emphasis there.

214
00:10:14.840 --> 00:10:15.879
<v Speaker 1>Why are they so important?

215
00:10:16.039 --> 00:10:19.240
<v Speaker 2>The introduced nonlinearity. Without them, the network could only learn

216
00:10:19.360 --> 00:10:22.600
<v Speaker 2>linear relationships no matter how many layers you stack. The

217
00:10:22.639 --> 00:10:25.360
<v Speaker 2>book mentions the common ones sigmoid.

218
00:10:25.080 --> 00:10:28.279
<v Speaker 1>Re lu, real use seems everywhere.

219
00:10:27.879 --> 00:10:33.240
<v Speaker 2>It's computational efficient. Also ten softmas for multi class outputs

220
00:10:33.480 --> 00:10:38.080
<v Speaker 2>and variations like leaky reilu e lu. That nonlinearity is

221
00:10:38.279 --> 00:10:40.480
<v Speaker 2>key for learning complex patterns.

222
00:10:40.559 --> 00:10:44.720
<v Speaker 1>Got it? So data goes forward, nonlinearity is added, how

223
00:10:44.759 --> 00:10:45.440
<v Speaker 1>does it learn?

224
00:10:45.799 --> 00:10:48.440
<v Speaker 2>That's where the loss function comes in. It measures how

225
00:10:48.519 --> 00:10:52.159
<v Speaker 2>wrong the network's predictions are compared to the actual answers.

226
00:10:52.639 --> 00:10:56.000
<v Speaker 2>Quantifies the error a performance metric basically right, And once

227
00:10:56.000 --> 00:10:59.240
<v Speaker 2>you can measure the error, you use backward propagation backprop

228
00:10:59.279 --> 00:11:01.480
<v Speaker 2>to figure out how to just the weights and biases

229
00:11:01.480 --> 00:11:03.919
<v Speaker 2>to reduce that error. That's the learning part.

230
00:11:04.080 --> 00:11:08.639
<v Speaker 1>Okay, and the book defines the training jargon epochs batches.

231
00:11:08.360 --> 00:11:12.279
<v Speaker 2>YEP defines epoch one full pass through the training data batch,

232
00:11:12.519 --> 00:11:15.240
<v Speaker 2>a subset of data used in one update step iteration

233
00:11:15.559 --> 00:11:18.480
<v Speaker 2>one update step. Also optimizers the algorithms that do the

234
00:11:18.480 --> 00:11:22.679
<v Speaker 2>weed adjustments like sgd armsprop atom atoms another common one

235
00:11:22.799 --> 00:11:25.159
<v Speaker 2>very common, and the learning rate, which controls how big

236
00:11:25.200 --> 00:11:29.080
<v Speaker 2>those adjustments are. Understanding these helps estimate resource needs training

237
00:11:29.080 --> 00:11:30.919
<v Speaker 2>times within an architecture makes sense?

238
00:11:31.039 --> 00:11:32.240
<v Speaker 1>Does it show how to build one?

239
00:11:32.480 --> 00:11:37.200
<v Speaker 2>Yes? It uses Keras for practical examples. First, a binary

240
00:11:37.200 --> 00:11:40.720
<v Speaker 2>classification model for breast cancer prediction using a standard data

241
00:11:40.720 --> 00:11:41.679
<v Speaker 2>set from psychic.

242
00:11:41.519 --> 00:11:45.240
<v Speaker 1>Learn, so predicting one of two outcomes.

243
00:11:44.799 --> 00:11:48.120
<v Speaker 2>Exactly, walks through loading libraries looking at the data, then

244
00:11:48.159 --> 00:11:51.679
<v Speaker 2>building the Keras model layer by layer input layer, a

245
00:11:51.759 --> 00:11:52.639
<v Speaker 2>hidden layer with.

246
00:11:52.679 --> 00:11:54.879
<v Speaker 1>Re lu how many neurons.

247
00:11:54.519 --> 00:11:57.200
<v Speaker 2>The example uses, I think sixteen in the hidden layer,

248
00:11:57.519 --> 00:12:00.480
<v Speaker 2>then an output layer with one neuron and a sigmoide

249
00:12:00.519 --> 00:12:04.960
<v Speaker 2>activation because it's binary, then compiling it, choosing the atom

250
00:12:05.000 --> 00:12:08.559
<v Speaker 2>optimizer the right loss function like sparse categor a cross entropy,

251
00:12:09.159 --> 00:12:10.480
<v Speaker 2>then training with model.

252
00:12:10.240 --> 00:12:11.519
<v Speaker 1>Out fit and evaluating.

253
00:12:11.600 --> 00:12:14.080
<v Speaker 2>Yeah looks at loss and accuracy. Yeah shows how to plot,

254
00:12:14.080 --> 00:12:16.759
<v Speaker 2>a confusion matrix, get a classification report gives you the

255
00:12:16.759 --> 00:12:17.799
<v Speaker 2>whole assessment.

256
00:12:17.399 --> 00:12:20.279
<v Speaker 1>Picture, which is vital if you're monitoring these models in production.

257
00:12:20.480 --> 00:12:24.399
<v Speaker 2>Absolutely. Then interestingly it builds a deeper network for the

258
00:12:24.399 --> 00:12:26.240
<v Speaker 2>same problem, adds another hidden.

259
00:12:26.039 --> 00:12:28.600
<v Speaker 1>Layer to see if it improves, right, and.

260
00:12:28.559 --> 00:12:32.600
<v Speaker 2>The book notes that the deeper network performed better. Highlights

261
00:12:32.639 --> 00:12:37.320
<v Speaker 2>that trade off more complexity potentially better results, but also

262
00:12:37.399 --> 00:12:40.039
<v Speaker 2>more computation architectural consideration.

263
00:12:40.159 --> 00:12:42.919
<v Speaker 1>Good point. What about other types of problems?

264
00:12:43.240 --> 00:12:46.960
<v Speaker 2>It follows up with a regression example, predicting Boston housing prices,

265
00:12:47.720 --> 00:12:48.960
<v Speaker 2>again using a built.

266
00:12:48.679 --> 00:12:52.279
<v Speaker 1>In data set, so predicting a number, not a category correct.

267
00:12:52.279 --> 00:12:56.399
<v Speaker 2>And here it emphasizes preprocessing more train test split, of course,

268
00:12:56.679 --> 00:13:00.440
<v Speaker 2>but also features scaling using standard scaler often crucial for

269
00:13:00.519 --> 00:13:01.799
<v Speaker 2>regression with neural nets.

270
00:13:01.879 --> 00:13:03.639
<v Speaker 1>Why scaling helps.

271
00:13:03.399 --> 00:13:06.200
<v Speaker 2>The optimizer converge better when features have very different ranges.

272
00:13:06.759 --> 00:13:09.960
<v Speaker 2>The model itself is similar infut layer matching the number

273
00:13:10.000 --> 00:13:12.240
<v Speaker 2>of features a hidden layer maybe one hundred and twenty

274
00:13:12.240 --> 00:13:14.759
<v Speaker 2>eight neurons with REALU, and then the output layer is

275
00:13:14.799 --> 00:13:16.600
<v Speaker 2>just one neuron with a linear.

276
00:13:16.240 --> 00:13:19.399
<v Speaker 1>Activation linear because the output is a continuous price.

277
00:13:19.240 --> 00:13:22.320
<v Speaker 2>Exactly, and the lass function changes too. Uses mean squared

278
00:13:22.399 --> 00:13:25.440
<v Speaker 2>error standard for regression, then trains and evaluates based on

279
00:13:25.480 --> 00:13:26.720
<v Speaker 2>the MSSE on the test set.

280
00:13:26.759 --> 00:13:30.600
<v Speaker 1>So two clear examples. Classification and regression covers the basics.

281
00:13:30.600 --> 00:13:33.559
<v Speaker 2>Well yeah, provides a solid caras foundation for building these

282
00:13:33.559 --> 00:13:37.759
<v Speaker 2>fundamental network types. Essential knowledge for architects dealing with different

283
00:13:37.879 --> 00:13:38.960
<v Speaker 2>m model types.

284
00:13:39.080 --> 00:13:42.879
<v Speaker 1>Okay, moving on to chapter four, Convolutional neural networks CNNs,

285
00:13:43.039 --> 00:13:44.279
<v Speaker 1>big topic for images.

286
00:13:44.440 --> 00:13:48.159
<v Speaker 2>Huge. The book starts by explaining why you need CNNs

287
00:13:48.279 --> 00:13:51.720
<v Speaker 2>for images. Why just flattening the pixels and feeding them

288
00:13:51.720 --> 00:13:54.360
<v Speaker 2>into a standard network isn't ideal?

289
00:13:54.559 --> 00:13:57.279
<v Speaker 1>Right? You mentioned losing spatial info exactly.

290
00:13:57.320 --> 00:14:01.600
<v Speaker 2>It's sensitive to shifts distortions. CNNs are designed to handle

291
00:14:01.600 --> 00:14:05.200
<v Speaker 2>that spatial hierarchy in images. It introduces the core idea

292
00:14:05.879 --> 00:14:06.840
<v Speaker 2>kernels or.

293
00:14:06.720 --> 00:14:09.120
<v Speaker 1>Filters, the little squares that slide over.

294
00:14:09.000 --> 00:14:12.360
<v Speaker 2>The image yep, and the convolution operation itself. How the

295
00:14:12.360 --> 00:14:15.759
<v Speaker 2>filter multiplies in sums pixel values to create a feature map,

296
00:14:15.960 --> 00:14:20.039
<v Speaker 2>highlighting specific patterns like edges or textures. Understanding this helps

297
00:14:20.039 --> 00:14:23.240
<v Speaker 2>think about how image data needs to be handled architecturally.

298
00:14:23.320 --> 00:14:24.960
<v Speaker 1>It also covers stride and padding.

299
00:14:25.240 --> 00:14:29.279
<v Speaker 2>Right. Stride is how many pixels the filter jumps each time.

300
00:14:29.679 --> 00:14:32.840
<v Speaker 2>Padding is adding borders to control the output size. Then

301
00:14:32.919 --> 00:14:36.720
<v Speaker 2>it explains how convolution works on color images, RGB, multiple channels,

302
00:14:36.919 --> 00:14:40.039
<v Speaker 2>and how using multiple filters lets the network learn different

303
00:14:40.039 --> 00:14:41.200
<v Speaker 2>features simultaneously.

304
00:14:41.399 --> 00:14:44.879
<v Speaker 1>Okay, so convolution extracts features. What else is in a CNN?

305
00:14:45.240 --> 00:14:48.919
<v Speaker 2>Pooling layers usually max pooling, They downsample the feature maps,

306
00:14:48.960 --> 00:14:52.440
<v Speaker 2>make the network more robust of variations, reduce computation, brings

307
00:14:52.480 --> 00:14:56.440
<v Speaker 2>things down basically. Yeah, then flattening, taking the final two

308
00:14:56.480 --> 00:14:58.440
<v Speaker 2>D feature maps and turn them into a one D.

309
00:14:58.480 --> 00:15:01.720
<v Speaker 1>Vector to feed into a regular dense layer exactly.

310
00:15:01.840 --> 00:15:04.320
<v Speaker 2>The final part is usually one or more dense layers

311
00:15:04.320 --> 00:15:06.639
<v Speaker 2>for the actual classification, just like in the A and

312
00:15:06.720 --> 00:15:07.519
<v Speaker 2>NS we discussed.

313
00:15:07.519 --> 00:15:09.559
<v Speaker 1>Does it show an example, of course.

314
00:15:09.600 --> 00:15:13.480
<v Speaker 2>The classic MNIST data set handwritten digits walks through loading

315
00:15:13.480 --> 00:15:16.120
<v Speaker 2>at via keras preprocessing.

316
00:15:15.480 --> 00:15:17.480
<v Speaker 1>Like reshaping for the color channel.

317
00:15:17.279 --> 00:15:19.799
<v Speaker 2>YEP reshaping to add that channel dimension even though it's

318
00:15:19.799 --> 00:15:23.720
<v Speaker 2>greyscale and one hot encoding the labels zero to nine.

319
00:15:23.320 --> 00:15:25.000
<v Speaker 1>And the CNN architecture.

320
00:15:25.159 --> 00:15:28.720
<v Speaker 2>It shows building a typical CNN convy two D layers

321
00:15:28.759 --> 00:15:32.519
<v Speaker 2>with ReLU maybe batch normalization for stability, max pooling two

322
00:15:32.559 --> 00:15:35.559
<v Speaker 2>D layers, the flatten layer, and dense layers with soft

323
00:15:35.600 --> 00:15:37.440
<v Speaker 2>max at the end for the ten digit.

324
00:15:37.200 --> 00:15:40.519
<v Speaker 1>Classes compiled with categorical cross entropy.

325
00:15:40.399 --> 00:15:45.200
<v Speaker 2>Right an atom optimizer usually then training plotting the ACCURACYLS curves,

326
00:15:45.200 --> 00:15:49.000
<v Speaker 2>making predictions, showing the confusion matrix. The whole workflow for

327
00:15:49.039 --> 00:15:50.799
<v Speaker 2>image classification very practical.

328
00:15:50.879 --> 00:15:53.960
<v Speaker 1>What about tuning? CNNs have lots of knobs.

329
00:15:53.600 --> 00:15:57.320
<v Speaker 2>To turn good point. The chapter introduces hyper parameter tuning

330
00:15:57.519 --> 00:16:00.200
<v Speaker 2>using Keros tuner, but switches to the fashion MNA. Yes

331
00:16:00.320 --> 00:16:03.879
<v Speaker 2>data set similar idea, but images of clothing items. Why

332
00:16:03.919 --> 00:16:06.519
<v Speaker 2>tuning because just picking the number of layers or filter

333
00:16:06.600 --> 00:16:10.159
<v Speaker 2>sizes by guesswork is an optimal. Things like learning rate,

334
00:16:10.360 --> 00:16:14.279
<v Speaker 2>activation functions, number of units, they all impact performance. Tuning

335
00:16:14.320 --> 00:16:15.039
<v Speaker 2>finds the best.

336
00:16:14.840 --> 00:16:18.120
<v Speaker 1>Combo, and karristuoner helps automate that search exactly.

337
00:16:18.240 --> 00:16:20.919
<v Speaker 2>The book shows how to install it PIP installed moles

338
00:16:21.000 --> 00:16:24.320
<v Speaker 2>Karras tuner. Then define a model building function. Inside that function,

339
00:16:24.559 --> 00:16:27.279
<v Speaker 2>you define the search space for your hyper parameters, like.

340
00:16:27.360 --> 00:16:30.000
<v Speaker 1>Try learning rates of biller point zero one or point

341
00:16:30.120 --> 00:16:31.080
<v Speaker 1>zero zero one.

342
00:16:31.159 --> 00:16:35.000
<v Speaker 2>Precisely, or try one versus two dense layers or different

343
00:16:35.039 --> 00:16:38.440
<v Speaker 2>numbers of units. You tell Karrastuner the ranges are choices.

344
00:16:38.879 --> 00:16:43.639
<v Speaker 2>Then you create a tuner object like dat hyperband hyperband

345
00:16:43.720 --> 00:16:46.559
<v Speaker 2>it's one of the search algorithms Karris tuner efforts. Then

346
00:16:46.559 --> 00:16:49.120
<v Speaker 2>you run tuner dot search and it trains lots of

347
00:16:49.120 --> 00:16:51.840
<v Speaker 2>model variations to find the best hyper parameters.

348
00:16:51.320 --> 00:16:53.679
<v Speaker 1>And you can get the best ones out yeap tuner I.

349
00:16:53.639 --> 00:16:56.759
<v Speaker 2>Get best hyper parameters, gives you the optimal settings it found.

350
00:16:57.360 --> 00:16:59.679
<v Speaker 2>Then you build the final model with those best settings

351
00:17:00.039 --> 00:17:01.279
<v Speaker 2>and train it properly.

352
00:17:01.039 --> 00:17:04.119
<v Speaker 1>And evaluate that best model shows the real benefit.

353
00:17:04.200 --> 00:17:07.400
<v Speaker 2>Right shows how tuning can push performance higher. Important for

354
00:17:07.519 --> 00:17:09.839
<v Speaker 2>architects thinking about optimizing training pipelines.

355
00:17:09.920 --> 00:17:14.079
<v Speaker 1>Okay. Chapter five moves to a specific application, Optical Character

356
00:17:14.160 --> 00:17:15.880
<v Speaker 1>recognition OCR.

357
00:17:16.039 --> 00:17:19.880
<v Speaker 2>Yeah, turning text in images into actual usable text data.

358
00:17:20.519 --> 00:17:25.400
<v Speaker 2>Super important for digitizing documents invoices, bank statements, even reading

359
00:17:25.480 --> 00:17:26.960
<v Speaker 2>road signs for autonomous cars.

360
00:17:27.119 --> 00:17:28.920
<v Speaker 1>Lots of applications. What tools does it cover?

361
00:17:29.079 --> 00:17:31.279
<v Speaker 2>It introduces several Python OCR libraries.

362
00:17:31.880 --> 00:17:35.240
<v Speaker 1>Starts with tessak, the classic open source one from HP,

363
00:17:35.359 --> 00:17:36.839
<v Speaker 1>then Google that's the one.

364
00:17:37.000 --> 00:17:40.640
<v Speaker 2>Mentions installation, setting the path, and a basic demo using

365
00:17:40.680 --> 00:17:44.480
<v Speaker 2>Pietes image does string straightforward for simple cases?

366
00:17:44.559 --> 00:17:46.119
<v Speaker 1>What about more modern approaches?

367
00:17:46.440 --> 00:17:49.599
<v Speaker 2>It covers Kara's okey This uses deep learning models under

368
00:17:49.599 --> 00:17:52.799
<v Speaker 2>the hood, shows installing it creating an OCR pipeline and

369
00:17:52.920 --> 00:17:54.680
<v Speaker 2>using pipeline dot recognize on an.

370
00:17:54.640 --> 00:17:57.880
<v Speaker 1>Image, so leveraging pre trained models exactly.

371
00:17:57.920 --> 00:18:00.920
<v Speaker 2>It often handles more varied images better than traditional methods.

372
00:18:01.079 --> 00:18:02.000
<v Speaker 1>Okay, any others.

373
00:18:02.039 --> 00:18:04.680
<v Speaker 2>Easy OCR the name says it all right. It highlights

374
00:18:04.680 --> 00:18:08.480
<v Speaker 2>its simplicity and really good multi language support out of

375
00:18:08.480 --> 00:18:08.839
<v Speaker 2>the box.

376
00:18:08.960 --> 00:18:10.880
<v Speaker 1>Multi language that's a big plus.

377
00:18:11.000 --> 00:18:15.359
<v Speaker 2>Definitely shows installation pip install eazokey initializing a reader with

378
00:18:15.440 --> 00:18:19.160
<v Speaker 2>language codes like n fr D and then just reader

379
00:18:19.200 --> 00:18:20.720
<v Speaker 2>dot read text pretty simple.

380
00:18:20.799 --> 00:18:23.960
<v Speaker 1>API nice does handle PDFs, that's common.

381
00:18:24.039 --> 00:18:27.000
<v Speaker 2>Mentions that, yeah, needs helper tools like Poplar utils and

382
00:18:27.039 --> 00:18:30.440
<v Speaker 2>pdf two image to first convert PDF pages to images.

383
00:18:30.839 --> 00:18:33.880
<v Speaker 2>Then you run easy OCR on the images. A common workflow.

384
00:18:34.000 --> 00:18:35.960
<v Speaker 1>Good practical tip. One more tree.

385
00:18:36.000 --> 00:18:40.160
<v Speaker 2>OCR stands for transformer OCR. It's described as more of

386
00:18:40.160 --> 00:18:44.079
<v Speaker 2>a research project using transformer models like from NLP adapted

387
00:18:44.119 --> 00:18:46.319
<v Speaker 2>for OCR on challenging natural.

388
00:18:46.119 --> 00:18:49.119
<v Speaker 1>Images transformers for OCR interesting.

389
00:18:49.200 --> 00:18:52.039
<v Speaker 2>Yeah, shows how cutting edge NLP architectures are crossing over.

390
00:18:52.359 --> 00:18:57.119
<v Speaker 2>Shows installation pip install, transcute transformers, Loading the pre trained

391
00:18:57.160 --> 00:19:00.880
<v Speaker 2>three OCR model and processor and running it represents the

392
00:19:00.880 --> 00:19:01.519
<v Speaker 2>state of the art.

393
00:19:01.880 --> 00:19:05.319
<v Speaker 1>Okay, so several OCR options depending on the need Chapter

394
00:19:05.359 --> 00:19:06.720
<v Speaker 1>six object detection.

395
00:19:07.000 --> 00:19:10.279
<v Speaker 2>Right moving beyond just what's in an image, classification or

396
00:19:10.319 --> 00:19:13.799
<v Speaker 2>where text is ocr to finding multiple objects and drawing

397
00:19:13.880 --> 00:19:14.799
<v Speaker 2>boxes around them.

398
00:19:14.839 --> 00:19:17.279
<v Speaker 1>It distinguishes that from classification and localization.

399
00:19:17.440 --> 00:19:21.079
<v Speaker 2>First, Yeah, clearly defined classification with one label per image,

400
00:19:21.160 --> 00:19:25.599
<v Speaker 2>localization one object with a box, detection multiple objects, multiple boxes.

401
00:19:25.640 --> 00:19:29.200
<v Speaker 1>Then it lists some key algorithms like RCNN faster, RCNN.

402
00:19:28.839 --> 00:19:31.759
<v Speaker 2>YEP, the RCNN family which are influential but slower, and

403
00:19:31.759 --> 00:19:34.440
<v Speaker 2>then the faster ones, SSD Single Shot Detector and YOLO

404
00:19:34.519 --> 00:19:37.720
<v Speaker 2>you only look once. Mentions that accuracy versus speed trade off,

405
00:19:37.759 --> 00:19:38.680
<v Speaker 2>which is always a factor.

406
00:19:38.720 --> 00:19:40.519
<v Speaker 1>How does it show implementing them SSD?

407
00:19:40.640 --> 00:19:46.119
<v Speaker 2>First, Yeah demonstrates SSD using PyTorch Hub. Steps include installing prerequisites,

408
00:19:46.480 --> 00:19:49.359
<v Speaker 2>loading a pre trained SSD model from Nvidia via the.

409
00:19:49.400 --> 00:19:52.079
<v Speaker 1>Hub, leveraging pre trained again smart.

410
00:19:51.960 --> 00:19:56.640
<v Speaker 2>Very common loading utilities, formatting the input image, running detection,

411
00:19:57.119 --> 00:20:00.200
<v Speaker 2>filtering results by confidence score, and then drawing the box

412
00:20:00.319 --> 00:20:03.920
<v Speaker 2>and labels on the image. Shows a practical pipeline using PyTorch.

413
00:20:04.000 --> 00:20:04.720
<v Speaker 1>What about Yolo?

414
00:20:05.039 --> 00:20:08.319
<v Speaker 2>For Yolo, it shows the Darknet approach, cloning the original

415
00:20:08.359 --> 00:20:12.240
<v Speaker 2>Darknet c code repository, compiling it with make downloading the

416
00:20:12.279 --> 00:20:13.400
<v Speaker 2>pre trained Yolo.

417
00:20:13.200 --> 00:20:15.359
<v Speaker 1>Weights, so a different ecosystem right, and.

418
00:20:15.279 --> 00:20:18.119
<v Speaker 2>Then running detection from the command line using the compiled

419
00:20:18.200 --> 00:20:21.759
<v Speaker 2>Darknet executable shows a different but also a very popular

420
00:20:21.799 --> 00:20:25.000
<v Speaker 2>way to use a leading object detection model, especially known

421
00:20:25.039 --> 00:20:25.599
<v Speaker 2>for its speed.

422
00:20:25.799 --> 00:20:28.920
<v Speaker 1>Good to see different implementation styles. Next up, chapter seven,

423
00:20:29.079 --> 00:20:31.960
<v Speaker 1>image segmentation, getting even more detailed.

424
00:20:31.559 --> 00:20:35.880
<v Speaker 2>Exactly pixel level classification, assigning a category label to every

425
00:20:35.880 --> 00:20:38.839
<v Speaker 2>single pixel in the image. The book contrasts it clearly

426
00:20:38.880 --> 00:20:42.759
<v Speaker 2>with classification whole image and detection bounding boxes.

427
00:20:42.599 --> 00:20:45.839
<v Speaker 1>So understanding the exact shape of objects precisely.

428
00:20:46.200 --> 00:20:49.720
<v Speaker 2>It lists quite a few segmentation architectures data architects might

429
00:20:49.720 --> 00:20:55.119
<v Speaker 2>hear about unit fcnaight, mask our CNN, dp LAB lots

430
00:20:55.119 --> 00:20:55.400
<v Speaker 2>of them.

431
00:20:55.440 --> 00:20:56.519
<v Speaker 1>Does it implement all of them?

432
00:20:56.759 --> 00:20:58.880
<v Speaker 2>Well, that would be a lot. It focuses on providing

433
00:20:58.880 --> 00:21:03.000
<v Speaker 2>Python implementations for three significant ones unit FCN eight and

434
00:21:03.079 --> 00:21:03.960
<v Speaker 2>mask RCNN.

435
00:21:04.119 --> 00:21:05.519
<v Speaker 1>How does it show UNT.

436
00:21:05.480 --> 00:21:10.000
<v Speaker 2>Using tensiflow caras outlines prerequisites, data loading and prep using

437
00:21:10.039 --> 00:21:11.440
<v Speaker 2>the Oxford it PET.

438
00:21:11.279 --> 00:21:14.079
<v Speaker 1>Data sets a segmenting cats and dogs right.

439
00:21:14.240 --> 00:21:17.440
<v Speaker 2>Building the unit model architecture which has that characteristic use

440
00:21:17.440 --> 00:21:21.160
<v Speaker 2>shape with skip connections, compiling training and then visualizing the

441
00:21:21.160 --> 00:21:24.160
<v Speaker 2>output segmentation masks on top of the pet images unit

442
00:21:24.279 --> 00:21:25.720
<v Speaker 2>is huge in medical imaging too.

443
00:21:25.880 --> 00:21:27.480
<v Speaker 1>Okay, what about FCN eight.

444
00:21:27.759 --> 00:21:30.400
<v Speaker 2>FCN stands for fully convolutional network. It was a key

445
00:21:30.400 --> 00:21:35.039
<v Speaker 2>step towards modern segmentation. The implementation shows importing libraries, defining

446
00:21:35.039 --> 00:21:38.240
<v Speaker 2>the FCN eight model, replacing dense layers with convolutional ones,

447
00:21:38.400 --> 00:21:42.960
<v Speaker 2>generating sample data, compiling, training, and evaluation, more foundational.

448
00:21:42.519 --> 00:21:45.640
<v Speaker 1>And mask RCNN sounds related to object.

449
00:21:45.400 --> 00:21:49.880
<v Speaker 2>Detection, it is it extends faster RCNN. It detects objects

450
00:21:49.880 --> 00:21:52.759
<v Speaker 2>with bounding boxes and generates a pixel level mask for

451
00:21:52.799 --> 00:21:57.039
<v Speaker 2>each detected object instance. So instant segmentation, best of both

452
00:21:57.079 --> 00:21:59.400
<v Speaker 2>worlds in a way. The book shows how to implement

453
00:21:59.440 --> 00:22:03.319
<v Speaker 2>it using tense, often leveraging models from their TPU repository.

454
00:22:04.079 --> 00:22:07.599
<v Speaker 2>Steps include loading libraries, setting up category labels, loading an

455
00:22:07.640 --> 00:22:10.960
<v Speaker 2>image using a pre train mast RCNM model for inference,

456
00:22:11.200 --> 00:22:14.559
<v Speaker 2>and then visualizing both the boxes and the masks. Very powerful.

457
00:22:14.599 --> 00:22:16.839
<v Speaker 1>Okay, that covers a lot on image analysis. Chapter eight

458
00:22:16.839 --> 00:22:19.920
<v Speaker 1>shifts to sequences RNNs.

459
00:22:19.359 --> 00:22:23.680
<v Speaker 2>Yes, recurrent neural networks for data where order matters natural language,

460
00:22:23.720 --> 00:22:27.359
<v Speaker 2>time series audio. Their key feature is that internal memory

461
00:22:27.440 --> 00:22:30.279
<v Speaker 2>or hidden state that remembers past information.

462
00:22:30.039 --> 00:22:31.839
<v Speaker 1>Right the recurrent part exactly.

463
00:22:32.039 --> 00:22:36.160
<v Speaker 2>The book briefly mentions the training algorithms like BPTT backpropagation

464
00:22:36.240 --> 00:22:38.640
<v Speaker 2>through time, and the more advanced architecture is designed to

465
00:22:38.640 --> 00:22:42.640
<v Speaker 2>handle long sequences better LSTM long short term memory and

466
00:22:42.759 --> 00:22:44.519
<v Speaker 2>gru gated recurrent unit.

467
00:22:44.599 --> 00:22:46.319
<v Speaker 1>They fix problems with basic RNNs.

468
00:22:46.519 --> 00:22:49.319
<v Speaker 2>Yeah, they have internal gates that control the flow of information,

469
00:22:49.799 --> 00:22:52.759
<v Speaker 2>helping them remember relevant stuff from further back in the

470
00:22:52.799 --> 00:22:55.559
<v Speaker 2>sequence and avoid the vanish ingradient problem.

471
00:22:55.680 --> 00:22:57.000
<v Speaker 1>Does it show how to use them?

472
00:22:57.160 --> 00:23:00.160
<v Speaker 2>It starts with a very simple RNN in Keras for

473
00:23:00.200 --> 00:23:03.720
<v Speaker 2>basic sequence prediction. Then it dives into a practical LSTM

474
00:23:03.799 --> 00:23:07.680
<v Speaker 2>example for time series forecasting, predicting airline passenger numbers.

475
00:23:07.680 --> 00:23:09.440
<v Speaker 1>Classic data set YEP.

476
00:23:09.559 --> 00:23:13.200
<v Speaker 2>Covers loading, normalizing the data, splitting, creating the sequences correctly

477
00:23:13.240 --> 00:23:16.839
<v Speaker 2>for the LSTM input, building the Carras LSTM model, training it,

478
00:23:16.920 --> 00:23:21.759
<v Speaker 2>making predictions and evaluating very relevant for predictive analytics architecture.

479
00:23:21.799 --> 00:23:22.759
<v Speaker 1>What about grus.

480
00:23:23.119 --> 00:23:26.119
<v Speaker 2>It shows a gru example for a sentiment analysis on

481
00:23:26.160 --> 00:23:28.839
<v Speaker 2>the IMDb movie review data set.

482
00:23:28.920 --> 00:23:29.720
<v Speaker 1>Text data Now.

483
00:23:29.799 --> 00:23:33.319
<v Speaker 2>Right shows loading the data reviews labeled positive negative, A

484
00:23:33.319 --> 00:23:35.640
<v Speaker 2>crucial step of padding the sequences, so they are all.

485
00:23:35.599 --> 00:23:39.599
<v Speaker 1>The same lengths because RNNs expect fixed length inputs, usually.

486
00:23:39.480 --> 00:23:42.920
<v Speaker 2>Right then building the gru modeling keras, training it to

487
00:23:42.960 --> 00:23:46.839
<v Speaker 2>classify sentiment, and evaluating its accuracy. Grus are often a

488
00:23:46.839 --> 00:23:51.599
<v Speaker 2>bit simpler and faster to train than LSTMs sometimes performs similarly.

489
00:23:51.319 --> 00:23:57.359
<v Speaker 1>So RNNs LSTMs grus for sequences. Chapter nine gets creative.

490
00:23:57.839 --> 00:24:01.839
<v Speaker 2>Jams Generative adversarial networks. Yeah, really fascinating area. The book

491
00:24:01.839 --> 00:24:05.200
<v Speaker 2>explains the core idea to networks, a generator and a

492
00:24:05.240 --> 00:24:06.839
<v Speaker 2>discriminator battling it out.

493
00:24:06.920 --> 00:24:09.839
<v Speaker 1>The generator makes fake data, the discriminator tries to spot

494
00:24:09.839 --> 00:24:10.799
<v Speaker 1>the fakes exactly.

495
00:24:11.000 --> 00:24:13.319
<v Speaker 2>Through this competition, the generator gets better and better at

496
00:24:13.400 --> 00:24:16.839
<v Speaker 2>making realistic synthetic data, images, text, whatever. The book lists

497
00:24:16.839 --> 00:24:19.680
<v Speaker 2>some types Vanilla JAN, conditional JAN.

498
00:24:19.839 --> 00:24:23.200
<v Speaker 1>DCGN DCGN deep convolutional.

499
00:24:23.039 --> 00:24:26.240
<v Speaker 2>Use, the CNNs YE also voser seen JAN style. Again,

500
00:24:26.759 --> 00:24:27.680
<v Speaker 2>lots of variations.

501
00:24:27.680 --> 00:24:28.640
<v Speaker 1>How does it implement them?

502
00:24:28.720 --> 00:24:33.160
<v Speaker 2>Vanilla First Yeah provides Vanilla Jan implementation using PyTorch shows

503
00:24:33.200 --> 00:24:37.160
<v Speaker 2>defining the generator and discriminator networks, the loss functions the optimizers,

504
00:24:37.359 --> 00:24:40.319
<v Speaker 2>the training loop where they alternate updates and visualizing the

505
00:24:40.359 --> 00:24:42.279
<v Speaker 2>generated images gives you the basic.

506
00:24:42.039 --> 00:24:45.079
<v Speaker 1>Mechanics then dc JAN. What's the key difference the.

507
00:24:45.160 --> 00:24:50.079
<v Speaker 2>Architecture DCGN uses convolutional layers in both generator using transposed

508
00:24:50.079 --> 00:24:53.599
<v Speaker 2>convolutions to upsemple and discriminator. This works much better for

509
00:24:53.640 --> 00:24:56.599
<v Speaker 2>images producing more stable training and realistic results.

510
00:24:56.880 --> 00:24:59.799
<v Speaker 1>And it shows a DCJAN implementation.

511
00:24:59.480 --> 00:25:04.079
<v Speaker 2>Yes using TensorFlow. This time covers imports, defining the convolutional

512
00:25:04.119 --> 00:25:08.440
<v Speaker 2>generator and discriminator models, the specific loss functions and optimizers

513
00:25:08.440 --> 00:25:12.519
<v Speaker 2>for JANS, the training function, and how to periodically save

514
00:25:12.599 --> 00:25:14.960
<v Speaker 2>generated images to see the progress.

515
00:25:14.759 --> 00:25:16.960
<v Speaker 1>So you can watch it learn to make faces or

516
00:25:17.000 --> 00:25:18.519
<v Speaker 1>digits or whatever exactly.

517
00:25:19.039 --> 00:25:21.599
<v Speaker 2>And also briefly mentioned style agan as a more advanced

518
00:25:21.599 --> 00:25:24.160
<v Speaker 2>technique for controlling the style of the generated output.

519
00:25:24.319 --> 00:25:29.519
<v Speaker 1>Okay, jens for generation. What's the final chapter? Chapter ten Transformers.

520
00:25:28.920 --> 00:25:32.359
<v Speaker 2>The current giants of NLP. Yeah, Transformers focuses on their

521
00:25:32.400 --> 00:25:35.359
<v Speaker 2>key innovation. The self attention mechanists.

522
00:25:34.920 --> 00:25:37.359
<v Speaker 1>Allows them to weigh word importance.

523
00:25:37.079 --> 00:25:40.119
<v Speaker 2>Right understand which words in a sentence are most relevant

524
00:25:40.160 --> 00:25:43.519
<v Speaker 2>to understanding a specific words meaning in that context. That's

525
00:25:43.519 --> 00:25:46.039
<v Speaker 2>how they handle long range dependencies so well. It lists

526
00:25:46.079 --> 00:25:48.759
<v Speaker 2>the famous ones Bert gpt or Berta et.

527
00:25:48.680 --> 00:25:50.759
<v Speaker 1>Cetera huge impact on language tasks.

528
00:25:51.200 --> 00:25:54.000
<v Speaker 2>Absolutely. The book does a good job explaining the difference

529
00:25:54.039 --> 00:25:58.079
<v Speaker 2>between older non contextual embeddings like word to vec where

530
00:25:58.079 --> 00:26:01.160
<v Speaker 2>bank always has the same vector, and the contextual embeddings

531
00:26:01.160 --> 00:26:04.279
<v Speaker 2>from models like BURT, where Riverbank and Savings Bank give

532
00:26:04.319 --> 00:26:05.079
<v Speaker 2>different vectors.

533
00:26:05.079 --> 00:26:08.200
<v Speaker 1>For bank that context is crucial. Does it show code?

534
00:26:08.480 --> 00:26:12.160
<v Speaker 2>Yeah, uses the transformers library from hugging Face shows getting

535
00:26:12.160 --> 00:26:16.240
<v Speaker 2>embeddings from WORDTVEK versus BURT to illustrate that contextual difference.

536
00:26:16.400 --> 00:26:17.720
<v Speaker 1>I'm using BURT and GPT.

537
00:26:18.000 --> 00:26:22.119
<v Speaker 2>Yes. Shows a BERT implementation for text understanding, loading the

538
00:26:22.119 --> 00:26:26.720
<v Speaker 2>tokenizer and model preparing inputs token id's attention masks, then

539
00:26:26.759 --> 00:26:31.279
<v Speaker 2>a GPT implementation for text generation loading cognizer, model prepping

540
00:26:31.319 --> 00:26:35.279
<v Speaker 2>input generating text continuation. Briefly mentions fine tuning too.

541
00:26:35.440 --> 00:26:38.720
<v Speaker 1>So practical examples of using these powerful pre trained.

542
00:26:38.359 --> 00:26:42.559
<v Speaker 2>Models exactly shows how accessible they become via libraries like transformers.

543
00:26:42.640 --> 00:26:45.400
<v Speaker 1>Wow. Okay, We've covered a massive amount of ground in

544
00:26:45.440 --> 00:26:49.160
<v Speaker 1>this deep dive, mirroring the book's journey from Python basics

545
00:26:49.160 --> 00:26:50.720
<v Speaker 1>and efficient data handling.

546
00:26:50.480 --> 00:26:53.000
<v Speaker 2>Right through those EDA and rapid modeling.

547
00:26:52.680 --> 00:26:57.240
<v Speaker 1>Tools to the core neural network types ann's CNNs for images,

548
00:26:57.440 --> 00:26:59.640
<v Speaker 1>RNNs for sequences.

549
00:26:59.119 --> 00:27:02.720
<v Speaker 2>Then GM's for jen generation, and finally transformers for language,

550
00:27:02.920 --> 00:27:07.160
<v Speaker 2>plus those specific applications like OCR and object detection.

551
00:27:07.680 --> 00:27:10.799
<v Speaker 1>It really is a broad overview of deep learning relevant

552
00:27:10.880 --> 00:27:11.960
<v Speaker 1>to data architects.

553
00:27:12.000 --> 00:27:16.000
<v Speaker 2>It definitely provides that comprehensive sweep. And while we've stayed

554
00:27:16.000 --> 00:27:19.000
<v Speaker 2>at a fairly high level here, hopefully giuse you our

555
00:27:19.079 --> 00:27:22.359
<v Speaker 2>listener a solid map of the key concepts and tools

556
00:27:22.359 --> 00:27:24.839
<v Speaker 2>discussed in deep learning for data architects.

557
00:27:24.920 --> 00:27:27.519
<v Speaker 1>Absolutely, and we definitely encourage you to check out the

558
00:27:27.519 --> 00:27:31.119
<v Speaker 1>book itself for all the code details, the deeper explanations,

559
00:27:31.519 --> 00:27:33.680
<v Speaker 1>and more guidance on actually applying this stuff to your

560
00:27:33.759 --> 00:27:34.839
<v Speaker 1>architectural challenges.

561
00:27:34.920 --> 00:27:37.359
<v Speaker 2>For sure. You know, this whole exploration really brings up

562
00:27:37.359 --> 00:27:40.200
<v Speaker 2>a fundamental thought, doesn't it. What's that just how profoundly

563
00:27:40.240 --> 00:27:44.480
<v Speaker 2>our ability to process and understand complex data is changing,

564
00:27:44.880 --> 00:27:48.279
<v Speaker 2>whether it's through these smarter analysis tools or these sophisticated

565
00:27:48.319 --> 00:27:52.160
<v Speaker 2>deep learning models. How is that fundamentally reshaping the role,

566
00:27:52.319 --> 00:27:55.839
<v Speaker 2>the capabilities, maybe even the strategic importance of the data

567
00:27:55.920 --> 00:27:57.160
<v Speaker 2>architects in today's world.

568
00:27:57.440 --> 00:28:00.480
<v Speaker 1>That's a great point something for everyone listening to really

569
00:28:00.519 --> 00:28:03.279
<v Speaker 1>consider as they think about the future of data systems

570
00:28:03.359 --> 00:28:04.759
<v Speaker 1>and their place within them,
