WEBVTT 1 00:00:00.080 --> 00:00:03.319 Welcome to the deep dive. We're taking on a really 2 00:00:03.399 --> 00:00:08.000 crucial area today, deep learning specifically tailored for you, the 3 00:00:08.080 --> 00:00:11.439 data architect. You shared some great material and our focus 4 00:00:11.519 --> 00:00:15.880 is this book Deep Learning for Data Architects by Shikhar Condowall. 5 00:00:16.199 --> 00:00:18.760 Yeah, it looks like a really solid resource, good for 6 00:00:18.879 --> 00:00:22.320 understanding how these advanced techniques fit into data infrastructure. 7 00:00:22.440 --> 00:00:25.800 It's pretty current too, right BPB Online twenty twenty four. 8 00:00:26.000 --> 00:00:28.600 That's right ISBN nine seven eight nine three five five 9 00:00:28.679 --> 00:00:31.079 five one five three nine one. So yeah, very up 10 00:00:31.079 --> 00:00:31.359 to date. 11 00:00:31.519 --> 00:00:35.679 Absolutely, and the author, Shikhar Kundawal, he seems to really 12 00:00:35.719 --> 00:00:38.439 know with stuff. Senior AI and data scientists in Hamburg, 13 00:00:38.840 --> 00:00:41.600 got a master's in data science specialized in computer vision 14 00:00:41.719 --> 00:00:45.520 and wow, over fifteen years in AI and machine learning. 15 00:00:45.399 --> 00:00:48.359 And experience with all the big cloud platforms aws, Google 16 00:00:48.359 --> 00:00:50.960 Cloud as your IBM cloud. That's important. 17 00:00:51.039 --> 00:00:53.880 Definitely brings that practical, real world angle. And apparently he's 18 00:00:53.880 --> 00:00:56.880 also into marathons and CrossFit huh. 19 00:00:57.200 --> 00:01:00.439 Yeah, saw that because that drive carries over and seriously, 20 00:01:00.520 --> 00:01:05.599 that real world experience is invaluable, especially for bridging that 21 00:01:05.719 --> 00:01:11.040 gap between complex AI theory and the practical stuff data 22 00:01:11.120 --> 00:01:15.120 architects deal with. Yeah, the book's dedication is quite nice too, 23 00:01:15.319 --> 00:01:19.000 to his wife, daughter, his uncle Danesh, a bookseller. 24 00:01:18.519 --> 00:01:20.359 A bookseller uncle, that's a nice touch. 25 00:01:20.439 --> 00:01:25.239 Yeah, and parents, PPB publications, colleagues, readers. Gives it a 26 00:01:25.319 --> 00:01:25.920 human feel. 27 00:01:26.120 --> 00:01:28.799 Okay, so let's unpack this for you, our listener. The 28 00:01:28.920 --> 00:01:31.480 mission here is to pull out the key knowledge from 29 00:01:31.480 --> 00:01:34.040 this book that's well directly relevant to your work as 30 00:01:34.079 --> 00:01:37.719 a data architect. We'll look at how deep learning concepts 31 00:01:37.719 --> 00:01:41.879 get implemented using Python, focusing on the practical side without 32 00:01:41.959 --> 00:01:43.680 getting too bogged down in dense theory. 33 00:01:43.840 --> 00:01:46.959 Think of it as like your fast track to understanding 34 00:01:46.959 --> 00:01:49.359 the deep learning bits that matter for your field exactly. 35 00:01:49.439 --> 00:01:50.879 We want to give you a clear picture of the 36 00:01:50.920 --> 00:01:53.439 main principles the tools, so you can see how deep 37 00:01:53.519 --> 00:01:56.959 learning could maybe be integrated strategically into the architectures you're 38 00:01:56.959 --> 00:01:58.000 designing or managing. 39 00:01:58.200 --> 00:01:59.879 Sounds good. Where does the book start? 40 00:02:00.079 --> 00:02:03.439 Right at the foundation Python for data science? Chapter one. 41 00:02:04.959 --> 00:02:07.519 There's a quote there that kind of sets the stage. 42 00:02:08.039 --> 00:02:10.319 You can't build a great building on a weak foundation, 43 00:02:10.800 --> 00:02:11.560 you know the one. 44 00:02:11.800 --> 00:02:14.719 You must have a solid foundation if you are going 45 00:02:14.759 --> 00:02:16.319 to have a super strong structure. 46 00:02:16.840 --> 00:02:19.800 Yeah, It's the perfect analogy, isn't it totally Just like 47 00:02:19.840 --> 00:02:23.520 a building needs that solid base. Any serious deep learning 48 00:02:23.560 --> 00:02:28.840 setup relies heavily on Python and its whole library ecosystem. 49 00:02:28.400 --> 00:02:30.759 So the book jumps right into the essential libraries. 50 00:02:30.879 --> 00:02:34.080 YEP kicks off with pandas for data handling, NUMBPI for 51 00:02:34.120 --> 00:02:36.560 all the numerical stuff, the real. 52 00:02:36.319 --> 00:02:38.439 Workhorses, you know, right basics. 53 00:02:38.520 --> 00:02:42.039 Then matt plot Live and Seaborn for visualization, super important 54 00:02:42.080 --> 00:02:45.240 for understanding data flows or how models are doing you. 55 00:02:45.240 --> 00:02:47.639 Can't really see inside the black box out of wise exactly. 56 00:02:47.719 --> 00:02:50.439 Then there's psyche learn, the big toolkit for general machine 57 00:02:50.520 --> 00:02:53.840 learning tasks, and of course TensorFlow and Cares. 58 00:02:53.719 --> 00:02:55.879 The deep learning powerhouses definitely. 59 00:02:55.960 --> 00:02:58.879 And for images it mentions psychic image and open cv too, 60 00:02:59.039 --> 00:02:59.240 And it. 61 00:02:59.240 --> 00:03:02.439 Gives you the install commands like pip install. Yeah, it 62 00:03:02.479 --> 00:03:06.159 provides the basic PIP install commands and actually a useful 63 00:03:06.199 --> 00:03:10.159 tip for notebook users using cyst dot executable and m 64 00:03:10.840 --> 00:03:13.800 pip install an su library name oh. 65 00:03:13.560 --> 00:03:15.120 Right, to make sure it installs in the. 66 00:03:15.159 --> 00:03:17.599 Right place exactly avoid some common headaches. 67 00:03:17.759 --> 00:03:20.800 Okay, so beyond us installing, what about using them Pandas 68 00:03:20.800 --> 00:03:21.800 for data io. 69 00:03:21.919 --> 00:03:24.960 Yeah, It covers reading and writing data extensively. You know 70 00:03:25.000 --> 00:03:29.800 the usual suspects, CSV with talks and read CSV standard 71 00:03:29.960 --> 00:03:33.960 Excel using two excel READEXL Jason two Two's dreets in 72 00:03:34.759 --> 00:03:37.840 pretty comprehensive for a data architect needing to connect different systems. 73 00:03:37.879 --> 00:03:40.319 That broad format support is key, and it gets. 74 00:03:40.199 --> 00:03:43.039 Into some interesting ones too, like read clipboard for quick 75 00:03:43.039 --> 00:03:43.919 copy pasting. 76 00:03:43.759 --> 00:03:45.879 Data, pandy for testing small things, and. 77 00:03:45.879 --> 00:03:48.879 Read ATML to pull tables straight from websites, which is 78 00:03:48.919 --> 00:03:49.439 pretty cool. 79 00:03:49.479 --> 00:03:51.039 Wait, multiple tables on a page. 80 00:03:51.120 --> 00:03:53.400 Yeah. It mentions the match parameter. You can tell it 81 00:03:53.439 --> 00:03:55.520 like look for a table with specific text in it 82 00:03:55.639 --> 00:03:59.159 or near it. Yeah. Really useful for automating data scraping pipelines. 83 00:03:59.199 --> 00:04:00.680 Okay, that is useful, And it. 84 00:04:00.599 --> 00:04:03.039 Even points to a blog post by the author about 85 00:04:03.039 --> 00:04:06.319 other formats like parquet and pickle. So PANDAS really positions 86 00:04:06.360 --> 00:04:08.479 itself as a central hub for data movement. 87 00:04:08.639 --> 00:04:11.680 But it's not just about reading data right. Efficiency matters, 88 00:04:11.800 --> 00:04:13.240 especially with big data sets. 89 00:04:13.319 --> 00:04:17.639 Oh absolutely, the book stress is optimizing pandas dot reacsv 90 00:04:18.639 --> 00:04:23.519 huge for data architects worried about resources, performance, cost, How 91 00:04:23.759 --> 00:04:26.519 well the d type parameter. First you can tell pand 92 00:04:26.600 --> 00:04:29.279 is exactly what data type each column should be, like 93 00:04:29.319 --> 00:04:31.680 this one's an integer, this one's a smaller float. 94 00:04:31.480 --> 00:04:34.120 Instead of letting pandas guests and maybe use more memory 95 00:04:34.160 --> 00:04:35.560 than needed precisely. 96 00:04:36.199 --> 00:04:39.079 The book gives an example with housing data setting rooms 97 00:04:39.439 --> 00:04:42.199 to MP dot nine ten thirty two distance to MP 98 00:04:42.279 --> 00:04:45.399 dot float sixteen. Stuff like that saves a noticeable chunk 99 00:04:45.399 --> 00:04:45.839 of memory. 100 00:04:46.040 --> 00:04:47.240 Okay, I see sense. 101 00:04:47.240 --> 00:04:49.680 Then there's use calls. Just tell which columns you actually need. 102 00:04:49.800 --> 00:04:52.439 Ah, so don't even load the rest into memory. 103 00:04:52.199 --> 00:04:54.240 Right if you only need five columns out of fifty, 104 00:04:54.279 --> 00:04:57.199 why load all fifty again? The example shows a big 105 00:04:57.240 --> 00:04:59.519 memory drop just by specifying. 106 00:04:58.920 --> 00:05:01.199 The columns, and you can come those d type and 107 00:05:01.319 --> 00:05:01.800 use calls. 108 00:05:01.920 --> 00:05:05.120 Yeah, that's where the real magic happens. For optimization, the 109 00:05:05.160 --> 00:05:08.759 book shows combining them can bring memory usage way down, 110 00:05:08.879 --> 00:05:12.639 sometimes to just like a few thousand kbs. That's significant 111 00:05:12.639 --> 00:05:14.519 for pipeline performance definitely. 112 00:05:14.560 --> 00:05:18.120 What about data sets that are just too big, like 113 00:05:18.600 --> 00:05:19.959 won't fit in memory at all. 114 00:05:20.120 --> 00:05:23.240 That's where chunk size comes in critical for architects designing 115 00:05:23.240 --> 00:05:26.399 for massive data volumes unless you process the file piece 116 00:05:26.439 --> 00:05:28.639 by piece in manageable chunks. 117 00:05:28.759 --> 00:05:33.199 Okay, So solid foundation with Python efficient data handling. Where 118 00:05:33.199 --> 00:05:35.480 does it go next? It seems like chapter two tackles 119 00:05:35.600 --> 00:05:37.079 real world data challenges. 120 00:05:37.160 --> 00:05:40.240 Yeah, it shifts focused to you know, the practical side 121 00:05:40.240 --> 00:05:43.720 of turning that raw data into something useful, into insights, 122 00:05:44.120 --> 00:05:46.759 because just having the data store isn't the endgame. 123 00:05:46.560 --> 00:05:47.000 Right right. 124 00:05:47.040 --> 00:05:49.920 You need to understand it, prepare it exactly, and especially 125 00:05:49.920 --> 00:05:53.639 before feeding it into say a deep learning model. So 126 00:05:53.759 --> 00:05:57.800 the book introduces some automated tools for exploratory data analysis da. 127 00:05:57.920 --> 00:06:01.480 AH tools to speed up that initial data understanding phase. 128 00:06:01.879 --> 00:06:04.319 Useful for an architect looking at a new source definitely. 129 00:06:04.360 --> 00:06:05.519 First one mentioned is panned. 130 00:06:05.600 --> 00:06:06.959 Is profiling heard of that one? 131 00:06:07.040 --> 00:06:12.399 Yeah? Generates reports right yep, pretty comprehensive reports, gives you stats, distributions, 132 00:06:12.680 --> 00:06:15.560 potential issues like missing values, correlations all in one go. 133 00:06:16.279 --> 00:06:18.399 Great for getting a fast assessment of a data set 134 00:06:18.439 --> 00:06:19.199 you need to integrate. 135 00:06:19.319 --> 00:06:21.800 Saves a lot of manual plotting and checking for. 136 00:06:21.759 --> 00:06:25.600 Sure, and it can generate interactive widgets in Jupiter notebooks 137 00:06:25.720 --> 00:06:30.480 using dfprofile dot to widgets. Good for collaboration. It also 138 00:06:30.519 --> 00:06:32.920 mentions a minimal mode for really huge data sets so 139 00:06:32.959 --> 00:06:36.079 it doesn't choke practical. What else next is sweet viz 140 00:06:36.639 --> 00:06:39.680 similar goal DA, but it really focuses on creating nice 141 00:06:39.839 --> 00:06:41.639 interactive HTML reports. 142 00:06:41.839 --> 00:06:44.519 HTML reports so easy to share exactly you can. 143 00:06:44.399 --> 00:06:46.839 Share them with people who aren't running Python. And a 144 00:06:46.920 --> 00:06:49.360 key feature is comparing two data sets side by side. 145 00:06:49.439 --> 00:06:53.279 Wooh, that sounds useful, like for data migration, validation or 146 00:06:53.560 --> 00:06:54.720 ab test results. 147 00:06:54.720 --> 00:06:57.759 Precisely, the commands are simple sv dot analyzed df dot 148 00:06:57.759 --> 00:07:01.199 SHOWTML for one data set as vadas compare DF ONEDF 149 00:07:01.240 --> 00:07:04.879 two do showtmail compare report dot HGMO for two. The 150 00:07:05.000 --> 00:07:06.879 visual comparison could be really powerful. 151 00:07:07.000 --> 00:07:09.759 Okay, cool, what's next? Autoviz sounds fast. 152 00:07:09.639 --> 00:07:12.759 That's the idea. One line of code for visualizations av 153 00:07:12.879 --> 00:07:16.800 dot autovis your data dot CSV deepvar target column. 154 00:07:16.920 --> 00:07:18.040 One line. What does it show? 155 00:07:18.240 --> 00:07:21.639 It tries to automatically figure out the important relationships and 156 00:07:21.720 --> 00:07:25.800 generates a bunch of relevant plots, scatterplots, distributions, box plots, 157 00:07:25.800 --> 00:07:29.360 heat maps, whatever seems appropriate for the variables. Quick visual 158 00:07:29.399 --> 00:07:30.759 scans for patterns or problems. 159 00:07:30.839 --> 00:07:35.240 Efficient. Then there's a LUX that integrates with Jupiter widgets. 160 00:07:35.439 --> 00:07:37.920 Yeah. Lux is interesting because it works within the notebook. 161 00:07:37.959 --> 00:07:41.199 You get a toggle Panda Seleux button on your data frame. 162 00:07:41.120 --> 00:07:43.639 So you switch between table and visuals. 163 00:07:43.879 --> 00:07:47.160 Kind of when you display a data frame. LUX ads 164 00:07:47.199 --> 00:07:51.040 recommendations for visualizations based on the current data, or you 165 00:07:51.040 --> 00:07:55.279 can state in intent like DF dot intent age fair, 166 00:07:56.079 --> 00:07:58.160 and it generates plots relevant to. 167 00:07:58.079 --> 00:07:59.480 Those columns intent based. 168 00:07:59.519 --> 00:08:01.920 I like that it tries to be smarter about what 169 00:08:01.959 --> 00:08:04.000 you might want to see. You can save reports to 170 00:08:04.079 --> 00:08:07.920 HTML too, DF save EACHTML clear the intent even get 171 00:08:07.920 --> 00:08:10.240 more advanced with vislist. It's quite interactive. 172 00:08:10.319 --> 00:08:13.079 Lots of EDA tools. What about modeling good transition? 173 00:08:13.480 --> 00:08:16.759 The next tool is lazy predict. It shifts gears towards 174 00:08:16.839 --> 00:08:19.279 quickly trying out lots of different machine learning models. 175 00:08:19.399 --> 00:08:22.079 Lazy like it does the work for you pretty much. 176 00:08:22.680 --> 00:08:25.439 If you're thinking about adding mL but aren't sure which 177 00:08:25.480 --> 00:08:29.240 algorithm might work best. Lazy predict runs your data through 178 00:08:29.279 --> 00:08:31.959 a whole bunch of standard classifiers or regressors. 179 00:08:32.080 --> 00:08:32.840 How does that work? 180 00:08:33.000 --> 00:08:36.120 You use lazy classifier or lazy regressor, Give it your 181 00:08:36.159 --> 00:08:40.000 training and testing data and call fit. It then spits 182 00:08:40.000 --> 00:08:43.320 out a table comparing the performance metrics accuracy F one 183 00:08:43.360 --> 00:08:46.799 score are squared whatever for dozens of models. 184 00:08:46.960 --> 00:08:49.720 Wow. Okay, so a quick benchmark to see what directions. 185 00:08:49.720 --> 00:08:53.720 Look promising exactly helps you understand potential complexity and performance 186 00:08:53.759 --> 00:08:56.440 trade offs early on before you commit architecturally. 187 00:08:56.720 --> 00:08:58.879 And the last one in this chapter, Pi Carrot. 188 00:08:59.159 --> 00:09:02.000 PI Carrot is a no other automated mL library, maybe 189 00:09:02.039 --> 00:09:04.840 a bit more comprehensive than Lazy Predict. It covers more 190 00:09:04.840 --> 00:09:09.639 of the workflow, model building, tuning, evaluation, even some deployment. 191 00:09:09.240 --> 00:09:11.080 Aspects, so more end to end. 192 00:09:11.600 --> 00:09:13.639 Yeah, you can do things like compare models to get 193 00:09:13.639 --> 00:09:16.120 a leader board and create model to build say an 194 00:09:16.120 --> 00:09:19.639 extra trees model and plot model to visualize confusion matrices, 195 00:09:19.759 --> 00:09:21.480 future importance learning curves. 196 00:09:21.519 --> 00:09:23.200 So it helps understand the whole life cycle. 197 00:09:23.440 --> 00:09:27.240 Right for an architect, Understanding that full process helps inform 198 00:09:27.279 --> 00:09:32.279 decisions about deployment, monitoring, scaling the mL components within the 199 00:09:32.360 --> 00:09:33.039 larger system. 200 00:09:33.159 --> 00:09:36.320 Okay, that's a powerful set of tools for understanding and 201 00:09:36.360 --> 00:09:39.799 initially modeling data. What's next? Chapter three gets into the 202 00:09:39.799 --> 00:09:42.000 core right, building neural networks. 203 00:09:42.080 --> 00:09:45.200 Yes, Chapter three dives into artificial neural networks an NS. 204 00:09:45.759 --> 00:09:50.200 It starts conceptually explaining the inspiration from biology, our. 205 00:09:50.200 --> 00:09:53.200 Brains, the whole neurons and connections idea exactly. 206 00:09:53.200 --> 00:09:55.799 It provides that helpful mental model Then it breaks down 207 00:09:55.799 --> 00:09:59.399 the basic building blocks, the artificial neurons, the weights connecting them, 208 00:09:59.600 --> 00:10:03.039 the BUI, and importantly, activation functions. 209 00:10:02.639 --> 00:10:06.000 Crucial for an architect to understand the components if they're 210 00:10:06.039 --> 00:10:07.360 supporting the infrastructure. 211 00:10:07.519 --> 00:10:10.720 Absolutely. It briefly covers the feed forward process how data 212 00:10:10.759 --> 00:10:14.080 flows through in the basic neuron math. Then yeah, activation functions, 213 00:10:14.080 --> 00:10:14.799 big emphasis there. 214 00:10:14.840 --> 00:10:15.879 Why are they so important? 215 00:10:16.039 --> 00:10:19.240 The introduced nonlinearity. Without them, the network could only learn 216 00:10:19.360 --> 00:10:22.600 linear relationships no matter how many layers you stack. The 217 00:10:22.639 --> 00:10:25.360 book mentions the common ones sigmoid. 218 00:10:25.080 --> 00:10:28.279 Re lu, real use seems everywhere. 219 00:10:27.879 --> 00:10:33.240 It's computational efficient. Also ten softmas for multi class outputs 220 00:10:33.480 --> 00:10:38.080 and variations like leaky reilu e lu. That nonlinearity is 221 00:10:38.279 --> 00:10:40.480 key for learning complex patterns. 222 00:10:40.559 --> 00:10:44.720 Got it? So data goes forward, nonlinearity is added, how 223 00:10:44.759 --> 00:10:45.440 does it learn? 224 00:10:45.799 --> 00:10:48.440 That's where the loss function comes in. It measures how 225 00:10:48.519 --> 00:10:52.159 wrong the network's predictions are compared to the actual answers. 226 00:10:52.639 --> 00:10:56.000 Quantifies the error a performance metric basically right, And once 227 00:10:56.000 --> 00:10:59.240 you can measure the error, you use backward propagation backprop 228 00:10:59.279 --> 00:11:01.480 to figure out how to just the weights and biases 229 00:11:01.480 --> 00:11:03.919 to reduce that error. That's the learning part. 230 00:11:04.080 --> 00:11:08.639 Okay, and the book defines the training jargon epochs batches. 231 00:11:08.360 --> 00:11:12.279 YEP defines epoch one full pass through the training data batch, 232 00:11:12.519 --> 00:11:15.240 a subset of data used in one update step iteration 233 00:11:15.559 --> 00:11:18.480 one update step. Also optimizers the algorithms that do the 234 00:11:18.480 --> 00:11:22.679 weed adjustments like sgd armsprop atom atoms another common one 235 00:11:22.799 --> 00:11:25.159 very common, and the learning rate, which controls how big 236 00:11:25.200 --> 00:11:29.080 those adjustments are. Understanding these helps estimate resource needs training 237 00:11:29.080 --> 00:11:30.919 times within an architecture makes sense? 238 00:11:31.039 --> 00:11:32.240 Does it show how to build one? 239 00:11:32.480 --> 00:11:37.200 Yes? It uses Keras for practical examples. First, a binary 240 00:11:37.200 --> 00:11:40.720 classification model for breast cancer prediction using a standard data 241 00:11:40.720 --> 00:11:41.679 set from psychic. 242 00:11:41.519 --> 00:11:45.240 Learn, so predicting one of two outcomes. 243 00:11:44.799 --> 00:11:48.120 Exactly, walks through loading libraries looking at the data, then 244 00:11:48.159 --> 00:11:51.679 building the Keras model layer by layer input layer, a 245 00:11:51.759 --> 00:11:52.639 hidden layer with. 246 00:11:52.679 --> 00:11:54.879 Re lu how many neurons. 247 00:11:54.519 --> 00:11:57.200 The example uses, I think sixteen in the hidden layer, 248 00:11:57.519 --> 00:12:00.480 then an output layer with one neuron and a sigmoide 249 00:12:00.519 --> 00:12:04.960 activation because it's binary, then compiling it, choosing the atom 250 00:12:05.000 --> 00:12:08.559 optimizer the right loss function like sparse categor a cross entropy, 251 00:12:09.159 --> 00:12:10.480 then training with model. 252 00:12:10.240 --> 00:12:11.519 Out fit and evaluating. 253 00:12:11.600 --> 00:12:14.080 Yeah looks at loss and accuracy. Yeah shows how to plot, 254 00:12:14.080 --> 00:12:16.759 a confusion matrix, get a classification report gives you the 255 00:12:16.759 --> 00:12:17.799 whole assessment. 256 00:12:17.399 --> 00:12:20.279 Picture, which is vital if you're monitoring these models in production. 257 00:12:20.480 --> 00:12:24.399 Absolutely. Then interestingly it builds a deeper network for the 258 00:12:24.399 --> 00:12:26.240 same problem, adds another hidden. 259 00:12:26.039 --> 00:12:28.600 Layer to see if it improves, right, and. 260 00:12:28.559 --> 00:12:32.600 The book notes that the deeper network performed better. Highlights 261 00:12:32.639 --> 00:12:37.320 that trade off more complexity potentially better results, but also 262 00:12:37.399 --> 00:12:40.039 more computation architectural consideration. 263 00:12:40.159 --> 00:12:42.919 Good point. What about other types of problems? 264 00:12:43.240 --> 00:12:46.960 It follows up with a regression example, predicting Boston housing prices, 265 00:12:47.720 --> 00:12:48.960 again using a built. 266 00:12:48.679 --> 00:12:52.279 In data set, so predicting a number, not a category correct. 267 00:12:52.279 --> 00:12:56.399 And here it emphasizes preprocessing more train test split, of course, 268 00:12:56.679 --> 00:13:00.440 but also features scaling using standard scaler often crucial for 269 00:13:00.519 --> 00:13:01.799 regression with neural nets. 270 00:13:01.879 --> 00:13:03.639 Why scaling helps. 271 00:13:03.399 --> 00:13:06.200 The optimizer converge better when features have very different ranges. 272 00:13:06.759 --> 00:13:09.960 The model itself is similar infut layer matching the number 273 00:13:10.000 --> 00:13:12.240 of features a hidden layer maybe one hundred and twenty 274 00:13:12.240 --> 00:13:14.759 eight neurons with REALU, and then the output layer is 275 00:13:14.799 --> 00:13:16.600 just one neuron with a linear. 276 00:13:16.240 --> 00:13:19.399 Activation linear because the output is a continuous price. 277 00:13:19.240 --> 00:13:22.320 Exactly, and the lass function changes too. Uses mean squared 278 00:13:22.399 --> 00:13:25.440 error standard for regression, then trains and evaluates based on 279 00:13:25.480 --> 00:13:26.720 the MSSE on the test set. 280 00:13:26.759 --> 00:13:30.600 So two clear examples. Classification and regression covers the basics. 281 00:13:30.600 --> 00:13:33.559 Well yeah, provides a solid caras foundation for building these 282 00:13:33.559 --> 00:13:37.759 fundamental network types. Essential knowledge for architects dealing with different 283 00:13:37.879 --> 00:13:38.960 m model types. 284 00:13:39.080 --> 00:13:42.879 Okay, moving on to chapter four, Convolutional neural networks CNNs, 285 00:13:43.039 --> 00:13:44.279 big topic for images. 286 00:13:44.440 --> 00:13:48.159 Huge. The book starts by explaining why you need CNNs 287 00:13:48.279 --> 00:13:51.720 for images. Why just flattening the pixels and feeding them 288 00:13:51.720 --> 00:13:54.360 into a standard network isn't ideal? 289 00:13:54.559 --> 00:13:57.279 Right? You mentioned losing spatial info exactly. 290 00:13:57.320 --> 00:14:01.600 It's sensitive to shifts distortions. CNNs are designed to handle 291 00:14:01.600 --> 00:14:05.200 that spatial hierarchy in images. It introduces the core idea 292 00:14:05.879 --> 00:14:06.840 kernels or. 293 00:14:06.720 --> 00:14:09.120 Filters, the little squares that slide over. 294 00:14:09.000 --> 00:14:12.360 The image yep, and the convolution operation itself. How the 295 00:14:12.360 --> 00:14:15.759 filter multiplies in sums pixel values to create a feature map, 296 00:14:15.960 --> 00:14:20.039 highlighting specific patterns like edges or textures. Understanding this helps 297 00:14:20.039 --> 00:14:23.240 think about how image data needs to be handled architecturally. 298 00:14:23.320 --> 00:14:24.960 It also covers stride and padding. 299 00:14:25.240 --> 00:14:29.279 Right. Stride is how many pixels the filter jumps each time. 300 00:14:29.679 --> 00:14:32.840 Padding is adding borders to control the output size. Then 301 00:14:32.919 --> 00:14:36.720 it explains how convolution works on color images, RGB, multiple channels, 302 00:14:36.919 --> 00:14:40.039 and how using multiple filters lets the network learn different 303 00:14:40.039 --> 00:14:41.200 features simultaneously. 304 00:14:41.399 --> 00:14:44.879 Okay, so convolution extracts features. What else is in a CNN? 305 00:14:45.240 --> 00:14:48.919 Pooling layers usually max pooling, They downsample the feature maps, 306 00:14:48.960 --> 00:14:52.440 make the network more robust of variations, reduce computation, brings 307 00:14:52.480 --> 00:14:56.440 things down basically. Yeah, then flattening, taking the final two 308 00:14:56.480 --> 00:14:58.440 D feature maps and turn them into a one D. 309 00:14:58.480 --> 00:15:01.720 Vector to feed into a regular dense layer exactly. 310 00:15:01.840 --> 00:15:04.320 The final part is usually one or more dense layers 311 00:15:04.320 --> 00:15:06.639 for the actual classification, just like in the A and 312 00:15:06.720 --> 00:15:07.519 NS we discussed. 313 00:15:07.519 --> 00:15:09.559 Does it show an example, of course. 314 00:15:09.600 --> 00:15:13.480 The classic MNIST data set handwritten digits walks through loading 315 00:15:13.480 --> 00:15:16.120 at via keras preprocessing. 316 00:15:15.480 --> 00:15:17.480 Like reshaping for the color channel. 317 00:15:17.279 --> 00:15:19.799 YEP reshaping to add that channel dimension even though it's 318 00:15:19.799 --> 00:15:23.720 greyscale and one hot encoding the labels zero to nine. 319 00:15:23.320 --> 00:15:25.000 And the CNN architecture. 320 00:15:25.159 --> 00:15:28.720 It shows building a typical CNN convy two D layers 321 00:15:28.759 --> 00:15:32.519 with ReLU maybe batch normalization for stability, max pooling two 322 00:15:32.559 --> 00:15:35.559 D layers, the flatten layer, and dense layers with soft 323 00:15:35.600 --> 00:15:37.440 max at the end for the ten digit. 324 00:15:37.200 --> 00:15:40.519 Classes compiled with categorical cross entropy. 325 00:15:40.399 --> 00:15:45.200 Right an atom optimizer usually then training plotting the ACCURACYLS curves, 326 00:15:45.200 --> 00:15:49.000 making predictions, showing the confusion matrix. The whole workflow for 327 00:15:49.039 --> 00:15:50.799 image classification very practical. 328 00:15:50.879 --> 00:15:53.960 What about tuning? CNNs have lots of knobs. 329 00:15:53.600 --> 00:15:57.320 To turn good point. The chapter introduces hyper parameter tuning 330 00:15:57.519 --> 00:16:00.200 using Keros tuner, but switches to the fashion MNA. Yes 331 00:16:00.320 --> 00:16:03.879 data set similar idea, but images of clothing items. Why 332 00:16:03.919 --> 00:16:06.519 tuning because just picking the number of layers or filter 333 00:16:06.600 --> 00:16:10.159 sizes by guesswork is an optimal. Things like learning rate, 334 00:16:10.360 --> 00:16:14.279 activation functions, number of units, they all impact performance. Tuning 335 00:16:14.320 --> 00:16:15.039 finds the best. 336 00:16:14.840 --> 00:16:18.120 Combo, and karristuoner helps automate that search exactly. 337 00:16:18.240 --> 00:16:20.919 The book shows how to install it PIP installed moles 338 00:16:21.000 --> 00:16:24.320 Karras tuner. Then define a model building function. Inside that function, 339 00:16:24.559 --> 00:16:27.279 you define the search space for your hyper parameters, like. 340 00:16:27.360 --> 00:16:30.000 Try learning rates of biller point zero one or point 341 00:16:30.120 --> 00:16:31.080 zero zero one. 342 00:16:31.159 --> 00:16:35.000 Precisely, or try one versus two dense layers or different 343 00:16:35.039 --> 00:16:38.440 numbers of units. You tell Karrastuner the ranges are choices. 344 00:16:38.879 --> 00:16:43.639 Then you create a tuner object like dat hyperband hyperband 345 00:16:43.720 --> 00:16:46.559 it's one of the search algorithms Karris tuner efforts. Then 346 00:16:46.559 --> 00:16:49.120 you run tuner dot search and it trains lots of 347 00:16:49.120 --> 00:16:51.840 model variations to find the best hyper parameters. 348 00:16:51.320 --> 00:16:53.679 And you can get the best ones out yeap tuner I. 349 00:16:53.639 --> 00:16:56.759 Get best hyper parameters, gives you the optimal settings it found. 350 00:16:57.360 --> 00:16:59.679 Then you build the final model with those best settings 351 00:17:00.039 --> 00:17:01.279 and train it properly. 352 00:17:01.039 --> 00:17:04.119 And evaluate that best model shows the real benefit. 353 00:17:04.200 --> 00:17:07.400 Right shows how tuning can push performance higher. Important for 354 00:17:07.519 --> 00:17:09.839 architects thinking about optimizing training pipelines. 355 00:17:09.920 --> 00:17:14.079 Okay. Chapter five moves to a specific application, Optical Character 356 00:17:14.160 --> 00:17:15.880 recognition OCR. 357 00:17:16.039 --> 00:17:19.880 Yeah, turning text in images into actual usable text data. 358 00:17:20.519 --> 00:17:25.400 Super important for digitizing documents invoices, bank statements, even reading 359 00:17:25.480 --> 00:17:26.960 road signs for autonomous cars. 360 00:17:27.119 --> 00:17:28.920 Lots of applications. What tools does it cover? 361 00:17:29.079 --> 00:17:31.279 It introduces several Python OCR libraries. 362 00:17:31.880 --> 00:17:35.240 Starts with tessak, the classic open source one from HP, 363 00:17:35.359 --> 00:17:36.839 then Google that's the one. 364 00:17:37.000 --> 00:17:40.640 Mentions installation, setting the path, and a basic demo using 365 00:17:40.680 --> 00:17:44.480 Pietes image does string straightforward for simple cases? 366 00:17:44.559 --> 00:17:46.119 What about more modern approaches? 367 00:17:46.440 --> 00:17:49.599 It covers Kara's okey This uses deep learning models under 368 00:17:49.599 --> 00:17:52.799 the hood, shows installing it creating an OCR pipeline and 369 00:17:52.920 --> 00:17:54.680 using pipeline dot recognize on an. 370 00:17:54.640 --> 00:17:57.880 Image, so leveraging pre trained models exactly. 371 00:17:57.920 --> 00:18:00.920 It often handles more varied images better than traditional methods. 372 00:18:01.079 --> 00:18:02.000 Okay, any others. 373 00:18:02.039 --> 00:18:04.680 Easy OCR the name says it all right. It highlights 374 00:18:04.680 --> 00:18:08.480 its simplicity and really good multi language support out of 375 00:18:08.480 --> 00:18:08.839 the box. 376 00:18:08.960 --> 00:18:10.880 Multi language that's a big plus. 377 00:18:11.000 --> 00:18:15.359 Definitely shows installation pip install eazokey initializing a reader with 378 00:18:15.440 --> 00:18:19.160 language codes like n fr D and then just reader 379 00:18:19.200 --> 00:18:20.720 dot read text pretty simple. 380 00:18:20.799 --> 00:18:23.960 API nice does handle PDFs, that's common. 381 00:18:24.039 --> 00:18:27.000 Mentions that, yeah, needs helper tools like Poplar utils and 382 00:18:27.039 --> 00:18:30.440 pdf two image to first convert PDF pages to images. 383 00:18:30.839 --> 00:18:33.880 Then you run easy OCR on the images. A common workflow. 384 00:18:34.000 --> 00:18:35.960 Good practical tip. One more tree. 385 00:18:36.000 --> 00:18:40.160 OCR stands for transformer OCR. It's described as more of 386 00:18:40.160 --> 00:18:44.079 a research project using transformer models like from NLP adapted 387 00:18:44.119 --> 00:18:46.319 for OCR on challenging natural. 388 00:18:46.119 --> 00:18:49.119 Images transformers for OCR interesting. 389 00:18:49.200 --> 00:18:52.039 Yeah, shows how cutting edge NLP architectures are crossing over. 390 00:18:52.359 --> 00:18:57.119 Shows installation pip install, transcute transformers, Loading the pre trained 391 00:18:57.160 --> 00:19:00.880 three OCR model and processor and running it represents the 392 00:19:00.880 --> 00:19:01.519 state of the art. 393 00:19:01.880 --> 00:19:05.319 Okay, so several OCR options depending on the need Chapter 394 00:19:05.359 --> 00:19:06.720 six object detection. 395 00:19:07.000 --> 00:19:10.279 Right moving beyond just what's in an image, classification or 396 00:19:10.319 --> 00:19:13.799 where text is ocr to finding multiple objects and drawing 397 00:19:13.880 --> 00:19:14.799 boxes around them. 398 00:19:14.839 --> 00:19:17.279 It distinguishes that from classification and localization. 399 00:19:17.440 --> 00:19:21.079 First, Yeah, clearly defined classification with one label per image, 400 00:19:21.160 --> 00:19:25.599 localization one object with a box, detection multiple objects, multiple boxes. 401 00:19:25.640 --> 00:19:29.200