WEBVTT 1 00:00:00.080 --> 00:00:04.160 Okay, imagine this scenario. You launch your amazing new application 2 00:00:04.400 --> 00:00:07.960 and boom, overnight, it just goes completely viral. The dream, right, 3 00:00:08.000 --> 00:00:11.800 absolutely every developer's dream. But then that little bit of 4 00:00:11.800 --> 00:00:15.160 panic starts creeping in. Is your system actually ready? Can 5 00:00:15.199 --> 00:00:18.239 it handle say, one hundred x traffic spike without just 6 00:00:18.320 --> 00:00:18.879 falling over? 7 00:00:19.280 --> 00:00:21.679 Yeah, that's the kind of challenge that definitely keeps people 8 00:00:21.760 --> 00:00:23.160 up at night exactly. 9 00:00:23.600 --> 00:00:26.039 So this deep dive, this is your shortcut really to 10 00:00:26.199 --> 00:00:31.879 understanding how Kubernetes and containers can turn that potential nightmare 11 00:00:31.920 --> 00:00:36.320 scenario into a well, a robust, scalable reality. 12 00:00:36.359 --> 00:00:38.560 We're basically going to pull apart the key ideas from 13 00:00:38.560 --> 00:00:42.359 a really great resource, William Dennis's Kubernetes for Developers. 14 00:00:42.439 --> 00:00:45.359 Yeah, we'll unpack the essentials, starting from the real basics 15 00:00:45.399 --> 00:00:46.840 like why you even use containers in. 16 00:00:46.799 --> 00:00:48.240 The first place, all the way up to the more 17 00:00:48.280 --> 00:00:51.000 advanced stuff you know, strategies for running your applications in 18 00:00:51.039 --> 00:00:53.159 a proper professional production environment. 19 00:00:53.320 --> 00:00:55.799 Our mission here is simple. We want to equip you 20 00:00:55.840 --> 00:00:59.359 with the knowledge you need to confidently, deploy, manage, and 21 00:00:59.479 --> 00:01:03.320 crucially scale your applications. 22 00:01:02.240 --> 00:01:05.480 Making you feel well informed and hopefully ready for pretty 23 00:01:05.519 --> 00:01:06.040 much anything. 24 00:01:06.120 --> 00:01:08.159 Okay, so let's dive right in. Let's start right at 25 00:01:08.200 --> 00:01:12.719 the beginning. Why why are containers and kubernetes suddenly such 26 00:01:12.719 --> 00:01:16.079 a huge deal? What problem were they actually designed to solve? 27 00:01:16.920 --> 00:01:18.879 Well, it's really been an evolution, hasn't it. I mean 28 00:01:18.959 --> 00:01:21.760 you think back. We went from just throwing multiple apps 29 00:01:21.799 --> 00:01:22.840 onto a single server. 30 00:01:23.000 --> 00:01:24.920 Oh yeah, the bad old days totally. 31 00:01:25.040 --> 00:01:30.159 Then we moved to isolating them in these heavyweight virtual machines, 32 00:01:30.719 --> 00:01:34.319 and now finally we've landed on containers. 33 00:01:34.000 --> 00:01:35.200 And the key differences. 34 00:01:35.439 --> 00:01:39.040 It's all about that lightweight isolation. See a VM, it 35 00:01:39.200 --> 00:01:43.239 duplicates an entire operating system, kernel and all super heavy, right, 36 00:01:43.560 --> 00:01:46.719 But a container it just packages your application and only 37 00:01:46.760 --> 00:01:48.599 its direct dependencies, nothing else. 38 00:01:48.920 --> 00:01:51.439 Okay, So the benefits there are huge benefits. 39 00:01:52.040 --> 00:01:55.319 Language flexibility. For one, you can run like different versions 40 00:01:55.359 --> 00:01:57.920 of Java or Python on the very same host machine 41 00:01:57.959 --> 00:01:58.760 without them clashing. 42 00:01:58.879 --> 00:01:59.959 That alone is pretty big. 43 00:02:00.079 --> 00:02:03.560 Yeah it is, and maybe even bigger is true reproducibility. 44 00:02:04.239 --> 00:02:07.000 The docker file that's like the recipe for your container. 45 00:02:07.120 --> 00:02:12.000 It records every single step needed to create that exact environment. 46 00:02:11.599 --> 00:02:14.199 So no more, Well it worked on my machine exactly. 47 00:02:14.240 --> 00:02:18.639 That whole headache just goes away consistency every single time. 48 00:02:18.800 --> 00:02:23.000 Okay, that reproducibility sounds like a lifesaver personally. So containers 49 00:02:23.039 --> 00:02:27.000 give us this perfectly packaged portable app Where does Kubernetes 50 00:02:27.000 --> 00:02:29.159 fit into this picture? What's its role? 51 00:02:29.520 --> 00:02:33.039 So if containers are the perfectly packed boxes, Kubernetes is 52 00:02:33.080 --> 00:02:36.719 kind of like the super smart automated warehouse management system 53 00:02:36.759 --> 00:02:37.800 for all those boxes. 54 00:02:37.840 --> 00:02:38.680 I like that analogy. 55 00:02:38.759 --> 00:02:41.840 It orchestrates them, but at a workload level, it hits 56 00:02:41.919 --> 00:02:44.400 this really nice sweet spot. You know, you're not drowning 57 00:02:44.400 --> 00:02:47.120 in the details of every single machine, but you're also 58 00:02:47.199 --> 00:02:50.719 not stuck in a really high level, sometimes limiting platform 59 00:02:50.759 --> 00:02:52.560 as a service or pot environment. 60 00:02:52.680 --> 00:02:54.879 So how does it do that? What are the components 61 00:02:55.240 --> 00:02:55.719 it uses? 62 00:02:55.759 --> 00:03:00.680 These really clever composable building blocks. You've got pods. They're 63 00:03:00.719 --> 00:03:03.599 the smallest thing you can deploy, often just one container, 64 00:03:03.639 --> 00:03:06.159 but could be a few tightly coupled ones working together. 65 00:03:06.680 --> 00:03:09.680 Then you have deployments. They manage your stateless applications, the 66 00:03:09.719 --> 00:03:11.719 ones that don't hold onto data long term. They make 67 00:03:11.800 --> 00:03:14.280 sure you always have the right number of replicas running. 68 00:03:14.400 --> 00:03:16.719 Got it? And how do they talk to each other? 69 00:03:16.759 --> 00:03:17.719 Were they outside world? 70 00:03:18.039 --> 00:03:23.039 Ah? That's services. Services expose your applications either internally or externally, 71 00:03:23.199 --> 00:03:25.919 and they handle all the load balancing between your pods. 72 00:03:26.319 --> 00:03:28.599 The real magic, though, sounds like the automation. 73 00:03:28.879 --> 00:03:31.840 Oh absolutely, that's the core of it. It delivers automation. 74 00:03:32.080 --> 00:03:35.639 You get self healing, so if a container crashes, Kupernettes 75 00:03:35.680 --> 00:03:39.120 just restarts it nice. If a whole server a node 76 00:03:39.199 --> 00:03:42.479 goes down, it reschedules your pods onto healthy nodes automatically. 77 00:03:42.560 --> 00:03:42.919 Wow. 78 00:03:43.080 --> 00:03:46.520 And massive scalability. I mean, think about projects like Niantic's 79 00:03:46.560 --> 00:03:49.680 Pokemon Go that ran on tens of thousands of CPU cords, 80 00:03:49.719 --> 00:03:52.400 scaling up and down like crazy. Kupernettes made that possible. 81 00:03:52.759 --> 00:03:56.919 That self healing, that automatic rescheduling, that's a game changer 82 00:03:56.919 --> 00:03:57.800 for reliability. 83 00:03:59.039 --> 00:04:01.560 It makes you think about those traditional PAWS platforms. Though 84 00:04:01.599 --> 00:04:03.159 people often say they're quicker to start with. 85 00:04:03.360 --> 00:04:05.599 They can be, yeah, initially. 86 00:04:05.240 --> 00:04:08.199 So what's the fundamental difference. What does Kupernetes offer that 87 00:04:08.280 --> 00:04:11.400 makes it worth Maybe a steeper initial learning curve. 88 00:04:11.680 --> 00:04:14.719 That's a great question. With a traditional pass you can 89 00:04:14.719 --> 00:04:17.279 get going fast, sure, but a lot of teams they 90 00:04:17.319 --> 00:04:20.680 eventually hit this well people call the clone of despair. 91 00:04:20.959 --> 00:04:25.160 Huh Okay, seriously, their needs just outgrow what the pass 92 00:04:25.199 --> 00:04:28.560 can easily do, and then they're stuck facing a complete, 93 00:04:29.000 --> 00:04:33.920 often really painful re architecture from scratch out. Yeah. Kuper Natties, 94 00:04:33.959 --> 00:04:36.279 on the other hand, while maybe taking a bit longer 95 00:04:36.279 --> 00:04:39.879 to learn upfront, gives you just expansive possibilities as you grow. 96 00:04:40.160 --> 00:04:43.439 The key is flexibility. You get deep control, but without 97 00:04:43.480 --> 00:04:45.519 sacrificing that automation we talked. 98 00:04:45.279 --> 00:04:48.360 About, So you don't paint yourself into a corner exactly. 99 00:04:48.639 --> 00:04:50.920 You won't need to just rip everything out and start 100 00:04:50.920 --> 00:04:53.839 over if your application of all. So you can run simple, 101 00:04:53.959 --> 00:04:58.120 stateless web apps, sure, but you can also migrate complex, 102 00:04:58.360 --> 00:05:01.519 stateful legacy apps, or even run your own. 103 00:05:01.439 --> 00:05:03.800 Databases, all on the same platform, all. 104 00:05:03.639 --> 00:05:07.959 On the same unified platform. That initial investment in learning Kubernetes, 105 00:05:08.000 --> 00:05:10.800 it really pays off long term as your system matures. 106 00:05:10.920 --> 00:05:13.879 All right, I'm definitely sold on the why. This makes 107 00:05:13.920 --> 00:05:16.279 a lot of sense. Now let's get practical. How do 108 00:05:16.319 --> 00:05:19.199 you actually get an application ready for Kubernetes. What's the 109 00:05:19.319 --> 00:05:20.040 very first step? 110 00:05:20.199 --> 00:05:23.680 The essential first step is getting your app into a container. 111 00:05:23.720 --> 00:05:26.480 We call it containerizing it, and the main tool for 112 00:05:26.519 --> 00:05:27.079 that is the. 113 00:05:27.040 --> 00:05:29.000 Docker file the recipe you mentioned earlier. 114 00:05:29.079 --> 00:05:31.279 Exactly, it's just a text file with a set of 115 00:05:31.279 --> 00:05:33.920 instructions step by step telling Docker how to build your 116 00:05:33.920 --> 00:05:36.680 container image. You start with a base image like say 117 00:05:36.759 --> 00:05:40.199 Python point three, instead of installing Python yourself on a boontu, right, 118 00:05:40.399 --> 00:05:45.079 saves you that step totally. Then you add your application code, installers, dependencies, 119 00:05:45.199 --> 00:05:48.800 configure things for apps that need compiling, like Java or Go. 120 00:05:49.120 --> 00:05:51.639 There's a really neat trick called multi stage builds. 121 00:05:51.959 --> 00:05:53.800 Oh yeah, how does that work? 122 00:05:53.959 --> 00:05:59.040 You basically use one stage with all the heavy build tools, compilers, SDKs, whatever. 123 00:05:59.399 --> 00:06:03.279 Then in a final stage, you copy only the compiled 124 00:06:03.319 --> 00:06:06.120 application into a clean, minimal base image. 125 00:06:06.240 --> 00:06:08.480 Ah, so you ditch all the build stuff exactly. 126 00:06:08.480 --> 00:06:11.519 You end up with these tiny, production ready images that 127 00:06:11.600 --> 00:06:15.560 only have what's strictly needed to run much smaller, more secure. 128 00:06:15.319 --> 00:06:18.439 Makes sense. And for server applications, what else do we 129 00:06:18.480 --> 00:06:19.199 need to keep in mind? 130 00:06:19.279 --> 00:06:21.920 Well, the main thing is your container needs to keep running, right. 131 00:06:22.040 --> 00:06:24.240 It's not just a script that finishes, it's. 132 00:06:24.079 --> 00:06:25.079 A long running process. 133 00:06:25.199 --> 00:06:27.560 Yeah, and you'll need to know how to map ports, 134 00:06:27.879 --> 00:06:30.560 like tell Docker to connect port eighty eighty on your 135 00:06:30.600 --> 00:06:34.000 local machine to port eighty inside the container. Using something 136 00:06:34.079 --> 00:06:36.800 like the natch P eighty eighty point eighty eight zo flag. 137 00:06:36.879 --> 00:06:39.079 Okay, standard stuff for web servers. So we've got our 138 00:06:39.079 --> 00:06:43.600 container image built, but surely we test this locally before 139 00:06:43.759 --> 00:06:45.079 throwing it onto a big cluster. 140 00:06:45.279 --> 00:06:48.240 Oh absolutely, you have to, and for local testing and development. 141 00:06:48.319 --> 00:06:51.600 Docker composed is just invaluable. It's really your secret weapon. 142 00:06:51.920 --> 00:06:55.360 It lets you define and run multi container applications easily, 143 00:06:55.759 --> 00:06:57.759 so you can spin up your app container, maybe a 144 00:06:57.839 --> 00:07:01.240 database container or reddis container, all linked together just like 145 00:07:01.279 --> 00:07:03.319 it would be in production, but on your laptop. 146 00:07:03.519 --> 00:07:03.839 Nice. 147 00:07:03.959 --> 00:07:06.720 It preserves all your run time settings, makes port mapping 148 00:07:06.759 --> 00:07:09.720 easy and a really cool feature you can map local 149 00:07:09.720 --> 00:07:11.800 folders directly into the container. 150 00:07:11.920 --> 00:07:14.639 Oh so you can edit code locally and see the changes. 151 00:07:14.360 --> 00:07:18.079 Live instantly, make a code change, refresh your browser. It's reflected, 152 00:07:18.240 --> 00:07:20.879 super fast iteration. Plus you can even use it to 153 00:07:20.920 --> 00:07:24.560 sort of fake external cloud services, like how say your 154 00:07:24.600 --> 00:07:29.160 app needs to talk to AWSS three for object storage. Locally, 155 00:07:29.360 --> 00:07:32.399 you can just spin up a minio container using Docker compose. 156 00:07:33.040 --> 00:07:35.959 It behaves just like S three, so your app works 157 00:07:35.959 --> 00:07:39.759 without needing real cloud credentials or connectivity during development. 158 00:07:39.920 --> 00:07:43.240 That's incredibly useful for local dev loops. Okay, so we've 159 00:07:43.240 --> 00:07:47.319 built the image, tested it locally with compose. Now the 160 00:07:47.360 --> 00:07:50.959 moment of truth deploying to a real Kubernetes cluster. What 161 00:07:51.000 --> 00:07:52.079 does that actually involve? 162 00:07:52.480 --> 00:07:55.199 Right? Your main interaction point is going to be the 163 00:07:55.199 --> 00:07:59.040 command line tool quebectl, and you tell Kubernetes what you 164 00:07:59.079 --> 00:08:01.879 want by writing these declarative yamal configuration files. 165 00:08:02.120 --> 00:08:03.920 Yamil got it Eeryone's favorite. 166 00:08:04.040 --> 00:08:07.279 Huh Well, it gets the job done first, though you 167 00:08:07.360 --> 00:08:09.920 need a cluster. A really great option, especially when you're 168 00:08:09.920 --> 00:08:14.319 starting out, is a managed service think Google Kubernetes Engine GK, 169 00:08:15.240 --> 00:08:17.680 aws eks Azure AKS. 170 00:08:17.839 --> 00:08:20.600 Takes away some of the underlying management pain exactly. 171 00:08:21.000 --> 00:08:24.639 Gk's autopilot mode is particularly neat because it handles the 172 00:08:24.680 --> 00:08:27.680 node management for you. You just focus on your workloads. 173 00:08:27.720 --> 00:08:31.319 Okay, so clusteretty quebectl is set up and authenticated. Then 174 00:08:31.360 --> 00:08:31.920 what then? 175 00:08:32.000 --> 00:08:34.240 You need to get your container image somewhere kubernets can 176 00:08:34.240 --> 00:08:36.919 pull it from. That's a container registry like Docker Hub, 177 00:08:36.919 --> 00:08:40.759 Google Container Registry, AWSCR, et cetera. You push your image there, 178 00:08:41.039 --> 00:08:44.480 then you take your deployment yammo file, the one describing 179 00:08:44.480 --> 00:08:47.360 your application how many replicas you want which image to use, 180 00:08:47.879 --> 00:08:50.039 and you just run quebec to all apply dasheff your 181 00:08:50.080 --> 00:08:51.600 dash appoyment dot YAMO. 182 00:08:51.360 --> 00:08:54.360 And Kubernetes just makes that happen pretty much. 183 00:08:54.440 --> 00:08:56.679 It reads your desired state from the YAML and then 184 00:08:56.840 --> 00:08:59.639 works tirelessly in the background to make the cluster's actual 185 00:08:59.679 --> 00:09:01.919 state match that. It's constantly reconciling. 186 00:09:02.080 --> 00:09:06.480 Okay, but what if things go wrong? Because you know they. 187 00:09:06.320 --> 00:09:09.399 Sometimes do, oh for sure, troubleshooting is key. You'll see 188 00:09:09.399 --> 00:09:12.240 some common errors air image pull that usually means a 189 00:09:12.279 --> 00:09:15.639 typo in the image name in your Yamel, or maybe 190 00:09:15.720 --> 00:09:18.919 Kubernetes doesn't have permission to pull from your registry. 191 00:09:18.559 --> 00:09:20.039 Right authentication exactly. 192 00:09:20.559 --> 00:09:24.000 Or you might see stuck in pending that means Kubernetes 193 00:09:24.039 --> 00:09:26.519 wants to schedule your POD, but it can't find a 194 00:09:26.559 --> 00:09:29.720 node with enough resources cpu memory available. 195 00:09:29.320 --> 00:09:31.279 For it AH capacity issue. 196 00:09:31.399 --> 00:09:34.519 Yeah, and then there's the classic crash loop back off 197 00:09:34.720 --> 00:09:38.360 that tells you your container is starting crashing, restarting, crashing. 198 00:09:38.120 --> 00:09:39.759 Again, usually an application bug. 199 00:09:39.799 --> 00:09:43.240 Then often yeah, or maybe it's missing a configuration or 200 00:09:43.320 --> 00:09:46.519 can't connect to a database. It depends on something inside 201 00:09:46.519 --> 00:09:48.480 the container is failing repeatedly. 202 00:09:48.679 --> 00:09:49.639 So how do you debug that? 203 00:09:49.720 --> 00:09:52.759 Peek inside YEP, quebec dol exec is your friend. There. 204 00:09:52.960 --> 00:09:55.440 It lets you get a shell right inside the running containers. 205 00:09:55.480 --> 00:09:59.080 You can poke around, check files, run commands okay. 206 00:09:58.840 --> 00:10:02.240 And Quebeca logs FP is essential. It streams the logs 207 00:10:02.240 --> 00:10:04.720 from your container in real time, so you can see 208 00:10:04.759 --> 00:10:08.240 exactly what error messages your application is spitting out. 209 00:10:08.360 --> 00:10:12.399 Got it? So we deploy, maybe debug a bit. How 210 00:10:12.399 --> 00:10:14.960 do we actually make the application accessible? Like, give it 211 00:10:15.000 --> 00:10:15.799 an IP address. 212 00:10:15.919 --> 00:10:18.399 That's where the service object comes in. You create a 213 00:10:18.440 --> 00:10:21.440 service yammal. If you want external access from the Internet, 214 00:10:21.559 --> 00:10:25.000 you typically use a load balancer type service. The cloud 215 00:10:25.000 --> 00:10:27.919 provider will automatically provision a load balancer and give you 216 00:10:28.000 --> 00:10:29.559 an external IP okay. 217 00:10:29.600 --> 00:10:32.159 And for internal stuff like micro service is talking to 218 00:10:32.200 --> 00:10:32.600 each other. 219 00:10:32.799 --> 00:10:35.759 For that, you'd use a cluster IP type service. It 220 00:10:35.759 --> 00:10:38.679 gives you a stable internal IP address that other pods 221 00:10:38.679 --> 00:10:42.080 within the cluster can use to reach your application. The 222 00:10:42.120 --> 00:10:45.200 service uses labels, simple key value pairs you put on 223 00:10:45.240 --> 00:10:47.480 your pods to know which pods to send traffic to. 224 00:10:47.759 --> 00:10:50.080 Labels and selectors right exactly. 225 00:10:49.919 --> 00:10:53.320 And the beauty of this declarative approach Updating your application 226 00:10:53.440 --> 00:10:55.960 is super simple. How so you just update the image 227 00:10:55.960 --> 00:10:58.080 tag in your deployment yammo file to point to your 228 00:10:58.120 --> 00:11:01.080 new container image version. Then you run a quebecle apply 229 00:11:01.120 --> 00:11:04.159 to f again with the updated file, and Kubernetes handles 230 00:11:04.159 --> 00:11:05.039 the rollout yep. 231 00:11:05.320 --> 00:11:08.960 It performs a rolling update by default, gradually replacing old 232 00:11:09.000 --> 00:11:11.759 pods with new ones. You can even watch the progress 233 00:11:11.879 --> 00:11:16.679 live using watchteled de quebectl get deployed. It's remarkably smooth. 234 00:11:17.000 --> 00:11:21.080 That's incredibly powerful, that automated reconciliation and rollout, But just 235 00:11:21.080 --> 00:11:24.399 getting it deployed isn't the whole story. How does Kubernetes 236 00:11:24.399 --> 00:11:27.360 make sure our apps actually stay up and running even 237 00:11:27.399 --> 00:11:29.759 if things go wrong or during those updates? Right? 238 00:11:29.840 --> 00:11:34.519 This is where automated operations and health checks become really critical. 239 00:11:34.879 --> 00:11:37.639 Kubernetes already gives you that basic self healing. We talked 240 00:11:37.639 --> 00:11:41.600 about restarting crash containers, rescheduling pods from dead nodes yea. 241 00:11:41.799 --> 00:11:45.480 But to be smarter than just basic restarts, Kubernetes needs 242 00:11:45.480 --> 00:11:48.840 signals from your application about its actual health. And that's 243 00:11:48.840 --> 00:11:49.840 where probes come in. 244 00:11:49.960 --> 00:11:51.440 Probes Okay, what kinds are there? 245 00:11:51.519 --> 00:11:54.360 There? Are two main types you'll use constantly, liveness probes 246 00:11:54.399 --> 00:11:55.320 and readiness probes. 247 00:11:55.360 --> 00:11:57.000 Liveness and readiness what's the difference. 248 00:11:57.200 --> 00:12:01.120 Aliveness probe tells Kubernetes, is this app location still alive 249 00:12:01.200 --> 00:12:05.960 and functioning? If the liveness probe fails, maybe your app 250 00:12:06.000 --> 00:12:09.600 is deadlocked or hung. Kubernetes nose it needs to restart 251 00:12:09.679 --> 00:12:11.320 that container to try and recover it. 252 00:12:11.440 --> 00:12:15.200 Okay, so liveness restart if broken exactly now. 253 00:12:15.200 --> 00:12:18.240 A readiness probe tells Kubernetes, is this application ready to 254 00:12:18.360 --> 00:12:20.080 actually serve traffic right now? 255 00:12:20.559 --> 00:12:23.559 Ah? So it might be alive but not quite ready, 256 00:12:23.600 --> 00:12:25.120 like still starting up precisely. 257 00:12:25.559 --> 00:12:28.919 Think about an application that needs to load data or 258 00:12:28.960 --> 00:12:31.919 warm up caches when it starts. It might be running, 259 00:12:32.159 --> 00:12:34.799 so the liveness probe passes, but it's not ready to 260 00:12:34.840 --> 00:12:37.480 handle user requests yet, right okay. If the readiness probe 261 00:12:37.480 --> 00:12:40.639 isn't passing, Kubernetes won't send any traffic to that pod. 262 00:12:41.120 --> 00:12:44.919 This is absolutely crucial for achieving zero downtime updates. 263 00:12:44.559 --> 00:12:46.879 Because it waits until the new pod is actually ready 264 00:12:46.879 --> 00:12:48.559 before sending users to it. 265 00:12:48.639 --> 00:12:51.000 You got it. No user ever hits an application that's 266 00:12:51.000 --> 00:12:53.279 still in the middle of booting up. Get your readiness 267 00:12:53.320 --> 00:12:56.879 probes right, and you unlock those seamless, zero downtime deployments. 268 00:12:56.919 --> 00:12:58.919 It's a massive win for user experience. 269 00:12:59.159 --> 00:13:03.320 That makes perfect sense. So assuming we've got those readiness 270 00:13:03.399 --> 00:13:05.960 checks nailed, ensuring no one sees a half baked app, 271 00:13:06.480 --> 00:13:09.039 what are the different ways we can actually roll out updates? 272 00:13:09.360 --> 00:13:11.600 You mentioned rolling updates, but are there other strategies? 273 00:13:11.919 --> 00:13:14.440 Yeah, there are a few main ones. Rolling update is 274 00:13:14.480 --> 00:13:17.240 the default, like I said, and honestly it's the best 275 00:13:17.320 --> 00:13:19.919 choice for most typical web services. 276 00:13:20.039 --> 00:13:20.879 How does it work again? 277 00:13:21.480 --> 00:13:25.600 It updates pods incrementally in batches. It ensures a certain 278 00:13:25.639 --> 00:13:28.440 number always available, brings up new ones, waits for them 279 00:13:28.480 --> 00:13:32.279 to be ready thanks to readiness probes, then terminates old ones. 280 00:13:33.080 --> 00:13:36.000 It's smooth, continuous. 281 00:13:35.399 --> 00:13:37.200 Okay, keeps the service up the whole time. 282 00:13:37.279 --> 00:13:40.080 What else, Well, there's the blue green strategy. This is 283 00:13:40.159 --> 00:13:43.039 quite different. You deploy the entire new version, the green 284 00:13:43.159 --> 00:13:46.480 version completely separate from the currently running blue versions. 285 00:13:46.200 --> 00:13:47.759 So both are running at the same time. 286 00:13:47.639 --> 00:13:50.559 For a period. Yes, once you're happy the green version 287 00:13:50.600 --> 00:13:53.559 is working perfectly, you just switch the load balancer or 288 00:13:53.559 --> 00:13:56.120 a router to point all traffic from blue to green. 289 00:13:56.240 --> 00:13:59.279 Instantly instant cutover zero downtime. 290 00:13:59.279 --> 00:14:03.320 There too, presume YEP, zero downtime. The big advantage is 291 00:14:03.440 --> 00:14:06.320 instant rollback. If Green has issues, just flip the switch 292 00:14:06.399 --> 00:14:10.559 back to Blue. The downside you need roughly double the 293 00:14:10.600 --> 00:14:14.039 infrastructure resources during the transition period, which can be costly. 294 00:14:14.320 --> 00:14:18.000 Right, running two full copies makes sense any other options. 295 00:14:18.080 --> 00:14:21.639 There's also the simpler recreate strategy. This one just well, 296 00:14:21.759 --> 00:14:23.720 it kills all the old pods first and then it 297 00:14:23.759 --> 00:14:24.759 creates all the new ones. 298 00:14:24.879 --> 00:14:26.639 WHOA okay, so that definitely means. 299 00:14:26.399 --> 00:14:29.879 Downtime guaranteed downtime, yes, okay, but it's simple, and sometimes 300 00:14:29.879 --> 00:14:32.919 it's necessary if the old and new versions are incompatible 301 00:14:32.960 --> 00:14:36.600 and can't run side by side for some reason. But generally, 302 00:14:36.840 --> 00:14:39.360 rolling update is your go to workhoorse. 303 00:14:38.960 --> 00:14:44.039 Fantastic breakdown Okay. Beyond keeping things running, another huge concern 304 00:14:44.200 --> 00:14:47.840 is efficiency. How do we stop our apps from hogging resources, 305 00:14:47.919 --> 00:14:51.480 wasting money, or maybe even impacting other applications running on 306 00:14:51.519 --> 00:14:52.240 the same cluster. 307 00:14:52.519 --> 00:14:56.799 Resource management absolutely fundamental in Kubernetes, and you can figure 308 00:14:56.799 --> 00:14:59.840 this right in your pod specification using requests and limits 309 00:14:59.840 --> 00:15:01.559 for CPU and memory. 310 00:15:01.360 --> 00:15:03.879 Requests and limits, what do they each do think of. 311 00:15:03.879 --> 00:15:07.759 Requests as telling the Kubernetes scheduler, Okay, to run reliably, 312 00:15:07.960 --> 00:15:10.399 this pod needs at least this much CPU and this 313 00:15:10.559 --> 00:15:14.679 much memory. It's a guarantee. The scheduler uses this info 314 00:15:14.799 --> 00:15:16.720 to decide where to place your pod on a node 315 00:15:16.720 --> 00:15:19.080 that actually has those resources available, so. 316 00:15:19.039 --> 00:15:21.759 It ensures the pod gets what it needs to start exactly. 317 00:15:22.399 --> 00:15:26.399 Limits, on the other hand, define the absolute maximum resources 318 00:15:26.440 --> 00:15:29.080 a pod is allowed to consume. If a pod tries 319 00:15:29.120 --> 00:15:31.919 to use more memory than its limit, Kubernetes will likely 320 00:15:32.000 --> 00:15:34.639 kill it. Oh killed out of memory? 321 00:15:34.679 --> 00:15:37.080 Killed, okay, hard stop for memory? What about CPU? 322 00:15:37.360 --> 00:15:40.600 If it exceeds its CPU limit, it just gets throttled. 323 00:15:40.720 --> 00:15:43.039 It won't be killed, but its performance will be capped. 324 00:15:43.039 --> 00:15:44.759 So setting both gives you predictability. 325 00:15:44.799 --> 00:15:48.919 Precisely, it defines a quality of service or QoS class 326 00:15:48.919 --> 00:15:52.759 for your pod. Kubernetes uses these values constantly to manage 327 00:15:52.799 --> 00:15:56.000 resource contention on the nodes. It might even evict lower 328 00:15:56.000 --> 00:15:58.960 priority pods that are exceeding their request just to protect 329 00:15:59.039 --> 00:15:59.960 higher priority one. 330 00:16:00.120 --> 00:16:02.120 You can set priorities YEP, using something. 331 00:16:01.919 --> 00:16:04.960 Called priority class. So how do you figure out the 332 00:16:05.039 --> 00:16:08.679 right values? Best practices usually start a bit generous with 333 00:16:08.720 --> 00:16:10.240 your requests and limits. 334 00:16:10.519 --> 00:16:14.679 Okay, then use monitoring tools absolutely essential to observe how 335 00:16:14.759 --> 00:16:18.320 much CPU and memory your application actually uses under load, 336 00:16:18.519 --> 00:16:22.120 and then tune them down exactly. Fine tune those requests 337 00:16:22.120 --> 00:16:25.639 and limits to match reality, avoiding waste but still giving 338 00:16:25.679 --> 00:16:28.399 your app what it needs. Quick tip for web apps, 339 00:16:29.039 --> 00:16:31.399 you can often set the CPU limit higher than the 340 00:16:31.399 --> 00:16:32.200 CPU request. 341 00:16:32.399 --> 00:16:33.720 Oh why is that? 342 00:16:33.840 --> 00:16:36.960 It allows your app to burst to temporarily use more 343 00:16:37.039 --> 00:16:40.600 CPU if the node has spare capacity available. Good for 344 00:16:40.639 --> 00:16:44.320 handling short spikes and traffic without over provisioning the guaranteed 345 00:16:44.320 --> 00:16:45.200 request all the time. 346 00:16:45.279 --> 00:16:49.600 That's a really smart optimization balancing guarantees with burst potential. Okay, 347 00:16:49.600 --> 00:16:52.399 speaking of capacity and spikes, how do we handle that 348 00:16:52.480 --> 00:16:57.039 viral moment we talked about where traffic just explodes? Or conversely, 349 00:16:57.120 --> 00:16:59.360 how do we save money when things are quiet? How 350 00:16:59.360 --> 00:17:01.200 does Kubernety automate the scaling part? 351 00:17:01.440 --> 00:17:04.759 Right? Automatic scaling? Kubernates gives you two main tools here 352 00:17:05.160 --> 00:17:08.559 for scaling your application pods horizontally. There's the Horizontal Pod 353 00:17:08.799 --> 00:17:10.440 Autoscaler or HPA. 354 00:17:10.720 --> 00:17:12.880 HPA, what does it look at? The most common metric 355 00:17:12.960 --> 00:17:16.759 is CPU utilization? You can say, okay, if the average 356 00:17:16.759 --> 00:17:19.880 CPU across all my pods goes above sixty percent. 357 00:17:20.160 --> 00:17:22.160 Add more pods, simple enough. 358 00:17:22.000 --> 00:17:24.559 But it's way more flexible than just CPU. You can 359 00:17:24.559 --> 00:17:28.799 configure HPAs to scale based on memory usage or even 360 00:17:28.880 --> 00:17:32.599 custom metrics like request per second hitting your service, or 361 00:17:32.839 --> 00:17:35.119 maybe the number of messages sitting in a queue that 362 00:17:35.160 --> 00:17:36.880 your workers need to process, so. 363 00:17:36.799 --> 00:17:40.599 You can tie scaling directly to your application's actual load drivers. 364 00:17:40.319 --> 00:17:44.279 Exactly makes it much more responsive inefficient. Now that's scaling 365 00:17:44.319 --> 00:17:47.680 your pods. What about the underlying machines, the nodes. 366 00:17:47.799 --> 00:17:49.599 Yeah, if you add more pods, you might run at 367 00:17:49.640 --> 00:17:52.079 a node capacity right back to that pending state. 368 00:17:52.200 --> 00:17:55.839 Precisely, that's where the cluster autoscaler comes in. This is 369 00:17:55.920 --> 00:17:59.400 usually component provided by your cloud provider or installed separately. 370 00:18:00.039 --> 00:18:02.200 Watches for pods that are stuck and pending because of 371 00:18:02.240 --> 00:18:05.799 resource constraints, and it automatically adds more nodes to the 372 00:18:05.799 --> 00:18:09.000 cluster to accommodate them nice and just as importantly, if 373 00:18:09.079 --> 00:18:12.359 nodes become underutilized for a while, it will consolidate the 374 00:18:12.359 --> 00:18:15.359 pods onto fewer nodes and then terminate the unnecessary ones 375 00:18:15.359 --> 00:18:15.880 to save. 376 00:18:15.759 --> 00:18:18.519 Costs, So it scales the infrastructure up and down too, 377 00:18:18.960 --> 00:18:19.480 very cool. 378 00:18:19.720 --> 00:18:23.279 What about really fast scaling needs like almost instant? 379 00:18:23.480 --> 00:18:27.240 Ah? Yeah, for super rapid scaling, there's a clever technique 380 00:18:27.319 --> 00:18:30.960 using placeholder pods. You deploy these special pods with a 381 00:18:31.079 --> 00:18:35.160 very low priority. They basically just sit there consuming resources 382 00:18:35.240 --> 00:18:39.160 in occupying space. Okay. Why because when your real application 383 00:18:39.319 --> 00:18:41.839 needs to scale up quickly due to a sudden spike, 384 00:18:42.240 --> 00:18:46.160 its new higher priority pods can immediately preempt those low 385 00:18:46.160 --> 00:18:50.119 priority placeholder pods they get kicked out, freeing up resources 386 00:18:50.160 --> 00:18:53.319 instantly for your critical application. It gives you immediate headroom 387 00:18:53.480 --> 00:18:55.440 without waiting for new nodes to spin up. 388 00:18:55.559 --> 00:18:58.400 That's a neat trick. So underlying all this scaling tech, 389 00:18:58.480 --> 00:19:01.400 what's the most important design pile for building an application 390 00:19:01.440 --> 00:19:02.599 that can actually scale like this? 391 00:19:02.960 --> 00:19:05.920 Oh? Hands down, the single most important thing is avoiding 392 00:19:05.960 --> 00:19:10.640 local state. Your application replicas need to be stateless. 393 00:19:10.400 --> 00:19:14.720 Meaning any replica can handle any incoming request without needing 394 00:19:14.759 --> 00:19:18.359 specific data stored only on that particular instance or data 395 00:19:18.440 --> 00:19:22.039 from another specific instance. All the necessary state should be 396 00:19:22.119 --> 00:19:26.200 external in a database, a cache, object, storage, whatever. 397 00:19:26.319 --> 00:19:29.799 Because if any replica can handle any request, Kubernetes can 398 00:19:29.839 --> 00:19:32.880 just add or remove replicas without worrying about losing data 399 00:19:32.960 --> 00:19:34.559 or breaking user sessions exactly. 400 00:19:34.559 --> 00:19:38.400 It makes scaling effortless. Statelessness is foundational. 401 00:19:37.920 --> 00:19:41.680