WEBVTT 1 00:00:00.160 --> 00:00:03.160 Welcome to the deep dive. Now you gave us a 2 00:00:03.200 --> 00:00:06.599 really interesting stack of sources, this time a blueprint really 3 00:00:06.639 --> 00:00:11.519 for modern enterprise tech. And our mission today is, well, 4 00:00:11.640 --> 00:00:13.880 let's cut through the jargon. We want to make you 5 00:00:14.000 --> 00:00:18.120 instantly well informed about cloud native application development. This isn't 6 00:00:18.160 --> 00:00:22.480 about just dry definitions. It's about understanding the core ideas, 7 00:00:23.000 --> 00:00:27.719 the philosophy, and the tools that really change software. 8 00:00:27.359 --> 00:00:28.800 Development, fundamentally changed it. 9 00:00:28.879 --> 00:00:31.239 We're talking about that shift, you know, from apps that 10 00:00:31.280 --> 00:00:34.920 took months, maybe years to roll out, oh, to systems 11 00:00:34.960 --> 00:00:38.399 that demand new features in well days. 12 00:00:38.000 --> 00:00:41.719 And that need for speed, for constant evolution. That's the 13 00:00:41.759 --> 00:00:44.600 core driver here. If you look at the sources, the 14 00:00:44.640 --> 00:00:48.799 big move is crystal clear. It's leaving behind the monolithic 15 00:00:48.920 --> 00:00:50.719 architecture monoliths. 16 00:00:51.079 --> 00:00:54.119 We all remember that latency pain, don't we These massive 17 00:00:54.159 --> 00:00:58.479 applications built as just one single giant unit, indivisible. They 18 00:00:58.520 --> 00:01:00.320 were hard to scale. We know that, But what was 19 00:01:00.320 --> 00:01:03.759 the real headache, the biggest pain point organizationally? Maybe? 20 00:01:03.920 --> 00:01:07.879 Well, yeah, scaling was tough. If one tiny part like 21 00:01:08.319 --> 00:01:11.239 search on an e commerce site got hammered with traffic, right, 22 00:01:11.760 --> 00:01:15.799 you had to clone the entire huge application, massive resource waste. 23 00:01:15.959 --> 00:01:21.920 But honestly, maybe worse was deployment. Developer fixes one tiny 24 00:01:22.000 --> 00:01:26.159 bug deep in some module. They couldn't just deplay that fix. 25 00:01:26.959 --> 00:01:29.959 Oh no, you had to test everything. 26 00:01:29.599 --> 00:01:33.319 Everything, full regression testing on the whole beast. So releases 27 00:01:33.359 --> 00:01:36.879 were slow, risky, took weeks sometimes to coordinate. 28 00:01:37.000 --> 00:01:39.640 Okay, So micro services come in to fix that, breaking 29 00:01:39.719 --> 00:01:44.000 the ab down into the smallest logical level loosely coupled services. 30 00:01:44.319 --> 00:01:48.000 The key advantage is that granular control selective scaling. Search 31 00:01:48.040 --> 00:01:50.840 service overloaded find just scale that not the whole thing. 32 00:01:51.000 --> 00:01:54.200 The order service accounts. Yeah, they just sit there doing 33 00:01:54.200 --> 00:01:58.560 their job. And there's another cool benefit. The sources highlight polyglot. 34 00:01:58.079 --> 00:01:59.959 Programming, meaning using different languages. 35 00:02:00.120 --> 00:02:03.000 Right, use the best tool for the specific job. Maybe 36 00:02:03.040 --> 00:02:05.519 Python for I don't know, a reporting micro service because 37 00:02:05.519 --> 00:02:09.039 this data libraries are great, but use Java for your 38 00:02:09.080 --> 00:02:12.199 high performance transaction service. Get that flexibility. 39 00:02:12.400 --> 00:02:15.599 So the tech enables breaking things up. But let's be honest, 40 00:02:15.759 --> 00:02:18.680 this whole shift wouldn't have happened without the money side 41 00:02:18.759 --> 00:02:19.280 changing too. 42 00:02:19.360 --> 00:02:21.719 Right, That pay as you go model, Oh absolutely, that's 43 00:02:21.719 --> 00:02:25.639 the economic engine that really drove cloud adoption, paying only 44 00:02:25.680 --> 00:02:29.400 for what you actually use and what's really interesting is 45 00:02:29.759 --> 00:02:33.840 how it lowered the barrier to just trying things out, experimentation. 46 00:02:34.159 --> 00:02:37.439 How so, well, think back if you needed a powerful 47 00:02:37.479 --> 00:02:41.560 server just to say, run some sample code and validy fixes. Yeah, 48 00:02:41.680 --> 00:02:44.400 you had to buy the hardware first, a big capital expense. 49 00:02:44.919 --> 00:02:47.400 Now you rent a virtual machine, maybe a quad core, 50 00:02:47.840 --> 00:02:49.400 use it for twenty minutes, shut. 51 00:02:49.199 --> 00:02:51.919 It down, and pay for twenty minutes exactly. 52 00:02:51.560 --> 00:02:54.280 Just the time you use it. Completely changes the economics 53 00:02:54.280 --> 00:02:55.319 of development and testing. 54 00:02:55.840 --> 00:02:59.639 That ability to just grab resources when needed it must 55 00:02:59.639 --> 00:03:02.759 have slash overheads. Okay, so we know why the shift happened. 56 00:03:03.639 --> 00:03:07.520 Let's talk about where the cloud environment itself. The sources 57 00:03:07.599 --> 00:03:10.280 use this great layer by layer metaphor for the Zauce 58 00:03:10.319 --> 00:03:14.680 categories ISS PIASAUCE. It gets confusing. So where does my 59 00:03:14.800 --> 00:03:17.000 responsibility end and the cloud providers begin? 60 00:03:17.199 --> 00:03:20.080 Right? This hierarchy is key because it defines your cost, 61 00:03:20.240 --> 00:03:23.039 your effort, and how much control you have. Let's start 62 00:03:23.039 --> 00:03:26.439 at the bottom. Traditional on premise. Ok. Here you manage 63 00:03:26.479 --> 00:03:29.560 everything the building, the power, the cooling, the servers, the network, 64 00:03:29.599 --> 00:03:33.199 the OS, middleware, the app, the data, all of it. 65 00:03:33.800 --> 00:03:34.479 Your problem. 66 00:03:34.560 --> 00:03:38.639 Total control, total responsibility. So climbing one step up, ISS 67 00:03:38.840 --> 00:03:40.199 infrastructure as a service. 68 00:03:40.280 --> 00:03:43.159 What changes with is think is something like an aws 69 00:03:43.240 --> 00:03:47.039 EC two instance, a basic virtual machine. The cloud provider 70 00:03:47.080 --> 00:03:52.319 handles the fundamental infrastructure, the servers, storage, networking, the physical stuff. 71 00:03:52.360 --> 00:03:55.120 But I still manage You're still responsible for the operating system, 72 00:03:55.159 --> 00:03:58.840 any middleware, the runtime environment, your application and your data. 73 00:03:59.240 --> 00:04:00.560 Still quite a bit look after. 74 00:04:00.840 --> 00:04:04.080 Okay, so PASS platform as a service must take more 75 00:04:04.120 --> 00:04:04.639 off my clate. 76 00:04:04.680 --> 00:04:08.879 Then it does significantly more with pass. The cloud provider 77 00:04:08.960 --> 00:04:13.319 manages the OS, the middleware, and the runtime environments. Ah So, 78 00:04:13.439 --> 00:04:15.599 as the user, you really only need to focus on 79 00:04:15.639 --> 00:04:19.480 your application code and its associated data. I think Amazon 80 00:04:19.560 --> 00:04:21.439 Elastic Beanstock. You just upload your. 81 00:04:21.360 --> 00:04:26.000 Code and the platform handles the rest, provisioning, scaling pretty much. 82 00:04:26.079 --> 00:04:29.199 Yeah. It abstracts away a lot of the operational burden. 83 00:04:28.959 --> 00:04:31.439 And then sas software is as a service is just 84 00:04:31.600 --> 00:04:34.439 the finished product, like logging into Gmail or Office three 85 00:04:34.560 --> 00:04:35.079 sixty five. 86 00:04:35.240 --> 00:04:38.480 Exactly, you're purely a consumer, log in, use the software. 87 00:04:38.519 --> 00:04:39.839 The provider handles everything else. 88 00:04:39.879 --> 00:04:42.000 But the cloud didn't stop there, did it. There's face 89 00:04:42.079 --> 00:04:43.399 functions as a service. 90 00:04:43.199 --> 00:04:47.079 Right, face like AWS LANDA or Azure functions. This is 91 00:04:47.319 --> 00:04:50.160 like the ultimate level of abstraction. The smallest unit. 92 00:04:50.240 --> 00:04:50.920 Your smallest unit. 93 00:04:51.000 --> 00:04:53.879 Yeah, you only manage the function itself, the actual snippet 94 00:04:53.879 --> 00:04:58.319 of code. The cloud handles everything else, provisioning servers, scaling up, 95 00:04:58.560 --> 00:05:02.600 scaling down, even scaling to zero when it's not being used, 96 00:05:02.639 --> 00:05:06.279 and the billing reflects that precisely. Your charge only for 97 00:05:06.360 --> 00:05:08.959 the exact time your code is running. If it runs 98 00:05:08.959 --> 00:05:12.519 for say, thirty milliseconds, you pay for thirty milliseconds. That 99 00:05:12.600 --> 00:05:16.480 efficiency is just game changing. 100 00:05:16.639 --> 00:05:19.439 Okay, that clarifies the layers the what. But here's the 101 00:05:19.519 --> 00:05:23.720 crucial part. The mindset shift. Just lifting and shifting your 102 00:05:23.759 --> 00:05:28.079 old monolith onto an IASVM. That doesn't really unlock the 103 00:05:28.079 --> 00:05:30.120 cloud's power, does it not at all? 104 00:05:30.279 --> 00:05:32.439 That's just running your old problems in a new location. 105 00:05:32.759 --> 00:05:36.040 Right. The sources really stress this. Building true cloud native 106 00:05:36.079 --> 00:05:39.800 apps means unlearning old habits. You have to design for well. 107 00:05:40.120 --> 00:05:41.360 Volatility absolutely. 108 00:05:41.439 --> 00:05:45.439 Cloud native design assumes failure is normal. The old monolith 109 00:05:45.519 --> 00:05:47.839 mindset was, you know, the server's precious, keep it running 110 00:05:47.839 --> 00:05:50.680 at all costs. In the cloud, servers are cattle nut pets. 111 00:05:50.680 --> 00:05:52.759 They're disposable. Components will fail. 112 00:05:52.519 --> 00:05:56.199 Which brings us right to design factor one, embrace failure. Specifically, 113 00:05:56.480 --> 00:05:57.839 no single point of failure. 114 00:05:58.000 --> 00:06:01.160 Correct, and the cornerstone of building resilient systems like this 115 00:06:01.439 --> 00:06:02.319 is statelessness. 116 00:06:02.360 --> 00:06:05.279 Okay, statelessness? What does that mean in practice? Give us 117 00:06:05.279 --> 00:06:05.879 an analogy. 118 00:06:06.040 --> 00:06:10.120 Okay, think about ordering food. Right, a stateful restaurant, only 119 00:06:10.199 --> 00:06:13.120 the waiter who took your order knows what you ordered 120 00:06:13.199 --> 00:06:15.519 or where you are in your meal. If that specific 121 00:06:15.519 --> 00:06:19.199 waiter goes home, you're stuck. Your state is lost with them, right, I. 122 00:06:19.160 --> 00:06:21.279 Can see that annoying. 123 00:06:21.160 --> 00:06:25.120 Very Now a stateless restaurant, your order is written down, 124 00:06:25.319 --> 00:06:29.439 maybe put into a central system. Any available waiter, any 125 00:06:29.480 --> 00:06:32.600 server instance can look up your order details and continues 126 00:06:32.680 --> 00:06:33.839 serving you seamlessly. 127 00:06:34.279 --> 00:06:37.879 Ah, The system knows, not the individual server exactly. 128 00:06:38.079 --> 00:06:41.399 The service itself doesn't hold onto session memory between requests 129 00:06:41.920 --> 00:06:45.160 the state. The data is stored elsewhere, maybe a database 130 00:06:45.240 --> 00:06:47.959 or cash accessible to all instances. That's what lets you 131 00:06:48.000 --> 00:06:50.680 easily add a remove server's horizontal scaling. You can have 132 00:06:50.720 --> 00:06:52.680 one hundred identical interchangeable servers. 133 00:06:52.759 --> 00:06:55.160 That makes total sense. Okay, So failure is inevitable. The 134 00:06:55.240 --> 00:06:59.000 system needs to not just survive it, but handle it gracefully. 135 00:06:59.279 --> 00:07:03.560 Yeah, yes, fail fast, that's crucial, often done using the 136 00:07:03.600 --> 00:07:04.600 circuit breaker pattern. 137 00:07:04.720 --> 00:07:07.000 Circuit breaker like in my house sort of. 138 00:07:07.399 --> 00:07:10.360 You don't want a struggling service to just hang silent 139 00:07:10.439 --> 00:07:13.959 feeling for minutes, causing backups everywhere else. If a downstream 140 00:07:14.000 --> 00:07:17.040 service is clearly having trouble, got it off exactly, the 141 00:07:17.079 --> 00:07:20.879 circuit breaker trips. It stops sending requests that failing service 142 00:07:20.920 --> 00:07:23.680 for a short period, preventing overload and giving it a 143 00:07:23.720 --> 00:07:28.000 chance to recover or be replaced. Then importantly, the calling 144 00:07:28.040 --> 00:07:32.040 service needs to handle that failure gracefully, meaning don't just 145 00:07:32.040 --> 00:07:35.279 show an error page. If live searches down, maybe pull 146 00:07:35.319 --> 00:07:39.680 results from a cash or show top selling products, something useful, 147 00:07:39.759 --> 00:07:40.600 not just a dead end. 148 00:07:40.759 --> 00:07:44.879 Okay, resilience is built on statelessness and failing fast. But 149 00:07:45.000 --> 00:07:48.839 if we have potentially hundreds of these small services, managing 150 00:07:48.839 --> 00:07:52.680 that manually sounds impossible, which leads to design factor two. 151 00:07:53.319 --> 00:07:55.959 Automation is king absolutely non negotiable. 152 00:07:56.000 --> 00:07:58.839 With potentially hundreds of micro services, manual management is a 153 00:07:58.839 --> 00:08:02.720 recipe for disaster. You must automate testing, deployment, that's your 154 00:08:02.800 --> 00:08:04.560 CICD pipeline. 155 00:08:04.040 --> 00:08:07.439 And monitoring to eliminate human error, that and just. 156 00:08:07.399 --> 00:08:10.279 To cope with the scale. This automation is also key 157 00:08:10.319 --> 00:08:15.160 to building a self healing system self healing like Wolverine. Huh, Well, 158 00:08:15.759 --> 00:08:18.519 maybe not quite that fast, but the system needs to 159 00:08:18.600 --> 00:08:23.079 automatically detect and recover from failures without needing a human 160 00:08:23.120 --> 00:08:27.240 to step in. That could mean Kubernetes automatically restarting a 161 00:08:27.319 --> 00:08:28.959 failed container instance. 162 00:08:28.720 --> 00:08:30.199 Or redirecting traffic. 163 00:08:29.959 --> 00:08:33.399 Or spinning up more instances at the load increases the 164 00:08:33.440 --> 00:08:34.799 system manages itself. 165 00:08:35.159 --> 00:08:38.240 Ideally, and for developers actually building these things. The sources 166 00:08:38.320 --> 00:08:40.799 kept mentioning the twelve factor app philosophy is that like 167 00:08:40.840 --> 00:08:41.840 a checklist, it's more. 168 00:08:41.799 --> 00:08:45.480 Set of principles. Yeah. A widely accepted guide for building robust, 169 00:08:45.799 --> 00:08:49.039 scalable services for the cloud. It covers things like how 170 00:08:49.039 --> 00:08:51.919 to handle configuration, logs, dependencies. 171 00:08:52.240 --> 00:08:54.559 What's a key benefit of following those rules? 172 00:08:54.720 --> 00:08:59.440 Standardization and predictability. Take configuration for example, Factor three says 173 00:08:59.440 --> 00:09:01.679 stork and figure in the environment, not in the code. 174 00:09:01.840 --> 00:09:02.960 Why is that so important? 175 00:09:03.080 --> 00:09:05.639 Because then the exact same compiled code artifact can be 176 00:09:05.679 --> 00:09:10.360 deployed unchanged across development, test, staging, production. You just change 177 00:09:10.399 --> 00:09:14.240 the environment variables for database connections, apikeys, et cetera. It 178 00:09:14.279 --> 00:09:17.559 makes deployments much faster and safer, eliminates a huge source 179 00:09:17.559 --> 00:09:18.080 of errors. 180 00:09:18.600 --> 00:09:22.679 Got it? Okay, so we have the design mindset, embrace failure, 181 00:09:22.799 --> 00:09:26.440 automate everything. Now let's connect that to the toolkit. What 182 00:09:26.639 --> 00:09:29.799 technologies actually make this happen? How do we deploy and 183 00:09:29.879 --> 00:09:30.559 run these things? 184 00:09:30.600 --> 00:09:34.240 Right? The implementation, It really starts with containers. We mentioned 185 00:09:34.559 --> 00:09:35.799 VM's being heavy. 186 00:09:35.519 --> 00:09:37.320 Because they include a full OS. 187 00:09:37.159 --> 00:09:41.039 Right, containers are way lighter. They allow for much greater density. 188 00:09:41.120 --> 00:09:44.000 Remind us why they're lighter again, what's the core difference? 189 00:09:44.240 --> 00:09:47.480 They share the host operating system's kernel, so instead of 190 00:09:47.519 --> 00:09:50.919 each app needing its own complete OS like a VM does. 191 00:09:50.879 --> 00:09:53.080 Like separate cars with engines, Yeah. 192 00:09:52.960 --> 00:09:55.799 Containers are more like everyone sharing the car's engine, the 193 00:09:55.840 --> 00:09:58.519 host to S kernel, but each having their own secure 194 00:09:58.639 --> 00:10:02.080 passenger cabin built around. You're just packaging the application and 195 00:10:02.240 --> 00:10:04.440 its dependencies, not the whole OS stack. 196 00:10:04.639 --> 00:10:07.039 So you can pack way more onto the same hardware, 197 00:10:07.240 --> 00:10:08.440 more efficient, faster. 198 00:10:08.240 --> 00:10:11.600 To start up, exactly, huge boost and agility. But then 199 00:10:11.639 --> 00:10:16.360 if you've got tens, maybe hundreds or thousands of these containers, 200 00:10:17.279 --> 00:10:20.360 you need management, air traffic control. 201 00:10:20.080 --> 00:10:21.600 Basically, and that's Kubernetes. 202 00:10:22.000 --> 00:10:25.399 That's Kubernetes, or k eights as it's often called It's 203 00:10:25.440 --> 00:10:29.159 become the de facto standard for container orchestration orchestration, meaning 204 00:10:29.360 --> 00:10:33.360 it automates the deployment, scaling, load balancing, and crucially, the 205 00:10:33.440 --> 00:10:37.720 healing of containerized applications across a cluster of machines. If 206 00:10:37.720 --> 00:10:40.440 a container running your service crashes. 207 00:10:40.240 --> 00:10:42.240 Kates, notices and starts a new one. 208 00:10:42.159 --> 00:10:45.519 Yep, automatically, it handles that complexity so developers can focus 209 00:10:45.559 --> 00:10:48.080 on code, not infrastructure babysitting. 210 00:10:48.120 --> 00:10:51.360 Okay, containers managed by KAS We mentioned speed earlier that 211 00:10:51.360 --> 00:10:55.159 comes from continuous integration and continuous delivery CICD Right. 212 00:10:55.279 --> 00:10:58.720 CICD pipelines ensure that your code is constantly being built, 213 00:10:59.200 --> 00:11:03.200 tested and made ready for deployment. This enables those frequent, small, 214 00:11:03.240 --> 00:11:06.320 low risk releases we talked about instead of massive, scary 215 00:11:06.360 --> 00:11:07.480 deployments once a quarter. 216 00:11:07.639 --> 00:11:10.120 You deploy small changes, maybe multiple times a day. 217 00:11:10.399 --> 00:11:13.000 That's the goal. But how do you make sure the 218 00:11:13.080 --> 00:11:17.679 underlying infrastructure, the Kubernetes cluster itself, the networking, the databases 219 00:11:18.159 --> 00:11:21.039 is set up correctly and consistently every single time? 220 00:11:21.240 --> 00:11:24.720 Good question. Manual setup seems error prone. 221 00:11:24.480 --> 00:11:28.759 Highly error prone. That's where infrastructure as code or IAC 222 00:11:29.120 --> 00:11:30.039 is absolutely viable. 223 00:11:30.159 --> 00:11:32.000 Infrastructure as code yeah. 224 00:11:32.279 --> 00:11:34.799 Instead of clicking around in a cloud provider's web console 225 00:11:35.080 --> 00:11:37.120 to set things up, which is slow and impossible to 226 00:11:37.159 --> 00:11:39.440 replicate perfectly, you write. 227 00:11:39.240 --> 00:11:42.080 Scripts scripts that define the infrastructure. 228 00:11:41.480 --> 00:11:46.000 Exactly, using tools like AWS cloud Formation as your ARM 229 00:11:46.080 --> 00:11:51.679 templates or Terraform. These scripts define your servers, networks, load balancers, everything. 230 00:11:52.159 --> 00:11:54.639 You check this code in diversion control just like your 231 00:11:54.639 --> 00:11:55.519 application code. 232 00:11:55.559 --> 00:11:58.679 Ah. So it's repeatable, auditable, and testable. 233 00:11:58.879 --> 00:12:01.960 You can spin up an entire identical environment for development, testing, 234 00:12:02.039 --> 00:12:05.679 or production just by running the script. It eliminates configuration 235 00:12:05.879 --> 00:12:09.080 drift and the classic well it worked on my machine problem? 236 00:12:09.120 --> 00:12:12.279 Okay, that consistency is key. Now we have hundreds of 237 00:12:12.279 --> 00:12:16.200 services automated deployment. How do we avoid the pay as 238 00:12:16.240 --> 00:12:19.320 you go model becoming pay way too much? How do 239 00:12:19.360 --> 00:12:21.519 we manage costs and spot problems? 240 00:12:21.639 --> 00:12:24.720 That comes down to proactive monitoring and alerting. It's not optional, 241 00:12:24.759 --> 00:12:25.399 it's essential. 242 00:12:25.519 --> 00:12:27.159 Not just for finding bugs. 243 00:12:27.080 --> 00:12:30.679 No, it's critical for cost management too. You need visibility 244 00:12:30.720 --> 00:12:34.519 into how all these services are behaving. Our resources being underutilized, 245 00:12:34.879 --> 00:12:39.480 that's wasted money. Are they being overutilized that risks, performance 246 00:12:39.519 --> 00:12:40.399 issues are failures. 247 00:12:40.480 --> 00:12:43.480 So you need centralized logging and metrics absolutely. 248 00:12:43.759 --> 00:12:47.360 Tools like AWS, cloud Watch or open source stacks like 249 00:12:47.399 --> 00:12:51.559 the ELK stack, Elastic Search, log Stash, Cubana are common. 250 00:12:52.039 --> 00:12:54.960 They gather logs and metrics from all your micro services 251 00:12:55.000 --> 00:12:55.559 into one. 252 00:12:55.399 --> 00:12:57.679 Place so you can see the whole picture and set up. 253 00:12:57.559 --> 00:13:01.399 Alerts for anomalies, errors, high latency, unused resource consumption so 254 00:13:01.440 --> 00:13:03.879 you can react before it impacts users or your bill. 255 00:13:03.960 --> 00:13:07.360 Makes sense. One last piece security Moving from one big 256 00:13:07.399 --> 00:13:11.480 monolith to hundreds of distributed services must change the security 257 00:13:11.480 --> 00:13:12.240 game completely. 258 00:13:12.360 --> 00:13:14.519 It absolutely does. You can't just put a big firewall 259 00:13:14.600 --> 00:13:17.360 or on the monolith anymore. Cloud data security relies heavily 260 00:13:17.399 --> 00:13:21.320 on two things. Fine grained access control and network segmentation. 261 00:13:21.399 --> 00:13:22.360 Okay, break those down. 262 00:13:22.519 --> 00:13:26.960 Role based access control or RBAC IM and AWS is 263 00:13:26.960 --> 00:13:31.279 about who can do what principle of least privilege. Users 264 00:13:31.600 --> 00:13:35.279 or even services themselves only get the absolute minimum permissions 265 00:13:35.279 --> 00:13:39.159 they need to function. No more generic admin keys floating around. 266 00:13:39.200 --> 00:13:41.320 And network segmentation that's about. 267 00:13:41.159 --> 00:13:45.320 Controlling what can talk to what You use virtual networks, subnets, 268 00:13:45.480 --> 00:13:50.080 security groups essentially internal firewalls to isolate services. The Bayman 269 00:13:50.159 --> 00:13:52.240 service should only be allowed to talk to the specific 270 00:13:52.320 --> 00:13:55.159 database that needs and maybe the order service, it shouldn't 271 00:13:55.159 --> 00:13:56.879 be able to reach the user profile service, for. 272 00:13:56.840 --> 00:14:00.440 Example, inintaining the blast radius. If something gets compromised. 273 00:14:00.279 --> 00:14:03.000 Exactly zero, trust principles become much more important. 274 00:14:03.159 --> 00:14:06.679 Wow. Okay, so looking back, it's quite a journey. We 275 00:14:06.720 --> 00:14:10.240 went from the monolithic bottleneck to these nimble micro services. 276 00:14:10.720 --> 00:14:14.240 We navigated the ZIAS models, the pay as you go economics, 277 00:14:14.600 --> 00:14:20.639 and critically adopted this design mindset focused on resilience, statelessness, automation, 278 00:14:20.960 --> 00:14:21.919 assuming failure. 279 00:14:22.200 --> 00:14:24.679 And I think the biggest takeaway really, the thing that 280 00:14:24.720 --> 00:14:28.159 should guide anyone starting this journey or refining it, is 281 00:14:28.159 --> 00:14:32.000 that proactive planning right at the design phase. That's what 282 00:14:32.080 --> 00:14:35.480 saves you those thousands of dollars and man hours down 283 00:14:35.559 --> 00:14:36.000 the line. 284 00:14:36.080 --> 00:14:38.360 You can't just tack this stuff on later. 285 00:14:38.279 --> 00:14:41.759 No, you really can't. Trying to retrofit cloud native ideas 286 00:14:41.759 --> 00:14:46.759 onto an old monolithic design, it's usually painful and often fails. 287 00:14:47.240 --> 00:14:48.679 You have to build it in from day. 288 00:14:48.559 --> 00:14:51.080 One, right, which brings us nicely to our final thought 289 00:14:51.159 --> 00:14:53.960 for you, the listener to ponder, We talked a lot 290 00:14:53.960 --> 00:14:57.720 about designing for failure, and the sources mentioned this powerful concept, 291 00:14:57.879 --> 00:14:59.240 the bulkhead pattern. 292 00:14:59.360 --> 00:15:01.639 Ah Yes, ship analogy exactly. 293 00:15:01.879 --> 00:15:05.159 A bulkhead is that watertight wall inside a ship's hull. 294 00:15:05.399 --> 00:15:08.120 If there's a leak or a fire in one compartment. 295 00:15:07.679 --> 00:15:10.919 The bulkhead contains it. It stops the disaster from spreading 296 00:15:10.919 --> 00:15:13.039 and sinking the whole ship. The rest of the vessel 297 00:15:13.200 --> 00:15:14.120 stays operational. 298 00:15:14.720 --> 00:15:18.000 So the question for you is where in your business, 299 00:15:18.039 --> 00:15:21.240 in your systems, in your organization can you deliberately apply 300 00:15:21.279 --> 00:15:22.559 that bulkhead pattern. 301 00:15:22.440 --> 00:15:26.799 If one part fails, a key service, a critical process, 302 00:15:27.080 --> 00:15:30.440 maybe even a team. Have you built the walls, Have 303 00:15:30.559 --> 00:15:34.799 you designed the isolation points, the statelessness, the network segmentation, 304 00:15:34.919 --> 00:15:39.039 the circuit breakers, the automation to ensure that failure is contained. 305 00:15:38.919 --> 00:15:42.320 So that the core mission, the essential operations, can continue 306 00:15:42.320 --> 00:15:45.000 moving forward even when something inevitably breaks. 307 00:15:45.120 --> 00:15:48.279 That's the challenge thinking about resilience, not just in code, 308 00:15:48.320 --> 00:15:49.440 but across the whole system. 309 00:15:49.480 --> 00:15:51.840 Definitely something to mull over as you apply these cloud 310 00:15:51.879 --> 00:15:54.360 native concepts. Thank you for joining us for this deep dive.