WEBVTT

1
00:00:00.080 --> 00:00:03.160
<v Speaker 1>Welcome back to the deep dive. Today we're pulling back

2
00:00:03.200 --> 00:00:07.080
<v Speaker 1>the curtain on something well fundamental yet often kind of unseen,

3
00:00:07.919 --> 00:00:10.919
<v Speaker 1>the intricate world of cloud networking on AWS.

4
00:00:11.119 --> 00:00:12.119
<v Speaker 2>That's right, you've sent.

5
00:00:12.160 --> 00:00:14.279
<v Speaker 1>Us a stack of sources looks like excerpts from an

6
00:00:14.279 --> 00:00:19.879
<v Speaker 1>AWS certification book, and our mission is pretty clear. Distill

7
00:00:19.920 --> 00:00:23.120
<v Speaker 1>those crucial insights. We give you a shortcut really to

8
00:00:23.199 --> 00:00:27.879
<v Speaker 1>being genuinely well informed about the hidden physics of the cloud.

9
00:00:28.320 --> 00:00:29.000
<v Speaker 1>Let's umpact this.

10
00:00:29.199 --> 00:00:32.439
<v Speaker 2>It is fascinating, isn't it, Because you know, cloud services

11
00:00:32.560 --> 00:00:36.600
<v Speaker 2>can feel almost like magic, but underneath they're built on

12
00:00:36.679 --> 00:00:41.719
<v Speaker 2>these deeply rooted networking principles, just with a dynamic global twist.

13
00:00:42.079 --> 00:00:44.159
<v Speaker 2>So today, yeah, we're going to try and uncover those

14
00:00:44.200 --> 00:00:47.320
<v Speaker 2>aha moments. We'll go from the virtual network interfaces powering

15
00:00:47.399 --> 00:00:50.159
<v Speaker 2>your instances all the way to how global traffic is managed,

16
00:00:50.159 --> 00:00:53.399
<v Speaker 2>how it's secured, revealing the key bits and importantly the

17
00:00:53.479 --> 00:00:56.399
<v Speaker 2>unseen challenges that keep everything running.

18
00:00:56.520 --> 00:00:58.679
<v Speaker 1>Okay, So where do we even begin with something as

19
00:00:58.679 --> 00:01:01.200
<v Speaker 1>big as cloud networking? Maybe start right at the foundation

20
00:01:01.560 --> 00:01:02.039
<v Speaker 1>makes sense.

21
00:01:02.119 --> 00:01:04.359
<v Speaker 2>Let's talk about the elastic network interface, the E and I,

22
00:01:04.560 --> 00:01:05.400
<v Speaker 2>The E and I.

23
00:01:05.319 --> 00:01:07.959
<v Speaker 1>Right, So think of it like the cloud's virtual network

24
00:01:08.000 --> 00:01:11.400
<v Speaker 1>card for your EC two instances. Every single bit of

25
00:01:11.480 --> 00:01:13.640
<v Speaker 1>network traffic in and out it flows through an E

26
00:01:13.719 --> 00:01:16.519
<v Speaker 1>and I exactly. But the key thing about E and i's,

27
00:01:16.560 --> 00:01:20.799
<v Speaker 1>it seems, isn't just that they're virtual NICs. It's their flexibility, right, Absolutely,

28
00:01:20.840 --> 00:01:21.799
<v Speaker 1>the flexibility is huge.

29
00:01:21.799 --> 00:01:24.959
<v Speaker 2>You can attach them, detach them in different states too,

30
00:01:25.280 --> 00:01:25.560
<v Speaker 2>Like you.

31
00:01:25.480 --> 00:01:29.480
<v Speaker 1>Can do a hot attachment while the instance is actually running.

32
00:01:29.280 --> 00:01:32.120
<v Speaker 2>Yep, hot attachment, while running warm attachment if it's stopped,

33
00:01:32.200 --> 00:01:34.120
<v Speaker 2>or even coal attachment right when you'll launch it.

34
00:01:34.239 --> 00:01:37.319
<v Speaker 1>Wow, And an instance can have more than one, like

35
00:01:37.439 --> 00:01:39.079
<v Speaker 1>connected to different parts of your network.

36
00:01:39.200 --> 00:01:41.079
<v Speaker 2>Yeah, definitely, you can have multiple E and i's on

37
00:01:41.120 --> 00:01:44.040
<v Speaker 2>a single EC two instance, each connected to a different

38
00:01:44.120 --> 00:01:47.680
<v Speaker 2>VPC subnet, maybe for different security zones or traffic types.

39
00:01:48.120 --> 00:01:51.920
<v Speaker 2>That detachment flexibility really changes how you think about high availability.

40
00:01:52.079 --> 00:01:55.120
<v Speaker 1>Okay, so that's the interface. What about the addresses themselves

41
00:01:55.560 --> 00:01:56.760
<v Speaker 1>inside aws?

42
00:01:56.840 --> 00:02:01.159
<v Speaker 2>Good question. Let's dive into IP addressing. So mostly you'll

43
00:02:01.200 --> 00:02:04.680
<v Speaker 2>see vpcs using private IP ranges, you know, the standard

44
00:02:04.840 --> 00:02:06.680
<v Speaker 2>RFC nineteen eighteen stuff.

45
00:02:06.400 --> 00:02:10.960
<v Speaker 1>Right like ten dot or one seventy two dot sixteen exactly.

46
00:02:11.599 --> 00:02:14.919
<v Speaker 2>But subnets can also allow for the auto assignment of

47
00:02:15.039 --> 00:02:18.159
<v Speaker 2>public IPv four addresses when you'll launch an instance.

48
00:02:18.360 --> 00:02:22.360
<v Speaker 1>Ah okay, but wait if those auto assigned public ips

49
00:02:22.400 --> 00:02:26.039
<v Speaker 1>can just change, say if you stop and start the instance. Yeah,

50
00:02:26.120 --> 00:02:29.879
<v Speaker 1>how do you deal with services that absolutely need a fixed,

51
00:02:30.120 --> 00:02:33.840
<v Speaker 1>unchanging external address, like a web server or something.

52
00:02:34.199 --> 00:02:37.639
<v Speaker 2>That's a fantastic question, and that's precisely where elastic IP

53
00:02:37.759 --> 00:02:41.560
<v Speaker 2>addresses or EPs come in. They're indispensable for that hpiece.

54
00:02:41.800 --> 00:02:45.240
<v Speaker 2>They are static public IPv four addresses. You basically allocate

55
00:02:45.240 --> 00:02:48.879
<v Speaker 2>them to your AWS account, not directly to an instance initially.

56
00:02:49.000 --> 00:02:51.639
<v Speaker 1>Ah okay, So they belong to the account.

57
00:02:51.280 --> 00:02:53.680
<v Speaker 2>Right, and then you could associate an EP with an

58
00:02:53.680 --> 00:02:56.199
<v Speaker 2>E and I or directly with an EC two instance.

59
00:02:56.560 --> 00:02:59.199
<v Speaker 2>The crucial flexibility here is that the EEP isn't permanently

60
00:02:59.240 --> 00:03:02.080
<v Speaker 2>tied to that specific piece of hardware or virtual.

61
00:03:01.759 --> 00:03:03.719
<v Speaker 1>Hardware, so you can move it around exactly.

62
00:03:03.759 --> 00:03:06.000
<v Speaker 2>If an instance fails or you need to swap something out,

63
00:03:06.360 --> 00:03:09.560
<v Speaker 2>you just reassociate that same EP with a different instance

64
00:03:09.639 --> 00:03:11.680
<v Speaker 2>or E and I. Gives you that stable public phase

65
00:03:11.680 --> 00:03:14.919
<v Speaker 2>for your applications, regardless of the underlying instance churn.

66
00:03:15.120 --> 00:03:18.879
<v Speaker 1>That makes a lot of sense, okay. Building on managing IPS, efficiently.

67
00:03:20.560 --> 00:03:24.120
<v Speaker 1>The sources mentioned something called prefix lists. What are those about?

68
00:03:24.159 --> 00:03:25.439
<v Speaker 1>How do they make life simpler?

69
00:03:25.520 --> 00:03:29.759
<v Speaker 2>Prefix lists are actually quite clever. They're basically custom managed

70
00:03:29.759 --> 00:03:32.960
<v Speaker 2>lists of IP address ranges or prefixes. You maintain these

71
00:03:33.000 --> 00:03:36.319
<v Speaker 2>lists and then you can reference them consistently in your network.

72
00:03:36.000 --> 00:03:39.960
<v Speaker 1>Can fix like in security groups or route tables precisely.

73
00:03:40.400 --> 00:03:43.960
<v Speaker 2>Instead of typing out or copying and pasting potentially huge

74
00:03:44.080 --> 00:03:46.479
<v Speaker 2>lists of IP addresses over and over again, you just

75
00:03:46.520 --> 00:03:49.520
<v Speaker 2>refer to the prefix list by its name. It simplifies

76
00:03:49.599 --> 00:03:50.919
<v Speaker 2>policy creation immensely.

77
00:03:51.039 --> 00:03:53.800
<v Speaker 1>Okay, so you define it once, use it many times exactly.

78
00:03:53.960 --> 00:03:56.840
<v Speaker 2>And they are two types. You've got AWS managed prefix lists,

79
00:03:56.919 --> 00:03:59.599
<v Speaker 2>which AWS maintains for their own services, makes it super

80
00:03:59.599 --> 00:04:03.360
<v Speaker 2>easy to allow traffic to S three or DynamoDB for example.

81
00:04:03.520 --> 00:04:03.960
<v Speaker 1>Oh nice.

82
00:04:04.000 --> 00:04:06.439
<v Speaker 2>And then you have customer managed prefix lists where you

83
00:04:06.479 --> 00:04:08.639
<v Speaker 2>define your own groups of ips. Maybe you create one

84
00:04:08.680 --> 00:04:12.319
<v Speaker 2>called all dev resources that includes all this CIDR blocks

85
00:04:12.319 --> 00:04:16.120
<v Speaker 2>for your development VPCS makes managing access way easier.

86
00:04:16.040 --> 00:04:18.120
<v Speaker 1>Right, I can see how that would tidy things up.

87
00:04:18.600 --> 00:04:23.399
<v Speaker 1>Now for something that sounds a bit more mysterious, the hyperplane.

88
00:04:23.560 --> 00:04:24.680
<v Speaker 1>What on earth is that?

89
00:04:24.839 --> 00:04:26.399
<v Speaker 2>Yeah? It does sound a bit sci fi doesn't it.

90
00:04:26.600 --> 00:04:30.360
<v Speaker 2>Think of the hyperplane as like the virtual network engine

91
00:04:30.360 --> 00:04:35.680
<v Speaker 2>of AWS. It's the massively distributed underlying infrastructure that takes

92
00:04:35.680 --> 00:04:39.360
<v Speaker 2>the physical network and slices it up virtually for every customer.

93
00:04:39.560 --> 00:04:42.399
<v Speaker 2>That's what makes vpcs and all these services actually work.

94
00:04:42.720 --> 00:04:45.800
<v Speaker 1>Okay, the engine behind the scenes, But what's surprising about it.

95
00:04:46.160 --> 00:04:48.879
<v Speaker 2>What's surprising or at least really important to understand, is

96
00:04:48.920 --> 00:04:52.680
<v Speaker 2>how it operates with these artificial limits. AWS puts these

97
00:04:52.680 --> 00:04:55.959
<v Speaker 2>in place to ensure fair resource allocation across all tenants.

98
00:04:56.120 --> 00:04:57.040
<v Speaker 1>Limits like bandwidth.

99
00:04:57.160 --> 00:04:59.399
<v Speaker 2>Yeah, bandwidth or throughput limits are part of it, and

100
00:04:59.439 --> 00:05:01.959
<v Speaker 2>they very hugely depending on the service. You know, a

101
00:05:02.000 --> 00:05:05.800
<v Speaker 2>transit gateway VPC attachment might go up to fifty gigabits

102
00:05:05.800 --> 00:05:09.639
<v Speaker 2>per second, whereas a single VPN tunnel might top out

103
00:05:09.680 --> 00:05:13.439
<v Speaker 2>at say one point twenty five gvps and direct connect

104
00:05:13.439 --> 00:05:15.360
<v Speaker 2>depending on the port, maybe one ten or even one

105
00:05:15.399 --> 00:05:18.000
<v Speaker 2>hundred gvps. Now, these numbers kind of show the different

106
00:05:18.000 --> 00:05:18.959
<v Speaker 2>skills you're working at.

107
00:05:19.680 --> 00:05:24.399
<v Speaker 1>That huge difference between TGW and a VPN tunnel is striking.

108
00:05:25.079 --> 00:05:28.879
<v Speaker 1>But you mentioned something else, something trickier than just bandwidth.

109
00:05:29.040 --> 00:05:31.279
<v Speaker 2>Yes, and this is the one that catches people out,

110
00:05:31.319 --> 00:05:36.120
<v Speaker 2>even experienced folks. It's the packets per second or PPS limitations.

111
00:05:35.639 --> 00:05:37.800
<v Speaker 1>Packets per second. Why is that trickier?

112
00:05:37.959 --> 00:05:40.680
<v Speaker 2>Because you can often hit the PPS limit before you

113
00:05:40.759 --> 00:05:43.560
<v Speaker 2>hit the bandwidth limit, especially with lots of small packets,

114
00:05:43.600 --> 00:05:46.160
<v Speaker 2>like certain types of application traffic, or maybe even a

115
00:05:46.240 --> 00:05:48.240
<v Speaker 2>d DOS attack using small packets.

116
00:05:48.240 --> 00:05:48.800
<v Speaker 1>And what happens?

117
00:05:48.800 --> 00:05:52.639
<v Speaker 2>Then you start dropping packets silently. Your bandwidth monitors might

118
00:05:52.680 --> 00:05:55.800
<v Speaker 2>look totally fine, nowhere near saturated, but packets are just

119
00:05:55.879 --> 00:05:59.480
<v Speaker 2>vanishing into the ether because the hyperplane component handling your

120
00:05:59.480 --> 00:06:01.839
<v Speaker 2>traffic can't process them fast enough.

121
00:06:01.879 --> 00:06:04.560
<v Speaker 1>Ouch. So how do you even spot that? If there are,

122
00:06:04.680 --> 00:06:06.879
<v Speaker 1>as you said, no obvious signs, That.

123
00:06:06.959 --> 00:06:09.240
<v Speaker 2>Is the challenge. It feels like a ghost in the machine.

124
00:06:09.720 --> 00:06:13.639
<v Speaker 2>Diagnosing it usually means you need to look beyond just throughput.

125
00:06:13.839 --> 00:06:16.879
<v Speaker 2>You need metrics on packet counts, maybe packet drop counters

126
00:06:16.920 --> 00:06:19.879
<v Speaker 2>if the service exposes them, or you have to use

127
00:06:19.920 --> 00:06:24.120
<v Speaker 2>tools like VPC flow logs or even VBC traffic mirroring,

128
00:06:24.160 --> 00:06:26.079
<v Speaker 2>which we can get into later, to try and see

129
00:06:26.079 --> 00:06:28.720
<v Speaker 2>what's actually happening at the packet level. It definitely defies

130
00:06:28.759 --> 00:06:30.360
<v Speaker 2>traditional bandwidth troubleshooting.

131
00:06:30.439 --> 00:06:33.600
<v Speaker 1>Okay, that's a really crucial subtle point So we've got

132
00:06:33.600 --> 00:06:37.560
<v Speaker 1>the building blocks E and i's ips limits. But clouds

133
00:06:37.560 --> 00:06:40.680
<v Speaker 1>aren't usually isolated islands, right, You need them to talk

134
00:06:40.720 --> 00:06:41.279
<v Speaker 1>to each other.

135
00:06:41.439 --> 00:06:43.040
<v Speaker 2>Absolutely, Connecting things is key.

136
00:06:43.160 --> 00:06:46.639
<v Speaker 1>So moving beyond individual instances, how do we connect these

137
00:06:46.720 --> 00:06:50.199
<v Speaker 1>cloud components. What's the simplest way? VPC peering.

138
00:06:50.639 --> 00:06:53.639
<v Speaker 2>Yeah, VPC peering is often the starting point. It creates

139
00:06:53.680 --> 00:06:58.439
<v Speaker 2>a direct private connection between two vpcs using aws's backbone.

140
00:06:58.600 --> 00:07:01.360
<v Speaker 2>It's pretty straightforward to set up. And actually there aren't

141
00:07:01.439 --> 00:07:05.600
<v Speaker 2>explicit throughput limits imposed by the peering connection itself, beyond

142
00:07:05.720 --> 00:07:07.120
<v Speaker 2>instance or other limits.

143
00:07:07.319 --> 00:07:09.560
<v Speaker 1>Sounds good, but there have to be catches.

144
00:07:09.319 --> 00:07:13.720
<v Speaker 2>Right, Oh, definitely key considerations. First, it's nontransitive, meaning if

145
00:07:13.800 --> 00:07:18.279
<v Speaker 2>VPCA is peered with VPCB and VPCD is peered with VPCC,

146
00:07:19.240 --> 00:07:23.160
<v Speaker 2>VPCA cannot automatically talk to VPCC just by going through B.

147
00:07:23.600 --> 00:07:26.480
<v Speaker 2>There's no implicit routing pass through. You'd need a separate

148
00:07:26.480 --> 00:07:29.000
<v Speaker 2>peering connection directly between A and C. Ah.

149
00:07:29.040 --> 00:07:31.600
<v Speaker 1>Okay, so no hubbin spoke, just using peering exactly.

150
00:07:31.720 --> 00:07:34.920
<v Speaker 2>And the second big one, maybe even bigger. It absolutely

151
00:07:34.920 --> 00:07:38.279
<v Speaker 2>cannot be used if the vpcs have overlapping CADR ranges.

152
00:07:38.360 --> 00:07:41.120
<v Speaker 1>Right, If both vpcs use ten point zero point zero

153
00:07:41.120 --> 00:07:42.759
<v Speaker 1>point zero one six For example.

154
00:07:42.519 --> 00:07:46.199
<v Speaker 2>YEP peering just won't work. Routing across peered vpcs relies

155
00:07:46.240 --> 00:07:48.319
<v Speaker 2>on static routes. You have to manually add to the

156
00:07:48.360 --> 00:07:51.399
<v Speaker 2>route tables in both vpcs for traffic to flow back

157
00:07:51.399 --> 00:07:51.839
<v Speaker 2>and forth.

158
00:07:51.920 --> 00:07:55.240
<v Speaker 1>That IP overlap thing, that sounds like a potential nightmare.

159
00:07:55.480 --> 00:07:58.959
<v Speaker 1>You mentioned company mergers earlier. Imagine trying to connect two

160
00:07:59.000 --> 00:08:02.120
<v Speaker 1>company networks that both picked say ten point one hundred

161
00:08:02.120 --> 00:08:05.160
<v Speaker 1>point zero points zero one six independently peering is out?

162
00:08:05.360 --> 00:08:08.720
<v Speaker 1>Is there any way to make workloads with overlapping IPS talk?

163
00:08:08.800 --> 00:08:11.040
<v Speaker 2>You're right, it's a huge challenge in mergers or large

164
00:08:11.120 --> 00:08:15.040
<v Speaker 2>organizations VPC peering it's a wall there. There is a solution,

165
00:08:15.480 --> 00:08:18.839
<v Speaker 2>though it's not perfect. Yeah, using private net gateways.

166
00:08:18.399 --> 00:08:20.879
<v Speaker 1>Not gateways, but usually those are forgetting out to the internet.

167
00:08:21.040 --> 00:08:23.800
<v Speaker 2>Correct, those are public net gateways, but you can also

168
00:08:23.839 --> 00:08:27.000
<v Speaker 2>set up private net gateways. They allow workloads in one

169
00:08:27.079 --> 00:08:31.040
<v Speaker 2>DPC to initiate connections to workloads in another VPC, even

170
00:08:31.040 --> 00:08:34.360
<v Speaker 2>if they have overlapping IPS, because the neat gateway handles

171
00:08:34.399 --> 00:08:36.320
<v Speaker 2>the address translation on the way out.

172
00:08:36.399 --> 00:08:39.320
<v Speaker 1>Ah, clever, but you said initiate.

173
00:08:39.039 --> 00:08:41.879
<v Speaker 2>Yeah, that's the caveat. The communication generally have to be

174
00:08:41.919 --> 00:08:45.080
<v Speaker 2>initiated from the side using the neat gateway. It's not

175
00:08:45.159 --> 00:08:48.679
<v Speaker 2>a truly transparent bidirectional connection like peering would be if

176
00:08:48.679 --> 00:08:52.200
<v Speaker 2>the ips didn't overlap. Solves a specific problem, but it's

177
00:08:52.240 --> 00:08:54.840
<v Speaker 2>not a universal fix for overlapping CIDRs.

178
00:08:55.159 --> 00:08:58.279
<v Speaker 1>Okay, so peering is simple, but limited, especially by transitivity

179
00:08:58.320 --> 00:09:01.720
<v Speaker 1>and IP overlap. How did AS addressed the need for larger,

180
00:09:01.879 --> 00:09:04.399
<v Speaker 1>more complex, maybe hub and spoke networks in the cloud.

181
00:09:04.720 --> 00:09:07.039
<v Speaker 2>Well, the community first came up with solutions like the

182
00:09:07.080 --> 00:09:10.759
<v Speaker 2>Transit VPC. This usually involves setting up dedicated EC two

183
00:09:10.840 --> 00:09:15.440
<v Speaker 2>instances running routing software, network virtual appliances or mvas in

184
00:09:15.519 --> 00:09:17.360
<v Speaker 2>the central VPC to act as a hub.

185
00:09:17.559 --> 00:09:19.879
<v Speaker 1>So building your own router in the cloud basically.

186
00:09:19.600 --> 00:09:23.159
<v Speaker 2>Pretty much it worked, But managing those mvas, worrying about

187
00:09:23.159 --> 00:09:28.039
<v Speaker 2>their scaling, high availability, it's complex. So AWS eventually released

188
00:09:28.039 --> 00:09:30.759
<v Speaker 2>a mandaged service to solve this much more elegantly, the

189
00:09:30.840 --> 00:09:34.519
<v Speaker 2>AWS Transit Gateway or TGW Transit Gateway.

190
00:09:34.559 --> 00:09:35.440
<v Speaker 1>Okay, how's that different.

191
00:09:35.519 --> 00:09:39.600
<v Speaker 2>TGW acts as a fully managed, highly scalable central cloud

192
00:09:39.679 --> 00:09:43.000
<v Speaker 2>router or hub. You attach your vpcs, your VPN connections,

193
00:09:43.000 --> 00:09:47.559
<v Speaker 2>your direct connections all to the TGW. It simplifies INNERVPC

194
00:09:47.639 --> 00:09:51.440
<v Speaker 2>connectivity massively and also makes hybrid networking connecting to on

195
00:09:51.519 --> 00:09:54.720
<v Speaker 2>premises much cleaner. It takes the routing burden off you

196
00:09:54.960 --> 00:09:57.799
<v Speaker 2>and puts it into a managed AWS service. A true

197
00:09:57.879 --> 00:09:59.399
<v Speaker 2>hub and spoke model becomes easy.

198
00:09:59.480 --> 00:10:04.360
<v Speaker 1>Got it? So TDW is the modern way for complex connectivity. Now,

199
00:10:04.440 --> 00:10:07.320
<v Speaker 1>speaking of hybrid, what about that dedicated link you mentioned,

200
00:10:07.360 --> 00:10:10.600
<v Speaker 1>Direct connect or DX. Why would someone go for DX

201
00:10:10.600 --> 00:10:12.360
<v Speaker 1>instead of just setting up a VPN over the Internet.

202
00:10:12.399 --> 00:10:13.840
<v Speaker 1>It seems like VPNs are pretty common.

203
00:10:13.960 --> 00:10:17.080
<v Speaker 2>They are common and often sufficient, but direct connect offers

204
00:10:17.120 --> 00:10:21.960
<v Speaker 2>several really critical advantages, especially for larger enterprises or sensitive workloads.

205
00:10:22.159 --> 00:10:26.320
<v Speaker 2>Like what First, Privacy and security DX provides a dedicated

206
00:10:26.399 --> 00:10:29.519
<v Speaker 2>private circuit. Your traffic isn't going over the public Internet,

207
00:10:29.600 --> 00:10:34.120
<v Speaker 2>so it can't be snooped on easily. Second, reliability, DX

208
00:10:34.159 --> 00:10:38.200
<v Speaker 2>comes with service level Agreements slas, promising certain levels of uptime.

209
00:10:38.440 --> 00:10:41.080
<v Speaker 2>The public Internet is inherently best effort.

210
00:10:40.840 --> 00:10:43.679
<v Speaker 1>Okay, so more secure, more reliable, and.

211
00:10:43.639 --> 00:10:48.519
<v Speaker 2>Third performance significantly higher bandwidth as possible. D X connections

212
00:10:48.559 --> 00:10:51.480
<v Speaker 2>come in one gbp's, ten gvps and now even one

213
00:10:51.600 --> 00:10:55.559
<v Speaker 2>hundred gbp's flavors, plus you generally get lower and more

214
00:10:55.600 --> 00:10:57.320
<v Speaker 2>consistent latency compared.

215
00:10:56.960 --> 00:10:59.759
<v Speaker 1>To the Internet one hundred gigs. Wow. And it's literally

216
00:10:59.799 --> 00:11:01.840
<v Speaker 1>a physical connection right like a cable.

217
00:11:02.000 --> 00:11:05.360
<v Speaker 2>Yes, Fundamentally you work with AWS or a partner to

218
00:11:05.360 --> 00:11:07.519
<v Speaker 2>get a physical cross connect cable run in a shared

219
00:11:07.559 --> 00:11:10.559
<v Speaker 2>data center, a direct connect location between your networking equipment

220
00:11:10.639 --> 00:11:13.799
<v Speaker 2>and aws's equipment. There's even a document involved, the Letter

221
00:11:13.840 --> 00:11:17.799
<v Speaker 2>of Authorization in Connecting Facility Assignment or LOACFA, that you

222
00:11:17.879 --> 00:11:20.159
<v Speaker 2>use to authorize the data center technicians to make that

223
00:11:20.200 --> 00:11:23.279
<v Speaker 2>physical link. It's a tangible piece of your cloud connection.

224
00:11:23.320 --> 00:11:26.320
<v Speaker 1>A physical manifestation of the cloud. Okay, that's cool. So

225
00:11:26.480 --> 00:11:30.120
<v Speaker 1>DX sounds robust. What if one link isn't enough bandwidth

226
00:11:30.279 --> 00:11:32.919
<v Speaker 1>or you need more redundancy, and how do you actually

227
00:11:33.360 --> 00:11:36.440
<v Speaker 1>get that physical pipe connected into your virtual network your.

228
00:11:36.399 --> 00:11:40.559
<v Speaker 2>Vpcs great questions for more bandwidth or redundancy. AWS offers

229
00:11:40.600 --> 00:11:44.399
<v Speaker 2>link aggregation groups or lags. This is pretty neat. It

230
00:11:44.480 --> 00:11:47.799
<v Speaker 2>lets you bundle multiple physical DX connections together, say for

231
00:11:48.039 --> 00:11:51.240
<v Speaker 2>one gbp's links, and treat them as a single logical

232
00:11:51.279 --> 00:11:55.799
<v Speaker 2>connection with combined bandwidth like four gbps. It simplifies management too.

233
00:11:55.840 --> 00:11:58.240
<v Speaker 1>Ah like bonding network interfaces.

234
00:11:57.679 --> 00:12:00.720
<v Speaker 2>Exactly like that, and to extend that physical connectivity into

235
00:12:00.759 --> 00:12:05.679
<v Speaker 2>your actual AWS resources, you use virtual interfaces or visifs.

236
00:12:05.799 --> 00:12:09.200
<v Speaker 2>Vifs essentially carve up that physical DX connection or LAG

237
00:12:09.240 --> 00:12:12.639
<v Speaker 2>into logical pathways using VLAN tagging standard A to two

238
00:12:12.679 --> 00:12:15.720
<v Speaker 2>point one qvland tags. This lets you run different types

239
00:12:15.759 --> 00:12:17.639
<v Speaker 2>of network traffic over the same physical.

240
00:12:17.320 --> 00:12:18.200
<v Speaker 1>Link, different type.

241
00:12:18.279 --> 00:12:21.879
<v Speaker 2>Yeah, we mainly distinguish between three types. Private vifs, which

242
00:12:21.879 --> 00:12:24.759
<v Speaker 2>are used to connect directly to your vpcs, usually via

243
00:12:24.799 --> 00:12:27.759
<v Speaker 2>a component called a virtual private gateway PGW or more

244
00:12:27.799 --> 00:12:31.720
<v Speaker 2>commonly now a transit gateway. Then there are public vifs,

245
00:12:31.879 --> 00:12:34.960
<v Speaker 2>which let you access public AWS services like S three

246
00:12:35.080 --> 00:12:39.240
<v Speaker 2>or EC two APIs over your private DX link bypassing

247
00:12:39.279 --> 00:12:39.759
<v Speaker 2>the Internet.

248
00:12:39.840 --> 00:12:43.480
<v Speaker 1>Okay, private for vpcs, public for AWS services. What's the third?

249
00:12:43.600 --> 00:12:46.480
<v Speaker 2>The third is the transit VIF. This one is specifically

250
00:12:46.480 --> 00:12:49.919
<v Speaker 2>designed to connect your direct connect link to a transit gateway.

251
00:12:49.799 --> 00:12:53.240
<v Speaker 1>Right for that hub and spoke model with TGW precisely.

252
00:12:53.480 --> 00:12:56.480
<v Speaker 2>And there's a crucial point here about transit, vis and TGW.

253
00:12:56.720 --> 00:12:59.639
<v Speaker 2>I've heard about something called hairpinning that can get really

254
00:12:59.639 --> 00:13:01.080
<v Speaker 2>expended if you're not careful.

255
00:13:01.159 --> 00:13:03.919
<v Speaker 1>Ah. Yes, hairpinning. You're absolutely right to bring that up.

256
00:13:03.919 --> 00:13:07.360
<v Speaker 1>It's a potentially very costly mistake. If you're using transit

257
00:13:07.399 --> 00:13:09.919
<v Speaker 1>gateway with direct connect, it's critical that you use a

258
00:13:09.960 --> 00:13:14.039
<v Speaker 1>single transit VIF per TGW connection to your on premises site.

259
00:13:14.120 --> 00:13:17.600
<v Speaker 2>Why just one, because if you have multiple or misconfigure routing,

260
00:13:17.840 --> 00:13:20.080
<v Speaker 2>you can end up with hair pinning. This is where

261
00:13:20.120 --> 00:13:22.799
<v Speaker 2>traffic comes in from your on prem network via DX,

262
00:13:23.000 --> 00:13:25.480
<v Speaker 2>goes to the TGW maybe needs to get to another VPC,

263
00:13:25.919 --> 00:13:29.159
<v Speaker 2>but instead of routing directly, the TGW rows it back

264
00:13:29.159 --> 00:13:31.720
<v Speaker 2>out the same DX connection towards your on prem router,

265
00:13:32.080 --> 00:13:34.480
<v Speaker 2>only for your router to send it immediately back into

266
00:13:34.519 --> 00:13:37.519
<v Speaker 2>AWS over DX again to reach the intended VPC.

267
00:13:37.799 --> 00:13:41.399
<v Speaker 1>So it makes a U turn back through your own network.

268
00:13:41.159 --> 00:13:44.600
<v Speaker 2>Exactly, a totally unnecessary round trip out of AWS and

269
00:13:44.639 --> 00:13:47.519
<v Speaker 2>back in. And since you pay for a data egress

270
00:13:47.519 --> 00:13:51.799
<v Speaker 2>from AWS, that double egress gets incredibly expensive really fast.

271
00:13:52.240 --> 00:13:55.480
<v Speaker 2>A single transit VIF for TGW connection point, along with

272
00:13:55.559 --> 00:13:59.080
<v Speaker 2>proper route propagation and filtering, prevents this costly detour.

273
00:13:59.279 --> 00:14:03.360
<v Speaker 1>Wow. Okay, definitely noted avoid the hairpin So last piece

274
00:14:03.399 --> 00:14:06.120
<v Speaker 1>on connectivity, what if you need your vpcs to talk

275
00:14:06.279 --> 00:14:09.840
<v Speaker 1>privately to services could be aws's owned services or maybe

276
00:14:09.879 --> 00:14:13.000
<v Speaker 1>third party sauce providers you use, but you don't want

277
00:14:13.000 --> 00:14:14.440
<v Speaker 1>to go out to the Internet, and you don't want

278
00:14:14.440 --> 00:14:16.519
<v Speaker 1>to route through net gateways if you can avoid it,

279
00:14:16.559 --> 00:14:17.399
<v Speaker 1>what's the play there?

280
00:14:17.480 --> 00:14:20.320
<v Speaker 2>That's the perfect use case for AWS private link private

281
00:14:20.360 --> 00:14:24.080
<v Speaker 2>LINKA private link uses a component called a VPC endpoint.

282
00:14:24.759 --> 00:14:28.399
<v Speaker 2>It essentially creates a secure private connection directly from your

283
00:14:28.480 --> 00:14:32.240
<v Speaker 2>VPC to the service. The service endpoint effectively gets a

284
00:14:32.279 --> 00:14:36.399
<v Speaker 2>private IP address within your vpc's address range, making the

285
00:14:36.440 --> 00:14:39.639
<v Speaker 2>external service appear as if it's running right there inside

286
00:14:39.639 --> 00:14:40.200
<v Speaker 2>your network.

287
00:14:40.240 --> 00:14:43.679
<v Speaker 1>Ah So no Internet gateway, no, not no public eyps

288
00:14:43.720 --> 00:14:45.360
<v Speaker 1>involved for that service connection.

289
00:14:45.519 --> 00:14:49.879
<v Speaker 2>Correct traffic stays entirely within the AWS network backbone. It

290
00:14:50.000 --> 00:14:53.600
<v Speaker 2>massively improves security by keeping sensitive data off the public Internet,

291
00:14:53.840 --> 00:14:57.039
<v Speaker 2>and it simplifies your network architecture because you don't need

292
00:14:57.159 --> 00:15:00.399
<v Speaker 2>complex firewall rules or net setups just to reach those

293
00:15:00.399 --> 00:15:03.279
<v Speaker 2>services privately. It's very powerful for secure service consumption.

294
00:15:03.440 --> 00:15:05.399
<v Speaker 1>Okay, that covers a lot of ground on how to

295
00:15:05.440 --> 00:15:08.440
<v Speaker 1>connect things, but even with the best designs, things go wrong.

296
00:15:08.559 --> 00:15:11.159
<v Speaker 1>Cloud networks, maybe even more than traditional ones, can have

297
00:15:11.200 --> 00:15:14.519
<v Speaker 1>these elusive problems. Because so much is abstracted. How do

298
00:15:14.559 --> 00:15:18.360
<v Speaker 1>you start peeling back those layers when inevitably something breaks.

299
00:15:18.399 --> 00:15:21.279
<v Speaker 1>Let's talk about potential problems first. What kind of things

300
00:15:21.320 --> 00:15:23.320
<v Speaker 1>typically bite you in cloud networking?

301
00:15:23.600 --> 00:15:26.519
<v Speaker 2>Oh, there's a whole list. We definitely see IP address

302
00:15:26.519 --> 00:15:29.600
<v Speaker 2>allocation issues pretty often, like a subnet just runs out

303
00:15:29.639 --> 00:15:33.840
<v Speaker 2>of available IPS, IP exhaustion, yeah, or worse, those overlapping

304
00:15:33.879 --> 00:15:37.639
<v Speaker 2>CIDR ranges we talked about causing weird routing conflicts. If

305
00:15:37.639 --> 00:15:40.759
<v Speaker 2>someone tries to connect things that shouldn't be connected, then

306
00:15:40.799 --> 00:15:44.519
<v Speaker 2>there are root scale limitations. AWS services have limits on

307
00:15:44.519 --> 00:15:47.240
<v Speaker 2>the number of routes they can handle. Exceed those and

308
00:15:47.320 --> 00:15:50.159
<v Speaker 2>routes might just disappear, or BGP sessions with your on

309
00:15:50.279 --> 00:15:51.679
<v Speaker 2>prem gear might tear down.

310
00:15:51.840 --> 00:15:53.480
<v Speaker 1>Okay, limits again. What else?

311
00:15:53.759 --> 00:15:57.799
<v Speaker 2>Packet size mismatches. This is a subtle one. Issues with

312
00:15:57.840 --> 00:16:02.600
<v Speaker 2>maximum transmission unit MTU or maximum segment size MSS can

313
00:16:02.679 --> 00:16:06.799
<v Speaker 2>cause fragmentation. This often doesn't look like a network down problem,

314
00:16:06.879 --> 00:16:09.279
<v Speaker 2>but it hits applications. You might see really slow file

315
00:16:09.320 --> 00:16:13.399
<v Speaker 2>transfers or some web apps timing out without obvious network errors.

316
00:16:13.240 --> 00:16:16.159
<v Speaker 1>Right because the network itself is passing packets just fragmented

317
00:16:16.200 --> 00:16:18.240
<v Speaker 1>ones the application struggles with exactly.

318
00:16:18.919 --> 00:16:21.240
<v Speaker 2>Then we have the hard limits we discussed band with

319
00:16:21.320 --> 00:16:24.759
<v Speaker 2>throughput limitations which are usually pretty core quotas, and those

320
00:16:24.799 --> 00:16:28.600
<v Speaker 2>tricky PPS limitations causing those silent packet drops that are

321
00:16:28.600 --> 00:16:29.679
<v Speaker 2>so hard to diagnose.

322
00:16:29.960 --> 00:16:30.639
<v Speaker 1>Still scary.

323
00:16:30.759 --> 00:16:34.159
<v Speaker 2>Yeah, and related to that just general packet loss, maybe

324
00:16:34.279 --> 00:16:38.159
<v Speaker 2>due to unreliable transit somewhere between regions, or maybe the

325
00:16:38.279 --> 00:16:42.039
<v Speaker 2>end hosts themselves are just overwhelmed and dropping packets. And finally,

326
00:16:42.519 --> 00:16:47.240
<v Speaker 2>never underestimate plain old security misconfiguration, a wrong rule in

327
00:16:47.279 --> 00:16:51.200
<v Speaker 2>a security group or more often a network Access control

328
00:16:51.240 --> 00:16:54.879
<v Speaker 2>list NaCl is a super frequent cause of it just

329
00:16:54.960 --> 00:16:56.320
<v Speaker 2>doesn't connect problems.

330
00:16:56.440 --> 00:16:59.240
<v Speaker 1>That's quite a list. Sounds like troubleshooting could be finding

331
00:16:59.279 --> 00:17:03.200
<v Speaker 1>a needle in a haystack. What tools does AWS actually

332
00:17:03.200 --> 00:17:07.240
<v Speaker 1>give you to get visibility to see inside this sometimes

333
00:17:07.240 --> 00:17:08.160
<v Speaker 1>opaque window.

334
00:17:08.519 --> 00:17:11.920
<v Speaker 2>Well, the courterstone of observability in AWS is definitely Amazon

335
00:17:11.960 --> 00:17:12.759
<v Speaker 2>cloud Watch.

336
00:17:12.599 --> 00:17:15.000
<v Speaker 1>Cloud Watch right, that's for metrics and logs for pretty

337
00:17:15.079 --> 00:17:16.079
<v Speaker 1>much everything exactly.

338
00:17:16.079 --> 00:17:18.640
<v Speaker 2>It's the central hub you need to understand its core components.

339
00:17:18.640 --> 00:17:21.880
<v Speaker 2>There are name spaces, which are basically containers for metrics

340
00:17:21.920 --> 00:17:25.039
<v Speaker 2>from a specific service like EC two or ELB. Then

341
00:17:25.079 --> 00:17:27.640
<v Speaker 2>the metrics themselves. Those are the actual time series data

342
00:17:27.640 --> 00:17:31.359
<v Speaker 2>points like CPU utilization or network in. Then you have dimensions,

343
00:17:31.519 --> 00:17:34.160
<v Speaker 2>which are key value pairs that help you filter in

344
00:17:34.200 --> 00:17:38.839
<v Speaker 2>group metrics like instant seed or autoscaling group name, and

345
00:17:38.920 --> 00:17:41.960
<v Speaker 2>finally periods which define the time interval over which the

346
00:17:42.039 --> 00:17:45.319
<v Speaker 2>data is aggregated, like one minute or five minutes. Cloud

347
00:17:45.319 --> 00:17:49.160
<v Speaker 2>Watch is your main dashboard for performance, health and setting alarms.

348
00:17:49.359 --> 00:17:52.079
<v Speaker 1>So cloud watch gives you the high level metrics. But

349
00:17:52.160 --> 00:17:56.319
<v Speaker 1>what about seeing the actual traffic flows, like which connections

350
00:17:56.319 --> 00:17:59.279
<v Speaker 1>are being allowed or denied. That sounds more like VPC

351
00:17:59.400 --> 00:18:00.799
<v Speaker 1>flow logs precisely.

352
00:18:01.119 --> 00:18:04.559
<v Speaker 2>VPC flowlugs give you metadata about the IP traffic flowing

353
00:18:04.599 --> 00:18:08.480
<v Speaker 2>through your VPC. They capture information for each flow like

354
00:18:08.599 --> 00:18:12.680
<v Speaker 2>source and destination, IP ports protocol, the number of packets

355
00:18:12.680 --> 00:18:15.640
<v Speaker 2>and bytes, and crucially, the forwarding decision made by the

356
00:18:15.720 --> 00:18:18.839
<v Speaker 2>VPC router, whether the traffic was accepted or rejected.

357
00:18:19.440 --> 00:18:22.200
<v Speaker 1>That accept traject status seems key for troubleshooting.

358
00:18:22.279 --> 00:18:25.079
<v Speaker 2>It is, but remember flow lugs are not full packet captures.

359
00:18:25.119 --> 00:18:27.000
<v Speaker 2>They don't show you the payload, but they give you

360
00:18:27.079 --> 00:18:30.079
<v Speaker 2>really valuable insight into network level decisions, and you can

361
00:18:30.119 --> 00:18:33.319
<v Speaker 2>even set up custom formats for flow logs now custom formats.

362
00:18:33.319 --> 00:18:34.039
<v Speaker 1>How would you use that?

363
00:18:34.200 --> 00:18:37.400
<v Speaker 2>Well, for instance, you could include TCP flags in your logs.

364
00:18:37.880 --> 00:18:43.079
<v Speaker 2>That might help you troubleshoot specific issues like TCP handshake problems.

365
00:18:43.119 --> 00:18:46.880
<v Speaker 2>Are you seeing syn packets but no syn ACKs back?

366
00:18:47.119 --> 00:18:49.799
<v Speaker 2>Things like that. It lets you tailor the logs to

367
00:18:49.880 --> 00:18:51.079
<v Speaker 2>the problem you're investigating.

368
00:18:51.440 --> 00:18:55.279
<v Speaker 1>That's handy. Now to make this concrete, the source material

369
00:18:55.359 --> 00:18:59.319
<v Speaker 1>had this Prailcats troubleshooting example. Can you walk us through that?

370
00:18:59.359 --> 00:19:02.440
<v Speaker 1>It seemed like a good illustration of using these tools systematically.

371
00:19:02.720 --> 00:19:05.640
<v Speaker 2>Yeah, the Trailcats examples classic. They had a website and

372
00:19:05.720 --> 00:19:09.440
<v Speaker 2>it was having these mysterious connectivity problems between two of

373
00:19:09.480 --> 00:19:12.039
<v Speaker 2>its back end servers. So the first thing they did

374
00:19:12.400 --> 00:19:15.279
<v Speaker 2>was enable VPT flow logs, but they did it at

375
00:19:15.279 --> 00:19:17.480
<v Speaker 2>the NI level for the servers involved.

376
00:19:17.519 --> 00:19:20.440
<v Speaker 1>Okay, looking right at the server's network interfaces, right, and.

377
00:19:20.400 --> 00:19:24.799
<v Speaker 2>Those logs showed nothing rejected all ec SPTT. So initial

378
00:19:24.839 --> 00:19:26.599
<v Speaker 2>thought might be, okay, the network's fine, must be an

379
00:19:26.640 --> 00:19:27.839
<v Speaker 2>application problem.

380
00:19:27.519 --> 00:19:29.480
<v Speaker 1>A dead end potentially exactly.

381
00:19:29.720 --> 00:19:31.759
<v Speaker 2>But they didn't stop there. They widened the scope. They

382
00:19:31.880 --> 00:19:34.279
<v Speaker 2>enabled flow logs, but this time at the subnet.

383
00:19:34.000 --> 00:19:37.519
<v Speaker 1>Level AH one level up from the instance NI YEP

384
00:19:37.680 --> 00:19:38.799
<v Speaker 1>and boom.

385
00:19:39.039 --> 00:19:42.519
<v Speaker 2>The subnet level logs immediately showed rejected traffic between those

386
00:19:42.559 --> 00:19:43.240
<v Speaker 2>two servers.

387
00:19:43.319 --> 00:19:44.400
<v Speaker 1>So what did that point to?

388
00:19:44.759 --> 00:19:48.960
<v Speaker 2>It pointed directly to a network Access control list or ANACL.

389
00:19:49.680 --> 00:19:52.960
<v Speaker 2>Because nacls operated at the subnet boundary, they were blocking

390
00:19:52.960 --> 00:19:55.720
<v Speaker 2>the traffic before it even got to the instance's E

391
00:19:55.839 --> 00:19:58.720
<v Speaker 2>and I. The ENI level logs never saw the rejected

392
00:19:58.759 --> 00:20:00.920
<v Speaker 2>packets because they never or reach the E ANDI.

393
00:20:01.480 --> 00:20:04.680
<v Speaker 1>That's a brilliant example of how changing your observation point

394
00:20:04.759 --> 00:20:07.759
<v Speaker 1>widening the scope is critical in cloud troubleshooting.

395
00:20:07.960 --> 00:20:09.880
<v Speaker 2>Absolutely, you have to look at the different layers.

396
00:20:09.960 --> 00:20:13.599
<v Speaker 1>Okay, so flow logs give metadata except reject But what

397
00:20:13.720 --> 00:20:16.400
<v Speaker 1>if you do need to see the actual packet contents,

398
00:20:16.519 --> 00:20:19.200
<v Speaker 1>like you suspect something weird in the payload or you

399
00:20:19.240 --> 00:20:22.480
<v Speaker 1>need deep protocol analysis. Is there an equivalent to plugging

400
00:20:22.480 --> 00:20:25.279
<v Speaker 1>in wire shark via a span port like in a

401
00:20:25.279 --> 00:20:26.279
<v Speaker 1>physical data center.

402
00:20:26.359 --> 00:20:30.119
<v Speaker 2>Yes, there is. That's VPC traffic mirroring. It essentially provides

403
00:20:30.160 --> 00:20:32.039
<v Speaker 2>that span port capability in the cloud.

404
00:20:32.160 --> 00:20:33.200
<v Speaker 1>Okay, how does it work?

405
00:20:33.319 --> 00:20:35.920
<v Speaker 2>It lets you capture network traffic from a specific source,

406
00:20:36.279 --> 00:20:39.119
<v Speaker 2>usually an EC two instances E and I and mirror

407
00:20:39.160 --> 00:20:41.720
<v Speaker 2>it send a copy to a designated target.

408
00:20:41.839 --> 00:20:42.640
<v Speaker 1>What kind of target?

409
00:20:42.839 --> 00:20:45.359
<v Speaker 2>The target could be another E and I, maybe on

410
00:20:45.440 --> 00:20:48.559
<v Speaker 2>an instance running wire shark or some security tool. Or

411
00:20:48.640 --> 00:20:51.039
<v Speaker 2>it could be a network load balancer or even a

412
00:20:51.079 --> 00:20:54.319
<v Speaker 2>gateway load balancer which might front a whole fleet of

413
00:20:54.400 --> 00:20:56.079
<v Speaker 2>monitoring appliances.

414
00:20:55.559 --> 00:20:58.200
<v Speaker 1>And you can control what traffic gets mirrored. You don't

415
00:20:58.200 --> 00:20:59.920
<v Speaker 1>want to flood your monitoring tool, right.

416
00:21:00.119 --> 00:21:02.880
<v Speaker 2>You use a filter which is basically an access control

417
00:21:02.920 --> 00:21:06.240
<v Speaker 2>list using the standard five tuple format source itis ip

418
00:21:07.119 --> 00:21:11.599
<v Speaker 2>port protocol to specify exactly which flows you want a mirror,

419
00:21:12.039 --> 00:21:15.160
<v Speaker 2>and the whole thing source target filter is tied together

420
00:21:15.200 --> 00:21:15.799
<v Speaker 2>in a session.

421
00:21:16.319 --> 00:21:18.920
<v Speaker 1>So Source E and I filter the traffic, send it

422
00:21:18.960 --> 00:21:20.920
<v Speaker 1>to a target for analysis exactly.

423
00:21:20.960 --> 00:21:25.279
<v Speaker 2>It's incredibly powerful for deep packet inspection, security threat analysis,

424
00:21:25.720 --> 00:21:29.960
<v Speaker 2>compliance monitoring, and just advanced troubleshooting where flow lugs aren't enough,

425
00:21:30.039 --> 00:21:31.000
<v Speaker 2>you get the full packet.

426
00:21:31.039 --> 00:21:34.880
<v Speaker 1>Got it? And just quickly you mentioned visibility for larger networks.

427
00:21:34.920 --> 00:21:37.640
<v Speaker 1>Transit Gateway Network Manager TGNM. What's its role?

428
00:21:37.839 --> 00:21:42.279
<v Speaker 2>Think of TGNM primarily as a unified dashboard and visualization tool,

429
00:21:42.839 --> 00:21:47.359
<v Speaker 2>especially if you have a complex network involving multiple transit gateways, vpcs,

430
00:21:47.519 --> 00:21:51.759
<v Speaker 2>VPNs direct connect maybe even reaching into different AWS regions

431
00:21:51.839 --> 00:21:54.960
<v Speaker 2>or connecting to on premises sites. TGNM helps you see

432
00:21:54.960 --> 00:21:57.599
<v Speaker 2>it all, so draws you a mapped pretty much. It

433
00:21:57.599 --> 00:22:00.160
<v Speaker 2>gives you a logical and often a geographical view of

434
00:22:00.200 --> 00:22:03.640
<v Speaker 2>your global network topology. You can register your on prem

435
00:22:03.680 --> 00:22:06.640
<v Speaker 2>devices and sights too. It helps with monitoring the health

436
00:22:06.640 --> 00:22:10.319
<v Speaker 2>and status of your TGW attachments and routes all in

437
00:22:10.319 --> 00:22:13.480
<v Speaker 2>one place. It brings that bird's eye view which is

438
00:22:13.559 --> 00:22:14.839
<v Speaker 2>vital as network scale.

439
00:22:15.000 --> 00:22:19.000
<v Speaker 1>Okay, that makes sense, essential for managing complexity. Now, let's

440
00:22:19.000 --> 00:22:23.039
<v Speaker 1>shift gears slightly, but stay related security. It's non negotiable. Obviously.

441
00:22:23.279 --> 00:22:26.119
<v Speaker 1>A TOBS talks about a layered approach to traffic control

442
00:22:26.200 --> 00:22:27.839
<v Speaker 1>right from the edge all the way down to the instance.

443
00:22:27.880 --> 00:22:30.759
<v Speaker 1>How does that layering start way out at the global edge.

444
00:22:30.640 --> 00:22:34.480
<v Speaker 2>Right before traffic even gets close to your VPC? AWS

445
00:22:34.519 --> 00:22:38.480
<v Speaker 2>offers protection. The first line is often AWS Shield. This

446
00:22:38.519 --> 00:22:41.079
<v Speaker 2>is primarily for a distributed denial of service or d

447
00:22:41.200 --> 00:22:42.079
<v Speaker 2>DO protection.

448
00:22:42.200 --> 00:22:43.640
<v Speaker 1>Shield just protection well.

449
00:22:43.720 --> 00:22:47.079
<v Speaker 2>Shield standard is automatically enabled and protects against common network

450
00:22:47.160 --> 00:22:50.200
<v Speaker 2>level DIDOS attacks, but Shield Advanced is a paid service

451
00:22:50.200 --> 00:22:52.720
<v Speaker 2>that gives you much more. Totion four seven Access to

452
00:22:52.799 --> 00:22:58.240
<v Speaker 2>aws's d DOOS Response Team DRT, detailed attack diagnostics and importantly,

453
00:22:58.319 --> 00:23:02.640
<v Speaker 2>economic protection. AWOS can help cover cost and CURD due

454
00:23:02.640 --> 00:23:05.119
<v Speaker 2>to d DOO sub driven spikes and usage on services

455
00:23:05.119 --> 00:23:07.000
<v Speaker 2>like cloud Front or load balancers.

456
00:23:07.119 --> 00:23:10.240
<v Speaker 1>Oh okay, insurance against DDOSE costs too. What else is

457
00:23:10.279 --> 00:23:11.079
<v Speaker 1>out at the edge?

458
00:23:11.200 --> 00:23:14.640
<v Speaker 2>Then you have ABUS WAFT the Web application firewall. This

459
00:23:14.680 --> 00:23:17.799
<v Speaker 2>operates at layer seven, the application layer. It helps protect

460
00:23:17.799 --> 00:23:20.880
<v Speaker 2>your web applications from common exploits like SEQL injection, cross

461
00:23:20.920 --> 00:23:25.200
<v Speaker 2>sided scripting, XSS file inclusion, things that target vulnerabilities in

462
00:23:25.240 --> 00:23:28.559
<v Speaker 2>your application code itself. You apply WAFT rules typically to

463
00:23:28.599 --> 00:23:31.279
<v Speaker 2>cloud Front distributions or application load balancers.

464
00:23:31.519 --> 00:23:35.680
<v Speaker 1>So shield handles the flood. Why handles the malicious application requests?

465
00:23:35.839 --> 00:23:39.480
<v Speaker 2>Got it? Now? Bringing that firewall capability inside your vpcs?

466
00:23:39.680 --> 00:23:43.319
<v Speaker 2>What about AWS Network Firewall? How does that work? AWS

467
00:23:43.400 --> 00:23:47.519
<v Speaker 2>Network Firewall is a managed stateful firewall service that you

468
00:23:47.599 --> 00:23:51.279
<v Speaker 2>deploy within your vpcs. It gives you fine grained control

469
00:23:51.319 --> 00:23:54.920
<v Speaker 2>over traffic flowing between subnets between vpcs or in and

470
00:23:54.960 --> 00:23:57.519
<v Speaker 2>out to the internet or on prem networks. You deploy

471
00:23:57.640 --> 00:24:00.640
<v Speaker 2>firewall endpoints into specific subnets.

472
00:24:01.160 --> 00:24:03.960
<v Speaker 1>And how do people typically design with it? The sources

473
00:24:04.039 --> 00:24:05.319
<v Speaker 1>mention different patterns.

474
00:24:05.440 --> 00:24:07.759
<v Speaker 2>Yeah, there are few common architectural patterns. One is the

475
00:24:07.799 --> 00:24:11.160
<v Speaker 2>distributed design. You put a network firewall end point in

476
00:24:11.240 --> 00:24:14.119
<v Speaker 2>basically every VPC that needs protection pros and cons pro

477
00:24:14.680 --> 00:24:18.880
<v Speaker 2>granular policy control per VPC. Potentially lower latency as traffic

478
00:24:18.920 --> 00:24:21.680
<v Speaker 2>doesn't have to leave the VPC for inspection. Con can

479
00:24:21.720 --> 00:24:24.720
<v Speaker 2>get expensive and complex to manage policies across many firewalls.

480
00:24:24.720 --> 00:24:25.839
<v Speaker 1>Okay, what's the alternative?

481
00:24:25.960 --> 00:24:29.759
<v Speaker 2>The centralized design? Here you create a dedicated security VPC

482
00:24:30.039 --> 00:24:35.920
<v Speaker 2>or inspection VPC. All traffic inner VPC, Internet ingresscress VPNDX

483
00:24:35.960 --> 00:24:39.920
<v Speaker 2>traffic gets routed through network firewall end points in this central.

484
00:24:39.680 --> 00:24:43.640
<v Speaker 1>VPC AH single choke point for inspection exactly.

485
00:24:43.319 --> 00:24:47.599
<v Speaker 2>Pro cost savings, fewer end points, centralized management and policy

486
00:24:47.680 --> 00:24:51.640
<v Speaker 2>enforcement condishing potentially adds latency as traffic has to detour

487
00:24:51.680 --> 00:24:55.599
<v Speaker 2>through the inspection VPC and that VPC becomes a critical dependency.

488
00:24:55.880 --> 00:24:57.640
<v Speaker 1>Makes sense any other patterns.

489
00:24:58.039 --> 00:25:01.079
<v Speaker 2>There's also often a combination design trying to get the

490
00:25:01.079 --> 00:25:04.240
<v Speaker 2>best of both worlds. Maybe you centralize inspection for east

491
00:25:04.279 --> 00:25:07.839
<v Speaker 2>west inner VPC traffic, but you distribute the firewalls for

492
00:25:07.960 --> 00:25:11.720
<v Speaker 2>Internet and gresscress traffic within each VPC to reduce latency

493
00:25:11.759 --> 00:25:16.319
<v Speaker 2>for external connections. It's about balancing cost, latency, and manageability.

494
00:25:16.519 --> 00:25:19.680
<v Speaker 1>Okay, choices depending on your needs. Now let's drill down

495
00:25:19.680 --> 00:25:23.200
<v Speaker 1>to the absolute fundamentals inside of VPC. Remind us again

496
00:25:23.240 --> 00:25:28.160
<v Speaker 1>about security groups sgs versus network access control lists and acls.

497
00:25:28.440 --> 00:25:31.000
<v Speaker 1>They both filter traffic. But how are they different and

498
00:25:31.039 --> 00:25:32.319
<v Speaker 1>where do people get tripped up?

499
00:25:32.400 --> 00:25:35.240
<v Speaker 2>This is super important and yeah, confusion here causes a

500
00:25:35.279 --> 00:25:38.240
<v Speaker 2>lot of issues. Okay, security groups or sg's think of

501
00:25:38.279 --> 00:25:41.480
<v Speaker 2>them as stateful firewalls operating at the instance level, really

502
00:25:41.519 --> 00:25:42.440
<v Speaker 2>the e NI level.

503
00:25:42.559 --> 00:25:44.559
<v Speaker 1>Stateful meaning stateful.

504
00:25:44.079 --> 00:25:46.720
<v Speaker 2>Means if you allow an outbound connection from your instance,

505
00:25:47.200 --> 00:25:50.920
<v Speaker 2>say on part eighty, the SG automatically allows the return

506
00:25:51.000 --> 00:25:54.480
<v Speaker 2>traffic back to the instance for that specific connection without

507
00:25:54.480 --> 00:25:57.640
<v Speaker 2>needing a separate inbound rule. It understands the connection state.

508
00:25:58.240 --> 00:26:00.440
<v Speaker 2>Also with sgs, the order of the rules as matter.

509
00:26:00.599 --> 00:26:04.640
<v Speaker 2>All allow rules are evaluated, and importantly, sg's only support

510
00:26:05.079 --> 00:26:08.559
<v Speaker 2>allow rules. There's an implicit deny at the end.

511
00:26:08.599 --> 00:26:12.759
<v Speaker 1>Okay, stateful instance level allow rules only order doesn't matter.

512
00:26:12.799 --> 00:26:14.480
<v Speaker 1>What about nacls.

513
00:26:14.079 --> 00:26:18.000
<v Speaker 2>Network Access control lists or nacls. These are stateless firewalls

514
00:26:18.039 --> 00:26:19.480
<v Speaker 2>operating at the subnet.

515
00:26:19.160 --> 00:26:21.319
<v Speaker 1>Level stateless meaning stateless.

516
00:26:20.839 --> 00:26:24.240
<v Speaker 2>Means they don't track connection state. If you allow outbound

517
00:26:24.279 --> 00:26:26.799
<v Speaker 2>traffic on port eighty, you must also have an explicit

518
00:26:26.839 --> 00:26:30.000
<v Speaker 2>inbound rule allowing traffic back on the ephemeral ports typically

519
00:26:30.039 --> 00:26:32.839
<v Speaker 2>high numbered ports. For the response to get through, you

520
00:26:32.920 --> 00:26:36.319
<v Speaker 2>need rules for both directions. Ah more work YEP and

521
00:26:36.519 --> 00:26:39.480
<v Speaker 2>NEACL rules are processed in order, from the lowest numbered

522
00:26:39.519 --> 00:26:42.079
<v Speaker 2>rule to the highest. The first rule that matches the

523
00:26:42.079 --> 00:26:46.559
<v Speaker 2>traffic is applied, and that's it. Crucially, nacls support both

524
00:26:46.599 --> 00:26:50.000
<v Speaker 2>allow and DNY rules, so you can create explicit blocks.

525
00:26:50.279 --> 00:26:54.079
<v Speaker 2>There's also an implicit deny at the very end rule number.

526
00:26:53.880 --> 00:26:59.000
<v Speaker 1>Asterisk okay, stateless subnet level order matters allow and deny rules.

527
00:26:59.039 --> 00:27:00.599
<v Speaker 1>So where's the common mistake?

528
00:27:01.279 --> 00:27:04.559
<v Speaker 2>Often people forget the stateless nature of nacls and don't

529
00:27:04.559 --> 00:27:07.279
<v Speaker 2>add the return traffic rules, or they mess up the

530
00:27:07.359 --> 00:27:09.519
<v Speaker 2>rule order, having a d and an Y rule that

531
00:27:09.559 --> 00:27:12.720
<v Speaker 2>accidentally blocks traffic they intended to allow because it comes

532
00:27:12.720 --> 00:27:15.440
<v Speaker 2>before the allow rule. Because they're at the subnet level.

533
00:27:15.640 --> 00:27:18.759
<v Speaker 2>A misconfigured NaCl can cut off a whole group of instances.

534
00:27:18.880 --> 00:27:21.519
<v Speaker 1>Got it. Be careful with nacls now. This race is

535
00:27:21.519 --> 00:27:24.480
<v Speaker 1>an important question. Load balancers. We usually think of them

536
00:27:24.519 --> 00:27:27.000
<v Speaker 1>for performance and availability, but how do they fit into

537
00:27:27.039 --> 00:27:30.480
<v Speaker 1>the security and traffic flow picture, especially the different types?

538
00:27:30.599 --> 00:27:33.599
<v Speaker 2>Absolutely critical role for both. Okay, let's bring it down first.

539
00:27:33.799 --> 00:27:37.400
<v Speaker 2>Network load balancers or nlbs. These operate down at layer

540
00:27:37.480 --> 00:27:40.480
<v Speaker 2>thirty four the network and transport layers. They look at

541
00:27:40.519 --> 00:27:43.759
<v Speaker 2>IP addresses and ports, typically using a five touple hash

542
00:27:44.279 --> 00:27:47.880
<v Speaker 2>source it st IP sources support protocol to distribute connections.

543
00:27:48.079 --> 00:27:51.440
<v Speaker 1>Okay, lower level, what's key about them? For security?

544
00:27:51.799 --> 00:27:54.400
<v Speaker 2>A key feature of nlbs is that they preserve the

545
00:27:54.440 --> 00:27:57.759
<v Speaker 2>original client source IP address when forwarding traffic to the

546
00:27:57.799 --> 00:28:01.920
<v Speaker 2>back end instances. This is super sportant for logging, security analysis,

547
00:28:01.960 --> 00:28:04.599
<v Speaker 2>or applying IP based rules on the back end. Also,

548
00:28:04.640 --> 00:28:07.000
<v Speaker 2>because they operate at layer four, they're often used for

549
00:28:07.039 --> 00:28:11.119
<v Speaker 2>inserting third party network virtual appliances and vas like firewalls

550
00:28:11.200 --> 00:28:14.559
<v Speaker 2>or intrusion detection systems into the traffic path non disruptively.

551
00:28:14.640 --> 00:28:18.559
<v Speaker 1>Ok. NLB layer four keeps client IP good for mvas.

552
00:28:18.599 --> 00:28:19.599
<v Speaker 1>What about albs.

553
00:28:19.880 --> 00:28:23.559
<v Speaker 2>Application load balancers or albs? These are smarter, operating up

554
00:28:23.559 --> 00:28:27.119
<v Speaker 2>at layer seven the application layer hgtph GTPs. They can

555
00:28:27.160 --> 00:28:30.279
<v Speaker 2>make routing decisions based on things like the requested urlpath

556
00:28:30.400 --> 00:28:34.599
<v Speaker 2>like images are appy hosttheaders, gray string parameters, even HTTT.

557
00:28:34.279 --> 00:28:36.359
<v Speaker 1>Method much more granular routing.

558
00:28:36.240 --> 00:28:39.240
<v Speaker 2>Exactly, and a big function of alb's is TLS termination.

559
00:28:39.519 --> 00:28:43.319
<v Speaker 2>They handle the HTTPS decryption encryption, offloading that compute intensive

560
00:28:43.319 --> 00:28:45.200
<v Speaker 2>work from your back end web servers.

561
00:28:45.359 --> 00:28:48.079
<v Speaker 1>That sounds good for performance any security implications.

562
00:28:48.160 --> 00:28:51.279
<v Speaker 2>Yes, while offloading TLS is efficient, it does mean the

563
00:28:51.279 --> 00:28:54.279
<v Speaker 2>connection between the ALB and your back end instance is

564
00:28:54.319 --> 00:29:00.720
<v Speaker 2>typically unencrypted HTTP unless you specifically configure re Encryptionechnically, it

565
00:29:00.759 --> 00:29:04.119
<v Speaker 2>breaks into end encryption within your VPC boundary. There Also,

566
00:29:04.200 --> 00:29:07.559
<v Speaker 2>because the ALB terminates the connection, the back end instance

567
00:29:07.599 --> 00:29:10.519
<v Speaker 2>doesn't see the original client IP directly. It sees the

568
00:29:10.559 --> 00:29:11.759
<v Speaker 2>alb's IP.

569
00:29:11.720 --> 00:29:13.880
<v Speaker 1>Ah, so you lose the client IP.

570
00:29:14.119 --> 00:29:16.920
<v Speaker 2>You do unless the ALB adds the x forwarded four

571
00:29:17.039 --> 00:29:19.960
<v Speaker 2>HTTP header and your applications configure to read and trust

572
00:29:20.000 --> 00:29:22.519
<v Speaker 2>that header to find the original client IP. Many web

573
00:29:22.559 --> 00:29:24.240
<v Speaker 2>frameworks do this, but it's an extra step.

574
00:29:24.319 --> 00:29:28.400
<v Speaker 1>Okay. ALB layer seven smart routing TLS termination with caveats

575
00:29:28.480 --> 00:29:30.720
<v Speaker 1>needs x foard the four or for client IP. What's

576
00:29:30.759 --> 00:29:32.079
<v Speaker 1>the third type GLB.

577
00:29:32.000 --> 00:29:35.920
<v Speaker 2>Keitway load balancers or glbs. These are a bit different

578
00:29:36.440 --> 00:29:41.119
<v Speaker 2>They are specialized built on NLB technology, but designed specifically

579
00:29:41.160 --> 00:29:45.599
<v Speaker 2>for simplified service insertion of MVAS, particularly security appliance is

580
00:29:45.640 --> 00:29:47.559
<v Speaker 2>like firewalls, IPS, sides, et cetera.

581
00:29:47.759 --> 00:29:50.599
<v Speaker 1>How do they simplify it Compared to using an NLB

582
00:29:50.759 --> 00:29:51.559
<v Speaker 1>for MVAS.

583
00:29:51.920 --> 00:29:56.319
<v Speaker 2>Glb's use a special tunneling protocol called genety encapsulation. Essentially,

584
00:29:56.319 --> 00:29:59.400
<v Speaker 2>when traffic hits the GLB, it wraps the original network

585
00:29:59.400 --> 00:30:02.880
<v Speaker 2>packet in another packet, the genev packet, and sends it

586
00:30:02.920 --> 00:30:04.839
<v Speaker 2>to one of the registered security appliances.

587
00:30:05.039 --> 00:30:07.039
<v Speaker 1>NVA's okay, it puts the packet in a package.

588
00:30:07.079 --> 00:30:10.039
<v Speaker 2>Why because this preserves the entire original packet headers and

589
00:30:10.079 --> 00:30:12.680
<v Speaker 2>all the security appliance can inspect it fully see the

590
00:30:12.680 --> 00:30:16.799
<v Speaker 2>original source and destination everything. Then, after inspection, the appliance

591
00:30:16.839 --> 00:30:20.000
<v Speaker 2>sends the potentially modified or approved back it back to

592
00:30:20.039 --> 00:30:23.279
<v Speaker 2>the GLB. Still encapsulated, the GLB unwraps it and sends

593
00:30:23.319 --> 00:30:24.599
<v Speaker 2>the original packet on its way.

594
00:30:24.720 --> 00:30:27.079
<v Speaker 1>Ah, so the NVA doesn't even need to know about

595
00:30:27.079 --> 00:30:30.759
<v Speaker 1>the network routing. It's exactly the GLB handles all the

596
00:30:30.799 --> 00:30:34.400
<v Speaker 1>routing complexities. The NVA just receives packets, inspects them, and

597
00:30:34.480 --> 00:30:37.400
<v Speaker 1>sends them back It makes the security appliance fleet function

598
00:30:37.519 --> 00:30:40.440
<v Speaker 1>like a transparent bump in the wire, super elegant for

599
00:30:40.440 --> 00:30:42.240
<v Speaker 1>deploying security services scalably.

600
00:30:42.319 --> 00:30:45.759
<v Speaker 2>That is clever is shifting from network traffic security to

601
00:30:45.880 --> 00:30:50.079
<v Speaker 2>DNS security equally vital. How does AWS Root fifty three

602
00:30:50.440 --> 00:30:53.720
<v Speaker 2>help lock down DNS both for the Internet and internally?

603
00:30:53.759 --> 00:30:56.640
<v Speaker 1>Where fifty three is aws's DNS service and it plays

604
00:30:56.680 --> 00:30:59.640
<v Speaker 1>a big role. First, you have public hosted zones. These

605
00:30:59.640 --> 00:31:02.599
<v Speaker 1>manage DNS records for your Internet routable domain names like

606
00:31:02.599 --> 00:31:05.839
<v Speaker 1>your company's website. Standard DNS stuff, right. But then you

607
00:31:05.880 --> 00:31:08.839
<v Speaker 1>also have private hosted zones. These are associated with one

608
00:31:08.920 --> 00:31:11.200
<v Speaker 1>or more of your vpcs and manage DNS records for

609
00:31:11.240 --> 00:31:13.920
<v Speaker 1>internal domain names that should only be resolvable from within

610
00:31:13.960 --> 00:31:16.200
<v Speaker 1>those vpcs, like Service.

611
00:31:15.839 --> 00:31:19.039
<v Speaker 2>Dot, Internal dot Corp exactly. This allows you to have

612
00:31:19.079 --> 00:31:23.920
<v Speaker 2>a split DNS or split horizon DNS architecture. Your internal

613
00:31:23.960 --> 00:31:27.880
<v Speaker 2>servers or instances within the VPC can resolve both internal

614
00:31:27.960 --> 00:31:31.440
<v Speaker 2>names from the private hosted zone and external Internet names

615
00:31:31.599 --> 00:31:35.200
<v Speaker 2>via standard DNS resolution. It keeps your internal namespace private

616
00:31:35.240 --> 00:31:38.599
<v Speaker 2>and secure, preventing internal host names or service names from

617
00:31:38.680 --> 00:31:40.440
<v Speaker 2>leaking or being resolved externally.

618
00:31:41.039 --> 00:31:43.960
<v Speaker 1>That makes sense for separating internal and external views. What

619
00:31:44.079 --> 00:31:47.720
<v Speaker 1>about those advanced Route fifty three routing policies The sources

620
00:31:47.759 --> 00:31:53.279
<v Speaker 1>listed simple failover, latency weighted, geolocation. Can you quickly unpack

621
00:31:53.319 --> 00:31:54.240
<v Speaker 1>what each is for.

622
00:31:54.200 --> 00:31:58.119
<v Speaker 2>Sure offer powerful traffic managing capabilities. Simple is just basic

623
00:31:58.240 --> 00:32:01.680
<v Speaker 2>round robin DNS no health check standard failover is for

624
00:32:01.759 --> 00:32:04.400
<v Speaker 2>active passive setups. You define a primary record and a

625
00:32:04.440 --> 00:32:08.000
<v Speaker 2>secondary record. If the primary resource becomes unhealthy based on

626
00:32:08.079 --> 00:32:11.039
<v Speaker 2>Root fifty three health checks, Root fifty three automatically starts

627
00:32:11.039 --> 00:32:14.599
<v Speaker 2>returning the secondary records. IP think the old Twitter failwaale.

628
00:32:14.160 --> 00:32:16.240
<v Speaker 1>Page okay, high availability YEP.

629
00:32:16.960 --> 00:32:19.880
<v Speaker 2>Latency based routing is cool. Root fifty three has data

630
00:32:19.880 --> 00:32:22.319
<v Speaker 2>on network latency from different parts of the Internet to

631
00:32:22.400 --> 00:32:26.480
<v Speaker 2>AWS regions. It directs users to the AWSN point like

632
00:32:26.519 --> 00:32:29.319
<v Speaker 2>a load balancer or instance in a specific region that

633
00:32:29.400 --> 00:32:30.880
<v Speaker 2>provides the lowest latency for.

634
00:32:30.839 --> 00:32:34.599
<v Speaker 1>Them, directing users to the closest healthy server latency wise.

635
00:32:34.759 --> 00:32:38.640
<v Speaker 2>Exactly weighted routing lets you distribute traffic across multiple resources

636
00:32:38.680 --> 00:32:41.359
<v Speaker 2>based on percentages you define. You could send ninety percent

637
00:32:41.440 --> 00:32:43.680
<v Speaker 2>of traffic to your stable version and ten percent to

638
00:32:43.720 --> 00:32:46.480
<v Speaker 2>a new version for EV testing, or just balance load

639
00:32:46.519 --> 00:32:50.960
<v Speaker 2>across endpoints. Unevenly if needed, useful for rollouts very and

640
00:32:51.200 --> 00:32:53.960
<v Speaker 2>geolocation routing lets you wrap traffic based on the user's

641
00:32:53.960 --> 00:32:57.480
<v Speaker 2>actual geographic location, like sending all European users to your

642
00:32:57.519 --> 00:33:00.720
<v Speaker 2>Frankfort Regent servers and all US users to your Virginia

643
00:33:00.759 --> 00:33:03.720
<v Speaker 2>region servers. Good for localization or data sovereignty.

644
00:33:03.880 --> 00:33:07.400
<v Speaker 1>Got it? And finally, what about DNSEC? What problem does

645
00:33:07.440 --> 00:33:08.880
<v Speaker 1>that solve? Is it about encryption?

646
00:33:09.240 --> 00:33:13.359
<v Speaker 2>Good question. Dns sec is not about encrypting DNS queries

647
00:33:13.440 --> 00:33:17.839
<v Speaker 2>or responses. Its purpose is authentication and integrity. It uses

648
00:33:17.880 --> 00:33:21.480
<v Speaker 2>digital signatures and public key cryptography to allow a DNS

649
00:33:21.559 --> 00:33:25.160
<v Speaker 2>resolver like your ISP's resolver or a public onelike Google's,

650
00:33:25.400 --> 00:33:28.359
<v Speaker 2>to verify that the DNS records that receive actually came

651
00:33:28.359 --> 00:33:31.279
<v Speaker 2>from the authoritative DNS server and haven't been tampered with

652
00:33:31.359 --> 00:33:31.920
<v Speaker 2>in transit.

653
00:33:32.119 --> 00:33:35.559
<v Speaker 1>So it prevents DNS spoofing or cash poisoning exactly.

654
00:33:35.599 --> 00:33:37.480
<v Speaker 2>It builds a chain of trust from the root DNAs

655
00:33:37.519 --> 00:33:40.480
<v Speaker 2>servers down to your domain, ensuring the IP address you

656
00:33:40.480 --> 00:33:43.279
<v Speaker 2>get back for a host name is authentic. It's about trust,

657
00:33:43.480 --> 00:33:44.519
<v Speaker 2>not confidentiality.

658
00:33:44.559 --> 00:33:48.400
<v Speaker 1>Okay, authentication and integrity for DNS makes sense. Yeah, now

659
00:33:48.640 --> 00:33:51.359
<v Speaker 1>let's zoom out Again, we've talked about the components, the connections,

660
00:33:51.359 --> 00:33:54.680
<v Speaker 1>the security. How is all this modern cloud infrastructure actually

661
00:33:54.720 --> 00:33:57.480
<v Speaker 1>built and managed? Especially at scale? This brings us to

662
00:33:57.559 --> 00:34:01.960
<v Speaker 1>ideas like DevOps and crucially in structure as code or IAC.

663
00:34:02.160 --> 00:34:03.960
<v Speaker 1>What's the core philosophy here right?

664
00:34:04.319 --> 00:34:07.839
<v Speaker 2>DevOps broadly is about breaking down silos between development and

665
00:34:07.880 --> 00:34:13.079
<v Speaker 2>operations teams, focusing on automation, collaboration, and faster, more reliable

666
00:34:13.119 --> 00:34:17.760
<v Speaker 2>software delivery. Infrastructure is code or IAC is a key

667
00:34:17.800 --> 00:34:19.960
<v Speaker 2>practice enabling DevOps for infrastructure?

668
00:34:20.199 --> 00:34:21.960
<v Speaker 1>And what's the central idea of IIC?

669
00:34:22.280 --> 00:34:28.079
<v Speaker 2>The core idea is managing and provisioning your infrastructure, your servers, networks, databases,

670
00:34:28.119 --> 00:34:32.760
<v Speaker 2>load balancers, everything through machine readable definition files using code

671
00:34:33.079 --> 00:34:35.920
<v Speaker 2>rather than manual configuration or clicking around in a console.

672
00:34:36.199 --> 00:34:38.840
<v Speaker 2>It's about making your infrastructure ephemeral andmaleable.

673
00:34:39.039 --> 00:34:43.559
<v Speaker 1>Ephemeral and malleable meaning easily created, destroyed, change precisely.

674
00:34:44.000 --> 00:34:47.320
<v Speaker 2>You treat your infrastructure definitions like application code, and this

675
00:34:47.400 --> 00:34:50.840
<v Speaker 2>leads to arguably the most game changing insight of IIC,

676
00:34:51.360 --> 00:34:52.840
<v Speaker 2>the concept of immutability.

677
00:34:52.960 --> 00:34:55.840
<v Speaker 1>Immutability that sounds important, What does that actually mean when

678
00:34:55.840 --> 00:34:58.119
<v Speaker 1>you're managing cloud resources? Why is it such a big deal?

679
00:34:58.159 --> 00:34:59.440
<v Speaker 1>For reliability and speed.

680
00:35:00.000 --> 00:35:04.679
<v Speaker 2>Fundamental paradigm shift immutability means that once infrastructure is deployed,

681
00:35:04.960 --> 00:35:08.159
<v Speaker 2>say a server or a cluster, you don't make changes

682
00:35:08.159 --> 00:35:10.760
<v Speaker 2>to it directly. You don't log in and patch it

683
00:35:10.880 --> 00:35:11.840
<v Speaker 2>or reconfigure it.

684
00:35:11.719 --> 00:35:14.920
<v Speaker 1>In place, So no patching. How does that work?

685
00:35:15.199 --> 00:35:18.000
<v Speaker 2>If you need to update or change something, apply a patch,

686
00:35:18.440 --> 00:35:22.280
<v Speaker 2>deploy new code, change it, config you don't modify the

687
00:35:22.320 --> 00:35:26.559
<v Speaker 2>existing infrastructure. Instead, you build a new instance or set

688
00:35:26.559 --> 00:35:30.239
<v Speaker 2>of instances from your updated IAC definition which includes the

689
00:35:30.280 --> 00:35:33.199
<v Speaker 2>patch or new code. Deploy the new set, test it,

690
00:35:33.360 --> 00:35:36.320
<v Speaker 2>switch traffic over, and then you simply destroy the old

691
00:35:36.760 --> 00:35:38.840
<v Speaker 2>unchanged infrastructure.

692
00:35:38.239 --> 00:35:40.480
<v Speaker 1>The whole cattle not pets analogy exactly.

693
00:35:40.480 --> 00:35:43.159
<v Speaker 2>You don't nurse sick pet servers back to health. You

694
00:35:43.239 --> 00:35:46.480
<v Speaker 2>replace the disposable cattle units. Yeah. You treat your infrastructure

695
00:35:46.519 --> 00:35:50.039
<v Speaker 2>definitions like source code, store them in version control depositories

696
00:35:50.079 --> 00:35:54.880
<v Speaker 2>like get, use declarative specification documents like YAML for cloud formation,

697
00:35:55.320 --> 00:35:57.599
<v Speaker 2>and rely on automation to deploy consistently.

698
00:35:57.679 --> 00:35:59.840
<v Speaker 1>And the benefit isn't just avoiding manual work.

699
00:36:00.159 --> 00:36:02.920
<v Speaker 2>No, the core benefit isn't primarily cost savings, although that

700
00:36:02.960 --> 00:36:07.280
<v Speaker 2>can happen. It's about speed, consistency, and safety Deployments become

701
00:36:07.320 --> 00:36:11.639
<v Speaker 2>repeatable and predictable. Rollbacks are easier. Just redeploy the previous

702
00:36:11.719 --> 00:36:15.039
<v Speaker 2>version of the code. Drift between environments is minimized. If

703
00:36:15.039 --> 00:36:18.199
<v Speaker 2>there's a problem, you don't spend hours troubleshooting a potentially

704
00:36:18.199 --> 00:36:21.719
<v Speaker 2>broken server. You just redeploy a known good state from

705
00:36:21.719 --> 00:36:25.320
<v Speaker 2>code in minutes. It fundamentally changes how you approach operations.

706
00:36:25.519 --> 00:36:29.280
<v Speaker 1>That's a really powerful concept. Okay, So focusing on aws's

707
00:36:29.280 --> 00:36:34.559
<v Speaker 1>main iac tool, AWS CloudFormation, how do its templates enable

708
00:36:34.679 --> 00:36:36.360
<v Speaker 1>this immutable approach.

709
00:36:36.519 --> 00:36:39.960
<v Speaker 2>CloudFormation templates are where you define your AWS resources declaratively,

710
00:36:40.079 --> 00:36:42.480
<v Speaker 2>usually in Yamel or JSON. You state what you want

711
00:36:42.519 --> 00:36:44.679
<v Speaker 2>like I want a VPC with this cid R block,

712
00:36:44.760 --> 00:36:46.960
<v Speaker 2>I want two subnets inside it. I want an EC

713
00:36:47.079 --> 00:36:49.719
<v Speaker 2>two instance with this AMI and one subnet. You don't

714
00:36:49.719 --> 00:36:53.079
<v Speaker 2>specify the how like the API calls to make. CloudFormation

715
00:36:53.159 --> 00:36:53.760
<v Speaker 2>figures that out.

716
00:36:53.840 --> 00:36:56.320
<v Speaker 1>So you declare the desired state exactly.

717
00:36:56.079 --> 00:36:59.039
<v Speaker 2>And the power comes from features within the templates. You

718
00:36:59.039 --> 00:37:02.360
<v Speaker 2>have parameters which let you pass in values ad deployment time,

719
00:37:02.760 --> 00:37:07.239
<v Speaker 2>like the vpc, CIDR block, or the instance type or

720
00:37:07.239 --> 00:37:12.000
<v Speaker 2>maybe an environment name devastating PROD. This makes templates reusable.

721
00:37:11.559 --> 00:37:13.880
<v Speaker 1>Okay, parameters for reusability, what else.

722
00:37:13.880 --> 00:37:16.400
<v Speaker 2>And you have intrinsic functions. These are special functions you

723
00:37:16.400 --> 00:37:19.360
<v Speaker 2>can use right inside the template code, things like differ

724
00:37:19.719 --> 00:37:22.480
<v Speaker 2>lets you reference the ID or attribute of another resource

725
00:37:22.519 --> 00:37:24.960
<v Speaker 2>defined in the same template, like getting the idea of

726
00:37:25.000 --> 00:37:28.039
<v Speaker 2>the VPC to create a subnet in it, get at

727
00:37:28.039 --> 00:37:32.440
<v Speaker 2>fetches an attribute. Sitor can calculate subnet CIDR blocks based

728
00:37:32.480 --> 00:37:35.400
<v Speaker 2>on a main VPC block. Select can pick an item

729
00:37:35.440 --> 00:37:38.360
<v Speaker 2>from a list, maybe an availability zone. These functions allow

730
00:37:38.400 --> 00:37:42.639
<v Speaker 2>you to create dynamic, interconnected infrastructure definitions without hard coding everything.

731
00:37:42.760 --> 00:37:45.559
<v Speaker 1>So templates define the what parameters make them reusable, and

732
00:37:45.599 --> 00:37:47.440
<v Speaker 1>functions add dynamic capabilities.

733
00:37:47.920 --> 00:37:51.199
<v Speaker 2>Got it and quickly. For developers who might prefer Python

734
00:37:51.400 --> 00:37:56.519
<v Speaker 2>or Java overwriting YAML, what's the AWS Cloud Development Kit

735
00:37:56.639 --> 00:37:59.440
<v Speaker 2>or CDK. CDK is another way to do iac On

736
00:37:59.480 --> 00:38:03.679
<v Speaker 2>aws IT lists, developers define their cloud infrastructure using familiar

737
00:38:03.719 --> 00:38:08.719
<v Speaker 2>programming languages Python, Typescript, Java, c shark Go. You write

738
00:38:08.840 --> 00:38:13.039
<v Speaker 2>code using CDK constructs, which represent AWUS resources.

739
00:38:13.280 --> 00:38:17.440
<v Speaker 1>So you write Python code to define a VPC YEP, and.

740
00:38:17.400 --> 00:38:20.239
<v Speaker 2>Then when you run the CDK toolkit, it synthesizes your

741
00:38:20.239 --> 00:38:23.159
<v Speaker 2>Python code or whatever language you used into a standard

742
00:38:23.199 --> 00:38:24.960
<v Speaker 2>AWS CloudFormation template.

743
00:38:25.199 --> 00:38:28.840
<v Speaker 1>Ah, so it generates the CloudFormation for you exactly.

744
00:38:29.239 --> 00:38:31.039
<v Speaker 2>The benefit is you can use the power of your

745
00:38:31.039 --> 00:38:36.320
<v Speaker 2>programming language loops, conditionals, object oriented programming, existing libraries, code

746
00:38:36.320 --> 00:38:39.599
<v Speaker 2>completion in your ide to define infrastructure. You can create

747
00:38:39.639 --> 00:38:43.440
<v Speaker 2>reusable components and patterns more easily than with raw CloudFormation templates.

748
00:38:43.480 --> 00:38:47.760
<v Speaker 2>Sometimes it's particularly good for complex applications or for teams

749
00:38:47.800 --> 00:38:51.079
<v Speaker 2>already comfortable with those programming languages, allowing them to define

750
00:38:51.079 --> 00:38:54.480
<v Speaker 2>both their app and its infrastructure in the same language ecosystem.

751
00:38:54.840 --> 00:38:59.559
<v Speaker 1>Very cool, leveraging programming skills for infrastructure. Okay, huh, that

752
00:38:59.599 --> 00:39:01.519
<v Speaker 1>was quite the journey. So there you have it, a

753
00:39:01.559 --> 00:39:04.679
<v Speaker 1>real deep dive pulling back the layers on AWS networking.

754
00:39:04.760 --> 00:39:07.599
<v Speaker 1>We went from the basic virtual network cards, the e

755
00:39:07.760 --> 00:39:10.840
<v Speaker 1>andi's and wrestled with the hyperplane's.

756
00:39:10.280 --> 00:39:12.280
<v Speaker 2>Hidden limits of those PPS limits right.

757
00:39:12.480 --> 00:39:15.599
<v Speaker 1>Then we looked at all the ways to connect things peering, TGW,

758
00:39:15.719 --> 00:39:19.239
<v Speaker 1>direct connect, and the security layers from shield and wef

759
00:39:19.320 --> 00:39:23.679
<v Speaker 1>down to sgs nacls, plus the nuances of different load balancers,

760
00:39:24.239 --> 00:39:27.679
<v Speaker 1>and finally how modern teams actually build and manage all

761
00:39:27.719 --> 00:39:30.360
<v Speaker 1>this using infrastructure as code.

762
00:39:30.840 --> 00:39:34.400
<v Speaker 2>It really shows how AWS provides this incredibly rich toolkit

763
00:39:34.519 --> 00:39:37.880
<v Speaker 2>right for control, for monitoring, for automation. It turns what

764
00:39:37.920 --> 00:39:41.239
<v Speaker 2>could be overwhelming complexity into manageable patterns, which is what

765
00:39:41.360 --> 00:39:44.679
<v Speaker 2>ultimately enables all that innovation we see happening in the cloud. Absolutely,

766
00:39:44.760 --> 00:39:47.719
<v Speaker 2>and I think understanding these layers, you know, from the

767
00:39:47.760 --> 00:39:51.039
<v Speaker 2>foundational e andi's the strategic use of transit gateways, all

768
00:39:51.079 --> 00:39:54.119
<v Speaker 2>the way to that philosophical shift towards the infrastructure as code,

769
00:39:54.400 --> 00:39:57.679
<v Speaker 2>it means you're not just like using cloud services passively,

770
00:39:57.920 --> 00:40:00.000
<v Speaker 2>You're truly starting to harness their underlying power.

771
00:40:00.440 --> 00:40:03.639
<v Speaker 1>Yeah, understanding the physics behind the magic exactly.

772
00:40:03.679 --> 00:40:06.920
<v Speaker 2>It encourages us, encourages you listening to look beyond the

773
00:40:07.000 --> 00:40:12.119
<v Speaker 2>console surface and constantly ask, Okay, what hidden network physics

774
00:40:12.239 --> 00:40:14.199
<v Speaker 2>might be a play here, and how can we leverage

775
00:40:14.239 --> 00:40:17.559
<v Speaker 2>that or maybe mitigate its effects to build something even better,

776
00:40:17.719 --> 00:40:18.840
<v Speaker 2>even more innovative.

777
00:40:19.039 --> 00:40:21.800
<v Speaker 1>That is a powerful thought to leave everyone with, isn't it?

778
00:40:21.960 --> 00:40:26.400
<v Speaker 1>How can you leverage or mitigate those hidden physics? Fantastic? Well,

779
00:40:26.400 --> 00:40:28.840
<v Speaker 1>thank you for joining us on this deep dive. Until

780
00:40:28.880 --> 00:40:31.559
<v Speaker 1>next time, keep that curiosity well fed.
