WEBVTT

1
00:00:00.160 --> 00:00:04.160
<v Speaker 1>Have you ever wondered why pressing one for customer service

2
00:00:04.160 --> 00:00:07.200
<v Speaker 1>on a modern VoIP call doesn't just you know, send

3
00:00:07.240 --> 00:00:10.400
<v Speaker 1>the actual physical beep of the button over the internet, right.

4
00:00:10.480 --> 00:00:13.119
<v Speaker 2>Yeah, people assume it's just raw audio going across.

5
00:00:12.839 --> 00:00:15.560
<v Speaker 1>The wire exactly. But if you fed that raw, that

6
00:00:15.759 --> 00:00:21.320
<v Speaker 1>dual tone, multi frequency sound into standard low bitrate network compression,

7
00:00:21.839 --> 00:00:25.239
<v Speaker 1>the algorithms would just well completely shred.

8
00:00:25.039 --> 00:00:26.719
<v Speaker 2>The frequency, absolutely destroy it.

9
00:00:26.879 --> 00:00:29.640
<v Speaker 1>Yeah. The automated system on the other end wouldn't hear

10
00:00:29.640 --> 00:00:33.520
<v Speaker 1>a clean one. It would hear this garbled, unrecognizable noise,

11
00:00:33.560 --> 00:00:35.679
<v Speaker 1>and you'd just be stuck in menupurgatory forever.

12
00:00:36.280 --> 00:00:39.920
<v Speaker 2>It's it is a perfect example of that invisible friction

13
00:00:40.039 --> 00:00:45.320
<v Speaker 2>between legacy analog technology and you know, modern packet switch networks.

14
00:00:45.759 --> 00:00:49.320
<v Speaker 2>We expect a seamless experience, but beneath the surface, there

15
00:00:49.359 --> 00:00:52.840
<v Speaker 2>is an incredibly complex choreography required to just make a

16
00:00:52.840 --> 00:00:53.600
<v Speaker 2>simple phone ring.

17
00:00:53.840 --> 00:00:57.000
<v Speaker 1>Welcome to the deep Dive for you listening, whether you

18
00:00:57.000 --> 00:00:59.679
<v Speaker 1>are managing an enterprise network or just like intensely curious

19
00:00:59.679 --> 00:01:02.439
<v Speaker 1>about the hidden infrastructure of daily life. We have a

20
00:01:02.479 --> 00:01:04.319
<v Speaker 1>really fascinating mission today too.

21
00:01:04.359 --> 00:01:06.519
<v Speaker 2>We're looking at some heavy source material today.

22
00:01:06.799 --> 00:01:10.319
<v Speaker 1>We are we are pulling apart a highly detailed Cisco

23
00:01:10.400 --> 00:01:14.719
<v Speaker 1>student guide. It's titled implementing Cisco Voice Gateways and Gatekeepers,

24
00:01:15.280 --> 00:01:19.239
<v Speaker 1>and our goal is to decode the specific architectural mechanisms

25
00:01:19.280 --> 00:01:24.079
<v Speaker 1>companies use to migrate from traditional physical PBX systems and

26
00:01:24.280 --> 00:01:26.560
<v Speaker 1>the PSTN to IP telephony.

27
00:01:26.680 --> 00:01:29.519
<v Speaker 2>Right, and doing all of that without dropping a single

28
00:01:29.560 --> 00:01:31.480
<v Speaker 2>syllable of your conversation exactly.

29
00:01:31.560 --> 00:01:34.000
<v Speaker 1>Okay, let's unpack this. We should probably start at the

30
00:01:34.040 --> 00:01:36.480
<v Speaker 1>exact border right where the two networks meet.

31
00:01:36.680 --> 00:01:39.359
<v Speaker 2>Yeah, because the transition from circuit switch networks to pack

32
00:01:39.400 --> 00:01:41.680
<v Speaker 2>and switch networks isn't just a matter of, you know,

33
00:01:41.719 --> 00:01:45.799
<v Speaker 2>swapping out the deskcones. The PSTN speaks a fundamentally different

34
00:01:45.879 --> 00:01:50.439
<v Speaker 2>language than an IP wan, so bridging that gap requires

35
00:01:50.519 --> 00:01:54.519
<v Speaker 2>highly specialized hardware acting as translators at the very edge

36
00:01:54.560 --> 00:01:55.120
<v Speaker 2>of the network.

37
00:01:55.319 --> 00:01:58.400
<v Speaker 1>So the device sitting at that demarcation point where the

38
00:01:58.400 --> 00:02:01.840
<v Speaker 1>analog world literally collides with the digital IP world is

39
00:02:01.879 --> 00:02:04.879
<v Speaker 1>the voice gateway. Now reading through the source material, I

40
00:02:04.879 --> 00:02:07.200
<v Speaker 1>initially thought of the gateway as a sort of I

41
00:02:07.200 --> 00:02:10.240
<v Speaker 1>don't know, universal travel adapter, the travel adapt Yeah, like

42
00:02:10.280 --> 00:02:12.759
<v Speaker 1>you plut a legacy PBX into it, it changes the

43
00:02:12.800 --> 00:02:15.000
<v Speaker 1>shape of the signal and then just spits it out

44
00:02:15.039 --> 00:02:16.159
<v Speaker 1>onto the IP network.

45
00:02:16.280 --> 00:02:18.800
<v Speaker 2>I see what you mean, But a travel adapter is

46
00:02:18.840 --> 00:02:22.800
<v Speaker 2>a completely passive, dumb device. A voice gateway is much

47
00:02:22.800 --> 00:02:26.240
<v Speaker 2>closer to like a bilingual diplomatic.

48
00:02:25.840 --> 00:02:27.479
<v Speaker 1>Envoard a diplomat okay, yeah.

49
00:02:27.479 --> 00:02:31.439
<v Speaker 2>It doesn't just pass signals blindly. It actively negotiates the connection.

50
00:02:31.879 --> 00:02:35.520
<v Speaker 2>Its primary function is to physically convert IP packets into

51
00:02:35.560 --> 00:02:38.360
<v Speaker 2>analog or digital voice signals and vice versa.

52
00:02:38.479 --> 00:02:41.759
<v Speaker 1>So it connects the IP network to the PSTN or

53
00:02:41.800 --> 00:02:43.599
<v Speaker 1>a legacy PBX exactly.

54
00:02:44.159 --> 00:02:47.479
<v Speaker 2>But it also has to manage signaling protocols, coordinate supplementary

55
00:02:47.479 --> 00:02:51.560
<v Speaker 2>services like hold and transfer, and provide local processing.

56
00:02:51.120 --> 00:02:53.800
<v Speaker 1>Power, which brings us back to that pressing one problem

57
00:02:53.800 --> 00:02:58.120
<v Speaker 1>from earlier. The text highlights this mechanism called DTMF relay.

58
00:02:57.919 --> 00:03:00.000
<v Speaker 2>Right, the dual tone multi frequency relay.

59
00:03:00.120 --> 00:03:03.520
<v Speaker 1>Yeah, because those dial tones get destroyed by voice compression codex,

60
00:03:03.879 --> 00:03:07.039
<v Speaker 1>the gateway actually has to actively intervene. It intercepts your

61
00:03:07.039 --> 00:03:09.599
<v Speaker 1>button press, pulls that specific tone out of your main

62
00:03:09.719 --> 00:03:12.800
<v Speaker 1>voice audio the bearer stream, and translates it.

63
00:03:12.800 --> 00:03:16.120
<v Speaker 2>It translates it into a pure mathematical digital.

64
00:03:15.800 --> 00:03:18.960
<v Speaker 1>Message exactly, and then it routes that message out of

65
00:03:19.000 --> 00:03:23.240
<v Speaker 1>BAM through a completely separate signaling channel just bypassing the

66
00:03:23.280 --> 00:03:24.680
<v Speaker 1>audio compression entirely.

67
00:03:24.800 --> 00:03:27.719
<v Speaker 2>The gateway essentially says I won't send the sound of

68
00:03:27.719 --> 00:03:31.039
<v Speaker 2>the button, I will send a digital packet, explicitly stating

69
00:03:31.080 --> 00:03:32.240
<v Speaker 2>that the user pressed one.

70
00:03:32.599 --> 00:03:34.439
<v Speaker 1>That's so smart it really is.

71
00:03:34.680 --> 00:03:38.639
<v Speaker 2>The receiving gateway or call manager then synthesizes that exact

72
00:03:38.639 --> 00:03:41.479
<v Speaker 2>tone on the other side. It is a brilliant workaround

73
00:03:41.520 --> 00:03:45.400
<v Speaker 2>to preserve the integrity of the signaling without sacrificing the

74
00:03:45.439 --> 00:03:47.360
<v Speaker 2>bandwidth savings of compressed audio.

75
00:03:47.599 --> 00:03:50.360
<v Speaker 1>Now the text grounds all this theory in a massive

76
00:03:50.479 --> 00:03:53.400
<v Speaker 1>real world case study span engineering.

77
00:03:53.560 --> 00:03:56.039
<v Speaker 2>Ah, Yes, the span engineering migration right.

78
00:03:56.039 --> 00:04:00.000
<v Speaker 1>They're executing this phased migration. Phase one is a straightforward

79
00:04:00.080 --> 00:04:03.360
<v Speaker 1>whole bypass between their Chicago and Dallas offices. They are

80
00:04:03.360 --> 00:04:07.400
<v Speaker 1>basically routing interoffice calls over their existing IP data network

81
00:04:07.439 --> 00:04:10.479
<v Speaker 1>to completely eliminate long distance PSTN.

82
00:04:10.000 --> 00:04:12.400
<v Speaker 2>Charges, which is a huge cost savings.

83
00:04:11.919 --> 00:04:15.639
<v Speaker 1>Massive But Phase two is an incredibly complex hybrid environment.

84
00:04:15.680 --> 00:04:18.360
<v Speaker 1>They are integrating their new Cisco call Manager clusters with

85
00:04:18.439 --> 00:04:22.759
<v Speaker 1>existing legacy pbx's across San Francisco, Chicago, Dallas, and cell Polo.

86
00:04:22.839 --> 00:04:26.480
<v Speaker 2>Span engineering is dealing with the messy reality of enterprise it.

87
00:04:27.360 --> 00:04:32.360
<v Speaker 2>I mean, you cannot just forklift a global communications infrastructure overnight.

88
00:04:32.399 --> 00:04:33.360
<v Speaker 1>No, you'd break everything.

89
00:04:33.480 --> 00:04:36.319
<v Speaker 2>Exactly. You have to maintain parity between the legacy copper

90
00:04:36.360 --> 00:04:38.759
<v Speaker 2>Wire and sell Polo and the brand new IP fund

91
00:04:38.800 --> 00:04:41.920
<v Speaker 2>in San Francisco. The gateways are literally the only things

92
00:04:42.000 --> 00:04:43.480
<v Speaker 2>keeping those offices communicating.

93
00:04:43.600 --> 00:04:45.120
<v Speaker 1>But here's where I want to push back on the

94
00:04:45.199 --> 00:04:48.600
<v Speaker 1>architecture a bit. Okay, if the voice gateway is essentially

95
00:04:48.600 --> 00:04:52.879
<v Speaker 1>translating between the IP world and the legacy PBX, why

96
00:04:52.920 --> 00:04:55.639
<v Speaker 1>does it need built in survival instincts.

97
00:04:55.680 --> 00:04:56.680
<v Speaker 2>Survival instincts?

98
00:04:56.759 --> 00:05:00.120
<v Speaker 1>Yeah, the text mentions it needs to support call manager redundancy.

99
00:05:00.639 --> 00:05:03.639
<v Speaker 1>If the call manager, like the brain of the network

100
00:05:03.639 --> 00:05:06.920
<v Speaker 1>goes down, shouldn't the call just drop? Why is the

101
00:05:07.040 --> 00:05:09.199
<v Speaker 1>edge device responsible for keeping the lights on?

102
00:05:09.680 --> 00:05:13.680
<v Speaker 2>What's fascinating here is how IP voice architecture handles statefulness.

103
00:05:14.279 --> 00:05:19.240
<v Speaker 2>In a traditional, say, centralized data application, if the server crashes,

104
00:05:19.759 --> 00:05:22.199
<v Speaker 2>the client session dies. That's just how it works, right,

105
00:05:22.240 --> 00:05:22.519
<v Speaker 2>You just.

106
00:05:22.439 --> 00:05:25.000
<v Speaker 1>Get a four or four error or whatever exactly.

107
00:05:25.800 --> 00:05:29.560
<v Speaker 2>But voice is real time and it's mission critical. The

108
00:05:29.600 --> 00:05:34.079
<v Speaker 2>gateway is specifically engineered to preserve the RTP bearer stream

109
00:05:34.600 --> 00:05:37.920
<v Speaker 2>the actual packets carrying your voice independently of the call

110
00:05:37.959 --> 00:05:42.360
<v Speaker 2>control signaling. Wait, really independently, Yes, If the primary call

111
00:05:42.399 --> 00:05:45.800
<v Speaker 2>manager fails mid conversation, the gateway is smart enough to

112
00:05:45.920 --> 00:05:50.079
<v Speaker 2>immediately rehulme its signaling to a secondary call manager. Wow.

113
00:05:50.120 --> 00:05:54.160
<v Speaker 2>And more importantly, it keeps routing the active RTP voice

114
00:05:54.160 --> 00:05:57.600
<v Speaker 2>packets between the two endpoints. The brain of the network

115
00:05:57.639 --> 00:06:00.800
<v Speaker 2>can completely crash and the two people talk, we'll never

116
00:06:00.879 --> 00:06:01.720
<v Speaker 2>even hear a blip.

117
00:06:01.800 --> 00:06:04.480
<v Speaker 1>That active intelligence at the edge is incredible. It really

118
00:06:04.560 --> 00:06:06.959
<v Speaker 1>is like a diplomat ensuring the treaty holds even if

119
00:06:06.959 --> 00:06:08.160
<v Speaker 1>the capital city goes dark.

120
00:06:08.319 --> 00:06:09.319
<v Speaker 2>That's a great way to put it.

121
00:06:09.519 --> 00:06:12.160
<v Speaker 1>But looking at SPAN Engineering's roadmap as they scale up

122
00:06:12.199 --> 00:06:15.759
<v Speaker 1>Phase two, they are going to have dozens, maybe hundreds

123
00:06:15.759 --> 00:06:17.879
<v Speaker 1>of these gateways globally. If I'm doing the math on

124
00:06:17.920 --> 00:06:21.360
<v Speaker 1>a full meshed topology, configuring every single gateway to know

125
00:06:21.399 --> 00:06:24.439
<v Speaker 1>the explicit IP address of every other gateway, oh, that

126
00:06:24.480 --> 00:06:27.879
<v Speaker 1>means maintaining thousands of individual peer to peer connections. That

127
00:06:27.959 --> 00:06:30.680
<v Speaker 1>is an absolute scaling and management nightmare.

128
00:06:30.759 --> 00:06:35.319
<v Speaker 2>It is mathematically unsustainable. A full mesh network of gateways

129
00:06:35.360 --> 00:06:40.120
<v Speaker 2>just becomes impossible to troubleshoot at that scale. To solve that,

130
00:06:40.399 --> 00:06:43.600
<v Speaker 2>the architecture introduces a new layer of abstraction and control,

131
00:06:44.439 --> 00:06:45.000
<v Speaker 2>the gatekeeper.

132
00:06:45.079 --> 00:06:47.519
<v Speaker 1>Yes, at the gatekeeper, the source defines it as an

133
00:06:47.639 --> 00:06:50.360
<v Speaker 1>H three to three land device, right, Yes, that's correct.

134
00:06:50.680 --> 00:06:53.920
<v Speaker 1>Its job is to group these individual gateways into logical

135
00:06:54.040 --> 00:06:58.160
<v Speaker 1>zones and provide centralized dial plan administration. So at the

136
00:06:58.160 --> 00:07:01.240
<v Speaker 1>gateways are the diplomats at the board. The gatekeeper is

137
00:07:01.240 --> 00:07:03.240
<v Speaker 1>sort of like the central routing command.

138
00:07:03.360 --> 00:07:05.839
<v Speaker 2>It functions very much like air traffic control. That text

139
00:07:05.920 --> 00:07:10.120
<v Speaker 2>actually separates the gatekeeper's duties into mandatory and optional functions. Right.

140
00:07:10.240 --> 00:07:13.959
<v Speaker 2>The most critical mandatory function is address translation. When a

141
00:07:14.040 --> 00:07:16.680
<v Speaker 2>user in Chicago dials a standard E point one sixty

142
00:07:16.720 --> 00:07:18.720
<v Speaker 2>four phone number, just a regular phone number for the

143
00:07:18.759 --> 00:07:21.879
<v Speaker 2>South Polo office, the originating gateway doesn't know the IP

144
00:07:21.959 --> 00:07:22.519
<v Speaker 2>address of the.

145
00:07:22.519 --> 00:07:24.759
<v Speaker 1>South Polo gateway, it has no idea where to send the.

146
00:07:24.680 --> 00:07:28.360
<v Speaker 2>Packets exactly, so it queries the gatekeeper. The gatekeeper resolves

147
00:07:28.399 --> 00:07:30.600
<v Speaker 2>that E point one sixty four number, maps it to

148
00:07:30.680 --> 00:07:33.279
<v Speaker 2>the correct destination IP address, and hands it back so

149
00:07:33.319 --> 00:07:34.240
<v Speaker 2>the call can connect.

150
00:07:34.399 --> 00:07:37.399
<v Speaker 1>It's kind of like if gateways are individual cell towers.

151
00:07:37.480 --> 00:07:40.480
<v Speaker 1>The gatekeeper is the GPS routing system telling the data

152
00:07:40.600 --> 00:07:43.279
<v Speaker 1>which tower to use, so there isn't a massive traffic jam.

153
00:07:43.399 --> 00:07:44.959
<v Speaker 2>That's a very solid analogy. Yeah.

154
00:07:45.120 --> 00:07:47.279
<v Speaker 1>The other mandatory function is where we get into the

155
00:07:47.279 --> 00:07:52.720
<v Speaker 1>physics of the network bandwidth control, specifically call admission control

156
00:07:53.160 --> 00:07:57.079
<v Speaker 1>or CAC. This is fundamentally about protecting the network from itself,

157
00:07:57.120 --> 00:07:57.800
<v Speaker 1>isn't it.

158
00:07:57.800 --> 00:08:01.319
<v Speaker 2>It is entirely about protecting this trictquality of service required

159
00:08:01.560 --> 00:08:05.600
<v Speaker 2>for real time voice in a distributed enterprise like span engineering.

160
00:08:05.920 --> 00:08:09.279
<v Speaker 2>The ip WAN links connecting those cities have finite.

161
00:08:08.879 --> 00:08:11.000
<v Speaker 1>Bandwidth, right, They aren't infinite pipes.

162
00:08:11.160 --> 00:08:13.959
<v Speaker 2>No. Let's say the link between Chicago and Dallas is

163
00:08:13.959 --> 00:08:16.680
<v Speaker 2>configured to allow exactly five hundred and twelve kilobits per

164
00:08:16.680 --> 00:08:19.399
<v Speaker 2>second for voice traffic. If too many users try to

165
00:08:19.399 --> 00:08:23.839
<v Speaker 2>initiate call simultaneously and the traffic exceeds that threshold, packet

166
00:08:23.879 --> 00:08:25.759
<v Speaker 2>delay and jitter just spike.

167
00:08:25.560 --> 00:08:29.600
<v Speaker 1>And you immediately get that terrifying robotic underwater audio distortion.

168
00:08:29.800 --> 00:08:31.480
<v Speaker 2>Yes, the underwater robot voice.

169
00:08:31.480 --> 00:08:34.799
<v Speaker 1>Nobody wants that, So the gatekeeper prevents the overload before

170
00:08:34.840 --> 00:08:38.480
<v Speaker 1>it even starts. Before a gateway is allowed to establish

171
00:08:38.480 --> 00:08:42.000
<v Speaker 1>a session, it must send an admission request to the gatekeeper. Right,

172
00:08:42.279 --> 00:08:45.440
<v Speaker 1>The gatekeeper looks at the active zone bandwidth. If that

173
00:08:45.559 --> 00:08:49.879
<v Speaker 1>five hundred and twelve kilobit threshold is reached, it mathematically

174
00:08:49.919 --> 00:08:53.200
<v Speaker 1>rejects the new call over the ip WAN. But it

175
00:08:53.240 --> 00:08:55.679
<v Speaker 1>doesn't just give the user a busy signal, does it.

176
00:08:55.840 --> 00:09:00.200
<v Speaker 2>Usually No, a properly configured network will pivot that reject

177
00:09:00.200 --> 00:09:03.960
<v Speaker 2>ip call and seamlessly routed out over the traditional PSTN

178
00:09:04.000 --> 00:09:04.879
<v Speaker 2>toll lines instead.

179
00:09:05.039 --> 00:09:08.120
<v Speaker 1>Ah, So it falls back to the old reliable copper exactly.

180
00:09:08.559 --> 00:09:11.200
<v Speaker 2>It caused a company a few cents in long distance charges,

181
00:09:11.480 --> 00:09:14.399
<v Speaker 2>but it preserves the pristine audio quality for the calls

182
00:09:14.440 --> 00:09:18.120
<v Speaker 2>already utilizing the WHAN and ensures the new caller still connects.

183
00:09:18.360 --> 00:09:20.919
<v Speaker 1>That's brilliant. And as span grows, they can link these

184
00:09:20.960 --> 00:09:25.519
<v Speaker 1>zones together using innercluster trunks, which are h three twenty

185
00:09:25.600 --> 00:09:29.240
<v Speaker 1>three connections bridging entire Cisco call manager clusters over the

186
00:09:29.240 --> 00:09:29.919
<v Speaker 1>war Right.

187
00:09:30.039 --> 00:09:31.960
<v Speaker 2>And if we connect this to the bigger picture, they

188
00:09:32.000 --> 00:09:33.840
<v Speaker 2>can even deploy a directory gatekeeper.

189
00:09:33.919 --> 00:09:35.759
<v Speaker 1>A directory gatekeeper, Yeah.

190
00:09:35.559 --> 00:09:39.039
<v Speaker 2>It's essentially a master air traffic controller that just manages

191
00:09:39.080 --> 00:09:42.960
<v Speaker 2>the regional gatekeepers. It is a highly modular hierarchy that

192
00:09:43.080 --> 00:09:46.960
<v Speaker 2>scales from a medium network up to a massive global one.

193
00:09:47.000 --> 00:09:50.600
<v Speaker 1>That modularity really dictates the blueprint of the physical network.

194
00:09:50.879 --> 00:09:54.519
<v Speaker 1>The source material outlines three distinct deployment models, and the

195
00:09:54.559 --> 00:09:58.279
<v Speaker 1>physical layout changes the entire routing logic and hardware requirements.

196
00:09:58.360 --> 00:09:59.519
<v Speaker 2>Let's walk through those blueprints.

197
00:09:59.600 --> 00:10:01.759
<v Speaker 1>Okay, First is the single site model. This is the

198
00:10:01.759 --> 00:10:07.159
<v Speaker 1>simplest topology, one physical location like spans Salpollo office perse migration.

199
00:10:07.840 --> 00:10:10.559
<v Speaker 1>All the call processing happens locally on a single call

200
00:10:10.639 --> 00:10:11.759
<v Speaker 1>manager cluster.

201
00:10:11.639 --> 00:10:14.320
<v Speaker 2>Right, and the external calls route straight out of a

202
00:10:14.320 --> 00:10:15.879
<v Speaker 2>gateway to the local PSTN.

203
00:10:16.080 --> 00:10:19.279
<v Speaker 1>The crucial technical detail here is that they exclusively use

204
00:10:19.399 --> 00:10:22.639
<v Speaker 1>G point seventy one one codex because G point seven

205
00:10:22.639 --> 00:10:25.679
<v Speaker 1>to one is uncompressed high quality audio. The gateway doesn't

206
00:10:25.720 --> 00:10:29.000
<v Speaker 1>need to be populated with digital signal processors or DSPs

207
00:10:29.279 --> 00:10:34.679
<v Speaker 1>to handle complex transcoding. And importantly, there is no gatekeeper required.

208
00:10:34.480 --> 00:10:37.360
<v Speaker 2>Because there is no ip WAN traffic to manage, no

209
00:10:37.480 --> 00:10:41.279
<v Speaker 2>zones to route between, and no bandwidth constraints to strictly

210
00:10:41.320 --> 00:10:44.559
<v Speaker 2>police with call admission control. The local land runs at

211
00:10:44.600 --> 00:10:48.759
<v Speaker 2>gigabit speeds, so uncompressed G seven to eleven is perfectly fine.

212
00:10:48.960 --> 00:10:51.679
<v Speaker 1>Makes sense. Then we scale up to the multi site

213
00:10:51.720 --> 00:10:55.440
<v Speaker 1>centralized model. This is where a company places one massive

214
00:10:55.519 --> 00:10:59.080
<v Speaker 1>call manager cluster at their headquarters and it serves dozens

215
00:10:59.080 --> 00:11:01.600
<v Speaker 1>of remote branch offices over the IP WAN.

216
00:11:01.799 --> 00:11:02.960
<v Speaker 2>A very common setup.

217
00:11:03.080 --> 00:11:06.480
<v Speaker 1>The branch offices don't have local call processing brains, their

218
00:11:06.559 --> 00:11:09.840
<v Speaker 1>IP phones registered directly with the headquarters over the Internet.

219
00:11:10.279 --> 00:11:13.639
<v Speaker 1>But this introduces a massive single point of failure. If

220
00:11:13.639 --> 00:11:16.159
<v Speaker 1>the WAN link to a branch office goes down, those

221
00:11:16.200 --> 00:11:19.240
<v Speaker 1>remote IP phones lose their connection to the call manager.

222
00:11:19.399 --> 00:11:21.360
<v Speaker 1>They just become plastic bricks on a desk.

223
00:11:21.639 --> 00:11:24.759
<v Speaker 2>Historically, yes, a WAN outage in the centralized model meant

224
00:11:24.799 --> 00:11:28.879
<v Speaker 2>total telecommunications blackout for the branch, but the Cisco architecture

225
00:11:28.960 --> 00:11:32.399
<v Speaker 2>mitigates this with SRST survivable remote site telephony.

226
00:11:32.679 --> 00:11:35.360
<v Speaker 1>I love the mechanics of this future. When the remote

227
00:11:35.399 --> 00:11:38.360
<v Speaker 1>IP phones lose their keep alive heartbeat with the central

228
00:11:38.360 --> 00:11:42.440
<v Speaker 1>call manager, SRST triggers the local Cisco router sitting in

229
00:11:42.440 --> 00:11:46.200
<v Speaker 1>the branch office's network closet, detects the failure and instantly

230
00:11:46.240 --> 00:11:47.039
<v Speaker 1>shifts its state.

231
00:11:47.240 --> 00:11:50.320
<v Speaker 2>It temporarily transforms into a lightweight call processor.

232
00:11:50.480 --> 00:11:53.879
<v Speaker 1>Right the local IP phones re register to this local

233
00:11:53.919 --> 00:11:57.159
<v Speaker 1>router and it routes all of their outbound calls directly

234
00:11:57.279 --> 00:12:01.960
<v Speaker 1>through the branch's local analog PSTN line. The Internet is down,

235
00:12:02.399 --> 00:12:05.440
<v Speaker 1>but the phones seamlessly transition to the copper backups.

236
00:12:05.679 --> 00:12:09.519
<v Speaker 2>It provides robust business continuity without the massive capital expenditure

237
00:12:09.559 --> 00:12:12.720
<v Speaker 2>of buying a dedicated call manager server for every ten

238
00:12:12.799 --> 00:12:14.480
<v Speaker 2>person branch office.

239
00:12:14.080 --> 00:12:17.039
<v Speaker 1>Which brings us to the third blueprint, the multi site

240
00:12:17.039 --> 00:12:21.519
<v Speaker 1>distributed model. This is SPAN Engineering's master plan. Every major

241
00:12:21.600 --> 00:12:25.600
<v Speaker 1>site Chicago, Dallas, San Francisco, gets its own dedicated call

242
00:12:25.679 --> 00:12:26.759
<v Speaker 1>manager cluster.

243
00:12:26.559 --> 00:12:28.480
<v Speaker 2>Right independent brains at every site.

244
00:12:28.559 --> 00:12:31.159
<v Speaker 1>The IP one is used to carry the actual voice

245
00:12:31.200 --> 00:12:34.399
<v Speaker 1>bearer traffic between the cities, but the WAND does not

246
00:12:34.600 --> 00:12:40.039
<v Speaker 1>carry call control signaling for intracite calls. Chicago handles Chicago's signaling,

247
00:12:40.559 --> 00:12:45.120
<v Speaker 1>Dallas handles Dallas's signaling. Gatekeepers are strictly required here to

248
00:12:45.159 --> 00:12:49.080
<v Speaker 1>maintain the Hubbins book riding logic between the independent clusters.

249
00:12:49.080 --> 00:12:50.919
<v Speaker 2>Because it isolates the fault domains.

250
00:12:51.200 --> 00:12:53.120
<v Speaker 1>So what does this all mean? I really have to

251
00:12:53.200 --> 00:12:56.519
<v Speaker 1>push back on the ROI of this distributed model. If

252
00:12:56.559 --> 00:12:59.960
<v Speaker 1>every site is completely independent, buying and maintaining its own

253
00:13:00.000 --> 00:13:03.639
<v Speaker 1>own heavy call processing servers, why go through the immense

254
00:13:03.679 --> 00:13:06.879
<v Speaker 1>configuration hassle of connecting them over the ip WAN at all.

255
00:13:07.159 --> 00:13:08.039
<v Speaker 2>That's a fair question.

256
00:13:08.240 --> 00:13:11.080
<v Speaker 1>I mean, if the WAND isn't centralizing the signaling to

257
00:13:11.120 --> 00:13:13.159
<v Speaker 1>save on hardware, aren't we just building a bunch of

258
00:13:13.200 --> 00:13:15.519
<v Speaker 1>expensive isolated single site Well.

259
00:13:15.399 --> 00:13:18.759
<v Speaker 2>You are building isolated control planes, but a unified data plane.

260
00:13:19.080 --> 00:13:21.600
<v Speaker 2>The primary driver here is the pure economics of toll

261
00:13:21.600 --> 00:13:24.960
<v Speaker 2>bypass combined with bulletproof quality of service. Okay, the agony

262
00:13:25.120 --> 00:13:28.360
<v Speaker 2>right span Engineering is paying for a massive data pipe

263
00:13:28.440 --> 00:13:32.039
<v Speaker 2>between Chicago and Dallas anyway, By routing the inner office

264
00:13:32.120 --> 00:13:35.840
<v Speaker 2>RTP streams over that whan they eliminate hundreds of thousands

265
00:13:35.879 --> 00:13:38.360
<v Speaker 2>of dollars in recurring PSTN charges.

266
00:13:38.399 --> 00:13:40.600
<v Speaker 1>Oh I see, But they avoid the fragility of the

267
00:13:40.639 --> 00:13:42.120
<v Speaker 1>centralized model precisely.

268
00:13:42.600 --> 00:13:45.639
<v Speaker 2>If a backo severs the fiber optic line connecting Chicago

269
00:13:45.639 --> 00:13:49.159
<v Speaker 2>and Dallas, the WAN drops. In a centralized model, Dallas

270
00:13:49.240 --> 00:13:53.440
<v Speaker 2>might go completely offline. In the distributed model, Dallas employees

271
00:13:53.440 --> 00:13:57.039
<v Speaker 2>can still call other Dallas employees without interruption because their

272
00:13:57.120 --> 00:14:01.519
<v Speaker 2>local call manager is handling the interracite signaling makes total sense. Furthermore,

273
00:14:01.519 --> 00:14:04.840
<v Speaker 2>the text heavily emphasizes that for this distributed model, you

274
00:14:04.919 --> 00:14:07.200
<v Speaker 2>must use the G point seventy twenty nine CODEC for

275
00:14:07.240 --> 00:14:08.240
<v Speaker 2>the WHAN links.

276
00:14:08.000 --> 00:14:10.279
<v Speaker 1>Because G point seven two nine is highly compressed.

277
00:14:10.360 --> 00:14:13.360
<v Speaker 2>Exactly, it compresses the voice payload down to eight kilobits

278
00:14:13.360 --> 00:14:16.279
<v Speaker 2>per second. Now it requires DSPs on the gateways to

279
00:14:16.320 --> 00:14:19.000
<v Speaker 2>handle the intensive math of transcoding the audio from the

280
00:14:19.000 --> 00:14:21.159
<v Speaker 2>local G point seven one one network to the G

281
00:14:21.279 --> 00:14:24.240
<v Speaker 2>point seven two nine WHAN link. But it saves massive

282
00:14:24.240 --> 00:14:26.960
<v Speaker 2>amounts of bandwidth on those expensive long haul connections.

283
00:14:27.200 --> 00:14:30.399
<v Speaker 1>Okay, the architecture makes logical sense. We have the physical hardware,

284
00:14:30.639 --> 00:14:32.799
<v Speaker 1>we know where it lives, and we know how it scales.

285
00:14:33.120 --> 00:14:36.120
<v Speaker 1>But the deeper layer of the source material explores the.

286
00:14:36.080 --> 00:14:37.919
<v Speaker 2>Actual rule books protocols.

287
00:14:38.039 --> 00:14:42.039
<v Speaker 1>Right, What are the specific signaling languages these devices used

288
00:14:42.080 --> 00:14:45.759
<v Speaker 1>to orchestrate all of this. The text contrasts two dominant

289
00:14:45.759 --> 00:14:49.360
<v Speaker 1>protocols H three two three three and MGCP.

290
00:14:49.159 --> 00:14:53.200
<v Speaker 2>And choosing between them requires a fundamental philosophical decision about

291
00:14:53.200 --> 00:14:55.360
<v Speaker 2>where intelligence should reside in your network.

292
00:14:55.679 --> 00:14:57.559
<v Speaker 1>Let's start with h J three two three three. I

293
00:14:57.720 --> 00:15:00.679
<v Speaker 1>look at this protocol as the Independent contract. It is

294
00:15:00.720 --> 00:15:03.879
<v Speaker 1>a massive umbrella standard originally developed by the ITU for

295
00:15:04.039 --> 00:15:08.559
<v Speaker 1>multimedia handling voice, video and data over unreliable networks. It

296
00:15:08.639 --> 00:15:12.240
<v Speaker 1>is heavily based on the legacy isdn Q point nine

297
00:15:12.360 --> 00:15:13.320
<v Speaker 1>three one standard.

298
00:15:13.519 --> 00:15:14.799
<v Speaker 2>Very robust, very complex.

299
00:15:14.919 --> 00:15:17.480
<v Speaker 1>Yeah, the mechanics rely on a suite of sub protocols.

300
00:15:17.679 --> 00:15:19.639
<v Speaker 1>It uses H two two five for the initial call

301
00:15:19.720 --> 00:15:22.559
<v Speaker 1>setup and rout actually ringing the phone on the other side,

302
00:15:22.720 --> 00:15:25.200
<v Speaker 1>but then it uses H two forty five to negotiate the.

303
00:15:25.159 --> 00:15:26.600
<v Speaker 2>Media capability right by handshake.

304
00:15:26.679 --> 00:15:28.440
<v Speaker 1>Yeah. H forty five is the handshake where the two

305
00:15:28.559 --> 00:15:31.279
<v Speaker 1>end points say I support G talk seven one one

306
00:15:31.440 --> 00:15:34.000
<v Speaker 1>and G seven two nine what do you support before

307
00:15:34.039 --> 00:15:35.960
<v Speaker 1>they actually open the logical audio channel.

308
00:15:36.200 --> 00:15:38.399
<v Speaker 2>And because H three to two to three does all

309
00:15:38.399 --> 00:15:42.000
<v Speaker 2>of this complex negotiation directly on the edge gateway, it

310
00:15:42.200 --> 00:15:46.000
<v Speaker 2>historically suffered from latency. The back and forth round trips

311
00:15:46.000 --> 00:15:48.600
<v Speaker 2>of establishing H two two D five and then negotiating

312
00:15:48.759 --> 00:15:52.279
<v Speaker 2>H two forty five took time. It could lead to

313
00:15:52.320 --> 00:15:54.960
<v Speaker 2>the first syllable of a conversation getting clipped.

314
00:15:54.639 --> 00:15:56.960
<v Speaker 1>Which is super annoying, and that is why the protocol

315
00:15:57.000 --> 00:15:59.840
<v Speaker 1>evolved to include fast connect, compressing those set up in

316
00:15:59.840 --> 00:16:02.879
<v Speaker 1>a egotiation messages into a single exchange to open the

317
00:16:02.879 --> 00:16:06.919
<v Speaker 1>media channel immediately. But still the gateway is doing heavy lifting.

318
00:16:07.120 --> 00:16:10.200
<v Speaker 1>It is now contrast that with MGCP, the media Gateway

319
00:16:10.240 --> 00:16:13.159
<v Speaker 1>Control Protocol. Here's where it gets really interesting for me.

320
00:16:13.559 --> 00:16:16.200
<v Speaker 1>If H three two three is the independent contractor managing

321
00:16:16.240 --> 00:16:20.200
<v Speaker 1>its own tool set, MGCP is like a remote control drone.

322
00:16:20.360 --> 00:16:24.200
<v Speaker 2>MGCP strips the intelligence entirely out of the edge device completely.

323
00:16:24.240 --> 00:16:27.559
<v Speaker 1>It takes all that complex routing, signal processing and capability

324
00:16:27.679 --> 00:16:30.480
<v Speaker 1>negotiation and moves it to a central call agent, in

325
00:16:30.519 --> 00:16:33.720
<v Speaker 1>this case the Cisco Call Manager. The gateway literally becomes

326
00:16:33.720 --> 00:16:36.360
<v Speaker 1>a slave device. It just sits there listening on UDP

327
00:16:36.480 --> 00:16:39.480
<v Speaker 1>port twenty four to twenty seven, waiting for plaintext commands.

328
00:16:39.639 --> 00:16:44.440
<v Speaker 2>It utilizes a strict master slave architecture built around endpoints

329
00:16:44.919 --> 00:16:48.480
<v Speaker 2>which can be physical analog ports or virtual digital interfaces

330
00:16:48.480 --> 00:16:52.039
<v Speaker 2>and connections which are the actual media sessions between endpoints.

331
00:16:52.200 --> 00:16:55.039
<v Speaker 1>And we should note while MDCP uses UDP for speed

332
00:16:55.279 --> 00:16:57.360
<v Speaker 1>because in real time voice, you know you don't have

333
00:16:57.440 --> 00:17:00.720
<v Speaker 1>time for TCP's three way handshakes your packet reach transmissions.

334
00:17:01.039 --> 00:17:03.120
<v Speaker 1>It relies on the call agent to manage the state

335
00:17:03.159 --> 00:17:03.720
<v Speaker 1>of the network.

336
00:17:03.840 --> 00:17:08.480
<v Speaker 2>Right, and the source briefly mentions SIP the session initiation protocol.

337
00:17:08.920 --> 00:17:12.279
<v Speaker 2>But the structural debate here is really between the distributed

338
00:17:12.319 --> 00:17:15.640
<v Speaker 2>intelligence of age three twenty three and the centralized control

339
00:17:15.759 --> 00:17:16.599
<v Speaker 2>of MGCP.

340
00:17:16.920 --> 00:17:20.079
<v Speaker 1>But why would a network architect intentionally choose to deploy

341
00:17:20.240 --> 00:17:24.319
<v Speaker 1>dumb drones. If you have hardware capable of independent routing,

342
00:17:24.720 --> 00:17:27.480
<v Speaker 1>why strip its autonomy and force all the processing onto

343
00:17:27.519 --> 00:17:28.279
<v Speaker 1>the call manager.

344
00:17:28.559 --> 00:17:32.799
<v Speaker 2>This raises an important question about operational overhead. Centralized control

345
00:17:32.960 --> 00:17:36.799
<v Speaker 2>drastically simplifies dial plan management. If you manage a distributed

346
00:17:36.880 --> 00:17:39.599
<v Speaker 2>network of fifty eighth three twenty three gateways and you

347
00:17:39.640 --> 00:17:42.440
<v Speaker 2>need to implement a new company wide routing rule or

348
00:17:42.599 --> 00:17:44.079
<v Speaker 2>change dial prefix.

349
00:17:43.720 --> 00:17:46.519
<v Speaker 1>You have to log into and configure fifty separate independent

350
00:17:46.519 --> 00:17:47.960
<v Speaker 1>devices exactly.

351
00:17:48.000 --> 00:17:51.200
<v Speaker 2>It's a nightmare. With MGCP, you update the dial plan

352
00:17:51.319 --> 00:17:54.160
<v Speaker 2>exactly once in the central call manager. The call manager

353
00:17:54.160 --> 00:17:57.680
<v Speaker 2>simply pushes the execution commands down to the dumb gateways.

354
00:17:58.000 --> 00:18:00.480
<v Speaker 2>It severely lowers the barrier for administry.

355
00:18:00.480 --> 00:18:03.920
<v Speaker 1>But the source text is clear. MGCP isn't a silver bullet.

356
00:18:04.119 --> 00:18:06.599
<v Speaker 1>You still need the intelligence of H three two three

357
00:18:06.640 --> 00:18:11.000
<v Speaker 1>for specific complex integrations. For example, if your gateway needs

358
00:18:11.000 --> 00:18:14.559
<v Speaker 1>to interface directly with Signaling System seven ssven, the core

359
00:18:14.599 --> 00:18:18.119
<v Speaker 1>protocol of the global public Telephone network, a dumb MGCP

360
00:18:18.240 --> 00:18:20.079
<v Speaker 1>gateway cannot handle that translation.

361
00:18:20.200 --> 00:18:23.160
<v Speaker 2>No, it can't. You need the robust independent state machine

362
00:18:23.160 --> 00:18:26.640
<v Speaker 2>of H three two three sitting at that edge. Architecture

363
00:18:26.680 --> 00:18:29.359
<v Speaker 2>is always a compromise between centralized ease of management and

364
00:18:29.400 --> 00:18:32.759
<v Speaker 2>distributed resilience. You apply the protocol that fits the specific

365
00:18:32.759 --> 00:18:33.839
<v Speaker 2>demarcation point.

366
00:18:33.880 --> 00:18:36.680
<v Speaker 1>Now there is one final specialized piece of hardware. The

367
00:18:36.680 --> 00:18:39.480
<v Speaker 1>text pivots to at the very end, the IP to

368
00:18:39.559 --> 00:18:42.720
<v Speaker 1>IP gateway. Everything we have discussed so far is about

369
00:18:42.759 --> 00:18:45.759
<v Speaker 1>bridging the new IP world to the old legacy PBX

370
00:18:45.880 --> 00:18:47.640
<v Speaker 1>or analog PSTN.

371
00:18:47.039 --> 00:18:49.200
<v Speaker 2>World right analog to digital, But an.

372
00:18:49.200 --> 00:18:52.920
<v Speaker 1>IP to IP gateway explicitly joins two digital IP call

373
00:18:53.000 --> 00:18:53.599
<v Speaker 1>legs together.

374
00:18:53.880 --> 00:18:57.680
<v Speaker 2>It acts as a purely digital demarcation point, for instance,

375
00:18:57.799 --> 00:19:01.599
<v Speaker 2>connecting your internal Cisco IP network directly to an Internet

376
00:19:01.720 --> 00:19:04.920
<v Speaker 2>telephany service provider via ASIP trunk.

377
00:19:05.039 --> 00:19:08.319
<v Speaker 1>It strictly routes packets. It supports features like T point

378
00:19:08.440 --> 00:19:11.480
<v Speaker 1>three to eight fax relay over IP, but the text

379
00:19:11.480 --> 00:19:15.480
<v Speaker 1>gives us stark hardware warning do not install physical voice

380
00:19:15.480 --> 00:19:19.200
<v Speaker 1>network modules. Analog fxs or FIXO ports into a router

381
00:19:19.400 --> 00:19:22.839
<v Speaker 1>operating as an IP to IP gateway. Its entire purpose

382
00:19:22.920 --> 00:19:24.759
<v Speaker 1>is to maintain a purely digital chain.

383
00:19:25.039 --> 00:19:27.839
<v Speaker 2>It never touches a copper analog line It is the

384
00:19:27.839 --> 00:19:31.160
<v Speaker 2>evolution of the gateway concept, moving away from legacy translation

385
00:19:31.440 --> 00:19:33.359
<v Speaker 2>and toward purely digital border control.

386
00:19:33.480 --> 00:19:35.960
<v Speaker 1>So to bring all of these architectural blueprints together for

387
00:19:36.000 --> 00:19:39.440
<v Speaker 1>you listening, we've unpacked a massive hidden infrastructure today. We

388
00:19:39.519 --> 00:19:43.160
<v Speaker 1>started with voice gateways acting as active bilingual diplomats, preserving

389
00:19:43.160 --> 00:19:46.440
<v Speaker 1>your RTP bear streams and intercepting DTMF dial tone so

390
00:19:46.480 --> 00:19:47.480
<v Speaker 1>they survive compression.

391
00:19:47.599 --> 00:19:50.839
<v Speaker 2>We scaled up to gatekeepers, the air traffic controllers, managing

392
00:19:50.920 --> 00:19:54.640
<v Speaker 2>E point one sixty four address translation and enforcing call

393
00:19:54.680 --> 00:19:56.240
<v Speaker 2>admission control bandwidth limits.

394
00:19:56.359 --> 00:20:00.839
<v Speaker 1>We explored how distributed deployment models isolate fault domains, utilizing

395
00:20:01.039 --> 00:20:04.920
<v Speaker 1>SRST to magically failover to local copper lines when the

396
00:20:04.960 --> 00:20:08.920
<v Speaker 1>wand drops, and we decoded the heavy rule books, contrasting

397
00:20:08.920 --> 00:20:11.960
<v Speaker 1>the independent processing of H three twenty three with the

398
00:20:12.000 --> 00:20:14.640
<v Speaker 1>central drone like obedience of MGCP.

399
00:20:14.839 --> 00:20:17.839
<v Speaker 2>It is a phenomenal amount of logic executing in milliseconds

400
00:20:17.920 --> 00:20:20.200
<v Speaker 2>just to give us the illusion of a simple phone call.

401
00:20:20.319 --> 00:20:22.039
<v Speaker 1>The next time you press a button on a conference

402
00:20:22.039 --> 00:20:25.240
<v Speaker 1>call menu, or you hear the audio perfectly clear on

403
00:20:25.279 --> 00:20:27.599
<v Speaker 1>a call halfway across the globe, you will know exactly

404
00:20:27.640 --> 00:20:31.279
<v Speaker 1>why it isn't magic. It is DSP transcoding, out of

405
00:20:31.319 --> 00:20:35.279
<v Speaker 1>band signaling and a carefully orchestrated dance of network protocols.

406
00:20:35.480 --> 00:20:37.799
<v Speaker 2>But as we look at that final concept, the ipd

407
00:20:37.880 --> 00:20:41.160
<v Speaker 2>IP gateway, it forces us to consider the implications of

408
00:20:41.160 --> 00:20:44.119
<v Speaker 2>where this architecture is heading. Historically, the voice gateway was

409
00:20:44.160 --> 00:20:47.599
<v Speaker 2>a physical bridge to the PSTN. That physical copper connection

410
00:20:47.720 --> 00:20:51.119
<v Speaker 2>acted as a natural air gap, isolating enterprise voice networks

411
00:20:51.119 --> 00:20:54.079
<v Speaker 2>from the broader Internet. But as we transition entirely to

412
00:20:54.400 --> 00:20:58.480
<v Speaker 2>IP to IP gateways and SESSIP trunking bridging digital networks

413
00:20:58.480 --> 00:21:01.960
<v Speaker 2>directly to digital service providers, we are removing that physical

414
00:21:01.960 --> 00:21:05.359
<v Speaker 2>air gap. It opens up an entirely new frontier. If

415
00:21:05.359 --> 00:21:08.400
<v Speaker 2>our core voice networks are now just another IP data stream,

416
00:21:08.680 --> 00:21:12.079
<v Speaker 2>are they certainly vulnerable to the same volumetric DIDOS attacks

417
00:21:12.279 --> 00:21:15.880
<v Speaker 2>and packet level exploits that target standard web servers. Securing

418
00:21:15.880 --> 00:21:18.480
<v Speaker 2>the perimeter of a purely digital global nervous system might

419
00:21:18.480 --> 00:21:20.839
<v Speaker 2>be a much harder problem than translating a dial tone.

420
00:21:20.880 --> 00:21:22.559
<v Speaker 1>Now, that is a deep dive for another day.
