WEBVTT

1
00:00:00.160 --> 00:00:03.680
<v Speaker 1>Welcome to the deep dive. Today. We're diving into well,

2
00:00:03.720 --> 00:00:07.240
<v Speaker 1>a really fundamental challenge in computing these days, handling that

3
00:00:07.519 --> 00:00:10.080
<v Speaker 1>absolute flood of data, you know, the high speed stuff.

4
00:00:10.160 --> 00:00:14.119
<v Speaker 2>Yeah, it's relentless. I think social media, IoT sensors, everywhere,

5
00:00:14.199 --> 00:00:17.280
<v Speaker 2>web clicks, it just keeps coming.

6
00:00:17.760 --> 00:00:19.839
<v Speaker 1>And it's not just the sheer volume, right, it's how

7
00:00:19.879 --> 00:00:22.480
<v Speaker 1>bursty it is. You get these sudden peaks exactly.

8
00:00:22.519 --> 00:00:26.000
<v Speaker 2>That borestiness is the real killer. Our sources. They show

9
00:00:26.039 --> 00:00:29.399
<v Speaker 2>that building systems that don't just fall over when you

10
00:00:29.440 --> 00:00:32.039
<v Speaker 2>get a sudden rush, like say everyone trying to check

11
00:00:32.079 --> 00:00:34.280
<v Speaker 2>out a bike in a smart city right at five pm,

12
00:00:34.600 --> 00:00:37.799
<v Speaker 2>that needs special thinking. Traditional approaches just won't.

13
00:00:37.600 --> 00:00:40.119
<v Speaker 1>Cut it right. So our mission today is basically to

14
00:00:40.119 --> 00:00:43.159
<v Speaker 1>give you a solid shortcut to understanding these kinds of architectures.

15
00:00:43.640 --> 00:00:47.079
<v Speaker 1>Will use the Amazon Kinesis family, you know, data streams,

16
00:00:47.119 --> 00:00:49.799
<v Speaker 1>fire hose, data analytics, video streams, kind of as our

17
00:00:49.840 --> 00:00:50.359
<v Speaker 1>case study.

18
00:00:50.560 --> 00:00:52.679
<v Speaker 2>Yeah. The goal is by the end you'll really get

19
00:00:52.679 --> 00:00:55.640
<v Speaker 2>the difference between these tools, and more importantly, the core

20
00:00:55.719 --> 00:00:59.920
<v Speaker 2>idea is behind making these systems scalable and crucially fast.

21
00:01:00.039 --> 00:01:02.280
<v Speaker 1>Okay, let's kick off with the big why why is

22
00:01:02.399 --> 00:01:05.400
<v Speaker 1>speed so important in data analysis. Now why the big rush?

23
00:01:05.560 --> 00:01:09.000
<v Speaker 2>Well, it's become a strategic must have. Really, we've moved

24
00:01:09.000 --> 00:01:12.319
<v Speaker 2>way beyond just looking backwards. Yeah, like those old batch

25
00:01:12.400 --> 00:01:14.599
<v Speaker 2>jobs running overnight giving you a report.

26
00:01:14.239 --> 00:01:17.120
<v Speaker 1>On yesterday, right, a snapshot of the past exactly.

27
00:01:17.640 --> 00:01:21.599
<v Speaker 2>Now, it's all about immediate insights, the freshest possible data.

28
00:01:21.840 --> 00:01:25.760
<v Speaker 2>Let's you make the best decisions right now, perishable insights.

29
00:01:25.760 --> 00:01:31.319
<v Speaker 1>Basically, I've heard about this thing, the O day loop observe, orient, decide, act.

30
00:01:31.760 --> 00:01:34.719
<v Speaker 1>It comes from military strategy. I think, how does real

31
00:01:34.760 --> 00:01:35.959
<v Speaker 1>time data fit in there?

32
00:01:36.079 --> 00:01:40.359
<v Speaker 2>It fits perfectly. Real time analytics fundamentally strengths that observe window.

33
00:01:40.719 --> 00:01:42.640
<v Speaker 2>Something happens, you know about.

34
00:01:42.400 --> 00:01:45.280
<v Speaker 1>It instantly, which means you can orient, decide, and act

35
00:01:45.359 --> 00:01:45.920
<v Speaker 1>much faster.

36
00:01:46.079 --> 00:01:49.719
<v Speaker 2>Precisely gives you a huge edge. Think about frog detection

37
00:01:50.000 --> 00:01:53.760
<v Speaker 2>or optimizing ad placements on the fly, or in that

38
00:01:53.799 --> 00:01:58.200
<v Speaker 2>smart city example, rerouting maintenance crews instantly based on real

39
00:01:58.280 --> 00:02:00.000
<v Speaker 2>time bike usage patterns.

40
00:02:00.040 --> 00:02:03.000
<v Speaker 1>Okay, so that's the why. Now the how? A single

41
00:02:03.000 --> 00:02:05.840
<v Speaker 1>server obviously can't cope, so we use distributed systems. Lots

42
00:02:05.840 --> 00:02:08.280
<v Speaker 1>of servers network together. But does not just explode the

43
00:02:08.280 --> 00:02:10.599
<v Speaker 1>complexity failure modes everywhere.

44
00:02:10.759 --> 00:02:14.400
<v Speaker 2>Oh absolutely, It gets complex fast, but engineers figured it

45
00:02:14.400 --> 00:02:17.919
<v Speaker 2>out ways to manage it. Two main patterns really help

46
00:02:18.039 --> 00:02:20.439
<v Speaker 2>tame that initial complexity.

47
00:02:19.919 --> 00:02:21.479
<v Speaker 1>Beast Okay, what were they.

48
00:02:21.520 --> 00:02:26.599
<v Speaker 2>First standardized interfaces? Things like APIs This really paved the

49
00:02:26.639 --> 00:02:28.479
<v Speaker 2>way for micro service architectures.

50
00:02:28.719 --> 00:02:32.199
<v Speaker 1>Ah, so breaking the big system down into smaller independent

51
00:02:32.240 --> 00:02:33.639
<v Speaker 1>services exactly.

52
00:02:33.840 --> 00:02:36.400
<v Speaker 2>It lets different teams own their piece and you know,

53
00:02:36.639 --> 00:02:39.360
<v Speaker 2>iterate faster without stepping on each other's toes.

54
00:02:39.560 --> 00:02:42.439
<v Speaker 1>It sounds a bit like Conway's law in action, the

55
00:02:42.479 --> 00:02:44.919
<v Speaker 1>system design mirroring the ORG structure.

56
00:02:45.240 --> 00:02:47.879
<v Speaker 2>That's a great way to put it. Yeah, small independent

57
00:02:47.919 --> 00:02:52.159
<v Speaker 2>teams tend to build small independent services. Those standardized interfaces

58
00:02:52.159 --> 00:02:53.319
<v Speaker 2>are what make it work smoothly.

59
00:02:53.439 --> 00:02:57.439
<v Speaker 1>Okay, interfaces help services talk. But what about failures? If

60
00:02:57.439 --> 00:03:00.639
<v Speaker 1>one micro service crashes while it's processing data, how do

61
00:03:00.680 --> 00:03:03.240
<v Speaker 1>you stop it from dragging down everything else that depends

62
00:03:03.280 --> 00:03:03.560
<v Speaker 1>on it?

63
00:03:03.639 --> 00:03:06.400
<v Speaker 2>Right, that's the castgating failure problem. And that's where the

64
00:03:06.439 --> 00:03:10.759
<v Speaker 2>second pattern comes in. Decoupling. You use an asynchronous message broker.

65
00:03:10.879 --> 00:03:14.479
<v Speaker 1>A message broker like a middleman for data sort of.

66
00:03:14.599 --> 00:03:17.719
<v Speaker 2>Yeah. It acts as the stable buffer, an invariant in

67
00:03:17.759 --> 00:03:21.560
<v Speaker 2>the system. If a downstream service, a consumer fails, the

68
00:03:21.639 --> 00:03:25.120
<v Speaker 2>broker just holds onto the incoming messages. It builds up

69
00:03:25.120 --> 00:03:26.240
<v Speaker 2>a backlog.

70
00:03:25.840 --> 00:03:29.039
<v Speaker 1>So the producer can keep sending data unaware of the

71
00:03:29.080 --> 00:03:30.120
<v Speaker 1>downstream problem.

72
00:03:30.280 --> 00:03:33.479
<v Speaker 2>Exactly, the failure is contained. Once the consumer service comes

73
00:03:33.479 --> 00:03:36.800
<v Speaker 2>back online, it just starts processing the backlog from the broker.

74
00:03:37.080 --> 00:03:39.560
<v Speaker 2>No data lost, no castkating crash.

75
00:03:39.680 --> 00:03:43.159
<v Speaker 1>That makes sense. The broker isolates the fault. But if

76
00:03:43.199 --> 00:03:46.199
<v Speaker 1>that backlog keeps growing, that sounds like trouble brewing. What

77
00:03:46.319 --> 00:03:47.360
<v Speaker 1>metrics do we watch?

78
00:03:47.560 --> 00:03:51.439
<v Speaker 2>The key capacity metric is transactions per second tps. That's

79
00:03:51.520 --> 00:03:54.000
<v Speaker 2>usually limited by either the number of records per second

80
00:03:54.159 --> 00:03:56.560
<v Speaker 2>or the total data size like megabytes per second.

81
00:03:56.599 --> 00:03:59.960
<v Speaker 1>Okay, tps, but what's the danger signal the real data.

82
00:04:00.080 --> 00:04:02.919
<v Speaker 2>Your signal is back pressure. That's when your producers are

83
00:04:02.960 --> 00:04:06.000
<v Speaker 2>sending data faster than your consumers can process it. Input

84
00:04:06.000 --> 00:04:08.520
<v Speaker 2>tps is consistently higher than output tps.

85
00:04:08.759 --> 00:04:11.080
<v Speaker 1>So back to the bike sharing rush hour hits every

86
00:04:11.159 --> 00:04:13.879
<v Speaker 1>bike starts pinging its location like crazy. How do you

87
00:04:13.919 --> 00:04:17.759
<v Speaker 1>handle that surge, that back pressure without the system just drowning.

88
00:04:18.040 --> 00:04:21.120
<v Speaker 2>You've got a few strategies ranging from let's say gentle

89
00:04:21.560 --> 00:04:25.120
<v Speaker 2>to pretty aggressive. Okay, you can throttle the producers tell

90
00:04:25.160 --> 00:04:28.560
<v Speaker 2>the bikes, hey, maybe only report your location every minute

91
00:04:28.600 --> 00:04:30.439
<v Speaker 2>instead of every second during peak times.

92
00:04:30.639 --> 00:04:32.600
<v Speaker 1>Reduce the input rate makes.

93
00:04:32.439 --> 00:04:35.000
<v Speaker 2>Sense, Or you can scale the consumers. If you're in

94
00:04:35.040 --> 00:04:38.480
<v Speaker 2>the cloud, you might automatically spin up more instances of

95
00:04:38.519 --> 00:04:40.680
<v Speaker 2>your processing application to handle the load.

96
00:04:40.800 --> 00:04:42.000
<v Speaker 1>Add more workers, right.

97
00:04:42.079 --> 00:04:44.279
<v Speaker 2>You can also use bigger buffers in the message broker

98
00:04:44.279 --> 00:04:48.800
<v Speaker 2>itself to just absorb temporary spikes. But the most drastic

99
00:04:48.839 --> 00:04:52.000
<v Speaker 2>option is to just drop messages drop data.

100
00:04:52.279 --> 00:04:53.199
<v Speaker 1>That sounds risky.

101
00:04:53.360 --> 00:04:56.120
<v Speaker 2>It is. You'd only ever do that for non critical data.

102
00:04:56.160 --> 00:04:59.120
<v Speaker 2>Like maybe dropping some routine sensor readings is okay if

103
00:04:59.120 --> 00:05:02.120
<v Speaker 2>the systems overloaded, but you'd never drop something like a

104
00:05:02.279 --> 00:05:04.360
<v Speaker 2>customer order completion message. Never.

105
00:05:04.519 --> 00:05:08.160
<v Speaker 1>Okay, got it? Critical versus non critical. Let's dig into

106
00:05:08.160 --> 00:05:11.560
<v Speaker 1>the stream itself. What are the absolute basic parts of

107
00:05:11.639 --> 00:05:13.800
<v Speaker 1>any streaming system for main components?

108
00:05:14.079 --> 00:05:16.000
<v Speaker 2>Yeah, you can break it down pretty simply. First you

109
00:05:16.000 --> 00:05:18.639
<v Speaker 2>have the producers. These are the apps sending the data,

110
00:05:19.120 --> 00:05:22.439
<v Speaker 2>our bike sensors, the user's mobile app checking out a bike,

111
00:05:22.480 --> 00:05:23.040
<v Speaker 2>that kind of thing.

112
00:05:23.079 --> 00:05:24.639
<v Speaker 1>Okay, producers send data.

113
00:05:24.680 --> 00:05:27.720
<v Speaker 2>Then you have the messages or records. That's the actual

114
00:05:27.800 --> 00:05:30.839
<v Speaker 2>data payload, usually small, maybe I a few kill bytes

115
00:05:31.000 --> 00:05:34.319
<v Speaker 2>often capped around a megabyte. Each record has the data

116
00:05:34.360 --> 00:05:37.399
<v Speaker 2>itself and a header, usually with a unique ID the

117
00:05:37.439 --> 00:05:38.920
<v Speaker 2>broker assigns.

118
00:05:38.399 --> 00:05:40.160
<v Speaker 1>Got it, producers messages.

119
00:05:40.319 --> 00:05:43.279
<v Speaker 2>Third is the stream or the broker. That's the component

120
00:05:43.360 --> 00:05:45.600
<v Speaker 2>doing the buffering, like canisis itself.

121
00:05:45.360 --> 00:05:48.000
<v Speaker 3>Or Kafka the fault isolator we talked about, right, And

122
00:05:48.040 --> 00:05:51.079
<v Speaker 3>finally you have the consumers, the applications that pull data

123
00:05:51.120 --> 00:05:55.959
<v Speaker 3>from the stream and actually do something with it, analysis, storage, triggering, alerts.

124
00:05:55.519 --> 00:06:02.399
<v Speaker 1>Whatever, producers, messages, stream consumers. Okay, now latency you mentioned

125
00:06:02.399 --> 00:06:05.279
<v Speaker 1>speed is critical the source of stress. We need to

126
00:06:05.279 --> 00:06:08.199
<v Speaker 1>be really precise here. It's not just latency. They're two

127
00:06:08.240 --> 00:06:09.720
<v Speaker 1>specific measures.

128
00:06:09.600 --> 00:06:14.000
<v Speaker 2>Yes, absolutely critical distinction. There's propagation delay that's simply the

129
00:06:14.079 --> 00:06:17.279
<v Speaker 2>time it takes from the moment of producer writes a

130
00:06:17.279 --> 00:06:21.920
<v Speaker 2>message to the moment a consumer reads it raw transmission speed.

131
00:06:21.759 --> 00:06:24.600
<v Speaker 1>Basically okay, right to read time. What's the other one?

132
00:06:24.720 --> 00:06:28.480
<v Speaker 2>The other and often more telling metric is the age

133
00:06:28.480 --> 00:06:31.800
<v Speaker 2>of the message. This measures how long a message has

134
00:06:31.800 --> 00:06:34.720
<v Speaker 2>actually been sitting in the stream before a consumer picked

135
00:06:34.720 --> 00:06:35.000
<v Speaker 2>it up.

136
00:06:35.120 --> 00:06:38.079
<v Speaker 1>Ah, So propagation delay could be low, but the message

137
00:06:38.160 --> 00:06:41.040
<v Speaker 1>might still be old. If the consumers are lagging exactly.

138
00:06:41.079 --> 00:06:43.879
<v Speaker 2>If that average message age starts creeping up, that's your

139
00:06:43.879 --> 00:06:46.680
<v Speaker 2>big warning sign. It means your consumers can't keep up.

140
00:06:46.839 --> 00:06:49.759
<v Speaker 2>The backlog is growing, and performance problems are right around

141
00:06:49.759 --> 00:06:52.319
<v Speaker 2>the corner, even if the network itself is fast.

142
00:06:52.759 --> 00:06:56.839
<v Speaker 1>That's a really important distinction. Now, failure modes distributed systems

143
00:06:56.920 --> 00:07:00.319
<v Speaker 1>retries network glitches. It means data streams all and have

144
00:07:00.399 --> 00:07:03.560
<v Speaker 1>this at least once delivery guarantee right, a message might

145
00:07:03.600 --> 00:07:04.519
<v Speaker 1>show up more than once.

146
00:07:04.800 --> 00:07:09.519
<v Speaker 2>That's the standard reality. Yes, because ensuring exactly once delivery

147
00:07:09.560 --> 00:07:14.279
<v Speaker 2>across a distributed system is incredibly hard. Most brokers guarantee

148
00:07:14.279 --> 00:07:17.279
<v Speaker 2>at least once. The producer might retry sending if it

149
00:07:17.279 --> 00:07:20.680
<v Speaker 2>doesn't get confirmation the consumer by process and in crash

150
00:07:20.759 --> 00:07:23.160
<v Speaker 2>before confirming. Duplicates happen, and that.

151
00:07:23.319 --> 00:07:28.279
<v Speaker 1>Sounds potentially disastrous. If you're, say, processing payments or updating

152
00:07:28.279 --> 00:07:30.319
<v Speaker 1>inventory accounts, double counting.

153
00:07:30.040 --> 00:07:33.199
<v Speaker 2>It would be disastrous, which is why the responsibility for

154
00:07:33.240 --> 00:07:37.720
<v Speaker 2>handling duplicates shifts downstream to the consumer or the final

155
00:07:37.759 --> 00:07:42.199
<v Speaker 2>destination system. Those systems must be designed to be idempatant idempatent,

156
00:07:42.439 --> 00:07:46.120
<v Speaker 2>meaning meaning processing the same message multiple times has the

157
00:07:46.160 --> 00:07:48.680
<v Speaker 2>exact same effect as processing it just once.

158
00:07:48.800 --> 00:07:50.680
<v Speaker 1>How do you achieve that two main ways.

159
00:07:50.800 --> 00:07:53.560
<v Speaker 2>Either the operation itself is naturally idempatant like setting a

160
00:07:53.639 --> 00:07:55.560
<v Speaker 2>value as adempatent adding to a count, or is not,

161
00:07:56.360 --> 00:07:59.079
<v Speaker 2>or you use that unique ideed. It's typically in the

162
00:07:59.079 --> 00:08:02.399
<v Speaker 2>message payload to explicitly check if you've already processed this

163
00:08:02.439 --> 00:08:04.839
<v Speaker 2>specific message before de duplication.

164
00:08:04.959 --> 00:08:08.560
<v Speaker 1>Basically okay, So the consuming application needs to be smart

165
00:08:08.560 --> 00:08:12.399
<v Speaker 1>about duplicates. What about errors within a message? If a

166
00:08:12.439 --> 00:08:15.720
<v Speaker 1>consumer gets a record it just can't process.

167
00:08:15.600 --> 00:08:18.959
<v Speaker 2>That leads to a particularly nasty problem called the poison pill.

168
00:08:19.120 --> 00:08:21.800
<v Speaker 1>Poisman pill sounds bad, It is.

169
00:08:22.160 --> 00:08:25.879
<v Speaker 2>Because message streams usually guarantee the order of messages, at

170
00:08:25.959 --> 00:08:29.040
<v Speaker 2>least within a certain context. Like all messages for a

171
00:08:29.079 --> 00:08:32.960
<v Speaker 2>specific bike, you need to process the unlock event before

172
00:08:33.000 --> 00:08:34.559
<v Speaker 2>the return event takes sense.

173
00:08:34.759 --> 00:08:35.480
<v Speaker 1>Order matters.

174
00:08:35.559 --> 00:08:38.480
<v Speaker 2>But if one message in that sequence causes an error

175
00:08:38.519 --> 00:08:42.159
<v Speaker 2>in the consumer, maybe it's malformed bad data, and the

176
00:08:42.200 --> 00:08:46.200
<v Speaker 2>consumer keeps retrying and failing on that specific message, it

177
00:08:46.200 --> 00:08:49.600
<v Speaker 2>gets stuck. It gets completely stuck. That single poison pill

178
00:08:49.639 --> 00:08:52.919
<v Speaker 2>record blocks the processing of all the perfectly valid records

179
00:08:52.919 --> 00:08:54.879
<v Speaker 2>sitting behind it in the stream for that same bike

180
00:08:55.000 --> 00:08:58.679
<v Speaker 2>or whatever the ordering context is. The whole sequence grinds

181
00:08:58.720 --> 00:09:00.879
<v Speaker 2>to a halt because of one bad message.

182
00:09:00.960 --> 00:09:03.879
<v Speaker 1>Wow, so a single bite of bad data could effectively

183
00:09:03.919 --> 00:09:06.799
<v Speaker 1>block a whole partition of the stream, causing a latency

184
00:09:06.879 --> 00:09:09.440
<v Speaker 1>for everything behind it to skyrocket exactly.

185
00:09:09.559 --> 00:09:13.519
<v Speaker 2>It really highlights why robust error handling and potentially dead

186
00:09:13.559 --> 00:09:18.080
<v Speaker 2>letter cues for problematic messages are so crucial in consumer design.

187
00:09:18.080 --> 00:09:20.720
<v Speaker 1>Okay, that sets the stage really well. Let's move on

188
00:09:20.759 --> 00:09:23.759
<v Speaker 1>to the specific tools Amazon offers with Kinesis to handle

189
00:09:23.759 --> 00:09:27.480
<v Speaker 1>all this at scale. Starting with the foundation canesis data

190
00:09:27.519 --> 00:09:30.240
<v Speaker 1>streams or KDS. When do you reach for this?

191
00:09:30.720 --> 00:09:34.600
<v Speaker 2>KTS is your go to when you need maximum control,

192
00:09:34.919 --> 00:09:39.679
<v Speaker 2>high durability, and really low latency. We're talking subsecond processing

193
00:09:39.720 --> 00:09:42.519
<v Speaker 2>for your own custom applications that read directly from the stream.

194
00:09:42.600 --> 00:09:45.279
<v Speaker 1>And the core unit of KDS is the shard, right,

195
00:09:45.360 --> 00:09:46.360
<v Speaker 1>what does that actually do?

196
00:09:46.559 --> 00:09:49.840
<v Speaker 2>The shard is the fundamental unit of capacity and parallelism

197
00:09:49.879 --> 00:09:53.159
<v Speaker 2>in KDS. Each chard has fixed limits on how much

198
00:09:53.240 --> 00:09:55.879
<v Speaker 2>data it can ingest typically one megabyte per second or

199
00:09:55.879 --> 00:09:58.480
<v Speaker 2>one thousand records per second, whichever comes first, and also

200
00:09:58.519 --> 00:09:59.720
<v Speaker 2>how much data can be read out.

201
00:10:00.000 --> 00:10:02.120
<v Speaker 1>If I need more capacity, I just add more shards

202
00:10:02.320 --> 00:10:02.879
<v Speaker 1>pretty much.

203
00:10:03.000 --> 00:10:05.799
<v Speaker 2>Yeah, you scale the stream by scaling the number of shards.

204
00:10:05.879 --> 00:10:08.720
<v Speaker 1>Okay, and if we're tracking millions of bike journeys, how

205
00:10:08.720 --> 00:10:11.960
<v Speaker 1>does KDS decide which shard a specific bike's data goes

206
00:10:12.000 --> 00:10:13.279
<v Speaker 1>to and keep it in order.

207
00:10:13.480 --> 00:10:17.200
<v Speaker 2>That's determined by the partition key. When your producer application

208
00:10:17.320 --> 00:10:20.399
<v Speaker 2>sends a record to KDS, it includes a partition key.

209
00:10:20.840 --> 00:10:22.879
<v Speaker 2>This could be the bike ID for instance.

210
00:10:22.759 --> 00:10:25.360
<v Speaker 1>And KTS uses that key to route.

211
00:10:25.080 --> 00:10:29.360
<v Speaker 2>The data exactly. KTS hashes the partition key and uses

212
00:10:29.399 --> 00:10:31.960
<v Speaker 2>the result to assign the record to a specific shard.

213
00:10:32.679 --> 00:10:35.240
<v Speaker 2>All records with the same partition key always go to

214
00:10:35.279 --> 00:10:35.759
<v Speaker 2>the same.

215
00:10:35.639 --> 00:10:38.559
<v Speaker 1>Chard ah, and that's how it guarantees order. Within that key,

216
00:10:38.919 --> 00:10:41.399
<v Speaker 1>all data for bike one twenty three lands on, say,

217
00:10:41.799 --> 00:10:44.559
<v Speaker 1>shard five, in the correct sequence precisely.

218
00:10:45.159 --> 00:10:48.679
<v Speaker 2>But this also highlights a potential pitfall, the hot partition

219
00:10:48.759 --> 00:10:52.039
<v Speaker 2>key or hot shard. If one partition ke, like a

220
00:10:52.080 --> 00:10:55.240
<v Speaker 2>super popular bike or a single central sensor, sends way

221
00:10:55.240 --> 00:10:58.000
<v Speaker 2>more data than others, its shark can get overwhelmed while

222
00:10:58.000 --> 00:10:58.720
<v Speaker 2>others are idle.

223
00:10:58.879 --> 00:11:01.399
<v Speaker 1>So choosing a good part two key that distributes data

224
00:11:01.440 --> 00:11:04.600
<v Speaker 1>evenly is critical for performance, maybe not just the bike idea.

225
00:11:04.600 --> 00:11:06.759
<v Speaker 1>If usage is very uneven.

226
00:11:06.679 --> 00:11:10.200
<v Speaker 2>Right, sometimes using a more random key or combining fields

227
00:11:10.440 --> 00:11:12.240
<v Speaker 2>is necessary to spread the load effectively.

228
00:11:12.480 --> 00:11:14.799
<v Speaker 1>Now reading the data, Let's say we have multiple apps

229
00:11:14.840 --> 00:11:17.039
<v Speaker 1>wanting that bike data, one for maintenance alarts, one for

230
00:11:17.120 --> 00:11:20.279
<v Speaker 1>route analysis. How do they read efficiently? You mentioned Enhanced

231
00:11:20.279 --> 00:11:21.080
<v Speaker 1>fan out EFO.

232
00:11:21.440 --> 00:11:24.279
<v Speaker 2>Yes, so, standard KDS consumers work on a pull model.

233
00:11:24.480 --> 00:11:27.200
<v Speaker 2>They pull the shard for data, but each chard has

234
00:11:27.240 --> 00:11:30.200
<v Speaker 2>a total read capacity limit about two megabytes per second

235
00:11:30.440 --> 00:11:32.000
<v Speaker 2>that all standard consumers share.

236
00:11:32.200 --> 00:11:34.519
<v Speaker 1>So if you have lots of consumers pulling the same shard,

237
00:11:34.559 --> 00:11:36.879
<v Speaker 1>they start slowing each other down exactly.

238
00:11:36.519 --> 00:11:40.519
<v Speaker 2>They compete for that shared bandwidth. Enhanced fan out EFO

239
00:11:41.200 --> 00:11:44.840
<v Speaker 2>solves this with EFO. Each registered consumer gets its own

240
00:11:44.919 --> 00:11:49.480
<v Speaker 2>dedicated throughput limit per shard, delivered via push model from Kinesis.

241
00:11:49.960 --> 00:11:52.639
<v Speaker 1>So EFO consumers don't interfere with each other.

242
00:11:52.759 --> 00:11:55.799
<v Speaker 2>Correct. It allows multiple real time applications to consume from

243
00:11:55.799 --> 00:11:58.720
<v Speaker 2>the same stream at high speed independently. If you need

244
00:11:58.720 --> 00:12:01.840
<v Speaker 2>scale on the consumer side, EFO is often essential.

245
00:12:02.240 --> 00:12:06.519
<v Speaker 1>Okay, KTS for flexible, low latency custom apps. What about

246
00:12:06.600 --> 00:12:09.799
<v Speaker 1>Kinesis Data fire Hose KDF. People often call it the

247
00:12:09.799 --> 00:12:10.440
<v Speaker 1>easy loader.

248
00:12:10.720 --> 00:12:14.720
<v Speaker 2>Why because it is KDF is fully managed, totally serverless

249
00:12:14.960 --> 00:12:17.080
<v Speaker 2>and Its whole purpose is to make it super simple

250
00:12:17.120 --> 00:12:20.440
<v Speaker 2>to capture streaming data and load it into specific destinations.

251
00:12:20.639 --> 00:12:24.279
<v Speaker 2>Think S three data laks, redshift data warehouses, elastic search

252
00:12:24.320 --> 00:12:25.360
<v Speaker 2>for search, that kind of thing.

253
00:12:25.399 --> 00:12:26.440
<v Speaker 1>How easy is easy?

254
00:12:26.759 --> 00:12:29.440
<v Speaker 2>Often zero code is needed for the delivery part. You

255
00:12:29.480 --> 00:12:31.759
<v Speaker 2>can figure fire Hose in the console, point it at

256
00:12:31.799 --> 00:12:34.679
<v Speaker 2>your stream source which could even be KDS, tell it

257
00:12:34.720 --> 00:12:38.840
<v Speaker 2>where to send the data, and it handles the scaling, buffering, delivery,

258
00:12:38.960 --> 00:12:40.360
<v Speaker 2>and retries automatically.

259
00:12:40.519 --> 00:12:42.799
<v Speaker 1>Does it do anything else like transform the data?

260
00:12:43.039 --> 00:12:45.679
<v Speaker 2>Yes, you can. It has built in capabilities for data

261
00:12:45.720 --> 00:12:48.639
<v Speaker 2>format conversion like Jayson to park, and it can also

262
00:12:48.679 --> 00:12:53.039
<v Speaker 2>invoke an AWS lambda function to perform custom inline transformations

263
00:12:53.039 --> 00:12:55.759
<v Speaker 2>basic epl before the data lands and the destination.

264
00:12:56.039 --> 00:12:59.840
<v Speaker 1>Okay, KTS gives us speed and flexibility. KDF gives us

265
00:13:00.039 --> 00:13:03.240
<v Speaker 1>ease of use for loading data, sinks. What's the catch

266
00:13:03.320 --> 00:13:04.879
<v Speaker 1>with KDF? What's the tradeoff?

267
00:13:05.080 --> 00:13:08.279
<v Speaker 2>The main trade off is latency because fire Hose buffers

268
00:13:08.360 --> 00:13:11.799
<v Speaker 2>data internally, often for several seconds or even minutes to

269
00:13:11.840 --> 00:13:13.919
<v Speaker 2>batch it efficiently for delivery to the destination.

270
00:13:14.080 --> 00:13:16.960
<v Speaker 1>Ah, It's not truly real time like KDS.

271
00:13:16.519 --> 00:13:19.559
<v Speaker 2>Can be exactly. It optimizes for delivery throughput and cost

272
00:13:19.559 --> 00:13:23.480
<v Speaker 2>effectiveness by batching, not for subsecond end to end latency,

273
00:13:23.919 --> 00:13:26.840
<v Speaker 2>So if you need that immediate subsecond processing, KDS is

274
00:13:26.879 --> 00:13:30.039
<v Speaker 2>the choice. If your goal is reliable, easy loading into

275
00:13:30.120 --> 00:13:32.639
<v Speaker 2>F three or redshift and near real time is good enough,

276
00:13:32.799 --> 00:13:34.720
<v Speaker 2>KDF is often way simpler.

277
00:13:34.519 --> 00:13:41.240
<v Speaker 1>Makes sense speed versus simplicity. Now, analysis kinesis data analytics kDa.

278
00:13:41.600 --> 00:13:43.960
<v Speaker 1>How does this fit in analyzing the stream as it flows?

279
00:13:44.159 --> 00:13:47.320
<v Speaker 2>Precisely? kDa is also serverless, and it lets you run

280
00:13:47.360 --> 00:13:50.679
<v Speaker 2>continuous queries or applications against your streaming data, either from

281
00:13:50.759 --> 00:13:53.320
<v Speaker 2>KDS or KDF. It offers two different.

282
00:13:53.039 --> 00:13:55.120
<v Speaker 1>Engines for this, Okay, what are the engine choices?

283
00:13:55.279 --> 00:13:58.879
<v Speaker 2>First is a sqel engine. If you're familiar with standard SQL,

284
00:13:59.000 --> 00:14:02.879
<v Speaker 2>you can use ans icql to query the stream. kDa

285
00:14:02.960 --> 00:14:06.240
<v Speaker 2>presents the stream data within windows, like tumbling windows of the

286
00:14:06.279 --> 00:14:08.799
<v Speaker 2>last five minutes or sliding windows, so.

287
00:14:08.879 --> 00:14:11.840
<v Speaker 1>You could write a query like select count from bike

288
00:14:11.919 --> 00:14:15.480
<v Speaker 1>stream where status in US, group by location district over

289
00:14:15.559 --> 00:14:16.440
<v Speaker 1>a rolling time.

290
00:14:16.279 --> 00:14:19.200
<v Speaker 2>Window exactly that kind of thing. It treats the incoming

291
00:14:19.240 --> 00:14:22.519
<v Speaker 2>stream like a continuously updating table. You can query. Very

292
00:14:22.559 --> 00:14:26.399
<v Speaker 2>powerful for many common analytics tasks and accessible if you

293
00:14:26.440 --> 00:14:27.320
<v Speaker 2>know SQL.

294
00:14:27.240 --> 00:14:28.799
<v Speaker 1>And the second engine you mentioned too.

295
00:14:29.039 --> 00:14:31.279
<v Speaker 2>The second is based on a patche flink. This gives

296
00:14:31.279 --> 00:14:33.840
<v Speaker 2>you much more power and flexibility. You typically write your

297
00:14:33.879 --> 00:14:35.519
<v Speaker 2>applications in Javar Scala.

298
00:14:35.600 --> 00:14:37.960
<v Speaker 1>What can flink do that the SQL engine can't.

299
00:14:38.200 --> 00:14:41.600
<v Speaker 2>Well, flink apps can handle much more complex logic, maintain

300
00:14:41.679 --> 00:14:45.639
<v Speaker 2>sophisticated state across events, and importantly can connect the data

301
00:14:45.639 --> 00:14:49.360
<v Speaker 2>sources and sinks outside of just kinesis and aws like

302
00:14:49.399 --> 00:14:52.799
<v Speaker 2>maybe an on premises KOFKA cluster or a custom database.

303
00:14:53.000 --> 00:14:56.000
<v Speaker 1>And I remember reading flink is key for achieving exactly

304
00:14:56.039 --> 00:14:58.759
<v Speaker 1>once processing semantics. How does it manage that?

305
00:14:59.159 --> 00:15:03.519
<v Speaker 2>Yes, that's a major advantage. Flank achieves exactly once by

306
00:15:03.559 --> 00:15:08.200
<v Speaker 2>carefully managing the application's internal state and periodically saving checkpoints

307
00:15:08.200 --> 00:15:11.159
<v Speaker 2>of that state to durable storage typically S three or

308
00:15:11.240 --> 00:15:14.480
<v Speaker 2>something like rock sysdb. If there's a failure, it can

309
00:15:14.480 --> 00:15:17.840
<v Speaker 2>restore from the last successful checkpoint and resume processing without

310
00:15:17.879 --> 00:15:19.759
<v Speaker 2>missing data or processing duplicates.

311
00:15:19.840 --> 00:15:23.360
<v Speaker 1>So SQL for simpler windowed queries, think for complex stateful

312
00:15:23.440 --> 00:15:26.159
<v Speaker 1>potentially exactly one's processing and connecting anywhere.

313
00:15:26.200 --> 00:15:26.879
<v Speaker 2>That's a good summary.

314
00:15:26.960 --> 00:15:30.519
<v Speaker 1>Yeah, okay, one more service kinesis video streams KBS. This

315
00:15:30.559 --> 00:15:32.559
<v Speaker 1>seems more specialized. Video and audio.

316
00:15:32.720 --> 00:15:36.240
<v Speaker 2>Yeah, KBS is specifically designed for handling time encoded data streams,

317
00:15:36.440 --> 00:15:39.759
<v Speaker 2>primarily video and audio, but potentially other time series data too,

318
00:15:39.919 --> 00:15:41.919
<v Speaker 2>like radar or liar feeds.

319
00:15:42.159 --> 00:15:44.600
<v Speaker 1>How does play out in our smart city bike scenario?

320
00:15:44.960 --> 00:15:49.279
<v Speaker 2>KBS actually addresses two pretty distinct use cases. There first

321
00:15:49.399 --> 00:15:53.279
<v Speaker 2>is real time low latency interaction. Imagine a user wanting

322
00:15:53.320 --> 00:15:56.240
<v Speaker 2>to see a live video feed of a bike stand

323
00:15:56.320 --> 00:15:58.440
<v Speaker 2>on their phone to check if bikes.

324
00:15:58.159 --> 00:16:00.000
<v Speaker 1>Are actually available okay, live streaming.

325
00:16:00.080 --> 00:16:04.000
<v Speaker 2>KVS provides capabilities, often using Web RTC standards for that

326
00:16:04.080 --> 00:16:06.440
<v Speaker 2>kind of low latency peer to peer or small group

327
00:16:06.559 --> 00:16:10.159
<v Speaker 2>video streaming. It manages the signaling channels and things like

328
00:16:10.200 --> 00:16:13.840
<v Speaker 2>stunt and turn servers needed to establish the connections. So

329
00:16:14.080 --> 00:16:15.799
<v Speaker 2>KVS WebRTC.

330
00:16:15.360 --> 00:16:18.159
<v Speaker 1>For live use got it? And the second use case.

331
00:16:18.000 --> 00:16:20.799
<v Speaker 2>The second is more about storage, playback and analysis. This

332
00:16:20.840 --> 00:16:24.720
<v Speaker 2>is KVS storage. You stream video from say security cameras

333
00:16:24.720 --> 00:16:28.000
<v Speaker 2>that the bike stands into KVS for durable storage.

334
00:16:27.559 --> 00:16:28.919
<v Speaker 1>And then what just store it?

335
00:16:29.039 --> 00:16:31.200
<v Speaker 2>You can store it for compliance or later review, but

336
00:16:31.240 --> 00:16:33.639
<v Speaker 2>the real power comes from integrating it with AI and

337
00:16:33.720 --> 00:16:36.480
<v Speaker 2>mL services. For instance, you could feed that stored video

338
00:16:36.480 --> 00:16:38.679
<v Speaker 2>stream into Amazon Recognition.

339
00:16:38.399 --> 00:16:39.759
<v Speaker 1>AH for analysis.

340
00:16:39.919 --> 00:16:43.679
<v Speaker 2>What like running facial recognition to detect known vandals near

341
00:16:43.720 --> 00:16:46.519
<v Speaker 2>the bike stands in real time, or maybe object detection

342
00:16:46.559 --> 00:16:50.519
<v Speaker 2>to count bikes automatically or spot obstructions. Recognition analyzes the

343
00:16:50.559 --> 00:16:54.279
<v Speaker 2>frames ingested via KVS and can trigger alerts or other actions.

344
00:16:54.440 --> 00:16:57.399
<v Speaker 1>So KVS handles both the live interaction piece and the

345
00:16:57.600 --> 00:16:59.759
<v Speaker 1>jest for later analysis piece for video.

346
00:16:59.559 --> 00:17:02.039
<v Speaker 2>Data correct two sides of the video coin.

347
00:17:02.200 --> 00:17:04.759
<v Speaker 1>That's a really helpful tour of the whole Kinesis family.

348
00:17:04.839 --> 00:17:06.960
<v Speaker 1>Let's try to quickly recap the core job of each

349
00:17:07.000 --> 00:17:07.759
<v Speaker 1>one for the listener.

350
00:17:07.839 --> 00:17:13.079
<v Speaker 2>Okay, KTS, that's your foundational low latency, high throughput stream

351
00:17:13.559 --> 00:17:17.480
<v Speaker 2>for custom apps needing maximum flexibility, thinks subsecond speed right

352
00:17:17.599 --> 00:17:21.599
<v Speaker 2>KTF KDF the easy serverlust loader, great for getting streaming

353
00:17:21.640 --> 00:17:24.759
<v Speaker 2>data into S three redshift elastic search without much code,

354
00:17:25.279 --> 00:17:29.519
<v Speaker 2>but sacrifices that subsecond latency for batching efficiency. kDa the

355
00:17:29.599 --> 00:17:32.799
<v Speaker 2>real time Analytics engine, use it SQL interface for simpler

356
00:17:32.799 --> 00:17:36.039
<v Speaker 2>windowed queries, or the Apache frink engine for complex stateful

357
00:17:36.039 --> 00:17:39.200
<v Speaker 2>processing exactly one semantics and broader connectivity.

358
00:17:39.240 --> 00:17:42.079
<v Speaker 1>And finally, KVS KDS.

359
00:17:41.559 --> 00:17:44.920
<v Speaker 2>The specialist for video and time encoded data, handles both

360
00:17:44.960 --> 00:17:48.559
<v Speaker 2>low latency web RTC streaming for live interaction and gerbil

361
00:17:48.720 --> 00:17:53.119
<v Speaker 2>ingestion for storage playback, and AML analysis like recognition.

362
00:17:53.359 --> 00:17:56.640
<v Speaker 1>Perfect summary. Now, before we wrap up, there's a crucial

363
00:17:56.680 --> 00:18:00.599
<v Speaker 1>security point that underpins all of this distributed complexity. How

364
00:18:00.640 --> 00:18:04.160
<v Speaker 1>to secure all these producers and consumers talking to kinesis, Yeah, this.

365
00:18:04.160 --> 00:18:08.440
<v Speaker 2>Is super important. The absolute standard best practice is use

366
00:18:08.519 --> 00:18:11.960
<v Speaker 2>IMA roles. Do not embed long term access keys or

367
00:18:12.000 --> 00:18:16.319
<v Speaker 2>secrets directly into your producer or consumer applications. Why roles specifically,

368
00:18:16.559 --> 00:18:20.240
<v Speaker 2>because IM roles provide temporary, short lived security credentials that

369
00:18:20.279 --> 00:18:24.759
<v Speaker 2>are automatically generated and rotated by AWS. Your application running

370
00:18:24.759 --> 00:18:27.440
<v Speaker 2>on EC two or Lambda or wherever assumes a role

371
00:18:27.599 --> 00:18:31.000
<v Speaker 2>gets temporary permissions, does its work, and those credentials expire.

372
00:18:31.119 --> 00:18:34.440
<v Speaker 1>So even if an application instance gets compromised somehow, the

373
00:18:34.480 --> 00:18:36.359
<v Speaker 1>attacker doesn't get permanent keys.

374
00:18:36.480 --> 00:18:42.079
<v Speaker 2>Exactly, the potential blast radius is dramatically reduced compared to static,

375
00:18:42.160 --> 00:18:46.200
<v Speaker 2>long lived keys being compromised. It's all about implementing least

376
00:18:46.200 --> 00:18:51.640
<v Speaker 2>privileged access using temporary role based credentials. It's fundamental to

377
00:18:51.680 --> 00:18:53.039
<v Speaker 2>secure cloud architecture.

378
00:18:53.359 --> 00:18:56.440
<v Speaker 1>That's a critical takeaway. Okay, let's finish with that provocative

379
00:18:56.480 --> 00:18:59.480
<v Speaker 1>thought we talked about at least once delivery and the

380
00:18:59.519 --> 00:19:02.519
<v Speaker 1>need for dempotency. How should listeners think about that?

381
00:19:02.920 --> 00:19:05.119
<v Speaker 2>Right, we established that GETT duplicates is just a fact

382
00:19:05.160 --> 00:19:08.440
<v Speaker 2>of life in most high performance, resilient streaming systems like

383
00:19:08.559 --> 00:19:10.960
<v Speaker 2>those built on Kinesis data streams. You have to design

384
00:19:11.000 --> 00:19:11.240
<v Speaker 2>for it.

385
00:19:11.559 --> 00:19:12.920
<v Speaker 1>So the provocative thought.

386
00:19:12.680 --> 00:19:15.599
<v Speaker 2>Is think about how profoundly that changes how you have

387
00:19:15.680 --> 00:19:18.559
<v Speaker 2>to design your applications. Yeah, you can never assume an

388
00:19:18.559 --> 00:19:22.200
<v Speaker 2>input an event, A message will arrive only once, might

389
00:19:22.279 --> 00:19:25.079
<v Speaker 2>arrive twice or even more time. So that's rare. So

390
00:19:25.119 --> 00:19:28.240
<v Speaker 2>how does that constraint The absolute certainty that you might

391
00:19:28.279 --> 00:19:31.680
<v Speaker 2>get duplicates ripple through your entire design? How does it

392
00:19:31.680 --> 00:19:36.240
<v Speaker 2>affect your database schemas, your transaction logic, your state management.

393
00:19:36.640 --> 00:19:39.920
<v Speaker 2>Realizing that you must build identity consumers isn't just a detail,

394
00:19:40.039 --> 00:19:43.240
<v Speaker 2>it's a fundamental shift in mindset needed to truly master

395
00:19:43.359 --> 00:19:44.480
<v Speaker 2>streaming architectures.

396
00:19:44.720 --> 00:19:48.279
<v Speaker 1>A great point to ponder designing for repeats, not just requests.

397
00:19:48.400 --> 00:19:49.880
<v Speaker 1>Thanks for breaking all that down.

398
00:19:49.720 --> 00:19:52.839
<v Speaker 2>My pleasure. It's complex, but hopefully a bit clearer now.
