WEBVTT

1
00:00:05.040 --> 00:00:06.320
<v Speaker 1>What's up born, How you doing.

2
00:00:07.160 --> 00:00:10.599
<v Speaker 2>I'm doing I'm doing pretty well. Thanks, Thanks, well cool.

3
00:00:10.800 --> 00:00:14.039
<v Speaker 1>I cut my intro a little shorter this week because

4
00:00:14.759 --> 00:00:17.079
<v Speaker 1>I just kind of assume, and maybe this is a

5
00:00:17.079 --> 00:00:20.760
<v Speaker 1>bad assumption on my part, that people know which podcasts

6
00:00:20.800 --> 00:00:22.800
<v Speaker 1>are listening to because they had to click on it,

7
00:00:23.079 --> 00:00:25.800
<v Speaker 1>and so I felt like it was kind of redundant

8
00:00:25.800 --> 00:00:29.600
<v Speaker 1>to say, welcome to the Adventures in DevOps podcast.

9
00:00:30.280 --> 00:00:32.240
<v Speaker 2>You just you just did it right.

10
00:00:32.320 --> 00:00:35.359
<v Speaker 1>That was subtle though, right, that was clever. I'm gonna

11
00:00:35.479 --> 00:00:37.439
<v Speaker 1>just pat myself on the back for that one.

12
00:00:37.600 --> 00:00:41.119
<v Speaker 3>Yeah, I've got I've got actually an interesting fact that

13
00:00:41.159 --> 00:00:42.679
<v Speaker 3>I can share, so you know, we can jump into

14
00:00:42.679 --> 00:00:47.159
<v Speaker 3>that there was an OTP provider that actually changed hands

15
00:00:47.200 --> 00:00:51.240
<v Speaker 3>similar to the XC vulnerability and compression and Linux not

16
00:00:51.280 --> 00:00:54.520
<v Speaker 3>too long ago. And for those of you that did OTPs,

17
00:00:54.520 --> 00:00:56.240
<v Speaker 3>that's one time passwords, so you can think.

18
00:00:56.119 --> 00:00:57.759
<v Speaker 2>Of like an off path that you got installed in

19
00:00:57.799 --> 00:00:58.240
<v Speaker 2>your phone.

20
00:00:58.880 --> 00:01:02.119
<v Speaker 3>And it's sort of ridiculous that this even happened, because

21
00:01:02.200 --> 00:01:04.519
<v Speaker 3>if you think about how bad it is for a

22
00:01:04.640 --> 00:01:08.200
<v Speaker 3>open source library to get co opted by a militias attacker,

23
00:01:08.239 --> 00:01:11.319
<v Speaker 3>to have an application on your phone that is also

24
00:01:11.400 --> 00:01:15.200
<v Speaker 3>responsible for security for two factor off codes to change hands.

25
00:01:15.480 --> 00:01:19.560
<v Speaker 3>That provider now has access to every single one of

26
00:01:19.599 --> 00:01:22.519
<v Speaker 3>those users two factor off and it could be even

27
00:01:22.560 --> 00:01:25.799
<v Speaker 3>primary factor if it comes to password resets and whatnot.

28
00:01:25.959 --> 00:01:29.480
<v Speaker 3>So that's a great way of stealing credentials. And I

29
00:01:29.480 --> 00:01:31.000
<v Speaker 3>don't think it's an attack factor that a lot of

30
00:01:31.040 --> 00:01:33.040
<v Speaker 3>people think about. I think they, you know, it's like whatever,

31
00:01:33.079 --> 00:01:35.359
<v Speaker 3>it just stores my two factor codes, doesn't really matter.

32
00:01:35.959 --> 00:01:38.879
<v Speaker 2>But now there's actually a huge problem that could come

33
00:01:38.920 --> 00:01:39.599
<v Speaker 2>up because of that.

34
00:01:40.359 --> 00:01:43.879
<v Speaker 1>Nice I look forward to it. I'm just excited about that.

35
00:01:46.439 --> 00:01:48.359
<v Speaker 3>I think everyone really has to switch over to weboth

36
00:01:48.400 --> 00:01:50.680
<v Speaker 3>and that's the truth secret. And if you don't know

37
00:01:50.719 --> 00:01:52.359
<v Speaker 3>what that is, come and talk to me after the show.

38
00:01:52.879 --> 00:01:55.439
<v Speaker 3>I'm happy to give everyone an earful about that.

39
00:01:56.359 --> 00:01:57.840
<v Speaker 1>Or can we just give up on it and just

40
00:01:57.879 --> 00:02:01.280
<v Speaker 1>everyone uses password all lower case letters for their password.

41
00:02:01.640 --> 00:02:06.200
<v Speaker 1>I mean there's some credibility to that approach.

42
00:02:05.879 --> 00:02:10.080
<v Speaker 2>Right, there's a whole episode there.

43
00:02:11.840 --> 00:02:15.479
<v Speaker 1>Maybe maybe, But today's episode, we're talking about one of

44
00:02:15.520 --> 00:02:19.960
<v Speaker 1>my favorite topics, incident response and on call management. And

45
00:02:20.560 --> 00:02:25.800
<v Speaker 1>to chat through that topic with us, we've got Felipe Jane,

46
00:02:25.960 --> 00:02:29.919
<v Speaker 1>the CEO of pager Ley, joining us today. Felipe welcome.

47
00:02:31.560 --> 00:02:31.800
<v Speaker 4>Thanks.

48
00:02:31.800 --> 00:02:35.080
<v Speaker 5>So i'mon guys, happy to join on this and you

49
00:02:35.280 --> 00:02:41.120
<v Speaker 5>have a biggest psion response and ons right on.

50
00:02:41.319 --> 00:02:41.599
<v Speaker 4>Cool.

51
00:02:41.680 --> 00:02:47.280
<v Speaker 1>I feel like incident response is a learned skill, you know,

52
00:02:47.439 --> 00:02:51.800
<v Speaker 1>and it's learned on the job under pressure when everything's

53
00:02:51.840 --> 00:02:56.599
<v Speaker 1>going to hell, And prior to your first incident, you

54
00:02:56.719 --> 00:02:59.680
<v Speaker 1>never even thought through that this is where your life

55
00:02:59.719 --> 00:03:02.400
<v Speaker 1>was going to lead. So how did you end up

56
00:03:03.039 --> 00:03:07.759
<v Speaker 1>starting a company dedicated to this incident response?

57
00:03:09.639 --> 00:03:12.560
<v Speaker 5>It began with like when I started out my first

58
00:03:13.240 --> 00:03:16.639
<v Speaker 5>job basically, and I was part of Amazon and in

59
00:03:16.680 --> 00:03:19.360
<v Speaker 5>the dtail page TAM, which is friendly one of the

60
00:03:19.479 --> 00:03:25.520
<v Speaker 5>highest sort of pages in terms of traffic difficulsider, And

61
00:03:26.879 --> 00:03:29.199
<v Speaker 5>that was like the first hand experience. I like, my

62
00:03:29.199 --> 00:03:32.520
<v Speaker 5>manager put me on oncoholic within sort of months of joining,

63
00:03:33.000 --> 00:03:34.719
<v Speaker 5>and he told me like, hey, this is like the

64
00:03:34.719 --> 00:03:38.199
<v Speaker 5>best way to learn about things, And I kidnam it.

65
00:03:38.319 --> 00:03:42.120
<v Speaker 4>That was some serious pressure, say in.

66
00:03:42.120 --> 00:03:48.000
<v Speaker 5>The initial initial stuff and I and since then I've

67
00:03:48.039 --> 00:03:49.919
<v Speaker 5>always knie, yeah, this is the best way to sort

68
00:03:49.960 --> 00:03:54.599
<v Speaker 5>of learn things because you are absolutely learning each and

69
00:03:54.759 --> 00:03:55.960
<v Speaker 5>every bit of things.

70
00:03:55.719 --> 00:03:57.599
<v Speaker 4>In the shortest possible amount of time.

71
00:03:58.120 --> 00:04:01.599
<v Speaker 5>So I think that that was my first I would

72
00:04:01.599 --> 00:04:04.439
<v Speaker 5>say interaction with the incidents and the on cord world.

73
00:04:05.159 --> 00:04:09.039
<v Speaker 1>Well, let's be realistic there. Your manager's thought process was, actually,

74
00:04:09.560 --> 00:04:11.319
<v Speaker 1>if this guy's going to quit, I'm going to make

75
00:04:11.400 --> 00:04:13.719
<v Speaker 1>him quit sooner rather than later, So I'm putting him

76
00:04:13.759 --> 00:04:14.199
<v Speaker 1>on call.

77
00:04:14.319 --> 00:04:18.040
<v Speaker 2>Right at the beginning, it could be.

78
00:04:18.000 --> 00:04:20.240
<v Speaker 4>I think it was probably you know, just titting.

79
00:04:20.759 --> 00:04:24.600
<v Speaker 5>This is how you get waters And yeah, it was

80
00:04:24.680 --> 00:04:27.439
<v Speaker 5>pretty pretty sort of a new world. Before that, I

81
00:04:27.519 --> 00:04:31.480
<v Speaker 5>always thought like, yeah, software development is more about building

82
00:04:31.480 --> 00:04:36.000
<v Speaker 5>stuff and you know, maybe designing it. This part is

83
00:04:36.240 --> 00:04:40.000
<v Speaker 5>truly what you see a management or managing your maintaining

84
00:04:40.040 --> 00:04:40.879
<v Speaker 5>your product.

85
00:04:41.240 --> 00:04:44.399
<v Speaker 4>And that was like the first kind of interaction I had.

86
00:04:47.199 --> 00:04:50.519
<v Speaker 1>Right and cool. So then you went from Amazon. After

87
00:04:50.639 --> 00:04:53.839
<v Speaker 1>that you went to Disney, right yeah.

88
00:04:54.000 --> 00:04:56.959
<v Speaker 4>Yeah, so in Amazon it was pretty interesting.

89
00:04:57.279 --> 00:05:00.240
<v Speaker 5>The dal page and like there are they It was

90
00:05:00.279 --> 00:05:03.600
<v Speaker 5>sort of quite a few days, especially during Fine Days

91
00:05:03.600 --> 00:05:06.600
<v Speaker 5>and cyber Mondays and the Christmas week is always like

92
00:05:06.839 --> 00:05:11.160
<v Speaker 5>pretty high pressure stuff and I remember each and every

93
00:05:11.279 --> 00:05:14.360
<v Speaker 5>one of them, like in terms of the events, like

94
00:05:14.519 --> 00:05:17.800
<v Speaker 5>even if there's like a small blade, there's like I

95
00:05:17.839 --> 00:05:22.519
<v Speaker 5>think so many teams on a single bridge in even

96
00:05:22.560 --> 00:05:25.160
<v Speaker 5>in different locations, and everyone just giving their status up

97
00:05:25.240 --> 00:05:28.360
<v Speaker 5>dates each and after time and it was like it

98
00:05:28.480 --> 00:05:30.399
<v Speaker 5>was almost like i'd say, like a bar room.

99
00:05:30.879 --> 00:05:31.000
<v Speaker 4>Uh.

100
00:05:31.240 --> 00:05:35.319
<v Speaker 5>People I think have now started to tagg in on

101
00:05:35.439 --> 00:05:38.680
<v Speaker 5>called rooms as barrooms. Like everyone givings the startus updates

102
00:05:38.720 --> 00:05:40.959
<v Speaker 5>and see update and go.

103
00:05:40.959 --> 00:05:44.759
<v Speaker 4>On for quite a few nights. So that was I

104
00:05:44.759 --> 00:05:47.800
<v Speaker 4>think like pretty interesting in the Disney.

105
00:05:47.920 --> 00:05:51.120
<v Speaker 5>It was on the other way around, like our major

106
00:05:51.160 --> 00:05:55.360
<v Speaker 5>events in Disney for the live stream. So in India

107
00:05:55.399 --> 00:05:58.879
<v Speaker 5>we had a cricket as the major school and we

108
00:05:59.000 --> 00:06:02.040
<v Speaker 5>used to live stream cricket and we had around twenty

109
00:06:02.160 --> 00:06:03.920
<v Speaker 5>million concurent.

110
00:06:03.680 --> 00:06:04.839
<v Speaker 4>Viewers also at some point.

111
00:06:05.759 --> 00:06:10.040
<v Speaker 5>And with that scale, each and every bit of system

112
00:06:10.279 --> 00:06:14.839
<v Speaker 5>you know, from starting from where to CDN even to

113
00:06:14.879 --> 00:06:19.040
<v Speaker 5>a load balancer, even to a small humanities file to

114
00:06:19.079 --> 00:06:22.959
<v Speaker 5>even to maybe our cashing systems, everything gets tested a lot.

115
00:06:23.680 --> 00:06:28.720
<v Speaker 5>So for us in Disney, uh, that was our major priority.

116
00:06:28.959 --> 00:06:33.120
<v Speaker 5>How to you know, manage our on calls response instead

117
00:06:33.160 --> 00:06:37.160
<v Speaker 5>of responding on calls during the live streams because we

118
00:06:37.199 --> 00:06:40.639
<v Speaker 5>cannot even afford to go for a minute down because

119
00:06:40.680 --> 00:06:43.240
<v Speaker 5>we know how much takes the life events.

120
00:06:42.839 --> 00:06:44.439
<v Speaker 4>Are for the company.

121
00:06:44.839 --> 00:06:48.759
<v Speaker 1>Well, especially when you're talking about like live streaming cricket

122
00:06:48.800 --> 00:06:51.279
<v Speaker 1>to Indians, because y'all take that seriously.

123
00:06:52.120 --> 00:06:55.720
<v Speaker 5>Yeah, yeah, yeah, we you know the you know, the

124
00:06:55.759 --> 00:06:59.319
<v Speaker 5>broadcast contracts are for you know, billions of dollars and

125
00:06:59.639 --> 00:07:02.120
<v Speaker 5>let's say we go for Doubt for a couple of minutes.

126
00:07:02.319 --> 00:07:04.920
<v Speaker 5>We're really using losing money at every point in time,

127
00:07:05.519 --> 00:07:07.600
<v Speaker 5>and we can see the Twitter. You know, we're just

128
00:07:07.639 --> 00:07:09.639
<v Speaker 5>praending on trat real like your app is down on

129
00:07:09.759 --> 00:07:12.600
<v Speaker 5>what's happening and what's not. So you need to be

130
00:07:12.759 --> 00:07:17.959
<v Speaker 5>very very careful you know, how to respond publicly also,

131
00:07:18.040 --> 00:07:20.920
<v Speaker 5>and how to you know, quickly bring things up in

132
00:07:20.959 --> 00:07:23.680
<v Speaker 5>a way that it can last at least during the live.

133
00:07:24.839 --> 00:07:29.240
<v Speaker 5>So so those were like I'd say, the most you know,

134
00:07:29.360 --> 00:07:33.279
<v Speaker 5>the closest I can be a customer at that point

135
00:07:33.319 --> 00:07:37.040
<v Speaker 5>of time, and the most engineer can be at the

136
00:07:37.160 --> 00:07:42.759
<v Speaker 5>highest pressure point. So yeah, so that was my Disney

137
00:07:42.800 --> 00:07:47.079
<v Speaker 5>stint and those that I've started out with patiently.

138
00:07:47.680 --> 00:07:50.680
<v Speaker 1>Right. Yeah, So I think after that you're kind of

139
00:07:50.800 --> 00:07:55.279
<v Speaker 1>committed at this point, you yeah, you're just your career

140
00:07:55.319 --> 00:07:59.600
<v Speaker 1>path is now incident response after those two stands.

141
00:07:59.800 --> 00:08:00.600
<v Speaker 4>Yeah yeah, yeah.

142
00:08:00.959 --> 00:08:06.199
<v Speaker 5>So interestingly, I think, uh, the two companies had slightly

143
00:08:06.240 --> 00:08:10.720
<v Speaker 5>a different way of handling response, maybe because.

144
00:08:10.480 --> 00:08:13.399
<v Speaker 4>Of the company size or team sizes.

145
00:08:13.920 --> 00:08:19.079
<v Speaker 5>But overall, I think that the concept way like everyone

146
00:08:19.160 --> 00:08:22.000
<v Speaker 5>was leading to a single sort of a role where

147
00:08:22.040 --> 00:08:26.600
<v Speaker 5>we wanted to reduce the same incidents again and at

148
00:08:26.680 --> 00:08:27.879
<v Speaker 5>least that that's.

149
00:08:27.720 --> 00:08:28.839
<v Speaker 4>Our primary rules.

150
00:08:29.279 --> 00:08:32.840
<v Speaker 5>And I see at what I saw in the two things,

151
00:08:32.879 --> 00:08:36.639
<v Speaker 5>like the primary part of the incident response is the process,

152
00:08:36.759 --> 00:08:38.519
<v Speaker 5>Like how do you sort of you know, set up

153
00:08:38.559 --> 00:08:43.000
<v Speaker 5>the processes, how do you enable your engineers to you know,

154
00:08:43.120 --> 00:08:44.399
<v Speaker 5>follow these processes?

155
00:08:44.799 --> 00:08:45.240
<v Speaker 4>Uh?

156
00:08:45.320 --> 00:08:48.759
<v Speaker 5>And and assistant I think like as an engineer, as

157
00:08:48.759 --> 00:08:53.639
<v Speaker 5>a developer, ah, that's why we call on call as

158
00:08:53.679 --> 00:08:57.039
<v Speaker 5>an operational part. Nobody wants to you know, spend a

159
00:08:57.080 --> 00:08:59.759
<v Speaker 5>lot of time on it. Like everyone wants to maybe code,

160
00:09:00.159 --> 00:09:05.399
<v Speaker 5>develop features, maybe even design, architect even blog nowadays, but

161
00:09:05.879 --> 00:09:08.679
<v Speaker 5>on call is the last part, Like everyone wants to

162
00:09:08.720 --> 00:09:11.720
<v Speaker 5>spend time, and so most grinted work, especially if they're

163
00:09:11.799 --> 00:09:16.039
<v Speaker 5>like you know, work after work related to burgs or incidents.

164
00:09:16.440 --> 00:09:20.919
<v Speaker 5>So that's where I saw a lot of common patterns

165
00:09:21.840 --> 00:09:24.799
<v Speaker 5>across you know, incident response, even.

166
00:09:24.879 --> 00:09:25.799
<v Speaker 4>On call management.

167
00:09:26.200 --> 00:09:29.320
<v Speaker 5>UH is like like we there can be a lot

168
00:09:29.360 --> 00:09:33.759
<v Speaker 5>of tools or automations or even.

169
00:09:33.600 --> 00:09:36.000
<v Speaker 4>Assistant agents which can help the engeneers.

170
00:09:36.120 --> 00:09:39.919
<v Speaker 5>So that's why I kind of you know started started

171
00:09:39.919 --> 00:09:42.799
<v Speaker 5>with ag which is helping the teams to assist the

172
00:09:42.840 --> 00:09:44.200
<v Speaker 5>incident and management.

173
00:09:44.559 --> 00:09:47.360
<v Speaker 1>Yeah, I think that's a solid point that's often overlooked.

174
00:09:47.840 --> 00:09:51.279
<v Speaker 1>I work a lot with early stage startups, and it's

175
00:09:51.320 --> 00:09:54.360
<v Speaker 1>a pattern I've seen over my career, like the biggest

176
00:09:54.440 --> 00:09:59.000
<v Speaker 1>part of incident response happens before you ever have your

177
00:09:59.039 --> 00:10:01.679
<v Speaker 1>first incident, but because you have to talk through like

178
00:10:01.879 --> 00:10:03.919
<v Speaker 1>what are we going to do when this actually happens,

179
00:10:03.960 --> 00:10:06.639
<v Speaker 1>who are we gonna bring on, how are we going

180
00:10:06.720 --> 00:10:11.159
<v Speaker 1>to carry out communications? And so Yeah, I think that's

181
00:10:11.159 --> 00:10:13.080
<v Speaker 1>a good solid point. I like the fact that you

182
00:10:13.159 --> 00:10:16.519
<v Speaker 1>mentioned that there's multiple ways of doing that. You know,

183
00:10:16.559 --> 00:10:18.080
<v Speaker 1>there's not one right way.

184
00:10:18.559 --> 00:10:22.039
<v Speaker 5>Right right, So there, I would say, like the first

185
00:10:22.039 --> 00:10:26.080
<v Speaker 5>part is like you need to sort of realize, yeah,

186
00:10:26.320 --> 00:10:29.159
<v Speaker 5>the time has come in my organization that we need

187
00:10:29.200 --> 00:10:33.120
<v Speaker 5>to have this set up. I what I've usually seen

188
00:10:33.159 --> 00:10:37.240
<v Speaker 5>it all ninety percent times it comes down from the

189
00:10:37.320 --> 00:10:41.279
<v Speaker 5>top leadership with if you have like cetos or epes

190
00:10:41.480 --> 00:10:45.240
<v Speaker 5>depending on the organization size, If if those books have

191
00:10:45.320 --> 00:10:50.200
<v Speaker 5>come from a place where incident response or oncal processes had.

192
00:10:50.039 --> 00:10:52.559
<v Speaker 4>Been in place, they bring that culture into.

193
00:10:52.360 --> 00:10:55.080
<v Speaker 5>The company because they they have realized value over the

194
00:10:55.120 --> 00:10:58.320
<v Speaker 5>time of these processes in companies where they have not

195
00:10:58.559 --> 00:11:02.559
<v Speaker 5>like usually or they take a longer time to realize, yeah,

196
00:11:02.600 --> 00:11:06.120
<v Speaker 5>we need such processes. So that's the first part is

197
00:11:06.159 --> 00:11:09.879
<v Speaker 5>to realize like, hey, these are the processes we need

198
00:11:09.960 --> 00:11:12.679
<v Speaker 5>so that we can at least, you know, radiuce are

199
00:11:12.799 --> 00:11:15.159
<v Speaker 5>on our issues in a longer.

200
00:11:14.879 --> 00:11:15.519
<v Speaker 4>Frame of time.

201
00:11:16.200 --> 00:11:18.720
<v Speaker 5>So that's the first part. The second part is them

202
00:11:18.799 --> 00:11:21.759
<v Speaker 5>to set up that on called roster. So on the

203
00:11:21.919 --> 00:11:25.480
<v Speaker 5>rosters is something like uh, now that is something that

204
00:11:25.519 --> 00:11:29.279
<v Speaker 5>which is very dependent on our too work. Some emanations

205
00:11:29.399 --> 00:11:33.120
<v Speaker 5>wants to have a centralized on call team which kind

206
00:11:33.159 --> 00:11:36.840
<v Speaker 5>of handles everything like let's say, even if it's like

207
00:11:36.879 --> 00:11:39.000
<v Speaker 5>a re issue, they do it, even if it's like

208
00:11:39.039 --> 00:11:43.279
<v Speaker 5>an infra situation, they handle it. And people have you know,

209
00:11:43.320 --> 00:11:45.879
<v Speaker 5>different ways of setting up maybe like one one person

210
00:11:46.480 --> 00:11:49.480
<v Speaker 5>from each team or just one person every week, and

211
00:11:49.600 --> 00:11:54.440
<v Speaker 5>they do some some omnations have theredd on called team

212
00:11:54.519 --> 00:11:58.559
<v Speaker 5>for each of their separate teams, so uh, and they

213
00:11:58.679 --> 00:12:01.000
<v Speaker 5>kind of rotated weekly by I becieve monthly man.

214
00:12:01.480 --> 00:12:02.519
<v Speaker 4>So that's the other part.

215
00:12:02.639 --> 00:12:05.480
<v Speaker 5>The second part is to set the oncle on roster

216
00:12:06.080 --> 00:12:11.080
<v Speaker 5>and I think then pergnission takes some time until the

217
00:12:11.360 --> 00:12:13.519
<v Speaker 5>oncle rosters get setted and.

218
00:12:13.440 --> 00:12:15.960
<v Speaker 4>They start you know, debugging tickets, and.

219
00:12:16.039 --> 00:12:18.559
<v Speaker 5>After some certain amount of time then they go into

220
00:12:18.600 --> 00:12:22.559
<v Speaker 5>the setting up the incident response part, which is the

221
00:12:22.639 --> 00:12:25.600
<v Speaker 5>post mortems as well as you know, figuring out Hey, like,

222
00:12:25.720 --> 00:12:29.919
<v Speaker 5>these are the sort of our general kind of you know,

223
00:12:29.960 --> 00:12:32.399
<v Speaker 5>steps we take to solve an issue. These are the

224
00:12:32.559 --> 00:12:36.320
<v Speaker 5>certain workflows that we do. Now let's try to streamline

225
00:12:36.320 --> 00:12:38.879
<v Speaker 5>this both in the response as well as in the

226
00:12:38.960 --> 00:12:41.679
<v Speaker 5>post mortem process as on the post modern analysis.

227
00:12:42.080 --> 00:12:44.279
<v Speaker 4>So that's how that's what.

228
00:12:44.120 --> 00:12:49.320
<v Speaker 5>We have seen orgnisation go from step A to step

229
00:12:49.679 --> 00:12:51.399
<v Speaker 5>to last parties of post models.

230
00:12:51.600 --> 00:12:55.039
<v Speaker 3>You said something really interesting I think, which is I

231
00:12:55.039 --> 00:12:57.519
<v Speaker 3>haven't worked at any company so far, and even my

232
00:12:57.639 --> 00:13:01.879
<v Speaker 3>own authors here we have process. It isn't like we

233
00:13:01.919 --> 00:13:04.279
<v Speaker 3>don't do anything when that happens. Maybe it's because we're

234
00:13:04.320 --> 00:13:06.639
<v Speaker 3>a tech focused or a software focused company, and that's

235
00:13:06.639 --> 00:13:10.039
<v Speaker 3>pretty much where I've worked. But the part that you

236
00:13:10.039 --> 00:13:12.080
<v Speaker 3>said that was really interesting for me is that software

237
00:13:12.080 --> 00:13:15.879
<v Speaker 3>engineers don't like on call and you know, I have

238
00:13:17.200 --> 00:13:19.200
<v Speaker 3>I want to challenge that or like, you know, I want.

239
00:13:19.080 --> 00:13:20.559
<v Speaker 2>To live in the world where it's not a problem.

240
00:13:20.600 --> 00:13:22.720
<v Speaker 3>It's like, why why do people not like it so much,

241
00:13:22.720 --> 00:13:23.799
<v Speaker 3>any thoughts about that.

242
00:13:24.240 --> 00:13:28.519
<v Speaker 5>Because people as engineers or developers, they don't consider as

243
00:13:28.559 --> 00:13:32.320
<v Speaker 5>part of the building process. We love to build, we

244
00:13:32.440 --> 00:13:35.480
<v Speaker 5>love to you know, architect things, but once we have

245
00:13:35.559 --> 00:13:37.600
<v Speaker 5>done that, then we don want to sort of you know,

246
00:13:37.720 --> 00:13:41.039
<v Speaker 5>going and you know, fix out just one tiny part

247
00:13:41.080 --> 00:13:43.440
<v Speaker 5>of it which is actually causing the major issue, but

248
00:13:43.679 --> 00:13:47.120
<v Speaker 5>going there fixing part of it and probably you know,

249
00:13:47.279 --> 00:13:50.679
<v Speaker 5>just taking a blame or so people kind of have

250
00:13:50.840 --> 00:13:54.440
<v Speaker 5>that kind of bias. Also, Hey, my my product is

251
00:13:54.480 --> 00:13:55.360
<v Speaker 5>like a bug three.

252
00:13:55.559 --> 00:13:58.399
<v Speaker 4>My product is you know, super gid so so.

253
00:13:58.840 --> 00:14:01.600
<v Speaker 5>And going there fixing this as well as you know,

254
00:14:01.679 --> 00:14:04.639
<v Speaker 5>you already have a lot of other work going on

255
00:14:04.759 --> 00:14:08.559
<v Speaker 5>the strains in this agile world. So that's where people

256
00:14:08.799 --> 00:14:11.120
<v Speaker 5>don't defini sort of difference, don't want to spend a

257
00:14:11.120 --> 00:14:11.879
<v Speaker 5>lot of time on it.

258
00:14:12.399 --> 00:14:15.799
<v Speaker 4>So that's what you think.

259
00:14:15.799 --> 00:14:18.840
<v Speaker 3>It's not like well prioritized or rewarded. Like if you

260
00:14:18.879 --> 00:14:20.879
<v Speaker 3>do on call work, you're not rewarded for it. If

261
00:14:20.919 --> 00:14:23.360
<v Speaker 3>you write buglass code, you're not rewarded for it. So

262
00:14:23.480 --> 00:14:26.559
<v Speaker 3>you know, whatever I don't want to do it, it's

263
00:14:26.559 --> 00:14:28.120
<v Speaker 3>going to happen, and then I have to pay the

264
00:14:28.279 --> 00:14:29.919
<v Speaker 3>I have to pay the fine because of it, and

265
00:14:30.000 --> 00:14:31.240
<v Speaker 3>I don't I don't get the benefit.

266
00:14:32.440 --> 00:14:37.279
<v Speaker 5>Yeah, I think like benefits and all, like cans probably

267
00:14:37.279 --> 00:14:40.559
<v Speaker 5>be sort of be defined by the engineting managers or

268
00:14:40.600 --> 00:14:42.919
<v Speaker 5>team leads if they want to sort of reward or

269
00:14:42.960 --> 00:14:45.799
<v Speaker 5>they want to highlight maybe you know, like if the

270
00:14:45.840 --> 00:14:48.759
<v Speaker 5>person has solved these many decads and these incidents, or

271
00:14:49.120 --> 00:14:54.279
<v Speaker 5>maybe find a better way of you know, rewarding rewarding uh,

272
00:14:54.600 --> 00:14:57.320
<v Speaker 5>developers who actually solve a lot of incidents.

273
00:14:58.080 --> 00:14:59.120
<v Speaker 4>But yeah, I think like in.

274
00:14:59.159 --> 00:15:02.000
<v Speaker 5>General sense, like it's not part of the building, it's

275
00:15:02.159 --> 00:15:05.320
<v Speaker 5>only maintain, but that's the major.

276
00:15:08.000 --> 00:15:09.919
<v Speaker 2>Like, no, no, I totally got it.

277
00:15:10.399 --> 00:15:12.720
<v Speaker 1>Yeah, I don't like on call because it's never my code,

278
00:15:12.759 --> 00:15:13.879
<v Speaker 1>it's always something else.

279
00:15:14.159 --> 00:15:18.039
<v Speaker 3>Yeah, but I mean that makes me think there's something Yeah, no,

280
00:15:18.080 --> 00:15:19.600
<v Speaker 3>I totally get I mean I feel like there's something

281
00:15:19.600 --> 00:15:20.919
<v Speaker 3>fundamentally broken there.

282
00:15:20.960 --> 00:15:23.679
<v Speaker 2>Like I've seen that where I worked at one.

283
00:15:23.600 --> 00:15:27.039
<v Speaker 3>Of the previous previous jobs, that was twenty fifty engineers

284
00:15:27.080 --> 00:15:30.519
<v Speaker 3>that were all rotating through all the same on call schedule,

285
00:15:30.960 --> 00:15:33.720
<v Speaker 3>as if somehow code just because it was all in

286
00:15:33.720 --> 00:15:37.039
<v Speaker 3>the monolith, if if something was broken, I somehow would

287
00:15:37.039 --> 00:15:39.320
<v Speaker 3>magically know what was going on and say I don't

288
00:15:39.360 --> 00:15:42.919
<v Speaker 3>know products or logistics code when I had nothing to

289
00:15:42.960 --> 00:15:45.440
<v Speaker 3>do with the development of it, Like I like it

290
00:15:45.519 --> 00:15:47.600
<v Speaker 3>might as well be like some for it, like I

291
00:15:47.639 --> 00:15:51.120
<v Speaker 3>don't know Aramaic or you know, uniform to me, like

292
00:15:51.159 --> 00:15:53.759
<v Speaker 3>I have no idea what that was going on there

293
00:15:53.840 --> 00:15:56.240
<v Speaker 3>at all, and yet somehow I have to come in

294
00:15:56.320 --> 00:15:59.080
<v Speaker 3>and debug or find out what the problem.

295
00:15:58.879 --> 00:16:02.279
<v Speaker 5>Was, right, Yeah, I think in Amazon this was a

296
00:16:02.360 --> 00:16:06.480
<v Speaker 5>case like the entire I the de deal page was

297
00:16:07.000 --> 00:16:10.720
<v Speaker 5>kind of at that point of time like that, and

298
00:16:10.240 --> 00:16:14.240
<v Speaker 5>and that particular sort of service the page is like

299
00:16:14.320 --> 00:16:18.679
<v Speaker 5>maintained by more than one fifty software engineers. So like

300
00:16:19.000 --> 00:16:21.080
<v Speaker 5>most of the times you are debugging something that you

301
00:16:21.120 --> 00:16:24.320
<v Speaker 5>have not ruled, so and you're evening, it's a three

302
00:16:24.480 --> 00:16:27.799
<v Speaker 5>in the night, and you're already frustrated, like you don't

303
00:16:27.799 --> 00:16:31.000
<v Speaker 5>know what's happening. So that's why I think some bit

304
00:16:31.039 --> 00:16:33.919
<v Speaker 5>of uscision do come from. And that's why I think

305
00:16:34.000 --> 00:16:37.519
<v Speaker 5>like good processes can sort of somehow mitigate some of

306
00:16:37.600 --> 00:16:41.600
<v Speaker 5>the pain points, especially like institutions.

307
00:16:41.039 --> 00:16:45.399
<v Speaker 3>Like was the expectation at three am that software engineers

308
00:16:45.399 --> 00:16:47.679
<v Speaker 3>should be able to log on and identify the problem

309
00:16:47.720 --> 00:16:50.399
<v Speaker 3>and push out a fix like that seems like there's

310
00:16:50.440 --> 00:16:54.080
<v Speaker 3>something that would never actually have like actually worked out

311
00:16:54.120 --> 00:16:55.679
<v Speaker 3>in practice.

312
00:16:55.200 --> 00:16:58.360
<v Speaker 5>It did like something. So like it kind of depends

313
00:16:58.440 --> 00:17:01.440
<v Speaker 5>so what kind ofs that you have in place. So

314
00:17:01.600 --> 00:17:04.440
<v Speaker 5>one past one part is to maybe mitigate. Sometimes medication

315
00:17:04.519 --> 00:17:07.079
<v Speaker 5>can be done just to give a revered last running.

316
00:17:07.279 --> 00:17:10.359
<v Speaker 5>That's one medication people don't have. But that's even that

317
00:17:10.519 --> 00:17:14.400
<v Speaker 5>is not done. You probably need to sort of page

318
00:17:14.759 --> 00:17:17.880
<v Speaker 5>the person who has probably added that line of code

319
00:17:17.960 --> 00:17:20.839
<v Speaker 5>and take hit from him or her, or you maybe

320
00:17:20.920 --> 00:17:23.000
<v Speaker 5>have certain like a time of team or something like

321
00:17:23.079 --> 00:17:26.160
<v Speaker 5>that which can sort of you know, uh orches straight

322
00:17:26.240 --> 00:17:29.880
<v Speaker 5>and collaborate with a lot of different on calls for

323
00:17:30.000 --> 00:17:32.279
<v Speaker 5>different developers.

324
00:17:31.839 --> 00:17:34.559
<v Speaker 4>To sort of mitigate and put up put a patch

325
00:17:34.640 --> 00:17:35.359
<v Speaker 4>or fix the issues.

326
00:17:35.759 --> 00:17:39.079
<v Speaker 5>Right, So this kind of like depends on you know,

327
00:17:39.200 --> 00:17:43.240
<v Speaker 5>how you have set up that inst response forrocess itself.

328
00:17:43.519 --> 00:17:46.759
<v Speaker 1>Yeah, I think that's a really good distinction to make

329
00:17:46.839 --> 00:17:52.279
<v Speaker 1>there that like during an incident, oftentimes the primary goal

330
00:17:52.519 --> 00:17:56.440
<v Speaker 1>is to mitigate the problem, which is different than solving

331
00:17:56.559 --> 00:17:59.799
<v Speaker 1>the root cause of the problem. So like if if

332
00:17:59.799 --> 00:18:03.000
<v Speaker 1>we're run an outage or an incident, like we might

333
00:18:03.160 --> 00:18:09.680
<v Speaker 1>mitigate the problem by launching you know, fifteen more Kubernetes

334
00:18:09.720 --> 00:18:13.400
<v Speaker 1>pods with just insane amounts of memory, just so we

335
00:18:13.480 --> 00:18:16.000
<v Speaker 1>can ride through the problem too, we're able to figure

336
00:18:16.039 --> 00:18:18.960
<v Speaker 1>out the root cause and test that theory and then

337
00:18:19.000 --> 00:18:20.039
<v Speaker 1>deploy a fix for it.

338
00:18:20.480 --> 00:18:23.000
<v Speaker 3>So just so I got you right, Well, your strategy

339
00:18:23.119 --> 00:18:25.519
<v Speaker 3>for incident management is turning it off and back on.

340
00:18:25.599 --> 00:18:29.640
<v Speaker 1>Again absolutely three times. Always reboot three times.

341
00:18:31.559 --> 00:18:34.000
<v Speaker 4>Yeah, I think like legging ice stream.

342
00:18:34.720 --> 00:18:38.480
<v Speaker 5>You know, one of the major kind of our last

343
00:18:38.559 --> 00:18:41.160
<v Speaker 5>heart was to just put the live stream on and

344
00:18:41.720 --> 00:18:44.480
<v Speaker 5>don't maybe don't have a pay on or something like that.

345
00:18:44.680 --> 00:18:46.920
<v Speaker 4>So that can be one even in.

346
00:18:48.839 --> 00:18:51.920
<v Speaker 5>Different scenarios that you can sit like, okay, even if

347
00:18:52.000 --> 00:18:53.160
<v Speaker 5>fixing is stacking time even.

348
00:18:53.039 --> 00:18:58.279
<v Speaker 4>Immedi maybe do something. Maybe just put that I took there.

349
00:18:58.839 --> 00:19:02.519
<v Speaker 5>There was a certain uh forcessiarity had in peace, so

350
00:19:02.839 --> 00:19:06.160
<v Speaker 5>I just said, like, just increase the community spots and

351
00:19:06.799 --> 00:19:09.680
<v Speaker 5>at least that your customer is not facing that issue

352
00:19:09.680 --> 00:19:12.799
<v Speaker 5>for the movement, then they use use that time to

353
00:19:13.759 --> 00:19:14.880
<v Speaker 5>to actually fix the issue.

354
00:19:15.200 --> 00:19:18.640
<v Speaker 3>How do you decide what mitigation strategy makes the most sense, Like,

355
00:19:19.000 --> 00:19:20.960
<v Speaker 3>if you like, I feel like we're in the case

356
00:19:21.000 --> 00:19:23.279
<v Speaker 3>of the world now where we're going to automate whatever

357
00:19:23.400 --> 00:19:26.279
<v Speaker 3>it is. So if we have some number of failures,

358
00:19:26.359 --> 00:19:28.799
<v Speaker 3>Do we just immediately start deploying extra pods? Do we

359
00:19:28.960 --> 00:19:32.160
<v Speaker 3>immediately try to roll back to a previous code version?

360
00:19:32.240 --> 00:19:35.240
<v Speaker 3>Like can we even know upfront what the right approach

361
00:19:35.400 --> 00:19:37.200
<v Speaker 3>is there to automate? Because the last thing I want,

362
00:19:37.240 --> 00:19:39.519
<v Speaker 3>I feel like, is someone to get online and after

363
00:19:39.680 --> 00:19:41.799
<v Speaker 3>half an hour be like, you know what, maybe we

364
00:19:41.920 --> 00:19:45.160
<v Speaker 3>should deploy some pods with an insane amount of memory

365
00:19:45.200 --> 00:19:46.480
<v Speaker 3>that will solve all of our problems.

366
00:19:47.359 --> 00:19:50.240
<v Speaker 4>Yeah, yeah, it's interesting.

367
00:19:50.319 --> 00:19:55.319
<v Speaker 5>It was like like as an engineer, like even if

368
00:19:55.400 --> 00:19:59.319
<v Speaker 5>you have some similarity, you would know this is the issue.

369
00:19:59.519 --> 00:20:02.079
<v Speaker 4>Let's see, if you're seeing them mentally, then part increase

370
00:20:02.359 --> 00:20:03.000
<v Speaker 4>makes a sense.

371
00:20:03.440 --> 00:20:07.119
<v Speaker 5>But let's say if there's a something else you're seeing

372
00:20:07.119 --> 00:20:10.319
<v Speaker 5>a lot of five texis, probably the part increase might

373
00:20:10.359 --> 00:20:11.000
<v Speaker 5>not make a sense.

374
00:20:11.000 --> 00:20:14.400
<v Speaker 4>They're probably reverting to a previous version might make a sense.

375
00:20:15.079 --> 00:20:18.799
<v Speaker 5>So I think you need to have some bit of humanity,

376
00:20:18.960 --> 00:20:21.680
<v Speaker 5>you know, with whatever system that you are hiding it

377
00:20:22.279 --> 00:20:23.759
<v Speaker 5>in case you're not, then it becomes.

378
00:20:23.559 --> 00:20:27.079
<v Speaker 4>A huge challenges. Then you probably don't know you know

379
00:20:27.200 --> 00:20:31.279
<v Speaker 4>what to do. So I think, like I think some

380
00:20:31.839 --> 00:20:35.440
<v Speaker 4>hulmarities need at least to have that first modication.

381
00:20:35.559 --> 00:20:41.799
<v Speaker 1>Still, I think that alludes to an entire skill set

382
00:20:42.000 --> 00:20:44.480
<v Speaker 1>in software engineering of how to troubleshoot.

383
00:20:45.039 --> 00:20:45.160
<v Speaker 4>You know.

384
00:20:45.240 --> 00:20:48.200
<v Speaker 1>It's because like debugging when you're writing code is completely

385
00:20:48.319 --> 00:20:53.839
<v Speaker 1>different I think than troubleshooting a live system and all

386
00:20:53.880 --> 00:20:56.839
<v Speaker 1>of its different dependencies and trying to figure out where

387
00:20:56.920 --> 00:20:59.599
<v Speaker 1>the potential problem might be and how do you how

388
00:20:59.640 --> 00:21:02.519
<v Speaker 1>do you get some faith in that theory and then

389
00:21:03.319 --> 00:21:04.440
<v Speaker 1>do something to mitigate it.

390
00:21:04.799 --> 00:21:05.000
<v Speaker 4>Yeah.

391
00:21:05.200 --> 00:21:07.599
<v Speaker 5>Yeah, I think like a couple of techniques that we

392
00:21:07.799 --> 00:21:10.079
<v Speaker 5>have seen and we had to kind of deployed. One

393
00:21:10.240 --> 00:21:14.119
<v Speaker 5>is maybe do like a shadow on care like you

394
00:21:14.279 --> 00:21:17.000
<v Speaker 5>can you do on calls, like someone is handing the

395
00:21:17.079 --> 00:21:19.279
<v Speaker 5>on care, but let's say if they have major incidents,

396
00:21:19.720 --> 00:21:22.759
<v Speaker 5>you shadow them. So you know, these are tools like

397
00:21:22.920 --> 00:21:25.839
<v Speaker 5>even even I've seen sometimes people don't even know like

398
00:21:26.000 --> 00:21:28.240
<v Speaker 5>these are dashboards that you can refer to and which

399
00:21:28.240 --> 00:21:31.720
<v Speaker 5>will probably help you more so, uh, reaching out to

400
00:21:32.079 --> 00:21:35.920
<v Speaker 5>people shadowing them definitely hips, especially when whenever there's an

401
00:21:36.000 --> 00:21:38.640
<v Speaker 5>incident or an issue in a system which you have

402
00:21:38.880 --> 00:21:42.599
<v Speaker 5>not yet touched. So that is one other bit is

403
00:21:43.079 --> 00:21:45.559
<v Speaker 5>what we had done was we had done chaos monkey

404
00:21:45.680 --> 00:21:50.200
<v Speaker 5>a lot. So chaos monkey like like a concept I

405
00:21:50.279 --> 00:21:53.559
<v Speaker 5>think probably generated in Netflix engineering.

406
00:21:53.519 --> 00:21:57.920
<v Speaker 4>That uh you kind of have like game days, uh

407
00:21:58.519 --> 00:22:02.559
<v Speaker 4>where were what you do is like certain infrastructures, you

408
00:22:02.680 --> 00:22:03.359
<v Speaker 4>just put it down.

409
00:22:03.880 --> 00:22:06.359
<v Speaker 5>Let's say you put it down like a replica of

410
00:22:06.680 --> 00:22:10.119
<v Speaker 5>let's say post is early and and see how your

411
00:22:10.160 --> 00:22:11.599
<v Speaker 5>engineing team is performing after that.

412
00:22:11.759 --> 00:22:13.240
<v Speaker 4>What's how much time they are just.

413
00:22:13.279 --> 00:22:17.160
<v Speaker 5>Taken to mitigate or at least at least mitigating what's

414
00:22:17.240 --> 00:22:20.359
<v Speaker 5>their mptity, What's how they're fixing it, how they're communicating,

415
00:22:20.400 --> 00:22:22.759
<v Speaker 5>how they're collaborating, They're putting.

416
00:22:22.519 --> 00:22:25.319
<v Speaker 4>The right communication to the right stakeholders at the right time.

417
00:22:25.839 --> 00:22:31.319
<v Speaker 5>So those kind of events, those kinds of practices have helped,

418
00:22:31.440 --> 00:22:33.559
<v Speaker 5>especially when you have not done.

419
00:22:33.400 --> 00:22:36.960
<v Speaker 4>For a long time. So chaos lunkey help does a.

420
00:22:37.000 --> 00:22:43.119
<v Speaker 5>Lot, especially tripping for you large events, and especially having

421
00:22:43.279 --> 00:22:47.720
<v Speaker 5>like a proper collaboration sync with other other teams, because

422
00:22:47.759 --> 00:22:50.559
<v Speaker 5>that's also what is needed. You do it within your team,

423
00:22:50.920 --> 00:22:53.039
<v Speaker 5>but you also have to do it with let's say

424
00:22:53.519 --> 00:22:56.160
<v Speaker 5>your maybe front time team or maybe your DevOps team.

425
00:22:56.240 --> 00:22:58.519
<v Speaker 5>You need to do in sync to mitigating issue. So

426
00:23:00.359 --> 00:23:02.640
<v Speaker 5>there are techniques to do that. So that't like right

427
00:23:03.000 --> 00:23:06.680
<v Speaker 5>takes and right information also is like like the right

428
00:23:06.839 --> 00:23:10.640
<v Speaker 5>education is also done for the for whoever is coming on.

429
00:23:12.279 --> 00:23:16.079
<v Speaker 3>I often wonder how how much these things provide value.

430
00:23:16.160 --> 00:23:19.039
<v Speaker 3>Like way along the spectrum is the right time to

431
00:23:19.079 --> 00:23:21.839
<v Speaker 3>start implementing, say a game day where you're taking your

432
00:23:21.880 --> 00:23:24.599
<v Speaker 3>own stuff down, or the Simian army to inject faults

433
00:23:24.599 --> 00:23:27.640
<v Speaker 3>into your architecture or infrastructure. Like I see a lot

434
00:23:27.680 --> 00:23:30.160
<v Speaker 3>of companies that I'm I could say, hey, you know what,

435
00:23:30.440 --> 00:23:34.279
<v Speaker 3>that's probably not the highest value day. Like they're like,

436
00:23:34.799 --> 00:23:37.160
<v Speaker 3>they have so many other problems that I think they

437
00:23:37.160 --> 00:23:41.119
<v Speaker 3>should tackle first before they're ready to do that. But

438
00:23:41.279 --> 00:23:44.359
<v Speaker 3>then on the opposite side, I'm thinking, wait, like if

439
00:23:44.400 --> 00:23:47.960
<v Speaker 3>they did this, they may actually identify critical problems within

440
00:23:48.000 --> 00:23:51.480
<v Speaker 3>their infrastructure that could cause them multi day downtimes or

441
00:23:51.599 --> 00:23:55.240
<v Speaker 3>multi week downtimes, which you would have more catastrophic impacts

442
00:23:55.279 --> 00:23:58.440
<v Speaker 3>in the long term. Uh, Like, I don't know, is

443
00:23:58.480 --> 00:24:00.279
<v Speaker 3>that interesting? And do any thoughts of that?

444
00:24:01.240 --> 00:24:04.880
<v Speaker 5>And like like, of course, like your company at a

445
00:24:04.920 --> 00:24:07.240
<v Speaker 5>startup stage or initial stages.

446
00:24:06.839 --> 00:24:09.480
<v Speaker 4>Where they maybe don't have a lot of customers or.

447
00:24:09.799 --> 00:24:12.839
<v Speaker 5>That's uh, they don't, they won't be doing this even

448
00:24:12.920 --> 00:24:17.839
<v Speaker 5>if I'll say, I think once you have started having

449
00:24:17.960 --> 00:24:22.160
<v Speaker 5>multiple teams, multiple engineering teams with say different different powers,

450
00:24:22.240 --> 00:24:25.400
<v Speaker 5>kind of a system where sometimes the information is scattered

451
00:24:26.359 --> 00:24:29.279
<v Speaker 5>between teams and you don't know, you know, like when

452
00:24:29.319 --> 00:24:31.119
<v Speaker 5>a when a fire is there, you don't know who

453
00:24:31.200 --> 00:24:33.200
<v Speaker 5>to who to say, like who put that?

454
00:24:33.759 --> 00:24:35.200
<v Speaker 4>And that's the beginning of it.

455
00:24:35.599 --> 00:24:38.559
<v Speaker 5>And as slightly, I think the team's start to mature,

456
00:24:38.640 --> 00:24:41.720
<v Speaker 5>and the mature I think, I think that's the right

457
00:24:41.799 --> 00:24:45.039
<v Speaker 5>time to sort of sort of start these processes.

458
00:24:45.400 --> 00:24:48.400
<v Speaker 1>Yeah. I think maturity is really the key word there

459
00:24:48.720 --> 00:24:52.039
<v Speaker 1>because it takes you know, you have to have multiple

460
00:24:52.119 --> 00:24:54.279
<v Speaker 1>layers of maturity there. You have to have a product

461
00:24:54.319 --> 00:24:59.359
<v Speaker 1>that's mature enough to be tested, but you also also

462
00:24:59.440 --> 00:25:02.799
<v Speaker 1>have to have maturity in your leadership team where they

463
00:25:03.000 --> 00:25:06.319
<v Speaker 1>recognize and understand the value of saying, hey, we're not

464
00:25:06.440 --> 00:25:10.440
<v Speaker 1>shipping new features this week, We're not shipping shiny new buttons.

465
00:25:10.480 --> 00:25:12.519
<v Speaker 1>We're actually going to take the time and effort to

466
00:25:12.559 --> 00:25:14.359
<v Speaker 1>see what it takes to break our system.

467
00:25:14.720 --> 00:25:18.839
<v Speaker 5>Yeah, I think probably, But a company having a launch,

468
00:25:19.440 --> 00:25:22.200
<v Speaker 5>launching new things or launching a new product, and maybe

469
00:25:22.240 --> 00:25:25.720
<v Speaker 5>a week so I think people do dog footing. They

470
00:25:25.759 --> 00:25:28.960
<v Speaker 5>can add this maybe instead of response or as a

471
00:25:29.039 --> 00:25:31.839
<v Speaker 5>part of it, so that you know how your team

472
00:25:31.839 --> 00:25:34.400
<v Speaker 5>would be reacting on day one days ago one in two.

473
00:25:34.960 --> 00:25:40.720
<v Speaker 5>So I think I think it like generally sometimes even

474
00:25:40.799 --> 00:25:45.000
<v Speaker 5>the managers or the management kind of starts to realizing,

475
00:25:45.400 --> 00:25:47.119
<v Speaker 5>now we are spending a lot of time on these

476
00:25:47.279 --> 00:25:51.319
<v Speaker 5>incidents itself, like our delivery for other important stuff is

477
00:25:51.400 --> 00:25:53.519
<v Speaker 5>also getting impacted, so now we.

478
00:25:53.519 --> 00:25:57.519
<v Speaker 4>Should find time to set some process time for.

479
00:25:57.599 --> 00:26:00.480
<v Speaker 5>This or that, like we get you know, these incident

480
00:26:00.759 --> 00:26:04.200
<v Speaker 5>it is so we can have a longer time for

481
00:26:04.279 --> 00:26:05.160
<v Speaker 5>our whole features.

482
00:26:05.400 --> 00:26:08.720
<v Speaker 4>So it's always, you know about finding that right balance.

483
00:26:09.279 --> 00:26:12.200
<v Speaker 5>Even engineering managers or I believe have a tough time

484
00:26:12.240 --> 00:26:15.079
<v Speaker 5>to sort of something justify a lot of spending a

485
00:26:15.119 --> 00:26:16.680
<v Speaker 5>lot of time for these kind of things.

486
00:26:16.799 --> 00:26:17.920
<v Speaker 4>Like it's always I think.

487
00:26:17.799 --> 00:26:22.000
<v Speaker 5>That's I think that's probably there always conundrum they are in,

488
00:26:22.559 --> 00:26:24.440
<v Speaker 5>you know, which which part to spend time?

489
00:26:24.839 --> 00:26:30.000
<v Speaker 3>So uh, they have to take you I'm like stifling

490
00:26:30.160 --> 00:26:32.160
<v Speaker 3>maybe laughing here because I feel like I have so

491
00:26:32.240 --> 00:26:36.079
<v Speaker 3>many previous traumatic experiences of some sort of on call event.

492
00:26:36.480 --> 00:26:37.759
<v Speaker 3>You know that's on one side the other and the

493
00:26:37.799 --> 00:26:39.400
<v Speaker 3>other side. You said, it's like, oh, well, you know,

494
00:26:39.519 --> 00:26:43.119
<v Speaker 3>the product manager needs to prioritize the factor, but like

495
00:26:43.319 --> 00:26:45.440
<v Speaker 3>I want to hire that PM that actually is like

496
00:26:45.519 --> 00:26:48.279
<v Speaker 3>you know what our insids, our incidents are are impacting

497
00:26:48.400 --> 00:26:52.039
<v Speaker 3>our you know, future profitability, so we should actually take

498
00:26:52.039 --> 00:26:53.079
<v Speaker 3>a look at it, improving ourselves.

499
00:26:53.079 --> 00:26:54.599
<v Speaker 2>Like I've never heard that. I've never heard.

500
00:26:54.440 --> 00:26:57.079
<v Speaker 3>Anything like anyone on that side say that, you know,

501
00:26:57.400 --> 00:26:59.400
<v Speaker 3>like it's always the other way, like, oh, we don't

502
00:26:59.400 --> 00:27:00.839
<v Speaker 3>need to worry about that is done right.

503
00:27:00.880 --> 00:27:02.400
<v Speaker 2>We didn't we finish that coverage to push it.

504
00:27:02.559 --> 00:27:02.839
<v Speaker 4>We don't.

505
00:27:02.920 --> 00:27:06.119
<v Speaker 2>We don't need to improve it anymore, and think like

506
00:27:06.519 --> 00:27:06.839
<v Speaker 2>I think the.

507
00:27:06.839 --> 00:27:10.079
<v Speaker 5>First part is always about you know, having that right

508
00:27:10.200 --> 00:27:13.799
<v Speaker 5>report or having some sort of information so that you

509
00:27:13.880 --> 00:27:16.400
<v Speaker 5>could add like maybe you know if these are the

510
00:27:16.799 --> 00:27:20.079
<v Speaker 5>these are incidents, there are recreatable incidents, these are the

511
00:27:20.400 --> 00:27:23.559
<v Speaker 5>probably if you have some sort of a business impact

512
00:27:23.599 --> 00:27:25.759
<v Speaker 5>to it, we show them their numbers and see like

513
00:27:26.240 --> 00:27:27.839
<v Speaker 5>this is an impact and if you.

514
00:27:27.920 --> 00:27:30.000
<v Speaker 4>Want to sort of reduce.

515
00:27:29.720 --> 00:27:32.240
<v Speaker 5>That numbers of business impact, then we need this to

516
00:27:32.960 --> 00:27:35.079
<v Speaker 5>I like, I think, I just think it's always a

517
00:27:35.079 --> 00:27:37.920
<v Speaker 5>hard time to justify spending time on the instance.

518
00:27:38.039 --> 00:27:40.240
<v Speaker 4>But if you have that data, that data would be

519
00:27:40.720 --> 00:27:41.279
<v Speaker 4>any use.

520
00:27:41.960 --> 00:27:44.279
<v Speaker 3>I mean, this is where like Dora is super successful,

521
00:27:44.319 --> 00:27:47.000
<v Speaker 3>where we come in with meantime to resolution and change

522
00:27:47.039 --> 00:27:49.640
<v Speaker 3>failure rate and so falling back on those statistics can

523
00:27:50.000 --> 00:27:53.279
<v Speaker 3>be really helpful in the conversation to convince people that

524
00:27:53.400 --> 00:27:56.640
<v Speaker 3>these aren't industry standard, that we have every single pot

525
00:27:56.720 --> 00:27:59.240
<v Speaker 3>request we push out results in a bug in production.

526
00:28:01.759 --> 00:28:06.640
<v Speaker 4>That's right, right right.

527
00:28:07.920 --> 00:28:14.039
<v Speaker 1>I would imagine that most companies onboarding experience to incident

528
00:28:14.119 --> 00:28:18.880
<v Speaker 1>response is a result of hitting a breaking point where

529
00:28:18.880 --> 00:28:21.799
<v Speaker 1>they've had just outage after outage after outage, and finally

530
00:28:21.839 --> 00:28:25.480
<v Speaker 1>they're like, Okay, we have to do something different, which

531
00:28:25.519 --> 00:28:29.599
<v Speaker 1>is probably what leads them to you would that be

532
00:28:29.880 --> 00:28:30.720
<v Speaker 1>a fair statement.

533
00:28:31.920 --> 00:28:35.400
<v Speaker 5>Yeah, yeah, I think like one is what you say,

534
00:28:35.559 --> 00:28:40.119
<v Speaker 5>like having outages outages, and the other part is even

535
00:28:40.200 --> 00:28:43.960
<v Speaker 5>if let's say, if they want to sort of stream

536
00:28:44.039 --> 00:28:47.519
<v Speaker 5>into some process, usually they see like maybe oncoll is

537
00:28:47.559 --> 00:28:51.039
<v Speaker 5>confused what to do, or maybe they are the OnCore

538
00:28:51.160 --> 00:28:55.160
<v Speaker 5>is need to react, or the manager doesn't know what's happening,

539
00:28:55.400 --> 00:28:56.920
<v Speaker 5>or someone someone doesn't even know how.

540
00:28:56.839 --> 00:29:00.160
<v Speaker 4>To report an for example. So there are different.

541
00:29:00.039 --> 00:29:04.640
<v Speaker 5>Different aspects to it. I obviously like like the entire

542
00:29:04.720 --> 00:29:08.960
<v Speaker 5>incident responses part of two bars, what is you know

543
00:29:09.079 --> 00:29:12.079
<v Speaker 5>the trigger or you know, how are you creating the incident?

544
00:29:12.240 --> 00:29:13.880
<v Speaker 5>Like what's the trigger for that? And then how are

545
00:29:13.880 --> 00:29:17.519
<v Speaker 5>you responding to that, which is like debugging, communicating, and then.

546
00:29:17.480 --> 00:29:18.480
<v Speaker 4>The post modern movement.

547
00:29:19.039 --> 00:29:22.039
<v Speaker 5>So so that's where we kind of try to come in,

548
00:29:22.160 --> 00:29:24.920
<v Speaker 5>like sort of you can stream at the entire pipeline

549
00:29:24.960 --> 00:29:27.799
<v Speaker 5>of it, like make it as quick as possible, make

550
00:29:27.880 --> 00:29:34.119
<v Speaker 5>it visible across maybe stakeholders, maybe support across engineering teams

551
00:29:34.160 --> 00:29:36.640
<v Speaker 5>and having the post modern analysis processes in the least.

552
00:29:37.160 --> 00:29:40.680
<v Speaker 5>So it like I think, like like we come in

553
00:29:41.039 --> 00:29:45.960
<v Speaker 5>when people when teams recognize too many repetated incidents or

554
00:29:46.000 --> 00:29:47.000
<v Speaker 5>too many of these.

555
00:29:46.880 --> 00:29:51.279
<v Speaker 4>Stuff, and whoever is the on call is kind of feeling.

556
00:29:51.079 --> 00:29:55.200
<v Speaker 5>Very confused for a state of things. So that's where

557
00:29:55.480 --> 00:29:57.720
<v Speaker 5>we have seen a lot of competition onto this.

558
00:29:58.319 --> 00:30:00.640
<v Speaker 1>Yeah, you mentioned the stakehold was there, and I think

559
00:30:00.640 --> 00:30:04.160
<v Speaker 1>that's a really cool thing to dive into for a minute,

560
00:30:04.200 --> 00:30:09.440
<v Speaker 1>because communication is one of the key things of incident response,

561
00:30:09.480 --> 00:30:12.039
<v Speaker 1>and it's the one I always hated the most early

562
00:30:12.119 --> 00:30:14.960
<v Speaker 1>on in my career because I would be in an

563
00:30:15.079 --> 00:30:18.839
<v Speaker 1>incident and then everyone wants to know what's going on. Well,

564
00:30:18.920 --> 00:30:21.680
<v Speaker 1>I'm working on it, damn it. But I can't work

565
00:30:21.759 --> 00:30:23.880
<v Speaker 1>on it if you're sitting here hounded me with questions.

566
00:30:23.920 --> 00:30:25.720
<v Speaker 1>And so I think a key part of a solid

567
00:30:26.519 --> 00:30:31.000
<v Speaker 1>incident response plan is having a communication plan so that

568
00:30:31.839 --> 00:30:34.720
<v Speaker 1>you can relay that information out and free up the

569
00:30:34.799 --> 00:30:38.240
<v Speaker 1>people actually working on the incident to continue working on it.

570
00:30:38.359 --> 00:30:40.559
<v Speaker 1>How do you recommend addressing that?

571
00:30:41.240 --> 00:30:43.799
<v Speaker 5>Like I say, like an on call is a person

572
00:30:43.880 --> 00:30:48.440
<v Speaker 5>who's always on fire who has to you know, mitigate

573
00:30:48.519 --> 00:30:49.000
<v Speaker 5>the issue.

574
00:30:49.039 --> 00:30:54.079
<v Speaker 4>I think that's the namone everyone Connie, but because of the.

575
00:30:54.559 --> 00:30:57.720
<v Speaker 5>Environment, he needs to do a lot of things also

576
00:30:58.319 --> 00:31:01.759
<v Speaker 5>communicated to them support also, you know what message of

577
00:31:02.160 --> 00:31:06.920
<v Speaker 5>what's the estimated time or resolution from we communicate to manager, Hey,

578
00:31:06.960 --> 00:31:09.839
<v Speaker 5>this is probably the impact these many users, this much

579
00:31:09.960 --> 00:31:15.359
<v Speaker 5>subscriptions are being impacted right now. So so I that's

580
00:31:15.400 --> 00:31:18.519
<v Speaker 5>a major pain point the on call person has, Uh,

581
00:31:19.000 --> 00:31:21.160
<v Speaker 5>what's the way The best way is to you know,

582
00:31:21.920 --> 00:31:26.039
<v Speaker 5>delegate a lot of these stuff or maybe have a

583
00:31:26.519 --> 00:31:29.279
<v Speaker 5>have a system which is you know, like which is

584
00:31:29.359 --> 00:31:32.319
<v Speaker 5>visible to the stakeholders so that they don't ping.

585
00:31:32.279 --> 00:31:33.960
<v Speaker 4>The on call or they don't kind of you know,

586
00:31:34.160 --> 00:31:35.039
<v Speaker 4>ask them again and again.

587
00:31:35.200 --> 00:31:38.559
<v Speaker 5>What's one of the ways we do via page leg

588
00:31:38.839 --> 00:31:42.000
<v Speaker 5>is we live with Slack as a major part of

589
00:31:42.079 --> 00:31:46.400
<v Speaker 5>the incident response. So let's say we created channel Slack

590
00:31:46.480 --> 00:31:50.839
<v Speaker 5>channel for each incident and in the Slack channel, you

591
00:31:50.920 --> 00:31:55.119
<v Speaker 5>can see you know what's the eating Uh, what's.

592
00:31:54.960 --> 00:31:55.839
<v Speaker 4>The business impact?

593
00:31:56.240 --> 00:31:58.920
<v Speaker 5>Or maybe some bit of information is like something some

594
00:31:59.039 --> 00:32:01.640
<v Speaker 5>bit of it is a by the on kore, but

595
00:32:01.960 --> 00:32:04.160
<v Speaker 5>nobody is like asking on again in the.

596
00:32:04.200 --> 00:32:07.200
<v Speaker 4>Game, you know what's the ETU? What's the impact? Let's

597
00:32:07.200 --> 00:32:09.000
<v Speaker 4>say aug much wants to see they can go to

598
00:32:09.039 --> 00:32:09.839
<v Speaker 4>the channel and see it.

599
00:32:10.279 --> 00:32:14.599
<v Speaker 5>Customer support can see you so like like like whatever,

600
00:32:14.960 --> 00:32:17.200
<v Speaker 5>Let's say if someone wants to send an email, no

601
00:32:17.240 --> 00:32:21.279
<v Speaker 5>one likely they can just send all that information to emails.

602
00:32:21.519 --> 00:32:26.279
<v Speaker 5>What if the stakeholders that the company has so whatever

603
00:32:27.359 --> 00:32:31.160
<v Speaker 5>kind of the you know, actions on cool has to

604
00:32:31.240 --> 00:32:35.880
<v Speaker 5>do apart from mitigation is an additional effort. And whatever

605
00:32:36.079 --> 00:32:38.519
<v Speaker 5>tools and resources they can utilize to sort of you know,

606
00:32:38.599 --> 00:32:42.200
<v Speaker 5>delegate an automatated would be much more helpful. Uh, And

607
00:32:42.799 --> 00:32:46.200
<v Speaker 5>so that they so that his major sort of brain

608
00:32:46.279 --> 00:32:49.759
<v Speaker 5>focuses always on mitigating the issue as quickly as possible.

609
00:32:50.440 --> 00:32:52.480
<v Speaker 3>Yeah, I mean, I think having those additional things in

610
00:32:52.599 --> 00:32:55.480
<v Speaker 3>place once you identify them to help streamline the process

611
00:32:55.519 --> 00:32:59.759
<v Speaker 3>are super important. Like we've got uh status dashboards that

612
00:32:59.880 --> 00:33:02.240
<v Speaker 3>we can point customers to immediately, so rather than trying

613
00:33:02.319 --> 00:33:05.200
<v Speaker 3>to explain where the updates are going to be or

614
00:33:05.240 --> 00:33:06.599
<v Speaker 3>how they're going to happen, and you just go to

615
00:33:06.680 --> 00:33:08.839
<v Speaker 3>the Zurel and stuff is there. But I mean, I

616
00:33:08.839 --> 00:33:12.359
<v Speaker 3>think also as customers of SaaS solutions, we have like

617
00:33:12.559 --> 00:33:16.839
<v Speaker 3>an opportunity to even be nicer to companies that are

618
00:33:16.880 --> 00:33:19.559
<v Speaker 3>having incidents. I mean, I think there's an emoji dedicated

619
00:33:19.559 --> 00:33:22.359
<v Speaker 3>to this, hug ops right, you know when something's happening,

620
00:33:22.519 --> 00:33:25.720
<v Speaker 3>you know, pass pass on the empathy a little bit.

621
00:33:25.839 --> 00:33:28.200
<v Speaker 3>Like I care way more about as a customer that

622
00:33:28.319 --> 00:33:30.680
<v Speaker 3>you tell me that you know that there's a problem

623
00:33:30.759 --> 00:33:33.240
<v Speaker 3>that someone's looking at rather than being like, oh, we

624
00:33:33.359 --> 00:33:35.039
<v Speaker 3>don't know, I don't I don't know what's going on,

625
00:33:35.359 --> 00:33:37.319
<v Speaker 3>or you know, even someone's looking at it, Like I

626
00:33:37.440 --> 00:33:39.480
<v Speaker 3>much prefer to be told oh, yeah, like we'll have

627
00:33:39.599 --> 00:33:42.519
<v Speaker 3>an update in an hour, then oh, this is exactly

628
00:33:42.519 --> 00:33:43.200
<v Speaker 3>what's happening at.

629
00:33:43.160 --> 00:33:44.400
<v Speaker 2>This moment, Like I don't care about that.

630
00:33:44.519 --> 00:33:46.559
<v Speaker 3>I want to know, you know, when's the next update

631
00:33:46.640 --> 00:33:50.319
<v Speaker 3>going to be happening, more so than Play by Life.

632
00:33:50.839 --> 00:33:54.640
<v Speaker 5>Right like there too also always there are two types

633
00:33:54.680 --> 00:33:56.920
<v Speaker 5>of communication, one internally and externally.

634
00:33:57.160 --> 00:33:58.480
<v Speaker 4>Both has to be I think that.

635
00:34:00.000 --> 00:34:02.640
<v Speaker 5>Suddenly more because you have the state like you have

636
00:34:02.759 --> 00:34:07.160
<v Speaker 5>the ultimate stakedness, but like like both needs to be

637
00:34:07.559 --> 00:34:09.119
<v Speaker 5>you know, always updated.

638
00:34:09.239 --> 00:34:11.400
<v Speaker 4>Both needs to be you know, always to the point

639
00:34:11.800 --> 00:34:13.679
<v Speaker 4>so that like because.

640
00:34:13.480 --> 00:34:16.920
<v Speaker 5>In any of these conditions there's any miscommindation happening, then

641
00:34:16.960 --> 00:34:20.360
<v Speaker 5>it will you know, just prolonged instead much wrong.

642
00:34:21.000 --> 00:34:27.800
<v Speaker 1>So it reminds me of the AWS status pages early

643
00:34:27.960 --> 00:34:31.880
<v Speaker 1>in the days of AWS, like was always green, Like

644
00:34:32.360 --> 00:34:35.039
<v Speaker 1>I would I would have put money that it was

645
00:34:35.280 --> 00:34:37.400
<v Speaker 1>just a green icon there and there were no other

646
00:34:37.519 --> 00:34:40.119
<v Speaker 1>options available because it was always green.

647
00:34:41.239 --> 00:34:44.079
<v Speaker 5>Right, I remember at that time, I think like, uh,

648
00:34:45.039 --> 00:34:48.320
<v Speaker 5>I usually didn't sort of had a lot of confidence

649
00:34:48.400 --> 00:34:51.480
<v Speaker 5>on that, I like down these some other or even

650
00:34:51.639 --> 00:34:54.440
<v Speaker 5>Twitter was much more sort of a better way to

651
00:34:54.559 --> 00:34:58.440
<v Speaker 5>you know, there's like actually a major and those status

652
00:34:58.519 --> 00:34:59.920
<v Speaker 5>pages were like not at all.

653
00:35:01.800 --> 00:35:03.639
<v Speaker 4>I think, I think things have changed.

654
00:35:03.679 --> 00:35:07.239
<v Speaker 1>But you bring up Twitter though, and that's a really

655
00:35:07.320 --> 00:35:09.480
<v Speaker 1>good point. I mean, I think for a lot of

656
00:35:09.920 --> 00:35:15.280
<v Speaker 1>tech oriented companies that's a primary communication channel, you know,

657
00:35:15.519 --> 00:35:21.280
<v Speaker 1>sending out notifications on Twitter or x and relaying information

658
00:35:21.360 --> 00:35:25.639
<v Speaker 1>that way. And also like it's kind of sad to say,

659
00:35:25.679 --> 00:35:29.360
<v Speaker 1>but that's also a good notification method of whenever your

660
00:35:29.400 --> 00:35:31.599
<v Speaker 1>customers think something's going wrong.

661
00:35:32.320 --> 00:35:37.800
<v Speaker 3>Was I mean enough enough that I saw some products

662
00:35:37.840 --> 00:35:40.400
<v Speaker 3>that specifically like we go around to social media and

663
00:35:40.519 --> 00:35:44.599
<v Speaker 3>get the up real time status from potential users complaining

664
00:35:44.639 --> 00:35:47.800
<v Speaker 3>because it's another source that you're not tapping into to

665
00:35:47.880 --> 00:35:50.199
<v Speaker 3>actually let you know if customers have a problem, you know,

666
00:35:50.280 --> 00:35:52.000
<v Speaker 3>they're not necessarily reporting it back to you.

667
00:35:52.199 --> 00:35:54.360
<v Speaker 2>This is the report mechanism, right.

668
00:35:55.159 --> 00:35:57.800
<v Speaker 4>I think these two kind of work.

669
00:35:59.239 --> 00:35:59.800
<v Speaker 2>That's a good point.

670
00:36:00.400 --> 00:36:03.400
<v Speaker 5>Yeah, And I think the companies, I think I've started

671
00:36:03.440 --> 00:36:06.639
<v Speaker 5>to put artists feeds also, like for a longer time,

672
00:36:07.199 --> 00:36:11.519
<v Speaker 5>and they have integrations with those feeds to their Twitter

673
00:36:11.679 --> 00:36:16.960
<v Speaker 5>accounts or maybe some of their complements discord if they're

674
00:36:17.000 --> 00:36:18.920
<v Speaker 5>doing a sas kind of a product or something like that,

675
00:36:19.079 --> 00:36:22.840
<v Speaker 5>so that their customers are also updated by these platforms.

676
00:36:23.800 --> 00:36:25.880
<v Speaker 3>I mean accuracy, though, is what you're bringing up will

677
00:36:25.920 --> 00:36:28.719
<v Speaker 3>And I feel like there's a huge challenge there realistically

678
00:36:28.880 --> 00:36:30.719
<v Speaker 3>to like what do you what like what makes sense

679
00:36:30.800 --> 00:36:34.360
<v Speaker 3>to even talk about and what should be intermittent hidden

680
00:36:34.440 --> 00:36:37.400
<v Speaker 3>failures from an internal company standpoint, Like I don't want

681
00:36:37.440 --> 00:36:40.400
<v Speaker 3>to see Amazon just being read all the time because

682
00:36:41.000 --> 00:36:44.599
<v Speaker 3>some node in cloud front failed one request because the

683
00:36:44.760 --> 00:36:46.320
<v Speaker 3>connection didn't go through.

684
00:36:46.280 --> 00:36:47.519
<v Speaker 2>Like how does that help?

685
00:36:47.599 --> 00:36:49.880
<v Speaker 3>So I mean I feel like or yellow all the

686
00:36:49.960 --> 00:36:53.360
<v Speaker 3>time because there's always something that's probably impossibly problem. I

687
00:36:53.440 --> 00:36:56.239
<v Speaker 3>think a single color there is is always wrong.

688
00:36:57.119 --> 00:36:59.480
<v Speaker 4>Right, and and that's why I didn't.

689
00:37:00.000 --> 00:37:02.679
<v Speaker 5>I think if you see AWS, they have although the

690
00:37:03.280 --> 00:37:06.679
<v Speaker 5>period of time they have evolved their status page earlier

691
00:37:07.039 --> 00:37:10.079
<v Speaker 5>like now, they have actually region wise. Also I think

692
00:37:10.119 --> 00:37:12.519
<v Speaker 5>they have also started to do for so for some

693
00:37:12.559 --> 00:37:15.480
<v Speaker 5>of the services, they have started to do more grummar

694
00:37:15.559 --> 00:37:20.800
<v Speaker 5>scooping as much as possible, so like, uh, that's even

695
00:37:20.840 --> 00:37:23.719
<v Speaker 5>for Slack. Also, like earlier they used to do only

696
00:37:23.840 --> 00:37:27.400
<v Speaker 5>for messages, you know, if you have work spaces working fine.

697
00:37:27.480 --> 00:37:30.159
<v Speaker 5>They have not started to do for APIs. And you know,

698
00:37:30.519 --> 00:37:34.079
<v Speaker 5>like even logging has been different, so every bit of

699
00:37:34.199 --> 00:37:36.440
<v Speaker 5>different they have started to do so that like you

700
00:37:36.559 --> 00:37:39.239
<v Speaker 5>don't have like a yellow for maybe just a small

701
00:37:39.320 --> 00:37:41.559
<v Speaker 5>issue in or maybe just a small service in a

702
00:37:41.639 --> 00:37:44.920
<v Speaker 5>small region. So if if that day page is more,

703
00:37:45.440 --> 00:37:48.000
<v Speaker 5>if that status page is more sort of detay, then

704
00:37:48.159 --> 00:37:50.519
<v Speaker 5>I think it probably helps to sort of give the

705
00:37:50.599 --> 00:37:51.280
<v Speaker 5>right information.

706
00:37:52.960 --> 00:37:55.519
<v Speaker 3>I mean, I actually think AWS went a lot further here.

707
00:37:55.599 --> 00:37:58.760
<v Speaker 3>They have something called the Health Dashboard, which figures out

708
00:37:58.800 --> 00:38:00.480
<v Speaker 3>what services you're actually used, I think, and how that

709
00:38:00.519 --> 00:38:03.320
<v Speaker 3>could be impacted you and then actually have messages there,

710
00:38:03.360 --> 00:38:07.039
<v Speaker 3>which I mean is really what we all actually care about, right,

711
00:38:07.079 --> 00:38:09.360
<v Speaker 3>you know, is there something happening at this moment which

712
00:38:09.440 --> 00:38:13.480
<v Speaker 3>actually affects us that could be interesting realistically if we

713
00:38:13.559 --> 00:38:15.760
<v Speaker 3>saw a problem, does this explain it?

714
00:38:16.719 --> 00:38:18.159
<v Speaker 4>Right? Right? Absolutely?

715
00:38:18.679 --> 00:38:20.440
<v Speaker 1>So. One thing we haven't talked about a lot is

716
00:38:20.559 --> 00:38:24.880
<v Speaker 1>the post mortem, and I feel like that's all just

717
00:38:25.000 --> 00:38:29.880
<v Speaker 1>like that is as much work as doing the incident

718
00:38:30.519 --> 00:38:34.960
<v Speaker 1>response itself, but sometimes it gets overlooked because it's no

719
00:38:35.079 --> 00:38:37.400
<v Speaker 1>longer a priority. Like once the incident is no longer

720
00:38:37.480 --> 00:38:40.480
<v Speaker 1>an incident, you have to just be disciplined enough to

721
00:38:40.559 --> 00:38:43.519
<v Speaker 1>run through the post mortem process. How do you how

722
00:38:43.559 --> 00:38:44.239
<v Speaker 1>do you approach that?

723
00:38:44.599 --> 00:38:47.880
<v Speaker 4>I think the post mortem is like I'll say, like

724
00:38:48.000 --> 00:38:50.800
<v Speaker 4>a chain your top in terms of the look like

725
00:38:50.960 --> 00:38:51.239
<v Speaker 4>you have.

726
00:38:51.760 --> 00:38:53.840
<v Speaker 5>You know, you're doing us keep less right, You're going

727
00:38:53.920 --> 00:38:56.840
<v Speaker 5>to the incident and maybe fix it also, but now

728
00:38:57.280 --> 00:39:01.760
<v Speaker 5>you need it like a Disney we used to have

729
00:39:01.920 --> 00:39:04.599
<v Speaker 5>like a day Maximat's idea or even there's a need

730
00:39:04.760 --> 00:39:08.119
<v Speaker 5>to you know, come with that most modern document because

731
00:39:08.199 --> 00:39:10.639
<v Speaker 5>they were kind of very bullish on that, like we

732
00:39:10.920 --> 00:39:14.320
<v Speaker 5>want to know what cause issues, soctly we can fix

733
00:39:14.360 --> 00:39:18.719
<v Speaker 5>it r tomorre itself. So that like, like I would say,

734
00:39:18.800 --> 00:39:22.199
<v Speaker 5>like that's where that's where the gap is, Like, that's

735
00:39:22.199 --> 00:39:24.880
<v Speaker 5>where a lot of people drop where they don't want

736
00:39:24.920 --> 00:39:29.800
<v Speaker 5>to do that work work, especially after a grueling period

737
00:39:29.880 --> 00:39:32.400
<v Speaker 5>of you know, incident resolving process.

738
00:39:33.039 --> 00:39:36.599
<v Speaker 4>So but I think it's just about.

739
00:39:38.199 --> 00:39:40.800
<v Speaker 5>More of an education part or more of you know,

740
00:39:41.679 --> 00:39:47.199
<v Speaker 5>realizing what you have learned from your incident resolving partly,

741
00:39:47.880 --> 00:39:51.239
<v Speaker 5>h you have probably a tea has resolved a lot

742
00:39:51.280 --> 00:39:54.039
<v Speaker 5>of incident, but if they have not learned anything from them,

743
00:39:54.440 --> 00:39:58.760
<v Speaker 5>then's pretty much beastful because tomorrow similar or probably the

744
00:39:58.840 --> 00:40:01.199
<v Speaker 5>same incident would occur, probably a different team of.

745
00:40:01.400 --> 00:40:04.760
<v Speaker 4>You know, in your team itself. So like, I think

746
00:40:04.960 --> 00:40:08.840
<v Speaker 4>the value of the postem needs to be told pretty

747
00:40:09.000 --> 00:40:10.599
<v Speaker 4>you know, clearly, and it's a very clear poposition.

748
00:40:10.679 --> 00:40:13.039
<v Speaker 5>I always feel like if you tell the engineer if

749
00:40:13.039 --> 00:40:15.159
<v Speaker 5>you don't like, you know, hey, what we don't want

750
00:40:15.280 --> 00:40:18.239
<v Speaker 5>is to you know, you spending this much amount of

751
00:40:18.280 --> 00:40:21.239
<v Speaker 5>time again on a similar thing, you know, next week.

752
00:40:21.719 --> 00:40:25.400
<v Speaker 5>So that's where postpartant can help. So I think think

753
00:40:25.480 --> 00:40:28.400
<v Speaker 5>that value is pretty much it's I think it's it's important.

754
00:40:28.639 --> 00:40:31.159
<v Speaker 1>Yeah, And I think that's one of the really big

755
00:40:31.360 --> 00:40:35.000
<v Speaker 1>values of using an incident response tool is it it

756
00:40:35.119 --> 00:40:37.599
<v Speaker 1>will collect all of those data points and help you

757
00:40:38.400 --> 00:40:42.719
<v Speaker 1>more easily see that you're having this common failure.

758
00:40:43.840 --> 00:40:45.280
<v Speaker 2>Yeah, over and over again.

759
00:40:45.519 --> 00:40:47.519
<v Speaker 1>That otherwise, if you're just tracking this in like Google

760
00:40:47.599 --> 00:40:50.239
<v Speaker 1>docs or whatever, you wouldn't actually see that correlation.

761
00:40:51.039 --> 00:40:54.280
<v Speaker 5>Yeah, I think, like I think it needs to start

762
00:40:54.360 --> 00:40:58.679
<v Speaker 5>with what information you are feeling. So like, like even visually,

763
00:40:58.719 --> 00:41:01.079
<v Speaker 5>what we kind of help is to do the five buys.

764
00:41:01.239 --> 00:41:04.599
<v Speaker 5>You know, what what went well? You know what's first

765
00:41:04.599 --> 00:41:06.679
<v Speaker 5>of all, what happened, then what went well?

766
00:41:06.920 --> 00:41:07.920
<v Speaker 4>What can go with?

767
00:41:08.400 --> 00:41:11.880
<v Speaker 5>What we can do to you know, mediate or in future.

768
00:41:12.440 --> 00:41:16.679
<v Speaker 5>So like having those information in places pretty much is

769
00:41:16.760 --> 00:41:19.599
<v Speaker 5>the first tip. So do you have the right way

770
00:41:19.679 --> 00:41:23.679
<v Speaker 5>to analyze stuff in the timeline spart so you know

771
00:41:23.760 --> 00:41:25.440
<v Speaker 5>if you have you know, if you want to do

772
00:41:25.559 --> 00:41:28.000
<v Speaker 5>the slack conversations or if you want to do you know,

773
00:41:28.760 --> 00:41:32.639
<v Speaker 5>want to see what happened from when the incident was

774
00:41:32.639 --> 00:41:36.719
<v Speaker 5>triggered puill the result that timeline also helps you a lot,

775
00:41:37.119 --> 00:41:39.480
<v Speaker 5>so that you know where a lot of time was

776
00:41:39.519 --> 00:41:42.320
<v Speaker 5>being spent or if there's a miss there is like

777
00:41:42.440 --> 00:41:45.480
<v Speaker 5>a gap in the communication process. That is also that

778
00:41:45.639 --> 00:41:48.599
<v Speaker 5>is also kind of visible from them or you know,

779
00:41:48.719 --> 00:41:51.559
<v Speaker 5>what are the tickets or the action items that you

780
00:41:51.639 --> 00:41:54.480
<v Speaker 5>have created out of it. So there are like I'm

781
00:41:54.519 --> 00:41:59.119
<v Speaker 5>sell a lot of information in that postpotum document that can.

782
00:41:59.159 --> 00:42:01.519
<v Speaker 4>Help you to you know, analyze a lot of things.

783
00:42:02.039 --> 00:42:04.960
<v Speaker 5>And most of the times we have seen it are

784
00:42:05.079 --> 00:42:10.280
<v Speaker 5>usually a communication you know, uh communication error that is happening. Generally,

785
00:42:10.400 --> 00:42:13.599
<v Speaker 5>let's say you didn't sort of you didn't tell the

786
00:42:13.679 --> 00:42:16.119
<v Speaker 5>team to you know, he ep vision celebrated, so you

787
00:42:16.199 --> 00:42:18.960
<v Speaker 5>need to update. Things like that are the most common issues.

788
00:42:19.400 --> 00:42:22.400
<v Speaker 5>So from you set up a process around that too.

789
00:42:22.880 --> 00:42:25.079
<v Speaker 5>You know, next time, if you're great a version that

790
00:42:25.559 --> 00:42:30.000
<v Speaker 5>that modefiction is tent to different teams. So but avery

791
00:42:30.119 --> 00:42:33.880
<v Speaker 5>bit of this is always and always, you know, you

792
00:42:34.079 --> 00:42:37.400
<v Speaker 5>get these results or these jwills only after you have.

793
00:42:37.519 --> 00:42:40.679
<v Speaker 4>That document which has that entire information place.

794
00:42:41.280 --> 00:42:43.519
<v Speaker 5>So so and you can sort of you know, add

795
00:42:43.639 --> 00:42:46.679
<v Speaker 5>those action items alves maybe like a short term actional item,

796
00:42:46.719 --> 00:42:48.719
<v Speaker 5>a long term action item, and that.

797
00:42:49.039 --> 00:42:52.079
<v Speaker 4>That really helps. And the other part is to follow

798
00:42:52.119 --> 00:42:52.440
<v Speaker 4>this up.

799
00:42:52.800 --> 00:42:58.039
<v Speaker 5>So like you create these documents, you probably have meetings also,

800
00:42:58.599 --> 00:43:00.960
<v Speaker 5>but what after that, we need to sort of follow

801
00:43:01.079 --> 00:43:05.239
<v Speaker 5>those action items to the last brick until those tickets

802
00:43:05.239 --> 00:43:08.320
<v Speaker 5>are closed. You need to follow that up because otherwise

803
00:43:08.800 --> 00:43:14.119
<v Speaker 5>this entire process becomes useless. So following that part to

804
00:43:14.199 --> 00:43:15.880
<v Speaker 5>the very end is also pretty much important.

805
00:43:16.199 --> 00:43:19.079
<v Speaker 3>I heard a spicy take recently, and I want to

806
00:43:19.159 --> 00:43:24.280
<v Speaker 3>I want to lay this on you. Every incident could

807
00:43:24.320 --> 00:43:27.199
<v Speaker 3>have been prevented if you just had the right test.

808
00:43:30.360 --> 00:43:35.480
<v Speaker 5>Like I think every instance, as I'll say, like most

809
00:43:35.519 --> 00:43:38.039
<v Speaker 5>of the cases that we have seen is usually the

810
00:43:38.119 --> 00:43:41.920
<v Speaker 5>communication part, Like that's the most common thing that we

811
00:43:42.000 --> 00:43:46.559
<v Speaker 5>are also, uh like like I think like like I

812
00:43:46.719 --> 00:43:50.400
<v Speaker 5>remember case like terraform has a lot of issues. Everyone

813
00:43:51.280 --> 00:43:53.079
<v Speaker 5>kind of has a different story.

814
00:43:52.880 --> 00:43:55.239
<v Speaker 2>To it, but no argument here.

815
00:43:56.800 --> 00:43:59.280
<v Speaker 5>So we like one time what we saw is like

816
00:44:00.119 --> 00:44:03.639
<v Speaker 5>if we update the security groups via terra form, what

817
00:44:04.039 --> 00:44:04.960
<v Speaker 5>CBS was.

818
00:44:05.039 --> 00:44:08.039
<v Speaker 4>Doing was like it removes the security groups first. Like

819
00:44:08.159 --> 00:44:10.239
<v Speaker 4>let's say, if I want to add a security group, it.

820
00:44:10.320 --> 00:44:14.000
<v Speaker 5>Removes the security group first, the existing one, and then

821
00:44:14.039 --> 00:44:16.719
<v Speaker 5>it adds the list even though I'm adding a just

822
00:44:16.920 --> 00:44:20.960
<v Speaker 5>one exact Now, what happened was in the meantime and

823
00:44:21.039 --> 00:44:23.880
<v Speaker 5>it's removing those security groups. So we was down, like

824
00:44:24.119 --> 00:44:26.599
<v Speaker 5>the service was down for let's say two three minutes.

825
00:44:27.079 --> 00:44:29.159
<v Speaker 4>Now, this is something that we kind of.

826
00:44:30.800 --> 00:44:34.280
<v Speaker 5>Like that happened to our two and that's a potentially

827
00:44:35.079 --> 00:44:38.199
<v Speaker 5>like you know, ticking bomb, which can actually you know,

828
00:44:38.679 --> 00:44:42.920
<v Speaker 5>happen any time to across any games. So even if

829
00:44:42.960 --> 00:44:46.000
<v Speaker 5>we just communicate like, hey, this is what we have seen,

830
00:44:46.159 --> 00:44:48.679
<v Speaker 5>this is what we had in the experience, and if

831
00:44:48.719 --> 00:44:51.920
<v Speaker 5>we just relate to the all the engineering teams, that

832
00:44:52.079 --> 00:44:56.079
<v Speaker 5>issue would not occur. So that's usually the case since

833
00:44:56.199 --> 00:44:59.599
<v Speaker 5>we have seen Like if even if the communication is proper,

834
00:45:00.840 --> 00:45:04.239
<v Speaker 5>i'd stay like excu s differicental times of incidents wander.

835
00:45:05.800 --> 00:45:08.079
<v Speaker 3>I still I still can't get over the fact that

836
00:45:08.199 --> 00:45:11.079
<v Speaker 3>terrorform does that by default, Like it seems like something

837
00:45:11.159 --> 00:45:14.519
<v Speaker 3>that no one in their right mind would have designed

838
00:45:14.800 --> 00:45:18.119
<v Speaker 3>to have the default be first delete all of the

839
00:45:18.199 --> 00:45:21.920
<v Speaker 3>resources and then recreate them. I mean that just seems

840
00:45:21.960 --> 00:45:24.079
<v Speaker 3>like it just backwards to be like, isn't isn't the

841
00:45:24.159 --> 00:45:27.519
<v Speaker 3>common wisdom in in operations to okay, first we'll create

842
00:45:27.599 --> 00:45:29.719
<v Speaker 3>the new things, make sure that it works, and then

843
00:45:29.800 --> 00:45:32.239
<v Speaker 3>switch over to it. Why is the default delete?

844
00:45:32.320 --> 00:45:32.679
<v Speaker 4>I don't know.

845
00:45:32.840 --> 00:45:35.119
<v Speaker 2>I I maybe maybe I just need enough.

846
00:45:35.119 --> 00:45:37.639
<v Speaker 3>Coffee or something and someone and it will just magically

847
00:45:37.800 --> 00:45:38.760
<v Speaker 3>insight will come to me.

848
00:45:40.320 --> 00:45:42.239
<v Speaker 4>Oh maybe it's that's a a W is seeing. We

849
00:45:42.320 --> 00:45:45.840
<v Speaker 4>don't know. I don't remember, no, because.

850
00:45:45.639 --> 00:45:48.800
<v Speaker 3>Like cloud cloud formation and CDK and everything like that,

851
00:45:49.239 --> 00:45:51.159
<v Speaker 3>it's not it's just the order in which the s

852
00:45:51.320 --> 00:45:53.719
<v Speaker 3>K is being executed. There's no fundamental reason why it

853
00:45:53.800 --> 00:45:54.440
<v Speaker 3>has to be that way.

854
00:45:55.559 --> 00:45:57.880
<v Speaker 1>I want to come back to your your spicy take Warren.

855
00:45:59.679 --> 00:46:03.800
<v Speaker 1>Every outage could be eliminated or avoided with the right test.

856
00:46:03.960 --> 00:46:06.599
<v Speaker 1>I mean, I think in theory that's true, but the

857
00:46:07.199 --> 00:46:12.239
<v Speaker 1>like the practical steps of executing that make it not

858
00:46:13.400 --> 00:46:14.960
<v Speaker 1>the right answer for a lot of people. But I

859
00:46:15.000 --> 00:46:17.079
<v Speaker 1>think it does highlight something that I don't think I've

860
00:46:17.119 --> 00:46:21.199
<v Speaker 1>ever talked about in terms of incident response with anyone before,

861
00:46:21.599 --> 00:46:25.360
<v Speaker 1>and that's identifying what your risk tolerance is. Because for

862
00:46:25.440 --> 00:46:29.159
<v Speaker 1>a lot of companies, having some downtime is really not

863
00:46:29.360 --> 00:46:32.599
<v Speaker 1>a big deal. In other companies, it is a big deal.

864
00:46:32.719 --> 00:46:36.840
<v Speaker 1>Like I worked for a while in a medical company

865
00:46:37.320 --> 00:46:40.679
<v Speaker 1>where downtime for us meant that patients could potentially die,

866
00:46:40.920 --> 00:46:44.440
<v Speaker 1>So we were kind of risk averse there. But in

867
00:46:44.519 --> 00:46:46.880
<v Speaker 1>other places, you know, I worked for a company that

868
00:46:46.920 --> 00:46:48.760
<v Speaker 1>built a fitness app. You know, if we were down,

869
00:46:49.360 --> 00:46:51.400
<v Speaker 1>somebody had to figure out how to use the treadmill

870
00:46:51.480 --> 00:46:53.880
<v Speaker 1>on their own, I think they're going to be okay, But.

871
00:46:54.079 --> 00:46:55.159
<v Speaker 4>Like in those.

872
00:47:00.320 --> 00:47:02.719
<v Speaker 1>Yeah, but in those two extremes, you know, there's like

873
00:47:03.400 --> 00:47:05.719
<v Speaker 1>there's a different risk tolerance for how much downtime you're

874
00:47:05.760 --> 00:47:08.679
<v Speaker 1>willing to take. And I think that is probably something

875
00:47:08.760 --> 00:47:11.159
<v Speaker 1>that maybe needs to be talked about more by companies

876
00:47:11.239 --> 00:47:16.239
<v Speaker 1>when deciding how much downtime we want, Yeah, is down

877
00:47:16.280 --> 00:47:16.880
<v Speaker 1>every weekend?

878
00:47:19.320 --> 00:47:23.400
<v Speaker 5>Right, I think both I think even in the downtime

879
00:47:23.559 --> 00:47:25.559
<v Speaker 5>as well as I think it brings to a point

880
00:47:25.719 --> 00:47:31.320
<v Speaker 5>also about alert fatigue or on call forty Like people

881
00:47:31.480 --> 00:47:35.519
<v Speaker 5>kind of have very lower thresholds for a lot of

882
00:47:35.599 --> 00:47:37.800
<v Speaker 5>things and over the time to realize like probably we

883
00:47:37.880 --> 00:47:41.760
<v Speaker 5>don't need a lot of lower thresh shoes, so uh, Like,

884
00:47:42.079 --> 00:47:45.880
<v Speaker 5>I think alert generally happens when people, when teams are

885
00:47:45.920 --> 00:47:48.719
<v Speaker 5>starting to have their on call process in place, they

886
00:47:48.800 --> 00:47:51.599
<v Speaker 5>put alerts on a bunch of things and over the

887
00:47:51.639 --> 00:47:54.599
<v Speaker 5>time for we don't need this alert or that is

888
00:47:54.679 --> 00:47:57.239
<v Speaker 5>probably what we or we can raise the thresholds, so

889
00:47:57.880 --> 00:48:01.880
<v Speaker 5>like like like click really be also provide these values

890
00:48:02.000 --> 00:48:05.679
<v Speaker 5>like you can innotate alerts to sort of analyze and

891
00:48:05.920 --> 00:48:08.800
<v Speaker 5>probably reduce some of the alerts that you don't even.

892
00:48:08.719 --> 00:48:10.840
<v Speaker 4>Need or you can probably increase trasuments.

893
00:48:11.320 --> 00:48:15.199
<v Speaker 5>So similarly for formably incidents, also you can define or

894
00:48:15.360 --> 00:48:17.760
<v Speaker 5>change your you know, sexual.

895
00:48:17.599 --> 00:48:19.480
<v Speaker 4>Values over the period of time.

896
00:48:20.000 --> 00:48:23.679
<v Speaker 5>Uh some some some companies can afford to have incident

897
00:48:24.360 --> 00:48:28.480
<v Speaker 5>response only during let's say business hours, they don't probably

898
00:48:28.760 --> 00:48:31.559
<v Speaker 5>they can afford to maybe don't do it during weekends

899
00:48:31.639 --> 00:48:34.079
<v Speaker 5>or night times. But some companies can't afford for even

900
00:48:34.079 --> 00:48:38.639
<v Speaker 5>a firm minute. So absolutely depends on completely company, type

901
00:48:38.639 --> 00:48:40.360
<v Speaker 5>of product or type of service detment.

902
00:48:41.199 --> 00:48:42.840
<v Speaker 3>Yeah, I mean, I think the same thing with like

903
00:48:42.880 --> 00:48:46.320
<v Speaker 3>the dependent on alerts from a security standpoint, which in

904
00:48:46.480 --> 00:48:47.280
<v Speaker 3>my domainment we.

905
00:48:47.400 --> 00:48:49.840
<v Speaker 2>Talk about a lot like how how much do you want?

906
00:48:49.960 --> 00:48:52.559
<v Speaker 2>Right like how much is important? How much is relevant

907
00:48:52.559 --> 00:48:52.719
<v Speaker 2>for you?

908
00:48:53.119 --> 00:48:56.039
<v Speaker 3>And and maybe the you know pajorly and at all

909
00:48:56.400 --> 00:48:58.639
<v Speaker 3>you know, help you actually identify after the fact how

910
00:48:58.719 --> 00:49:01.920
<v Speaker 3>much you have. And then point the ROI is super

911
00:49:02.000 --> 00:49:05.440
<v Speaker 3>critical to actually evaluate because you know, trying to actually

912
00:49:05.800 --> 00:49:08.880
<v Speaker 3>sort of duplicate production in a way to actually test

913
00:49:08.920 --> 00:49:11.360
<v Speaker 3>to see what happens at that scale at that moment,

914
00:49:11.440 --> 00:49:15.440
<v Speaker 3>and there's no way with cloud providers to uh, well

915
00:49:15.679 --> 00:49:19.440
<v Speaker 3>practice what does capacity constrained look like? And then if

916
00:49:19.480 --> 00:49:21.800
<v Speaker 3>I mean your capacity constrained because there isn't another bare

917
00:49:21.840 --> 00:49:25.639
<v Speaker 3>metal device available. There's no there's no alternative. Oh well,

918
00:49:25.679 --> 00:49:29.440
<v Speaker 3>you know, well we should be you know, multi cloud provider.

919
00:49:29.559 --> 00:49:30.639
<v Speaker 3>Like it is never the answer.

920
00:49:33.719 --> 00:49:37.159
<v Speaker 5>I mean you can have you know, backup das also,

921
00:49:37.280 --> 00:49:40.559
<v Speaker 5>you can have as much as possible, but like there's

922
00:49:40.599 --> 00:49:41.599
<v Speaker 5>no sort of answer.

923
00:49:41.880 --> 00:49:46.000
<v Speaker 3>H Yeah, there's some things like actually pick what your

924
00:49:46.280 --> 00:49:49.000
<v Speaker 3>solo is going to be, what your objective is going

925
00:49:49.039 --> 00:49:52.320
<v Speaker 3>to be for uptime or incidents, and then make sure

926
00:49:52.400 --> 00:49:55.440
<v Speaker 3>your strategy actually includes that and handles it, and then

927
00:49:55.519 --> 00:49:57.679
<v Speaker 3>measure it based off the number of incidents you get

928
00:49:57.760 --> 00:49:59.639
<v Speaker 3>rather than saying, oh, yeah, we should know when the

929
00:49:59.679 --> 00:50:02.840
<v Speaker 3>memory goes about ninety because then it's it's bad.

930
00:50:03.039 --> 00:50:03.559
<v Speaker 2>Apparently.

931
00:50:04.480 --> 00:50:09.599
<v Speaker 5>Yeah, I mean it always you know, gets updated, it

932
00:50:10.119 --> 00:50:13.400
<v Speaker 5>always gets you know, with the time it's probably from

933
00:50:13.440 --> 00:50:15.760
<v Speaker 5>your your port is growing, your customers are doing to

934
00:50:16.119 --> 00:50:18.360
<v Speaker 5>kind of get every passage.

935
00:50:17.960 --> 00:50:23.519
<v Speaker 1>Of switch your top tips for someone who is not

936
00:50:23.760 --> 00:50:29.199
<v Speaker 1>satisfied with their current incident response uh program or or software,

937
00:50:29.880 --> 00:50:30.400
<v Speaker 1>I would.

938
00:50:30.159 --> 00:50:33.480
<v Speaker 5>Say, like, like I think the entire fighting I think

939
00:50:33.679 --> 00:50:37.400
<v Speaker 5>like you can always see different parts to it. The

940
00:50:37.559 --> 00:50:41.119
<v Speaker 5>first part is the you know how easily your team

941
00:50:41.480 --> 00:50:44.639
<v Speaker 5>or anyone is able to report the incident. So, yes,

942
00:50:44.920 --> 00:50:50.800
<v Speaker 5>you have automated alerts on TV's or on from easy

943
00:50:50.880 --> 00:50:55.000
<v Speaker 5>tools and all those parts, but they'd say not everything

944
00:50:55.079 --> 00:50:55.880
<v Speaker 5>can be automated.

945
00:50:56.960 --> 00:50:59.639
<v Speaker 4>You need to have you know, correct way of identify

946
00:50:59.800 --> 00:51:01.360
<v Speaker 4>of you know, reporting issues.

947
00:51:01.400 --> 00:51:04.840
<v Speaker 5>So if you have customers support or if you have

948
00:51:05.000 --> 00:51:07.719
<v Speaker 5>a product team of let's say someone wants even if

949
00:51:07.840 --> 00:51:10.599
<v Speaker 5>even if let's say, if you have you know their

950
00:51:10.760 --> 00:51:13.960
<v Speaker 5>environments or three fraud environments you want to report issues there,

951
00:51:14.639 --> 00:51:17.280
<v Speaker 5>you need to have a good, good way of reporting

952
00:51:17.360 --> 00:51:20.920
<v Speaker 5>that UH and hope and have a process that the

953
00:51:21.119 --> 00:51:25.000
<v Speaker 5>correct uh the issue is reported to the correct team

954
00:51:25.159 --> 00:51:29.920
<v Speaker 5>as quickly as possible. So time to trigger the incident

955
00:51:30.079 --> 00:51:33.639
<v Speaker 5>or time to you know, you know you call that

956
00:51:33.840 --> 00:51:38.480
<v Speaker 5>on call should be as quickly as possible. So identify

957
00:51:38.760 --> 00:51:42.400
<v Speaker 5>the blockages in that there there can be blockage is

958
00:51:42.639 --> 00:51:45.039
<v Speaker 5>in you know, in set uping this process office.

959
00:51:45.840 --> 00:51:48.599
<v Speaker 4>And the other part is let's say, once the incident

960
00:51:48.679 --> 00:51:52.679
<v Speaker 4>has been triggered or created, what to do? So if

961
00:51:52.719 --> 00:51:54.199
<v Speaker 4>you if if if you.

962
00:51:54.280 --> 00:51:56.559
<v Speaker 5>Feel like if the calls feel like a lot of

963
00:51:56.800 --> 00:52:01.800
<v Speaker 5>work outside mitigation, uh, if they like you know, if

964
00:52:02.119 --> 00:52:05.039
<v Speaker 5>every time you are having an incident, you if you're

965
00:52:05.119 --> 00:52:08.639
<v Speaker 5>just running around and probably adding you know, calling people

966
00:52:09.639 --> 00:52:13.079
<v Speaker 5>and just figuring out, you know, what's the stackers.

967
00:52:12.719 --> 00:52:16.239
<v Speaker 4>Or adding other team on calls all the time.

968
00:52:17.039 --> 00:52:21.440
<v Speaker 5>Figure out these kind of blockages in your processes and

969
00:52:22.239 --> 00:52:25.719
<v Speaker 5>try to streamline as as much as possible so that

970
00:52:25.960 --> 00:52:30.400
<v Speaker 5>like on call or whoever is other stakeholders, can you know,

971
00:52:30.559 --> 00:52:33.400
<v Speaker 5>focus on solving as much as possible if you want

972
00:52:33.440 --> 00:52:33.639
<v Speaker 5>to have.

973
00:52:33.800 --> 00:52:36.159
<v Speaker 4>Like ah, if that is still like.

974
00:52:36.320 --> 00:52:38.320
<v Speaker 5>Taking a lot of time, maybe set up like a

975
00:52:38.400 --> 00:52:42.400
<v Speaker 5>team of an eyework person who is actually handling all

976
00:52:42.480 --> 00:52:45.760
<v Speaker 5>the incidents and he's he or she's actually dispatching the

977
00:52:45.840 --> 00:52:47.239
<v Speaker 5>incidents to a correct team.

978
00:52:47.039 --> 00:52:50.000
<v Speaker 4>And doing like a supervision of the entire process.

979
00:52:50.679 --> 00:52:54.679
<v Speaker 5>And the last one is to see whether are you

980
00:52:54.840 --> 00:52:57.800
<v Speaker 5>doing the post modems correctly, like uh, you know, are

981
00:52:57.840 --> 00:53:00.800
<v Speaker 5>you doing it all or not? Are you actually learning

982
00:53:00.920 --> 00:53:04.920
<v Speaker 5>the you know, uh learning from your incidents? Have your

983
00:53:05.079 --> 00:53:10.320
<v Speaker 5>repeated incidents have reduced over their time online? I say,

984
00:53:10.400 --> 00:53:12.559
<v Speaker 5>like that's the most between a lot of a lot

985
00:53:12.639 --> 00:53:15.599
<v Speaker 5>of companies kind of focus on entity. I think that's

986
00:53:15.800 --> 00:53:18.679
<v Speaker 5>not probably the right metric. The right metric is to

987
00:53:18.840 --> 00:53:22.639
<v Speaker 5>see how many unique incidents you're getting. I think if

988
00:53:22.760 --> 00:53:25.920
<v Speaker 5>if if if that's if that is fine, that's fine.

989
00:53:25.960 --> 00:53:29.239
<v Speaker 5>But if you're getting the repeated incidents time of her time,

990
00:53:29.840 --> 00:53:33.440
<v Speaker 5>something you could would have over then your incident responded

991
00:53:33.440 --> 00:53:35.360
<v Speaker 5>process is like the first model processes.

992
00:53:36.119 --> 00:53:38.480
<v Speaker 4>I think that that's interesting.

993
00:53:39.760 --> 00:53:40.440
<v Speaker 1>That's a good point.

994
00:53:41.719 --> 00:53:45.199
<v Speaker 3>I mean, I really wonder how many, like are people

995
00:53:45.559 --> 00:53:48.840
<v Speaker 3>hitting the same incident over and over again? Like I

996
00:53:49.280 --> 00:53:53.000
<v Speaker 3>my my guess would be probably not exactly, but maybe

997
00:53:53.119 --> 00:53:56.679
<v Speaker 3>correct categorization would really help spill it. Like, you know,

998
00:53:56.920 --> 00:53:59.159
<v Speaker 3>is it is it the same part of your framework,

999
00:53:59.239 --> 00:54:02.159
<v Speaker 3>code based or or component? You know, if you have

1000
00:54:02.400 --> 00:54:04.880
<v Speaker 3>even a monolith and not micro services, you still have

1001
00:54:04.960 --> 00:54:06.880
<v Speaker 3>broken out components. You can at least target it down

1002
00:54:06.920 --> 00:54:08.960
<v Speaker 3>to is it the same component that's causing the problem

1003
00:54:09.320 --> 00:54:11.559
<v Speaker 3>all the time? Uh, as far as a place to

1004
00:54:11.639 --> 00:54:14.159
<v Speaker 3>look and invest in rather than oh, you know, it's

1005
00:54:14.239 --> 00:54:16.480
<v Speaker 3>just something happening within our whole system.

1006
00:54:17.360 --> 00:54:20.440
<v Speaker 5>Right, Like I think, so maybe you know that the

1007
00:54:20.519 --> 00:54:23.239
<v Speaker 5>on call have the run book for when he was

1008
00:54:23.960 --> 00:54:27.079
<v Speaker 5>solving an incidents? Is do they had the right tools

1009
00:54:27.159 --> 00:54:31.360
<v Speaker 5>for you know, just seeing or for the service the

1010
00:54:31.400 --> 00:54:34.599
<v Speaker 5>particular serve it doesn't even have a dashboard, but doesn't

1011
00:54:34.639 --> 00:54:39.800
<v Speaker 5>have any playbook, uh, which can help on So there

1012
00:54:39.840 --> 00:54:42.960
<v Speaker 5>are like I think a lot of learnings, uh, which

1013
00:54:43.679 --> 00:54:47.440
<v Speaker 5>any or or any engine team can do. Uh and see,

1014
00:54:47.599 --> 00:54:50.400
<v Speaker 5>you know how much like I think at the end,

1015
00:54:50.480 --> 00:54:52.480
<v Speaker 5>it's all good how much we can help the on

1016
00:54:52.639 --> 00:54:54.119
<v Speaker 5>hulls as much as possible.

1017
00:54:54.639 --> 00:54:59.800
<v Speaker 1>So yeah, I mean, if we're seeing the same incident

1018
00:55:00.079 --> 00:55:02.000
<v Speaker 1>over and over again, we should at least be able

1019
00:55:02.079 --> 00:55:06.079
<v Speaker 1>to brag about our meantime the resolution decreasing because everybody

1020
00:55:06.079 --> 00:55:08.119
<v Speaker 1>instant service to restart.

1021
00:55:08.559 --> 00:55:10.239
<v Speaker 2>But maybe that's a good point though, right maybe that

1022
00:55:10.280 --> 00:55:11.280
<v Speaker 2>maybe that's the whole point.

1023
00:55:11.159 --> 00:55:14.239
<v Speaker 3>Right, like you you don't want to have that actually

1024
00:55:14.320 --> 00:55:17.440
<v Speaker 3>decreasing because right then then there's like, you know, it

1025
00:55:17.480 --> 00:55:19.800
<v Speaker 3>really points to a different problem. It's like if you

1026
00:55:19.880 --> 00:55:22.559
<v Speaker 3>have run books, you must be because you hit the

1027
00:55:22.599 --> 00:55:25.239
<v Speaker 3>same problem over and over again. And so rather than

1028
00:55:25.320 --> 00:55:27.480
<v Speaker 3>having the run book, it'd be better to eliminate where

1029
00:55:27.599 --> 00:55:29.519
<v Speaker 3>the source of the problem is coming from.

1030
00:55:30.800 --> 00:55:32.159
<v Speaker 4>And that's why I didn't.

1031
00:55:34.480 --> 00:55:37.320
<v Speaker 5>Like not like I think it's always divided for you

1032
00:55:37.440 --> 00:55:40.679
<v Speaker 5>on you know, with how good does a MPDIA matrix

1033
00:55:41.639 --> 00:55:46.159
<v Speaker 5>actually want to like trust on it? Like, because it's

1034
00:55:46.280 --> 00:55:49.840
<v Speaker 5>usually counter intuitive if you see, like as as you

1035
00:55:50.000 --> 00:55:53.119
<v Speaker 5>rightly say, say, let's say let's say a company has

1036
00:55:54.039 --> 00:55:56.800
<v Speaker 5>dissolved most of the incidents, like they have resolved it

1037
00:55:56.840 --> 00:56:00.760
<v Speaker 5>to the correct points in six months of time. In

1038
00:56:00.840 --> 00:56:04.159
<v Speaker 5>the seventh and in the seventh month, the engineer team

1039
00:56:04.239 --> 00:56:07.000
<v Speaker 5>doesn't have the same incidents or similar incidents, but they

1040
00:56:07.880 --> 00:56:10.159
<v Speaker 5>have only one new incident which kind of took a

1041
00:56:10.239 --> 00:56:12.840
<v Speaker 5>long time because that was like a unique se So

1042
00:56:13.079 --> 00:56:17.239
<v Speaker 5>in that case, the MPDR is too big because a

1043
00:56:17.320 --> 00:56:21.840
<v Speaker 5>new incident came with a lower frequency, lower number of times,

1044
00:56:22.079 --> 00:56:25.000
<v Speaker 5>but it took a longer time. But can we say

1045
00:56:25.039 --> 00:56:29.079
<v Speaker 5>that that the engineering team had a you know, bad

1046
00:56:29.159 --> 00:56:32.400
<v Speaker 5>state of incidents hygien, No, because they had kind of

1047
00:56:32.519 --> 00:56:35.320
<v Speaker 5>resolved most of the incidents that have occurred in the

1048
00:56:35.360 --> 00:56:37.719
<v Speaker 5>past and those are not occurring now this is like

1049
00:56:37.800 --> 00:56:41.079
<v Speaker 5>a new one. So that's why I think like MDA

1050
00:56:41.199 --> 00:56:45.920
<v Speaker 5>is always not the right victory to see in terms

1051
00:56:45.960 --> 00:56:47.039
<v Speaker 5>of incident hygiene.

1052
00:56:47.599 --> 00:56:49.920
<v Speaker 3>Yeah, I think the ERA budget, according to your SOLO

1053
00:56:50.159 --> 00:56:53.920
<v Speaker 3>is a much better one in this regard unfortunately. Yeah,

1054
00:56:53.960 --> 00:56:55.360
<v Speaker 3>but I'm I'm totally I mean, I think that all

1055
00:56:55.400 --> 00:56:57.199
<v Speaker 3>the door metrics sort of have that problem in a

1056
00:56:57.239 --> 00:57:00.760
<v Speaker 3>way if you measure them purely or just people in general,

1057
00:57:00.960 --> 00:57:03.599
<v Speaker 3>rather than how they're actually relevant. Like I I remember

1058
00:57:03.639 --> 00:57:05.679
<v Speaker 3>working with one company that they were measuring even in

1059
00:57:06.639 --> 00:57:09.719
<v Speaker 3>cycle time, but they were using feature flags and not

1060
00:57:09.920 --> 00:57:12.239
<v Speaker 3>including that in the cycle time. So I'm like, yeah,

1061
00:57:12.280 --> 00:57:14.239
<v Speaker 3>your code is going to production, but no one's using it,

1062
00:57:14.480 --> 00:57:15.760
<v Speaker 3>So what's the point.

1063
00:57:15.800 --> 00:57:16.440
<v Speaker 2>What's the point?

1064
00:57:16.880 --> 00:57:19.559
<v Speaker 3>I mean, yeah, I mean measure then also measure the

1065
00:57:19.639 --> 00:57:22.400
<v Speaker 3>cycle like the cycle time on feature flag removal. That's

1066
00:57:22.480 --> 00:57:25.000
<v Speaker 3>going to tell you a lot more about your success.

1067
00:57:25.199 --> 00:57:27.119
<v Speaker 5>Right, I think, like you know, we have seen a

1068
00:57:27.199 --> 00:57:30.239
<v Speaker 5>lot of tools on so we have seen a lot

1069
00:57:30.280 --> 00:57:33.199
<v Speaker 5>of tools maybe just you know how many commits you

1070
00:57:33.239 --> 00:57:36.760
<v Speaker 5>have pushed, So I think everything has to be you know,

1071
00:57:37.639 --> 00:57:40.679
<v Speaker 5>read with a lot of context, with a lot of corns.

1072
00:57:40.800 --> 00:57:44.840
<v Speaker 5>Is also not just because it can change based on

1073
00:57:44.880 --> 00:57:47.880
<v Speaker 5>a different kind of uh things that are happening.

1074
00:57:48.119 --> 00:57:52.280
<v Speaker 3>Yeah, I know we're getting close to the limit on

1075
00:57:52.360 --> 00:57:55.199
<v Speaker 3>the time that you've got with us today. Uh maybe

1076
00:57:55.280 --> 00:57:57.400
<v Speaker 3>there's some last words and then we can move over

1077
00:57:57.480 --> 00:57:59.880
<v Speaker 3>to picks. Anything you want to share.

1078
00:58:01.079 --> 00:58:03.800
<v Speaker 4>No, I think like this was pretty cool. Uh.

1079
00:58:04.559 --> 00:58:06.840
<v Speaker 5>Like I think like with with pajor L also we

1080
00:58:07.000 --> 00:58:10.920
<v Speaker 5>have we have seen a lot of different and unique

1081
00:58:10.960 --> 00:58:14.480
<v Speaker 5>cases h and it's good, like this is something that

1082
00:58:14.599 --> 00:58:17.199
<v Speaker 5>which is very close to you know what we have

1083
00:58:17.400 --> 00:58:21.920
<v Speaker 5>seen like what we have failed, and like mean, we're

1084
00:58:22.000 --> 00:58:25.400
<v Speaker 5>like to help companies to sort of stream and this

1085
00:58:25.760 --> 00:58:27.679
<v Speaker 5>entire process as much as possible.

1086
00:58:28.360 --> 00:58:30.719
<v Speaker 3>It's got to be super interesting too to see how

1087
00:58:30.880 --> 00:58:36.199
<v Speaker 3>companies incidents actually look like Oh for sure, Yeah, I.

1088
00:58:36.239 --> 00:58:38.760
<v Speaker 5>Think like what we have realized is like every company

1089
00:58:38.880 --> 00:58:42.360
<v Speaker 5>needs their kind of like every company has different processes,

1090
00:58:42.519 --> 00:58:44.719
<v Speaker 5>probably because of the state of their product, the state

1091
00:58:44.800 --> 00:58:49.360
<v Speaker 5>of all size. And what's what we have always ensured

1092
00:58:49.480 --> 00:58:52.239
<v Speaker 5>is like whatever your process or whatever you feel like

1093
00:58:52.360 --> 00:58:55.480
<v Speaker 5>is the most app will not force a tool of

1094
00:58:55.559 --> 00:58:58.639
<v Speaker 5>that will adapt to your kind of processes. We'll just

1095
00:58:58.800 --> 00:59:02.840
<v Speaker 5>make it more automate in most because we know, like

1096
00:59:03.079 --> 00:59:08.239
<v Speaker 5>you have set up some big even your set and

1097
00:59:08.360 --> 00:59:10.880
<v Speaker 5>we know that you know that uh sort of some

1098
00:59:11.000 --> 00:59:14.559
<v Speaker 5>of the big spieces of it. So there's no one

1099
00:59:14.679 --> 00:59:18.440
<v Speaker 5>particular way of doing things, but whatever it may will.

1100
00:59:18.360 --> 00:59:25.480
<v Speaker 2>Hit well said, well said, So what do you think, Well,

1101
00:59:25.519 --> 00:59:27.239
<v Speaker 2>should we should we do the picks?

1102
00:59:27.639 --> 00:59:28.519
<v Speaker 1>Let's do some picks.

1103
00:59:29.000 --> 00:59:31.320
<v Speaker 2>Okay, I know you put me on the spot anyway,

1104
00:59:31.320 --> 00:59:32.119
<v Speaker 2>so I'll just go first.

1105
00:59:32.320 --> 00:59:36.840
<v Speaker 3>Uh my, My my pick for day's session is a

1106
00:59:36.920 --> 00:59:40.039
<v Speaker 3>book called Radical Focus by I think it's a Christina

1107
00:59:40.960 --> 00:59:44.719
<v Speaker 3>Vodka it's a it's actually fantastic. It's a hypothetical story

1108
00:59:44.800 --> 00:59:49.119
<v Speaker 3>about how to actually uh set priorities using okay, r

1109
00:59:49.239 --> 00:59:52.719
<v Speaker 3>S or KPIs or whatever, MBI whatever you want to

1110
00:59:52.760 --> 00:59:55.280
<v Speaker 3>call them, honestly, and how not to do it and

1111
00:59:55.440 --> 00:59:58.559
<v Speaker 3>lessons learned from that. It's it's super relevant no matter

1112
00:59:58.639 --> 01:00:01.719
<v Speaker 3>what level you're at, realistically, like even at the team level,

1113
01:00:01.800 --> 01:00:03.960
<v Speaker 3>it's super interesting to think about, like how many priorities

1114
01:00:04.000 --> 01:00:06.039
<v Speaker 3>and what should our focus be on? How to think

1115
01:00:06.039 --> 01:00:08.320
<v Speaker 3>about that because I've seen so many teams, so many

1116
01:00:08.400 --> 01:00:11.159
<v Speaker 3>companies have like, oh, yeah, we have ten initiatives for

1117
01:00:11.239 --> 01:00:13.360
<v Speaker 3>this quarter, and I'm like, you can't. I bet your

1118
01:00:13.400 --> 01:00:16.840
<v Speaker 3>engineers couldn't even tell you five of them, Like it's

1119
01:00:16.920 --> 01:00:19.239
<v Speaker 3>just too many. And I think it's a great story

1120
01:00:19.239 --> 01:00:21.199
<v Speaker 3>about how to actually think about this and what's relevant.

1121
01:00:21.800 --> 01:00:22.719
<v Speaker 2>So highly recommended.

1122
01:00:23.360 --> 01:00:26.119
<v Speaker 1>Dude, your picks are always so relevant. I feel like

1123
01:00:26.199 --> 01:00:29.320
<v Speaker 1>mine are just like better, but yours are like, oh wow,

1124
01:00:29.400 --> 01:00:31.519
<v Speaker 1>that could actually work and be helpful.

1125
01:00:33.519 --> 01:00:36.760
<v Speaker 3>I mean, I don't know, maybe being I'm being lazy

1126
01:00:36.840 --> 01:00:40.559
<v Speaker 3>by picking easy things. Well, you know, I'm when I'm

1127
01:00:40.639 --> 01:00:42.880
<v Speaker 3>year two of a host here, maybe I'll have run

1128
01:00:42.960 --> 01:00:43.199
<v Speaker 3>out of.

1129
01:00:43.239 --> 01:00:46.280
<v Speaker 2>Things and then I'll be onto the I don't know

1130
01:00:46.719 --> 01:00:48.559
<v Speaker 2>my weights that I've got in my other room that

1131
01:00:48.599 --> 01:00:49.079
<v Speaker 2>I'm using.

1132
01:00:54.360 --> 01:00:56.119
<v Speaker 1>All right, Fali, what you bring for us for a

1133
01:00:56.159 --> 01:00:56.679
<v Speaker 1>pick today?

1134
01:00:58.400 --> 01:01:01.719
<v Speaker 4>I that's a little thing is one. I think.

1135
01:01:02.719 --> 01:01:07.119
<v Speaker 5>One I've seen like a documentary recently which was how

1136
01:01:07.480 --> 01:01:12.719
<v Speaker 5>Toyota Big Stuff and a lot of things was very interesting.

1137
01:01:12.920 --> 01:01:15.840
<v Speaker 5>I'm forgetting what what they call it, but essentially what

1138
01:01:16.000 --> 01:01:20.480
<v Speaker 5>they they is. And the third is like known for

1139
01:01:20.639 --> 01:01:23.480
<v Speaker 5>it's like you know, building bug.

1140
01:01:23.320 --> 01:01:28.280
<v Speaker 4>Free products manufacturer. Yeah for sure, yeah. Yeah.

1141
01:01:28.760 --> 01:01:30.639
<v Speaker 5>And one of the one of the things that they

1142
01:01:31.599 --> 01:01:35.880
<v Speaker 5>have always is like no matter where their manufacturing nits are,

1143
01:01:36.199 --> 01:01:40.559
<v Speaker 5>no matter where the what they if there's an issue,

1144
01:01:40.840 --> 01:01:44.960
<v Speaker 5>it gets reported to the topmost year like immediately, like

1145
01:01:45.280 --> 01:01:49.159
<v Speaker 5>with with proper clarity and and and that's how I think,

1146
01:01:49.199 --> 01:01:52.920
<v Speaker 5>like communication becomes so much important and like that kind

1147
01:01:52.920 --> 01:01:54.119
<v Speaker 5>of solves a lot of things.

1148
01:01:54.599 --> 01:01:58.920
<v Speaker 4>So like it, I think, and it's just very fascinating

1149
01:01:59.000 --> 01:01:59.719
<v Speaker 4>how kind of.

1150
01:01:59.760 --> 01:02:03.719
<v Speaker 3>They actually they actually have these like cords on the

1151
01:02:03.840 --> 01:02:07.440
<v Speaker 3>manufacturing for called and onlines that helped. Yeah, they stop

1152
01:02:07.599 --> 01:02:10.679
<v Speaker 3>the whole manufacturing line at once. You know, hey, you know,

1153
01:02:11.000 --> 01:02:12.840
<v Speaker 3>no more pull requests at this moment. For the whole

1154
01:02:12.880 --> 01:02:15.639
<v Speaker 3>company because there's something critically wrong going on, Like could

1155
01:02:15.679 --> 01:02:16.239
<v Speaker 3>you imagine.

1156
01:02:17.119 --> 01:02:20.199
<v Speaker 4>Yeah, yeah, I think I think on line was I

1157
01:02:20.360 --> 01:02:23.280
<v Speaker 4>was different to him. That's super super.

1158
01:02:23.840 --> 01:02:27.239
<v Speaker 5>I think that's such a simple kind of technique that

1159
01:02:27.519 --> 01:02:32.119
<v Speaker 5>any company can sort of have, Like I make such

1160
01:02:32.119 --> 01:02:33.800
<v Speaker 5>a simple forceses but very.

1161
01:02:36.360 --> 01:02:40.480
<v Speaker 1>Yeah. One of my customers was a company that provided

1162
01:02:41.880 --> 01:02:46.679
<v Speaker 1>seats to Toyota, and it was wild because they would

1163
01:02:46.719 --> 01:02:49.719
<v Speaker 1>get orders like Okay, we need three hundred and seventeen

1164
01:02:50.000 --> 01:02:53.840
<v Speaker 1>seats that are beige delivered at ten twenty seven am.

1165
01:02:54.320 --> 01:02:56.920
<v Speaker 1>You know, like the level of specificity because they have

1166
01:02:57.039 --> 01:02:59.800
<v Speaker 1>that that just in time manufacturing, like we need these

1167
01:02:59.840 --> 01:03:03.199
<v Speaker 1>at ten twenty seven am. And and this was a

1168
01:03:03.280 --> 01:03:05.920
<v Speaker 1>smaller company, so like the level of pressure for them

1169
01:03:06.079 --> 01:03:09.679
<v Speaker 1>to meet those requirements was just through the roof.

1170
01:03:10.440 --> 01:03:11.920
<v Speaker 3>I mean it makes a lot of sense too if

1171
01:03:11.960 --> 01:03:15.000
<v Speaker 3>you think about it, because they see inventory storage as

1172
01:03:15.039 --> 01:03:17.800
<v Speaker 3>a waste, as a cost to them, and so they

1173
01:03:17.880 --> 01:03:20.599
<v Speaker 3>don't want to have it stored at You're the inventory

1174
01:03:20.719 --> 01:03:23.119
<v Speaker 3>for them. You know, they're they're going to the shelf

1175
01:03:23.199 --> 01:03:25.679
<v Speaker 3>and they're pulling it and you are that shelf for them.

1176
01:03:26.119 --> 01:03:27.199
<v Speaker 2>Yeah. No, it's awesome.

1177
01:03:27.840 --> 01:03:30.320
<v Speaker 1>Yeah, I feel like, you know, we were talking about

1178
01:03:30.360 --> 01:03:33.199
<v Speaker 1>this before we started recording, about how my picks are

1179
01:03:33.559 --> 01:03:36.039
<v Speaker 1>just kind of out there. I feel like this one

1180
01:03:36.159 --> 01:03:40.519
<v Speaker 1>is going to be unlike the the crazy scale, this

1181
01:03:40.639 --> 01:03:47.639
<v Speaker 1>one's going to be hard to top. And yeah, I'll

1182
01:03:47.679 --> 01:03:49.079
<v Speaker 1>just get to it. So I read this book. I

1183
01:03:49.199 --> 01:03:51.559
<v Speaker 1>just finished it up a couple of days ago, called

1184
01:03:51.880 --> 01:03:55.519
<v Speaker 1>The Sacred Mushroom and the Cross by a guy named

1185
01:03:55.599 --> 01:04:00.320
<v Speaker 1>John Marco Allegro, and I would be tempted to call

1186
01:04:00.440 --> 01:04:04.079
<v Speaker 1>bullshit on the book right away, except for the fact

1187
01:04:04.159 --> 01:04:08.599
<v Speaker 1>that this guy spent fifteen years deciphering the Dead Sea Scrolls.

1188
01:04:09.480 --> 01:04:11.559
<v Speaker 1>And so if you're not familiar with the Dead Sea Scrolls,

1189
01:04:12.079 --> 01:04:15.960
<v Speaker 1>it's a set of scrolls that were found in Egypt,

1190
01:04:16.039 --> 01:04:19.480
<v Speaker 1>I believe in the nineteen forties that were thousands of

1191
01:04:19.599 --> 01:04:24.119
<v Speaker 1>years old, and they contained some parts of the New

1192
01:04:24.199 --> 01:04:27.440
<v Speaker 1>Testament Bible, but they also had other stories in there

1193
01:04:27.440 --> 01:04:30.480
<v Speaker 1>as well that weren't included in there, and so he

1194
01:04:30.639 --> 01:04:33.440
<v Speaker 1>deciphered them. But this book, The Sacred Mushroom in the Cross,

1195
01:04:35.920 --> 01:04:39.320
<v Speaker 1>he basically goes through this book showing or arguing that

1196
01:04:40.760 --> 01:04:44.880
<v Speaker 1>a lot of the stuff written in the Old Testament

1197
01:04:44.960 --> 01:04:48.320
<v Speaker 1>and the New Testament and some other religious books as

1198
01:04:48.400 --> 01:04:54.280
<v Speaker 1>well were not factual base, but they were actually like

1199
01:04:54.840 --> 01:04:59.800
<v Speaker 1>a play on words referencing psychedelic mushrooms, and that the

1200
01:05:00.079 --> 01:05:04.400
<v Speaker 1>whole religion is based on an ancient cult or ancient

1201
01:05:04.519 --> 01:05:11.199
<v Speaker 1>culture that worshiped psychedelic mushrooms. And it's a wild read. Man,

1202
01:05:11.440 --> 01:05:14.639
<v Speaker 1>It's very hard to read because of all the references

1203
01:05:14.719 --> 01:05:17.880
<v Speaker 1>he makes to like the Aramaic and the Semitic languages.

1204
01:05:18.440 --> 01:05:22.159
<v Speaker 1>But the big takeaway for me was, you know, you're

1205
01:05:22.159 --> 01:05:24.400
<v Speaker 1>reading through this and he's like, oh, so, well, they

1206
01:05:24.480 --> 01:05:28.719
<v Speaker 1>said this thing and that's actually the you know, the

1207
01:05:28.760 --> 01:05:33.480
<v Speaker 1>ancient Sumerian word for this psychedelic mushroom, and like everything

1208
01:05:33.559 --> 01:05:36.679
<v Speaker 1>points back to being the name of a psychedelic mushroom.

1209
01:05:36.679 --> 01:05:38.960
<v Speaker 1>And I was like, dude, how is it that we

1210
01:05:39.239 --> 01:05:42.880
<v Speaker 1>know so little about the ancient Sumerians, but you know

1211
01:05:43.079 --> 01:05:47.440
<v Speaker 1>the four hundred different words they had for psychedelic mushrooms.

1212
01:05:48.800 --> 01:05:50.920
<v Speaker 1>But then at the end of this book there's like

1213
01:05:50.960 --> 01:05:53.280
<v Speaker 1>a chapter. I can figure out who wrote this last

1214
01:05:53.400 --> 01:05:58.400
<v Speaker 1>chapter because it wasn't John o'leegertt was someone else. But

1215
01:05:58.599 --> 01:06:03.840
<v Speaker 1>the guy was talking with his wife, she was from Russia,

1216
01:06:04.719 --> 01:06:07.360
<v Speaker 1>and they came across a field with a bunch of mushrooms,

1217
01:06:07.400 --> 01:06:09.960
<v Speaker 1>you know, and he's like he was an American and

1218
01:06:10.000 --> 01:06:12.159
<v Speaker 1>he's like, no, don't eat those, they're all poisonous and stuff.

1219
01:06:12.199 --> 01:06:13.840
<v Speaker 1>And she's like, no, this one is, this, this and this,

1220
01:06:14.000 --> 01:06:16.079
<v Speaker 1>and so it turns out in Russian they have like

1221
01:06:16.159 --> 01:06:19.760
<v Speaker 1>an endless number of words from mushroom, but in the

1222
01:06:20.000 --> 01:06:23.400
<v Speaker 1>US and in Western cultures, we have, you know, like

1223
01:06:23.840 --> 01:06:28.519
<v Speaker 1>three like toadstools, mushrooms and and whatever else.

1224
01:06:28.760 --> 01:06:31.199
<v Speaker 3>So it's a yeah, that's a European thing because people

1225
01:06:31.239 --> 01:06:33.880
<v Speaker 3>actually go pick out pick mushrooms here, and so knowing

1226
01:06:34.000 --> 01:06:35.480
<v Speaker 3>which ones are poisonous the same thing.

1227
01:06:35.519 --> 01:06:37.559
<v Speaker 2>You know, once I've moved here, I learned all about that.

1228
01:06:37.679 --> 01:06:39.880
<v Speaker 3>But I think you've meaned yourself well, because now I

1229
01:06:40.000 --> 01:06:43.599
<v Speaker 3>see instead of you know, just this, but instead of aliens,

1230
01:06:43.679 --> 01:06:45.599
<v Speaker 3>now it's just it's mushrooms.

1231
01:06:45.159 --> 01:06:54.800
<v Speaker 4>Right, aliens? The last one, well, no, it's the the

1232
01:06:54.880 --> 01:06:55.440
<v Speaker 4>guy from.

1233
01:06:55.519 --> 01:06:58.400
<v Speaker 1>The Ancient Aliens TV show with the big hair. If

1234
01:06:58.440 --> 01:06:59.480
<v Speaker 1>you've ever seen that meme.

1235
01:07:00.199 --> 01:07:02.440
<v Speaker 3>It was the History Channel, there was Ancient Aliens and yeah,

1236
01:07:02.480 --> 01:07:05.280
<v Speaker 3>it's like the pyramids are landing platforms for aliens, and

1237
01:07:05.519 --> 01:07:07.000
<v Speaker 3>you know, well's here, you know, trying to.

1238
01:07:07.000 --> 01:07:11.079
<v Speaker 2>Sell us on the fact that the religious cult is

1239
01:07:11.159 --> 01:07:12.000
<v Speaker 2>of mushrooms.

1240
01:07:12.320 --> 01:07:13.800
<v Speaker 1>I mean, yeah, good reading.

1241
01:07:14.239 --> 01:07:16.000
<v Speaker 2>I added to my list, So thank you for that.

1242
01:07:17.159 --> 01:07:19.400
<v Speaker 1>Yeah, let me know when you when you get through it,

1243
01:07:19.440 --> 01:07:21.599
<v Speaker 1>I'd be interested to talk through that with you, because

1244
01:07:22.199 --> 01:07:24.440
<v Speaker 1>there's like some parts of which you're like, okay, I

1245
01:07:24.559 --> 01:07:26.400
<v Speaker 1>see how you can get to that conclusion, and there's

1246
01:07:26.480 --> 01:07:28.320
<v Speaker 1>other parts who're like, come.

1247
01:07:28.199 --> 01:07:36.199
<v Speaker 4>On, is it like pretty famous kind of book it

1248
01:07:36.320 --> 01:07:36.840
<v Speaker 4>was written?

1249
01:07:37.800 --> 01:07:41.599
<v Speaker 1>I think it's recently. It was recently. It was published

1250
01:07:41.599 --> 01:07:46.760
<v Speaker 1>in nineteen seventy, so it's an older book. And then

1251
01:07:46.880 --> 01:07:49.639
<v Speaker 1>he got a lot of hatred and supposedly it was

1252
01:07:49.800 --> 01:07:55.800
<v Speaker 1>very detrimental to his career. Go figure, who would have

1253
01:07:55.920 --> 01:07:59.960
<v Speaker 1>thought that, you know, claiming Jesus was a psychedelic mushroom

1254
01:08:00.039 --> 01:08:03.320
<v Speaker 1>with detrimental to your career. But anyway, but I think

1255
01:08:03.360 --> 01:08:06.480
<v Speaker 1>it's gained in popularity over the last couple of years

1256
01:08:06.599 --> 01:08:10.840
<v Speaker 1>just because of the shift in the things that we're seeing,

1257
01:08:11.519 --> 01:08:14.360
<v Speaker 1>at least here in the Western world, where people are

1258
01:08:16.159 --> 01:08:18.760
<v Speaker 1>kind of changing their opinion and approach to things like

1259
01:08:19.000 --> 01:08:22.680
<v Speaker 1>to natural medicines like psychedelic mushrooms, and you know, the

1260
01:08:22.800 --> 01:08:26.680
<v Speaker 1>legalization of pot, and now in Oregon and Colorado there's

1261
01:08:26.760 --> 01:08:34.319
<v Speaker 1>actually decriminalized centers for using mushrooms to treat like PTSD

1262
01:08:34.800 --> 01:08:36.600
<v Speaker 1>and memory issues and things like that.

1263
01:08:37.439 --> 01:08:39.920
<v Speaker 3>You're really close to Canada where that's been a huge

1264
01:08:40.039 --> 01:08:42.279
<v Speaker 3>topic in the last years.

1265
01:08:42.800 --> 01:08:43.039
<v Speaker 4>Yeah.

1266
01:08:43.840 --> 01:08:46.960
<v Speaker 1>Yeah, so I think that's been a key to the

1267
01:08:47.039 --> 01:08:53.600
<v Speaker 1>book gaining new popularity. Yeah all right, so there you go.

1268
01:08:53.840 --> 01:08:56.760
<v Speaker 1>So now the challenge is on next week? What am

1269
01:08:56.800 --> 01:08:58.560
<v Speaker 1>I going to come with a pick that tops Jesus

1270
01:08:58.600 --> 01:08:59.319
<v Speaker 1>as a mushroom?

1271
01:09:02.479 --> 01:09:05.920
<v Speaker 4>I actually saw one movie which is on my mind

1272
01:09:06.079 --> 01:09:08.279
<v Speaker 4>is Inside Out too. I saw, I think.

1273
01:09:08.279 --> 01:09:13.840
<v Speaker 5>Last night, and I think it, Uh, if someone wants

1274
01:09:13.880 --> 01:09:17.560
<v Speaker 5>to make highly recomm I think you can sort of

1275
01:09:17.600 --> 01:09:20.560
<v Speaker 5>feel a lot of emotions, uh for for.

1276
01:09:23.119 --> 01:09:25.199
<v Speaker 4>That's that is something which is like just all my

1277
01:09:25.319 --> 01:09:29.239
<v Speaker 4>mind and what was the name of that one? Inside

1278
01:09:29.279 --> 01:09:30.479
<v Speaker 4>Out the second part?

1279
01:09:31.600 --> 01:09:35.039
<v Speaker 1>Yeah, awesome, And with that done, I think we have

1280
01:09:35.199 --> 01:09:37.359
<v Speaker 1>an episode for Thanks for joining us, man, This has

1281
01:09:37.399 --> 01:09:39.760
<v Speaker 1>been a blast. Really appreciate having you on the show.

1282
01:09:40.359 --> 01:09:40.840
<v Speaker 4>Great, thank you.

1283
01:09:41.439 --> 01:09:45.600
<v Speaker 5>Yeah, I think I think, uh, it's been really great,

1284
01:09:46.399 --> 01:09:49.880
<v Speaker 5>had you know, wonderful time just to chat about incidents

1285
01:09:50.319 --> 01:09:53.560
<v Speaker 5>and a lot of other things and sharing each other's

1286
01:09:53.560 --> 01:09:56.239
<v Speaker 5>you know, personal experiences. I think this is something like

1287
01:09:56.920 --> 01:10:00.680
<v Speaker 5>every OnCore and every or even every developer has their

1288
01:10:00.760 --> 01:10:03.239
<v Speaker 5>own personal experience what they want to share.

1289
01:10:03.399 --> 01:10:05.000
<v Speaker 4>So it's been a really good.

1290
01:10:04.880 --> 01:10:09.119
<v Speaker 1>Child, awesome cool. Thank you again, and to all the listeners,

1291
01:10:09.199 --> 01:10:12.079
<v Speaker 1>thank you for listening. Appreciate y'all and be sure and

1292
01:10:12.279 --> 01:10:14.920
<v Speaker 1>hit us up if there's anything we can do for you,

1293
01:10:15.319 --> 01:10:16.439
<v Speaker 1>and we'll see y'all next week
