1
00:00:07,879 --> 00:00:11,240
Speaker 1: And welcome back to another episode of Adventures and DevOps.

2
00:00:11,359 --> 00:00:13,119
One of the things that I've talked with many of

3
00:00:13,119 --> 00:00:15,679
my colleagues about is just how it seems like there's

4
00:00:15,720 --> 00:00:19,480
a ramp up in the number of incidents, production or otherwise.

5
00:00:19,039 --> 00:00:19,920
Speaker 2: They've had to deal with.

6
00:00:20,280 --> 00:00:22,879
Speaker 1: So today I've brought in an expert in the industry,

7
00:00:23,039 --> 00:00:26,559
Lawrence Jones, founding engineer at Incident IOW and before that,

8
00:00:26,640 --> 00:00:29,839
if I'm right, principal site reliability engineer at go card List,

9
00:00:30,120 --> 00:00:33,320
which probably explains a lot about how you end up

10
00:00:33,479 --> 00:00:35,399
building Incident IOWE in the first place.

11
00:00:35,679 --> 00:00:38,640
Speaker 3: Yeah, No, that's absolutely it. Yeah, So I think we

12
00:00:38,679 --> 00:00:41,600
always we always choke it Incident about just how many

13
00:00:41,600 --> 00:00:45,039
of our early founding team were sourced from fintech. Well

14
00:00:45,079 --> 00:00:46,920
that might mean about the experience of working at a

15
00:00:46,960 --> 00:00:49,560
fintech company, but no, I think.

16
00:00:49,960 --> 00:00:50,200
Speaker 4: Yeah.

17
00:00:50,280 --> 00:00:53,840
Speaker 3: I joined Incident about four years ago. I was the

18
00:00:54,039 --> 00:00:57,280
first hire and joined alongside Pete, Stephen and Chris who

19
00:00:57,280 --> 00:00:59,359
are the founders. Between the five of us, we've seen

20
00:00:59,799 --> 00:01:03,520
love variety of incidents, both big and small, in financial

21
00:01:03,560 --> 00:01:07,719
regulation and then also infrastructure and normal technical incidents, and

22
00:01:07,760 --> 00:01:09,920
that serves you very well. It turns out when you're

23
00:01:09,920 --> 00:01:11,799
trying to build an incident response platform.

24
00:01:11,840 --> 00:01:14,319
Speaker 1: Do you think there's something specific about the financial industry

25
00:01:14,319 --> 00:01:18,359
that lends itself more to having incidents that cause bigger problems?

26
00:01:18,519 --> 00:01:22,560
Or is it everyone experiences incidents, no matter what vertical

27
00:01:22,719 --> 00:01:25,560
or area they're in, and it's just something that stuck

28
00:01:25,640 --> 00:01:27,799
for you as a problem that needed to be solved.

29
00:01:27,840 --> 00:01:32,519
Speaker 3: So I think in infintech you have an extremely pressing concern, right,

30
00:01:32,560 --> 00:01:36,280
which is the money, and managing finances is something that

31
00:01:36,359 --> 00:01:40,120
is both regulatorily governed, so you have laws that tell

32
00:01:40,200 --> 00:01:42,599
you how you should respond in certain situations, and you

33
00:01:42,599 --> 00:01:45,799
have obligations to certain regulatory bodies that you respond in

34
00:01:45,799 --> 00:01:48,680
that way, and that brings a level of rigor and

35
00:01:48,719 --> 00:01:50,799
discipline that you need for your incident response that you

36
00:01:50,840 --> 00:01:53,519
might not find in other companies. I remember the running

37
00:01:53,560 --> 00:01:57,040
joke with my SA team at the time, many years

38
00:01:57,079 --> 00:01:59,400
ago now at go cutas was the next job that

39
00:01:59,439 --> 00:02:02,079
we would be doing is a social media network for cats,

40
00:02:02,200 --> 00:02:05,959
because they presumably wouldn't care quite as much whenever we

41
00:02:06,040 --> 00:02:08,000
had downtime. When you can trust that with like a

42
00:02:08,000 --> 00:02:11,240
payments gateway where every minute of downtime is literally all

43
00:02:11,240 --> 00:02:14,840
of your customers can't take payments from their customers. The

44
00:02:14,879 --> 00:02:17,479
type of stress that goes into those incidents, it can

45
00:02:17,879 --> 00:02:18,479
get quite big.

46
00:02:18,560 --> 00:02:19,280
Speaker 2: From experience.

47
00:02:19,360 --> 00:02:22,360
Speaker 1: The types of people that maybe on the cats and

48
00:02:22,400 --> 00:02:26,080
would report to your support, you know, could be much

49
00:02:26,080 --> 00:02:29,759
more angry, like magnitudes more angrier than than your customers

50
00:02:29,759 --> 00:02:31,639
who aren't able to you know, build a couple of

51
00:02:31,639 --> 00:02:32,319
their customers.

52
00:02:32,439 --> 00:02:34,599
Speaker 3: I mean, given that I ended up going straight from

53
00:02:34,719 --> 00:02:37,000
goat cutlers into incident I OWE, and now I'm on

54
00:02:37,080 --> 00:02:39,800
the pager for our on call system, which is just

55
00:02:39,879 --> 00:02:43,759
as if not more critically availability sensitive than a payment skateway,

56
00:02:43,759 --> 00:02:46,520
I think maybe the cat social media network can be

57
00:02:46,520 --> 00:02:47,159
my next venture.

58
00:02:47,240 --> 00:02:49,919
Speaker 1: Yeah, a couple more years, and this is you'll do

59
00:02:49,960 --> 00:02:52,400
that adventure and hopefully retire from there more of a

60
00:02:52,400 --> 00:02:53,479
hobby project, right.

61
00:02:53,520 --> 00:02:56,560
Speaker 3: I think fintech is just one of those areas where

62
00:02:56,599 --> 00:03:00,039
not just like incidents in another in another environment, and

63
00:03:00,159 --> 00:03:02,759
maybe the website has gone down and after it's resolved

64
00:03:02,800 --> 00:03:05,560
and come back up, the incident is broadly okay, but

65
00:03:05,599 --> 00:03:09,120
there's a huge amount of background and other work that

66
00:03:09,159 --> 00:03:11,759
comes out of an incident that happens in fintech, and

67
00:03:11,800 --> 00:03:13,840
I think that requires you to build up a lot

68
00:03:13,879 --> 00:03:15,599
more in terms of your incident process, so that you

69
00:03:15,639 --> 00:03:18,759
can track work even after the initial impact is over,

70
00:03:19,280 --> 00:03:21,159
after you've stopped the bleeding. Yeah, you need to go

71
00:03:21,240 --> 00:03:23,280
chase all your customers, you need to go inform people.

72
00:03:23,319 --> 00:03:25,280
You need to do all of this, and the penalties

73
00:03:25,319 --> 00:03:26,879
if you don't do it are quite harsh. So you

74
00:03:27,000 --> 00:03:29,639
end up building good muscles around how to run incidents

75
00:03:29,680 --> 00:03:30,240
as a result.

76
00:03:30,360 --> 00:03:31,560
Speaker 2: So I have to ask this incident.

77
00:03:31,639 --> 00:03:34,840
Speaker 1: I like, you're not focused on only financial customers though, right,

78
00:03:35,039 --> 00:03:38,280
I assume there's no branding around that, but maybe I'm no. No.

79
00:03:38,360 --> 00:03:41,479
Speaker 3: So incident is a generic incident response platform, so you

80
00:03:41,479 --> 00:03:45,000
can think of us as so are like most most

81
00:03:45,080 --> 00:03:46,840
probably recognizable customers are people.

82
00:03:46,719 --> 00:03:48,800
Speaker 4: Like Netflix, etc. Skyscanner.

83
00:03:49,039 --> 00:03:52,360
Speaker 3: They use our tool so that they get paid when

84
00:03:52,439 --> 00:03:54,639
something goes wrong, and then when something goes wrong, they

85
00:03:54,719 --> 00:03:57,360
end up running their incident through our system. So the

86
00:03:57,400 --> 00:04:00,599
whole value of incident is that we're allowing our customers

87
00:04:00,639 --> 00:04:03,960
to encode their incident response process into the tool so

88
00:04:04,039 --> 00:04:06,159
that we can help them run the process without skipping

89
00:04:06,199 --> 00:04:08,919
a beat. Basically, and then, at least most recently over

90
00:04:08,960 --> 00:04:11,039
the last year in a bit, we've been looking at

91
00:04:11,080 --> 00:04:14,039
how we can use AI to help our customers actually

92
00:04:14,360 --> 00:04:17,399
solve the incident for them more debugging what's actually gone on.

93
00:04:17,560 --> 00:04:20,560
When you receive an alert that says, hey, you've got

94
00:04:20,639 --> 00:04:23,600
some network failure out over here, but we can appropriately

95
00:04:23,759 --> 00:04:26,399
explore all of the different parts of your infrastructure and go, hey,

96
00:04:26,519 --> 00:04:28,879
it's not actually this service that's going wrong. Maybe the

97
00:04:28,959 --> 00:04:31,720
database over here is out of resource due to a

98
00:04:31,800 --> 00:04:33,879
bad quick plan, and that's the reason that you're seeing

99
00:04:33,959 --> 00:04:36,360
requests go wrong over here. That's the journey that we've taken.

100
00:04:36,439 --> 00:04:39,519
And it's not just that there are financial companies using us,

101
00:04:39,519 --> 00:04:42,079
so we do have a lot. It's mostly people who

102
00:04:42,439 --> 00:04:45,360
care a lot about what happens when an incident happens,

103
00:04:45,399 --> 00:04:47,439
so making sure that they cut their down time as

104
00:04:47,519 --> 00:04:50,560
much as possible, and people who have like concerns such

105
00:04:50,560 --> 00:04:53,240
as we need to be really openly transparent with our

106
00:04:53,279 --> 00:04:55,600
customers and make sure that we communicate with them promptly

107
00:04:55,680 --> 00:04:57,560
and run the incident in a way that they would

108
00:04:57,759 --> 00:05:00,800
enjoy if the or maybe enjoys or word a way

109
00:05:00,839 --> 00:05:03,920
that they would appreciate if they were themselves their customers.

110
00:05:04,000 --> 00:05:07,000
Speaker 1: The reason I asked about the customer distribution is because

111
00:05:07,040 --> 00:05:09,800
I'm curious whether or not you see patterns by industry,

112
00:05:10,160 --> 00:05:13,399
or whether or not, say, handling incidents is something that

113
00:05:13,480 --> 00:05:17,639
becomes more undifferentiated work, sort of like doing software development

114
00:05:17,680 --> 00:05:20,600
at companies, Or whether or not the amount of regulations

115
00:05:20,720 --> 00:05:23,120
or the complexity of the vertical segment, or even the

116
00:05:23,279 --> 00:05:25,160
types of customers that a business has to deal with

117
00:05:25,600 --> 00:05:27,319
changes the way in which they handle incidents.

118
00:05:27,480 --> 00:05:29,959
Speaker 3: Yeah, so I think the answer is both yes and no,

119
00:05:30,079 --> 00:05:33,360
perhaps unsurprisingly. I think as you become more mature as

120
00:05:33,360 --> 00:05:35,920
a software engineer, you learn how to engage with an

121
00:05:35,959 --> 00:05:40,120
incident response process more effectively. But each organization has very

122
00:05:40,199 --> 00:05:43,120
different requirements or needs when it comes to incidents. If

123
00:05:43,120 --> 00:05:46,480
you work in fintech or a similar area, then you're

124
00:05:46,519 --> 00:05:49,120
often engaging with your legal team to try and figure

125
00:05:49,120 --> 00:05:50,879
out what your obligations are if you ever have a

126
00:05:50,959 --> 00:05:53,920
financial breach or something like that. But equally, as you

127
00:05:54,000 --> 00:05:58,120
start scaling into like really really large large enterprises, so

128
00:05:58,240 --> 00:06:00,519
people who have hundreds of thousands of cuss some of

129
00:06:00,600 --> 00:06:03,720
which themselves are large enterprises, and they have slas with

130
00:06:03,759 --> 00:06:06,639
them that are really stringent, and those customers care a

131
00:06:06,680 --> 00:06:08,800
lot about them. At that point, you're engaging with different

132
00:06:08,839 --> 00:06:11,000
parts of the business. Again, so maybe your GTM team

133
00:06:11,040 --> 00:06:13,560
and your customer success team, and just like in fintech,

134
00:06:13,600 --> 00:06:15,720
you end up with financial penalties if things go wrong.

135
00:06:15,759 --> 00:06:18,199
So it depends on your context as a company, but

136
00:06:18,360 --> 00:06:21,319
definitely as you scale upwards, you get much more serious

137
00:06:21,360 --> 00:06:23,879
in terms of how you approach these problems, and even

138
00:06:23,920 --> 00:06:27,360
across industries you see kind of broad patterns played out right.

139
00:06:27,519 --> 00:06:29,920
So I think everyone, as an example, I think, starts

140
00:06:29,959 --> 00:06:34,160
off with a status page where just transparently talking about

141
00:06:34,519 --> 00:06:36,639
every incident as soon as it happens on a status

142
00:06:36,680 --> 00:06:39,800
page is really appreciated from your customers and something good

143
00:06:39,839 --> 00:06:42,959
to do. Gradually, as you broaden your customer base and

144
00:06:43,000 --> 00:06:46,000
hopefully as you build resilience into the system, every incident

145
00:06:46,120 --> 00:06:48,560
does not affect every customer and that can complicate your

146
00:06:48,600 --> 00:06:51,040
relationship if you're publishing everything, so you look for more

147
00:06:51,040 --> 00:06:53,839
sophistication and how you're running your response. That is equally

148
00:06:54,000 --> 00:06:55,920
the same type of challenge that you have when you

149
00:06:56,079 --> 00:06:58,240
have a regulatory breach and you need to inform a

150
00:06:58,319 --> 00:06:59,879
certain subset of customers.

151
00:07:00,120 --> 00:07:04,120
Speaker 1: A challenge to find the balanced area of transparency and

152
00:07:04,240 --> 00:07:07,279
also solving the problem effectively. I know you're not specifically

153
00:07:07,319 --> 00:07:11,319
in this space, but for us, the challenges obviously if

154
00:07:11,360 --> 00:07:13,360
there is a problem with one of our services, and

155
00:07:13,360 --> 00:07:14,959
where I don't want to say, we're in the security space.

156
00:07:15,000 --> 00:07:18,319
We're providing logging and access control. So if there is

157
00:07:18,360 --> 00:07:22,040
a problem, then making sure that we're rolling out solutions

158
00:07:22,120 --> 00:07:25,000
or fixes or communication to just the customers that are

159
00:07:25,000 --> 00:07:28,199
trustworthy is more important than just rolling it across, you know,

160
00:07:28,720 --> 00:07:32,879
exposing that problem and then potentially having malicious customers of ours,

161
00:07:32,959 --> 00:07:36,000
you know, attempting to do something with that malicious data.

162
00:07:36,040 --> 00:07:38,040
And I know you probably don't have that particular scenario,

163
00:07:38,079 --> 00:07:40,680
but it's an easy way of seeing that. You want

164
00:07:40,720 --> 00:07:43,199
to focus the communication to the audience that makes the

165
00:07:43,240 --> 00:07:45,720
most sense, and at scale, it just doesn't make sense

166
00:07:45,759 --> 00:07:47,240
to send the same message to everyone.

167
00:07:47,399 --> 00:07:49,240
Speaker 3: Well, if you just want to inform people as soon

168
00:07:49,319 --> 00:07:52,279
as you possibly can, if you're not considerate with who

169
00:07:52,360 --> 00:07:55,560
you're informing, as your customer list grows much much larger,

170
00:07:55,759 --> 00:07:58,759
informing everyone about an incident where only ten percent of

171
00:07:58,800 --> 00:08:01,720
the people you're informing maybe impacted, arguably you can cause

172
00:08:01,879 --> 00:08:04,759
far more harm than good, like literally looking at like

173
00:08:04,879 --> 00:08:07,879
the social calculus of worrying ten times as many people

174
00:08:07,879 --> 00:08:09,879
as you need about it. That's not relevant to them

175
00:08:10,000 --> 00:08:13,040
is also not good. So yeah, it get it gets

176
00:08:13,040 --> 00:08:14,639
a lot more complicated as you scale up it.

177
00:08:14,959 --> 00:08:17,439
Speaker 2: As you say, are you in an AWS GCP asure?

178
00:08:17,720 --> 00:08:19,839
Speaker 3: So we run across a couple of different providers, but

179
00:08:19,879 --> 00:08:21,160
primarily inside of GCP.

180
00:08:21,480 --> 00:08:23,319
Speaker 1: So I actually don't know how a lot of experience

181
00:08:23,360 --> 00:08:27,360
with the operational support of running technology and GCP. But

182
00:08:27,439 --> 00:08:29,560
in AWS we get a lot of email saying like

183
00:08:29,720 --> 00:08:32,080
there is a required action to be executed, and you

184
00:08:32,159 --> 00:08:34,600
go into an email and the requirement is, oh, we're

185
00:08:34,679 --> 00:08:37,960
some setting this product which you have never used in

186
00:08:38,039 --> 00:08:41,159
any account in your entire organization, and so it definitely

187
00:08:41,200 --> 00:08:44,360
has like a negative impact on the brand even or

188
00:08:44,559 --> 00:08:47,120
you just send out these communications which aren't valuable and

189
00:08:47,279 --> 00:08:51,120
they waste cycles on having someone have toil even filter them.

190
00:08:51,200 --> 00:08:53,080
But read the email, understand that has I like to

191
00:08:53,159 --> 00:08:55,120
do with them before even jumping into it.

192
00:08:55,279 --> 00:08:57,320
Speaker 3: Be a hit and miss on this. I mean famously,

193
00:08:57,440 --> 00:09:00,559
Google love deprecating things. I argue whether not that's good

194
00:09:00,639 --> 00:09:02,759
or bad. But one thing that they do do in

195
00:09:02,840 --> 00:09:05,759
their communication are they usually quite good at going through

196
00:09:05,799 --> 00:09:08,639
all the different services that you're using and they'll only

197
00:09:08,679 --> 00:09:11,360
notify you if they're pretty confident that they've seen activity

198
00:09:11,480 --> 00:09:13,039
on the API that they think that they're going to

199
00:09:13,080 --> 00:09:16,399
be changing. So often, Actually, my experience with GDP is

200
00:09:16,440 --> 00:09:19,519
it's usually very actionable. They can go if I'm telling

201
00:09:19,559 --> 00:09:21,279
you about this, it's because I've seen that you used

202
00:09:21,279 --> 00:09:24,000
it in the last seven thirty days, and that usually

203
00:09:24,080 --> 00:09:26,159
clears up a lot of these communications. But I have

204
00:09:26,320 --> 00:09:30,720
much less experience running in EDIS. We were GCP primarily

205
00:09:30,759 --> 00:09:33,759
inside of Goat Carlis, and we've been primarily GDP in incidents,

206
00:09:33,799 --> 00:09:35,720
so they're my Yeah.

207
00:09:35,639 --> 00:09:38,279
Speaker 1: I assume that since you brought that over and you're

208
00:09:38,320 --> 00:09:40,200
the founding engineer, that was pretty much your decision.

209
00:09:40,279 --> 00:09:42,840
Speaker 3: When I originally turned up, it was Stephen and Kris

210
00:09:42,960 --> 00:09:46,600
have been paying together the MVP of what incident IOWA was. Honestly,

211
00:09:46,720 --> 00:09:49,720
our architecture hasn't changed so much in that time. It's

212
00:09:49,759 --> 00:09:52,360
evolved and become a lot more robust and mature. But

213
00:09:52,519 --> 00:09:56,000
even now we are one big go modolithic binary that

214
00:09:56,120 --> 00:09:58,840
we end up deploying in what you term a modular

215
00:09:59,000 --> 00:10:01,799
monolith set up. But we're doing that inside of Cubernettes

216
00:10:01,840 --> 00:10:03,519
now where when I first turned up it was just

217
00:10:03,879 --> 00:10:07,279
one go binary but running as a single process with

218
00:10:07,440 --> 00:10:09,720
all of the workers, all of the web stuff inside

219
00:10:09,759 --> 00:10:12,679
of Hiroki, which was quite uncomfortable when like we had

220
00:10:12,720 --> 00:10:17,080
our first obviously very predictable incident of like a worker.

221
00:10:16,919 --> 00:10:18,759
Speaker 4: Going wrong that would bring down the entire.

222
00:10:18,679 --> 00:10:21,639
Speaker 3: Service, and we were very quick to adjust as we went.

223
00:10:21,720 --> 00:10:23,879
But I guess this is actually one of the benefits

224
00:10:23,879 --> 00:10:26,879
of having seen the growth of a company like Cardlists

225
00:10:26,919 --> 00:10:29,320
before and then coming in being the person kind of

226
00:10:29,320 --> 00:10:32,600
responsible for revolving the infrastructure inside of the incident. It's

227
00:10:32,639 --> 00:10:34,120
that you can go, well, I know that this thing

228
00:10:34,200 --> 00:10:35,600
is going to happen at a certain point, and you

229
00:10:35,639 --> 00:10:38,080
can be very ready for those adaptations. And I think

230
00:10:38,159 --> 00:10:40,480
that's part of what helped us scale very effectively in

231
00:10:40,519 --> 00:10:42,519
the first two years. It was making sure that we

232
00:10:42,600 --> 00:10:45,519
only applied the right level of process and the right

233
00:10:45,600 --> 00:10:48,679
level of sophistication when we thought that we needed it.

234
00:10:48,879 --> 00:10:50,559
Speaker 1: I want to dive into that if something your at

235
00:10:50,600 --> 00:10:53,159
stack was using Heroko. My experience in the past has

236
00:10:53,200 --> 00:10:56,240
been at like deep levels. It's always surprised me how

237
00:10:56,960 --> 00:11:00,519
more successful companies can buy in a utilization going forward,

238
00:11:00,600 --> 00:11:03,759
because often it's missing some of those edge case services

239
00:11:03,799 --> 00:11:07,159
which then become critical. I see these things like secrets management,

240
00:11:07,399 --> 00:11:11,399
especially like anything related to cryptography. I just haven't seen

241
00:11:11,759 --> 00:11:14,840
Heroku be able to support that. Well, so you were

242
00:11:14,879 --> 00:11:17,200
on there, how did you actually decide, like, how did

243
00:11:17,320 --> 00:11:19,279
what was the turning point to actually make the decision

244
00:11:19,399 --> 00:11:23,120
to switch over and really go for one of the hyperscalars.

245
00:11:23,240 --> 00:11:26,480
Speaker 3: I think exactly what you what you've mentioned is honestly

246
00:11:26,600 --> 00:11:27,840
the thing that pushed us the most.

247
00:11:28,759 --> 00:11:28,960
Speaker 4: Well.

248
00:11:29,000 --> 00:11:32,000
Speaker 3: We were running in Heroku. The application at time was

249
00:11:32,039 --> 00:11:35,559
this Go monolith and it was running on a Heroku

250
00:11:35,919 --> 00:11:41,039
Service Postgress instance. And that hasn't I mean that architecture

251
00:11:41,039 --> 00:11:44,600
hasn't changed. We still have one big postgress database that

252
00:11:44,720 --> 00:11:46,600
we look after in a much more mature way now,

253
00:11:46,919 --> 00:11:49,600
but that's in cloud sequel. But even when we first

254
00:11:49,639 --> 00:11:52,120
started pops up was the way that the application did

255
00:11:52,200 --> 00:11:55,120
asynchronous message processing, and that that actually.

256
00:11:54,879 --> 00:11:55,679
Speaker 4: Worked really really well.

257
00:11:55,720 --> 00:11:57,600
Speaker 3: But it meant that we were already kind of half

258
00:11:57,679 --> 00:12:00,279
and half in both Heroku and in GCP. And I

259
00:12:00,360 --> 00:12:02,720
think that the thing for me that made it very

260
00:12:02,720 --> 00:12:05,000
obvious that we were going to move was what you

261
00:12:05,039 --> 00:12:07,960
said about Heroku lacking various primitives that you want when

262
00:12:08,000 --> 00:12:10,759
you want to become more mature, I mean, especially when

263
00:12:10,759 --> 00:12:14,759
you consider we have some horrible percentage of the world's

264
00:12:14,879 --> 00:12:18,000
like GDP is indirectly locked up in the companies that

265
00:12:18,120 --> 00:12:20,000
use us. At the moment, we've got some really really

266
00:12:20,039 --> 00:12:23,240
big customers and they're running their incidence in our platform,

267
00:12:23,320 --> 00:12:25,240
so we have a huge obligation to make sure that

268
00:12:25,320 --> 00:12:27,559
those things are secure and running. In Heroku, where you

269
00:12:27,679 --> 00:12:31,159
have just standard like to our factor author and like

270
00:12:31,399 --> 00:12:33,960
you're setting up environment variables, it's not the level of

271
00:12:34,039 --> 00:12:37,240
sophistication and security that you would want. So gradually, I

272
00:12:37,279 --> 00:12:40,879
think we were piecemeal moving things into what I had

273
00:12:40,919 --> 00:12:42,919
spent the first couple of weeks at incidents setting up

274
00:12:43,120 --> 00:12:46,440
as a very sensible GCP environment with all the right

275
00:12:46,519 --> 00:12:48,759
security primitives and things like that. But we ended up

276
00:12:48,799 --> 00:12:52,600
gradually moving to GCP's secret Manager, where even though we

277
00:12:52,679 --> 00:12:55,440
were running in Heroku, we would have our app using

278
00:12:55,480 --> 00:12:58,600
a security perimeter and service accounts that were authorized just

279
00:12:58,720 --> 00:13:01,360
from the Heroku environment decrypt the secrets on the fly

280
00:13:01,799 --> 00:13:03,559
as and when they wanted to use them, which was

281
00:13:03,559 --> 00:13:06,080
a nice way of us grafting ourselves to a much

282
00:13:06,120 --> 00:13:08,879
more secure GCP primitive without abandoning all.

283
00:13:08,840 --> 00:13:10,960
Speaker 4: The stuff that you would normally get inside of Hiroki.

284
00:13:11,159 --> 00:13:13,440
Speaker 3: And we did the same, gradually moving our work clothes

285
00:13:13,480 --> 00:13:18,120
over from Heroki into Kubernetes inside the GCP and then

286
00:13:18,200 --> 00:13:20,559
finding a way to split the traffic across and moving

287
00:13:20,639 --> 00:13:23,200
over the workers and then eventually moving over the database too.

288
00:13:23,360 --> 00:13:25,759
But yeah, I think there's always a push if you're

289
00:13:25,759 --> 00:13:27,960
ever in a Heroku environment or something like that, where

290
00:13:28,399 --> 00:13:30,759
you're just going I want a bit more control over this,

291
00:13:31,039 --> 00:13:33,679
or I want a bit more visibility, and I'm willing

292
00:13:33,679 --> 00:13:35,759
to pay the cost now because we're much larger, having

293
00:13:36,000 --> 00:13:38,720
a dedicated infrastructure engineers, so you can look after this

294
00:13:38,840 --> 00:13:41,440
and make a nice like paved path to production so

295
00:13:41,519 --> 00:13:43,039
that people can deploy things effectively.

296
00:13:43,240 --> 00:13:46,320
Speaker 1: I actually see it as a cost reduction mechanism. When

297
00:13:46,399 --> 00:13:48,679
you are given the primitives that work out of the box,

298
00:13:48,759 --> 00:13:51,879
you don't have to build complicated technology to say cross clouds,

299
00:13:51,879 --> 00:13:56,200
because even in your example using the GP Secret Manager

300
00:13:56,519 --> 00:13:59,120
and having service clients that have direct access there, you

301
00:13:59,120 --> 00:14:01,080
still have to deploy this secrets. Whether I think in

302
00:14:01,200 --> 00:14:04,360
GCP there are databut's or they're Jason blobs that have

303
00:14:04,440 --> 00:14:07,639
the certificate embedded in order to actually generate databut's to

304
00:14:07,720 --> 00:14:09,879
be sent to the server. But how do you even

305
00:14:09,919 --> 00:14:12,799
secure those and get those deployed into the rocal environment

306
00:14:12,799 --> 00:14:16,480
because if I recall and still true, maybe recently, there's

307
00:14:16,519 --> 00:14:20,120
no workload entity identifiers for the individual workloads that are

308
00:14:20,159 --> 00:14:22,279
running in Heroku, so there's no way to really secure

309
00:14:22,600 --> 00:14:26,240
those payloads locally to just the machine that's actually relevant.

310
00:14:26,399 --> 00:14:28,240
Speaker 3: Yeah, absolutely, and I think that this was part of

311
00:14:28,360 --> 00:14:30,759
the migration path that we came up with. So the

312
00:14:30,799 --> 00:14:33,559
first thing that we did was we created a security

313
00:14:33,639 --> 00:14:37,600
perimeter and GCP that allowed secret Manager API access only

314
00:14:37,639 --> 00:14:40,360
from a specific group of ips in Heroica. Interesting, that

315
00:14:40,480 --> 00:14:44,360
is not anywhere near as secure as workload identity, for example,

316
00:14:44,480 --> 00:14:47,360
But if you're looking at a halfway house, where you go, actually,

317
00:14:47,399 --> 00:14:49,320
I want to be able to pull this stuff into

318
00:14:49,360 --> 00:14:52,200
a different environment that's more secure and kind of like

319
00:14:52,279 --> 00:14:55,200
oh my way gradually into more of the sophistication that

320
00:14:55,200 --> 00:14:58,399
you get from a hyperscala that's a legitimate pathway to

321
00:14:58,480 --> 00:15:00,639
getting there. And now we're in a be an environment

322
00:15:00,720 --> 00:15:03,120
that is properly locked down and like, as you say,

323
00:15:03,240 --> 00:15:05,720
using all the primitives that allow you to best leverage

324
00:15:05,759 --> 00:15:07,759
the security tools that you get from the provider.

325
00:15:07,919 --> 00:15:10,000
Speaker 1: There was a little hesitation I think early on when

326
00:15:10,039 --> 00:15:13,159
you were answering which EXTAC you were using. And I

327
00:15:13,200 --> 00:15:15,559
don't know if that was because you have workloads running

328
00:15:15,639 --> 00:15:18,840
in the other our providers. Is that as backup strategy

329
00:15:19,039 --> 00:15:19,759
or something.

330
00:15:19,600 --> 00:15:22,879
Speaker 3: Like that, we want to prioritize on using few tools

331
00:15:22,919 --> 00:15:25,519
and knowing them extremely well. And for us, that means

332
00:15:25,799 --> 00:15:28,799
our disaster recovery plan has like a couple of different

333
00:15:28,799 --> 00:15:30,639
phases to it. But what we do is we end

334
00:15:30,679 --> 00:15:34,799
up running all of our infrastructure inside of primarily on region,

335
00:15:34,879 --> 00:15:38,320
which involves several different data data centers inside of Google,

336
00:15:38,519 --> 00:15:42,080
So all of our workloads in like the hot primary cluster,

337
00:15:42,879 --> 00:15:45,200
they're spread across three different data centers in one region.

338
00:15:45,840 --> 00:15:48,639
We then have a disaster recovery plan that has another

339
00:15:48,759 --> 00:15:50,879
region that we will fall back to if that ever

340
00:15:50,960 --> 00:15:53,759
goes wrong. That region is also within GCP. But we

341
00:15:53,799 --> 00:15:57,799
also have various different other requirements. For example, some of

342
00:15:57,840 --> 00:16:00,759
our work with customers in China require us to use

343
00:16:00,759 --> 00:16:03,679
different types of infrastructure to send title of communications over

344
00:16:03,720 --> 00:16:06,159
to them, So we have a ton of different constraints

345
00:16:06,200 --> 00:16:09,320
that mean that we want to be running with backups

346
00:16:09,440 --> 00:16:12,759
that span outside of just GCP now directionally izink. Over

347
00:16:12,799 --> 00:16:15,080
the next year, we'll end up running our work clothes

348
00:16:15,120 --> 00:16:17,679
across a variety of different providers. Again, I think that

349
00:16:17,759 --> 00:16:22,159
we'll be prioritizing GCP and a multi regional backup inside

350
00:16:22,159 --> 00:16:22,600
of GCP.

351
00:16:22,799 --> 00:16:24,279
Speaker 4: Is the way that we want to run things.

352
00:16:24,200 --> 00:16:27,519
Speaker 2: Now, based on the name of the company and the product.

353
00:16:27,759 --> 00:16:30,080
Speaker 1: I'm guessing you're and what you've said so far, it's

354
00:16:30,120 --> 00:16:34,039
really a focus on the incident management and I'll call

355
00:16:34,240 --> 00:16:36,840
the standard run books organizational run books for.

356
00:16:36,919 --> 00:16:37,759
Speaker 2: How to deal with that.

357
00:16:37,879 --> 00:16:40,679
Speaker 1: But I feel like you've expanded outside of that, and

358
00:16:41,320 --> 00:16:44,960
now I would see it as maybe else just for context,

359
00:16:45,440 --> 00:16:48,919
I say things like fire Hose and Pajer Duty. Are

360
00:16:49,000 --> 00:16:51,559
you would you say you're like directly in competition with

361
00:16:51,600 --> 00:16:54,000
those because I know early on companies like Page of Duty,

362
00:16:54,039 --> 00:16:58,000
they really focused on handling the event and alerting or

363
00:16:58,039 --> 00:17:01,039
when there was an incident, but not necessary handling the flow.

364
00:17:01,240 --> 00:17:03,559
And I don't remember ever coming up with a run

365
00:17:03,600 --> 00:17:06,400
book and throwing it in you know, third party other tools.

366
00:17:06,559 --> 00:17:07,480
Speaker 2: All our run books.

367
00:17:07,279 --> 00:17:09,000
Speaker 1: Would be in a and I'm sure I'm going to

368
00:17:09,039 --> 00:17:11,039
get some angry emails for this and question of my

369
00:17:11,160 --> 00:17:13,079
life choices, but we were using confluence.

370
00:17:13,440 --> 00:17:16,359
Speaker 2: I think I think what what I'll think about that is,

371
00:17:16,599 --> 00:17:17,759
have you hat to find a better tool?

372
00:17:18,920 --> 00:17:21,160
Speaker 1: It's the least bad one, but maybe you can add

373
00:17:21,279 --> 00:17:23,319
some clarity to sort of how you're thinking about it,

374
00:17:23,400 --> 00:17:26,000
because I know for bigger companies, as you said, if

375
00:17:26,039 --> 00:17:28,519
you're trying to apply a policy because of either compliance

376
00:17:28,640 --> 00:17:31,720
or because of regulation, then following the run books that

377
00:17:31,759 --> 00:17:34,680
you're creating it at the organizational level is critical for

378
00:17:34,880 --> 00:17:38,160
resolving the incident, which is of course separate from whether

379
00:17:38,279 --> 00:17:40,880
or not how you know there's an incident and what

380
00:17:40,960 --> 00:17:42,279
you communicate to your customers.

381
00:17:42,480 --> 00:17:45,039
Speaker 3: The full focus of the company and why incident came

382
00:17:45,079 --> 00:17:47,599
to be was it fell to us that the providers

383
00:17:47,680 --> 00:17:51,599
that already existed to provide incident response and tooling were

384
00:17:51,640 --> 00:17:54,839
focused kind of almost to the exclusion of anything else,

385
00:17:54,880 --> 00:17:57,319
on just how to tell you there is something wrong.

386
00:17:57,440 --> 00:17:59,960
And it's kind of this horrible sense that the efforts

387
00:18:00,000 --> 00:18:02,000
start to the point that your phone went off, And

388
00:18:02,119 --> 00:18:05,319
for people who've been paged for hundreds, maybe thousands of

389
00:18:05,359 --> 00:18:07,920
incidents at this point, the reverse could not be more true,

390
00:18:07,960 --> 00:18:10,240
which is that all the work starts at the point

391
00:18:10,319 --> 00:18:13,200
that your phone actually starts ringing. So incident Io was

392
00:18:13,319 --> 00:18:16,559
focused on initially focused on the attempt to help you

393
00:18:16,960 --> 00:18:19,680
beyond the point where you initially got paged, and at

394
00:18:19,680 --> 00:18:22,519
the start of the company was to plug into providers

395
00:18:22,599 --> 00:18:25,359
like page duty or upstene, so that at the point

396
00:18:25,400 --> 00:18:27,160
where you got Paige, we pull you into a Slack

397
00:18:27,240 --> 00:18:29,400
channel and it was at that point that we take

398
00:18:29,519 --> 00:18:31,480
you through the journey of actually trying to resolve it

399
00:18:31,559 --> 00:18:34,119
and caring about things like tracking all the actions that

400
00:18:34,160 --> 00:18:36,319
are going on on the incident, making sure that when

401
00:18:36,319 --> 00:18:38,559
you're coordinating with people there's just one place to go

402
00:18:38,759 --> 00:18:40,720
to talk about what's going on in the incident. As

403
00:18:40,759 --> 00:18:43,400
the company matured, we've kind of built out various different

404
00:18:43,440 --> 00:18:46,400
other kind of adjacent services that we think our core

405
00:18:46,480 --> 00:18:49,200
to incident response but weren't always packaged with those providers

406
00:18:49,240 --> 00:18:52,519
out there. So things that we've built so status pages.

407
00:18:53,720 --> 00:18:56,240
One of the biggest customers using our stats of page

408
00:18:56,240 --> 00:18:57,880
at the moment is open Ai. So if they ever

409
00:18:57,920 --> 00:19:01,200
have an outage, open Ai going to post an incident

410
00:19:01,279 --> 00:19:03,720
on those station's page, which ends up informing their customers

411
00:19:03,759 --> 00:19:06,880
about we've got some issues with this model, and then

412
00:19:06,920 --> 00:19:08,599
they can prioritize what they're doing. So that's a lot

413
00:19:08,640 --> 00:19:12,000
about helping walk your customers through the coordination of an incident.

414
00:19:12,119 --> 00:19:14,200
But we've also built out the on call aspect of

415
00:19:14,240 --> 00:19:16,400
incident response to so it's now the case that you

416
00:19:16,440 --> 00:19:17,640
don't need a page duc, you.

417
00:19:17,599 --> 00:19:18,559
Speaker 4: Don't need an opstiny.

418
00:19:18,640 --> 00:19:21,119
Speaker 3: We have the mobile app, we have all the telecommunications,

419
00:19:21,200 --> 00:19:23,599
so will help you set up your schedules and make

420
00:19:23,640 --> 00:19:25,599
sure that you're doing all the things that we think

421
00:19:25,640 --> 00:19:27,759
are really healthy, such as if you've had a really

422
00:19:27,799 --> 00:19:31,319
busy night on the pager, we'll propose to people on

423
00:19:31,359 --> 00:19:33,119
the schedule that maybe they should take your shift the

424
00:19:33,160 --> 00:19:35,119
next day so that you can get some rest and

425
00:19:35,200 --> 00:19:37,920
recover and recuperate, rather than just keeping the same person

426
00:19:38,000 --> 00:19:41,240
on throughout the week until they're run ragged. Most recently,

427
00:19:41,319 --> 00:19:43,279
the thing that has been interesting in this area for

428
00:19:43,400 --> 00:19:47,039
me is so I lead AI Incident. So we have

429
00:19:47,119 --> 00:19:50,079
a team working on how to leverage AI to best

430
00:19:50,160 --> 00:19:52,799
help people resolve their incidents, and we have a product

431
00:19:52,839 --> 00:19:56,240
called Aisri which is aiming to help in that space.

432
00:19:56,319 --> 00:19:58,000
And so what we do now whenever you have an

433
00:19:58,039 --> 00:20:01,319
incident is we will look the alert and we'll set

434
00:20:01,359 --> 00:20:03,880
our system off to go crawl a bunch of different things,

435
00:20:03,920 --> 00:20:06,359
say looking at all of your GitHub pull requests to say,

436
00:20:06,880 --> 00:20:08,240
i've found this alert, does anything.

437
00:20:08,160 --> 00:20:08,720
Speaker 4: Look like it might?

438
00:20:08,799 --> 00:20:11,000
Speaker 3: Of course, this alert based on the diffs of the

439
00:20:11,079 --> 00:20:12,880
code that we can see in the timing around when

440
00:20:12,920 --> 00:20:14,960
it was deployed, but we'll also check stuff in your

441
00:20:14,960 --> 00:20:16,720
sack work space, and we'll also look at all of

442
00:20:16,799 --> 00:20:19,200
your past incident data. And one of the most useful

443
00:20:19,240 --> 00:20:22,279
things of this product is that we actually pull together

444
00:20:22,440 --> 00:20:25,440
organically from the history of all your old incidents, how

445
00:20:25,559 --> 00:20:28,200
you've responded to issues like this in the past, and

446
00:20:28,440 --> 00:20:30,200
what we do is we end up combining this ephemeral

447
00:20:30,319 --> 00:20:32,359
run book that say it's actually, given what I can

448
00:20:32,400 --> 00:20:34,160
see that you've done in the past and what worked

449
00:20:34,200 --> 00:20:36,799
and what didn't, and even looking at your post mortems

450
00:20:36,839 --> 00:20:39,400
on what you said worked and what you missed, this

451
00:20:39,559 --> 00:20:41,160
is actually what we think that you should do right now,

452
00:20:41,160 --> 00:20:43,440
which can include things like, oh, this looks like a

453
00:20:43,559 --> 00:20:45,720
data breach. You should be contacting your DPA, do you

454
00:20:45,759 --> 00:20:47,839
want us to page them? Or maybe it's just by

455
00:20:47,880 --> 00:20:50,119
the way, this service is really flaky. I'm pretty sure

456
00:20:50,160 --> 00:20:52,519
this alert is actually not legit. You can run this

457
00:20:52,640 --> 00:20:54,599
script to try and verify whether or not the thing

458
00:20:54,680 --> 00:20:56,839
is actually true, and you can maybe even just ignore

459
00:20:56,880 --> 00:20:57,119
this and.

460
00:20:57,119 --> 00:20:57,720
Speaker 4: Go back to bed.

461
00:20:57,880 --> 00:20:59,480
Speaker 3: But yeah, I thought that the run book stuff is

462
00:20:59,519 --> 00:21:02,400
interesting is we have people who source their run books

463
00:21:02,440 --> 00:21:04,519
in all sorts of places, and while they may give

464
00:21:04,519 --> 00:21:06,440
a different answer on where they put their run books,

465
00:21:06,440 --> 00:21:08,440
there is one consistent message with them, which is always

466
00:21:08,480 --> 00:21:10,480
that the run books are always outdate, no matter what

467
00:21:10,599 --> 00:21:10,759
you do.

468
00:21:11,359 --> 00:21:14,319
Speaker 1: I think we've seen a lot of companies get wrong

469
00:21:14,559 --> 00:21:17,319
the reason why they're creating a run book, and I'll

470
00:21:17,400 --> 00:21:20,559
say that that reason is often I want to tell

471
00:21:20,640 --> 00:21:22,759
someone else what to do when there is an incident.

472
00:21:23,039 --> 00:21:23,359
Speaker 4: Gotcha.

473
00:21:23,640 --> 00:21:26,160
Speaker 1: I think the problem there is the people with the

474
00:21:26,279 --> 00:21:29,920
knowledge are trying to explain to someone who doesn't have

475
00:21:30,000 --> 00:21:33,000
the knowledge in a critical or emergency situation. But the

476
00:21:33,039 --> 00:21:35,599
correct thing to do is that's where the run book

477
00:21:35,599 --> 00:21:38,000
has zero value. And the value you get out of

478
00:21:38,160 --> 00:21:40,200
making a run book is making the run book and

479
00:21:40,400 --> 00:21:43,240
understanding how to actually communicate you're really onto something they're

480
00:21:43,319 --> 00:21:46,880
really critical, which is the fact that it's not about

481
00:21:47,160 --> 00:21:48,799
what you think you should do.

482
00:21:48,880 --> 00:21:49,599
Speaker 2: In that incident.

483
00:21:49,799 --> 00:21:52,599
Speaker 1: It's learning more about the system, and you're doing that

484
00:21:52,799 --> 00:21:56,799
programmatically through pulling in information from any number of sources

485
00:21:56,880 --> 00:22:00,200
you mentioned. I'm really curious about the technical challenges in

486
00:22:00,319 --> 00:22:03,400
actually achieving that. How do you actually go and pull

487
00:22:03,480 --> 00:22:05,279
the information out of there?

488
00:22:05,559 --> 00:22:08,160
Speaker 3: No, so, no, I mean, they're all really good questions.

489
00:22:08,200 --> 00:22:11,039
So what we do is we connect to a variety

490
00:22:11,079 --> 00:22:14,440
of different systems. That's kind of the prerogative of an

491
00:22:14,519 --> 00:22:17,680
incident response still, because everything in the company kind of

492
00:22:18,079 --> 00:22:21,599
falls downstream into an incident, so we already have access

493
00:22:21,640 --> 00:22:25,200
into a lot of different places. Slacker is just one example,

494
00:22:25,319 --> 00:22:26,880
And what we do is we allow you to connect

495
00:22:27,000 --> 00:22:29,680
various channels into our system that say, hey, by the way,

496
00:22:29,759 --> 00:22:33,039
here's a channel that will often contain interesting things that

497
00:22:33,160 --> 00:22:35,319
might be relevant to you in an incident. So the

498
00:22:35,400 --> 00:22:38,000
one that is often a really good example is like

499
00:22:38,079 --> 00:22:41,200
an engineering channel. If you just have a shared public

500
00:22:41,200 --> 00:22:43,799
SLIXE channel called engineering where you post, hey, by.

501
00:22:43,759 --> 00:22:46,279
Speaker 4: The way, we've changed how we do things about deployments.

502
00:22:46,440 --> 00:22:48,119
Speaker 3: We'll be watching that channel if you add it into

503
00:22:48,160 --> 00:22:51,000
the system to go see, there's the thing I've learned.

504
00:22:51,359 --> 00:22:53,720
So if I start seeing an incident that looks like

505
00:22:53,759 --> 00:22:56,119
it's something to do with deployments, and it's happened just

506
00:22:56,200 --> 00:22:57,880
a couple of hours after someone's gone and post it

507
00:22:57,920 --> 00:23:00,359
into engineering. Then that's actually very relevant. So I'm going

508
00:23:00,440 --> 00:23:02,200
to incorporate that into my findings and use it to

509
00:23:02,240 --> 00:23:03,839
guide me. But there's a ton of other tools that

510
00:23:03,880 --> 00:23:05,759
we connect to you. So we connect to get help,

511
00:23:05,799 --> 00:23:08,079
like you said, so we'll look at recent code changes.

512
00:23:08,200 --> 00:23:10,839
We'll also connect into telemetry, so we'll connect via Graffana

513
00:23:11,000 --> 00:23:13,160
into your metrics logs and traces, and what we end

514
00:23:13,240 --> 00:23:15,160
up doing is through a history of all the incidents

515
00:23:15,240 --> 00:23:18,079
that you've had before, and also through a variety of

516
00:23:18,160 --> 00:23:21,359
like background processing, we're continually trying to learn more about

517
00:23:21,400 --> 00:23:24,039
how you as an organization have built your kind of

518
00:23:24,200 --> 00:23:27,079
incident immune response, Like what are the things that you

519
00:23:27,279 --> 00:23:29,079
do when you have an incident like this, so that

520
00:23:29,200 --> 00:23:31,920
we can quickly guide an investigation to find or the

521
00:23:32,000 --> 00:23:34,240
dashboards that you normally look at surface all the relevant

522
00:23:34,240 --> 00:23:37,519
information inform you of anything someone said had changed recently.

523
00:23:37,680 --> 00:23:40,519
And we do that through a combination of like honestly

524
00:23:40,680 --> 00:23:43,039
a lot of quiet advanced like AI RAG.

525
00:23:43,200 --> 00:23:46,759
Speaker 2: So yeah, no, absolutely, I just even want to go deeper.

526
00:23:47,079 --> 00:23:49,880
Speaker 1: So the data is in my slack channels or in

527
00:23:50,039 --> 00:23:52,680
my I don't know previous incidents wherever I'm recording that,

528
00:23:53,480 --> 00:23:55,720
How does it get into a place that you can

529
00:23:55,759 --> 00:23:59,200
actually utilize to identify what the new epheneral rom book

530
00:23:59,240 --> 00:24:01,519
should be? Like, how do you decide what's relevant? Are

531
00:24:01,599 --> 00:24:04,359
you a canonical thing as you said RAG? Are you

532
00:24:04,640 --> 00:24:08,160
copying first level like what data seems like it could

533
00:24:08,160 --> 00:24:11,400
be relevant into a set of RAG databases and then

534
00:24:11,680 --> 00:24:15,119
using that at say runtime, whenever an incident happens to

535
00:24:15,799 --> 00:24:18,319
query it for those pieces of data and use that

536
00:24:18,480 --> 00:24:21,000
to then power some sort of search mechanism back on

537
00:24:21,039 --> 00:24:22,079
the original source of data.

538
00:24:23,039 --> 00:24:24,720
Speaker 4: Yeah, so it is exactly that.

539
00:24:24,920 --> 00:24:28,480
Speaker 3: So we have this product called Catalog, which is your

540
00:24:28,759 --> 00:24:32,319
service catalog and a load of other different organizational resources,

541
00:24:32,400 --> 00:24:35,160
and so people use that to model their organization. So

542
00:24:35,279 --> 00:24:37,519
we have a picture of all of your teams, all

543
00:24:37,559 --> 00:24:40,000
of your services, all their dependencies. We also have a

544
00:24:40,039 --> 00:24:41,880
picture for many people, like if you're B to B

545
00:24:42,039 --> 00:24:45,000
maybe they'll they'll connect their CRM, so we'll know about

546
00:24:45,039 --> 00:24:47,160
all the customers that you may have in your system.

547
00:24:47,720 --> 00:24:49,960
So what this means is we can use that as

548
00:24:50,000 --> 00:24:52,039
a knowledge graft to try and assemble all of the

549
00:24:52,079 --> 00:24:55,039
resources that we know about your organization and then use

550
00:24:55,079 --> 00:24:57,880
that to guide the investigation, where like you say, we're

551
00:24:57,920 --> 00:25:00,759
continually background indexing the resources that we might need.

552
00:25:00,920 --> 00:25:04,279
Speaker 1: Are you utilizing the APIs that are provided by a GitHub,

553
00:25:04,400 --> 00:25:07,039
get lab, whatever it is to funnel their data in

554
00:25:07,319 --> 00:25:09,880
or are you getting data from customers via I don't

555
00:25:09,880 --> 00:25:13,640
know any number of xyz pipelines and are utilizing that.

556
00:25:14,000 --> 00:25:16,319
For instance, I can imagine a lot of customers, especially

557
00:25:16,359 --> 00:25:19,200
at scale, may already have all of that data having

558
00:25:19,279 --> 00:25:23,119
been replicated into your redshifts or your snowflakes of the world.

559
00:25:23,240 --> 00:25:25,799
Speaker 3: We have a variety of both native connections, so we

560
00:25:25,839 --> 00:25:28,799
can connect directly to Salesforce or something like that, but

561
00:25:28,920 --> 00:25:31,720
we also have this kind of universal adapter that I

562
00:25:31,839 --> 00:25:34,119
wrote a while ago and has since been evolved a

563
00:25:34,160 --> 00:25:36,880
lot that we call the Catalog Importer, and people can

564
00:25:37,000 --> 00:25:40,519
run this and connect to their custom service catalog, maybe

565
00:25:40,519 --> 00:25:43,000
if they've built it themselves, and they run it periodically

566
00:25:43,039 --> 00:25:44,440
on a chrono and will pull down all their data

567
00:25:44,519 --> 00:25:46,160
and sync it across. And that's how we end up

568
00:25:46,279 --> 00:25:49,799
keeping our platform in sync with whatever your internal in

569
00:25:49,960 --> 00:25:52,319
house solution might be. But when it comes to something

570
00:25:52,359 --> 00:25:53,960
like GitHub and all the poor requests that you might

571
00:25:54,000 --> 00:25:57,079
be making. That's us connecting over a GitHub integration listening

572
00:25:57,119 --> 00:25:59,559
for webooks and going cool, we have a pr here.

573
00:26:00,039 --> 00:26:00,799
Speaker 4: I'm going to pull that in.

574
00:26:00,960 --> 00:26:02,160
Speaker 3: I'm going to have a look at the diff I'm

575
00:26:02,160 --> 00:26:04,079
going to analyze it and figure out what are the

576
00:26:04,160 --> 00:26:05,720
key things that have kind of changed. I'm going to

577
00:26:05,759 --> 00:26:07,880
tag it keywords so that I can quickly find it

578
00:26:08,079 --> 00:26:10,079
if something goes wrong. And we end up doing that

579
00:26:10,240 --> 00:26:12,599
for all of the different types of data that we index.

580
00:26:12,799 --> 00:26:14,799
But the cool thing is, I guess like this is

581
00:26:14,839 --> 00:26:17,400
a bit that is very interesting to our platform is

582
00:26:17,440 --> 00:26:21,440
that incidents themselves are amazing resources of this information. Are

583
00:26:21,480 --> 00:26:24,400
probably one of the best logs that you have when

584
00:26:24,440 --> 00:26:26,839
you end up taking the post mortims that people are writing,

585
00:26:27,000 --> 00:26:29,359
and you take all of the activity that happened during

586
00:26:29,440 --> 00:26:31,799
the incident channel and you try and join that up

587
00:26:31,839 --> 00:26:34,079
with everything that you can then see in maybe Jira

588
00:26:34,200 --> 00:26:36,599
or linear when you're opening follow up actions and things

589
00:26:36,680 --> 00:26:39,119
like that. You get a really complete picture of everything

590
00:26:39,160 --> 00:26:41,559
that's happened before and the state of your organization.

591
00:26:41,839 --> 00:26:45,240
Speaker 1: There must be such a challenge to identify every single

592
00:26:45,359 --> 00:26:48,640
source of data that could be utilized and correctly understand

593
00:26:48,680 --> 00:26:51,079
the semantics of the data that's there so that it

594
00:26:51,160 --> 00:26:54,160
can be utilized in the right way by I assume

595
00:26:54,200 --> 00:26:57,480
you're using some sort of LAM how do you overcome

596
00:26:57,599 --> 00:26:59,920
that challenge or you're just throwing all the data blog

597
00:27:00,799 --> 00:27:04,440
you know, through your LM embedding model and just putting

598
00:27:04,440 --> 00:27:06,519
in a RAG database and calling it done.

599
00:27:06,920 --> 00:27:08,839
Speaker 3: So the answer to that is, like, yes, it is

600
00:27:08,880 --> 00:27:11,880
a challenge, and if you run the naive approach, then

601
00:27:12,559 --> 00:27:16,680
it isn't good enough to provide the level of accuracy

602
00:27:16,960 --> 00:27:20,440
and actionable feedback that you want for an incident response

603
00:27:20,559 --> 00:27:23,079
product like this. So the way that we look at

604
00:27:23,079 --> 00:27:26,079
it is it's extremely harmful for you to turn up

605
00:27:26,079 --> 00:27:28,279
at the start of an incident response channel and claim

606
00:27:28,599 --> 00:27:30,359
that this is due to a problem that it is

607
00:27:30,440 --> 00:27:34,599
then not the idea of like actual like inaccurate assertions

608
00:27:34,880 --> 00:27:37,880
are really really distracting. It breaks trust in the system

609
00:27:38,039 --> 00:27:40,240
that you've built. And it can also just be highly

610
00:27:40,720 --> 00:27:44,759
negatively impactful send someone off on a real like wild

611
00:27:44,839 --> 00:27:47,599
goose chase that doesn't return anything of value. Whilst if

612
00:27:47,640 --> 00:27:50,200
we are who is improving this system for and whereas

613
00:27:50,240 --> 00:27:52,480
it got worse because often you can't really change this

614
00:27:52,559 --> 00:27:55,160
without at least something's getting worse at the same time,

615
00:27:55,720 --> 00:27:58,000
there's a lot of other things getting better. So this

616
00:27:58,160 --> 00:28:01,359
is everything from like having decent ev ol suites to

617
00:28:02,319 --> 00:28:05,359
building a system of data sets of old incidents so

618
00:28:05,400 --> 00:28:08,759
that we can rerun investigations on them, which also comes

619
00:28:08,799 --> 00:28:11,359
with a bunch of challenges, like how do you run

620
00:28:11,839 --> 00:28:14,559
an investigation as if it happened at a particular time

621
00:28:14,920 --> 00:28:17,599
with only the information that you had back then, so

622
00:28:17,680 --> 00:28:19,160
that you can then grade it to see if the

623
00:28:19,279 --> 00:28:22,279
new system is better, but without you kind of accidentally

624
00:28:22,400 --> 00:28:26,559
leaking information that happened after the incident back into the

625
00:28:26,680 --> 00:28:27,160
back test.

626
00:28:27,640 --> 00:28:31,279
Speaker 1: It's really interesting you're bringing that up, because actually I

627
00:28:31,400 --> 00:28:33,519
don't It wasn't the most recent episode, but we had

628
00:28:33,839 --> 00:28:36,559
Andrew Morlan on from chalk Ai and he was exactly

629
00:28:36,799 --> 00:28:39,720
talking about what do you call time traveling where for

630
00:28:39,920 --> 00:28:43,119
fraud detection actually, and it's a really interesting episode on

631
00:28:43,200 --> 00:28:45,319
that topic, So we don't need to dive into it here,

632
00:28:45,400 --> 00:28:49,079
but yeah, I totally get you have a really useful

633
00:28:49,200 --> 00:28:51,079
set of information there, and I think they were doing

634
00:28:51,160 --> 00:28:54,920
something similar basically in the fraud space in their database.

635
00:28:55,319 --> 00:28:56,799
Speaker 2: What are you doing with this data that's coming in?

636
00:28:57,160 --> 00:29:01,599
Speaker 1: Is it going into some proprietary RAG database or using

637
00:29:01,640 --> 00:29:03,759
something off the shelf or a third party provider.

638
00:29:03,519 --> 00:29:03,960
Speaker 4: Or something like that.

639
00:29:04,599 --> 00:29:07,480
Speaker 3: So we're mostly indexing this into our primary POSTG database.

640
00:29:07,599 --> 00:29:09,759
And what you can do is you can get a

641
00:29:09,880 --> 00:29:13,440
really long way by just indexing this data with pre

642
00:29:13,640 --> 00:29:16,519
process attributes that you can either use, so you can

643
00:29:16,720 --> 00:29:20,559
use vector embeddings if you want. They have several advantages

644
00:29:20,599 --> 00:29:24,000
and disadvantages. Postgress has PG vector that allows you to

645
00:29:24,079 --> 00:29:26,480
look for vector similarity in your results set. Actually, we

646
00:29:26,599 --> 00:29:30,480
found the embedding vectors aren't as easy to use and

647
00:29:30,559 --> 00:29:34,559
aren't as reliably consistent as us just using tags. But yeah,

648
00:29:34,599 --> 00:29:37,880
we do a ton of indexing things continuously attagging them

649
00:29:38,519 --> 00:29:41,319
some vector embeddings, and then when we fetch all the

650
00:29:41,359 --> 00:29:44,319
information back at the start of an investigation, we end

651
00:29:44,440 --> 00:29:46,559
up unpacking all of that information and passing it back

652
00:29:46,599 --> 00:29:49,279
through what we call a rerankor, which is a concept

653
00:29:49,319 --> 00:29:52,319
that's quite familiar for people working with AI, but you

654
00:29:52,440 --> 00:29:56,680
essentially unpack all of your long list of results and

655
00:29:56,759 --> 00:29:59,519
then you pass them back into an LM to gradually

656
00:29:59,640 --> 00:30:02,720
shortly to find the most compelling results from the.

657
00:30:02,759 --> 00:30:03,920
Speaker 4: Longer list that you produced.

658
00:30:04,680 --> 00:30:06,960
Speaker 3: And it's through doing this that we're able to search everything,

659
00:30:07,039 --> 00:30:10,319
can return a pretty decent aggregate of all the targeted

660
00:30:10,400 --> 00:30:13,519
resources within about a minute after the alerts fired, even

661
00:30:13,559 --> 00:30:15,920
though that might be hundreds of thousands of requests we're

662
00:30:15,960 --> 00:30:16,559
searching across.

663
00:30:16,640 --> 00:30:20,319
Speaker 1: But you're solving a critical emergency moment problem for people.

664
00:30:20,400 --> 00:30:23,319
But you're actually able to spend longer on that analysis

665
00:30:23,440 --> 00:30:26,880
and almost generate something similar to the existing reasoning models

666
00:30:26,880 --> 00:30:29,640
from the providers that are out there. They're running lms

667
00:30:29,680 --> 00:30:32,720
that may take a longer period of time to generate stuff,

668
00:30:32,720 --> 00:30:34,240
and I can imagine maybe you have some sort of

669
00:30:34,279 --> 00:30:38,240
strategy for generating a first pass answer and then spending

670
00:30:38,319 --> 00:30:42,319
more time in the background compiling longer challenged answers that

671
00:30:42,480 --> 00:30:45,519
come up with using more tokens or pulling more data.

672
00:30:45,839 --> 00:30:46,720
Speaker 4: Yeah, it's exactly that.

673
00:30:46,880 --> 00:30:49,039
Speaker 3: So I think we put a pretty high price on

674
00:30:49,079 --> 00:30:51,319
this idea that we want you to turn up in

675
00:30:51,400 --> 00:30:54,240
this incident channel. Having been paged and we've got a

676
00:30:54,480 --> 00:30:59,839
fairly substantial preliminary estimation of what's going on in the incident. Now,

677
00:31:00,480 --> 00:31:03,799
we immediately after go back another itseration and we go

678
00:31:03,960 --> 00:31:07,160
target like other resources, and we go work our way

679
00:31:07,200 --> 00:31:08,480
through it and go do we really think that this

680
00:31:08,680 --> 00:31:10,480
actually means the thing that we've gone and claimed, So

681
00:31:10,599 --> 00:31:13,079
we might have a preliminary message that goes low confidence.

682
00:31:13,119 --> 00:31:15,359
It might be this pr that we then actually we're

683
00:31:15,400 --> 00:31:17,759
cloning down the codebase in the background, and we're double

684
00:31:17,839 --> 00:31:20,799
checking all of the assumptions that were built into us

685
00:31:20,880 --> 00:31:23,079
thinking that this was the cause. So yeah, over time,

686
00:31:23,400 --> 00:31:25,640
I think we go maybe anywhere up to five or

687
00:31:25,680 --> 00:31:28,480
ten turns of that cycle. Most people will create an incident,

688
00:31:28,720 --> 00:31:31,680
especially a customer facing one. Someone else might create it

689
00:31:31,759 --> 00:31:34,599
and it will go, hey, website's broken, and for us,

690
00:31:34,720 --> 00:31:36,920
that's like, it's not very useful, right, how are you

691
00:31:36,960 --> 00:31:37,200
going to.

692
00:31:37,200 --> 00:31:38,599
Speaker 4: Find something that's going to help you with that?

693
00:31:39,119 --> 00:31:41,440
Speaker 3: Arguably anything that you have merged to the website or

694
00:31:41,480 --> 00:31:44,319
anything could cause that to be broken. So what we

695
00:31:44,400 --> 00:31:47,240
do is we end up pausing until someone provides enough

696
00:31:47,279 --> 00:31:49,960
information in the channel. Maybe they take a screenshot off

697
00:31:50,000 --> 00:31:51,480
the page that's broken and they drop it in.

698
00:31:51,599 --> 00:31:52,480
Speaker 4: So then we process the.

699
00:31:52,519 --> 00:31:55,640
Speaker 3: Image with some multimodal models, and at this point, now

700
00:31:55,680 --> 00:31:57,720
we know exactly what the website is and exactly what

701
00:31:57,839 --> 00:31:59,480
the path is because we can see it in the browser.

702
00:31:59,680 --> 00:32:01,799
We've got enough information now, so that's when we hit

703
00:32:01,960 --> 00:32:04,720
like all the heavy gucy searches and then we start

704
00:32:04,759 --> 00:32:05,359
putting things in.

705
00:32:05,680 --> 00:32:07,920
Speaker 1: I mean, that's that's pretty amazing what we're doing there,

706
00:32:08,039 --> 00:32:11,160
especially even pursing the images for pulling out the your

707
00:32:11,599 --> 00:32:14,079
art texts, the ad in information.

708
00:32:13,839 --> 00:32:14,440
Speaker 2: Of the pipeline.

709
00:32:14,839 --> 00:32:14,960
Speaker 4: You know.

710
00:32:15,000 --> 00:32:16,720
Speaker 1: I wonder if there's any sort of stat that's like

711
00:32:16,880 --> 00:32:19,279
when someone says, oh, the website is down, they usually

712
00:32:19,400 --> 00:32:22,240
mean this particular website in the past. The type of

713
00:32:22,279 --> 00:32:25,279
people that report that often are you know, in a

714
00:32:25,359 --> 00:32:28,079
particular area, And so there is still some sort of

715
00:32:28,160 --> 00:32:30,279
corollary that and ELM would be able to you know,

716
00:32:30,359 --> 00:32:32,079
pull out automatically via vector search.

717
00:32:32,319 --> 00:32:34,519
Speaker 4: So absolutely, and that is that is what we will do.

718
00:32:34,839 --> 00:32:37,119
Speaker 3: So even if we don't have enough information to go

719
00:32:37,200 --> 00:32:39,400
searching through your get up prs, what we can do

720
00:32:39,559 --> 00:32:41,759
is we can look for other incidents that look like

721
00:32:41,880 --> 00:32:44,119
this one. So if there's another incident in the past

722
00:32:44,160 --> 00:32:47,240
that went website is broken, we'll collate together all of

723
00:32:47,279 --> 00:32:49,759
the information that we think was relevant from those incidents

724
00:32:49,799 --> 00:32:52,039
and then we'll use that to guide our search. So

725
00:32:52,119 --> 00:32:54,039
it might be that we can bootstrap ourselves to a

726
00:32:54,079 --> 00:32:55,839
position where we go we're pretty sure we know what

727
00:32:55,920 --> 00:32:58,319
this problem is, and then that allows us to engage

728
00:32:58,359 --> 00:33:00,720
the other searches, but otherwise will wait until we have

729
00:33:00,799 --> 00:33:02,839
a bit more clarity, because again, what we don't want

730
00:33:02,920 --> 00:33:06,240
to do is send out a very generic query, get

731
00:33:06,359 --> 00:33:08,640
tons of data back, and then misguide people.

732
00:33:09,000 --> 00:33:12,279
Speaker 1: It's sobering almost that you said, you know, vector embedding

733
00:33:12,599 --> 00:33:15,240
or using a betting model and storing a vector database

734
00:33:15,319 --> 00:33:18,640
isn't the end all of optimized search results.

735
00:33:18,680 --> 00:33:18,799
Speaker 4: Here.

736
00:33:19,400 --> 00:33:22,119
Speaker 1: There was another episode in the recent past where we

737
00:33:22,200 --> 00:33:24,839
were actually talking about how whether or not semantic search

738
00:33:24,960 --> 00:33:28,160
for everything should just move to using in a betting

739
00:33:28,200 --> 00:33:32,519
model and staff developer relations expert from pine Cone so, actually,

740
00:33:32,599 --> 00:33:36,400
well maybe, but the keyword search is also still incredibly valuable.

741
00:33:36,400 --> 00:33:37,839
Speaker 2: We can't get rid of that. And you know, you're

742
00:33:37,839 --> 00:33:39,039
basically saying, yeah.

743
00:33:38,920 --> 00:33:43,440
Speaker 1: Actually tagging human tagging on data is still the most

744
00:33:43,519 --> 00:33:45,920
valuable thing that we could do be doing for some regard.

745
00:33:46,000 --> 00:33:48,519
I mean, of course you combine it with the appropriate format,

746
00:33:48,559 --> 00:33:51,640
but it's good. I feel like that multiple experts from

747
00:33:51,680 --> 00:33:55,000
you know, different domains are sort of, you know, really

748
00:33:55,039 --> 00:33:56,839
focused on getting down to the point here, which is

749
00:33:57,000 --> 00:33:59,559
actually we still need things that we've developed in the

750
00:33:59,599 --> 00:34:01,759
past to be successful it's not just like the new

751
00:34:01,839 --> 00:34:03,240
model is better than the one before.

752
00:34:03,559 --> 00:34:06,240
Speaker 3: Yeah, And I think this applies to so much in

753
00:34:06,440 --> 00:34:10,000
the AI engineering discipline at the moment. I think when

754
00:34:10,280 --> 00:34:12,159
we were first starting in ernest to do this maybe

755
00:34:12,199 --> 00:34:14,000
a year and a half ago, there were a ton

756
00:34:14,079 --> 00:34:16,599
of people whispering about how fine tuning was the way

757
00:34:16,760 --> 00:34:19,599
to get yourself to a system that is like the

758
00:34:19,760 --> 00:34:21,280
one that's so much better.

759
00:34:21,440 --> 00:34:24,559
Speaker 1: You really nailed it there, specifically, I've seen it similarly,

760
00:34:24,679 --> 00:34:28,079
the fine tuning is just prohibitively expensive, and so I'm

761
00:34:28,119 --> 00:34:29,480
not going to say, you know, rule out all the

762
00:34:29,760 --> 00:34:32,800
new technology options for sure, but you really have to

763
00:34:32,800 --> 00:34:36,199
be doing something special where it perfectly matches the model

764
00:34:36,239 --> 00:34:38,920
that you've got for it to really work effectively when

765
00:34:38,960 --> 00:34:41,039
you're in that situation to actually be able to utilize it.

766
00:34:41,039 --> 00:34:42,559
And I want to ask you a little bit about

767
00:34:42,559 --> 00:34:46,239
your evolution here because the company started at I feel

768
00:34:46,239 --> 00:34:50,079
like a really unique time for solving these things. Basically,

769
00:34:50,119 --> 00:34:54,639
when you started, the models available were just basically being born,

770
00:34:55,239 --> 00:34:58,039
and you've grown at the same time which the models

771
00:34:58,079 --> 00:35:01,840
have been really significantly refined and curious what the impact

772
00:35:01,880 --> 00:35:06,199
has been both internally in Incinnio but as well as

773
00:35:06,320 --> 00:35:08,559
the product that you thought you were building and the

774
00:35:08,599 --> 00:35:09,719
one that you ended up building.

775
00:35:10,079 --> 00:35:12,159
Speaker 4: So I think you are absolutely right.

776
00:35:12,239 --> 00:35:15,199
Speaker 3: And actually the models that we had maybe a couple

777
00:35:15,239 --> 00:35:18,960
of years ago, they were so so so much different

778
00:35:19,000 --> 00:35:20,320
than the ones that we have right now, and I

779
00:35:20,400 --> 00:35:24,320
think that ends up impacting both the scope and ambition

780
00:35:24,440 --> 00:35:26,480
that you have of the product that you want to build.

781
00:35:26,679 --> 00:35:30,239
But like it's entirely changed, honestly the direction that we

782
00:35:30,280 --> 00:35:33,039
would like to take the company as well. So I

783
00:35:33,079 --> 00:35:35,639
think we first started thinking about how we were going

784
00:35:35,719 --> 00:35:38,679
to use AI to really push the product forward, and

785
00:35:38,760 --> 00:35:40,559
maybe about two and a half years ago, and we

786
00:35:40,639 --> 00:35:43,519
started with very basic things. So we started with how

787
00:35:43,559 --> 00:35:47,719
would we automatically summarize incidents or how would we generate

788
00:35:48,079 --> 00:35:51,000
incident updates so that people who were responding to incidents

789
00:35:51,000 --> 00:35:53,400
wouldn't have to actually type them out themselves. But gradually,

790
00:35:53,440 --> 00:35:56,199
as we started becoming more AI literate, I guess, and

791
00:35:56,639 --> 00:35:59,880
the models started improving, and we saw a gentic tool

792
00:36:00,159 --> 00:36:01,920
being released that seem to do a lot more than

793
00:36:01,920 --> 00:36:04,000
we've ever seen before the scope of what we thought

794
00:36:04,159 --> 00:36:06,440
was possible to try and help our customers with the

795
00:36:06,440 --> 00:36:08,760
product that we were going to build. Has just increased

796
00:36:09,039 --> 00:36:12,199
probably every three months. And I think the biggest thing

797
00:36:12,280 --> 00:36:14,639
that kind of changed my mind is the journey that

798
00:36:14,679 --> 00:36:16,599
I've been through over the last year and a bit,

799
00:36:16,679 --> 00:36:18,679
where we've gone from you know what, we can just

800
00:36:18,719 --> 00:36:21,199
give someone a really great summary of everything that's going

801
00:36:21,239 --> 00:36:23,079
on and examine all the dashboards and just go these

802
00:36:23,119 --> 00:36:25,800
are the useful things to eventually going well. Actually, like,

803
00:36:26,000 --> 00:36:29,360
we're pretty sure we can find a narrative about exactly

804
00:36:29,400 --> 00:36:31,800
how this stuff has worked, and we can reason about

805
00:36:31,840 --> 00:36:34,199
it and rule out the things that we think are

806
00:36:34,400 --> 00:36:36,320
irrelevant and the things that we think are really relevant,

807
00:36:36,800 --> 00:36:38,800
and we can propose next steps to the people to

808
00:36:38,920 --> 00:36:41,239
eventually where we've got to now, which is we're able

809
00:36:41,320 --> 00:36:43,639
to in some cases, if we can identify the part

810
00:36:43,719 --> 00:36:46,280
of the code that's gone wrong, we'll actually create a

811
00:36:46,440 --> 00:36:48,199
code change that we'll try and fix it for you.

812
00:36:48,400 --> 00:36:51,480
So we have a virtual machine that sits there or

813
00:36:51,679 --> 00:36:53,800
several of them, and we end up communicating from our

814
00:36:53,800 --> 00:36:56,320
app and going, hey, please load up the codebase over there,

815
00:36:56,320 --> 00:36:57,960
and then we have an agent that sits inside the

816
00:36:58,000 --> 00:37:01,360
code base and tries actually debugging what's going on, and

817
00:37:01,440 --> 00:37:04,719
then our investigation system is using a combination of that

818
00:37:04,840 --> 00:37:08,360
coding agent, along with our telemetry agent and everything else

819
00:37:08,360 --> 00:37:10,559
that we've got in our system to poke and prod

820
00:37:10,719 --> 00:37:12,480
and build up it's understanding of what's going on, and

821
00:37:12,559 --> 00:37:14,880
so eventually we go, hey, can you actually fix this?

822
00:37:15,400 --> 00:37:17,000
And then we'll try and use the coding agent to

823
00:37:17,280 --> 00:37:19,360
end up building a fix which we end up pushing into,

824
00:37:19,599 --> 00:37:20,880
pushing in to get up. Wow.

825
00:37:21,000 --> 00:37:23,760
Speaker 2: Okay, that's pretty intriguing.

826
00:37:23,960 --> 00:37:25,599
Speaker 4: Really quite crazy security challenge.

827
00:37:25,719 --> 00:37:30,079
Speaker 3: If you've I remember coming to Ben who's like lead,

828
00:37:30,559 --> 00:37:33,800
and going, hey, how do you feel about us pulling

829
00:37:33,840 --> 00:37:36,960
down other people's code and then running an agentic piece

830
00:37:37,000 --> 00:37:39,920
of software in there? And like the agent has to

831
00:37:39,960 --> 00:37:42,280
be able to run arbitrary shellle commands and stuff like that.

832
00:37:42,639 --> 00:37:44,719
The amount of work that we did on trying to

833
00:37:44,920 --> 00:37:47,519
just secure that and make it a secure platform for

834
00:37:48,159 --> 00:37:49,880
us to run was kind of nuts.

835
00:37:50,119 --> 00:37:52,760
Speaker 1: Were you able to take inspiration from like Aws's firecracker

836
00:37:53,079 --> 00:37:56,559
or associated technologies or did you really have to spend

837
00:37:56,599 --> 00:37:58,800
this up yourself? I mean, I asked, because we're in

838
00:37:58,840 --> 00:38:02,519
a similar situation where sometimes not anything close to as

839
00:38:02,599 --> 00:38:05,360
ridiculous as you're doing and trying to run basically the

840
00:38:05,360 --> 00:38:09,639
customers code itself, but realistically, we let customers configure stuff

841
00:38:09,679 --> 00:38:13,320
with maybe a programmatic point extension and write some code,

842
00:38:13,400 --> 00:38:15,880
and one of the best security mechanisms I've found is

843
00:38:16,239 --> 00:38:19,039
to just give it to a cloud provider to execute,

844
00:38:19,079 --> 00:38:23,119
because their whole business is based on like isolation, and

845
00:38:23,239 --> 00:38:26,519
so if there's an isolation problem there, we're not going

846
00:38:26,599 --> 00:38:27,800
to be the only ones with an issue.

847
00:38:28,239 --> 00:38:31,320
Speaker 3: So I think, like absolutely, yes, the problem that we

848
00:38:31,400 --> 00:38:35,400
have is the latency, so being able to to connect

849
00:38:35,559 --> 00:38:38,599
really quickly and issue arbitrary code queries. Some of the

850
00:38:38,800 --> 00:38:41,519
repos that our customers have can be like many gigabytes

851
00:38:42,039 --> 00:38:44,400
large or small, depending on how you feel. And being

852
00:38:44,440 --> 00:38:46,440
able to pull that down and have that code base

853
00:38:46,519 --> 00:38:48,880
available so that you can then run queries against it,

854
00:38:49,320 --> 00:38:52,239
and trying to do that in ephemeral kind of isolated

855
00:38:52,280 --> 00:38:54,039
environments is quite difficult.

856
00:38:54,119 --> 00:38:54,880
Speaker 2: So complex.

857
00:38:54,920 --> 00:38:57,400
Speaker 1: I just would never imagine anyone trying to actually make

858
00:38:57,480 --> 00:39:01,159
that happen programmatically. So has hove to you on that

859
00:39:01,480 --> 00:39:04,519
particular challenge, overcome and execute it, and it would be

860
00:39:04,559 --> 00:39:06,239
really interesting to see how that evolves over time.

861
00:39:07,119 --> 00:39:09,480
Speaker 3: Yeah, no, I think yeah. I think Ben has plans

862
00:39:09,519 --> 00:39:11,440
to share a lot more about it soon. It's been

863
00:39:11,480 --> 00:39:15,079
a lot of adventures in isolation, and well, I think

864
00:39:15,079 --> 00:39:17,719
the key thing for us is this actually allows us

865
00:39:17,760 --> 00:39:21,360
to power some really interesting product experiences, and that's always

866
00:39:21,400 --> 00:39:22,440
the motivation for us here.

867
00:39:22,760 --> 00:39:25,559
Speaker 1: I think when you've identified something that is so novel

868
00:39:25,599 --> 00:39:27,920
in this way, it does create its own sort of

869
00:39:27,960 --> 00:39:30,519
competitive advantage, which I can see is like not something

870
00:39:30,519 --> 00:39:32,880
you ever necessarily wanted to start, Like you didn't start outside.

871
00:39:32,880 --> 00:39:33,519
Speaker 2: You know what we're going to do.

872
00:39:33,800 --> 00:39:36,519
Speaker 1: We're going to figure out that like all this infrastructure

873
00:39:36,519 --> 00:39:40,119
as code, you know devlops world stuff, where you build it,

874
00:39:40,119 --> 00:39:41,760
you run it. Now we're going to just figure out

875
00:39:41,760 --> 00:39:44,440
how to run other people's code programmatically on the fly.

876
00:39:45,079 --> 00:39:47,719
Speaker 2: And but once you've done that, which.

877
00:39:47,639 --> 00:39:49,239
Speaker 1: Arguably is the hard part, there's a lot of like

878
00:39:49,360 --> 00:39:51,280
that empower a lot of things, Like in one of

879
00:39:51,320 --> 00:39:55,280
these previous episodes, we were talking about how the company

880
00:39:55,360 --> 00:39:58,440
had to figure out how to dynamically run customer code.

881
00:39:58,920 --> 00:40:02,320
And they're not as they're not gigabyte repositories. But if

882
00:40:02,400 --> 00:40:04,760
you know what the code is and you convert it

883
00:40:04,840 --> 00:40:08,360
into an AST, you can then prove that you understand

884
00:40:08,400 --> 00:40:09,800
what the code actually does, and then run it in

885
00:40:09,880 --> 00:40:13,360
whatever infrastructure you want, which is really interesting because it

886
00:40:13,599 --> 00:40:16,840
puts all the effort on building the perfect AST generator

887
00:40:16,880 --> 00:40:20,079
from programming and one particular language and then being able

888
00:40:20,079 --> 00:40:21,800
to execute it. But you get around a lot of

889
00:40:21,800 --> 00:40:24,559
the security concerns because you've proven that you understand how

890
00:40:24,559 --> 00:40:26,719
the code is supposed to work, and if it violates

891
00:40:26,800 --> 00:40:29,880
your whatever invariant you have on your code, they always know.

892
00:40:29,960 --> 00:40:32,119
This actually is like an extra instruction to an ALM

893
00:40:32,239 --> 00:40:33,960
to do something we don't actually want to run that.

894
00:40:34,559 --> 00:40:36,360
Speaker 3: Even if you go into how do you get a

895
00:40:36,559 --> 00:40:40,239
code query to execute over a very large code base

896
00:40:40,320 --> 00:40:43,519
and give back responses that are both correct and also

897
00:40:43,719 --> 00:40:45,360
do it quite quickly, there are a couple of ways

898
00:40:45,400 --> 00:40:47,159
that you can do. You can either try and shave

899
00:40:47,280 --> 00:40:49,159
time off how fast you're going to get that code

900
00:40:49,159 --> 00:40:49,679
based locally.

901
00:40:49,760 --> 00:40:51,039
Speaker 4: That's one thing that we've done.

902
00:40:51,199 --> 00:40:53,880
Speaker 3: But there's actually a way more impactful thing that you

903
00:40:53,920 --> 00:40:57,320
can do, which is pre analyzing a code base so

904
00:40:57,360 --> 00:40:59,159
that you have a map of what that codebase looks like.

905
00:40:59,360 --> 00:41:02,360
So for us, that like actually crawling codebases and building

906
00:41:02,440 --> 00:41:03,880
up a bit of a map and an understanding of

907
00:41:03,960 --> 00:41:06,519
what they are so that when someone has an issue

908
00:41:06,639 --> 00:41:08,960
in an incident, we boost up our LLM with some

909
00:41:09,159 --> 00:41:11,360
like cheat sheet notes about this is how you should

910
00:41:11,400 --> 00:41:13,800
browse this thing, and that was probably the most significant

911
00:41:13,840 --> 00:41:15,840
impact on a very large co base. It would go

912
00:41:15,920 --> 00:41:18,159
from being like four or five minutes to answer a

913
00:41:18,280 --> 00:41:20,559
question to if you could seed it with this analysis,

914
00:41:20,880 --> 00:41:22,880
Suddenly it's like thirty seconds because it goes, oh, well

915
00:41:22,920 --> 00:41:23,719
that bit's over there.

916
00:41:23,960 --> 00:41:24,119
Speaker 4: Yeah.

917
00:41:24,159 --> 00:41:26,119
Speaker 1: And that's the most ridiculous thing is that in order

918
00:41:26,119 --> 00:41:28,960
to solve your problem, you designed this technology which basically

919
00:41:29,039 --> 00:41:31,400
puts a whole fleet of other products like out of

920
00:41:31,440 --> 00:41:33,960
business because you had to do the thing that they've

921
00:41:34,000 --> 00:41:36,079
been trying to do, which is it like usually look

922
00:41:36,119 --> 00:41:38,480
at it indexa all of your source code to be

923
00:41:38,519 --> 00:41:40,320
able to just find things that you're looking for in

924
00:41:40,440 --> 00:41:43,199
that source code with poorer class, et cetera. That's often

925
00:41:43,239 --> 00:41:46,400
a challenge that large companies have tried to approach, and

926
00:41:46,480 --> 00:41:49,159
in the past they've gone for tools that specifically offer

927
00:41:49,239 --> 00:41:52,199
that functionality. But in order to deliver your solution, you

928
00:41:52,320 --> 00:41:55,039
had to design a solution from that from the ground up.

929
00:41:55,400 --> 00:41:56,760
Speaker 2: And that means you can actually.

930
00:41:56,599 --> 00:41:59,920
Speaker 1: Target code searching for humans, not just for your agent,

931
00:42:00,239 --> 00:42:03,599
to be able to identify what has changed and what

932
00:42:03,679 --> 00:42:05,679
could it be impacting or creating the incident in the

933
00:42:05,719 --> 00:42:06,199
first place.

934
00:42:06,559 --> 00:42:08,760
Speaker 3: Yeah, absolutely, And I think the same goes for honestly

935
00:42:08,880 --> 00:42:11,119
understanding people's telemetry architectures.

936
00:42:11,239 --> 00:42:11,760
Speaker 4: Is the same deal.

937
00:42:12,320 --> 00:42:14,760
Speaker 3: If you want to understand how to browse someone's looks like,

938
00:42:14,880 --> 00:42:17,119
you need to first understand what looks there are and

939
00:42:17,599 --> 00:42:19,239
try and build up a map of like what lugs

940
00:42:19,280 --> 00:42:22,079
are even useful to you, and then you Whilst you

941
00:42:22,119 --> 00:42:25,440
get these lms in their huge hammers for many different tools,

942
00:42:25,719 --> 00:42:28,920
they also come with some pretty obvious and kind of

943
00:42:28,960 --> 00:42:32,000
like crippling limitations context win those being one of them.

944
00:42:32,119 --> 00:42:34,400
Speaker 1: Yeah, I don't want to undersell that in any way.

945
00:42:34,760 --> 00:42:37,639
The context model problem is never going away. We found

946
00:42:37,639 --> 00:42:39,880
out that just you will never be able to fit

947
00:42:39,920 --> 00:42:41,079
the whole context in your window.

948
00:42:41,119 --> 00:42:42,719
Speaker 2: And actually the larger the more.

949
00:42:42,639 --> 00:42:44,840
Speaker 1: Tokens you add in, the less value each of those

950
00:42:44,840 --> 00:42:47,039
individual tokens have. So you will always have this problem

951
00:42:47,159 --> 00:42:49,719
of I have too much data to utilize how do

952
00:42:49,800 --> 00:42:50,320
I deal with it?

953
00:42:50,440 --> 00:42:52,280
Speaker 2: And the interesting thing, and I think there's may be a.

954
00:42:52,320 --> 00:42:55,079
Speaker 1: Pattern for the episode, is that it turns out we've

955
00:42:55,119 --> 00:42:57,559
sort of solved this problem many times before without using

956
00:42:57,559 --> 00:42:59,320
an LM, and if you first, you know, take that

957
00:42:59,440 --> 00:43:01,559
step to I don't want to say, like sanitize the

958
00:43:01,639 --> 00:43:04,320
data or clean it in some way before passing it in.

959
00:43:04,800 --> 00:43:06,840
Speaker 2: Then you're ending you're going to end up with a.

960
00:43:06,880 --> 00:43:09,880
Speaker 1: Much better result in a much faster time period, rather

961
00:43:09,920 --> 00:43:12,119
than trying to wait for Gemini to push out the

962
00:43:12,159 --> 00:43:15,280
next you know multimillion token context window.

963
00:43:15,400 --> 00:43:19,039
Speaker 4: Yeah, co signed that has absolutely our experience with it.

964
00:43:19,239 --> 00:43:21,960
Speaker 1: Well, I like it that it's like a logical argument

965
00:43:22,039 --> 00:43:24,760
against and alms get better. Yeah, probably, but increasing the

966
00:43:24,800 --> 00:43:27,760
context window isn't isn't really going to be one of them.

967
00:43:27,960 --> 00:43:29,880
Speaker 3: I think the thing that stands out to me that

968
00:43:30,159 --> 00:43:33,079
I think was not true two years ago, maybe was

969
00:43:33,159 --> 00:43:35,039
becoming true a year ago, but it's definitely true now

970
00:43:35,119 --> 00:43:36,519
is I think that the models that we have out

971
00:43:36,559 --> 00:43:39,920
there are no longer the limiting factor to us building

972
00:43:39,960 --> 00:43:43,639
these systems. So when I think about the ais RE

973
00:43:43,960 --> 00:43:46,440
product that we have and getting it from where it

974
00:43:46,519 --> 00:43:49,360
is at the moment where it is like delivering value

975
00:43:49,400 --> 00:43:51,639
to customers, but the scope of the incidents that it

976
00:43:51,719 --> 00:43:54,800
can deal with really like small to moderate, and we

977
00:43:54,880 --> 00:43:56,639
want to get it up to even the highest level

978
00:43:56,719 --> 00:43:59,440
of very complicated incidents and dealing with this so that

979
00:43:59,480 --> 00:44:01,719
it's active almost all the time. I do not think

980
00:44:01,800 --> 00:44:03,639
that it will be an upgrade at the frontier models

981
00:44:03,679 --> 00:44:05,719
that allow us to get from that moderate to high.

982
00:44:06,119 --> 00:44:08,199
Almost all of the value that we've managed to deliver,

983
00:44:08,440 --> 00:44:11,239
all the improvements, have been down to being more structured

984
00:44:11,280 --> 00:44:14,000
in how we think about the problem, teaching us or

985
00:44:14,239 --> 00:44:16,280
breaking our system up so that it's more modular so

986
00:44:16,400 --> 00:44:18,679
that prompts can be more focused, and then figuring out

987
00:44:18,719 --> 00:44:21,039
how we can actually make better use of the models

988
00:44:21,079 --> 00:44:21,719
that we already have.

989
00:44:21,960 --> 00:44:23,119
Speaker 4: Rather than waiting for the next.

990
00:44:23,039 --> 00:44:25,079
Speaker 1: Upgrade, maybe we will schedule off fall up for this

991
00:44:25,199 --> 00:44:27,000
episode a year from now. We can see whether or

992
00:44:27,000 --> 00:44:28,880
not that promise is held true or not.

993
00:44:29,159 --> 00:44:30,599
Speaker 4: I think you won't be the only one holding me

994
00:44:30,679 --> 00:44:32,559
to that, so that seems okay to me.

995
00:44:33,199 --> 00:44:36,599
Speaker 2: He's repeated this promise on multiple podcasts.

996
00:44:36,119 --> 00:44:39,280
Speaker 3: So if anyone is exploring this space, I think the

997
00:44:39,400 --> 00:44:42,840
key things to take away figuring out how to actually

998
00:44:42,880 --> 00:44:45,519
objectively track whether or not what you're building is doing

999
00:44:45,719 --> 00:44:47,480
a good job, and then hopefully once you have that

1000
00:44:47,599 --> 00:44:50,199
objective measure, you can focus on trying to keep things

1001
00:44:50,239 --> 00:44:52,400
as simple as you possibly need them to be up

1002
00:44:52,480 --> 00:44:55,639
until the objective measure proves that you actually need the

1003
00:44:55,760 --> 00:44:57,880
additional complexity. And if you're doing that, then I think

1004
00:44:57,920 --> 00:44:58,960
you can't really go far wrong.

1005
00:44:59,119 --> 00:45:01,239
Speaker 1: That no one wants listen to is that you still

1006
00:45:01,280 --> 00:45:03,400
need to do the actual hard work like that that

1007
00:45:03,480 --> 00:45:05,519
hasn't gone anywhere. You still need to you know, know

1008
00:45:05,679 --> 00:45:07,480
exactly what you're doing and do it the right way

1009
00:45:07,519 --> 00:45:09,199
and think through that problem. And it's not going to

1010
00:45:09,280 --> 00:45:11,400
magically land on your lap somewhere and you'll be able

1011
00:45:11,440 --> 00:45:11,920
to push it out.

1012
00:45:12,119 --> 00:45:13,920
Speaker 3: Oh yeah, And like there is, there is just a

1013
00:45:14,039 --> 00:45:16,159
huge amount of time that you need to spend using

1014
00:45:16,199 --> 00:45:18,440
these models and building systems like these to build the

1015
00:45:18,559 --> 00:45:21,960
intuition that allows you to see, actually there's this evolution

1016
00:45:22,079 --> 00:45:23,519
to what we're doing at the moment that can get

1017
00:45:23,599 --> 00:45:26,039
us to the next step. But yeah, there is no

1018
00:45:26,199 --> 00:45:28,360
shortcut that I know of yet to get yourself to

1019
00:45:28,440 --> 00:45:30,199
the place where you have that intuition other than just

1020
00:45:30,599 --> 00:45:32,360
share hard work and time thinking about it.

1021
00:45:32,480 --> 00:45:34,599
Speaker 2: And here I was thinking I could retire soon. Okay,

1022
00:45:34,920 --> 00:45:38,159
So with that, that's when we were to picks.

1023
00:45:38,599 --> 00:45:40,480
Speaker 1: Well, I'll go first what I brought it today, And

1024
00:45:40,679 --> 00:45:42,440
obviously you have to be watching the YouTube channel in

1025
00:45:42,519 --> 00:45:42,920
order to see this.

1026
00:45:43,039 --> 00:45:45,800
Speaker 2: It's a it's an anchor powerport ADAM three.

1027
00:45:46,199 --> 00:45:50,199
Speaker 1: It's got some USB USBA ports, and it's got this

1028
00:45:50,400 --> 00:45:53,119
nice little adapter in the back for our two pronged

1029
00:45:53,559 --> 00:45:56,400
wall socket I don't carry. The reason I really like

1030
00:45:56,519 --> 00:45:59,639
this is besides it's durable, it's light, and it's small.

1031
00:46:00,000 --> 00:46:02,440
Travel a lot, and I don't like bringing around like

1032
00:46:02,719 --> 00:46:05,320
the plugs to stick in to charge my USB stuff

1033
00:46:05,920 --> 00:46:08,039
and having to plug them in, and after doing that

1034
00:46:08,119 --> 00:46:09,960
for a lot of years, they're always like, really crap,

1035
00:46:10,000 --> 00:46:14,119
are really expensive, like it's fifty one hundred dollars francs

1036
00:46:14,239 --> 00:46:17,480
whatever for actually good quality ones that are now able

1037
00:46:17,519 --> 00:46:18,440
to charge my laptop.

1038
00:46:18,760 --> 00:46:20,239
Speaker 2: And it's just a real waste. And I found like

1039
00:46:20,239 --> 00:46:21,960
the cheaper ones they just break all the time. They're

1040
00:46:22,000 --> 00:46:22,599
so unreliable.

1041
00:46:22,599 --> 00:46:24,320
Speaker 1: And I don't want to be on vacation or traveling

1042
00:46:24,400 --> 00:46:28,239
for work and have my USB power AC adapter break

1043
00:46:28,320 --> 00:46:30,280
on me. So what I started doing is I just

1044
00:46:30,320 --> 00:46:34,360
carry this thing around and I buy the cord that

1045
00:46:34,480 --> 00:46:36,559
actually fits for the country that I'm going to, so

1046
00:46:36,639 --> 00:46:38,280
I have like a whole bunch of chords that match,

1047
00:46:38,400 --> 00:46:41,079
rather than having to play around with like wall socket

1048
00:46:41,159 --> 00:46:44,679
adapter AC adapters for the different sizes and it you know,

1049
00:46:44,719 --> 00:46:46,599
there's a transformer in here, so it's like really just

1050
00:46:46,639 --> 00:46:49,360
a cord. And like I was in it was in

1051
00:46:49,400 --> 00:46:51,119
Thailand a couple of years ago and we were on

1052
00:46:51,280 --> 00:46:56,239
vacation and the power socket didn't hold the plug in

1053
00:46:56,840 --> 00:46:59,519
like you're The adapter was just falling out of the wall.

1054
00:46:59,559 --> 00:47:01,320
We had to keep it to it in order to

1055
00:47:01,639 --> 00:47:04,159
actually make a charge anything. And after that, I'm just like,

1056
00:47:04,280 --> 00:47:06,360
I'm done. You have a chord that comes out of it.

1057
00:47:06,400 --> 00:47:08,440
It's like really light and it stays in no problem.

1058
00:47:08,519 --> 00:47:10,000
So I absolutely love this thing.

1059
00:47:10,119 --> 00:47:11,280
Speaker 2: It's fantastic great.

1060
00:47:11,400 --> 00:47:15,119
Speaker 4: I mean, I'm going to buy one. So I came.

1061
00:47:15,360 --> 00:47:17,719
Speaker 3: I came. I came fairly unprepared for this when I

1062
00:47:17,840 --> 00:47:20,679
first jumped into the podcast. So I think I've got

1063
00:47:20,760 --> 00:47:23,880
I've got two picks. One of them is mostly City,

1064
00:47:24,599 --> 00:47:26,679
which is that we have a team here of people

1065
00:47:26,719 --> 00:47:29,480
who love or like fidget toys, and one of the

1066
00:47:29,519 --> 00:47:32,159
most popular has been this Roctopus, which is a three

1067
00:47:32,239 --> 00:47:36,119
D printed model of the Rock with Octimus legs, which

1068
00:47:36,159 --> 00:47:39,159
you can't on Etsy. That is endlessly entertaining for half

1069
00:47:39,199 --> 00:47:41,840
the team going through this, So you check that out.

1070
00:47:42,119 --> 00:47:43,320
And the other pick that I have is a lot

1071
00:47:43,400 --> 00:47:45,800
more serious or a lot more relevant to the discussion

1072
00:47:45,840 --> 00:47:48,000
that we had, which is if you haven't ever read it,

1073
00:47:48,280 --> 00:47:51,559
it's a book called The Checklist Manifesto, which is about

1074
00:47:51,599 --> 00:47:54,880
how checklists have been adopted in areas of medicine and

1075
00:47:55,039 --> 00:47:58,039
also other areas such as aviation, and kind of the

1076
00:47:58,079 --> 00:48:00,800
gradual realization of how to build a good checklist and

1077
00:48:01,519 --> 00:48:04,639
what's required to make it good, which has some direct

1078
00:48:05,079 --> 00:48:07,360
relevance to everything that we were talking about with rum books,

1079
00:48:07,559 --> 00:48:10,199
particularly around the idea that until you have written the

1080
00:48:10,280 --> 00:48:13,079
checklist and then run it yourself, the checklist is probably wrong.

1081
00:48:13,280 --> 00:48:16,480
And yeah, I got recommended this by VP of Engineering

1082
00:48:16,639 --> 00:48:19,239
and Niberte, who've worked with for many years, who used

1083
00:48:19,239 --> 00:48:21,800
to be an esserit Blizzard and love this book just

1084
00:48:21,840 --> 00:48:24,760
because of how how relevant it was TOESRI and everything

1085
00:48:24,800 --> 00:48:27,000
else that you might do in DevOps. So you can

1086
00:48:27,039 --> 00:48:28,880
get that on Amazon. I would check it out, The

1087
00:48:28,960 --> 00:48:29,920
Checklist Manifesto.

1088
00:48:30,400 --> 00:48:32,840
Speaker 2: Yeah, the links will be in the down below.

1089
00:48:32,920 --> 00:48:36,760
Speaker 1: In the episode description, I will ask about the repridit octopus.

1090
00:48:36,840 --> 00:48:37,440
Speaker 3: Is it is it like?

1091
00:48:37,719 --> 00:48:39,239
Speaker 2: Would it didn't look like Maybe.

1092
00:48:39,000 --> 00:48:42,119
Speaker 4: It's just some sort of it's actually it's just it's

1093
00:48:42,239 --> 00:48:44,960
just just plastic, I see. And is the rock too

1094
00:48:45,000 --> 00:48:48,039
pust because it is the rock's head on top. It's

1095
00:48:48,119 --> 00:48:51,360
just it's just endlessly entertaining. You'd be you'd be really surprised.

1096
00:48:51,880 --> 00:48:54,559
Speaker 1: Is there like a particular model that you can go

1097
00:48:54,920 --> 00:48:58,320
and download that from the internet to go and run

1098
00:48:58,400 --> 00:49:00,679
with it, or there's like it's gone and there's like

1099
00:49:00,760 --> 00:49:02,119
lots of different versions available.

1100
00:49:02,639 --> 00:49:04,360
Speaker 3: I would imagine it's gone viral at this point, but

1101
00:49:04,400 --> 00:49:05,880
I will find out and make sure it pays in

1102
00:49:05,960 --> 00:49:06,280
the licks.

1103
00:49:06,719 --> 00:49:09,559
Speaker 1: That's a plug for if you are designing swag and

1104
00:49:09,639 --> 00:49:12,360
you want to hand stuff out at conferences, don't go

1105
00:49:12,519 --> 00:49:14,840
to a third party manufacturer. Just go and buy a

1106
00:49:14,880 --> 00:49:17,519
three D printer and your employees will just go make

1107
00:49:17,599 --> 00:49:20,199
stuff and you can go. It will be really unique, right,

1108
00:49:20,280 --> 00:49:22,719
That's a unique swag to give away or just use

1109
00:49:22,719 --> 00:49:23,440
around the office.

1110
00:49:24,400 --> 00:49:26,639
Speaker 4: It really is really unique. Whether or not you say

1111
00:49:26,679 --> 00:49:28,960
that as a plus or fine is really up to you.

1112
00:49:29,320 --> 00:49:32,280
Speaker 1: Well, thank you, Lawrence for this awesome episode. It's been

1113
00:49:32,320 --> 00:49:35,079
great so far, and I hope that we'll see you

1114
00:49:35,159 --> 00:49:38,719
again back on this show maybe in the years to come.

1115
00:49:38,800 --> 00:49:42,719
Speaker 2: With the updates on a giantic evolution in the industry.

1116
00:49:42,960 --> 00:49:43,159
Speaker 4: Yeah.

1117
00:49:43,239 --> 00:49:45,119
Speaker 3: Well, thank you very much for having me on and

1118
00:49:45,199 --> 00:49:46,960
hopefully it was interesting to anyone listening.

1119
00:49:47,480 --> 00:49:51,880
Speaker 2: Yeah, and all our listeners, we'll see you again next week.

