WEBVTT

1
00:00:02.279 --> 00:00:05.919
<v Speaker 1>Welcome back everyone to another episodes of ventures and DevOps.

2
00:00:06.000 --> 00:00:09.240
<v Speaker 1>And I am really excited today because we're going to

3
00:00:09.320 --> 00:00:12.199
<v Speaker 1>jump into one of the areas that I find personally

4
00:00:12.199 --> 00:00:15.800
<v Speaker 1>really interesting, but also our guest has worked at a

5
00:00:15.880 --> 00:00:19.120
<v Speaker 1>number of companies in areas that I feel like lots

6
00:00:19.160 --> 00:00:21.120
<v Speaker 1>of companies get wrong. So I just want to welcome

7
00:00:21.120 --> 00:00:24.480
<v Speaker 1>to the show. Sylvan from from Rutley, who is the

8
00:00:24.519 --> 00:00:26.800
<v Speaker 1>head of veloper Relations. Hey, how are you doing?

9
00:00:27.760 --> 00:00:29.640
<v Speaker 2>Thank you for having me. We're in all good.

10
00:00:30.839 --> 00:00:33.840
<v Speaker 1>Good to hear, so, I said, head of velper Relations,

11
00:00:33.880 --> 00:00:35.560
<v Speaker 1>And I got to be honest, I feel like a

12
00:00:35.560 --> 00:00:38.759
<v Speaker 1>lot of companies have started to over utilize this term

13
00:00:38.799 --> 00:00:43.200
<v Speaker 1>to mean a wide variety of different roles and responsibilities.

14
00:00:43.560 --> 00:00:45.200
<v Speaker 1>Can you give me like a breakdown like what it

15
00:00:45.240 --> 00:00:46.479
<v Speaker 1>means for you today?

16
00:00:47.359 --> 00:00:51.759
<v Speaker 2>Yeah? Absolutely, Actually to a great question because I have

17
00:00:52.560 --> 00:00:56.479
<v Speaker 2>just puts it on LinkedIn two days ago when I

18
00:00:56.640 --> 00:01:02.520
<v Speaker 2>was higher and historically the role of developer relations is

19
00:01:02.600 --> 00:01:11.319
<v Speaker 2>to empower developers to use a product developer tool by

20
00:01:11.359 --> 00:01:16.560
<v Speaker 2>providing education resources, by answering any questions they may have,

21
00:01:17.439 --> 00:01:24.120
<v Speaker 2>and then just overall marketing you know, the product in

22
00:01:24.159 --> 00:01:28.079
<v Speaker 2>a way that fits engineer, which is not product marketing, right,

23
00:01:28.120 --> 00:01:31.359
<v Speaker 2>it's all about what you get out of it. Don't

24
00:01:31.439 --> 00:01:36.640
<v Speaker 2>don't tell, show me, So it's tutorials, talks, you know

25
00:01:36.640 --> 00:01:39.560
<v Speaker 2>in the former article, YouTube, video, and so on and

26
00:01:39.560 --> 00:01:44.000
<v Speaker 2>so forth. But as I joinedly, it was clear that

27
00:01:45.200 --> 00:01:50.200
<v Speaker 2>AI and more specifically here llms are the future of

28
00:01:50.719 --> 00:01:55.040
<v Speaker 2>incident management. For those who don't know, rutly Routly is

29
00:01:55.480 --> 00:01:59.719
<v Speaker 2>and on call and incident management platform. So when something

30
00:01:59.760 --> 00:02:03.040
<v Speaker 2>breaks and you have people on call we need to

31
00:02:03.079 --> 00:02:06.920
<v Speaker 2>respond to the incident, that's where they go to manage

32
00:02:06.959 --> 00:02:11.560
<v Speaker 2>the incident up to it being solved. So I played

33
00:02:11.560 --> 00:02:13.280
<v Speaker 2>a big role in that, and we'll speak about it

34
00:02:13.360 --> 00:02:17.360
<v Speaker 2>during this episode and most of my time actually at

35
00:02:17.400 --> 00:02:20.520
<v Speaker 2>truly most of my AERG, I would say like maybe

36
00:02:20.759 --> 00:02:25.639
<v Speaker 2>seventy five percent has been dedicated to agent AI agent

37
00:02:25.719 --> 00:02:34.319
<v Speaker 2>relations because we can see II agent as just another

38
00:02:34.479 --> 00:02:38.599
<v Speaker 2>member of the team, and this agent also need to

39
00:02:38.639 --> 00:02:42.000
<v Speaker 2>be taught and onboarded, just in a way that's different

40
00:02:42.039 --> 00:02:45.759
<v Speaker 2>from humans. So I would say, while I'm the head

41
00:02:45.759 --> 00:02:49.479
<v Speaker 2>of developer relation, can also say I'm the head of

42
00:02:50.080 --> 00:02:51.400
<v Speaker 2>AI agent relations.

43
00:02:52.560 --> 00:02:54.240
<v Speaker 1>Well, that's that's definitely a wide area.

44
00:02:54.280 --> 00:02:54.439
<v Speaker 2>You know.

45
00:02:54.599 --> 00:02:57.840
<v Speaker 1>I'm all interested because I found you mentioned this that

46
00:02:57.879 --> 00:03:00.719
<v Speaker 1>you're not doing product marketing, you're still marked in some

47
00:03:00.800 --> 00:03:06.120
<v Speaker 1>way to engineers. Uh, they're notoriously the biggest challenge to

48
00:03:06.319 --> 00:03:08.840
<v Speaker 1>get engineers on board with whatever you're trying to sell.

49
00:03:08.879 --> 00:03:11.800
<v Speaker 1>I mean I found of all groups of people, uh

50
00:03:12.479 --> 00:03:15.599
<v Speaker 1>even ones in the technology space, I feel like engineers

51
00:03:15.599 --> 00:03:19.120
<v Speaker 1>always want to do things themselves, right it.

52
00:03:19.000 --> 00:03:23.240
<v Speaker 2>Is and and with already we are targeting SARES psyche

53
00:03:23.240 --> 00:03:28.319
<v Speaker 2>truly ability engineers who are even more skeptical and hard

54
00:03:28.360 --> 00:03:30.919
<v Speaker 2>to convince because and for good reason. Right, their job

55
00:03:30.960 --> 00:03:36.120
<v Speaker 2>is to ensure that the infrastructure is rolling smoothly and

56
00:03:36.159 --> 00:03:39.039
<v Speaker 2>in optimized fation. And so you want to be careful

57
00:03:39.439 --> 00:03:42.680
<v Speaker 2>with the tool and the frame. A new framework or

58
00:03:42.800 --> 00:03:47.479
<v Speaker 2>new tool or methodology that might include might bring chaos

59
00:03:47.560 --> 00:03:52.599
<v Speaker 2>or instability. And uh, you know, so I used to

60
00:03:52.639 --> 00:03:58.159
<v Speaker 2>be an SARE myself. Back then SARE was not truly

61
00:03:58.199 --> 00:04:00.680
<v Speaker 2>a thing yet. I was working for flight Chair as

62
00:04:00.719 --> 00:04:04.840
<v Speaker 2>a develops engineer. We were in the top fifteen months

63
00:04:04.919 --> 00:04:10.400
<v Speaker 2>visited website in the world back then, displaying about one

64
00:04:10.439 --> 00:04:14.039
<v Speaker 2>point five billion slide a day, which you know was

65
00:04:14.080 --> 00:04:17.839
<v Speaker 2>definitely large volume. And we got acquired by LinkedIn, where

66
00:04:17.879 --> 00:04:20.439
<v Speaker 2>I work as a senior SR. This time for three

67
00:04:20.519 --> 00:04:23.120
<v Speaker 2>year and you can imagine the volume. So I've been

68
00:04:23.279 --> 00:04:26.040
<v Speaker 2>on the side, on the engineering side, and so I

69
00:04:26.079 --> 00:04:29.920
<v Speaker 2>completely you know, get the personal and I think at

70
00:04:29.959 --> 00:04:32.000
<v Speaker 2>the end of the day, it's just that these people

71
00:04:32.040 --> 00:04:35.240
<v Speaker 2>they don't want to waste time with marketing copies. They

72
00:04:35.240 --> 00:04:39.240
<v Speaker 2>want to understand what's in it for them and their job. Right,

73
00:04:40.079 --> 00:04:43.279
<v Speaker 2>So for me, it comes, you know, very naturally. I

74
00:04:43.279 --> 00:04:46.000
<v Speaker 2>don't think it's a challenge, but I think for someone

75
00:04:46.040 --> 00:04:49.360
<v Speaker 2>who does not come from an engineering background and doesn't

76
00:04:49.399 --> 00:04:53.759
<v Speaker 2>have as good of an understanding as you know, an

77
00:04:53.759 --> 00:04:56.120
<v Speaker 2>engineer may have, it may be hard to communicate to

78
00:04:56.160 --> 00:04:59.560
<v Speaker 2>this audience where you know, I think this thing might

79
00:04:59.560 --> 00:05:01.560
<v Speaker 2>come from.

80
00:05:01.639 --> 00:05:05.560
<v Speaker 1>Yeah, no, I totally got it. I know I'm gonna

81
00:05:05.639 --> 00:05:07.800
<v Speaker 1>I really want to dive into the core topic, which

82
00:05:07.839 --> 00:05:10.639
<v Speaker 1>is self healing systems. And we were talking a little

83
00:05:10.639 --> 00:05:12.680
<v Speaker 1>bit before the episode started about like this has been

84
00:05:12.720 --> 00:05:16.800
<v Speaker 1>your like a lifelong project area. How did you get

85
00:05:16.839 --> 00:05:18.800
<v Speaker 1>into this? I would say it's the first thing, like

86
00:05:18.879 --> 00:05:21.560
<v Speaker 1>was this, Like you always knew this is this was

87
00:05:21.600 --> 00:05:23.959
<v Speaker 1>the thing you're going to go into, and like how

88
00:05:23.959 --> 00:05:25.519
<v Speaker 1>long you've been doing this and what does that really

89
00:05:25.560 --> 00:05:27.600
<v Speaker 1>mean to work in self healing systems?

90
00:05:28.360 --> 00:05:31.399
<v Speaker 2>Yeah, so it goes back to actually I started the

91
00:05:31.439 --> 00:05:34.920
<v Speaker 2>project when I was working for a slight share as

92
00:05:34.959 --> 00:05:40.120
<v Speaker 2>a develops engineer slate slash slate site reabity engineer, and

93
00:05:40.319 --> 00:05:42.759
<v Speaker 2>I was on call and I had to you know,

94
00:05:42.839 --> 00:05:47.199
<v Speaker 2>manage outages back then who are using puppet as a

95
00:05:47.240 --> 00:05:51.800
<v Speaker 2>way to ensure that unfraustructure was as it should And

96
00:05:51.839 --> 00:05:55.879
<v Speaker 2>you know, ultimately there were a lot of repeat incident

97
00:05:56.040 --> 00:06:00.920
<v Speaker 2>or a lot of outage or you know issues that

98
00:06:01.959 --> 00:06:05.720
<v Speaker 2>were coming from the same type of problems. I think.

99
00:06:06.560 --> 00:06:09.639
<v Speaker 2>You know, as you as engineer grow in their careers,

100
00:06:09.639 --> 00:06:13.439
<v Speaker 2>they kind of know what are the main failure types

101
00:06:14.639 --> 00:06:18.360
<v Speaker 2>and so and so, you know, like I think it,

102
00:06:18.360 --> 00:06:21.000
<v Speaker 2>it gets repetitive, and I think a great engineer wants

103
00:06:21.079 --> 00:06:25.040
<v Speaker 2>to automate itself and not do the same thing over

104
00:06:25.079 --> 00:06:30.000
<v Speaker 2>and over again. And so yeah, that's where the idea

105
00:06:29.279 --> 00:06:35.879
<v Speaker 2>of building a self feeling system came about. So maybe just.

106
00:06:35.839 --> 00:06:37.839
<v Speaker 1>Sort off like you said, you would see the same

107
00:06:37.879 --> 00:06:40.399
<v Speaker 1>sort of regressions over and over again. Was there like

108
00:06:40.920 --> 00:06:43.360
<v Speaker 1>one in your mind that was just like the most

109
00:06:43.360 --> 00:06:45.839
<v Speaker 1>common where it was like every single time it happened. It

110
00:06:45.879 --> 00:06:47.680
<v Speaker 1>was like the driver for you to make a real

111
00:06:47.800 --> 00:06:49.040
<v Speaker 1>change in an organization.

112
00:06:50.160 --> 00:06:52.480
<v Speaker 2>Yeah, you know, I think. I mean, I know it's

113
00:06:52.519 --> 00:06:55.959
<v Speaker 2>been like more than a decade, so you know, I

114
00:06:56.000 --> 00:06:59.199
<v Speaker 2>won't have like super sharp example, but I think the

115
00:06:59.519 --> 00:07:03.600
<v Speaker 2>classic ones are, you know, issue with lack of resources,

116
00:07:03.600 --> 00:07:08.160
<v Speaker 2>whether it's you know, storage or CPU or memory, and

117
00:07:08.759 --> 00:07:13.319
<v Speaker 2>you need either to increase decrease the lots somehow or

118
00:07:13.879 --> 00:07:18.360
<v Speaker 2>distribute or scale. Could be like a service that misbehaving

119
00:07:18.439 --> 00:07:22.800
<v Speaker 2>and you need to restart. Could be a lot of

120
00:07:22.839 --> 00:07:27.079
<v Speaker 2>things that I think, So I think the industry took

121
00:07:27.160 --> 00:07:30.480
<v Speaker 2>different throughout. I think now with Kubernetists, what we do

122
00:07:30.519 --> 00:07:35.079
<v Speaker 2>is that if if this is misbehaving, we just shut

123
00:07:35.079 --> 00:07:37.160
<v Speaker 2>it down, right, we get rid of it and we

124
00:07:37.480 --> 00:07:41.839
<v Speaker 2>start a new one. And obviously kubernets is great at scaling,

125
00:07:43.000 --> 00:07:48.040
<v Speaker 2>so I think this tool to like the self feeling system,

126
00:07:48.160 --> 00:07:50.680
<v Speaker 2>it still works in this way, right like by new

127
00:07:50.759 --> 00:07:54.480
<v Speaker 2>king thing or scaling thing, you can heal a system.

128
00:07:55.199 --> 00:07:57.680
<v Speaker 2>I think in my mind I wanted to take a

129
00:07:57.680 --> 00:08:03.040
<v Speaker 2>different approach where I was training. I was envisioning a

130
00:08:03.079 --> 00:08:08.480
<v Speaker 2>system that would actually address the root cause I think

131
00:08:08.519 --> 00:08:10.639
<v Speaker 2>in some of the cases, you know, instead of just

132
00:08:10.720 --> 00:08:14.120
<v Speaker 2>new kings, this thing like really try to mimic what

133
00:08:15.360 --> 00:08:19.240
<v Speaker 2>a human engineer would do. So, Yeah, that's that's kind

134
00:08:19.240 --> 00:08:21.759
<v Speaker 2>of the philosophy that I had back then.

135
00:08:23.360 --> 00:08:26.279
<v Speaker 1>I mean, there's definitely a huge population of engineers who

136
00:08:26.360 --> 00:08:29.360
<v Speaker 1>think that the what they would do in those examples

137
00:08:29.399 --> 00:08:32.360
<v Speaker 1>would be for sure to restart the machine or the

138
00:08:32.399 --> 00:08:34.639
<v Speaker 1>container or the node if it started to run out

139
00:08:34.679 --> 00:08:38.600
<v Speaker 1>of memory or processing power. And I feel like that's

140
00:08:38.639 --> 00:08:40.200
<v Speaker 1>sort of the crux of one of the issues that

141
00:08:40.240 --> 00:08:42.840
<v Speaker 1>I've seen over and over again is that we do

142
00:08:42.919 --> 00:08:47.039
<v Speaker 1>build those systems that I say we as collective humanity

143
00:08:47.159 --> 00:08:52.399
<v Speaker 1>and not at my current company, that automatically restart or

144
00:08:52.600 --> 00:08:55.320
<v Speaker 1>you know, allocate more memory or processing power. And I

145
00:08:55.320 --> 00:08:58.840
<v Speaker 1>feel like the automatic scale scale out or scale up

146
00:09:00.080 --> 00:09:02.799
<v Speaker 1>for resources can make sense if it doesn't create a

147
00:09:02.840 --> 00:09:06.279
<v Speaker 1>negative impact on the feedback loop that you have to

148
00:09:06.320 --> 00:09:07.639
<v Speaker 1>solve the problem. And I feel like this is one

149
00:09:07.679 --> 00:09:10.120
<v Speaker 1>of the problems with automatic restarts is that it doesn't

150
00:09:10.159 --> 00:09:13.559
<v Speaker 1>really solve the problem. It's still is persistent there, It's

151
00:09:13.559 --> 00:09:17.360
<v Speaker 1>going to keep happening, and also you're delaying actually doing

152
00:09:17.399 --> 00:09:20.240
<v Speaker 1>the investigation, and you're also eliminating some of the evidence

153
00:09:20.559 --> 00:09:23.200
<v Speaker 1>that would allow you to identify the problem there. So

154
00:09:23.240 --> 00:09:25.720
<v Speaker 1>it's really great to hear that you know, you thought

155
00:09:25.799 --> 00:09:28.480
<v Speaker 1>that the appropriate process was, you know, go out like

156
00:09:28.519 --> 00:09:31.279
<v Speaker 1>why is there extra memory usage? Why is you know,

157
00:09:31.279 --> 00:09:33.879
<v Speaker 1>the machine getting stuck, et cetera, et cetera. And I

158
00:09:33.879 --> 00:09:36.840
<v Speaker 1>feel like that's sort of the thing that sets apart

159
00:09:36.919 --> 00:09:40.480
<v Speaker 1>the best esses from the ones that are just coming

160
00:09:40.480 --> 00:09:43.320
<v Speaker 1>into quote unquote do the job it is.

161
00:09:43.519 --> 00:09:45.879
<v Speaker 2>I think then you need to strike the right balance

162
00:09:45.919 --> 00:09:50.559
<v Speaker 2>between achieving the end results, which is capability, stability, and

163
00:09:50.679 --> 00:09:53.279
<v Speaker 2>if restarting is the way to go, and you know,

164
00:09:53.320 --> 00:09:56.879
<v Speaker 2>you don't need to spend engineering resources. And obviously, I

165
00:09:56.919 --> 00:10:01.840
<v Speaker 2>mean it's walking, so you know, I don't think it's

166
00:10:01.879 --> 00:10:04.080
<v Speaker 2>it's a valid issue, but yeah, as you said, you

167
00:10:04.120 --> 00:10:08.200
<v Speaker 2>need to find a strike the right balance between just

168
00:10:09.639 --> 00:10:13.639
<v Speaker 2>doing this over and over and if it's repeat like

169
00:10:13.759 --> 00:10:19.559
<v Speaker 2>out edsure issue, you know might need investigation. And so

170
00:10:21.000 --> 00:10:25.879
<v Speaker 2>the idea that I had back in two Southern was

171
00:10:26.039 --> 00:10:31.120
<v Speaker 2>I think I started in twelve thirteen was too really

172
00:10:31.919 --> 00:10:37.879
<v Speaker 2>ingest as many data as you can from a distributed system,

173
00:10:38.120 --> 00:10:45.320
<v Speaker 2>you know, whether it's any lugs, any metrics application, lugs, traces,

174
00:10:46.159 --> 00:10:48.600
<v Speaker 2>and ingest all of this in the dallabas. Back then

175
00:10:48.639 --> 00:10:52.879
<v Speaker 2>we were using fluend, which is an open source message,

176
00:10:52.879 --> 00:10:55.559
<v Speaker 2>but we still exist and actually it's very popular. Back

177
00:10:55.559 --> 00:10:58.120
<v Speaker 2>then I was we were one with lecture of the

178
00:10:58.240 --> 00:11:02.519
<v Speaker 2>first main user big use there actually shout out to

179
00:11:02.519 --> 00:11:04.879
<v Speaker 2>the team if they are listening. And now they went

180
00:11:05.000 --> 00:11:07.840
<v Speaker 2>very far with this technology and still all of this

181
00:11:07.960 --> 00:11:13.240
<v Speaker 2>in unstructured database. Back then we I bet on on

182
00:11:13.360 --> 00:11:15.679
<v Speaker 2>Mongo DIB. It doesn't really matter the technology, but that's

183
00:11:15.799 --> 00:11:18.679
<v Speaker 2>what we use for the prototype. And then based on

184
00:11:18.759 --> 00:11:24.120
<v Speaker 2>that come up with like a state of a system

185
00:11:24.320 --> 00:11:29.919
<v Speaker 2>and and use and then like try to resolve this

186
00:11:30.080 --> 00:11:34.799
<v Speaker 2>issue by throwing at the system a bunch of actions

187
00:11:34.799 --> 00:11:38.279
<v Speaker 2>that would be safe. You know, I'm speaking like any

188
00:11:38.320 --> 00:11:41.360
<v Speaker 2>action like a like AREM action or something like you know,

189
00:11:41.440 --> 00:11:45.200
<v Speaker 2>drop database like you need to be careful. But but

190
00:11:45.200 --> 00:11:50.519
<v Speaker 2>but other set of safe action and then use machine

191
00:11:50.600 --> 00:11:55.720
<v Speaker 2>learning as an engine to learn basically what, depending of

192
00:11:55.799 --> 00:11:58.960
<v Speaker 2>the state of the system, what could solve the issue.

193
00:12:00.679 --> 00:12:05.320
<v Speaker 2>And so we designed this for a way distributed infrastructure.

194
00:12:05.440 --> 00:12:10.200
<v Speaker 2>Actually continued this work at LinkedIn and they asked me

195
00:12:10.279 --> 00:12:16.480
<v Speaker 2>to write a patent, which eventually was accepted. But yeah,

196
00:12:16.480 --> 00:12:19.759
<v Speaker 2>it never got a chance to build it, unfortunately, because

197
00:12:19.759 --> 00:12:23.039
<v Speaker 2>then I left to become an entrepreneur. But that's why,

198
00:12:23.360 --> 00:12:26.559
<v Speaker 2>you know, I was telling you have been swimming in

199
00:12:26.639 --> 00:12:28.360
<v Speaker 2>this topic for a little while.

200
00:12:28.799 --> 00:12:31.000
<v Speaker 1>Yeah, I mean I still want to go further into that.

201
00:12:31.120 --> 00:12:34.759
<v Speaker 1>So just to summarize a little bit, the strategy is,

202
00:12:35.320 --> 00:12:38.519
<v Speaker 1>we're collecting tons of logs, maybe metrics, et cetera. We

203
00:12:38.639 --> 00:12:41.919
<v Speaker 1>maybe have access to the source code, and we train

204
00:12:42.080 --> 00:12:45.240
<v Speaker 1>on that data to identify based off what sort of

205
00:12:45.320 --> 00:12:49.039
<v Speaker 1>errors we're actually seeing, how to pinpoint potentially is it

206
00:12:49.080 --> 00:12:51.799
<v Speaker 1>a part of the source code or the infrastructure which

207
00:12:51.799 --> 00:12:55.559
<v Speaker 1>could be problematic, and then utilizing that on errors that

208
00:12:55.600 --> 00:12:58.679
<v Speaker 1>actually do come out of the system to help dive

209
00:12:58.720 --> 00:13:01.159
<v Speaker 1>in to identify the cause or does it go further

210
00:13:01.200 --> 00:13:01.440
<v Speaker 1>than that?

211
00:13:02.000 --> 00:13:05.039
<v Speaker 2>Yes, I think that's a good point if we speak

212
00:13:05.039 --> 00:13:09.000
<v Speaker 2>about I think I think back then I was I

213
00:13:09.120 --> 00:13:15.519
<v Speaker 2>directly dive into resolution. I was a young engineer, you know,

214
00:13:15.559 --> 00:13:19.200
<v Speaker 2>I was like twenty five, twenty six, maybe even younger

215
00:13:19.240 --> 00:13:23.399
<v Speaker 2>than this, so I was not really mature. But I

216
00:13:23.759 --> 00:13:29.519
<v Speaker 2>think I think starting with the root cause analysis is

217
00:13:29.559 --> 00:13:33.679
<v Speaker 2>the right approach. Obviously, you know now with insight it

218
00:13:33.759 --> 00:13:37.360
<v Speaker 2>makes sense, but but my goal was really resolution, which

219
00:13:37.440 --> 00:13:40.919
<v Speaker 2>ultimately you know where you want to go. So yeah,

220
00:13:40.960 --> 00:13:44.960
<v Speaker 2>I was really really focusing on on direct resolution, and

221
00:13:45.159 --> 00:13:49.320
<v Speaker 2>I would be a mix of kind of run books

222
00:13:49.840 --> 00:13:52.799
<v Speaker 2>that that you know, we could feed and then but

223
00:13:52.960 --> 00:14:02.039
<v Speaker 2>then more interestingly set of safe action safe commands that

224
00:14:02.080 --> 00:14:06.080
<v Speaker 2>the system could run and could see if this solved

225
00:14:06.080 --> 00:14:10.320
<v Speaker 2>the issue, and then kind of do like a learn

226
00:14:10.440 --> 00:14:13.039
<v Speaker 2>from it, you know, maybe try something. It doesn't work,

227
00:14:13.080 --> 00:14:16.559
<v Speaker 2>it's fine, you know, we just ditch this kind of

228
00:14:17.039 --> 00:14:20.200
<v Speaker 2>pass instruction instruction set as an option, but sometimes it

229
00:14:20.240 --> 00:14:23.720
<v Speaker 2>will succeed and it will use this for the next incident.

230
00:14:24.120 --> 00:14:27.200
<v Speaker 2>And all of this would be based on machine learning obviously,

231
00:14:27.240 --> 00:14:30.399
<v Speaker 2>like you know, like the more success you have with

232
00:14:30.440 --> 00:14:32.840
<v Speaker 2>an instruction, the more you are likely to use it

233
00:14:32.919 --> 00:14:35.879
<v Speaker 2>next time. Back then, machine learning was nearly not as

234
00:14:35.960 --> 00:14:41.519
<v Speaker 2>advanced as it is today, you know, so it's it

235
00:14:41.600 --> 00:14:44.039
<v Speaker 2>was hard like to achieve this goal. And I think

236
00:14:45.080 --> 00:14:48.799
<v Speaker 2>the industry, I don't think anyone build this type of

237
00:14:48.879 --> 00:14:55.159
<v Speaker 2>system until like now you add one player that did

238
00:14:55.200 --> 00:14:59.639
<v Speaker 2>something similar, which is Facebook. They build a system that's

239
00:14:59.679 --> 00:15:05.679
<v Speaker 2>called f bar F B A A R that the

240
00:15:05.759 --> 00:15:10.519
<v Speaker 2>definers self feeling it was to manage Facebook that center racks,

241
00:15:10.759 --> 00:15:14.840
<v Speaker 2>so it was not system but tracks where it would

242
00:15:15.519 --> 00:15:20.960
<v Speaker 2>auto automatically perform action to solve some production issue. But

243
00:15:21.039 --> 00:15:25.720
<v Speaker 2>it was deterministic, so it was not the well, no,

244
00:15:25.720 --> 00:15:30.240
<v Speaker 2>none of machine learning used in it. And then Dropbox

245
00:15:30.320 --> 00:15:38.679
<v Speaker 2>in twenty sixteen presented no rout Aserican and this was

246
00:15:38.799 --> 00:15:44.200
<v Speaker 2>a self feeling system but for distribute distributed systems, this

247
00:15:44.279 --> 00:15:48.039
<v Speaker 2>time for web infrastructure, but same year it was deterministic.

248
00:15:48.120 --> 00:15:50.679
<v Speaker 2>So I you know, I think this system I've been

249
00:15:50.720 --> 00:15:53.919
<v Speaker 2>around for like more than a decade, and I've been

250
00:15:53.960 --> 00:15:58.120
<v Speaker 2>produce producing value. I know they've been in production, and

251
00:15:58.480 --> 00:16:03.120
<v Speaker 2>I think as area generally not up with having mechanism

252
00:16:03.759 --> 00:16:06.480
<v Speaker 2>or system working on their behalf and kind of in

253
00:16:06.519 --> 00:16:08.360
<v Speaker 2>the shadow. But the truth is that they've been around

254
00:16:08.399 --> 00:16:11.399
<v Speaker 2>for a while. I think. Now the main difference, the

255
00:16:11.440 --> 00:16:14.320
<v Speaker 2>big difference, which is a huge difference, is that we

256
00:16:14.480 --> 00:16:20.279
<v Speaker 2>are including this machine learning LLM part which is non deterministic.

257
00:16:20.399 --> 00:16:21.600
<v Speaker 2>And that's a big deal.

258
00:16:22.200 --> 00:16:24.159
<v Speaker 1>So let's just dive into that for a second. When

259
00:16:24.159 --> 00:16:27.360
<v Speaker 1>we say deterministic in the history of self healing systems,

260
00:16:27.360 --> 00:16:31.600
<v Speaker 1>we're talking about like auto scaling groups or identifying specifically

261
00:16:31.639 --> 00:16:34.919
<v Speaker 1>based off of rules that some engineer wrote, what they're

262
00:16:34.960 --> 00:16:39.360
<v Speaker 1>seeing and then how to handle the situation very concretely.

263
00:16:39.480 --> 00:16:40.080
<v Speaker 1>Is that accurate?

264
00:16:40.480 --> 00:16:43.559
<v Speaker 2>It's a great and also a one that was called

265
00:16:43.639 --> 00:16:46.720
<v Speaker 2>earths and it's exactly how you describe it, like a

266
00:16:46.759 --> 00:16:49.759
<v Speaker 2>runt book that's threaten by a human and then this

267
00:16:49.919 --> 00:16:53.000
<v Speaker 2>rent book is a trigger based on a specific signal.

268
00:16:53.080 --> 00:16:54.919
<v Speaker 2>But it's absolutely deterministic.

269
00:16:55.320 --> 00:16:58.679
<v Speaker 1>Yeah, And I think the interesting thing with the deterministic

270
00:16:58.759 --> 00:17:01.559
<v Speaker 1>systems is that really require you to do the root

271
00:17:01.559 --> 00:17:04.119
<v Speaker 1>cause analysis so that you could write like if no

272
00:17:04.240 --> 00:17:06.160
<v Speaker 1>run book applied or the run or a run book

273
00:17:06.160 --> 00:17:09.720
<v Speaker 1>applied either by a human or through automation, didn't have

274
00:17:10.519 --> 00:17:13.920
<v Speaker 1>an effect that actually resolved whatever incident you had, you

275
00:17:13.960 --> 00:17:15.839
<v Speaker 1>had to actually still do the root cause analysis. And

276
00:17:15.920 --> 00:17:18.440
<v Speaker 1>now I feel like we're getting into you know, I

277
00:17:18.440 --> 00:17:20.680
<v Speaker 1>think everyone's waiting for us to talk about this how

278
00:17:20.680 --> 00:17:25.480
<v Speaker 1>to apply Uh, I hate to say AI to this concept.

279
00:17:25.599 --> 00:17:28.640
<v Speaker 1>So now that we're now that it's twenty twenty five,

280
00:17:29.839 --> 00:17:33.880
<v Speaker 1>what does it mean to deploy LM to be able

281
00:17:33.920 --> 00:17:36.359
<v Speaker 1>to self heal a system? What does that actually look

282
00:17:36.400 --> 00:17:37.039
<v Speaker 1>like in practice?

283
00:17:38.279 --> 00:17:41.440
<v Speaker 2>Yeah? So, you know, I think you hit the nail

284
00:17:41.440 --> 00:17:44.079
<v Speaker 2>on the head with speaking about the root cause analysis.

285
00:17:44.079 --> 00:17:47.920
<v Speaker 2>That's obviously the first step that this system need to

286
00:17:47.960 --> 00:17:53.119
<v Speaker 2>do is to understand what's up. I think in the

287
00:17:53.160 --> 00:17:55.839
<v Speaker 2>past it was a mix of for this self feeling

288
00:17:55.880 --> 00:18:02.440
<v Speaker 2>system I think if part of this runt book where

289
00:18:03.160 --> 00:18:09.519
<v Speaker 2>automatically based on a very specific maybe anologue or you know,

290
00:18:09.559 --> 00:18:13.519
<v Speaker 2>something like very trivial, and then it would be automatically applied.

291
00:18:13.599 --> 00:18:16.400
<v Speaker 2>In other cases, a human would need to do the

292
00:18:16.480 --> 00:18:19.599
<v Speaker 2>root cause analysis and then you would push this rund book,

293
00:18:19.720 --> 00:18:22.000
<v Speaker 2>which would still save a ton of time, right because

294
00:18:23.000 --> 00:18:25.680
<v Speaker 2>this system would orchestrate all the action that needs to

295
00:18:25.680 --> 00:18:29.720
<v Speaker 2>be done. And when we are speaking about very complicated

296
00:18:29.920 --> 00:18:33.960
<v Speaker 2>infrasecure like at Google, Facebook or LinkedIn, you know, it

297
00:18:33.960 --> 00:18:37.240
<v Speaker 2>can be a lot of work. But here the idea

298
00:18:37.240 --> 00:18:41.640
<v Speaker 2>is that we can throw whatever broken system to this

299
00:18:41.839 --> 00:18:47.400
<v Speaker 2>LLM and it should understand what's up or at least

300
00:18:48.000 --> 00:18:54.160
<v Speaker 2>come up with hypothesis. And I think the hypothesis is

301
00:18:54.960 --> 00:18:58.759
<v Speaker 2>very trivial, very important. Sorry, I don't think we need

302
00:18:58.880 --> 00:19:02.240
<v Speaker 2>We should not consider this LM to be God or

303
00:19:02.279 --> 00:19:06.319
<v Speaker 2>to be the silver bullet. We should consider them just

304
00:19:06.359 --> 00:19:10.480
<v Speaker 2>as another human which can make mistakes. We make a

305
00:19:10.519 --> 00:19:13.799
<v Speaker 2>lot of mistakes, and so I think one key element

306
00:19:13.960 --> 00:19:18.039
<v Speaker 2>is to understand this that this ipothesis also kind of

307
00:19:18.200 --> 00:19:22.480
<v Speaker 2>a degree of certainty, and so I think a great

308
00:19:22.559 --> 00:19:28.880
<v Speaker 2>ais are will provide as part of the diagnosis what

309
00:19:29.079 --> 00:19:33.240
<v Speaker 2>the degree of certainty that it has about the diagnosis,

310
00:19:33.400 --> 00:19:36.960
<v Speaker 2>so human can say, hey, if it's fifty percent, maybe

311
00:19:37.000 --> 00:19:39.000
<v Speaker 2>I should not pay too much attention attention to it.

312
00:19:39.000 --> 00:19:41.759
<v Speaker 2>If it's ninety five percent, okay, maybe I should really

313
00:19:41.839 --> 00:19:42.440
<v Speaker 2>look into it.

314
00:19:43.680 --> 00:19:48.880
<v Speaker 1>So what sorts of source data are you utilizing in

315
00:19:49.000 --> 00:19:51.680
<v Speaker 1>order to feed into the l I'm going to ask

316
00:19:51.720 --> 00:19:55.559
<v Speaker 1>you questions about about that afterwards, but specifically right now,

317
00:19:55.640 --> 00:19:58.240
<v Speaker 1>like is it you know a list of like source

318
00:19:58.319 --> 00:20:01.000
<v Speaker 1>code and some other things like what does the sourt

319
00:20:01.000 --> 00:20:01.640
<v Speaker 1>settle look like?

320
00:20:02.079 --> 00:20:07.240
<v Speaker 2>M So I think with as with many tools in

321
00:20:07.279 --> 00:20:13.519
<v Speaker 2>the LM air space today, context is king, and I'm

322
00:20:13.519 --> 00:20:16.440
<v Speaker 2>going to speak about what we are building at Rutely,

323
00:20:17.480 --> 00:20:21.279
<v Speaker 2>which again is an incident management platform. And why I

324
00:20:21.319 --> 00:20:24.680
<v Speaker 2>repeat this because it really matters in the sense that

325
00:20:26.519 --> 00:20:30.000
<v Speaker 2>engineering teams will all the signals that are associated with

326
00:20:30.039 --> 00:20:34.079
<v Speaker 2>an incident will flow through their incident management platform. It

327
00:20:34.480 --> 00:20:38.200
<v Speaker 2>just makes sense, right, And so these platforms such as

328
00:20:38.319 --> 00:20:43.559
<v Speaker 2>routely have pretty much all the context that is available

329
00:20:43.640 --> 00:20:49.720
<v Speaker 2>generally to solve this incident. So it can be monitoring,

330
00:20:50.720 --> 00:20:55.079
<v Speaker 2>logging traces, it can be more than this. It can

331
00:20:55.160 --> 00:20:59.680
<v Speaker 2>be slight conversation, right, it can be a zoom call

332
00:21:00.079 --> 00:21:01.960
<v Speaker 2>here you know, you can do a transcript as a

333
00:21:02.039 --> 00:21:03.640
<v Speaker 2>zoom call, so you can fit this in the l

334
00:21:03.720 --> 00:21:07.200
<v Speaker 2>l M. But I think there also are very important

335
00:21:07.279 --> 00:21:12.079
<v Speaker 2>data such as the history of post mortem incident resolution

336
00:21:12.319 --> 00:21:17.720
<v Speaker 2>report right where everything is documented from what happened, how

337
00:21:17.759 --> 00:21:21.960
<v Speaker 2>it happened, how did we solve the incident, how this

338
00:21:22.079 --> 00:21:25.400
<v Speaker 2>incident like how it do I solve? Who solved it?

339
00:21:26.440 --> 00:21:31.079
<v Speaker 2>And all this data is like super important for for

340
00:21:31.240 --> 00:21:35.599
<v Speaker 2>the AI agent to to to find a root cause.

341
00:21:37.519 --> 00:21:39.759
<v Speaker 2>And and the last one I I forgot you mentioned

342
00:21:39.960 --> 00:21:44.559
<v Speaker 2>is obviously anything that's linked to changes, which you know

343
00:21:44.599 --> 00:21:47.400
<v Speaker 2>takes the form of cod So the list of commits

344
00:21:48.119 --> 00:21:51.920
<v Speaker 2>is often you know where you'll find the issue.

345
00:21:53.480 --> 00:21:55.319
<v Speaker 1>So you've actually got a system that and justs all

346
00:21:55.319 --> 00:21:58.960
<v Speaker 1>this information and that's out. You know, here's a here's

347
00:21:59.160 --> 00:22:01.440
<v Speaker 1>some action items take And then I imagine some companies

348
00:22:01.440 --> 00:22:05.279
<v Speaker 1>are actually automating based off of that to remediate the

349
00:22:05.319 --> 00:22:07.599
<v Speaker 1>problems in production. Or is this like you need a

350
00:22:07.640 --> 00:22:10.880
<v Speaker 1>human to review this before you do anything else.

351
00:22:13.079 --> 00:22:18.880
<v Speaker 2>Yeah, I think the safe this field is extremely like

352
00:22:19.039 --> 00:22:23.440
<v Speaker 2>the auto healing mechanism around a LAMB is extremely young.

353
00:22:24.440 --> 00:22:27.279
<v Speaker 2>You know, the oldest company in the field maybe two

354
00:22:27.359 --> 00:22:30.720
<v Speaker 2>year old, perhaps even less, there is a lot of

355
00:22:30.720 --> 00:22:36.559
<v Speaker 2>competition in the space, seen at least twenty five products,

356
00:22:37.319 --> 00:22:40.480
<v Speaker 2>and I've spoken to a lot of engineers who are

357
00:22:40.480 --> 00:22:46.160
<v Speaker 2>building this internally at large companies, so you know, I

358
00:22:46.200 --> 00:22:50.000
<v Speaker 2>think everybody's doing differently. The maturity of the product is

359
00:22:50.279 --> 00:22:54.039
<v Speaker 2>also very different. But I think for s ARE is

360
00:22:54.079 --> 00:23:00.799
<v Speaker 2>obviously a really ability is ultimately the goal, and not experimentation,

361
00:23:01.000 --> 00:23:05.960
<v Speaker 2>right its secondary goal definitely Right, starting with just investigation

362
00:23:06.920 --> 00:23:10.680
<v Speaker 2>is the right way to go. And then I think,

363
00:23:10.759 --> 00:23:16.000
<v Speaker 2>as the space meteor and perhaps the model, what's work

364
00:23:16.039 --> 00:23:19.119
<v Speaker 2>great with this technology? You can teach it right, It

365
00:23:19.200 --> 00:23:23.680
<v Speaker 2>becomes better over time, so you can train models on

366
00:23:24.480 --> 00:23:28.160
<v Speaker 2>your data, you can tune it right. So as the

367
00:23:28.200 --> 00:23:31.599
<v Speaker 2>technology mature and the model is learning, and perhaps we

368
00:23:31.640 --> 00:23:38.839
<v Speaker 2>are learning as humans, we can go more towards allowing

369
00:23:38.960 --> 00:23:40.720
<v Speaker 2>this tool to do the resolution. But I think the

370
00:23:40.759 --> 00:23:44.240
<v Speaker 2>first step in is just fun our root cause analysis.

371
00:23:45.079 --> 00:23:48.279
<v Speaker 1>Yeah, I mean, at least from my personal standpoint, I'm

372
00:23:48.359 --> 00:23:53.200
<v Speaker 1>scared to hand over the tools to make changes to

373
00:23:53.240 --> 00:23:59.599
<v Speaker 1>production infrastructure automatically without involving some sort of review process.

374
00:23:59.799 --> 00:24:01.799
<v Speaker 1>And I mean, I guess, I guess it's fine to

375
00:24:01.839 --> 00:24:05.200
<v Speaker 1>have like another LM review. The first llm's work in

376
00:24:05.240 --> 00:24:07.960
<v Speaker 1>some way, but I don't know if the direction matters.

377
00:24:08.000 --> 00:24:10.759
<v Speaker 1>Like I think you need someone to review the context

378
00:24:10.759 --> 00:24:13.359
<v Speaker 1>of what's happening, just like you probably want multiple engineers

379
00:24:13.920 --> 00:24:17.519
<v Speaker 1>on call to actually validate any sort of code changes

380
00:24:17.559 --> 00:24:19.880
<v Speaker 1>that would have to go into production, because I mean,

381
00:24:19.920 --> 00:24:22.240
<v Speaker 1>otherwise you're in a situation where there's a critical event.

382
00:24:22.599 --> 00:24:25.720
<v Speaker 1>It's three, something's going wrong with the database. You log

383
00:24:25.759 --> 00:24:30.039
<v Speaker 1>in and accidentally drop the production dB. I mean, I'll

384
00:24:30.039 --> 00:24:32.119
<v Speaker 1>pause there because that this has actually happened to more

385
00:24:32.119 --> 00:24:35.400
<v Speaker 1>than one company, But I think there's one in our history.

386
00:24:36.000 --> 00:24:38.839
<v Speaker 1>I think it's almost ten years old now, a major

387
00:24:39.440 --> 00:24:44.359
<v Speaker 1>source code get server company had a production incident, very

388
00:24:44.359 --> 00:24:46.440
<v Speaker 1>famous with their I think it was Postcress at the time.

389
00:24:47.799 --> 00:24:51.640
<v Speaker 1>So engineer is definitely not infallible when it comes to remediation.

390
00:24:52.480 --> 00:24:54.759
<v Speaker 1>But I guess my question is going to be do

391
00:24:54.799 --> 00:24:58.039
<v Speaker 1>you find with all the data that you're collecting that

392
00:24:58.119 --> 00:25:02.519
<v Speaker 1>the set of incidents all point back to some like

393
00:25:02.559 --> 00:25:06.000
<v Speaker 1>as far as you're concerned, repeatable or already seen problems

394
00:25:06.039 --> 00:25:10.680
<v Speaker 1>like oh yeah, this sort of software development issue or

395
00:25:10.680 --> 00:25:15.559
<v Speaker 1>some syntax problem like no reference exception or dynamic module

396
00:25:15.640 --> 00:25:18.440
<v Speaker 1>loading or you know, memory exhaustion or something like that,

397
00:25:19.160 --> 00:25:22.359
<v Speaker 1>or are there like minor differences as time goes on,

398
00:25:22.440 --> 00:25:24.559
<v Speaker 1>like oh well it used to be this said, but

399
00:25:24.960 --> 00:25:27.119
<v Speaker 1>the next thing is sort of something that we haven't

400
00:25:27.119 --> 00:25:30.440
<v Speaker 1>discovered yet, and so you're still discovering sort of new

401
00:25:30.440 --> 00:25:31.160
<v Speaker 1>failure modes.

402
00:25:32.759 --> 00:25:36.079
<v Speaker 2>Yes, just picking like very briefly on what you said before.

403
00:25:36.079 --> 00:25:40.000
<v Speaker 2>I totally agree with you that ELMS should you be

404
00:25:40.559 --> 00:25:45.119
<v Speaker 2>considered as another gument So can review, doing canary deploy,

405
00:25:46.240 --> 00:25:50.319
<v Speaker 2>you know, passing the change through the CD basically making

406
00:25:50.400 --> 00:25:53.519
<v Speaker 2>sure that the change is safe is just a must do, right.

407
00:25:53.599 --> 00:25:56.079
<v Speaker 2>I don't think we should treat treat what the AI

408
00:25:56.240 --> 00:25:59.720
<v Speaker 2>say as the resolution pass as any different as a

409
00:25:59.799 --> 00:26:01.200
<v Speaker 2>human would say.

410
00:26:01.960 --> 00:26:04.079
<v Speaker 1>I mean it goes further than that though, right, Because

411
00:26:04.519 --> 00:26:08.279
<v Speaker 1>if we were able to confidently take the output from

412
00:26:08.440 --> 00:26:11.119
<v Speaker 1>LLMS and feed it back in, LMS should be able

413
00:26:11.160 --> 00:26:16.440
<v Speaker 1>to develop increasingly large solution of any size. And we

414
00:26:16.519 --> 00:26:20.720
<v Speaker 1>see that no company has a software an automated software

415
00:26:20.720 --> 00:26:24.720
<v Speaker 1>development or agent engineer that can just continually push out code.

416
00:26:24.839 --> 00:26:28.400
<v Speaker 1>Even ones operating of very small scopes have utterly failed

417
00:26:28.559 --> 00:26:31.480
<v Speaker 1>in their release and their push out of their products,

418
00:26:31.920 --> 00:26:34.039
<v Speaker 1>let alone larger companies that have been trying to build

419
00:26:34.079 --> 00:26:40.119
<v Speaker 1>stuff up. And the recent craze on vibe coding. Yeah,

420
00:26:40.160 --> 00:26:42.400
<v Speaker 1>I mean, and for anyone who's not aware, it's this

421
00:26:42.440 --> 00:26:44.359
<v Speaker 1>idea where you just you don't even look at the code.

422
00:26:44.400 --> 00:26:46.359
<v Speaker 1>You just have the LM produce all the output, and

423
00:26:46.359 --> 00:26:48.880
<v Speaker 1>whenever there's a problem, you just say, hey, here's the issue,

424
00:26:48.960 --> 00:26:51.519
<v Speaker 1>try to fix it. The problem is that the context

425
00:26:51.519 --> 00:26:54.400
<v Speaker 1>window will have to keep growing indefinitely. Every new feature

426
00:26:54.440 --> 00:26:56.559
<v Speaker 1>you add will continue to grow. And so as long

427
00:26:56.599 --> 00:26:59.200
<v Speaker 1>as we have these two failure modes A the LM's

428
00:26:59.240 --> 00:27:02.880
<v Speaker 1>finite context W and B, companies who have made it

429
00:27:02.880 --> 00:27:05.680
<v Speaker 1>their sole goal to make money off of automated software

430
00:27:05.680 --> 00:27:09.240
<v Speaker 1>development aren't making money off of that, you know, aren't

431
00:27:09.279 --> 00:27:12.119
<v Speaker 1>like wildly successful. The likelihood of you being able to

432
00:27:12.160 --> 00:27:15.400
<v Speaker 1>do it, lust being able to trust them fundamentally tells

433
00:27:15.480 --> 00:27:17.400
<v Speaker 1>us that, you know, we're not at that point yet.

434
00:27:18.880 --> 00:27:24.359
<v Speaker 2>For sure. I think the this vibe coding is variable

435
00:27:24.440 --> 00:27:29.920
<v Speaker 2>in many situations. If you want to prototypes, maybe if

436
00:27:29.920 --> 00:27:32.880
<v Speaker 2>you are very young startup, you know, I think it

437
00:27:32.920 --> 00:27:35.319
<v Speaker 2>makes a lot of sense. But when you get to

438
00:27:36.200 --> 00:27:39.319
<v Speaker 2>the stage where you hire a necessary or you need

439
00:27:39.319 --> 00:27:42.680
<v Speaker 2>stability in your product, or you are pushing a product

440
00:27:43.200 --> 00:27:46.759
<v Speaker 2>that is crucial for your customer system. You know, I

441
00:27:46.759 --> 00:27:52.079
<v Speaker 2>don't think this type of engineering practice, if we can

442
00:27:52.119 --> 00:27:57.279
<v Speaker 2>call it this way, makes sense, but I think this

443
00:27:57.359 --> 00:28:00.400
<v Speaker 2>technology can bring a lot of value. You mentioned do

444
00:28:00.440 --> 00:28:05.000
<v Speaker 2>we find patterns in the type of incidents that with

445
00:28:05.119 --> 00:28:07.920
<v Speaker 2>see through the system? And that's a really great question.

446
00:28:08.039 --> 00:28:12.680
<v Speaker 2>So one of the initial leading at rutely is the

447
00:28:13.160 --> 00:28:18.599
<v Speaker 2>rootly Air Labs. It's a community driven initiative where we

448
00:28:19.359 --> 00:28:23.960
<v Speaker 2>hire software engineers. We have the head of platform engineering

449
00:28:24.000 --> 00:28:27.240
<v Speaker 2>at Venmo and the former head of AI at Video

450
00:28:27.440 --> 00:28:33.079
<v Speaker 2>and other very smart student PhDs from Stanford and whatnot,

451
00:28:34.119 --> 00:28:40.920
<v Speaker 2>and we pay them to create open source prototype leveraging

452
00:28:41.319 --> 00:28:45.599
<v Speaker 2>the latest air innovation to see how can this be

453
00:28:45.680 --> 00:28:49.519
<v Speaker 2>applied to the world of free the ability and system operation.

454
00:28:49.640 --> 00:28:53.720
<v Speaker 2>And one of the projects that we're working on is

455
00:28:53.759 --> 00:28:57.319
<v Speaker 2>exactly what you mentioned, is to create a graph of

456
00:28:58.720 --> 00:29:03.640
<v Speaker 2>the incidents and see if we can find patterns. So

457
00:29:03.640 --> 00:29:08.039
<v Speaker 2>it could be an area we are infrastructure, or a

458
00:29:08.079 --> 00:29:11.880
<v Speaker 2>part of your application, or perhaps a type of failure.

459
00:29:12.160 --> 00:29:14.680
<v Speaker 2>You know, let's let's say we speak about we spoke

460
00:29:14.720 --> 00:29:18.839
<v Speaker 2>about resources. Is the resources often something you know that's

461
00:29:18.880 --> 00:29:21.920
<v Speaker 2>that's fading our system and maybe because our skating rules

462
00:29:21.960 --> 00:29:26.920
<v Speaker 2>are not aggressive in US, and and alms are helping

463
00:29:27.000 --> 00:29:32.319
<v Speaker 2>us to to create this graph because they are you know,

464
00:29:32.359 --> 00:29:36.200
<v Speaker 2>they are great interesting unstructured data and make sense of it.

465
00:29:37.240 --> 00:29:41.000
<v Speaker 2>And so then we can create this graph that can

466
00:29:42.039 --> 00:29:45.920
<v Speaker 2>and power a sorry team to understand where and stability

467
00:29:46.279 --> 00:29:46.680
<v Speaker 2>come from.

468
00:29:47.759 --> 00:29:50.640
<v Speaker 1>I mean, that's something I'd be super interested to find out,

469
00:29:50.640 --> 00:29:54.400
<v Speaker 1>like where statistically are the most problems coming from, and

470
00:29:54.440 --> 00:29:57.880
<v Speaker 1>how that maps are, or like what the confounding variables

471
00:29:57.920 --> 00:30:00.440
<v Speaker 1>are between maybe the culture of the company or software

472
00:30:00.519 --> 00:30:04.160
<v Speaker 1>languages that they're utilizing, or the frameworks or the industries. Right,

473
00:30:04.200 --> 00:30:07.039
<v Speaker 1>you know, maybe these industries have these common incidents. Like

474
00:30:07.079 --> 00:30:09.680
<v Speaker 1>I think that'd be super interesting to say.

475
00:30:10.440 --> 00:30:13.720
<v Speaker 2>Well, yeah, so we we're building it. You can check

476
00:30:13.759 --> 00:30:19.279
<v Speaker 2>it out. We have a gid up GitHub space if

477
00:30:19.319 --> 00:30:22.440
<v Speaker 2>you look for rootly AI labs. Everything is open source

478
00:30:23.559 --> 00:30:28.000
<v Speaker 2>and we're always welcoming people to to join, just giving

479
00:30:28.039 --> 00:30:32.319
<v Speaker 2>ideas or or contributing. Again, we're paying people to do that,

480
00:30:32.559 --> 00:30:35.559
<v Speaker 2>and so it's kind of a side job. But yeah,

481
00:30:35.559 --> 00:30:38.920
<v Speaker 2>I think AI is breaking like and and you know,

482
00:30:39.160 --> 00:30:41.440
<v Speaker 2>I would say it's kind of a side thing. You know,

483
00:30:41.799 --> 00:30:47.359
<v Speaker 2>it's it's not it's not as an ambitious goal as

484
00:30:47.440 --> 00:30:51.880
<v Speaker 2>like self ining system. But I do think that's where

485
00:30:51.880 --> 00:30:54.000
<v Speaker 2>you see that l M can can allow you to

486
00:30:54.079 --> 00:30:56.079
<v Speaker 2>do other things that are interesting. I know there are

487
00:30:56.319 --> 00:30:59.400
<v Speaker 2>prototype that I think it might be interesting in two

488
00:30:59.440 --> 00:31:02.720
<v Speaker 2>other prototy, but speak about it very briefly. One of

489
00:31:02.720 --> 00:31:08.960
<v Speaker 2>them is to create a diagram out of a post

490
00:31:09.039 --> 00:31:15.839
<v Speaker 2>more tem showing where things went wrong. And post more

491
00:31:15.880 --> 00:31:19.559
<v Speaker 2>TEMs are actually kind of painful for engineer to write

492
00:31:19.559 --> 00:31:22.079
<v Speaker 2>like you know, no one wants to do that. You

493
00:31:22.119 --> 00:31:26.039
<v Speaker 2>need to remember what happened and bring all of this together. Actually,

494
00:31:26.200 --> 00:31:28.960
<v Speaker 2>that's what's good with lelamps and that's something we have

495
00:31:28.960 --> 00:31:33.720
<v Speaker 2>in Rutley. Rutley will draft a pastmare tem for you

496
00:31:33.839 --> 00:31:36.359
<v Speaker 2>and then you just have to review it and chances

497
00:31:36.400 --> 00:31:38.920
<v Speaker 2>are that the post more time is going to be great.

498
00:31:39.359 --> 00:31:42.440
<v Speaker 2>And then the next step that we we tried with

499
00:31:42.519 --> 00:31:47.440
<v Speaker 2>the Routly air lab is how about trying to offer

500
00:31:47.480 --> 00:31:52.480
<v Speaker 2>another way to consume a postmare tem and a visual

501
00:31:52.559 --> 00:31:58.599
<v Speaker 2>way may help, especially I think non engineering audiences to

502
00:31:58.799 --> 00:32:03.440
<v Speaker 2>understand where the failure happened and why the other service

503
00:32:03.960 --> 00:32:06.680
<v Speaker 2>that may seem totally enerated was done as well. So

504
00:32:06.920 --> 00:32:10.519
<v Speaker 2>the way it works is that it will ingest the

505
00:32:10.559 --> 00:32:13.880
<v Speaker 2>post mortem makes sense of it as a geson and

506
00:32:13.920 --> 00:32:18.119
<v Speaker 2>then ingest you know your your code base, infrastructure and

507
00:32:18.200 --> 00:32:20.359
<v Speaker 2>code and make a gson out of it and then

508
00:32:20.599 --> 00:32:27.160
<v Speaker 2>merge this to and and create a knockdown graph. So yeah,

509
00:32:27.200 --> 00:32:30.640
<v Speaker 2>that's you know, another way to leverage LMS, which which

510
00:32:30.759 --> 00:32:34.000
<v Speaker 2>can ultimately help a very team to do their job

511
00:32:34.240 --> 00:32:34.960
<v Speaker 2>more efficiently.

512
00:32:35.519 --> 00:32:38.359
<v Speaker 1>I like how you called out that after you are

513
00:32:39.400 --> 00:32:44.319
<v Speaker 1>you've pushed out a post mortem that someone actually has

514
00:32:44.319 --> 00:32:48.039
<v Speaker 1>to review what way you've created, Like, no, don't just

515
00:32:48.079 --> 00:32:49.960
<v Speaker 1>don't just take that and you know, start sending it

516
00:32:50.000 --> 00:32:52.960
<v Speaker 1>to people as the official thing are Like, if you

517
00:32:53.079 --> 00:32:55.839
<v Speaker 1>take an LM generated post mortem and you put that

518
00:32:55.960 --> 00:33:00.400
<v Speaker 1>up publicly, you will for sure get harassed on Blue

519
00:33:00.440 --> 00:33:04.000
<v Speaker 1>Sky and asd on very quickly about how you spend

520
00:33:04.079 --> 00:33:06.720
<v Speaker 1>zero effort and then making sure that that was accurate.

521
00:33:07.119 --> 00:33:11.319
<v Speaker 1>It's very easy to identify LM generated stuff like that.

522
00:33:12.880 --> 00:33:15.119
<v Speaker 2>And the second thing that we've built that maybe of

523
00:33:15.240 --> 00:33:19.559
<v Speaker 2>interest to the audience is is an on called Burnard Detector.

524
00:33:20.599 --> 00:33:24.039
<v Speaker 2>I think that's particularly interested for companies that are distributed

525
00:33:25.039 --> 00:33:30.440
<v Speaker 2>where manager may not be in touch as much with

526
00:33:30.640 --> 00:33:33.440
<v Speaker 2>what the team is doing, and especially for large companies.

527
00:33:33.839 --> 00:33:37.440
<v Speaker 2>So what we do is that we feed all the

528
00:33:37.559 --> 00:33:43.880
<v Speaker 2>associated data about incident responder can be how long was

529
00:33:43.920 --> 00:33:46.279
<v Speaker 2>there shift over the last week, how many incidents they

530
00:33:46.319 --> 00:33:49.599
<v Speaker 2>had to travel, shoot, what was the severity of this incident,

531
00:33:51.079 --> 00:33:55.119
<v Speaker 2>how long were they working during the night, and so

532
00:33:55.200 --> 00:33:58.440
<v Speaker 2>many things you know that are all instructured data. Right,

533
00:33:59.640 --> 00:34:02.119
<v Speaker 2>So we again like Elem's are great at this, And

534
00:34:02.160 --> 00:34:05.039
<v Speaker 2>then from this an elelant can come up with kind

535
00:34:05.079 --> 00:34:08.920
<v Speaker 2>of a burnout level, you know, and see, hey, like

536
00:34:09.079 --> 00:34:12.239
<v Speaker 2>you know, this person was like smashed very hard with

537
00:34:12.360 --> 00:34:16.159
<v Speaker 2>a bunch of hard incident, like you may consider giving

538
00:34:16.199 --> 00:34:21.840
<v Speaker 2>them a break. I'm sorry, So as.

539
00:34:21.679 --> 00:34:25.719
<v Speaker 1>Long as it doesn't also suggest the therapy that should

540
00:34:25.719 --> 00:34:28.320
<v Speaker 1>be necessary and try to provide that. I think you're

541
00:34:28.360 --> 00:34:31.719
<v Speaker 1>on the right track there, Yeah, I mean, yeah, it

542
00:34:31.760 --> 00:34:35.519
<v Speaker 1>can be. It can be difficult to see the differences

543
00:34:35.559 --> 00:34:38.840
<v Speaker 1>between individuals. Like some of them are way more interested

544
00:34:38.880 --> 00:34:41.480
<v Speaker 1>in actually jumping in and you know, diving in and

545
00:34:41.519 --> 00:34:44.960
<v Speaker 1>trying to identify those problems and solve them, and others

546
00:34:45.000 --> 00:34:48.400
<v Speaker 1>are you know, care more about the routine. But I

547
00:34:49.280 --> 00:34:52.840
<v Speaker 1>don't think in my in the history of my engineering career,

548
00:34:52.880 --> 00:34:55.119
<v Speaker 1>I ever saw someone jump up and down and say yes,

549
00:34:55.239 --> 00:34:57.159
<v Speaker 1>I would love to be woken up at three a m.

550
00:34:57.559 --> 00:35:00.440
<v Speaker 1>And jump on a call with other people and try

551
00:35:00.480 --> 00:35:03.039
<v Speaker 1>to justify what was what was happening. So, you know,

552
00:35:03.039 --> 00:35:04.840
<v Speaker 1>I think you're definitely onto something interesting there.

553
00:35:05.840 --> 00:35:09.480
<v Speaker 2>Yeah. I was when I was young because I wanted

554
00:35:09.519 --> 00:35:15.760
<v Speaker 2>to learn. But I'm definitely not into that, but go

555
00:35:15.840 --> 00:35:16.079
<v Speaker 2>for it.

556
00:35:16.519 --> 00:35:18.960
<v Speaker 1>Yeah, I mean, I know. I think that's an interesting

557
00:35:19.000 --> 00:35:22.480
<v Speaker 1>point because you know your career, things change for you

558
00:35:22.519 --> 00:35:24.480
<v Speaker 1>over time, and maybe at some point you are willing

559
00:35:24.519 --> 00:35:27.760
<v Speaker 1>to make some sacrifices, you know, But I don't know

560
00:35:27.760 --> 00:35:29.280
<v Speaker 1>if it was the case for me, Like I remember

561
00:35:29.360 --> 00:35:32.760
<v Speaker 1>my first job out of university, there would be incidents

562
00:35:32.760 --> 00:35:35.119
<v Speaker 1>in the middle of the night, and I never had

563
00:35:35.119 --> 00:35:37.400
<v Speaker 1>to deal with that sort of thing in my life

564
00:35:37.480 --> 00:35:39.199
<v Speaker 1>up until that point. Like I didn't run my own

565
00:35:39.280 --> 00:35:41.000
<v Speaker 1>data center in my home, and even if I did,

566
00:35:41.039 --> 00:35:42.760
<v Speaker 1>I don't. It was not at the point where you'd

567
00:35:42.760 --> 00:35:44.760
<v Speaker 1>be like getting alerts to be woken up to deal

568
00:35:44.800 --> 00:35:49.159
<v Speaker 1>with one of your virtual machines failing. And the university

569
00:35:49.280 --> 00:35:53.280
<v Speaker 1>wasn't a thing. You're you're definitely awake while you're causing problems, right,

570
00:35:53.559 --> 00:35:56.280
<v Speaker 1>things aren't happening while you're sleeping. And so my first

571
00:35:56.360 --> 00:35:58.719
<v Speaker 1>job like this would happen, and I definitely came away

572
00:35:58.760 --> 00:36:01.320
<v Speaker 1>from that with it, with the idea this is wrong,

573
00:36:01.519 --> 00:36:04.000
<v Speaker 1>Like I don't ever want to be woken up in

574
00:36:04.039 --> 00:36:05.760
<v Speaker 1>the middle of the night, Like you don't have to

575
00:36:05.800 --> 00:36:08.480
<v Speaker 1>be it's not a requirement. And since then, like I've

576
00:36:08.559 --> 00:36:11.599
<v Speaker 1>really been on the path of highly reliable systems. And

577
00:36:11.639 --> 00:36:14.639
<v Speaker 1>I think the part that really stumps a lot of

578
00:36:14.639 --> 00:36:17.599
<v Speaker 1>people is they focus a lot on the preventative nature

579
00:36:17.880 --> 00:36:20.199
<v Speaker 1>that they can try to prevent every problem. Oh, get

580
00:36:20.239 --> 00:36:22.639
<v Speaker 1>one hundred percent test coverage, or you know, have a

581
00:36:22.719 --> 00:36:26.760
<v Speaker 1>highly reliable solution by duplicating the infrastructure in multiple regions.

582
00:36:26.800 --> 00:36:29.440
<v Speaker 1>And I mean the thing I think you said at

583
00:36:29.480 --> 00:36:32.639
<v Speaker 1>the beginning of the episode, which is that it will

584
00:36:32.679 --> 00:36:36.079
<v Speaker 1>go down, Like you cannot have one hundred percent reliable

585
00:36:36.079 --> 00:36:38.440
<v Speaker 1>system and so at some point you have to optimize

586
00:36:38.480 --> 00:36:42.159
<v Speaker 1>for recovery and not just prevention. And this is where

587
00:36:42.199 --> 00:36:44.280
<v Speaker 1>I think a lot of people get stuck because like

588
00:36:44.440 --> 00:36:47.519
<v Speaker 1>at our company, we have a five nines reliability SLA,

589
00:36:47.920 --> 00:36:49.760
<v Speaker 1>and that means that by the time so one gets

590
00:36:49.800 --> 00:36:54.760
<v Speaker 1>alerted and they get online, we've already violated the SLA,

591
00:36:54.880 --> 00:36:57.920
<v Speaker 1>let alone identified and fixed the problem.

592
00:36:58.480 --> 00:37:00.960
<v Speaker 2>That's a great point you bring that I think this

593
00:37:01.119 --> 00:37:05.159
<v Speaker 2>system well. First, first of all, getting woken up at

594
00:37:05.239 --> 00:37:09.239
<v Speaker 2>three am is never a pleasant experience, and it takes

595
00:37:09.280 --> 00:37:12.280
<v Speaker 2>time for your brain to get into it. And you know,

596
00:37:12.320 --> 00:37:15.519
<v Speaker 2>maybe you were in some deep sleep and you are

597
00:37:15.559 --> 00:37:18.639
<v Speaker 2>waking up and kind of having like little panic attack

598
00:37:18.800 --> 00:37:21.280
<v Speaker 2>or you know, something like tough on your body, and

599
00:37:21.320 --> 00:37:26.320
<v Speaker 2>then you need to you know, get time to ingest

600
00:37:26.400 --> 00:37:27.840
<v Speaker 2>the data and so on and so for so we

601
00:37:27.880 --> 00:37:30.800
<v Speaker 2>know how hard it is for your body and your mind.

602
00:37:31.239 --> 00:37:35.079
<v Speaker 2>I think that's where aiser is. You know, which are

603
00:37:35.159 --> 00:37:39.920
<v Speaker 2>like self feeling system or tools that can lead to

604
00:37:39.960 --> 00:37:43.639
<v Speaker 2>that can help. Is like, hey, this tool can ingest

605
00:37:43.920 --> 00:37:47.360
<v Speaker 2>so much data in such a small amount of time

606
00:37:47.920 --> 00:37:50.400
<v Speaker 2>and give you something to get started, like an initial

607
00:37:50.519 --> 00:37:53.239
<v Speaker 2>root cause analysis. Then by the time you get to

608
00:37:53.280 --> 00:37:57.440
<v Speaker 2>your computer, you already have something ready to look at.

609
00:37:57.480 --> 00:38:01.280
<v Speaker 2>Hopefully it's ninety five percent confidence and you know you

610
00:38:01.440 --> 00:38:03.719
<v Speaker 2>just have to push the fig that they suggests. I

611
00:38:03.719 --> 00:38:06.719
<v Speaker 2>think it's such a great tool. I think that we'll

612
00:38:06.760 --> 00:38:10.639
<v Speaker 2>have a great positive impact on the on the health

613
00:38:10.800 --> 00:38:17.079
<v Speaker 2>of our people. The second thing I think that's that's

614
00:38:18.079 --> 00:38:21.400
<v Speaker 2>interesting with this tool is that you mentioned you know

615
00:38:21.480 --> 00:38:25.280
<v Speaker 2>five nines, and you know, we know that it's possible

616
00:38:25.320 --> 00:38:28.760
<v Speaker 2>to get five, nine or six nines. But the companies

617
00:38:28.760 --> 00:38:31.559
<v Speaker 2>that are achieving that, like the Google of the world,

618
00:38:32.159 --> 00:38:36.440
<v Speaker 2>are investing huge amount of resources, uh you know, human

619
00:38:36.760 --> 00:38:40.599
<v Speaker 2>and financial to reach this level, and for the rest

620
00:38:40.599 --> 00:38:43.480
<v Speaker 2>of us, the rest of the businesses is simply not

621
00:38:43.599 --> 00:38:49.119
<v Speaker 2>possible until today. I believe that these self healing tools

622
00:38:49.440 --> 00:38:54.480
<v Speaker 2>will allow companies to reach this type of you know

623
00:38:54.639 --> 00:38:59.320
<v Speaker 2>SLA without spending the budget that that Google does. And

624
00:38:59.360 --> 00:39:02.679
<v Speaker 2>I think that's truly I think that's going to already

625
00:39:02.719 --> 00:39:04.239
<v Speaker 2>find the sory space.

626
00:39:05.519 --> 00:39:08.000
<v Speaker 1>Yeah, I mean, I will say that one of the

627
00:39:08.000 --> 00:39:14.039
<v Speaker 1>biggest struggles we have is actually customer perspective alignment. Like

628
00:39:14.599 --> 00:39:17.280
<v Speaker 1>it's a challenge for us to know what the status

629
00:39:17.280 --> 00:39:19.480
<v Speaker 1>of our system is like it's subjective. Is it up

630
00:39:19.599 --> 00:39:21.719
<v Speaker 1>or down? Is not like you can look at some

631
00:39:22.000 --> 00:39:25.280
<v Speaker 1>chart and have the answer there. And what's even more

632
00:39:25.320 --> 00:39:27.760
<v Speaker 1>important is that if we believe that our system is up,

633
00:39:27.920 --> 00:39:30.239
<v Speaker 1>that our customers also believe that our system is up,

634
00:39:30.800 --> 00:39:33.039
<v Speaker 1>because this mismatch is really what you're trying to solve

635
00:39:33.079 --> 00:39:37.239
<v Speaker 1>for If customers always like one hundred percent, reliability is

636
00:39:37.239 --> 00:39:39.599
<v Speaker 1>not what whether you think it's up, it's whether or

637
00:39:39.639 --> 00:39:41.880
<v Speaker 1>not you know the people that are paying you money

638
00:39:41.920 --> 00:39:45.119
<v Speaker 1>to you know, run some system believe it is and

639
00:39:45.559 --> 00:39:49.159
<v Speaker 1>the customer expectational alignment like that's actually a really that's

640
00:39:49.199 --> 00:39:54.440
<v Speaker 1>a huge challenge and I'm not sure you can fundamentally

641
00:39:54.559 --> 00:39:57.400
<v Speaker 1>all that problem. But yeah, I do things as a

642
00:39:57.440 --> 00:39:59.639
<v Speaker 1>huge gap with a lot of companies being able to

643
00:39:59.639 --> 00:40:03.320
<v Speaker 1>get for from where they're at, which is like their

644
00:40:03.320 --> 00:40:05.599
<v Speaker 1>software is going down like at least once a week,

645
00:40:05.800 --> 00:40:07.920
<v Speaker 1>to something much further than that.

646
00:40:09.360 --> 00:40:12.360
<v Speaker 2>Yeah, yeah, so, I you know, I do think the

647
00:40:12.519 --> 00:40:16.320
<v Speaker 2>LM can can help with that in some capacity. Maybe

648
00:40:16.360 --> 00:40:19.000
<v Speaker 2>I can jump also and share a little bit about

649
00:40:19.599 --> 00:40:22.719
<v Speaker 2>what are the challenging of building these type of tools.

650
00:40:22.800 --> 00:40:25.559
<v Speaker 1>Yeah, please, I'm dying to now, right.

651
00:40:25.599 --> 00:40:29.159
<v Speaker 2>So, I think one of the hardest things, which you know,

652
00:40:29.400 --> 00:40:32.519
<v Speaker 2>I think it's a big week LACE is obviously the

653
00:40:32.679 --> 00:40:37.960
<v Speaker 2>non deterministic part of the system. And here I think,

654
00:40:38.480 --> 00:40:42.880
<v Speaker 2>you know, the old adage you cannot improve or fix

655
00:40:43.679 --> 00:40:47.400
<v Speaker 2>what you cannot measure is you know, works very well,

656
00:40:47.519 --> 00:40:52.199
<v Speaker 2>right like for l elms, even if you provide the

657
00:40:52.239 --> 00:40:58.079
<v Speaker 2>same input, the output will be different. And so it's

658
00:40:58.159 --> 00:41:03.960
<v Speaker 2>very hard for engineering team to ensure that one, you know,

659
00:41:04.119 --> 00:41:08.440
<v Speaker 2>is my system running well as you say, it's subjective,

660
00:41:08.480 --> 00:41:11.639
<v Speaker 2>and I think here it's even more subjective because it's

661
00:41:11.639 --> 00:41:14.239
<v Speaker 2>not a matter of just hey, am I getting a

662
00:41:14.800 --> 00:41:17.360
<v Speaker 2>two hundred or five hundred or maybe it's a two hundred,

663
00:41:17.440 --> 00:41:22.639
<v Speaker 2>but which is too much you know latency. We're speaking

664
00:41:22.679 --> 00:41:28.719
<v Speaker 2>about an output which is natural language. And second is

665
00:41:29.559 --> 00:41:35.800
<v Speaker 2>my output better or worse? So that's that's like a

666
00:41:35.920 --> 00:41:41.599
<v Speaker 2>big challenge in in in building this system. And and

667
00:41:41.639 --> 00:41:45.039
<v Speaker 2>another point is that this system don't have sking in

668
00:41:45.079 --> 00:41:48.559
<v Speaker 2>the game. And elms are like dream machines. They are

669
00:41:48.679 --> 00:41:54.360
<v Speaker 2>designed to put together chain of tokens that are, you know,

670
00:41:54.519 --> 00:41:57.920
<v Speaker 2>using statistics, the more likely to be pleasing, you know,

671
00:41:59.239 --> 00:42:03.199
<v Speaker 2>and sometimes this what the assembly is not rooted in reality.

672
00:42:03.320 --> 00:42:07.320
<v Speaker 2>But they still did their job as they should. And

673
00:42:07.400 --> 00:42:10.159
<v Speaker 2>so if we compare this to a human when you know,

674
00:42:10.840 --> 00:42:14.480
<v Speaker 2>if let's say were you're my manager and I'm working

675
00:42:14.519 --> 00:42:16.599
<v Speaker 2>on trouble shooting this incident, and I'm like, hey, I

676
00:42:16.639 --> 00:42:20.239
<v Speaker 2>think that's the issue. I think this is where we

677
00:42:20.280 --> 00:42:24.239
<v Speaker 2>should you know, Look, I have skin in the game, right, like,

678
00:42:24.239 --> 00:42:30.519
<v Speaker 2>like I'm putting my skills on the line, and so

679
00:42:30.719 --> 00:42:33.440
<v Speaker 2>you know, when I share this with you, I have

680
00:42:33.519 --> 00:42:37.199
<v Speaker 2>a certain degree of certainty that this is a probable

681
00:42:37.239 --> 00:42:42.039
<v Speaker 2>cause for ALMS. There is none of that, right, So here,

682
00:42:42.679 --> 00:42:47.159
<v Speaker 2>what we've done at routely is that we we have

683
00:42:47.159 --> 00:42:51.599
<v Speaker 2>two types of agents. We have the master agent, which

684
00:42:51.679 --> 00:42:56.559
<v Speaker 2>is orchestrating sub agents which are in charge of doing

685
00:42:57.639 --> 00:43:01.960
<v Speaker 2>the work of gathering days trying to understand, like doing

686
00:43:02.000 --> 00:43:04.599
<v Speaker 2>the grain walk and then coming up with an answer.

687
00:43:05.039 --> 00:43:09.119
<v Speaker 2>And the master agent will make sure that the overall

688
00:43:09.199 --> 00:43:11.760
<v Speaker 2>narrative mix sense. And there is anothern agent that's like,

689
00:43:12.320 --> 00:43:14.400
<v Speaker 2>you know, coming up with something that doesn't make sense,

690
00:43:14.519 --> 00:43:18.280
<v Speaker 2>like a manager would do. So what's funny with with

691
00:43:18.559 --> 00:43:21.679
<v Speaker 2>LMS is that it kind of mimics a human narrative,

692
00:43:21.760 --> 00:43:24.679
<v Speaker 2>a human dynamic. Yeah.

693
00:43:24.760 --> 00:43:27.039
<v Speaker 1>No, I mean I feel like the most common questions

694
00:43:27.119 --> 00:43:29.800
<v Speaker 1>I end up asking are how do you know? And

695
00:43:29.880 --> 00:43:35.719
<v Speaker 1>why now? And Alan's not so good at solving that one,

696
00:43:36.440 --> 00:43:39.519
<v Speaker 1>especially when like a bunch of changes all stack together

697
00:43:39.719 --> 00:43:43.159
<v Speaker 1>to then cause the problem. Right, you know, you look

698
00:43:43.199 --> 00:43:45.599
<v Speaker 1>at individual changes and they all seem fine, and then

699
00:43:46.360 --> 00:43:48.280
<v Speaker 1>only together do they cause the issue. So I mean,

700
00:43:48.400 --> 00:43:51.360
<v Speaker 1>I do see this sort of interaction is necessary. I

701
00:43:51.400 --> 00:43:54.320
<v Speaker 1>do want to ask you about your models, though, so

702
00:43:54.760 --> 00:43:57.800
<v Speaker 1>are you taking some fundamental like some foundational model out

703
00:43:57.800 --> 00:44:00.440
<v Speaker 1>there that's available open source and mind too, it are

704
00:44:00.440 --> 00:44:03.920
<v Speaker 1>you building it up from scratch? Is there like one

705
00:44:03.960 --> 00:44:06.400
<v Speaker 1>particular companies models that you like more than others? What

706
00:44:06.440 --> 00:44:07.400
<v Speaker 1>does this look like for you?

707
00:44:09.639 --> 00:44:13.400
<v Speaker 2>Yeah, So I think the assumption that I think the

708
00:44:13.800 --> 00:44:18.519
<v Speaker 2>you know, anyone not not like deep in the space

709
00:44:18.559 --> 00:44:21.519
<v Speaker 2>would assume is that you need you need to train models,

710
00:44:22.159 --> 00:44:25.800
<v Speaker 2>you need to tune it using in our case, you know,

711
00:44:26.719 --> 00:44:30.119
<v Speaker 2>like your customer data or like you know, if you

712
00:44:30.159 --> 00:44:33.800
<v Speaker 2>are building this internally, your specific data. What we found

713
00:44:34.000 --> 00:44:37.199
<v Speaker 2>is that this is actually not needed for most of

714
00:44:37.239 --> 00:44:43.559
<v Speaker 2>the incidents, Like out of out of the shell like

715
00:44:43.800 --> 00:44:48.320
<v Speaker 2>model like work perfectly fine, and we'll find most of

716
00:44:48.360 --> 00:44:53.199
<v Speaker 2>the issues. Training model is actually really hard, really costly,

717
00:44:54.079 --> 00:44:58.239
<v Speaker 2>and we haven't found so far. You know, we're still

718
00:44:58.280 --> 00:45:00.519
<v Speaker 2>early you know in the space. So I think we'll

719
00:45:00.559 --> 00:45:02.960
<v Speaker 2>get to this eventually. But I think for now we're

720
00:45:03.440 --> 00:45:08.639
<v Speaker 2>finding the most value by not doing it. I think again, like,

721
00:45:08.920 --> 00:45:12.239
<v Speaker 2>it's it's difficult, it's expensive. And then there is a

722
00:45:12.239 --> 00:45:15.519
<v Speaker 2>lot of skepticism and I think issue with privacy and

723
00:45:15.559 --> 00:45:19.800
<v Speaker 2>security companies on one their data going into l LMS heaven,

724
00:45:19.880 --> 00:45:22.280
<v Speaker 2>you know, if we would do this only for their ALM.

725
00:45:23.880 --> 00:45:28.239
<v Speaker 2>So what we found matter of the most is ready

726
00:45:29.280 --> 00:45:33.480
<v Speaker 2>the context that you provide. And I think what we've

727
00:45:33.480 --> 00:45:39.400
<v Speaker 2>found is the most valuable is the non technical stuff.

728
00:45:39.719 --> 00:45:41.760
<v Speaker 2>But what I mean by this is the human generating

729
00:45:42.239 --> 00:45:48.000
<v Speaker 2>generated context. And when we link this to roughtly, it's

730
00:45:48.039 --> 00:45:53.000
<v Speaker 2>two things. The first one is the former postmare terms

731
00:45:53.039 --> 00:45:56.760
<v Speaker 2>like this is a gold mine of information. Most of

732
00:45:56.800 --> 00:46:02.760
<v Speaker 2>the time your system is unstable and it's gonna you know,

733
00:46:03.000 --> 00:46:05.880
<v Speaker 2>this area will remain unstable at least for some period

734
00:46:05.920 --> 00:46:08.760
<v Speaker 2>of time. Generally you have action items that your team

735
00:46:08.800 --> 00:46:12.920
<v Speaker 2>is supposed to implement. Sometimes it's actually items are done,

736
00:46:12.960 --> 00:46:17.440
<v Speaker 2>sometimes not. You know, there is always a priority issue

737
00:46:17.599 --> 00:46:21.039
<v Speaker 2>with we need to release this future, just fix this

738
00:46:21.079 --> 00:46:28.599
<v Speaker 2>potential bug. And the second thing is all the communications

739
00:46:28.639 --> 00:46:33.639
<v Speaker 2>that's happening on Slack or teams or Zoom or Google Meter,

740
00:46:33.920 --> 00:46:40.519
<v Speaker 2>you know and whatnot where that's how incidents are sold, right,

741
00:46:40.559 --> 00:46:43.320
<v Speaker 2>it's human communicating between each other and sharing so much

742
00:46:43.360 --> 00:46:49.119
<v Speaker 2>information that's business specific, right Like Ellen's are trained on

743
00:46:49.199 --> 00:46:54.760
<v Speaker 2>a ton of data that's online, but it's not specific

744
00:46:54.920 --> 00:46:58.320
<v Speaker 2>to a company obviously, right, And and and so we

745
00:46:58.440 --> 00:47:02.440
<v Speaker 2>found that this data is that really boosts the results

746
00:47:02.440 --> 00:47:04.039
<v Speaker 2>that we get out of these tools.

747
00:47:04.199 --> 00:47:06.280
<v Speaker 1>Yeah, I mean, I think you you said it a

748
00:47:06.320 --> 00:47:08.920
<v Speaker 1>different way. I like the context of you have to

749
00:47:08.960 --> 00:47:13.039
<v Speaker 1>pull in the business criteria, understanding and context in order

750
00:47:13.079 --> 00:47:16.039
<v Speaker 1>to have a valuable output. And I think it's you know,

751
00:47:16.039 --> 00:47:18.760
<v Speaker 1>it can even be more than that. It's the fundamental

752
00:47:18.840 --> 00:47:20.960
<v Speaker 1>nature of l ms that we have today, Like it's

753
00:47:21.000 --> 00:47:24.960
<v Speaker 1>not a it's transformer architecture, which you know, is fundamentally

754
00:47:25.039 --> 00:47:28.199
<v Speaker 1>lacking the reasoning piece, Like they'll never be able to reason,

755
00:47:28.239 --> 00:47:30.000
<v Speaker 1>which means they'll never be able to make a decision

756
00:47:30.400 --> 00:47:33.559
<v Speaker 1>based off of the business context. But they'll be able

757
00:47:33.559 --> 00:47:35.880
<v Speaker 1>to do a little bit better of pulling that in

758
00:47:35.920 --> 00:47:40.000
<v Speaker 1>and combining with the output that it would normally get. So,

759
00:47:40.559 --> 00:47:42.719
<v Speaker 1>you know, my one of my questions is here, Okay,

760
00:47:42.719 --> 00:47:44.719
<v Speaker 1>so building up a foundational model, and I think we've

761
00:47:44.719 --> 00:47:49.159
<v Speaker 1>heard this before auntaventures and DevOps, and that's that it's

762
00:47:49.199 --> 00:47:53.960
<v Speaker 1>incredibly expensive. Also, the industry is moving quicker that the

763
00:47:54.000 --> 00:47:58.280
<v Speaker 1>new foundational models are are just as good, So spending

764
00:47:58.320 --> 00:48:00.679
<v Speaker 1>money on building a new one doesn't make sense. Actually,

765
00:48:00.840 --> 00:48:03.400
<v Speaker 1>I think we heard one time that even fine tuning

766
00:48:03.400 --> 00:48:07.079
<v Speaker 1>models doesn't make sense because the next generation while like

767
00:48:07.159 --> 00:48:09.920
<v Speaker 1>say anthropics three point seven cloud versus three point five,

768
00:48:10.159 --> 00:48:13.000
<v Speaker 1>it's not it's not really that much of an improvement,

769
00:48:13.400 --> 00:48:16.559
<v Speaker 1>but you are getting up to date data. If anything. Rather,

770
00:48:16.639 --> 00:48:19.159
<v Speaker 1>you know the time stamp has changed, and if you

771
00:48:19.199 --> 00:48:23.000
<v Speaker 1>spend time training it, refine tuning it. Rather then by

772
00:48:23.039 --> 00:48:25.119
<v Speaker 1>the time the next one comes out, all you're fine.

773
00:48:25.119 --> 00:48:27.239
<v Speaker 1>Tuning well, first of all is a waste, second all

774
00:48:27.360 --> 00:48:29.719
<v Speaker 1>is expensive, and third of all, like you may be

775
00:48:29.719 --> 00:48:32.320
<v Speaker 1>able to throw your quarries at your prompts at the

776
00:48:32.360 --> 00:48:34.800
<v Speaker 1>new model and get the right answer out anyway, So

777
00:48:35.039 --> 00:48:36.760
<v Speaker 1>it's good to hear that. Does that mean you're using

778
00:48:36.760 --> 00:48:42.320
<v Speaker 1>some sort you're using like something from Olama or deep

779
00:48:42.360 --> 00:48:43.239
<v Speaker 1>seek or something like that.

780
00:48:44.599 --> 00:48:48.880
<v Speaker 2>We found that what entropy provide is generally, you know,

781
00:48:48.920 --> 00:48:55.239
<v Speaker 2>the best performing. Ultimately, we are integrating a number of

782
00:48:55.280 --> 00:49:01.079
<v Speaker 2>different model providers and we use different model a different

783
00:49:01.119 --> 00:49:07.239
<v Speaker 2>step of the process. You know, I cannot explain in

784
00:49:07.280 --> 00:49:10.000
<v Speaker 2>detail because it would be too long, but you know,

785
00:49:10.039 --> 00:49:14.360
<v Speaker 2>when we're basically like the the agent will come up

786
00:49:14.400 --> 00:49:18.440
<v Speaker 2>with an initial probe that you know, we compose and

787
00:49:19.119 --> 00:49:22.280
<v Speaker 2>I would say, like different model will for instance, coming

788
00:49:22.360 --> 00:49:25.519
<v Speaker 2>up with the let's say the master thesis of what

789
00:49:26.400 --> 00:49:32.000
<v Speaker 2>you know we need to look for might be better created

790
00:49:32.039 --> 00:49:35.480
<v Speaker 2>by some model, and then you know, the actual technical

791
00:49:35.920 --> 00:49:39.559
<v Speaker 2>part maybe better than by another model. So it's and

792
00:49:39.599 --> 00:49:41.960
<v Speaker 2>it's a moving target, as you said, like the industry

793
00:49:42.000 --> 00:49:44.960
<v Speaker 2>is moving fast. There is a constant flow of new

794
00:49:45.000 --> 00:49:49.159
<v Speaker 2>models and so so you know, I don't think it's

795
00:49:49.159 --> 00:49:52.719
<v Speaker 2>something that really set in stone.

796
00:49:53.280 --> 00:49:56.599
<v Speaker 1>Do you do something to validate model changes? So for instance,

797
00:49:56.599 --> 00:49:58.840
<v Speaker 1>when three point seven came out, are you still using

798
00:49:58.840 --> 00:50:00.800
<v Speaker 1>three point five before? Or you can, like you have

799
00:50:00.840 --> 00:50:03.880
<v Speaker 1>some sort of ender templates or system prompts that you

800
00:50:03.880 --> 00:50:06.960
<v Speaker 1>can throw out and validate that the answers still makes sense,

801
00:50:07.039 --> 00:50:11.079
<v Speaker 1>that the RCAs and post mortems that you're doing still

802
00:50:11.400 --> 00:50:15.199
<v Speaker 1>are understandable on match and somehow validating the outputs. Like

803
00:50:15.320 --> 00:50:16.559
<v Speaker 1>what does this process work for you?

804
00:50:17.239 --> 00:50:21.079
<v Speaker 2>Yeah, lung Chain as a bunch of open source tools

805
00:50:21.119 --> 00:50:23.800
<v Speaker 2>that can allow you to do this. So we are

806
00:50:23.840 --> 00:50:31.000
<v Speaker 2>constantly you know, tracing every like all the different you know,

807
00:50:31.000 --> 00:50:33.480
<v Speaker 2>it's like kind of a tree right with different nerd

808
00:50:33.519 --> 00:50:38.239
<v Speaker 2>and paths, and we keep track of everything that's being done,

809
00:50:38.239 --> 00:50:41.960
<v Speaker 2>the reasoning, the output, and we constantly measure you know,

810
00:50:42.360 --> 00:50:47.519
<v Speaker 2>the performance. So that's definitely something that we do. That

811
00:50:47.639 --> 00:50:52.679
<v Speaker 2>being said, I think it's a challenge. It's still a challenge,

812
00:50:52.679 --> 00:50:58.320
<v Speaker 2>you know to really understand the the quality, how the

813
00:50:58.400 --> 00:51:03.679
<v Speaker 2>quality is shifting. They were a talk at Aserican in

814
00:51:03.760 --> 00:51:07.000
<v Speaker 2>Santa Clara a few weeks ago. I think it was

815
00:51:07.199 --> 00:51:13.719
<v Speaker 2>the AI director at ASIA who were speaking about one

816
00:51:13.719 --> 00:51:18.440
<v Speaker 2>of the mobile based product that they are using, and

817
00:51:18.599 --> 00:51:21.039
<v Speaker 2>it was saying that it's very hard for them to

818
00:51:21.119 --> 00:51:30.119
<v Speaker 2>understand how to measure that, and they are relying on NPS.

819
00:51:30.639 --> 00:51:36.679
<v Speaker 2>So it's a neutral promo Neutral Promoter Score, which is

820
00:51:36.719 --> 00:51:42.320
<v Speaker 2>basically an industry standard rating, which is like would you

821
00:51:42.360 --> 00:51:45.440
<v Speaker 2>recommend this to your friend and family to assess how

822
00:51:45.480 --> 00:51:47.280
<v Speaker 2>their model are doing? And I think that was like

823
00:51:47.360 --> 00:51:48.800
<v Speaker 2>really shocking to the audience.

824
00:51:49.719 --> 00:51:52.119
<v Speaker 1>I mean, we do not We do know from experience

825
00:51:52.159 --> 00:51:56.320
<v Speaker 1>that like NPS is like totally wrong from the net standpoint,

826
00:51:56.519 --> 00:51:59.280
<v Speaker 1>because you should never ask from a human psychology standpoint,

827
00:51:59.280 --> 00:52:01.880
<v Speaker 1>you should never ask one what they would do, but

828
00:52:01.960 --> 00:52:05.679
<v Speaker 1>metrics on what they have done. And I think that's

829
00:52:05.719 --> 00:52:07.079
<v Speaker 1>often the problem. But I mean, I think it really

830
00:52:07.079 --> 00:52:09.320
<v Speaker 1>goes to show that there is no good way of

831
00:52:09.800 --> 00:52:12.599
<v Speaker 1>adequately measuring these things. You have to do it within

832
00:52:12.639 --> 00:52:14.559
<v Speaker 1>context of like what your business is doing, you know,

833
00:52:14.639 --> 00:52:17.760
<v Speaker 1>for instance, really being able to do the incident management.

834
00:52:18.119 --> 00:52:20.880
<v Speaker 1>And I do think that at least I know I

835
00:52:20.880 --> 00:52:23.159
<v Speaker 1>have this question, so I'm sure someone else does that.

836
00:52:23.199 --> 00:52:25.199
<v Speaker 1>There you're getting to the point where you don't want

837
00:52:25.239 --> 00:52:27.599
<v Speaker 1>to have to make the code changes to like go

838
00:52:27.719 --> 00:52:31.840
<v Speaker 1>into GitHub or get lab or heaven forbid a bitbucket

839
00:52:31.920 --> 00:52:34.519
<v Speaker 1>or one of the other ones to actually put up

840
00:52:34.559 --> 00:52:36.599
<v Speaker 1>a poor request to fix the problem. Wouldn't it be

841
00:52:36.639 --> 00:52:39.159
<v Speaker 1>great if there was another LM out there that had

842
00:52:39.159 --> 00:52:41.280
<v Speaker 1>the context of the source code and everything, and you

843
00:52:41.320 --> 00:52:43.719
<v Speaker 1>could just give it the output from routlely and have

844
00:52:43.800 --> 00:52:46.039
<v Speaker 1>a different agent do that. And that for me means

845
00:52:46.360 --> 00:52:49.440
<v Speaker 1>you need to somehow integrate with other agents. And I

846
00:52:49.440 --> 00:52:54.719
<v Speaker 1>can't believe I'm saying this, but MCP model context protocol, Like,

847
00:52:55.239 --> 00:52:56.360
<v Speaker 1>how do you feel about that?

848
00:52:57.360 --> 00:52:59.760
<v Speaker 2>I'm a huge fan of this. I think you are, Like,

849
00:53:00.119 --> 00:53:03.079
<v Speaker 2>that's exactly the architectures that you need to have in mind.

850
00:53:04.119 --> 00:53:09.360
<v Speaker 2>Is not an agent is the collection of agents, And

851
00:53:09.480 --> 00:53:11.119
<v Speaker 2>it can go as deep as like, let's say, like

852
00:53:12.440 --> 00:53:15.800
<v Speaker 2>you are doing work with GitHub. You can have an

853
00:53:15.800 --> 00:53:20.639
<v Speaker 2>agent working on commits one, on pull request one, you know,

854
00:53:20.719 --> 00:53:24.239
<v Speaker 2>like on each type of different resources that you may

855
00:53:24.280 --> 00:53:26.920
<v Speaker 2>have with gitub you really need to tailor the agent

856
00:53:27.000 --> 00:53:31.800
<v Speaker 2>too for them to do their best job. Because again,

857
00:53:31.880 --> 00:53:36.360
<v Speaker 2>like they don't always have the business logic and understanding

858
00:53:36.440 --> 00:53:40.480
<v Speaker 2>that we do as a human and bringing this context

859
00:53:41.639 --> 00:53:44.280
<v Speaker 2>in each of the small like sub set of agent

860
00:53:44.880 --> 00:53:48.559
<v Speaker 2>is critical for MCP. I you know, I'm a big

861
00:53:48.599 --> 00:53:53.159
<v Speaker 2>fan of MCP. I've been wearing an MCP badge at

862
00:53:53.199 --> 00:53:57.039
<v Speaker 2>Aserican and Cuicon because I truly think it's it's the

863
00:53:57.280 --> 00:54:02.199
<v Speaker 2>technology itself is nothing, you know, amazing, Like at the

864
00:54:02.280 --> 00:54:03.920
<v Speaker 2>end of the day, it's just to get away to

865
00:54:03.960 --> 00:54:04.559
<v Speaker 2>an API.

866
00:54:05.159 --> 00:54:06.960
<v Speaker 1>I mean, I really want, I really want to stress

867
00:54:07.000 --> 00:54:09.039
<v Speaker 1>that enough, Like anyone who's not cut off on this,

868
00:54:09.199 --> 00:54:12.320
<v Speaker 1>like it's nothing special, Like just imagine you deploy a

869
00:54:12.480 --> 00:54:15.719
<v Speaker 1>new API or verse proxy CDN in front of your

870
00:54:15.719 --> 00:54:19.559
<v Speaker 1>existing software and you're mapping from one protocol to another

871
00:54:19.599 --> 00:54:22.519
<v Speaker 1>one like from TCP to UDP, from h GDP to

872
00:54:23.760 --> 00:54:27.440
<v Speaker 1>or from rest to g RPC. It's really just another one.

873
00:54:27.519 --> 00:54:29.840
<v Speaker 1>And I think the joke right now is that the

874
00:54:30.000 --> 00:54:32.519
<v Speaker 1>AS and MCP stands for security.

875
00:54:33.119 --> 00:54:35.960
<v Speaker 2>Yeah, you know, I think I mean that. Yeah, I

876
00:54:36.000 --> 00:54:37.679
<v Speaker 2>think we go back to the you know what we

877
00:54:37.760 --> 00:54:40.199
<v Speaker 2>discussed at the beginning of the conversation. You have the engineer,

878
00:54:40.280 --> 00:54:42.840
<v Speaker 2>we will be an athayer and obviously there is a

879
00:54:42.920 --> 00:54:46.440
<v Speaker 2>lot of things that are wrong with this, with this protocol,

880
00:54:46.880 --> 00:54:50.440
<v Speaker 2>it's not stable, it's full of bug securities. Absolutely not

881
00:54:50.559 --> 00:54:53.679
<v Speaker 2>to concern. But I think what's interesting is the concept

882
00:54:53.760 --> 00:55:00.119
<v Speaker 2>of breaking this world between UH this AIG and and

883
00:55:00.880 --> 00:55:04.679
<v Speaker 2>all the resources that are out there, whether it's data

884
00:55:04.800 --> 00:55:10.119
<v Speaker 2>or system and and m cps just think it as USB, right,

885
00:55:10.239 --> 00:55:16.840
<v Speaker 2>it's facilitating the conversation between between these two entities. It's

886
00:55:16.840 --> 00:55:23.440
<v Speaker 2>it's open source and and it really unleash a lot

887
00:55:23.440 --> 00:55:29.079
<v Speaker 2>of power for for AI agent and data sources, which

888
00:55:29.159 --> 00:55:30.639
<v Speaker 2>as you say, most of the time is going to

889
00:55:30.639 --> 00:55:33.639
<v Speaker 2>be race API to communicate in a way that's very

890
00:55:33.679 --> 00:55:38.480
<v Speaker 2>optimized because one there are many issues with agent to

891
00:55:38.559 --> 00:55:42.440
<v Speaker 2>consume most of the time it's it's race APIs as

892
00:55:42.440 --> 00:55:44.639
<v Speaker 2>we say, like they may not have the business logic.

893
00:55:45.159 --> 00:55:48.920
<v Speaker 2>So getting an information from an API main equest you

894
00:55:48.960 --> 00:55:54.519
<v Speaker 2>to do multiple calls to multiple route and you know,

895
00:55:54.679 --> 00:55:57.880
<v Speaker 2>like the LM may not know what if it's feasible,

896
00:55:58.360 --> 00:56:00.320
<v Speaker 2>they may not use the best way to do that

897
00:56:00.400 --> 00:56:04.639
<v Speaker 2>may get lost. And so MCP is really enabling this,

898
00:56:04.840 --> 00:56:09.599
<v Speaker 2>removing all of all of this complexity. And for instance,

899
00:56:09.639 --> 00:56:12.639
<v Speaker 2>I built an m CP server for Rutely and what

900
00:56:13.199 --> 00:56:16.519
<v Speaker 2>it allows the developer to do is to when they

901
00:56:16.519 --> 00:56:20.320
<v Speaker 2>get paid and stuff opening, you know, the web up

902
00:56:20.320 --> 00:56:23.400
<v Speaker 2>going to rut looking the incident and you know it

903
00:56:23.480 --> 00:56:27.159
<v Speaker 2>taket time context switching, which we know is bad for developers.

904
00:56:27.599 --> 00:56:32.639
<v Speaker 2>They can just ask into their favorite power, I get

905
00:56:32.679 --> 00:56:35.039
<v Speaker 2>me the last latest incident is going to put up

906
00:56:35.039 --> 00:56:40.599
<v Speaker 2>in their chat and assuming it's simple in US and

907
00:56:40.719 --> 00:56:44.639
<v Speaker 2>there is in US data in the payload of the incident,

908
00:56:44.719 --> 00:56:47.559
<v Speaker 2>you can ask in this case, I use yourself to

909
00:56:47.599 --> 00:56:51.760
<v Speaker 2>fix the incident. And so you go from production incident

910
00:56:51.880 --> 00:56:56.159
<v Speaker 2>to resolution in a matter of a minute. Again, it's

911
00:56:56.280 --> 00:56:58.760
<v Speaker 2>as you said, it's you know, some people were like, yeah,

912
00:56:58.800 --> 00:57:02.039
<v Speaker 2>it's a joke, it's it's not truly, it's not revolutionary,

913
00:57:02.039 --> 00:57:04.440
<v Speaker 2>it's not. But I think what's great is that it

914
00:57:04.840 --> 00:57:09.800
<v Speaker 2>allows workflows to be done and it reduced a lot

915
00:57:09.840 --> 00:57:14.119
<v Speaker 2>of friction. And we see a lot of companies in

916
00:57:14.119 --> 00:57:19.480
<v Speaker 2>customer like Canvas and Bricks. They are huge engineering organization

917
00:57:19.599 --> 00:57:23.920
<v Speaker 2>and they're like, investigate so much into MCP because they

918
00:57:24.000 --> 00:57:27.320
<v Speaker 2>want their developer to remain where they produce the most value,

919
00:57:27.679 --> 00:57:30.119
<v Speaker 2>which is in their idea, and so they are trying

920
00:57:30.159 --> 00:57:34.280
<v Speaker 2>to bring as many you know, ICP server and then

921
00:57:34.320 --> 00:57:38.000
<v Speaker 2>it doesn't matter if it's ICP. Actually, IBM really is

922
00:57:38.039 --> 00:57:43.519
<v Speaker 2>a competitive protocol which is called ACP, which does the

923
00:57:43.559 --> 00:57:46.480
<v Speaker 2>same thing. But you know, they're trying to bring all

924
00:57:46.519 --> 00:57:50.159
<v Speaker 2>the contexts and the context that engineers need to do

925
00:57:50.239 --> 00:57:54.079
<v Speaker 2>their work into the idea, and MCP is allowing just this.

926
00:57:55.000 --> 00:57:57.480
<v Speaker 1>I think I'll be remissive. I didn't bring up Randall

927
00:57:57.480 --> 00:58:02.039
<v Speaker 1>Monroe's comic on the we have fourteen competing standards for this,

928
00:58:02.480 --> 00:58:05.800
<v Speaker 1>you know what, we need one universal standard to do this,

929
00:58:05.960 --> 00:58:08.440
<v Speaker 1>you know, and then time later we have fifteen competing

930
00:58:08.480 --> 00:58:11.239
<v Speaker 1>standards for this. I mean, because there really are. There's

931
00:58:11.360 --> 00:58:13.920
<v Speaker 1>like a Tobos came out with not long ago Smithy

932
00:58:14.079 --> 00:58:20.320
<v Speaker 1>for u h GDP services design pattern for documenting their APIs.

933
00:58:20.599 --> 00:58:24.119
<v Speaker 1>We had open API specification. It's on version three point

934
00:58:24.119 --> 00:58:26.679
<v Speaker 1>one right now, so that's you know, three versions later,

935
00:58:27.119 --> 00:58:29.480
<v Speaker 1>and there's a whole bunch of these that different companies use,

936
00:58:29.639 --> 00:58:32.199
<v Speaker 1>and I think the biggest trouble a lot of them

937
00:58:32.239 --> 00:58:36.079
<v Speaker 1>have is that, like we have open API specification for authors,

938
00:58:36.400 --> 00:58:39.719
<v Speaker 1>is that even if getting a human to understand what

939
00:58:39.880 --> 00:58:43.519
<v Speaker 1>was written there is quite challenging, and so like feeding

940
00:58:43.559 --> 00:58:46.239
<v Speaker 1>that into a model is you know, nonsensical, Like it's

941
00:58:46.280 --> 00:58:47.480
<v Speaker 1>just not going to get you the word. As you

942
00:58:47.559 --> 00:58:50.280
<v Speaker 1>pointed out, often the pattern is multiple things. I mean,

943
00:58:50.320 --> 00:58:51.840
<v Speaker 1>we have things like graph ql, which you know has

944
00:58:51.880 --> 00:58:54.719
<v Speaker 1>its own problems and whatnot. So I think we're just

945
00:58:54.719 --> 00:58:56.719
<v Speaker 1>going to keep seeing more of these and I don't

946
00:58:56.760 --> 00:58:58.239
<v Speaker 1>think we're ever going to really be able to settle

947
00:58:58.280 --> 00:58:59.920
<v Speaker 1>on one. It would be nice if we could have one.

948
00:59:00.199 --> 00:59:03.239
<v Speaker 1>The thing about MCP is, even if we pretend for

949
00:59:03.280 --> 00:59:05.159
<v Speaker 1>one moment, does the worst thing in the world. As

950
00:59:05.199 --> 00:59:08.639
<v Speaker 1>you pointed out, like I think Azure, GCP, and ABS

951
00:59:08.679 --> 00:59:16.880
<v Speaker 1>all released MCP servers for they're built in like AI products,

952
00:59:17.480 --> 00:59:19.639
<v Speaker 1>so you can interact with AWS better rock through an

953
00:59:19.719 --> 00:59:22.800
<v Speaker 1>MCP server and like, so irrelevant what you think about that.

954
00:59:23.039 --> 00:59:26.719
<v Speaker 1>It now exists and large companies have put some effort

955
00:59:26.800 --> 00:59:28.960
<v Speaker 1>behind it, and maybe they're just trying to capture some

956
00:59:29.000 --> 00:59:32.159
<v Speaker 1>of the market share and later things can evolve. I

957
00:59:32.199 --> 00:59:35.039
<v Speaker 1>do think that, especially if a lot of companies are

958
00:59:35.079 --> 00:59:38.960
<v Speaker 1>going to speed over quality that we may not get

959
00:59:39.079 --> 00:59:43.360
<v Speaker 1>for like that many more iterations of a protocol to work.

960
00:59:43.400 --> 00:59:47.599
<v Speaker 1>I mean, any I'll take this over the using sound

961
00:59:48.119 --> 00:59:52.039
<v Speaker 1>high frequencies to communicate between devices that have you know, LMS,

962
00:59:52.039 --> 00:59:54.480
<v Speaker 1>like I don't need that. It can go over the internet, please, Like,

963
00:59:54.800 --> 00:59:57.920
<v Speaker 1>that's where I'm comfortable with my security. I'm not comfortable

964
00:59:57.960 --> 01:00:00.800
<v Speaker 1>with things going through the airwaves because otherwise it's going

965
01:00:00.880 --> 01:00:04.280
<v Speaker 1>to be the Alexa. Please order me, you know, another

966
01:00:04.599 --> 01:00:07.599
<v Speaker 1>twenty four roll of toilet paper from an advertisement running

967
01:00:07.599 --> 01:00:10.599
<v Speaker 1>on my television and actually have it happen and like

968
01:00:10.639 --> 01:00:12.719
<v Speaker 1>this is recorded, that doesn't happen. Like, so I I

969
01:00:13.719 --> 01:00:16.119
<v Speaker 1>don't need that to happen. People will have this happening.

970
01:00:16.199 --> 01:00:18.159
<v Speaker 1>So I think MCP is still a little bit more

971
01:00:18.199 --> 01:00:20.400
<v Speaker 1>secure than some of these other protocols that are out there.

972
01:00:20.800 --> 01:00:23.599
<v Speaker 2>It is. Yeah, you know again, I'm not you know,

973
01:00:24.239 --> 01:00:28.320
<v Speaker 2>I'm not an MCP evangelist. I think I'm not vouching

974
01:00:28.360 --> 01:00:31.920
<v Speaker 2>for the technology, but not the concept. I think there's

975
01:00:32.199 --> 01:00:36.039
<v Speaker 2>some serious limitation, a lot of issue with it. I

976
01:00:36.079 --> 01:00:38.199
<v Speaker 2>think one of the security I think we've already discussed.

977
01:00:38.280 --> 01:00:40.400
<v Speaker 2>We won't let that. But I think one issue is,

978
01:00:40.440 --> 01:00:44.519
<v Speaker 2>for instance, you spoke about open API, so you can

979
01:00:44.599 --> 01:00:48.920
<v Speaker 2>fit actually your open MPI and MCP can use this

980
01:00:48.960 --> 01:00:51.599
<v Speaker 2>as a reference, which is great because you know, if

981
01:00:51.639 --> 01:00:55.519
<v Speaker 2>you if your API is constantly updated with the latest

982
01:00:56.639 --> 01:00:59.000
<v Speaker 2>you know state and translated into an API, then you

983
01:00:59.000 --> 01:01:01.000
<v Speaker 2>make sure you're m CPCR is always up to date.

984
01:01:01.400 --> 01:01:05.519
<v Speaker 2>What we found out at Rutely is that because we

985
01:01:05.559 --> 01:01:10.320
<v Speaker 2>work with large corporations like LinkedIn, Canva and Cisco and

986
01:01:10.360 --> 01:01:13.119
<v Speaker 2>so on and so for, they have like very specific

987
01:01:14.079 --> 01:01:17.599
<v Speaker 2>requests in how they want to run their internet management.

988
01:01:17.639 --> 01:01:20.400
<v Speaker 2>So our API is very verbal. We have a lot

989
01:01:20.440 --> 01:01:24.840
<v Speaker 2>of routes to please our customer, and if you expose

990
01:01:24.880 --> 01:01:27.519
<v Speaker 2>all of this to MCP, it's going to get lost

991
01:01:27.559 --> 01:01:30.800
<v Speaker 2>in it, even though it's supposed you know to do this,

992
01:01:30.880 --> 01:01:34.159
<v Speaker 2>So you need to restrict the amount of route that

993
01:01:34.519 --> 01:01:39.719
<v Speaker 2>you expose. And the second thing is even at the

994
01:01:39.800 --> 01:01:43.480
<v Speaker 2>next level in the MCP server chain is within the

995
01:01:43.559 --> 01:01:49.440
<v Speaker 2>client in the editor like what people recommend, you can

996
01:01:49.519 --> 01:01:53.800
<v Speaker 2>have up to five, two ten MCP server. After that

997
01:01:54.519 --> 01:01:57.119
<v Speaker 2>your local agent is going to get lost because again

998
01:01:57.199 --> 01:02:01.199
<v Speaker 2>too much context. So you know that this technology is

999
01:02:01.840 --> 01:02:03.400
<v Speaker 2>I don't know if it's going to mature or something

1000
01:02:03.519 --> 01:02:06.280
<v Speaker 2>is going to replace it. You know, then you need

1001
01:02:06.320 --> 01:02:10.639
<v Speaker 2>to envision maybe something that centralized this MCP server into

1002
01:02:10.840 --> 01:02:12.920
<v Speaker 2>the central hubs so you don't have to configure like

1003
01:02:12.960 --> 01:02:17.320
<v Speaker 2>fifty of them. But I think it's on the right

1004
01:02:17.400 --> 01:02:21.920
<v Speaker 2>track and I think we see adoption and but but yeah,

1005
01:02:22.320 --> 01:02:26.760
<v Speaker 2>we will see where this move open AI recently and

1006
01:02:26.880 --> 01:02:29.800
<v Speaker 2>now that they are supporting MCP, which you know is

1007
01:02:30.360 --> 01:02:35.239
<v Speaker 2>is interesting because they're competing with Entropy. So yeah, I

1008
01:02:35.280 --> 01:02:37.840
<v Speaker 2>think there will be more of this for sure.

1009
01:02:38.159 --> 01:02:40.639
<v Speaker 1>But I actually my pack at the end of the

1010
01:02:40.639 --> 01:02:42.679
<v Speaker 1>episode will actually related to that. So I think it's

1011
01:02:42.719 --> 01:02:45.360
<v Speaker 1>really interesting that you brought that up. Yeah, I mean

1012
01:02:45.400 --> 01:02:48.639
<v Speaker 1>there's a lot. There's a lot there realistically, and I

1013
01:02:48.880 --> 01:02:50.719
<v Speaker 1>don't like, unless you need it, you probably don't need

1014
01:02:50.719 --> 01:02:53.559
<v Speaker 1>to spend any time looking at the MTP uh. You know,

1015
01:02:53.679 --> 01:02:57.000
<v Speaker 1>it's highly specific here for for agents talking communicating with

1016
01:02:57.039 --> 01:02:59.440
<v Speaker 1>each other. I think the hard problem that will get

1017
01:02:59.440 --> 01:03:04.239
<v Speaker 1>to very quick is at scale, being concise and meaningful

1018
01:03:04.440 --> 01:03:07.840
<v Speaker 1>and focused on what the business value is is going

1019
01:03:07.880 --> 01:03:10.719
<v Speaker 1>to be even more important. And arguably it has always

1020
01:03:10.719 --> 01:03:13.480
<v Speaker 1>been important, but it's very easy to add another route

1021
01:03:13.559 --> 01:03:17.719
<v Speaker 1>to your open API specification or your you know, your

1022
01:03:17.760 --> 01:03:20.719
<v Speaker 1>your web service or whatever you're running and having users

1023
01:03:20.719 --> 01:03:23.119
<v Speaker 1>should be like, oh, they'll deal with it, right, they'll

1024
01:03:23.119 --> 01:03:25.719
<v Speaker 1>deal with the problem. And I think realistically, you know,

1025
01:03:25.800 --> 01:03:28.880
<v Speaker 1>you want to be as clear and concise about what

1026
01:03:28.920 --> 01:03:31.079
<v Speaker 1>you're offering and what your business is and what the

1027
01:03:31.079 --> 01:03:34.559
<v Speaker 1>product is offering, but still give your customer's freedom to

1028
01:03:34.760 --> 01:03:38.239
<v Speaker 1>utilize your product how you want. And now you are

1029
01:03:38.239 --> 01:03:40.960
<v Speaker 1>almost required to make it happen because of limited context

1030
01:03:40.960 --> 01:03:45.199
<v Speaker 1>windows for for LMS, for agents. For MCP is going

1031
01:03:45.239 --> 01:03:46.960
<v Speaker 1>to be even more a problem. I mean, you scared

1032
01:03:47.000 --> 01:03:49.159
<v Speaker 1>me by saying two to five. I feel like if

1033
01:03:49.159 --> 01:03:51.000
<v Speaker 1>you have any more than one, I think you really

1034
01:03:51.079 --> 01:03:52.679
<v Speaker 1>have to question, you know, what the thing is that

1035
01:03:52.719 --> 01:03:55.440
<v Speaker 1>you're fundamentally offering. I mean I do see platforms like

1036
01:03:56.400 --> 01:03:58.440
<v Speaker 1>at LASTI ends, where like you have may have one

1037
01:03:58.480 --> 01:04:01.239
<v Speaker 1>for Gyra and one for Confluent, you know, because that's

1038
01:04:01.239 --> 01:04:03.599
<v Speaker 1>like a knowledge base, and there's like the day issues

1039
01:04:03.719 --> 01:04:06.719
<v Speaker 1>and one for maybe the GIT server. Each one of

1040
01:04:06.760 --> 01:04:10.119
<v Speaker 1>those could potentially be a different server. You say you're

1041
01:04:10.159 --> 01:04:12.199
<v Speaker 1>not an evangelist, but you are the first person on

1042
01:04:12.239 --> 01:04:15.760
<v Speaker 1>this episode, on this podcast to come on and say MTP,

1043
01:04:16.239 --> 01:04:20.599
<v Speaker 1>so that I think, by definition makes you the evangelist.

1044
01:04:22.159 --> 01:04:25.079
<v Speaker 1>And I think there may be a good a good

1045
01:04:25.159 --> 01:04:27.880
<v Speaker 1>moment to switch over to PECKS. But before we do that,

1046
01:04:27.880 --> 01:04:29.559
<v Speaker 1>I'll ask you, you know, is there any one last thing

1047
01:04:29.599 --> 01:04:30.440
<v Speaker 1>that you want to share?

1048
01:04:30.960 --> 01:04:35.159
<v Speaker 2>Yeah, if you are curious about MCP, and you know

1049
01:04:35.159 --> 01:04:37.599
<v Speaker 2>I've been to Cube con Asserica and the vast majority

1050
01:04:37.599 --> 01:04:41.000
<v Speaker 2>of people still don't know about it. We are organizing

1051
01:04:41.119 --> 01:04:46.280
<v Speaker 2>an event on April twenty fourth at guitub in San Francisco.

1052
01:04:46.360 --> 01:04:53.079
<v Speaker 2>We love speaker from brother Bays, Entropic, Open the Eye, Guitub,

1053
01:04:53.880 --> 01:04:56.719
<v Speaker 2>Factory I, and a lot of other companies. We have

1054
01:04:56.920 --> 01:05:00.360
<v Speaker 2>demo and a panel. Well, we'll go over what the

1055
01:05:00.400 --> 01:05:03.920
<v Speaker 2>heck is MCP and I think mab broady, you know,

1056
01:05:03.960 --> 01:05:06.400
<v Speaker 2>as we would chart it, like where what does this

1057
01:05:06.480 --> 01:05:10.679
<v Speaker 2>mean for the industry and where is this going? So yeah,

1058
01:05:10.719 --> 01:05:13.719
<v Speaker 2>if you type m C P Rootly even guitub on

1059
01:05:13.800 --> 01:05:16.400
<v Speaker 2>Google or perhaps we can share this in the description

1060
01:05:16.480 --> 01:05:17.159
<v Speaker 2>of the episode.

1061
01:05:17.440 --> 01:05:20.840
<v Speaker 1>Yeah, for sure, they'll be a link. Okay, then I

1062
01:05:20.840 --> 01:05:22.880
<v Speaker 1>think it's a great point to move on to our

1063
01:05:23.079 --> 01:05:27.719
<v Speaker 1>our picks. Uh. So I'll go first. My pick is

1064
01:05:27.760 --> 01:05:32.360
<v Speaker 1>this short article online by ed Zeitron. He has a

1065
01:05:32.400 --> 01:05:36.960
<v Speaker 1>blog and it's called where the Money. Uh. It's he's

1066
01:05:37.039 --> 01:05:40.039
<v Speaker 1>arguing that there's no AI revolution. Uh. You know, if

1067
01:05:40.079 --> 01:05:44.760
<v Speaker 1>you look at companies like Anthropic and Open AI, they're

1068
01:05:45.119 --> 01:05:47.760
<v Speaker 1>funneling tons of money in to it and they're not

1069
01:05:47.840 --> 01:05:50.920
<v Speaker 1>getting the value out and so in a way they're

1070
01:05:50.920 --> 01:05:53.760
<v Speaker 1>doing the nice thing of subsidizing all our great AI usage,

1071
01:05:53.800 --> 01:05:55.559
<v Speaker 1>you know, so get it. While the fountain is going.

1072
01:05:56.199 --> 01:05:58.840
<v Speaker 1>Really's got a great one, it seems you know there

1073
01:05:58.880 --> 01:06:02.039
<v Speaker 1>are ones out there. Uh, it's just it's a really

1074
01:06:02.039 --> 01:06:05.199
<v Speaker 1>great breakdown of you know, how companies are supposed to work,

1075
01:06:05.559 --> 01:06:07.719
<v Speaker 1>how where the money is coming from, you know, where

1076
01:06:07.719 --> 01:06:10.119
<v Speaker 1>it's being spent, and challenging some of those assumptions. So

1077
01:06:10.519 --> 01:06:14.119
<v Speaker 1>if you are only optimistic about everything related to AI,

1078
01:06:14.239 --> 01:06:17.519
<v Speaker 1>I highly recommend reading the article because there's there's a

1079
01:06:17.519 --> 01:06:19.840
<v Speaker 1>bunch of really good points that are made that are

1080
01:06:19.880 --> 01:06:21.199
<v Speaker 1>are hard to argue against.

1081
01:06:23.079 --> 01:06:28.159
<v Speaker 2>Love it. Yeah, that's an interesting question. Yeah, I think

1082
01:06:28.199 --> 01:06:32.679
<v Speaker 2>the you know a gi and and you know the

1083
01:06:32.760 --> 01:06:35.960
<v Speaker 2>goal of great getting to this great intelligence, so you

1084
01:06:36.119 --> 01:06:38.800
<v Speaker 2>see that, you know, that's why the money is just bringing.

1085
01:06:39.559 --> 01:06:42.920
<v Speaker 1>Yeah, I mean there is this theory that basically we

1086
01:06:43.079 --> 01:06:46.880
<v Speaker 1>can spend literally all of humanity's resources to achieve this

1087
01:06:46.960 --> 01:06:49.480
<v Speaker 1>because once we have it, it will produce so much value.

1088
01:06:50.239 --> 01:06:53.639
<v Speaker 1>That's you know, that theory hasn't been proven yet, but

1089
01:06:54.119 --> 01:06:56.199
<v Speaker 1>I'll leave it to people to read the article. Who

1090
01:06:56.800 --> 01:06:59.639
<v Speaker 1>he's articulated this much better than I have. Okay, so

1091
01:07:00.079 --> 01:07:01.320
<v Speaker 1>you've got for us today.

1092
01:07:01.440 --> 01:07:04.480
<v Speaker 2>Well, I'm going it's going to be my pig that

1093
01:07:04.559 --> 01:07:07.480
<v Speaker 2>I wrote, and I know it's going to be controversial,

1094
01:07:07.519 --> 01:07:10.320
<v Speaker 2>which is why I want to share it even better.

1095
01:07:12.320 --> 01:07:14.440
<v Speaker 2>You know, we spoke a lot about this. We didn't

1096
01:07:14.480 --> 01:07:17.000
<v Speaker 2>speak a lot about this episode, but online everybody is

1097
01:07:17.000 --> 01:07:22.159
<v Speaker 2>speaking about vibe coding, and so I think what's coming

1098
01:07:22.199 --> 01:07:28.639
<v Speaker 2>for us SA is is incident vibing, because the amount

1099
01:07:28.920 --> 01:07:31.960
<v Speaker 2>of incidents that is going to come our way is

1100
01:07:32.000 --> 01:07:35.039
<v Speaker 2>going to probably going to increase. And more importantly, I

1101
01:07:35.039 --> 01:07:38.960
<v Speaker 2>think a lot of the fundamentals that makes an engineering

1102
01:07:39.039 --> 01:07:43.519
<v Speaker 2>organizations solid are going away. A few things. For instance,

1103
01:07:43.800 --> 01:07:47.239
<v Speaker 2>I think a team that knows their code base very well,

1104
01:07:47.920 --> 01:07:50.800
<v Speaker 2>it's kind of going away because humans are not doing

1105
01:07:51.199 --> 01:07:55.079
<v Speaker 2>the coding anymore, right, they are merely like reading it

1106
01:07:55.159 --> 01:07:58.519
<v Speaker 2>doing coveryview. Perhaps they will, you know, use another l LM,

1107
01:07:58.920 --> 01:08:01.679
<v Speaker 2>another model to do the could review of another model.

1108
01:08:02.519 --> 01:08:05.280
<v Speaker 2>But anyway, I think in general we know that the

1109
01:08:05.320 --> 01:08:06.920
<v Speaker 2>knowledge of the code base is going to go down.

1110
01:08:07.400 --> 01:08:12.000
<v Speaker 2>The other one is having matter experts in some fields,

1111
01:08:12.599 --> 01:08:15.480
<v Speaker 2>especially as your company grow. You know, let's say maybe

1112
01:08:16.000 --> 01:08:18.520
<v Speaker 2>you want someone with like very sharp on database or

1113
01:08:18.600 --> 01:08:21.840
<v Speaker 2>website or whatever it is. And this again is going

1114
01:08:21.840 --> 01:08:25.319
<v Speaker 2>away because of what I've just mentioned, but also because

1115
01:08:26.239 --> 01:08:28.600
<v Speaker 2>I think it's going to be increasingly harder for young

1116
01:08:29.039 --> 01:08:34.840
<v Speaker 2>professional to gain this experience and this flair that senior

1117
01:08:34.880 --> 01:08:38.520
<v Speaker 2>engineer have. And so what's the solution. I think it's

1118
01:08:38.560 --> 01:08:42.359
<v Speaker 2>incident vibing. And I think it's one of this story

1119
01:08:42.439 --> 01:08:45.039
<v Speaker 2>where if you cannot beat them, you should join them.

1120
01:08:46.800 --> 01:08:49.039
<v Speaker 2>And so in this article I speak about what some

1121
01:08:49.159 --> 01:08:51.840
<v Speaker 2>of the ways that the companies can can get ready

1122
01:08:51.920 --> 01:08:53.359
<v Speaker 2>with incident vibing.

1123
01:08:54.319 --> 01:08:56.760
<v Speaker 1>I love it, well, we'll share that like an OPEC

1124
01:08:56.840 --> 01:09:00.880
<v Speaker 1>section of the episode. I mean, I I both love

1125
01:09:00.920 --> 01:09:03.399
<v Speaker 1>and hate your pick honestly, because, like I am, I'm

1126
01:09:03.439 --> 01:09:06.199
<v Speaker 1>so with you that vibe coding is terrible. And if

1127
01:09:06.239 --> 01:09:08.319
<v Speaker 1>we look at the door Report or the episode we

1128
01:09:08.319 --> 01:09:09.960
<v Speaker 1>did on the Door Report from twenty twenty four, we

1129
01:09:10.000 --> 01:09:14.399
<v Speaker 1>see that the LAM sacrifice speed for quality. We also

1130
01:09:14.439 --> 01:09:16.640
<v Speaker 1>know that there's a huge problem coming and companies are

1131
01:09:16.680 --> 01:09:19.079
<v Speaker 1>still adopting it. So you have to live with the outcome,

1132
01:09:19.119 --> 01:09:21.239
<v Speaker 1>Like even if you are using lams as best as

1133
01:09:21.279 --> 01:09:23.920
<v Speaker 1>you can, you're gonna that means you're gonna get more incidents.

1134
01:09:24.039 --> 01:09:28.600
<v Speaker 1>And so I'm totally with you. I hate that this

1135
01:09:28.680 --> 01:09:31.680
<v Speaker 1>is happening, but Uh, there's no avoiding it. And so

1136
01:09:31.720 --> 01:09:36.000
<v Speaker 1>the next level is also viving the incident resolution. Okay,

1137
01:09:36.319 --> 01:09:36.680
<v Speaker 1>it is.

1138
01:09:36.720 --> 01:09:42.000
<v Speaker 2>And and we've seen companies, you know, hiring people engineer

1139
01:09:42.239 --> 01:09:45.119
<v Speaker 2>and and they cannot cut, they only have, they can

1140
01:09:45.159 --> 01:09:48.680
<v Speaker 2>only prompt And Yeah, whether you like it or it's,

1141
01:09:48.760 --> 01:09:51.279
<v Speaker 2>it's happening, it's coming. It's the future of software engineering

1142
01:09:51.359 --> 01:09:54.199
<v Speaker 2>in some capacity. And so I, you know, I just

1143
01:09:54.279 --> 01:09:56.279
<v Speaker 2>think we need to get ready for it. That's the

1144
01:09:56.399 --> 01:09:57.399
<v Speaker 2>only thing you can do.

1145
01:09:58.359 --> 01:10:00.960
<v Speaker 1>I mean, I love the per respective. You know, it

1146
01:10:01.199 --> 01:10:04.399
<v Speaker 1>doesn't matter if you agree or disagree with with utilizing it.

1147
01:10:04.399 --> 01:10:08.159
<v Speaker 1>It's it's happening. Uh. And that I'll say, thank you

1148
01:10:08.159 --> 01:10:10.920
<v Speaker 1>Silvin so much for coming on this episode and sharing

1149
01:10:10.960 --> 01:10:13.319
<v Speaker 1>your perspective and what really has I've been doing.

1150
01:10:13.720 --> 01:10:14.840
<v Speaker 2>Thank you very intriving me

1151
01:10:15.359 --> 01:10:18.000
<v Speaker 1>Yea, and thanks for all the listeners and viewers of

1152
01:10:18.039 --> 01:10:18.600
<v Speaker 1>this podcast.
