WEBVTT

1
00:00:14.519 --> 00:00:19.000
What's going on? Everybody. Welcome
to another episode of Adventures and dev Ops.

2
00:00:19.039 --> 00:00:22.719
I'm your host Will Button and joining
me today. This is going to

3
00:00:22.760 --> 00:00:27.960
be exciting. I have the founder
of pg Analyze and a member of the

4
00:00:28.000 --> 00:00:32.079
founding team at product ND. I've
got Lucas Fiddle. Welcome, Lucas,

5
00:00:32.719 --> 00:00:35.560
thank you, happy to be here
right on. I'm excited to have you.

6
00:00:35.719 --> 00:00:43.719
And this follows on nicely to an
episode we had two episodes ago talking

7
00:00:43.840 --> 00:00:49.920
about the role of DBAs in our
environment, you know, with our with

8
00:00:49.960 --> 00:00:53.439
the movement that's happened over the last
decade. You know, whenever I started,

9
00:00:53.520 --> 00:00:58.799
DBA was like the premier role to
have in technology. You know,

10
00:00:58.880 --> 00:01:03.840
being a database administry was like the
ultimate role because you could just you could

11
00:01:03.840 --> 00:01:07.480
just say no and shut everyone down, and you know, the company was

12
00:01:07.519 --> 00:01:11.200
almost powerless to do anything about it. Not that that was your goal,

13
00:01:11.319 --> 00:01:15.120
but like that was the prestige that
the DBA had. And these days we

14
00:01:15.239 --> 00:01:22.480
just don't see as many DBAs around, but those tasks that the DBAs do

15
00:01:22.799 --> 00:01:26.079
still exists, and so I'm interested
to hear your perspective on that. But

16
00:01:26.319 --> 00:01:30.640
before we jump into that, give
us a little bit about your background.

17
00:01:30.719 --> 00:01:33.719
How you got to this point.
Sure, yeah, and you know,

18
00:01:34.640 --> 00:01:37.400
I'll try to relate this to also's
Postcrest, which is the database I care

19
00:01:37.439 --> 00:01:42.959
a lot about. So over the
years, you know, I eventually started

20
00:01:42.959 --> 00:01:47.560
out in those six dozen and seven
putting servers in racks. But back when

21
00:01:47.560 --> 00:01:49.920
we still had racks. You know, still do or have racks. It's

22
00:01:49.959 --> 00:01:53.480
just you don't think about them anymore
geological concept more than the physical figure think

23
00:01:53.519 --> 00:01:57.359
about And so you know pretty much
you know, early like right after that,

24
00:01:57.400 --> 00:02:00.599
you know, kind of the first
job I had, I you know,

25
00:02:00.640 --> 00:02:02.000
kind of started a startup with some
friends of mine and Postcress was a

26
00:02:02.079 --> 00:02:06.640
database of choice, right, and
so you know back then, you know,

27
00:02:06.680 --> 00:02:09.080
we kind of just had everything stort
in our postgress and as you know,

28
00:02:09.280 --> 00:02:13.080
my personal career went on, I
kept, you know, seeing Postcress

29
00:02:13.080 --> 00:02:15.960
over the years, right, and
postcus is an old database, like it's

30
00:02:15.960 --> 00:02:17.360
not as old as I am,
but it's close to it. Right,

31
00:02:17.520 --> 00:02:23.520
So the you know, like they
turned I think twenty seven years last year

32
00:02:23.919 --> 00:02:30.240
post was it not life? So
uh, the really the story here,

33
00:02:30.280 --> 00:02:34.800
right is that I think what I've
seen over the years in my career,

34
00:02:34.879 --> 00:02:38.080
right, is like so like product
hunt for example, use posts my database

35
00:02:38.159 --> 00:02:42.800
right like product the community. The
database was you know, reasonable, not

36
00:02:42.840 --> 00:02:45.800
the biggest database, but it just
worked right like you just had the system

37
00:02:45.840 --> 00:02:50.719
of record and that was Postgress.
And then after that I joined a company

38
00:02:50.719 --> 00:02:53.120
called side of Data. Inside of
Data is essentially a way to scale out

39
00:02:53.159 --> 00:02:57.280
Postgress, and that ended up being
a quiet Microsoft. So you know,

40
00:02:57.319 --> 00:03:00.439
I had some fun in the corporate
world for a little bit. And then

41
00:03:00.680 --> 00:03:04.680
you know, really what this led
me to is where wearing math currently,

42
00:03:05.400 --> 00:03:07.800
which is a company called pg Analyze. And so pg Analyze is a small

43
00:03:07.800 --> 00:03:14.080
startup, bootstrapped very intentionally, so
no VC funding. And really the idea

44
00:03:14.159 --> 00:03:16.479
is that we want to make post
press work better. Right posts performance in

45
00:03:16.479 --> 00:03:23.039
particular work better for the application engineers, the BBAs the data platform engineers help

46
00:03:23.120 --> 00:03:27.560
you optimize slow careers better. I've
actually run this as a side project before,

47
00:03:27.800 --> 00:03:30.199
you know, kind of I went
full time. So I actually started

48
00:03:30.360 --> 00:03:34.919
technically like almost eleven years ago.
Wow, but you know, only I

49
00:03:34.919 --> 00:03:37.800
would say the last four or five
years has been really a serious project.

50
00:03:37.840 --> 00:03:39.560
Right before then, it was more
of a this was, you know,

51
00:03:39.879 --> 00:03:43.159
solving a problem. Some people are
paying for it, but really, you

52
00:03:43.159 --> 00:03:45.479
know, as a business. It's
really been you know, the last couple

53
00:03:45.520 --> 00:03:49.439
of years where it has taken off
and we've really seen great success right on,

54
00:03:49.719 --> 00:03:52.680
So you kind of it's it's one
of those sounds like it's one of

55
00:03:52.719 --> 00:03:58.759
those stories where you had a specific
problem personally and you built this tool to

56
00:03:58.840 --> 00:04:01.919
solve it and then realized, hey, other people kind of have the same

57
00:04:01.960 --> 00:04:05.199
problem, that's right. Yeah,
And I actually so I had a friend

58
00:04:05.240 --> 00:04:09.319
who he was involved early on with
the project but is no longer. But

59
00:04:10.039 --> 00:04:13.840
I had a friend who was essentially
my go to book posters person. Right,

60
00:04:13.840 --> 00:04:15.919
So if you asked me ten years
ago, I probably wouldn't have been

61
00:04:15.919 --> 00:04:18.439
that smart about poscasts asiety these days. Definitely not the smartest. There's definitely

62
00:04:18.439 --> 00:04:23.560
smarter people out there. But I
learned or two. But you know,

63
00:04:23.600 --> 00:04:26.000
back then, essentially he was a
person I would ask, you know,

64
00:04:26.000 --> 00:04:28.560
how to optimize this query? How
do I know set this thing up so

65
00:04:28.639 --> 00:04:30.959
it works well? How do I
direct the replication? And so really,

66
00:04:31.000 --> 00:04:35.360
in part of scratching my own interest
inially was this is a person that's great,

67
00:04:35.480 --> 00:04:40.000
but as there's some tasks that just
don't seem necessary for a person to

68
00:04:40.079 --> 00:04:45.240
do manually, right, for sure. Yeah, and I think that's one

69
00:04:45.279 --> 00:04:47.879
of the things I've encountered as well
as like I have in my own network,

70
00:04:48.279 --> 00:04:51.680
like, oh, this dude is
my go to guy for that.

71
00:04:53.439 --> 00:04:58.160
But you know, it's a personal
relationship, you know, and I'm always

72
00:04:58.680 --> 00:05:02.600
concerned with, you know, making
sure that that relationship goes two way and

73
00:05:02.600 --> 00:05:09.079
that I'm just not beating these people
up with questions, you know. So

74
00:05:09.279 --> 00:05:15.199
yeah, I see the need there
to build that. So what what were

75
00:05:15.199 --> 00:05:21.000
some of the like big things that
you were trying to accomplish with postgress or

76
00:05:21.000 --> 00:05:25.199
doing cogrit postgress or how did you
even know that these were things that you

77
00:05:25.279 --> 00:05:29.519
needed to be worried about? I
mean, the simplest thing is just a

78
00:05:29.560 --> 00:05:33.319
slow query, right, which I
would say basically everybody who runs a postcres

79
00:05:33.399 --> 00:05:35.839
data base we went to slow query
at some point. And this is not

80
00:05:36.000 --> 00:05:39.319
you know, just postgress, right, it's just any database. Really,

81
00:05:40.000 --> 00:05:43.279
Queries are a thing, right,
Like that's your basic unit of work essentially,

82
00:05:43.720 --> 00:05:47.600
And so I think the most fundamental
thing is the database doesn't behave as

83
00:05:47.600 --> 00:05:49.800
I expected to write, Like,
it's not as fast as I expected to

84
00:05:49.879 --> 00:05:54.279
it's returning less data than I expected
to, like, whatever it's happening.

85
00:05:55.199 --> 00:06:00.560
And so really I think that was
the most fundamental thing is understanding which careers

86
00:06:00.600 --> 00:06:03.360
are slow. Right. Sometimes you
have visibility problem because sometimes on the application

87
00:06:03.439 --> 00:06:06.399
side, you're actually calling a function
like in your orem, you're not really

88
00:06:06.399 --> 00:06:10.199
thinking of the database as queries,
right, You're not actually running SEQL.

89
00:06:11.040 --> 00:06:14.680
Sometimes it's just a visibility problem.
But then once you once you know which

90
00:06:14.759 --> 00:06:16.279
query is slow, there's a question
of you know, why is it slow?

91
00:06:16.360 --> 00:06:18.959
Right? How do I how do
I go from knowing it slow that

92
00:06:18.959 --> 00:06:24.120
it's kind of this matching box of
activity to you know, actually the database

93
00:06:24.240 --> 00:06:26.920
is doing just index scan, it's
doing a sequential scan, it's doing this

94
00:06:27.040 --> 00:06:30.279
joint and maybe there's a better way
to join things, right, So there's

95
00:06:30.319 --> 00:06:33.639
there's different choices of how the database
implements what you tell it to do.

96
00:06:34.600 --> 00:06:39.240
And so this is I think really
the most fundamental thing that I've seen myself

97
00:06:39.759 --> 00:06:44.560
need to know about databases is you
know why things are slow and can I

98
00:06:44.600 --> 00:06:46.920
do something to improve them? And
then you know, the more you have

99
00:06:46.959 --> 00:06:49.920
an expert at hand, Really,
what the expert I think can do is

100
00:06:50.720 --> 00:06:55.279
help you understand which action to take
right, and the action might be something

101
00:06:55.360 --> 00:07:00.439
like rewrite the query kind of get
database to a different querer plan. There's

102
00:07:00.480 --> 00:07:03.920
like different ways to influence what's going
on. But understanding is you have a

103
00:07:04.160 --> 00:07:06.759
slow queer problem in the first place. I think that's really the most fundamental

104
00:07:06.800 --> 00:07:11.040
thing to start with for sure.
Yeah, and I'm glad you brought up

105
00:07:11.040 --> 00:07:15.319
o rms because they are, like
I think anymore, that's like the default

106
00:07:15.879 --> 00:07:20.120
and interaction for a lot of engineers
with the databases just through the o RM,

107
00:07:20.959 --> 00:07:26.079
and you don't see the queries that
they create a lot of times.

108
00:07:26.120 --> 00:07:31.920
And I've seen quite a few instances
where when the database performance slows, the

109
00:07:31.959 --> 00:07:38.079
first response is always to increase the
size of the database server. And then

110
00:07:38.839 --> 00:07:41.720
and then you get that that bill, you know, you get the AWS

111
00:07:41.839 --> 00:07:46.240
or the GCP bill, And that's
whenever the finance team comes down and says,

112
00:07:46.279 --> 00:07:50.079
hey, maybe we should take a
different approach to this. And so

113
00:07:50.120 --> 00:07:55.360
that's where some you know, that's
whenever you've got to break open the hood

114
00:07:55.399 --> 00:08:00.120
on the o RM and really start
understanding what queries it's running and is there

115
00:08:00.120 --> 00:08:01.839
are a way to optimize those?
That's right, Yeah, And since you

116
00:08:01.879 --> 00:08:05.199
talk about costs, right, I
think like these days everybody's like optimizing for

117
00:08:05.240 --> 00:08:07.639
costs, right because like things are
expensive, and companies are you know,

118
00:08:07.720 --> 00:08:13.800
laying people off and all the bad
stuff. And database so interesting right because

119
00:08:13.800 --> 00:08:16.120
they're quite expensive, but they're also
hard to change, right, And so

120
00:08:16.160 --> 00:08:20.000
I think that's that's you know something
else I see. Often it's really like

121
00:08:20.040 --> 00:08:22.439
as you mentioned, right, like
this like people reach kind of a performance

122
00:08:22.480 --> 00:08:24.680
bottlem like but then they don't know
how to solve it, and so they

123
00:08:24.800 --> 00:08:30.000
just upscale database. It's worse is
usually in the cloud, your databases go

124
00:08:30.120 --> 00:08:33.919
up and you know slizes of two
essentially multiple power of two type things,

125
00:08:33.960 --> 00:08:35.480
right, So like I have sixteen
gigabytes to RAM, next you got to

126
00:08:35.480 --> 00:08:39.279
have three two gigabytes TOD then sixty
four, right, and so you just

127
00:08:39.399 --> 00:08:43.159
keep essentially like doubling your costs,
which is horrible, right. And it's

128
00:08:43.159 --> 00:08:46.799
really hard to not do that because
you can't just like have multiple database service

129
00:08:46.840 --> 00:08:50.799
like that requires architecture, like you
actually have to think about you know,

130
00:08:50.879 --> 00:08:54.399
can I use read replicas and like
if you like split up the read workload

131
00:08:54.960 --> 00:08:56.639
and so stuff like that becomes a
challenging problem. Right. So this is

132
00:08:56.639 --> 00:09:01.879
where if you can figure out a
very performance issue without scaling up the data

133
00:09:01.879 --> 00:09:05.799
reserve, right, because adding A
to the index, for example, then

134
00:09:05.879 --> 00:09:09.039
that you know, connectually safety a
lot of money, which is why people

135
00:09:09.039 --> 00:09:13.399
care about it. Oh for sure. Yeah. And I'm not a DBA

136
00:09:13.679 --> 00:09:16.720
by any stretch of the imagination.
But the power of indexes, whenever I

137
00:09:16.799 --> 00:09:22.360
discovered that early on in my career, was just mind blowing at at how

138
00:09:22.480 --> 00:09:26.039
much of a difference that would make
in the database performance. Yeah for sure.

139
00:09:26.120 --> 00:09:28.600
Yeah, and it's I mean,
the good news is, I think

140
00:09:28.919 --> 00:09:33.320
people have gotten better over the years
of understanding the basics of indexes. I

141
00:09:33.360 --> 00:09:37.039
think where it's hard, like what's
hard to understand if indexes sometimes is that

142
00:09:37.320 --> 00:09:39.679
there's multiple different index types. Right, So Postcress, like the default is

143
00:09:39.679 --> 00:09:43.200
a bea tree, which most people
roughly know exceptually what a B tree is.

144
00:09:43.919 --> 00:09:48.000
But Postcress has all these different index
types like just gin hash, Britain,

145
00:09:48.600 --> 00:09:52.919
all these like specialized things if you're
talking AI and m L, there's

146
00:09:54.000 --> 00:09:58.879
you know kind of future vector has
like IBF flat and hsu inex types and

147
00:09:58.960 --> 00:10:03.840
so really the challenge there is understanding
when you use these specialized types, right,

148
00:10:03.679 --> 00:10:07.960
But like that's that's essentially what's even
verse than the regular indexing problem.

149
00:10:07.960 --> 00:10:09.960
It's just you know, oh,
I forgot to add it, and next

150
00:10:09.159 --> 00:10:16.200
which index to that? Great?
Yeah, just information overload that. Yeah.

151
00:10:16.519 --> 00:10:26.039
So everyone's talking about AI these days, and I think one of the

152
00:10:26.200 --> 00:10:31.679
one of the stories that we're trying
to sell really hard is that AI is

153
00:10:31.759 --> 00:10:35.879
not the tool to replace your job. AI is a tool that helps you

154
00:10:35.960 --> 00:10:41.080
be more proficient at your job because
as an engineer today, like you've got

155
00:10:41.080 --> 00:10:46.960
to be versed in so many different
things from infrastructure to databases, to the

156
00:10:48.200 --> 00:10:52.759
actual application code that you're writing,
to the business knowledge. So what kind

157
00:10:52.840 --> 00:10:56.039
of what kind of assistance can we
get from AI and helping understand things like

158
00:10:56.080 --> 00:11:01.919
that like which where we need an
index and what type of index? Yeah?

159
00:11:03.120 --> 00:11:05.080
And I think it's it's a complex
topic, right, So I think

160
00:11:05.120 --> 00:11:09.759
the answer is I think there are
things we should be we should be thinking

161
00:11:09.759 --> 00:11:13.480
about. Is it important that a
human is doing activity or can we be

162
00:11:13.639 --> 00:11:16.440
machine assisted at the very least?
Right, I don't necessarily know if we

163
00:11:16.480 --> 00:11:18.519
want to be machine driven, right, I don't know if we want to

164
00:11:18.559 --> 00:11:22.200
have the AI kind of take control
of our database tuning, for example.

165
00:11:22.480 --> 00:11:24.919
I think you know very much the
same way that if you think of the

166
00:11:24.919 --> 00:11:26.159
depth hoops world, right, do
we just want to have you know,

167
00:11:26.200 --> 00:11:31.759
AI orchestrate our servers automatically and just
provision things? Probably not? Right,

168
00:11:31.840 --> 00:11:33.320
Like things like terror form and stuff
are a good thing, Like you want

169
00:11:33.320 --> 00:11:37.639
to have that level control. And
so I think with I mean there's a

170
00:11:37.679 --> 00:11:41.360
couple of things going on that field, right, I think at the like

171
00:11:41.480 --> 00:11:45.000
just looking backward the last couple of
years, looking outside of what we ourselves

172
00:11:45.000 --> 00:11:46.919
is done, and they we just
start with what other people have done.

173
00:11:46.720 --> 00:11:50.679
So just in terms of research,
there used to be a project called auto

174
00:11:50.759 --> 00:11:56.399
Tune. There's their company now they
essentially do hazing details personally, but they

175
00:11:56.440 --> 00:12:01.960
essentially do like basin statists six based
optimization, right, so they do not

176
00:12:03.240 --> 00:12:05.960
use generative AI, right, Like
it's not like you're asking Chatchy D how

177
00:12:05.960 --> 00:12:09.000
to optimstic database. What they're essentially
doing is saying, for this particular parameter

178
00:12:09.080 --> 00:12:13.159
value, can we run you know, a model that essentially comes up with

179
00:12:13.200 --> 00:12:16.360
the best possible parameter? Right,
And it's pretty beff feel like it's it's

180
00:12:16.360 --> 00:12:20.679
a cool idea's a cool system.
There's other like there's another startup like that,

181
00:12:20.759 --> 00:12:24.120
Dbtune, which also tries to do
with something very similar more recent some

182
00:12:24.200 --> 00:12:28.399
similar idea, right, which is
how can we get the best parameter values?

183
00:12:28.480 --> 00:12:31.279
Essentially, so I think that's great. You should take a look at

184
00:12:31.279 --> 00:12:35.480
these if you're interested in tuning parameters. What we have done is pach Analyze

185
00:12:35.519 --> 00:12:41.440
is we've optimized focused on essentially the
things that you know, they're a little

186
00:12:41.440 --> 00:12:43.559
bit more fuzzy in terms of who
owns them, right, so in you

187
00:12:43.919 --> 00:12:46.679
just want index earlier, and so
what we've actually come up with a system

188
00:12:46.720 --> 00:12:52.720
called pach Analys's Index Advisor, which
is essentially a recommendation system for which indexes

189
00:12:52.720 --> 00:12:56.480
to create, right, and so
different than parameters, right, the parameter

190
00:12:56.519 --> 00:12:58.159
is sometimes a choice, like you
kind of want to find tune it,

191
00:12:58.200 --> 00:13:01.240
but you don't necessarily need to do
that all the time, or your application

192
00:13:01.279 --> 00:13:05.639
engineers don't need to know how you
tune your share buffer parameter in post cress

193
00:13:05.679 --> 00:13:09.919
Like that stuff isn't that important essentially
to the application engineer. But indexes are

194
00:13:09.960 --> 00:13:13.519
interesting because they they have like this
fuzzy ownership, right, Like is it

195
00:13:13.559 --> 00:13:16.919
the application engineer owning them? Is
it the kind of DBA or the platform

196
00:13:16.919 --> 00:13:20.679
engineer owning them? And so what
recentially built this system that you know,

197
00:13:20.440 --> 00:13:22.639
imagine it's kind of like your safety
harness, right, Like, it doesn't

198
00:13:22.639 --> 00:13:26.840
necessarily say you you've got to drive
your decisions, but if you forget to

199
00:13:26.840 --> 00:13:28.840
add an index, it will tell
you. And so it's not you know

200
00:13:28.919 --> 00:13:31.200
what I call it AI. I
don't know, right, AI is a

201
00:13:31.200 --> 00:13:33.960
complex term. It's not chene AI, right, Like, it's not chanitif

202
00:13:35.000 --> 00:13:37.679
I the same way as auto tune
or DBTN or not chanitif I. But

203
00:13:37.759 --> 00:13:41.360
it is a way to essentially have
a recommendation, uh, kind of of

204
00:13:41.519 --> 00:13:46.279
which things to create based on your
career, work club, right, and

205
00:13:46.320 --> 00:13:48.919
so similar spirit, what we've also
done is what we call vacuum Advisor.

206
00:13:50.159 --> 00:13:52.600
And so vacuum and postcress is very
particular concept. You may not be familiar

207
00:13:52.639 --> 00:13:58.240
with it, but essentially it's it's
essentially that the dead row cleanup and postcress

208
00:13:58.279 --> 00:14:01.600
rights like when you have like you've
been update or delete postgress you know,

209
00:14:01.639 --> 00:14:07.080
will create essentially a record that kind
of just marks the deleted data essentially,

210
00:14:07.559 --> 00:14:09.399
and then vacuum comes and cleans that
up. And so sometimes you need to

211
00:14:09.399 --> 00:14:13.120
find tune the schedule of that,
right, you need to understand how often

212
00:14:13.159 --> 00:14:15.519
is it running? Is running too
often or not often enough? Is it?

213
00:14:15.559 --> 00:14:18.679
You know, kind of like running
it wrong time a day, right,

214
00:14:18.720 --> 00:14:20.679
like I have my business hours and
suddenly the database is busy a vacuum.

215
00:14:20.960 --> 00:14:24.639
And so we've essentially done something where
we looked at all the time serious

216
00:14:24.720 --> 00:14:28.000
data that we have in our system
and we said, you know, can

217
00:14:28.039 --> 00:14:31.399
we make your recommendations for which confex
stennings to change. This kind of goes

218
00:14:31.440 --> 00:14:33.799
a little bit into that parameter tuning
area. What we then do is we

219
00:14:33.840 --> 00:14:37.440
do is on a per table basis, and so this is then again where

220
00:14:37.759 --> 00:14:41.039
if you as a human did that, it just becomes very complicated, right

221
00:14:41.039 --> 00:14:43.919
because you've got to think of like
imagine our own database. For example,

222
00:14:43.919 --> 00:14:46.759
we have a thousand tables. If
you have a thousand tables, looking at

223
00:14:46.799 --> 00:14:48.960
each and every one of them and
then looking at you know, a graph

224
00:14:50.039 --> 00:14:52.679
over time that describes to you how
you know the auto vacuum works. It's

225
00:14:52.720 --> 00:14:56.879
just very tedious. And so being
able to have that automatically kind of looked

226
00:14:56.879 --> 00:15:01.159
at an analyzed is really what I've
found quite useful over the years. And

227
00:15:01.480 --> 00:15:03.919
what we done, oh for sure, that's to me, that's like pure

228
00:15:05.000 --> 00:15:09.759
gold right there, because you know, like I know that you have to

229
00:15:09.159 --> 00:15:13.120
vacuum postpress databases, and when I
first encountered that, I was like,

230
00:15:15.399 --> 00:15:18.919
what the hell I have to vacuum
this thing? Like can I hire a

231
00:15:20.000 --> 00:15:22.679
housekeeper? Or how does this work? And then you want to do it?

232
00:15:22.919 --> 00:15:26.759
Yeah? Yeah, and then so
then yeah, that was just led

233
00:15:26.799 --> 00:15:28.320
to more questions for me, like, well, should I trust this built

234
00:15:28.320 --> 00:15:31.360
in housekeeper? When's it going to
do it? How do I know if

235
00:15:31.360 --> 00:15:35.320
it's doing it at the right time? And so to have a tool that

236
00:15:35.519 --> 00:15:39.559
looks at what's actually going on in
my data database and then makes recommendations based

237
00:15:39.600 --> 00:15:46.159
on that activity, I think it
is just like an amazing resource to have

238
00:15:46.440 --> 00:15:50.960
because it answers so many questions for
me that that I didn't even know we're

239
00:15:52.000 --> 00:15:56.879
supposed to be questions, right exactly, Yeah, And I think especially if

240
00:15:56.919 --> 00:15:58.840
you, if you like are not
a pubicles expert, right, like,

241
00:15:58.840 --> 00:16:03.559
this is what I've seen order hears
is like people migrate from Oracle to postcress

242
00:16:03.600 --> 00:16:06.679
or from civile server to postcress,
and these databases they do this differently,

243
00:16:06.759 --> 00:16:08.639
right, so they don't have vacuum
as a concept, And so people when

244
00:16:08.639 --> 00:16:11.519
they first go to postcress and then
they have this bike production system. Suddenly

245
00:16:11.559 --> 00:16:15.200
these vacuum problems start happening, and
they're like, what's going on? Like

246
00:16:15.480 --> 00:16:17.720
what should I do? Right,
Like I don't have anybody in house who

247
00:16:17.720 --> 00:16:22.679
knows about this. Yeah, yeah, and you can always post on stack

248
00:16:22.720 --> 00:16:26.440
overflow and hope for the best.
There there are some good people on stack

249
00:16:26.480 --> 00:16:30.039
overflow. Let me talk like,
it's really you know, I've I've learned

250
00:16:30.039 --> 00:16:34.360
this thing or two or ten from
a stack overflow posts for sure. Yeah,

251
00:16:34.480 --> 00:16:41.799
it's it's funny because like there's the
learning curve of stack overflow, like

252
00:16:41.840 --> 00:16:47.519
it's it's this tremendous resource, but
you have to learn how to interact with

253
00:16:47.639 --> 00:16:51.720
you know, how to here's the
minimum criteria I have to have to open

254
00:16:52.000 --> 00:16:55.480
the question, you know, and
here's what I've got to include if I'm

255
00:16:55.480 --> 00:17:00.000
going to get a legitimate value response, right, And I think what's interesting

256
00:17:00.440 --> 00:17:03.960
on the topic of psago flow that
that actually makes me think in terms of

257
00:17:04.039 --> 00:17:07.920
AI, that makes me think of
gen AI, right, Like the whole

258
00:17:07.599 --> 00:17:11.000
you know, kind of give me
a response that resempless stechgorical answer is I

259
00:17:11.000 --> 00:17:15.240
think what generator A I and LMS
are good at and so somebody in the

260
00:17:15.240 --> 00:17:21.480
postgars community and Nicolai from postgrass Ai. He's a recent project where he's essentially

261
00:17:21.480 --> 00:17:26.480
connected to cheat GPT with kind of
a knowledge base of different postgress articles and

262
00:17:26.519 --> 00:17:29.359
it's actually a really interesting experiment.
But he's essentially saying, what if we

263
00:17:29.400 --> 00:17:32.160
had a chatbot, you know,
that was power by chat, ChiPT or

264
00:17:32.279 --> 00:17:36.200
GPT for I think, and then
you would essentially ask a question just like,

265
00:17:36.599 --> 00:17:38.200
you know, how do I set
this parameter in postcars and then will

266
00:17:38.240 --> 00:17:41.559
just spit out something that essentially looks
like a Stachofal answer, which I think

267
00:17:42.039 --> 00:17:45.160
like that has has this place as
well, right because people, I think

268
00:17:45.200 --> 00:17:49.240
what people do alternatively to asking a
chat about that, they'll go on Google

269
00:17:49.240 --> 00:17:52.720
and they'll search how to tune this
parameter. And then, unfortunately, with

270
00:17:52.839 --> 00:17:56.039
you know, the advance of AI, what has happened is a lot of

271
00:17:56.079 --> 00:18:00.799
more articles are just not that useful
because it's beau, I'm pretty cheap to

272
00:18:00.839 --> 00:18:03.960
generate, you know, kind of
that constant sense. And so what he's

273
00:18:03.960 --> 00:18:04.920
done is really, you know,
kind of I think, built a better

274
00:18:04.920 --> 00:18:08.200
solution to you know, how do
I get good postters knowledge? Is by

275
00:18:08.279 --> 00:18:12.480
essentially building off with you before and
having kind of this this more trusted I

276
00:18:12.480 --> 00:18:18.240
think data like database of good articles, which I found interesting. Yeah.

277
00:18:18.279 --> 00:18:23.440
I don't think it's going to be
truly successful though until chat GPT actually sends

278
00:18:23.440 --> 00:18:29.200
out insulting answers at random just to
insult your intelligence, until you to come

279
00:18:29.200 --> 00:18:33.680
back when you actually know how to
ask an intelligent question. Yeah. Yeah,

280
00:18:33.680 --> 00:18:34.960
And it has some weird issues like
earlier this week, I just I'm

281
00:18:34.960 --> 00:18:37.880
not an active chutch up to user
personally, but you know that there was

282
00:18:37.920 --> 00:18:42.640
like this issue where it just like
started screwing RD and garbage. It's certainly

283
00:18:42.920 --> 00:18:45.880
you know sometimes I mean it's it's
interesting in a sense, right, Like

284
00:18:45.920 --> 00:18:48.720
it feels like I'm reading a science
fiction novel and I not to get sure

285
00:18:48.720 --> 00:19:02.519
where it's going. Right. Cool, So let's go back to to someone

286
00:19:02.599 --> 00:19:08.599
who's who's working with postgress, you
know, has multiple other roles, and

287
00:19:08.640 --> 00:19:12.680
we talked about, you know,
the need for indexes and query optimization.

288
00:19:14.400 --> 00:19:21.480
But that's like pretty pretty specific,
like you're you're narrowing in on the actual

289
00:19:21.559 --> 00:19:26.759
problem there. What are the indicators
before that that tell them that this is

290
00:19:26.759 --> 00:19:29.880
the area what you want to focus
on, Like you're running postgress. What

291
00:19:29.960 --> 00:19:33.279
kind of things do you see that
tell you, hey, this is a

292
00:19:34.319 --> 00:19:37.960
something that postgress might help with,
versus this is something that looks like it's

293
00:19:38.000 --> 00:19:44.559
application code or infrastructure related. Yeah. No, it's a good question.

294
00:19:44.680 --> 00:19:45.799
I think it's it's actually not an
easy problem to solve, right, because

295
00:19:45.839 --> 00:19:49.759
you you're kind of looking at two
different worlds. Right. So I think

296
00:19:49.799 --> 00:19:53.559
that what people most commonly would have
is they would have like an APM tool

297
00:19:53.559 --> 00:20:00.599
of sorts or tracing toolrights that might
be using you relic data dog like long

298
00:20:00.680 --> 00:20:04.039
list of apmtals, right, Yeah. And so what you were seeing these

299
00:20:04.079 --> 00:20:07.599
tools is you would see a trace
for example, right, these as you

300
00:20:07.599 --> 00:20:11.200
would call a trace, and you
would say, here's a slow request,

301
00:20:11.200 --> 00:20:14.759
and you would see different spans essentially
in the trace that would say you know

302
00:20:14.759 --> 00:20:18.400
which part of request is slow?
And so usually with most of these instrumentations,

303
00:20:18.480 --> 00:20:22.920
you from the application side, you
will know that the query is the

304
00:20:22.960 --> 00:20:25.079
slow part. Right, So you
would I would expect, let's say,

305
00:20:25.279 --> 00:20:27.240
with a request that takes ten seconds, we would see that you know,

306
00:20:27.279 --> 00:20:30.000
out of these ten seconds, nine
seconds are spent in the database. Great,

307
00:20:30.039 --> 00:20:33.440
right, So this is this is
actually just something that you will get

308
00:20:33.480 --> 00:20:37.759
most of the time. Not a
hard part is understanding is this something I

309
00:20:37.759 --> 00:20:40.720
could do something about, right,
because it's essentially will say, here's a

310
00:20:40.720 --> 00:20:42.519
sequel query and here is you know, this nine seconds span, but it

311
00:20:42.519 --> 00:20:48.319
doesn't tell you anything more about it. And so what I've found quite interesting

312
00:20:48.759 --> 00:20:52.799
we've uh, we've kind of launched
this feature also last year, is we

313
00:20:52.160 --> 00:20:59.400
essentially use the fact that open telemetry
allows you to connect different essentially different services.

314
00:20:59.440 --> 00:21:02.680
Right. So the idea in a
microservice architecture is you can have different

315
00:21:02.960 --> 00:21:07.799
systems sent into the same trace essentially. And so imagine that if your database

316
00:21:07.839 --> 00:21:11.119
could tell you, hey, you
know these nine seconds of time, Actually

317
00:21:11.160 --> 00:21:15.119
I spent five seconds scanning this index, four seconds joining this data, right,

318
00:21:15.319 --> 00:21:18.480
Suddenly your trace becomes much more usable, right because you're not looking at

319
00:21:18.519 --> 00:21:22.079
this you know, nine second time
span. You're actually looking at individual operations

320
00:21:22.079 --> 00:21:25.480
that you maybe understand better, right
because you're doing a big quer on this

321
00:21:25.519 --> 00:21:30.240
table. And so what we've actually
done essentially is piggyback on this idea of

322
00:21:30.279 --> 00:21:33.440
different services sending into the you know, same system, the same observability system.

323
00:21:33.759 --> 00:21:38.000
And so we've built an integration where
we pull the kind of a slow

324
00:21:38.079 --> 00:21:42.519
query log or auto explained log in
postgrass, which essentially says, here is

325
00:21:42.559 --> 00:21:45.319
you know a slow queer execution,
and here's the plan for that execution,

326
00:21:45.640 --> 00:21:48.680
and so we pulled that as part
of our kind of agent, and then

327
00:21:48.720 --> 00:21:53.839
we send the essentially a reference of
that information into the tracing system. Right.

328
00:21:55.319 --> 00:21:56.920
And so what the solves is what
you would do otherwise. Right.

329
00:21:56.960 --> 00:22:00.279
So let's imagine if you don't have
this type of solution, is you look

330
00:22:00.279 --> 00:22:03.200
at your trace and you have that
nine second span, and then you have

331
00:22:03.279 --> 00:22:07.359
to correlate that with the database activity. Right, So you would essentially either

332
00:22:07.359 --> 00:22:11.359
look at your database logs or like
in postcristers, tools like ggstat statements which

333
00:22:11.400 --> 00:22:14.519
tracks the query performance over time.
And so what you would do is you

334
00:22:14.640 --> 00:22:17.400
essentially say, well, I know
roughly the query shape, right, formatting

335
00:22:17.480 --> 00:22:19.359
sometimes differs slightly, but maybe you
know, I'll like copy and paste a

336
00:22:19.359 --> 00:22:22.000
part of the query. I'll go
over to the database and I'll kind of

337
00:22:22.240 --> 00:22:26.039
put it in and I'll try to
find the thing that's matching that looks like

338
00:22:26.079 --> 00:22:29.599
the right query. But it's really
hard to do that precisely. And so

339
00:22:29.680 --> 00:22:33.720
this is I think where open telemetry
really has an interesting way of solving this,

340
00:22:33.880 --> 00:22:37.599
right, because just expand that briefly, what essentially it does is in

341
00:22:37.680 --> 00:22:41.200
tracing, you have these IDs right
as you're trying to propagate across systems.

342
00:22:41.759 --> 00:22:45.599
And so the idea is when you
run a CQL career, you add a

343
00:22:45.640 --> 00:22:48.799
comment and in a comment that says
what's the trace ID and the parent span

344
00:22:48.839 --> 00:22:52.240
ide of that query is. And
so when the query arrives on the database

345
00:22:52.279 --> 00:22:56.359
side, the database can output a
log for example, that says, hey,

346
00:22:56.400 --> 00:22:57.759
you know, this career was slow. Here's the career plant. And

347
00:22:57.799 --> 00:23:02.200
also, by the way, here's
the little comment at the start that tells

348
00:23:02.240 --> 00:23:04.960
you that tells the tracing system essentially
how to kind of stitch that back together

349
00:23:06.039 --> 00:23:11.319
into one unified view of the world, right, Yeah, And that's the

350
00:23:11.480 --> 00:23:15.880
Open Telemetry has just made some huge
strides in tying all of that together.

351
00:23:17.000 --> 00:23:22.640
We had Andy Grabner from Dinah Trace
on the show yesterday and we ended up

352
00:23:22.680 --> 00:23:30.359
talking about open telemetry there as well, and just the like the whole,

353
00:23:33.359 --> 00:23:37.920
you know, the whole community effort
from all of these different players to agree

354
00:23:37.079 --> 00:23:41.880
like, hey, we can all
use this standard and create this system that

355
00:23:41.960 --> 00:23:45.440
allows you to tie those kinds of
things together. It's pretty cool and something

356
00:23:45.440 --> 00:23:49.160
that I don't think we we could
have seen in our industry up until the

357
00:23:49.240 --> 00:23:53.359
last few years. Yeah, for
sure. And I think open Elementary is

358
00:23:53.400 --> 00:23:57.559
interesting because it has that like cross
company collaboration. Right, So like Dina

359
00:23:57.599 --> 00:24:02.880
Trace is involved in a standard,
it's Microsoft that are like big players involved.

360
00:24:02.920 --> 00:24:06.920
Like it just seems like I'm like, I'm positively surprised essentially that they

361
00:24:06.960 --> 00:24:08.000
were all able to come to the
table, and like some of them are

362
00:24:08.039 --> 00:24:11.960
vendors, some of them are you
know, big customers of observability data,

363
00:24:11.200 --> 00:24:14.519
and even if some of them are
selling their own solutions, right, they

364
00:24:14.519 --> 00:24:17.359
are essentially still at the table discussing, hey, how can make the standard

365
00:24:17.359 --> 00:24:18.960
work? Yeah, which is an
end user is actually great, right because

366
00:24:19.000 --> 00:24:23.000
you kind of go out of this
vendor specific instrumentation in your code and you're

367
00:24:23.039 --> 00:24:27.000
moving more towards the standardized instrumentation and
you can choose different vendors. So I

368
00:24:27.039 --> 00:24:30.359
think that's that's how things should be. Yeah. Yeah, And then the

369
00:24:32.559 --> 00:24:36.279
skeptical side of me sees that and
it's like, hmm, okay, what's

370
00:24:36.319 --> 00:24:41.319
the catch. I mean, I
think the catches it's it's sometimes sometimes the

371
00:24:41.319 --> 00:24:44.240
story is a little bit oversold,
right, So I've spent a lot of

372
00:24:44.319 --> 00:24:47.359
time and some of you know,
some of the maybe more niche use cases,

373
00:24:47.440 --> 00:24:49.240
right, and the whole thing we
just like I've just talked about in

374
00:24:49.359 --> 00:24:52.720
terms of commenting and such like.
For example, there's a project called SQL

375
00:24:52.720 --> 00:24:56.440
Commentary which is supposed to add these
query tags. In the beginning, that's

376
00:24:56.440 --> 00:25:02.200
the project at Google actually donated to
the Open Climate project. And it's great

377
00:25:02.240 --> 00:25:04.119
they've donated it, but also it's
kind of stall since then, so it's

378
00:25:04.119 --> 00:25:07.519
not like they've actually pushed forward and
really said, hey, you know,

379
00:25:07.559 --> 00:25:08.640
how do we make this, you
know, a proper standard, how do

380
00:25:08.640 --> 00:25:11.960
we document this as part of the
official project? And so sometimes I think

381
00:25:12.440 --> 00:25:18.359
the risk essentially of these stanardization projects
is that sometimes you know, there is

382
00:25:18.400 --> 00:25:19.960
like this you know, Alpha spec
to do something, or like this thing

383
00:25:21.000 --> 00:25:23.359
that was contributed by one of the
players, but there's not enough momentum around

384
00:25:23.599 --> 00:25:27.119
actually developing the standard further. And
so I think, for example, if

385
00:25:27.119 --> 00:25:30.200
I was doing tracing right, definitely
do the open plamage tracing stuff. But

386
00:25:30.240 --> 00:25:33.880
if you're doing logs even right,
like logs are like they're pretty stable in

387
00:25:33.920 --> 00:25:37.839
open plemagry understanding but they're still less, you know, kind of commonly,

388
00:25:37.960 --> 00:25:42.759
kind of fully supported across all different
languages and whatnot. Right, Yeah,

389
00:25:44.000 --> 00:25:52.720
so you've got a ton of experience
with Postgress. There are there's quite a

390
00:25:52.759 --> 00:25:56.480
few database choices available. I mean
really in my mind, there's like there's

391
00:25:56.720 --> 00:26:04.000
Postgrass, my sequel, and Mango. I consider those to be like the

392
00:26:04.000 --> 00:26:08.200
big players. And there's different variants
of that, you know, and you

393
00:26:08.240 --> 00:26:11.599
know when you get off into the
no SQL stuff, you know, there's

394
00:26:11.599 --> 00:26:15.079
different use cases there. But I
really see it as being like those three

395
00:26:15.400 --> 00:26:21.440
as being the most prevalent. What
are the things? And I've used Postgress

396
00:26:21.480 --> 00:26:25.400
a ton because it just seems to
fit any use case you throw at it.

397
00:26:25.440 --> 00:26:30.599
But what are you what's your your
take on why Postgress versus some of

398
00:26:30.640 --> 00:26:34.839
the other database offerings. Yeah,
and I would add, like, just

399
00:26:34.839 --> 00:26:37.160
just to add the fourth to the
list, I would say these days out

400
00:26:37.200 --> 00:26:41.039
of a ClickHouse to the list as
well. In terms of source databases,

401
00:26:41.039 --> 00:26:45.640
it's not relational like like or it's
a column store essentially. But I've definitely

402
00:26:45.640 --> 00:26:48.960
seen a lot of companies use posts
and ClickHouse for different workload, different parts

403
00:26:48.960 --> 00:26:56.079
of workloads. But yeah, I
would say you know the it's so so

404
00:26:56.119 --> 00:27:00.000
I think that like there's there's different
reasons why you use postosts, why use

405
00:27:00.039 --> 00:27:03.440
from ones. One of the things
that I usually talk about open clemmmetry and

406
00:27:03.440 --> 00:27:06.400
different companies coming together, right,
and it's from a community perspective. The

407
00:27:06.480 --> 00:27:08.400
one thing like what keeps me the
postcrist community, right, which is not

408
00:27:08.519 --> 00:27:11.359
exactly answering your question, but I
think it is an interesting aspect of this

409
00:27:11.720 --> 00:27:15.400
is postcurs is not a project by
Bond company, right. Post Chris is

410
00:27:15.400 --> 00:27:18.279
a community project. Like it has
you know, people from ab guests working

411
00:27:18.319 --> 00:27:22.160
in it, people from Microsoft working
from ed B working it, from Google

412
00:27:22.200 --> 00:27:25.519
working in it, people from small
companies plant working it. And it is

413
00:27:25.559 --> 00:27:27.720
a true community project, kind of
like Linux currently is. And I think

414
00:27:27.759 --> 00:27:30.960
this this is what fascinates me about
it and what makes me, you know,

415
00:27:32.079 --> 00:27:36.200
just the longevity of it is like
so much. Essentially I trust it

416
00:27:36.240 --> 00:27:37.920
to be to have that longevity even
in you know, ten years, twenty

417
00:27:38.039 --> 00:27:42.400
fifty years from now, just because
it's it's it's been able to kind of

418
00:27:42.480 --> 00:27:45.000
advance over the years, but it's
also been able to do that as a

419
00:27:45.000 --> 00:27:48.519
community project, not as a commercial
player trying to you know, kind of

420
00:27:49.359 --> 00:27:52.720
get both benefits essentially, right,
because if you look at mangoay to be,

421
00:27:52.839 --> 00:27:55.880
like mamay B has you know,
their ATHLETs service, And I don't

422
00:27:55.920 --> 00:27:57.920
know how it differs between you know
what mama to be open source gets you

423
00:27:59.039 --> 00:28:00.839
versus you know, the Atlas base
Mango to be. But certainly, you

424
00:28:00.839 --> 00:28:03.440
know, I would imagine they have
that conflict, right They're constantly thinking what

425
00:28:03.480 --> 00:28:07.000
should I put there? What should
it? But there are people just going

426
00:28:07.039 --> 00:28:10.160
to deploy their own, right,
and so just that that is really I

427
00:28:10.160 --> 00:28:11.440
think you know why postcrists is a
great building block, right, because it

428
00:28:11.440 --> 00:28:17.519
doesn't have that that fundamental conflict that
these companies usually have. I think when

429
00:28:17.519 --> 00:28:21.400
you're trying to make a decision which
one to use, I think if you're

430
00:28:21.480 --> 00:28:23.559
using my sequel, there's you know, there's some reasons to migrate to postgress,

431
00:28:23.559 --> 00:28:26.000
but they're they're usually not that strong, right, So I think when

432
00:28:26.039 --> 00:28:30.079
I see people migrating between the databases, it's really more to old school like

433
00:28:30.119 --> 00:28:33.920
Oracle or SQL server to post press, right, And for example there postcrists

434
00:28:33.960 --> 00:28:37.039
is a very good target for these
migrations because it has like a lot of

435
00:28:37.039 --> 00:28:41.400
similarities in some sense and career language
and such before Orecle for example, and

436
00:28:41.400 --> 00:28:45.119
stuff like that. There's easy it's
easier to migrate essentially to post Christ into

437
00:28:45.119 --> 00:28:48.240
my sequel. Also, if you're
using Oracle, why would you go to

438
00:28:48.240 --> 00:28:52.559
my sqel because that's owned by Oracle. Again, kind of doesn't really make

439
00:28:52.599 --> 00:29:00.079
sense. And I think Mango to
be I don't, you know, I

440
00:29:00.079 --> 00:29:03.759
don't have much experience personally, but
I think it's like really it's it's it's

441
00:29:03.799 --> 00:29:07.160
the kind of thing where you make
a choice. Sometimes companies, you know,

442
00:29:07.240 --> 00:29:11.440
run multiples of those, like oftentimes
I see companies just as in Postcriss,

443
00:29:11.720 --> 00:29:14.319
I think you can scale out all
of these right like Postcress as Citas.

444
00:29:14.559 --> 00:29:17.720
My sequel has the tests and planet
scale. I'm I'm going to be

445
00:29:17.799 --> 00:29:19.559
kind of as it's built in way
of scaling out. So it's it's you

446
00:29:19.599 --> 00:29:22.400
don't you don't really run into a
bottom like these days anymore with not being

447
00:29:22.400 --> 00:29:27.000
able to scale beyond a certain data
poet. Yeah, for sure. Yeah.

448
00:29:27.039 --> 00:29:33.880
My introduction to Mango deb was gosh, it's been I think it's been

449
00:29:33.920 --> 00:29:41.279
over ten years ago now, but
where we used it was with mobile applications,

450
00:29:41.920 --> 00:29:48.279
uh, and user attribution because you
have this application that ties into all

451
00:29:48.359 --> 00:29:52.279
these different services, you know,
like ties into Facebook and ties into Amazon

452
00:29:52.400 --> 00:29:56.519
and different things like that, and
you're trying to attribute your marketing campaigns to

453
00:29:56.640 --> 00:30:03.200
those platforms, but with each one
and you get different data. And so

454
00:30:03.359 --> 00:30:07.799
Mango for us turned out to be
a really good way to just say,

455
00:30:07.799 --> 00:30:11.759
Okay, here's the attribution data we
got for this user, and we're just

456
00:30:11.799 --> 00:30:15.880
gonna put it in this Mango field. And then in some cases we know

457
00:30:15.039 --> 00:30:22.440
that there's one key value pair that's
consistent across all of those platforms, and

458
00:30:22.480 --> 00:30:25.559
then we're gonna keep the rest of
the stuff around because we might end up

459
00:30:25.640 --> 00:30:29.079
needing it later too, And so
that was a really strong use case for

460
00:30:29.200 --> 00:30:36.079
Mango. Now Postgress also though,
supports Jason data types, so you could

461
00:30:36.160 --> 00:30:38.960
just as easily have done it with
Postgress. I don't know if Postgress actually

462
00:30:40.000 --> 00:30:41.480
had it back when we did this, but I know what that it is

463
00:30:41.519 --> 00:30:45.759
there now, yeah, and what
it probably had. So there's you know,

464
00:30:45.799 --> 00:30:48.039
I would say, there's three things
worth thinking about and knowing about it,

465
00:30:48.079 --> 00:30:49.880
right, So, Like the simplest
thing is you could always just store

466
00:30:49.960 --> 00:30:53.240
texts like Jason has texting your database, right, right, But what you're

467
00:30:53.279 --> 00:30:56.599
kind of losing there is like any
sense of validation, right, Like if

468
00:30:56.599 --> 00:30:59.839
you're missing a parenthesies, you're not
going to database if going to tell you,

469
00:30:59.920 --> 00:31:03.920
right, And so the most simplest
form, postcress has adjacent data type

470
00:31:03.200 --> 00:31:07.480
and Jason in postcus is really just
a validation step right where it says,

471
00:31:07.480 --> 00:31:11.160
okay, well let me conform to
this actual correct Jason, and then you

472
00:31:11.200 --> 00:31:14.000
know it can do some operations on
top of it. But really, where

473
00:31:14.160 --> 00:31:18.440
I think postcress has become more of
a competitive Mango to B is the Jason

474
00:31:18.519 --> 00:31:22.119
B data type. B stands for
binary, and the idea behind data is

475
00:31:22.200 --> 00:31:26.279
that it lets you, amongst other
things, index a Jason B kind of

476
00:31:26.319 --> 00:31:29.640
field, similar to how you could
indexes with Mongo. Right. So,

477
00:31:29.720 --> 00:31:33.599
like the idea is that if you
know, I have this like schema less

478
00:31:33.640 --> 00:31:34.799
data of sorts, like I don't
know exactly the shape of it or what

479
00:31:34.880 --> 00:31:37.559
comes what's gets thrown into it,
but I usually want to query for some

480
00:31:37.680 --> 00:31:40.519
of the keys, right, and
I want to search for things and that

481
00:31:40.680 --> 00:31:45.079
taste that essentially, then the way
that you can do this nowadays with post

482
00:31:45.119 --> 00:31:47.839
presses, you have a Jason B, you have a gin index or in

483
00:31:47.920 --> 00:31:52.480
some cases just in necess' on the
on Jason B column and you could just

484
00:31:52.559 --> 00:31:55.440
do queries on it. But you
don't have to declare upfront what you can

485
00:31:55.480 --> 00:31:56.960
to query, right, You don't
have to say, I'm always going to

486
00:31:57.079 --> 00:32:00.839
query for you know, this campaign, idea field or just you know,

487
00:32:00.960 --> 00:32:04.920
other kind of whatever the fields are. But you could actually have a kind

488
00:32:04.920 --> 00:32:08.079
of more generic index. And so
that's really I think what you know these

489
00:32:08.160 --> 00:32:12.359
days if you're storing Jason and the
postcars definitely used Jason because there's no reason

490
00:32:12.440 --> 00:32:15.440
not to, or almost no reason
not to. And so that I think,

491
00:32:15.480 --> 00:32:19.119
you know, gives you gets you
ninety five percent of the use cases

492
00:32:19.160 --> 00:32:22.000
that I'm going to be usually would
have. Yeah, and you're still in

493
00:32:23.160 --> 00:32:30.319
a database platform that supports relational data
as well, and that's I think to

494
00:32:30.400 --> 00:32:35.720
me, that's huge because every application, every business does have relational data,

495
00:32:36.079 --> 00:32:40.160
and so now you're able to accomplish
that in a single database platform versus juggling

496
00:32:40.319 --> 00:32:45.079
multiple database connections and trying to remember
which one has which data that's right.

497
00:32:45.160 --> 00:32:49.559
Yeah, And on that note,
more recently, also so talking about AI,

498
00:32:50.799 --> 00:32:54.839
we like in the postal community,
somebody called Andrew Caine, I create

499
00:32:54.880 --> 00:33:00.839
a product called pg vector. Excuse
me pg vector and pegbackctor essentially does is

500
00:33:00.880 --> 00:33:04.480
it stores vector embeddings inside Postgress so
that you don't have to use a dedicated

501
00:33:04.559 --> 00:33:07.680
vector database instead. It's essentially the
same idea, right, Like you have

502
00:33:07.119 --> 00:33:13.559
essentially a special like column like data
type that has special indexes, and then

503
00:33:13.640 --> 00:33:15.920
you can store you know, like
if you're trying to you know, build

504
00:33:15.920 --> 00:33:20.119
all these AI applications, instead of
now using specialized dabases, you can just

505
00:33:20.240 --> 00:33:22.319
use your Postgress And really the big
benefit that people are seeing these days,

506
00:33:22.359 --> 00:33:25.079
right is that they can then also
keep the rest of the data in Postgress.

507
00:33:25.119 --> 00:33:28.960
Right, So you're building your cool, fancy AI startup, you can

508
00:33:29.039 --> 00:33:31.799
now you know, use Postgress for
everything up to a certain limit. Right.

509
00:33:31.839 --> 00:33:35.640
So there are still benefits to using
a specialized database, but especially when

510
00:33:35.640 --> 00:33:38.440
you're starting out, it's just so
much simpler to stay within one system and

511
00:33:38.519 --> 00:33:43.279
then later on, you know,
you kind of scale down. And I

512
00:33:43.319 --> 00:33:45.000
do want to mention at that point, right. Part the reason we talk

513
00:33:45.039 --> 00:33:50.079
different database technologies. One of the
reasons why Postgress is very good at,

514
00:33:50.160 --> 00:33:53.920
you know, kind of supporting these
newer things like the embeddings is because it

515
00:33:53.960 --> 00:33:58.279
has a very good extension system.
So compared to for example, my SQL,

516
00:33:58.519 --> 00:34:00.720
it's much easier in postgress to create
an extension that changes some of the

517
00:34:00.799 --> 00:34:06.240
core functionality and postcress that hooks into
different parts. There's more of a community

518
00:34:06.279 --> 00:34:09.440
around that also that kind of releases
these extensions. And so the fact that

519
00:34:09.480 --> 00:34:12.960
you know, somebody like three years
ago Andrew Kaine was like, yes,

520
00:34:13.039 --> 00:34:15.039
I gotta you know, built this
library called pg vector and I'll just publish

521
00:34:15.079 --> 00:34:17.719
this. You know, it's an
extension of postgress. And then you know,

522
00:34:17.840 --> 00:34:22.519
all the big ciut companies like a
ws GCP Azure, they're all like,

523
00:34:22.800 --> 00:34:24.039
yes, AI, you know,
it's the best thing, and by

524
00:34:24.079 --> 00:34:27.360
the way, we support AI.
And then you look at it and it's

525
00:34:27.400 --> 00:34:30.639
just pgo vector kind of you know, sitting under in the database actually,

526
00:34:30.159 --> 00:34:32.159
like they just went and used the
extension. I mean, it's cool,

527
00:34:32.159 --> 00:34:36.760
they did it right, like they
they made it accessible and usable. But

528
00:34:37.000 --> 00:34:43.280
it's it's fascinating how you know that
extension system allows this adaptability of postprice essentially,

529
00:34:43.840 --> 00:34:49.000
yeah, for sure. And I
think that you know, that sets

530
00:34:49.079 --> 00:34:53.119
up the stage so that you can
empower your engineers so that they can focus

531
00:34:53.360 --> 00:34:59.679
on application performance, right, Yeah, that's right. And I think also

532
00:35:00.039 --> 00:35:02.880
not learning about the completely new system, right because they will have to be

533
00:35:04.719 --> 00:35:07.519
installing the driver probably like on the
application site, or I probably wouldn't support

534
00:35:07.559 --> 00:35:10.480
it, right, Like there's all
this extra steps that they would have to

535
00:35:10.519 --> 00:35:16.679
take if to use them completely different. So yeah, and you know,

536
00:35:16.800 --> 00:35:22.000
going back again to talking about the
number of technologies and fields we have to

537
00:35:22.079 --> 00:35:29.719
have expertise in. Really that's like
the overall objective for all of those is

538
00:35:29.800 --> 00:35:35.440
to create a performance app that moves
our business forward. Because most of us

539
00:35:35.599 --> 00:35:43.079
are not in the business of running
postcress databases or building AWS infrastructure. That's

540
00:35:43.199 --> 00:35:46.239
just the means to an end for
whatever our company is actually trying to do.

541
00:35:47.199 --> 00:35:53.400
Agree, where do you see?
Where do you see postgress and pg

542
00:35:53.559 --> 00:35:58.079
analyze going from here? Because we're
still in the early stages of AI,

543
00:35:58.159 --> 00:36:00.840
still trying to figure out what it
means. And I think you might agree

544
00:36:00.880 --> 00:36:07.039
with me that AI is not a
job killer but a job enabler. I

545
00:36:07.079 --> 00:36:08.519
always say it's complicated, right,
I think for some people it's definitely job

546
00:36:08.599 --> 00:36:12.559
killer. Right. So if you
are in the creative industries, and let's

547
00:36:12.559 --> 00:36:15.079
say you used to like, let's
see you're an artist, but you used

548
00:36:15.079 --> 00:36:19.000
to get money from you know,
working advertising campaigns. All turns outs,

549
00:36:19.039 --> 00:36:21.440
you know, Open the Eye just
released Sora, which you know makes your

550
00:36:21.440 --> 00:36:24.519
whole video production pipeline much easier to
you know, just automate with AI.

551
00:36:24.920 --> 00:36:28.679
And so maybe you're out of a
job, right because like suddenly your creative

552
00:36:28.679 --> 00:36:31.960
industry's job is just no longer paid, like you can still keeping an artist,

553
00:36:32.079 --> 00:36:36.559
right, there's just no money to
be made. So I would disagree

554
00:36:36.639 --> 00:36:38.079
saying, you know, it's uh, it's definitely taking jobs, right,

555
00:36:38.119 --> 00:36:45.000
Like that is very clear. I
think it's that's in a sense how when

556
00:36:45.079 --> 00:36:46.159
change is happening, that happens,
but it's also shitty, right, like

557
00:36:46.239 --> 00:36:50.199
it causes all kinds of problems.
So yeah, I think you have to

558
00:36:50.239 --> 00:36:53.199
accept that too. But I think
when I think about, you know,

559
00:36:53.280 --> 00:36:58.079
more personally, in terms of what
I see right like happening in my like

560
00:36:58.719 --> 00:37:00.760
niches of the world, right,
which is like engineering and data is optimization.

561
00:37:01.119 --> 00:37:04.639
I think as mentioned data is optimization. I'm not sure how much this

562
00:37:04.760 --> 00:37:07.320
is a generative problem, right,
So there to me, the natural language

563
00:37:07.360 --> 00:37:13.440
aspect of it is more a maybe
I you know, there's there's a semi

564
00:37:13.480 --> 00:37:16.000
automated system that can solve some of
these problems for me. But instead of

565
00:37:16.119 --> 00:37:20.239
me having to interact with the user
interface where I click around or like maybe

566
00:37:20.239 --> 00:37:22.519
I should write some code, instead
of that, I can just talk to

567
00:37:22.159 --> 00:37:25.719
my quasi you know, admin of
sorts and it just happens to be,

568
00:37:27.039 --> 00:37:30.119
you know, to be a large
language model type interface. But behind the

569
00:37:30.159 --> 00:37:34.960
scenes, what's actually doing is something
much more deterministic essentially, right, Like

570
00:37:35.039 --> 00:37:40.159
it's driving another system that really makes
makes the smartness essentially work. I think,

571
00:37:40.480 --> 00:37:44.559
you know, in the context of
engineering obviously, you know, get

572
00:37:44.599 --> 00:37:49.400
a copilot for example, I think
does have some interesting aspects to it,

573
00:37:49.519 --> 00:37:52.599
right, and so I have not
seen it fully you know, solved,

574
00:37:52.719 --> 00:37:54.960
Like I would wish it could write
my test for me, right, Like

575
00:37:55.159 --> 00:37:58.639
when I write code, I hate
writing tests, Like it's just like I'm

576
00:37:58.719 --> 00:38:00.920
lazy, and so what if I
could just have copilot right it? But

577
00:38:01.000 --> 00:38:05.159
then the problem is intent, right
because like it doesn't replace the intent,

578
00:38:05.280 --> 00:38:07.960
like you have to still tell it
do this, do that, create that,

579
00:38:07.480 --> 00:38:10.480
But it does kind of take a
little bit of the activation energy away,

580
00:38:10.519 --> 00:38:14.280
right. Sometimes when you you have
a tedosnask, which there's a lot

581
00:38:14.320 --> 00:38:16.280
of things to do. It's just
hard to get started. And so what

582
00:38:16.320 --> 00:38:19.599
I've heard from at least other people, and I have some seem to live

583
00:38:19.599 --> 00:38:22.280
with myself, is that the that's
today already a value that you know,

584
00:38:22.360 --> 00:38:25.320
something like copilot can deliver. And
so I think if we you know,

585
00:38:25.400 --> 00:38:30.920
project forward, right, I don't
think the I don't think the thinking goes

586
00:38:30.960 --> 00:38:32.360
away, right, I don't think
like none of the large language models are

587
00:38:32.400 --> 00:38:37.079
thinking, right, They're not reasoning, they're just like generating. And so

588
00:38:37.320 --> 00:38:44.280
I think really it's a question of
how do we, like it's a question

589
00:38:44.360 --> 00:38:45.360
of like how do we drive these
systems? Right, Like, how do

590
00:38:45.480 --> 00:38:52.480
we as operators as engineers drive a
system in a way that makes predictable outcomes

591
00:38:52.960 --> 00:38:57.679
but automates the things that are you
know, either hard to like it's just

592
00:38:57.960 --> 00:39:00.840
like take a lot of time.
They you know, we take a lot

593
00:39:00.840 --> 00:39:02.320
of expertise, right, So I
think there's a lot of opportunities there.

594
00:39:02.599 --> 00:39:07.719
But I don't think the human goes
away that drives essentially this is how the

595
00:39:07.760 --> 00:39:13.320
system should work for sure. Gotcha. So there's it's a way to provide

596
00:39:13.360 --> 00:39:20.039
the technical expertise, but you still
have to apply like the the is this

597
00:39:20.199 --> 00:39:23.320
a good idea, yeah, or
like the bounds of it, right,

598
00:39:23.599 --> 00:39:27.559
Like maybe it gives you suggestions that
Like it's kind of like we do with

599
00:39:27.599 --> 00:39:29.559
findext thing, right, So like
today if I look at our solution for

600
00:39:29.639 --> 00:39:30.599
the next thing, it's okay,
like it could definitely better, right,

601
00:39:30.639 --> 00:39:32.960
Like, it gives you a reasonable
good recommendation. But I think over the

602
00:39:34.079 --> 00:39:36.280
years, but I'll see you know
us do right, It's like at PG

603
00:39:36.320 --> 00:39:39.000
analyzed, our index recommendation will actually
get to the point where ninety percent of

604
00:39:39.039 --> 00:39:42.960
the time you they are just good, right, Like there's something that an

605
00:39:43.000 --> 00:39:45.960
expert would give you as well,
and you know you can just apply them.

606
00:39:45.239 --> 00:39:50.920
But still I think the decision of
when to apply them, how often

607
00:39:50.960 --> 00:39:52.320
to apply them, you know,
which level of testing to do, that

608
00:39:52.480 --> 00:39:59.000
is something that a human operator should
essentially take into accounts and decide because I

609
00:39:59.079 --> 00:40:01.519
think otherwise, you know, otherwise, you know, you don't really have

610
00:40:01.599 --> 00:40:04.840
control over the system, I guess, right, like you'll have you know,

611
00:40:04.920 --> 00:40:07.480
things be unnecessarily expensive and it could
be cheaper you maybe have you know,

612
00:40:07.599 --> 00:40:10.639
things be slow because the system oftimised
for the wrong thing. And so

613
00:40:10.679 --> 00:40:15.679
I think that there's this level of
control that I anticipate us needing ever so

614
00:40:15.960 --> 00:40:19.079
you know, ever more so in
the next couple of years. Yeah,

615
00:40:19.119 --> 00:40:22.119
it makes me think of the Jurassic
Park meme with I think it's Jeff Goldbloom,

616
00:40:22.119 --> 00:40:25.800
where it's like you were so focused
on whether or not you could you

617
00:40:25.920 --> 00:40:30.079
never stop to think whether or not
you should. Yes, Yes, that's

618
00:40:30.119 --> 00:40:34.360
definitely something I think of when I
see some of these things Aye does.

619
00:40:35.280 --> 00:40:45.760
Right, So what are the what
are the big problems that beyond index and

620
00:40:45.920 --> 00:40:51.199
query tuning. What are some of
the big problems you see with postgrass that

621
00:40:52.719 --> 00:40:59.360
fall into the I wish I'd known
that sooner category? Yeah, good question.

622
00:41:01.920 --> 00:41:08.320
I think it's I mean, there
are some things like most the other

623
00:41:08.360 --> 00:41:12.519
about this. So if you have
a lot of data, like terabytes of

624
00:41:12.599 --> 00:41:15.000
data, you can definitely sort it
in post ris, right, So like

625
00:41:15.199 --> 00:41:17.840
postcris doesn't really have a limit per
se. I think the one thing that

626
00:41:17.920 --> 00:41:22.719
I've I have definitely seen is if
you anticipate, you know, scaling hugely

627
00:41:22.800 --> 00:41:25.039
to the point that you know you'll
have one hundred terabytes of data, it

628
00:41:25.159 --> 00:41:28.960
really does help to think a little
bit about how you may be able to

629
00:41:29.000 --> 00:41:31.719
split up your data potentially or at
least how it should be structured in databased.

630
00:41:32.519 --> 00:41:37.519
So I'll give you two different examples
of represential data modeling of some sort

631
00:41:37.960 --> 00:41:40.760
is what I'm going for. So
one example is back when I was cite

632
00:41:40.800 --> 00:41:45.239
of data. Right, societies is
a sharting system, so you have different

633
00:41:45.400 --> 00:41:49.360
database servers essentially, and you scale
your workload b adding servers, which generally

634
00:41:49.440 --> 00:41:52.960
is a very good pattern. Not
a problem is if you're looking for,

635
00:41:52.400 --> 00:41:54.679
you know, a record, right, So let's say we have users.

636
00:41:54.800 --> 00:41:57.960
Well, let's see, yeah,
let's say if use we have just we

637
00:41:58.039 --> 00:42:01.440
have billions of users, right,
we have all these servers, right,

638
00:42:01.519 --> 00:42:05.840
and like a certain subset of our
users are in each server, and so

639
00:42:05.960 --> 00:42:08.760
if you're looking for a particular let's
say email, what we'd have to do,

640
00:42:08.960 --> 00:42:12.599
right is we essentially have to run
a query across all these servers,

641
00:42:12.719 --> 00:42:15.559
right, which, like, the
more service you have, the more complicated

642
00:42:15.639 --> 00:42:16.599
that becomes. And you can paralyze
a love of that, right, But

643
00:42:16.679 --> 00:42:21.119
like essentially many a problem with you
know, too many connections, and so

644
00:42:21.599 --> 00:42:24.400
like generally speaking, it's not good
if you have a lot of queries that

645
00:42:24.519 --> 00:42:28.800
go across all the service. And
so when you have this type of scaling

646
00:42:28.840 --> 00:42:34.000
out anticipation. One thing that I
always do these days is I make sure

647
00:42:34.440 --> 00:42:37.480
that I think about how could my
data potentially be sharded in the future,

648
00:42:37.559 --> 00:42:42.639
right, split up like that?
And oftentimes let's say you you build a

649
00:42:42.639 --> 00:42:45.719
sales tool, right, and a
sales tool has like B two B customers,

650
00:42:45.800 --> 00:42:47.239
right, so you have all these
customer IDs, right, And so

651
00:42:49.239 --> 00:42:52.039
in the most native implementation, you
have, you know, maybe a customer's

652
00:42:52.079 --> 00:42:54.079
table, and then you have a
CRM records table and that has a customer

653
00:42:54.199 --> 00:42:57.960
D. But then maybe each record
has a comment and sort of comment just

654
00:42:58.119 --> 00:43:00.920
you know, includes the record ID, but not the actual customer D because

655
00:43:00.920 --> 00:43:02.559
you're like, that's that seems duplicated, right, would I copy this you

656
00:43:02.639 --> 00:43:06.199
know, same value all over the
place. And so the thing I would

657
00:43:06.199 --> 00:43:09.440
actually do if I anticipate the scale
is I would actually include my customer D

658
00:43:09.559 --> 00:43:13.880
in this case, in all my
tables. Because that's the one thing that

659
00:43:14.079 --> 00:43:16.639
makes it a lot easier to shart
out your data is if it's very easy

660
00:43:16.719 --> 00:43:20.880
for you to say, if I
have, you know, my one tear

661
00:43:20.920 --> 00:43:24.159
by table, which subset of the
table belongs to each customer, and then

662
00:43:24.199 --> 00:43:27.840
when I'm going to move you know, like ten percent of my customers to

663
00:43:28.000 --> 00:43:30.320
this at a server, I can
just you know, select you know,

664
00:43:30.440 --> 00:43:34.360
essentially by customer I D versus doing
a very complicated joint that actually becomes quite

665
00:43:34.400 --> 00:43:37.960
expensive once you set its large.
And so that's the one thing I would

666
00:43:37.000 --> 00:43:44.599
do to to anticipate scaling is to
to really, you know, I think

667
00:43:44.679 --> 00:43:46.440
think of that, you know,
what, what's your unit of subdivision essentially

668
00:43:46.480 --> 00:43:51.480
in your data? Is there a
way to like potentially just add that ahead

669
00:43:51.480 --> 00:43:54.159
of time so that you have a
better chance scaling in the future essentially with

670
00:43:54.239 --> 00:44:00.880
outbumach effort. And then the other
thing again data related that I've found recently

671
00:44:01.000 --> 00:44:05.480
is just there are some tips and
tricks. This is really kind of postcers

672
00:44:05.480 --> 00:44:09.079
specific. It probably applies to my
SEQL to which is sometimes it makes sense

673
00:44:09.199 --> 00:44:13.960
to store data in a what seems
like a less than optimal format, but

674
00:44:14.000 --> 00:44:16.159
it's actually more efficient. So the
example is we store a lot of time

675
00:44:16.199 --> 00:44:20.000
serious data, and we don't use
specialized time seriou data base. We use

676
00:44:20.039 --> 00:44:23.920
postcress for for storting postcress time serio
data of course, right, And so

677
00:44:24.280 --> 00:44:29.000
one of the things we've recently done
is we've started using a rays to store

678
00:44:29.039 --> 00:44:31.559
some of our data. And so
imagine you have like a data point and

679
00:44:31.639 --> 00:44:35.280
you have you know, let's say
you know, there's five different values,

680
00:44:35.320 --> 00:44:37.000
they're all in the same timestamp.
And so the most simplest implantation is you

681
00:44:37.079 --> 00:44:39.920
just have you know, timestam column
data point one, data two, they've

682
00:44:39.920 --> 00:44:45.079
went three hundred and four all the
same time stam. So what we've done

683
00:44:45.119 --> 00:44:51.000
recently and this has like yielded a
tremendous performance like well disk based efficiency,

684
00:44:51.000 --> 00:44:54.519
but also performance benefits, is we've
essentially put more data points for the same

685
00:44:54.639 --> 00:45:00.880
customer in one row. Right,
so the row essentially becomes like an array

686
00:45:00.920 --> 00:45:02.639
of time stems, array of data
point one, a ray of data point

687
00:45:02.639 --> 00:45:06.119
two, a rady of data point
three and such. And what that does

688
00:45:06.360 --> 00:45:09.480
is postgress is various mechanisms. How
it you know, kind of will reduce

689
00:45:09.559 --> 00:45:13.519
overhead in this case, like it
will move some things to a second or

690
00:45:13.599 --> 00:45:16.360
storage. In some cases it will
and if you avoid the what's called tuople

691
00:45:16.400 --> 00:45:20.039
header, which is like the each
row has like a little header that takes

692
00:45:20.119 --> 00:45:23.440
extra space. And so if for
some reason you have a problem that looks

693
00:45:23.480 --> 00:45:27.599
like ours, right, then you
may want to think about a rays as

694
00:45:27.639 --> 00:45:30.159
as a way to kind of build
a not column storage of sorts, but

695
00:45:30.239 --> 00:45:34.000
like it's it's the kind of thing
that the column storage is good at.

696
00:45:34.679 --> 00:45:37.119
But if you are in a row
based stores, like postgress, arrays can

697
00:45:37.159 --> 00:45:40.480
be an interesting hack there to optimize
things. And then you can just add

698
00:45:40.760 --> 00:45:45.000
time series database to the list of
things that postgress can do. Exactly,

699
00:45:45.280 --> 00:45:47.280
Yeah, I guess I should you
know, in completeness the stake, I

700
00:45:47.320 --> 00:45:51.320
should probably say that partitioning is what
you should do first. So forget about

701
00:45:51.360 --> 00:45:52.800
the array stuff. You should partition
your tables if you haven't, right,

702
00:45:52.840 --> 00:45:54.639
So, like that's the other thing, if you haven't append on the work

703
00:45:54.719 --> 00:45:58.920
up which time zer state it usually
is. Just make sure you use partitions,

704
00:45:58.960 --> 00:46:02.000
because the big type pattern and progress
is if you're like you're doing inserts

705
00:46:02.000 --> 00:46:06.920
and you're doing the leads and so
what'sentually happens is that you create all lot

706
00:46:06.920 --> 00:46:08.840
of these dead rows versus if you
have partitions, right, Like, let's

707
00:46:08.840 --> 00:46:13.559
imagine you want to keep thirty days
with data, and so you insert each

708
00:46:13.639 --> 00:46:16.159
day's data into each partition and then
on the thirtieth day you drop that partition.

709
00:46:16.559 --> 00:46:21.039
And so dropping a partition like a
table petition is much cheaper than doing

710
00:46:21.119 --> 00:46:24.599
as elite statement across you know,
millions of records in a table, gotcha,

711
00:46:24.719 --> 00:46:28.800
And then that not only is that
more efficient to do, but it

712
00:46:28.960 --> 00:46:31.800
also saves you overhead when it comes
time to vacuum the database. Is that

713
00:46:31.880 --> 00:46:35.760
correct? That's right exactly, because
the vacuum doesn't have to do the work

714
00:46:35.800 --> 00:46:37.679
right because like, yeah, they're
not dead ros, they just are.

715
00:46:38.119 --> 00:46:43.199
You just dropped the table, so
right on. Excellent. Cool. Well,

716
00:46:43.719 --> 00:46:45.719
we could continue digging in on this, but I know you've got a

717
00:46:46.480 --> 00:46:50.960
meeting coming up here, so it
feels like we're a good stopping point and

718
00:46:51.039 --> 00:46:53.719
then we'll get you off to your
next meeting on time. But thanks for

719
00:46:53.960 --> 00:46:57.400
thanks for coming and talking about this. This has been cool, and I

720
00:46:58.000 --> 00:47:00.559
it's been insightful. I learned a
lot of things about put Express that I

721
00:47:00.599 --> 00:47:05.400
didn't know despite having used it for
a long time. Perfect. Yeah,

722
00:47:05.400 --> 00:47:08.760
And if anybody who's listening is interested
to learn even more about Postgress, I

723
00:47:09.039 --> 00:47:13.800
host a weekly video series and YouTube
called five Minutes of Postgress. So if

724
00:47:13.840 --> 00:47:15.519
you want to get you know,
a little like snippet of what's new with

725
00:47:15.559 --> 00:47:19.840
Postgress each week, feel free to
subscribe. To that, and I try

726
00:47:19.880 --> 00:47:22.960
to make that as useful as I
can to the community. Right on.

727
00:47:22.119 --> 00:47:25.559
What's the name of that YouTube channel. It's just an a PG analysed channel,

728
00:47:25.599 --> 00:47:29.840
but okay, the name of the
series is five minutes of Postgress Awesome.

729
00:47:30.000 --> 00:47:35.480
Right on? And then anywhere else
did you hang out online that people

730
00:47:35.519 --> 00:47:39.480
can interact with you? Yeah,
I mean I'm on macedon, I'm still

731
00:47:39.519 --> 00:47:45.199
on Twitter, akax and LinkedIn,
so feel free to you know, find

732
00:47:45.199 --> 00:47:47.239
me online. I'll send you a
few links you can include the show notes

733
00:47:47.559 --> 00:47:51.239
perfect, But generally, you know, if you want to hear more about

734
00:47:51.239 --> 00:47:53.880
PG analyes, just go through a
website pglies dot com. We host webinars

735
00:47:53.920 --> 00:47:57.960
every now and then where we talk
about, you know, things how how

736
00:47:58.000 --> 00:48:00.960
to running your postcress database and of
course how preach olice can help. But

737
00:48:00.079 --> 00:48:04.159
we try to, you know,
make that general useful. And then you

738
00:48:04.199 --> 00:48:07.559
know, also, I'm in a
bunch of the postal community spaces and postals

739
00:48:07.639 --> 00:48:10.280
conferences, so if you're at a
post conference later this year, maybe basically

740
00:48:10.360 --> 00:48:14.519
there awesome, right ern, Well, thank you so much for joining me

741
00:48:14.599 --> 00:48:16.440
on the show. Perfect, Thank
you so much for having me all right,

742
00:48:16.679 --> 00:48:20.480
see you see everyone else Next week, by every one,

