1
00:00:14,519 --> 00:00:19,000
What's going on? Everybody. Welcome
to another episode of Adventures and dev Ops.

2
00:00:19,039 --> 00:00:22,719
I'm your host Will Button and joining
me today. This is going to

3
00:00:22,760 --> 00:00:27,960
be exciting. I have the founder
of pg Analyze and a member of the

4
00:00:28,000 --> 00:00:32,079
founding team at product ND. I've
got Lucas Fiddle. Welcome, Lucas,

5
00:00:32,719 --> 00:00:35,560
thank you, happy to be here
right on. I'm excited to have you.

6
00:00:35,719 --> 00:00:43,719
And this follows on nicely to an
episode we had two episodes ago talking

7
00:00:43,840 --> 00:00:49,920
about the role of DBAs in our
environment, you know, with our with

8
00:00:49,960 --> 00:00:53,439
the movement that's happened over the last
decade. You know, whenever I started,

9
00:00:53,520 --> 00:00:58,799
DBA was like the premier role to
have in technology. You know,

10
00:00:58,880 --> 00:01:03,840
being a database administry was like the
ultimate role because you could just you could

11
00:01:03,840 --> 00:01:07,480
just say no and shut everyone down, and you know, the company was

12
00:01:07,519 --> 00:01:11,200
almost powerless to do anything about it. Not that that was your goal,

13
00:01:11,319 --> 00:01:15,120
but like that was the prestige that
the DBA had. And these days we

14
00:01:15,239 --> 00:01:22,480
just don't see as many DBAs around, but those tasks that the DBAs do

15
00:01:22,799 --> 00:01:26,079
still exists, and so I'm interested
to hear your perspective on that. But

16
00:01:26,319 --> 00:01:30,640
before we jump into that, give
us a little bit about your background.

17
00:01:30,719 --> 00:01:33,719
How you got to this point.
Sure, yeah, and you know,

18
00:01:34,640 --> 00:01:37,400
I'll try to relate this to also's
Postcrest, which is the database I care

19
00:01:37,439 --> 00:01:42,959
a lot about. So over the
years, you know, I eventually started

20
00:01:42,959 --> 00:01:47,560
out in those six dozen and seven
putting servers in racks. But back when

21
00:01:47,560 --> 00:01:49,920
we still had racks. You know, still do or have racks. It's

22
00:01:49,959 --> 00:01:53,480
just you don't think about them anymore
geological concept more than the physical figure think

23
00:01:53,519 --> 00:01:57,359
about And so you know pretty much
you know, early like right after that,

24
00:01:57,400 --> 00:02:00,599
you know, kind of the first
job I had, I you know,

25
00:02:00,640 --> 00:02:02,000
kind of started a startup with some
friends of mine and Postcress was a

26
00:02:02,079 --> 00:02:06,640
database of choice, right, and
so you know back then, you know,

27
00:02:06,680 --> 00:02:09,080
we kind of just had everything stort
in our postgress and as you know,

28
00:02:09,280 --> 00:02:13,080
my personal career went on, I
kept, you know, seeing Postcress

29
00:02:13,080 --> 00:02:15,960
over the years, right, and
postcus is an old database, like it's

30
00:02:15,960 --> 00:02:17,360
not as old as I am,
but it's close to it. Right,

31
00:02:17,520 --> 00:02:23,520
So the you know, like they
turned I think twenty seven years last year

32
00:02:23,919 --> 00:02:30,240
post was it not life? So
uh, the really the story here,

33
00:02:30,280 --> 00:02:34,800
right is that I think what I've
seen over the years in my career,

34
00:02:34,879 --> 00:02:38,080
right, is like so like product
hunt for example, use posts my database

35
00:02:38,159 --> 00:02:42,800
right like product the community. The
database was you know, reasonable, not

36
00:02:42,840 --> 00:02:45,800
the biggest database, but it just
worked right like you just had the system

37
00:02:45,840 --> 00:02:50,719
of record and that was Postgress.
And then after that I joined a company

38
00:02:50,719 --> 00:02:53,120
called side of Data. Inside of
Data is essentially a way to scale out

39
00:02:53,159 --> 00:02:57,280
Postgress, and that ended up being
a quiet Microsoft. So you know,

40
00:02:57,319 --> 00:03:00,439
I had some fun in the corporate
world for a little bit. And then

41
00:03:00,680 --> 00:03:04,680
you know, really what this led
me to is where wearing math currently,

42
00:03:05,400 --> 00:03:07,800
which is a company called pg Analyze. And so pg Analyze is a small

43
00:03:07,800 --> 00:03:14,080
startup, bootstrapped very intentionally, so
no VC funding. And really the idea

44
00:03:14,159 --> 00:03:16,479
is that we want to make post
press work better. Right posts performance in

45
00:03:16,479 --> 00:03:23,039
particular work better for the application engineers, the BBAs the data platform engineers help

46
00:03:23,120 --> 00:03:27,560
you optimize slow careers better. I've
actually run this as a side project before,

47
00:03:27,800 --> 00:03:30,199
you know, kind of I went
full time. So I actually started

48
00:03:30,360 --> 00:03:34,919
technically like almost eleven years ago.
Wow, but you know, only I

49
00:03:34,919 --> 00:03:37,800
would say the last four or five
years has been really a serious project.

50
00:03:37,840 --> 00:03:39,560
Right before then, it was more
of a this was, you know,

51
00:03:39,879 --> 00:03:43,159
solving a problem. Some people are
paying for it, but really, you

52
00:03:43,159 --> 00:03:45,479
know, as a business. It's
really been you know, the last couple

53
00:03:45,520 --> 00:03:49,439
of years where it has taken off
and we've really seen great success right on,

54
00:03:49,719 --> 00:03:52,680
So you kind of it's it's one
of those sounds like it's one of

55
00:03:52,719 --> 00:03:58,759
those stories where you had a specific
problem personally and you built this tool to

56
00:03:58,840 --> 00:04:01,919
solve it and then realized, hey, other people kind of have the same

57
00:04:01,960 --> 00:04:05,199
problem, that's right. Yeah,
And I actually so I had a friend

58
00:04:05,240 --> 00:04:09,319
who he was involved early on with
the project but is no longer. But

59
00:04:10,039 --> 00:04:13,840
I had a friend who was essentially
my go to book posters person. Right,

60
00:04:13,840 --> 00:04:15,919
So if you asked me ten years
ago, I probably wouldn't have been

61
00:04:15,919 --> 00:04:18,439
that smart about poscasts asiety these days. Definitely not the smartest. There's definitely

62
00:04:18,439 --> 00:04:23,560
smarter people out there. But I
learned or two. But you know,

63
00:04:23,600 --> 00:04:26,000
back then, essentially he was a
person I would ask, you know,

64
00:04:26,000 --> 00:04:28,560
how to optimize this query? How
do I know set this thing up so

65
00:04:28,639 --> 00:04:30,959
it works well? How do I
direct the replication? And so really,

66
00:04:31,000 --> 00:04:35,360
in part of scratching my own interest
inially was this is a person that's great,

67
00:04:35,480 --> 00:04:40,000
but as there's some tasks that just
don't seem necessary for a person to

68
00:04:40,079 --> 00:04:45,240
do manually, right, for sure. Yeah, and I think that's one

69
00:04:45,279 --> 00:04:47,879
of the things I've encountered as well
as like I have in my own network,

70
00:04:48,279 --> 00:04:51,680
like, oh, this dude is
my go to guy for that.

71
00:04:53,439 --> 00:04:58,160
But you know, it's a personal
relationship, you know, and I'm always

72
00:04:58,680 --> 00:05:02,600
concerned with, you know, making
sure that that relationship goes two way and

73
00:05:02,600 --> 00:05:09,079
that I'm just not beating these people
up with questions, you know. So

74
00:05:09,279 --> 00:05:15,199
yeah, I see the need there
to build that. So what what were

75
00:05:15,199 --> 00:05:21,000
some of the like big things that
you were trying to accomplish with postgress or

76
00:05:21,000 --> 00:05:25,199
doing cogrit postgress or how did you
even know that these were things that you

77
00:05:25,279 --> 00:05:29,519
needed to be worried about? I
mean, the simplest thing is just a

78
00:05:29,560 --> 00:05:33,319
slow query, right, which I
would say basically everybody who runs a postcres

79
00:05:33,399 --> 00:05:35,839
data base we went to slow query
at some point. And this is not

80
00:05:36,000 --> 00:05:39,319
you know, just postgress, right, it's just any database. Really,

81
00:05:40,000 --> 00:05:43,279
Queries are a thing, right,
Like that's your basic unit of work essentially,

82
00:05:43,720 --> 00:05:47,600
And so I think the most fundamental
thing is the database doesn't behave as

83
00:05:47,600 --> 00:05:49,800
I expected to write, Like,
it's not as fast as I expected to

84
00:05:49,879 --> 00:05:54,279
it's returning less data than I expected
to, like, whatever it's happening.

85
00:05:55,199 --> 00:06:00,560
And so really I think that was
the most fundamental thing is understanding which careers

86
00:06:00,600 --> 00:06:03,360
are slow. Right. Sometimes you
have visibility problem because sometimes on the application

87
00:06:03,439 --> 00:06:06,399
side, you're actually calling a function
like in your orem, you're not really

88
00:06:06,399 --> 00:06:10,199
thinking of the database as queries,
right, You're not actually running SEQL.

89
00:06:11,040 --> 00:06:14,680
Sometimes it's just a visibility problem.
But then once you once you know which

90
00:06:14,759 --> 00:06:16,279
query is slow, there's a question
of you know, why is it slow?

91
00:06:16,360 --> 00:06:18,959
Right? How do I how do
I go from knowing it slow that

92
00:06:18,959 --> 00:06:24,120
it's kind of this matching box of
activity to you know, actually the database

93
00:06:24,240 --> 00:06:26,920
is doing just index scan, it's
doing a sequential scan, it's doing this

94
00:06:27,040 --> 00:06:30,279
joint and maybe there's a better way
to join things, right, So there's

95
00:06:30,319 --> 00:06:33,639
there's different choices of how the database
implements what you tell it to do.

96
00:06:34,600 --> 00:06:39,240
And so this is I think really
the most fundamental thing that I've seen myself

97
00:06:39,759 --> 00:06:44,560
need to know about databases is you
know why things are slow and can I

98
00:06:44,600 --> 00:06:46,920
do something to improve them? And
then you know, the more you have

99
00:06:46,959 --> 00:06:49,920
an expert at hand, Really,
what the expert I think can do is

100
00:06:50,720 --> 00:06:55,279
help you understand which action to take
right, and the action might be something

101
00:06:55,360 --> 00:07:00,439
like rewrite the query kind of get
database to a different querer plan. There's

102
00:07:00,480 --> 00:07:03,920
like different ways to influence what's going
on. But understanding is you have a

103
00:07:04,160 --> 00:07:06,759
slow queer problem in the first place. I think that's really the most fundamental

104
00:07:06,800 --> 00:07:11,040
thing to start with for sure.
Yeah, and I'm glad you brought up

105
00:07:11,040 --> 00:07:15,319
o rms because they are, like
I think anymore, that's like the default

106
00:07:15,879 --> 00:07:20,120
and interaction for a lot of engineers
with the databases just through the o RM,

107
00:07:20,959 --> 00:07:26,079
and you don't see the queries that
they create a lot of times.

108
00:07:26,120 --> 00:07:31,920
And I've seen quite a few instances
where when the database performance slows, the

109
00:07:31,959 --> 00:07:38,079
first response is always to increase the
size of the database server. And then

110
00:07:38,839 --> 00:07:41,720
and then you get that that bill, you know, you get the AWS

111
00:07:41,839 --> 00:07:46,240
or the GCP bill, And that's
whenever the finance team comes down and says,

112
00:07:46,279 --> 00:07:50,079
hey, maybe we should take a
different approach to this. And so

113
00:07:50,120 --> 00:07:55,360
that's where some you know, that's
whenever you've got to break open the hood

114
00:07:55,399 --> 00:08:00,120
on the o RM and really start
understanding what queries it's running and is there

115
00:08:00,120 --> 00:08:01,839
are a way to optimize those?
That's right, Yeah, And since you

116
00:08:01,879 --> 00:08:05,199
talk about costs, right, I
think like these days everybody's like optimizing for

117
00:08:05,240 --> 00:08:07,639
costs, right because like things are
expensive, and companies are you know,

118
00:08:07,720 --> 00:08:13,800
laying people off and all the bad
stuff. And database so interesting right because

119
00:08:13,800 --> 00:08:16,120
they're quite expensive, but they're also
hard to change, right, And so

120
00:08:16,160 --> 00:08:20,000
I think that's that's you know something
else I see. Often it's really like

121
00:08:20,040 --> 00:08:22,439
as you mentioned, right, like
this like people reach kind of a performance

122
00:08:22,480 --> 00:08:24,680
bottlem like but then they don't know
how to solve it, and so they

123
00:08:24,800 --> 00:08:30,000
just upscale database. It's worse is
usually in the cloud, your databases go

124
00:08:30,120 --> 00:08:33,919
up and you know slizes of two
essentially multiple power of two type things,

125
00:08:33,960 --> 00:08:35,480
right, So like I have sixteen
gigabytes to RAM, next you got to

126
00:08:35,480 --> 00:08:39,279
have three two gigabytes TOD then sixty
four, right, and so you just

127
00:08:39,399 --> 00:08:43,159
keep essentially like doubling your costs,
which is horrible, right. And it's

128
00:08:43,159 --> 00:08:46,799
really hard to not do that because
you can't just like have multiple database service

129
00:08:46,840 --> 00:08:50,799
like that requires architecture, like you
actually have to think about you know,

130
00:08:50,879 --> 00:08:54,399
can I use read replicas and like
if you like split up the read workload

131
00:08:54,960 --> 00:08:56,639
and so stuff like that becomes a
challenging problem. Right. So this is

132
00:08:56,639 --> 00:09:01,879
where if you can figure out a
very performance issue without scaling up the data

133
00:09:01,879 --> 00:09:05,799
reserve, right, because adding A
to the index, for example, then

134
00:09:05,879 --> 00:09:09,039
that you know, connectually safety a
lot of money, which is why people

135
00:09:09,039 --> 00:09:13,399
care about it. Oh for sure. Yeah. And I'm not a DBA

136
00:09:13,679 --> 00:09:16,720
by any stretch of the imagination.
But the power of indexes, whenever I

137
00:09:16,799 --> 00:09:22,360
discovered that early on in my career, was just mind blowing at at how

138
00:09:22,480 --> 00:09:26,039
much of a difference that would make
in the database performance. Yeah for sure.

139
00:09:26,120 --> 00:09:28,600
Yeah, and it's I mean,
the good news is, I think

140
00:09:28,919 --> 00:09:33,320
people have gotten better over the years
of understanding the basics of indexes. I

141
00:09:33,360 --> 00:09:37,039
think where it's hard, like what's
hard to understand if indexes sometimes is that

142
00:09:37,320 --> 00:09:39,679
there's multiple different index types. Right, So Postcress, like the default is

143
00:09:39,679 --> 00:09:43,200
a bea tree, which most people
roughly know exceptually what a B tree is.

144
00:09:43,919 --> 00:09:48,000
But Postcress has all these different index
types like just gin hash, Britain,

145
00:09:48,600 --> 00:09:52,919
all these like specialized things if you're
talking AI and m L, there's

146
00:09:54,000 --> 00:09:58,879
you know kind of future vector has
like IBF flat and hsu inex types and

147
00:09:58,960 --> 00:10:03,840
so really the challenge there is understanding
when you use these specialized types, right,

148
00:10:03,679 --> 00:10:07,960
But like that's that's essentially what's even
verse than the regular indexing problem.

149
00:10:07,960 --> 00:10:09,960
It's just you know, oh,
I forgot to add it, and next

150
00:10:09,159 --> 00:10:16,200
which index to that? Great?
Yeah, just information overload that. Yeah.

151
00:10:16,519 --> 00:10:26,039
So everyone's talking about AI these days, and I think one of the

152
00:10:26,200 --> 00:10:31,679
one of the stories that we're trying
to sell really hard is that AI is

153
00:10:31,759 --> 00:10:35,879
not the tool to replace your job. AI is a tool that helps you

154
00:10:35,960 --> 00:10:41,080
be more proficient at your job because
as an engineer today, like you've got

155
00:10:41,080 --> 00:10:46,960
to be versed in so many different
things from infrastructure to databases, to the

156
00:10:48,200 --> 00:10:52,759
actual application code that you're writing,
to the business knowledge. So what kind

157
00:10:52,840 --> 00:10:56,039
of what kind of assistance can we
get from AI and helping understand things like

158
00:10:56,080 --> 00:11:01,919
that like which where we need an
index and what type of index? Yeah?

159
00:11:03,120 --> 00:11:05,080
And I think it's it's a complex
topic, right, So I think

160
00:11:05,120 --> 00:11:09,759
the answer is I think there are
things we should be we should be thinking

161
00:11:09,759 --> 00:11:13,480
about. Is it important that a
human is doing activity or can we be

162
00:11:13,639 --> 00:11:16,440
machine assisted at the very least?
Right, I don't necessarily know if we

163
00:11:16,480 --> 00:11:18,519
want to be machine driven, right, I don't know if we want to

164
00:11:18,559 --> 00:11:22,200
have the AI kind of take control
of our database tuning, for example.

165
00:11:22,480 --> 00:11:24,919
I think you know very much the
same way that if you think of the

166
00:11:24,919 --> 00:11:26,159
depth hoops world, right, do
we just want to have you know,

167
00:11:26,200 --> 00:11:31,759
AI orchestrate our servers automatically and just
provision things? Probably not? Right,

168
00:11:31,840 --> 00:11:33,320
Like things like terror form and stuff
are a good thing, Like you want

169
00:11:33,320 --> 00:11:37,639
to have that level control. And
so I think with I mean there's a

170
00:11:37,679 --> 00:11:41,360
couple of things going on that field, right, I think at the like

171
00:11:41,480 --> 00:11:45,000
just looking backward the last couple of
years, looking outside of what we ourselves

172
00:11:45,000 --> 00:11:46,919
is done, and they we just
start with what other people have done.

173
00:11:46,720 --> 00:11:50,679
So just in terms of research,
there used to be a project called auto

174
00:11:50,759 --> 00:11:56,399
Tune. There's their company now they
essentially do hazing details personally, but they

175
00:11:56,440 --> 00:12:01,960
essentially do like basin statists six based
optimization, right, so they do not

176
00:12:03,240 --> 00:12:05,960
use generative AI, right, Like
it's not like you're asking Chatchy D how

177
00:12:05,960 --> 00:12:09,000
to optimstic database. What they're essentially
doing is saying, for this particular parameter

178
00:12:09,080 --> 00:12:13,159
value, can we run you know, a model that essentially comes up with

179
00:12:13,200 --> 00:12:16,360
the best possible parameter? Right,
And it's pretty beff feel like it's it's

180
00:12:16,360 --> 00:12:20,679
a cool idea's a cool system.
There's other like there's another startup like that,

181
00:12:20,759 --> 00:12:24,120
Dbtune, which also tries to do
with something very similar more recent some

182
00:12:24,200 --> 00:12:28,399
similar idea, right, which is
how can we get the best parameter values?

183
00:12:28,480 --> 00:12:31,279
Essentially, so I think that's great. You should take a look at

184
00:12:31,279 --> 00:12:35,480
these if you're interested in tuning parameters. What we have done is pach Analyze

185
00:12:35,519 --> 00:12:41,440
is we've optimized focused on essentially the
things that you know, they're a little

186
00:12:41,440 --> 00:12:43,559
bit more fuzzy in terms of who
owns them, right, so in you

187
00:12:43,919 --> 00:12:46,679
just want index earlier, and so
what we've actually come up with a system

188
00:12:46,720 --> 00:12:52,720
called pach Analys's Index Advisor, which
is essentially a recommendation system for which indexes

189
00:12:52,720 --> 00:12:56,480
to create, right, and so
different than parameters, right, the parameter

190
00:12:56,519 --> 00:12:58,159
is sometimes a choice, like you
kind of want to find tune it,

191
00:12:58,200 --> 00:13:01,240
but you don't necessarily need to do
that all the time, or your application

192
00:13:01,279 --> 00:13:05,639
engineers don't need to know how you
tune your share buffer parameter in post cress

193
00:13:05,679 --> 00:13:09,919
Like that stuff isn't that important essentially
to the application engineer. But indexes are

194
00:13:09,960 --> 00:13:13,519
interesting because they they have like this
fuzzy ownership, right, Like is it

195
00:13:13,559 --> 00:13:16,919
the application engineer owning them? Is
it the kind of DBA or the platform

196
00:13:16,919 --> 00:13:20,679
engineer owning them? And so what
recentially built this system that you know,

197
00:13:20,440 --> 00:13:22,639
imagine it's kind of like your safety
harness, right, Like, it doesn't

198
00:13:22,639 --> 00:13:26,840
necessarily say you you've got to drive
your decisions, but if you forget to

199
00:13:26,840 --> 00:13:28,840
add an index, it will tell
you. And so it's not you know

200
00:13:28,919 --> 00:13:31,200
what I call it AI. I
don't know, right, AI is a

201
00:13:31,200 --> 00:13:33,960
complex term. It's not chene AI, right, Like, it's not chanitif

202
00:13:35,000 --> 00:13:37,679
I the same way as auto tune
or DBTN or not chanitif I. But

203
00:13:37,759 --> 00:13:41,360
it is a way to essentially have
a recommendation, uh, kind of of

204
00:13:41,519 --> 00:13:46,279
which things to create based on your
career, work club, right, and

205
00:13:46,320 --> 00:13:48,919
so similar spirit, what we've also
done is what we call vacuum Advisor.

206
00:13:50,159 --> 00:13:52,600
And so vacuum and postcress is very
particular concept. You may not be familiar

207
00:13:52,639 --> 00:13:58,240
with it, but essentially it's it's
essentially that the dead row cleanup and postcress

208
00:13:58,279 --> 00:14:01,600
rights like when you have like you've
been update or delete postgress you know,

209
00:14:01,639 --> 00:14:07,080
will create essentially a record that kind
of just marks the deleted data essentially,

210
00:14:07,559 --> 00:14:09,399
and then vacuum comes and cleans that
up. And so sometimes you need to

211
00:14:09,399 --> 00:14:13,120
find tune the schedule of that,
right, you need to understand how often

212
00:14:13,159 --> 00:14:15,519
is it running? Is running too
often or not often enough? Is it?

213
00:14:15,559 --> 00:14:18,679
You know, kind of like running
it wrong time a day, right,

214
00:14:18,720 --> 00:14:20,679
like I have my business hours and
suddenly the database is busy a vacuum.

215
00:14:20,960 --> 00:14:24,639
And so we've essentially done something where
we looked at all the time serious

216
00:14:24,720 --> 00:14:28,000
data that we have in our system
and we said, you know, can

217
00:14:28,039 --> 00:14:31,399
we make your recommendations for which confex
stennings to change. This kind of goes

218
00:14:31,440 --> 00:14:33,799
a little bit into that parameter tuning
area. What we then do is we

219
00:14:33,840 --> 00:14:37,440
do is on a per table basis, and so this is then again where

220
00:14:37,759 --> 00:14:41,039
if you as a human did that, it just becomes very complicated, right

221
00:14:41,039 --> 00:14:43,919
because you've got to think of like
imagine our own database. For example,

222
00:14:43,919 --> 00:14:46,759
we have a thousand tables. If
you have a thousand tables, looking at

223
00:14:46,799 --> 00:14:48,960
each and every one of them and
then looking at you know, a graph

224
00:14:50,039 --> 00:14:52,679
over time that describes to you how
you know the auto vacuum works. It's

225
00:14:52,720 --> 00:14:56,879
just very tedious. And so being
able to have that automatically kind of looked

226
00:14:56,879 --> 00:15:01,159
at an analyzed is really what I've
found quite useful over the years. And

227
00:15:01,480 --> 00:15:03,919
what we done, oh for sure, that's to me, that's like pure

228
00:15:05,000 --> 00:15:09,759
gold right there, because you know, like I know that you have to

229
00:15:09,159 --> 00:15:13,120
vacuum postpress databases, and when I
first encountered that, I was like,

230
00:15:15,399 --> 00:15:18,919
what the hell I have to vacuum
this thing? Like can I hire a

231
00:15:20,000 --> 00:15:22,679
housekeeper? Or how does this work? And then you want to do it?

232
00:15:22,919 --> 00:15:26,759
Yeah? Yeah, and then so
then yeah, that was just led

233
00:15:26,799 --> 00:15:28,320
to more questions for me, like, well, should I trust this built

234
00:15:28,320 --> 00:15:31,360
in housekeeper? When's it going to
do it? How do I know if

235
00:15:31,360 --> 00:15:35,320
it's doing it at the right time? And so to have a tool that

236
00:15:35,519 --> 00:15:39,559
looks at what's actually going on in
my data database and then makes recommendations based

237
00:15:39,600 --> 00:15:46,159
on that activity, I think it
is just like an amazing resource to have

238
00:15:46,440 --> 00:15:50,960
because it answers so many questions for
me that that I didn't even know we're

239
00:15:52,000 --> 00:15:56,879
supposed to be questions, right exactly, Yeah, And I think especially if

240
00:15:56,919 --> 00:15:58,840
you, if you like are not
a pubicles expert, right, like,

241
00:15:58,840 --> 00:16:03,559
this is what I've seen order hears
is like people migrate from Oracle to postcress

242
00:16:03,600 --> 00:16:06,679
or from civile server to postcress,
and these databases they do this differently,

243
00:16:06,759 --> 00:16:08,639
right, so they don't have vacuum
as a concept, And so people when

244
00:16:08,639 --> 00:16:11,519
they first go to postcress and then
they have this bike production system. Suddenly

245
00:16:11,559 --> 00:16:15,200
these vacuum problems start happening, and
they're like, what's going on? Like

246
00:16:15,480 --> 00:16:17,720
what should I do? Right,
Like I don't have anybody in house who

247
00:16:17,720 --> 00:16:22,679
knows about this. Yeah, yeah, and you can always post on stack

248
00:16:22,720 --> 00:16:26,440
overflow and hope for the best.
There there are some good people on stack

249
00:16:26,480 --> 00:16:30,039
overflow. Let me talk like,
it's really you know, I've I've learned

250
00:16:30,039 --> 00:16:34,360
this thing or two or ten from
a stack overflow posts for sure. Yeah,

251
00:16:34,480 --> 00:16:41,799
it's it's funny because like there's the
learning curve of stack overflow, like

252
00:16:41,840 --> 00:16:47,519
it's it's this tremendous resource, but
you have to learn how to interact with

253
00:16:47,639 --> 00:16:51,720
you know, how to here's the
minimum criteria I have to have to open

254
00:16:52,000 --> 00:16:55,480
the question, you know, and
here's what I've got to include if I'm

255
00:16:55,480 --> 00:17:00,000
going to get a legitimate value response, right, And I think what's interesting

256
00:17:00,440 --> 00:17:03,960
on the topic of psago flow that
that actually makes me think in terms of

257
00:17:04,039 --> 00:17:07,920
AI, that makes me think of
gen AI, right, Like the whole

258
00:17:07,599 --> 00:17:11,000
you know, kind of give me
a response that resempless stechgorical answer is I

259
00:17:11,000 --> 00:17:15,240
think what generator A I and LMS
are good at and so somebody in the

260
00:17:15,240 --> 00:17:21,480
postgars community and Nicolai from postgrass Ai. He's a recent project where he's essentially

261
00:17:21,480 --> 00:17:26,480
connected to cheat GPT with kind of
a knowledge base of different postgress articles and

262
00:17:26,519 --> 00:17:29,359
it's actually a really interesting experiment.
But he's essentially saying, what if we

263
00:17:29,400 --> 00:17:32,160
had a chatbot, you know,
that was power by chat, ChiPT or

264
00:17:32,279 --> 00:17:36,200
GPT for I think, and then
you would essentially ask a question just like,

265
00:17:36,599 --> 00:17:38,200
you know, how do I set
this parameter in postcars and then will

266
00:17:38,240 --> 00:17:41,559
just spit out something that essentially looks
like a Stachofal answer, which I think

267
00:17:42,039 --> 00:17:45,160
like that has has this place as
well, right because people, I think

268
00:17:45,200 --> 00:17:49,240
what people do alternatively to asking a
chat about that, they'll go on Google

269
00:17:49,240 --> 00:17:52,720
and they'll search how to tune this
parameter. And then, unfortunately, with

270
00:17:52,839 --> 00:17:56,039
you know, the advance of AI, what has happened is a lot of

271
00:17:56,079 --> 00:18:00,799
more articles are just not that useful
because it's beau, I'm pretty cheap to

272
00:18:00,839 --> 00:18:03,960
generate, you know, kind of
that constant sense. And so what he's

273
00:18:03,960 --> 00:18:04,920
done is really, you know,
kind of I think, built a better

274
00:18:04,920 --> 00:18:08,200
solution to you know, how do
I get good postters knowledge? Is by

275
00:18:08,279 --> 00:18:12,480
essentially building off with you before and
having kind of this this more trusted I

276
00:18:12,480 --> 00:18:18,240
think data like database of good articles, which I found interesting. Yeah.

277
00:18:18,279 --> 00:18:23,440
I don't think it's going to be
truly successful though until chat GPT actually sends

278
00:18:23,440 --> 00:18:29,200
out insulting answers at random just to
insult your intelligence, until you to come

279
00:18:29,200 --> 00:18:33,680
back when you actually know how to
ask an intelligent question. Yeah. Yeah,

280
00:18:33,680 --> 00:18:34,960
And it has some weird issues like
earlier this week, I just I'm

281
00:18:34,960 --> 00:18:37,880
not an active chutch up to user
personally, but you know that there was

282
00:18:37,920 --> 00:18:42,640
like this issue where it just like
started screwing RD and garbage. It's certainly

283
00:18:42,920 --> 00:18:45,880
you know sometimes I mean it's it's
interesting in a sense, right, Like

284
00:18:45,920 --> 00:18:48,720
it feels like I'm reading a science
fiction novel and I not to get sure

285
00:18:48,720 --> 00:19:02,519
where it's going. Right. Cool, So let's go back to to someone

286
00:19:02,599 --> 00:19:08,599
who's who's working with postgress, you
know, has multiple other roles, and

287
00:19:08,640 --> 00:19:12,680
we talked about, you know,
the need for indexes and query optimization.

288
00:19:14,400 --> 00:19:21,480
But that's like pretty pretty specific,
like you're you're narrowing in on the actual

289
00:19:21,559 --> 00:19:26,759
problem there. What are the indicators
before that that tell them that this is

290
00:19:26,759 --> 00:19:29,880
the area what you want to focus
on, Like you're running postgress. What

291
00:19:29,960 --> 00:19:33,279
kind of things do you see that
tell you, hey, this is a

292
00:19:34,319 --> 00:19:37,960
something that postgress might help with,
versus this is something that looks like it's

293
00:19:38,000 --> 00:19:44,559
application code or infrastructure related. Yeah. No, it's a good question.

294
00:19:44,680 --> 00:19:45,799
I think it's it's actually not an
easy problem to solve, right, because

295
00:19:45,839 --> 00:19:49,759
you you're kind of looking at two
different worlds. Right. So I think

296
00:19:49,799 --> 00:19:53,559
that what people most commonly would have
is they would have like an APM tool

297
00:19:53,559 --> 00:20:00,599
of sorts or tracing toolrights that might
be using you relic data dog like long

298
00:20:00,680 --> 00:20:04,039
list of apmtals, right, Yeah. And so what you were seeing these

299
00:20:04,079 --> 00:20:07,599
tools is you would see a trace
for example, right, these as you

300
00:20:07,599 --> 00:20:11,200
would call a trace, and you
would say, here's a slow request,

301
00:20:11,200 --> 00:20:14,759
and you would see different spans essentially
in the trace that would say you know

302
00:20:14,759 --> 00:20:18,400
which part of request is slow?
And so usually with most of these instrumentations,

303
00:20:18,480 --> 00:20:22,920
you from the application side, you
will know that the query is the

304
00:20:22,960 --> 00:20:25,079
slow part. Right, So you
would I would expect, let's say,

305
00:20:25,279 --> 00:20:27,240
with a request that takes ten seconds, we would see that you know,

306
00:20:27,279 --> 00:20:30,000
out of these ten seconds, nine
seconds are spent in the database. Great,

307
00:20:30,039 --> 00:20:33,440
right, So this is this is
actually just something that you will get

308
00:20:33,480 --> 00:20:37,759
most of the time. Not a
hard part is understanding is this something I

309
00:20:37,759 --> 00:20:40,720
could do something about, right,
because it's essentially will say, here's a

310
00:20:40,720 --> 00:20:42,519
sequel query and here is you know, this nine seconds span, but it

311
00:20:42,519 --> 00:20:48,319
doesn't tell you anything more about it. And so what I've found quite interesting

312
00:20:48,759 --> 00:20:52,799
we've uh, we've kind of launched
this feature also last year, is we

313
00:20:52,160 --> 00:20:59,400
essentially use the fact that open telemetry
allows you to connect different essentially different services.

314
00:20:59,440 --> 00:21:02,680
Right. So the idea in a
microservice architecture is you can have different

315
00:21:02,960 --> 00:21:07,799
systems sent into the same trace essentially. And so imagine that if your database

316
00:21:07,839 --> 00:21:11,119
could tell you, hey, you
know these nine seconds of time, Actually

317
00:21:11,160 --> 00:21:15,119
I spent five seconds scanning this index, four seconds joining this data, right,

318
00:21:15,319 --> 00:21:18,480
Suddenly your trace becomes much more usable, right because you're not looking at

319
00:21:18,519 --> 00:21:22,079
this you know, nine second time
span. You're actually looking at individual operations

320
00:21:22,079 --> 00:21:25,480
that you maybe understand better, right
because you're doing a big quer on this

321
00:21:25,519 --> 00:21:30,240
table. And so what we've actually
done essentially is piggyback on this idea of

322
00:21:30,279 --> 00:21:33,440
different services sending into the you know, same system, the same observability system.

323
00:21:33,759 --> 00:21:38,000
And so we've built an integration where
we pull the kind of a slow

324
00:21:38,079 --> 00:21:42,519
query log or auto explained log in
postgrass, which essentially says, here is

325
00:21:42,559 --> 00:21:45,319
you know a slow queer execution,
and here's the plan for that execution,

326
00:21:45,640 --> 00:21:48,680
and so we pulled that as part
of our kind of agent, and then

327
00:21:48,720 --> 00:21:53,839
we send the essentially a reference of
that information into the tracing system. Right.

328
00:21:55,319 --> 00:21:56,920
And so what the solves is what
you would do otherwise. Right.

329
00:21:56,960 --> 00:22:00,279
So let's imagine if you don't have
this type of solution, is you look

330
00:22:00,279 --> 00:22:03,200
at your trace and you have that
nine second span, and then you have

331
00:22:03,279 --> 00:22:07,359
to correlate that with the database activity. Right, So you would essentially either

332
00:22:07,359 --> 00:22:11,359
look at your database logs or like
in postcristers, tools like ggstat statements which

333
00:22:11,400 --> 00:22:14,519
tracks the query performance over time.
And so what you would do is you

334
00:22:14,640 --> 00:22:17,400
essentially say, well, I know
roughly the query shape, right, formatting

335
00:22:17,480 --> 00:22:19,359
sometimes differs slightly, but maybe you
know, I'll like copy and paste a

336
00:22:19,359 --> 00:22:22,000
part of the query. I'll go
over to the database and I'll kind of

337
00:22:22,240 --> 00:22:26,039
put it in and I'll try to
find the thing that's matching that looks like

338
00:22:26,079 --> 00:22:29,599
the right query. But it's really
hard to do that precisely. And so

339
00:22:29,680 --> 00:22:33,720
this is I think where open telemetry
really has an interesting way of solving this,

340
00:22:33,880 --> 00:22:37,599
right, because just expand that briefly, what essentially it does is in

341
00:22:37,680 --> 00:22:41,200
tracing, you have these IDs right
as you're trying to propagate across systems.

342
00:22:41,759 --> 00:22:45,599
And so the idea is when you
run a CQL career, you add a

343
00:22:45,640 --> 00:22:48,799
comment and in a comment that says
what's the trace ID and the parent span

344
00:22:48,839 --> 00:22:52,240
ide of that query is. And
so when the query arrives on the database

345
00:22:52,279 --> 00:22:56,359
side, the database can output a
log for example, that says, hey,

346
00:22:56,400 --> 00:22:57,759
you know, this career was slow. Here's the career plant. And

347
00:22:57,799 --> 00:23:02,200
also, by the way, here's
the little comment at the start that tells

348
00:23:02,240 --> 00:23:04,960
you that tells the tracing system essentially
how to kind of stitch that back together

349
00:23:06,039 --> 00:23:11,319
into one unified view of the world, right, Yeah, And that's the

350
00:23:11,480 --> 00:23:15,880
Open Telemetry has just made some huge
strides in tying all of that together.

351
00:23:17,000 --> 00:23:22,640
We had Andy Grabner from Dinah Trace
on the show yesterday and we ended up

352
00:23:22,680 --> 00:23:30,359
talking about open telemetry there as well, and just the like the whole,

353
00:23:33,359 --> 00:23:37,920
you know, the whole community effort
from all of these different players to agree

354
00:23:37,079 --> 00:23:41,880
like, hey, we can all
use this standard and create this system that

355
00:23:41,960 --> 00:23:45,440
allows you to tie those kinds of
things together. It's pretty cool and something

356
00:23:45,440 --> 00:23:49,160
that I don't think we we could
have seen in our industry up until the

357
00:23:49,240 --> 00:23:53,359
last few years. Yeah, for
sure. And I think open Elementary is

358
00:23:53,400 --> 00:23:57,559
interesting because it has that like cross
company collaboration. Right, So like Dina

359
00:23:57,599 --> 00:24:02,880
Trace is involved in a standard,
it's Microsoft that are like big players involved.

360
00:24:02,920 --> 00:24:06,920
Like it just seems like I'm like, I'm positively surprised essentially that they

361
00:24:06,960 --> 00:24:08,000
were all able to come to the
table, and like some of them are

362
00:24:08,039 --> 00:24:11,960
vendors, some of them are you
know, big customers of observability data,

363
00:24:11,200 --> 00:24:14,519
and even if some of them are
selling their own solutions, right, they

364
00:24:14,519 --> 00:24:17,359
are essentially still at the table discussing, hey, how can make the standard

365
00:24:17,359 --> 00:24:18,960
work? Yeah, which is an
end user is actually great, right because

366
00:24:19,000 --> 00:24:23,000
you kind of go out of this
vendor specific instrumentation in your code and you're

367
00:24:23,039 --> 00:24:27,000
moving more towards the standardized instrumentation and
you can choose different vendors. So I

368
00:24:27,039 --> 00:24:30,359
think that's that's how things should be. Yeah. Yeah, And then the

369
00:24:32,559 --> 00:24:36,279
skeptical side of me sees that and
it's like, hmm, okay, what's

370
00:24:36,319 --> 00:24:41,319
the catch. I mean, I
think the catches it's it's sometimes sometimes the

371
00:24:41,319 --> 00:24:44,240
story is a little bit oversold,
right, So I've spent a lot of

372
00:24:44,319 --> 00:24:47,359
time and some of you know,
some of the maybe more niche use cases,

373
00:24:47,440 --> 00:24:49,240
right, and the whole thing we
just like I've just talked about in

374
00:24:49,359 --> 00:24:52,720
terms of commenting and such like.
For example, there's a project called SQL

375
00:24:52,720 --> 00:24:56,440
Commentary which is supposed to add these
query tags. In the beginning, that's

376
00:24:56,440 --> 00:25:02,200
the project at Google actually donated to
the Open Climate project. And it's great

377
00:25:02,240 --> 00:25:04,119
they've donated it, but also it's
kind of stall since then, so it's

378
00:25:04,119 --> 00:25:07,519
not like they've actually pushed forward and
really said, hey, you know,

379
00:25:07,559 --> 00:25:08,640
how do we make this, you
know, a proper standard, how do

380
00:25:08,640 --> 00:25:11,960
we document this as part of the
official project? And so sometimes I think

381
00:25:12,440 --> 00:25:18,359
the risk essentially of these stanardization projects
is that sometimes you know, there is

382
00:25:18,400 --> 00:25:19,960
like this you know, Alpha spec
to do something, or like this thing

383
00:25:21,000 --> 00:25:23,359
that was contributed by one of the
players, but there's not enough momentum around

384
00:25:23,599 --> 00:25:27,119
actually developing the standard further. And
so I think, for example, if

385
00:25:27,119 --> 00:25:30,200
I was doing tracing right, definitely
do the open plamage tracing stuff. But

386
00:25:30,240 --> 00:25:33,880
if you're doing logs even right,
like logs are like they're pretty stable in

387
00:25:33,920 --> 00:25:37,839
open plemagry understanding but they're still less, you know, kind of commonly,

388
00:25:37,960 --> 00:25:42,759
kind of fully supported across all different
languages and whatnot. Right, Yeah,

389
00:25:44,000 --> 00:25:52,720
so you've got a ton of experience
with Postgress. There are there's quite a

390
00:25:52,759 --> 00:25:56,480
few database choices available. I mean
really in my mind, there's like there's

391
00:25:56,720 --> 00:26:04,000
Postgrass, my sequel, and Mango. I consider those to be like the

392
00:26:04,000 --> 00:26:08,200
big players. And there's different variants
of that, you know, and you

393
00:26:08,240 --> 00:26:11,599
know when you get off into the
no SQL stuff, you know, there's

394
00:26:11,599 --> 00:26:15,079
different use cases there. But I
really see it as being like those three

395
00:26:15,400 --> 00:26:21,440
as being the most prevalent. What
are the things? And I've used Postgress

396
00:26:21,480 --> 00:26:25,400
a ton because it just seems to
fit any use case you throw at it.

397
00:26:25,440 --> 00:26:30,599
But what are you what's your your
take on why Postgress versus some of

398
00:26:30,640 --> 00:26:34,839
the other database offerings. Yeah,
and I would add, like, just

399
00:26:34,839 --> 00:26:37,160
just to add the fourth to the
list, I would say these days out

400
00:26:37,200 --> 00:26:41,039
of a ClickHouse to the list as
well. In terms of source databases,

401
00:26:41,039 --> 00:26:45,640
it's not relational like like or it's
a column store essentially. But I've definitely

402
00:26:45,640 --> 00:26:48,960
seen a lot of companies use posts
and ClickHouse for different workload, different parts

403
00:26:48,960 --> 00:26:56,079
of workloads. But yeah, I
would say you know the it's so so

404
00:26:56,119 --> 00:27:00,000
I think that like there's there's different
reasons why you use postosts, why use

405
00:27:00,039 --> 00:27:03,440
from ones. One of the things
that I usually talk about open clemmmetry and

406
00:27:03,440 --> 00:27:06,400
different companies coming together, right,
and it's from a community perspective. The

407
00:27:06,480 --> 00:27:08,400
one thing like what keeps me the
postcrist community, right, which is not

408
00:27:08,519 --> 00:27:11,359
exactly answering your question, but I
think it is an interesting aspect of this

409
00:27:11,720 --> 00:27:15,400
is postcurs is not a project by
Bond company, right. Post Chris is

410
00:27:15,400 --> 00:27:18,279
a community project. Like it has
you know, people from ab guests working

411
00:27:18,319 --> 00:27:22,160
in it, people from Microsoft working
from ed B working it, from Google

412
00:27:22,200 --> 00:27:25,519
working in it, people from small
companies plant working it. And it is

413
00:27:25,559 --> 00:27:27,720
a true community project, kind of
like Linux currently is. And I think

414
00:27:27,759 --> 00:27:30,960
this this is what fascinates me about
it and what makes me, you know,

415
00:27:32,079 --> 00:27:36,200
just the longevity of it is like
so much. Essentially I trust it

416
00:27:36,240 --> 00:27:37,920
to be to have that longevity even
in you know, ten years, twenty

417
00:27:38,039 --> 00:27:42,400
fifty years from now, just because
it's it's it's been able to kind of

418
00:27:42,480 --> 00:27:45,000
advance over the years, but it's
also been able to do that as a

419
00:27:45,000 --> 00:27:48,519
community project, not as a commercial
player trying to you know, kind of

420
00:27:49,359 --> 00:27:52,720
get both benefits essentially, right,
because if you look at mangoay to be,

421
00:27:52,839 --> 00:27:55,880
like mamay B has you know,
their ATHLETs service, And I don't

422
00:27:55,920 --> 00:27:57,920
know how it differs between you know
what mama to be open source gets you

423
00:27:59,039 --> 00:28:00,839
versus you know, the Atlas base
Mango to be. But certainly, you

424
00:28:00,839 --> 00:28:03,440
know, I would imagine they have
that conflict, right They're constantly thinking what

425
00:28:03,480 --> 00:28:07,000
should I put there? What should
it? But there are people just going

426
00:28:07,039 --> 00:28:10,160
to deploy their own, right,
and so just that that is really I

427
00:28:10,160 --> 00:28:11,440
think you know why postcrists is a
great building block, right, because it

428
00:28:11,440 --> 00:28:17,519
doesn't have that that fundamental conflict that
these companies usually have. I think when

429
00:28:17,519 --> 00:28:21,400
you're trying to make a decision which
one to use, I think if you're

430
00:28:21,480 --> 00:28:23,559
using my sequel, there's you know, there's some reasons to migrate to postgress,

431
00:28:23,559 --> 00:28:26,000
but they're they're usually not that strong, right, So I think when

432
00:28:26,039 --> 00:28:30,079
I see people migrating between the databases, it's really more to old school like

433
00:28:30,119 --> 00:28:33,920
Oracle or SQL server to post press, right, And for example there postcrists

434
00:28:33,960 --> 00:28:37,039
is a very good target for these
migrations because it has like a lot of

435
00:28:37,039 --> 00:28:41,400
similarities in some sense and career language
and such before Orecle for example, and

436
00:28:41,400 --> 00:28:45,119
stuff like that. There's easy it's
easier to migrate essentially to post Christ into

437
00:28:45,119 --> 00:28:48,240
my sequel. Also, if you're
using Oracle, why would you go to

438
00:28:48,240 --> 00:28:52,559
my sqel because that's owned by Oracle. Again, kind of doesn't really make

439
00:28:52,599 --> 00:29:00,079
sense. And I think Mango to
be I don't, you know, I

440
00:29:00,079 --> 00:29:03,759
don't have much experience personally, but
I think it's like really it's it's it's

441
00:29:03,799 --> 00:29:07,160
the kind of thing where you make
a choice. Sometimes companies, you know,

442
00:29:07,240 --> 00:29:11,440
run multiples of those, like oftentimes
I see companies just as in Postcriss,

443
00:29:11,720 --> 00:29:14,319
I think you can scale out all
of these right like Postcress as Citas.

444
00:29:14,559 --> 00:29:17,720
My sequel has the tests and planet
scale. I'm I'm going to be

445
00:29:17,799 --> 00:29:19,559
kind of as it's built in way
of scaling out. So it's it's you

446
00:29:19,599 --> 00:29:22,400
don't you don't really run into a
bottom like these days anymore with not being

447
00:29:22,400 --> 00:29:27,000
able to scale beyond a certain data
poet. Yeah, for sure. Yeah.

448
00:29:27,039 --> 00:29:33,880
My introduction to Mango deb was gosh, it's been I think it's been

449
00:29:33,920 --> 00:29:41,279
over ten years ago now, but
where we used it was with mobile applications,

450
00:29:41,920 --> 00:29:48,279
uh, and user attribution because you
have this application that ties into all

451
00:29:48,359 --> 00:29:52,279
these different services, you know,
like ties into Facebook and ties into Amazon

452
00:29:52,400 --> 00:29:56,519
and different things like that, and
you're trying to attribute your marketing campaigns to

453
00:29:56,640 --> 00:30:03,200
those platforms, but with each one
and you get different data. And so

454
00:30:03,359 --> 00:30:07,799
Mango for us turned out to be
a really good way to just say,

455
00:30:07,799 --> 00:30:11,759
Okay, here's the attribution data we
got for this user, and we're just

456
00:30:11,799 --> 00:30:15,880
gonna put it in this Mango field. And then in some cases we know

457
00:30:15,039 --> 00:30:22,440
that there's one key value pair that's
consistent across all of those platforms, and

458
00:30:22,480 --> 00:30:25,559
then we're gonna keep the rest of
the stuff around because we might end up

459
00:30:25,640 --> 00:30:29,079
needing it later too, And so
that was a really strong use case for

460
00:30:29,200 --> 00:30:36,079
Mango. Now Postgress also though,
supports Jason data types, so you could

461
00:30:36,160 --> 00:30:38,960
just as easily have done it with
Postgress. I don't know if Postgress actually

462
00:30:40,000 --> 00:30:41,480
had it back when we did this, but I know what that it is

463
00:30:41,519 --> 00:30:45,759
there now, yeah, and what
it probably had. So there's you know,

464
00:30:45,799 --> 00:30:48,039
I would say, there's three things
worth thinking about and knowing about it,

465
00:30:48,079 --> 00:30:49,880
right, So, Like the simplest
thing is you could always just store

466
00:30:49,960 --> 00:30:53,240
texts like Jason has texting your database, right, right, But what you're

467
00:30:53,279 --> 00:30:56,599
kind of losing there is like any
sense of validation, right, Like if

468
00:30:56,599 --> 00:30:59,839
you're missing a parenthesies, you're not
going to database if going to tell you,

469
00:30:59,920 --> 00:31:03,920
right, And so the most simplest
form, postcress has adjacent data type

470
00:31:03,200 --> 00:31:07,480
and Jason in postcus is really just
a validation step right where it says,

471
00:31:07,480 --> 00:31:11,160
okay, well let me conform to
this actual correct Jason, and then you

472
00:31:11,200 --> 00:31:14,000
know it can do some operations on
top of it. But really, where

473
00:31:14,160 --> 00:31:18,440
I think postcress has become more of
a competitive Mango to B is the Jason

474
00:31:18,519 --> 00:31:22,119
B data type. B stands for
binary, and the idea behind data is

475
00:31:22,200 --> 00:31:26,279
that it lets you, amongst other
things, index a Jason B kind of

476
00:31:26,319 --> 00:31:29,640
field, similar to how you could
indexes with Mongo. Right. So,

477
00:31:29,720 --> 00:31:33,599
like the idea is that if you
know, I have this like schema less

478
00:31:33,640 --> 00:31:34,799
data of sorts, like I don't
know exactly the shape of it or what

479
00:31:34,880 --> 00:31:37,559
comes what's gets thrown into it,
but I usually want to query for some

480
00:31:37,680 --> 00:31:40,519
of the keys, right, and
I want to search for things and that

481
00:31:40,680 --> 00:31:45,079
taste that essentially, then the way
that you can do this nowadays with post

482
00:31:45,119 --> 00:31:47,839
presses, you have a Jason B, you have a gin index or in

483
00:31:47,920 --> 00:31:52,480
some cases just in necess' on the
on Jason B column and you could just

484
00:31:52,559 --> 00:31:55,440
do queries on it. But you
don't have to declare upfront what you can

485
00:31:55,480 --> 00:31:56,960
to query, right, You don't
have to say, I'm always going to

486
00:31:57,079 --> 00:32:00,839
query for you know, this campaign, idea field or just you know,

487
00:32:00,960 --> 00:32:04,920
other kind of whatever the fields are. But you could actually have a kind

488
00:32:04,920 --> 00:32:08,079
of more generic index. And so
that's really I think what you know these

489
00:32:08,160 --> 00:32:12,359
days if you're storing Jason and the
postcars definitely used Jason because there's no reason

490
00:32:12,440 --> 00:32:15,440
not to, or almost no reason
not to. And so that I think,

491
00:32:15,480 --> 00:32:19,119
you know, gives you gets you
ninety five percent of the use cases

492
00:32:19,160 --> 00:32:22,000
that I'm going to be usually would
have. Yeah, and you're still in

493
00:32:23,160 --> 00:32:30,319
a database platform that supports relational data
as well, and that's I think to

494
00:32:30,400 --> 00:32:35,720
me, that's huge because every application, every business does have relational data,

495
00:32:36,079 --> 00:32:40,160
and so now you're able to accomplish
that in a single database platform versus juggling

496
00:32:40,319 --> 00:32:45,079
multiple database connections and trying to remember
which one has which data that's right.

497
00:32:45,160 --> 00:32:49,559
Yeah, And on that note,
more recently, also so talking about AI,

498
00:32:50,799 --> 00:32:54,839
we like in the postal community,
somebody called Andrew Caine, I create

499
00:32:54,880 --> 00:33:00,839
a product called pg vector. Excuse
me pg vector and pegbackctor essentially does is

500
00:33:00,880 --> 00:33:04,480
it stores vector embeddings inside Postgress so
that you don't have to use a dedicated

501
00:33:04,559 --> 00:33:07,680
vector database instead. It's essentially the
same idea, right, Like you have

502
00:33:07,119 --> 00:33:13,559
essentially a special like column like data
type that has special indexes, and then

503
00:33:13,640 --> 00:33:15,920
you can store you know, like
if you're trying to you know, build

504
00:33:15,920 --> 00:33:20,119
all these AI applications, instead of
now using specialized dabases, you can just

505
00:33:20,240 --> 00:33:22,319
use your Postgress And really the big
benefit that people are seeing these days,

506
00:33:22,359 --> 00:33:25,079
right is that they can then also
keep the rest of the data in Postgress.

507
00:33:25,119 --> 00:33:28,960
Right, So you're building your cool, fancy AI startup, you can

508
00:33:29,039 --> 00:33:31,799
now you know, use Postgress for
everything up to a certain limit. Right.

509
00:33:31,839 --> 00:33:35,640
So there are still benefits to using
a specialized database, but especially when

510
00:33:35,640 --> 00:33:38,440
you're starting out, it's just so
much simpler to stay within one system and

511
00:33:38,519 --> 00:33:43,279
then later on, you know,
you kind of scale down. And I

512
00:33:43,319 --> 00:33:45,000
do want to mention at that point, right. Part the reason we talk

513
00:33:45,039 --> 00:33:50,079
different database technologies. One of the
reasons why Postgress is very good at,

514
00:33:50,160 --> 00:33:53,920
you know, kind of supporting these
newer things like the embeddings is because it

515
00:33:53,960 --> 00:33:58,279
has a very good extension system.
So compared to for example, my SQL,

516
00:33:58,519 --> 00:34:00,720
it's much easier in postgress to create
an extension that changes some of the

517
00:34:00,799 --> 00:34:06,240
core functionality and postcress that hooks into
different parts. There's more of a community

518
00:34:06,279 --> 00:34:09,440
around that also that kind of releases
these extensions. And so the fact that

519
00:34:09,480 --> 00:34:12,960
you know, somebody like three years
ago Andrew Kaine was like, yes,

520
00:34:13,039 --> 00:34:15,039
I gotta you know, built this
library called pg vector and I'll just publish

521
00:34:15,079 --> 00:34:17,719
this. You know, it's an
extension of postgress. And then you know,

522
00:34:17,840 --> 00:34:22,519
all the big ciut companies like a
ws GCP Azure, they're all like,

523
00:34:22,800 --> 00:34:24,039
yes, AI, you know,
it's the best thing, and by

524
00:34:24,079 --> 00:34:27,360
the way, we support AI.
And then you look at it and it's

525
00:34:27,400 --> 00:34:30,639
just pgo vector kind of you know, sitting under in the database actually,

526
00:34:30,159 --> 00:34:32,159
like they just went and used the
extension. I mean, it's cool,

527
00:34:32,159 --> 00:34:36,760
they did it right, like they
they made it accessible and usable. But

528
00:34:37,000 --> 00:34:43,280
it's it's fascinating how you know that
extension system allows this adaptability of postprice essentially,

529
00:34:43,840 --> 00:34:49,000
yeah, for sure. And I
think that you know, that sets

530
00:34:49,079 --> 00:34:53,119
up the stage so that you can
empower your engineers so that they can focus

531
00:34:53,360 --> 00:34:59,679
on application performance, right, Yeah, that's right. And I think also

532
00:35:00,039 --> 00:35:02,880
not learning about the completely new system, right because they will have to be

533
00:35:04,719 --> 00:35:07,519
installing the driver probably like on the
application site, or I probably wouldn't support

534
00:35:07,559 --> 00:35:10,480
it, right, Like there's all
this extra steps that they would have to

535
00:35:10,519 --> 00:35:16,679
take if to use them completely different. So yeah, and you know,

536
00:35:16,800 --> 00:35:22,000
going back again to talking about the
number of technologies and fields we have to

537
00:35:22,079 --> 00:35:29,719
have expertise in. Really that's like
the overall objective for all of those is

538
00:35:29,800 --> 00:35:35,440
to create a performance app that moves
our business forward. Because most of us

539
00:35:35,599 --> 00:35:43,079
are not in the business of running
postcress databases or building AWS infrastructure. That's

540
00:35:43,199 --> 00:35:46,239
just the means to an end for
whatever our company is actually trying to do.

541
00:35:47,199 --> 00:35:53,400
Agree, where do you see?
Where do you see postgress and pg

542
00:35:53,559 --> 00:35:58,079
analyze going from here? Because we're
still in the early stages of AI,

543
00:35:58,159 --> 00:36:00,840
still trying to figure out what it
means. And I think you might agree

544
00:36:00,880 --> 00:36:07,039
with me that AI is not a
job killer but a job enabler. I

545
00:36:07,079 --> 00:36:08,519
always say it's complicated, right,
I think for some people it's definitely job

546
00:36:08,599 --> 00:36:12,559
killer. Right. So if you
are in the creative industries, and let's

547
00:36:12,559 --> 00:36:15,079
say you used to like, let's
see you're an artist, but you used

548
00:36:15,079 --> 00:36:19,000
to get money from you know,
working advertising campaigns. All turns outs,

549
00:36:19,039 --> 00:36:21,440
you know, Open the Eye just
released Sora, which you know makes your

550
00:36:21,440 --> 00:36:24,519
whole video production pipeline much easier to
you know, just automate with AI.

551
00:36:24,920 --> 00:36:28,679
And so maybe you're out of a
job, right because like suddenly your creative

552
00:36:28,679 --> 00:36:31,960
industry's job is just no longer paid, like you can still keeping an artist,

553
00:36:32,079 --> 00:36:36,559
right, there's just no money to
be made. So I would disagree

554
00:36:36,639 --> 00:36:38,079
saying, you know, it's uh, it's definitely taking jobs, right,

555
00:36:38,119 --> 00:36:45,000
Like that is very clear. I
think it's that's in a sense how when

556
00:36:45,079 --> 00:36:46,159
change is happening, that happens,
but it's also shitty, right, like

557
00:36:46,239 --> 00:36:50,199
it causes all kinds of problems.
So yeah, I think you have to

558
00:36:50,239 --> 00:36:53,199
accept that too. But I think
when I think about, you know,

559
00:36:53,280 --> 00:36:58,079
more personally, in terms of what
I see right like happening in my like

560
00:36:58,719 --> 00:37:00,760
niches of the world, right,
which is like engineering and data is optimization.

561
00:37:01,119 --> 00:37:04,639
I think as mentioned data is optimization. I'm not sure how much this

562
00:37:04,760 --> 00:37:07,320
is a generative problem, right,
So there to me, the natural language

563
00:37:07,360 --> 00:37:13,440
aspect of it is more a maybe
I you know, there's there's a semi

564
00:37:13,480 --> 00:37:16,000
automated system that can solve some of
these problems for me. But instead of

565
00:37:16,119 --> 00:37:20,239
me having to interact with the user
interface where I click around or like maybe

566
00:37:20,239 --> 00:37:22,519
I should write some code, instead
of that, I can just talk to

567
00:37:22,159 --> 00:37:25,719
my quasi you know, admin of
sorts and it just happens to be,

568
00:37:27,039 --> 00:37:30,119
you know, to be a large
language model type interface. But behind the

569
00:37:30,159 --> 00:37:34,960
scenes, what's actually doing is something
much more deterministic essentially, right, Like

570
00:37:35,039 --> 00:37:40,159
it's driving another system that really makes
makes the smartness essentially work. I think,

571
00:37:40,480 --> 00:37:44,559
you know, in the context of
engineering obviously, you know, get

572
00:37:44,599 --> 00:37:49,400
a copilot for example, I think
does have some interesting aspects to it,

573
00:37:49,519 --> 00:37:52,599
right, and so I have not
seen it fully you know, solved,

574
00:37:52,719 --> 00:37:54,960
Like I would wish it could write
my test for me, right, Like

575
00:37:55,159 --> 00:37:58,639
when I write code, I hate
writing tests, Like it's just like I'm

576
00:37:58,719 --> 00:38:00,920
lazy, and so what if I
could just have copilot right it? But

577
00:38:01,000 --> 00:38:05,159
then the problem is intent, right
because like it doesn't replace the intent,

578
00:38:05,280 --> 00:38:07,960
like you have to still tell it
do this, do that, create that,

579
00:38:07,480 --> 00:38:10,480
But it does kind of take a
little bit of the activation energy away,

580
00:38:10,519 --> 00:38:14,280
right. Sometimes when you you have
a tedosnask, which there's a lot

581
00:38:14,320 --> 00:38:16,280
of things to do. It's just
hard to get started. And so what

582
00:38:16,320 --> 00:38:19,599
I've heard from at least other people, and I have some seem to live

583
00:38:19,599 --> 00:38:22,280
with myself, is that the that's
today already a value that you know,

584
00:38:22,360 --> 00:38:25,320
something like copilot can deliver. And
so I think if we you know,

585
00:38:25,400 --> 00:38:30,920
project forward, right, I don't
think the I don't think the thinking goes

586
00:38:30,960 --> 00:38:32,360
away, right, I don't think
like none of the large language models are

587
00:38:32,400 --> 00:38:37,079
thinking, right, They're not reasoning, they're just like generating. And so

588
00:38:37,320 --> 00:38:44,280
I think really it's a question of
how do we, like it's a question

589
00:38:44,360 --> 00:38:45,360
of like how do we drive these
systems? Right, Like, how do

590
00:38:45,480 --> 00:38:52,480
we as operators as engineers drive a
system in a way that makes predictable outcomes

591
00:38:52,960 --> 00:38:57,679
but automates the things that are you
know, either hard to like it's just

592
00:38:57,960 --> 00:39:00,840
like take a lot of time.
They you know, we take a lot

593
00:39:00,840 --> 00:39:02,320
of expertise, right, So I
think there's a lot of opportunities there.

594
00:39:02,599 --> 00:39:07,719
But I don't think the human goes
away that drives essentially this is how the

595
00:39:07,760 --> 00:39:13,320
system should work for sure. Gotcha. So there's it's a way to provide

596
00:39:13,360 --> 00:39:20,039
the technical expertise, but you still
have to apply like the the is this

597
00:39:20,199 --> 00:39:23,320
a good idea, yeah, or
like the bounds of it, right,

598
00:39:23,599 --> 00:39:27,559
Like maybe it gives you suggestions that
Like it's kind of like we do with

599
00:39:27,599 --> 00:39:29,559
findext thing, right, So like
today if I look at our solution for

600
00:39:29,639 --> 00:39:30,599
the next thing, it's okay,
like it could definitely better, right,

601
00:39:30,639 --> 00:39:32,960
Like, it gives you a reasonable
good recommendation. But I think over the

602
00:39:34,079 --> 00:39:36,280
years, but I'll see you know
us do right, It's like at PG

603
00:39:36,320 --> 00:39:39,000
analyzed, our index recommendation will actually
get to the point where ninety percent of

604
00:39:39,039 --> 00:39:42,960
the time you they are just good, right, Like there's something that an

605
00:39:43,000 --> 00:39:45,960
expert would give you as well,
and you know you can just apply them.

606
00:39:45,239 --> 00:39:50,920
But still I think the decision of
when to apply them, how often

607
00:39:50,960 --> 00:39:52,320
to apply them, you know,
which level of testing to do, that

608
00:39:52,480 --> 00:39:59,000
is something that a human operator should
essentially take into accounts and decide because I

609
00:39:59,079 --> 00:40:01,519
think otherwise, you know, otherwise, you know, you don't really have

610
00:40:01,599 --> 00:40:04,840
control over the system, I guess, right, like you'll have you know,

611
00:40:04,920 --> 00:40:07,480
things be unnecessarily expensive and it could
be cheaper you maybe have you know,

612
00:40:07,599 --> 00:40:10,639
things be slow because the system oftimised
for the wrong thing. And so

613
00:40:10,679 --> 00:40:15,679
I think that there's this level of
control that I anticipate us needing ever so

614
00:40:15,960 --> 00:40:19,079
you know, ever more so in
the next couple of years. Yeah,

615
00:40:19,119 --> 00:40:22,119
it makes me think of the Jurassic
Park meme with I think it's Jeff Goldbloom,

616
00:40:22,119 --> 00:40:25,800
where it's like you were so focused
on whether or not you could you

617
00:40:25,920 --> 00:40:30,079
never stop to think whether or not
you should. Yes, Yes, that's

618
00:40:30,119 --> 00:40:34,360
definitely something I think of when I
see some of these things Aye does.

619
00:40:35,280 --> 00:40:45,760
Right, So what are the what
are the big problems that beyond index and

620
00:40:45,920 --> 00:40:51,199
query tuning. What are some of
the big problems you see with postgrass that

621
00:40:52,719 --> 00:40:59,360
fall into the I wish I'd known
that sooner category? Yeah, good question.

622
00:41:01,920 --> 00:41:08,320
I think it's I mean, there
are some things like most the other

623
00:41:08,360 --> 00:41:12,519
about this. So if you have
a lot of data, like terabytes of

624
00:41:12,599 --> 00:41:15,000
data, you can definitely sort it
in post ris, right, So like

625
00:41:15,199 --> 00:41:17,840
postcris doesn't really have a limit per
se. I think the one thing that

626
00:41:17,920 --> 00:41:22,719
I've I have definitely seen is if
you anticipate, you know, scaling hugely

627
00:41:22,800 --> 00:41:25,039
to the point that you know you'll
have one hundred terabytes of data, it

628
00:41:25,159 --> 00:41:28,960
really does help to think a little
bit about how you may be able to

629
00:41:29,000 --> 00:41:31,719
split up your data potentially or at
least how it should be structured in databased.

630
00:41:32,519 --> 00:41:37,519
So I'll give you two different examples
of represential data modeling of some sort

631
00:41:37,960 --> 00:41:40,760
is what I'm going for. So
one example is back when I was cite

632
00:41:40,800 --> 00:41:45,239
of data. Right, societies is
a sharting system, so you have different

633
00:41:45,400 --> 00:41:49,360
database servers essentially, and you scale
your workload b adding servers, which generally

634
00:41:49,440 --> 00:41:52,960
is a very good pattern. Not
a problem is if you're looking for,

635
00:41:52,400 --> 00:41:54,679
you know, a record, right, So let's say we have users.

636
00:41:54,800 --> 00:41:57,960
Well, let's see, yeah,
let's say if use we have just we

637
00:41:58,039 --> 00:42:01,440
have billions of users, right,
we have all these servers, right,

638
00:42:01,519 --> 00:42:05,840
and like a certain subset of our
users are in each server, and so

639
00:42:05,960 --> 00:42:08,760
if you're looking for a particular let's
say email, what we'd have to do,

640
00:42:08,960 --> 00:42:12,599
right is we essentially have to run
a query across all these servers,

641
00:42:12,719 --> 00:42:15,559
right, which, like, the
more service you have, the more complicated

642
00:42:15,639 --> 00:42:16,599
that becomes. And you can paralyze
a love of that, right, But

643
00:42:16,679 --> 00:42:21,119
like essentially many a problem with you
know, too many connections, and so

644
00:42:21,599 --> 00:42:24,400
like generally speaking, it's not good
if you have a lot of queries that

645
00:42:24,519 --> 00:42:28,800
go across all the service. And
so when you have this type of scaling

646
00:42:28,840 --> 00:42:34,000
out anticipation. One thing that I
always do these days is I make sure

647
00:42:34,440 --> 00:42:37,480
that I think about how could my
data potentially be sharded in the future,

648
00:42:37,559 --> 00:42:42,639
right, split up like that?
And oftentimes let's say you you build a

649
00:42:42,639 --> 00:42:45,719
sales tool, right, and a
sales tool has like B two B customers,

650
00:42:45,800 --> 00:42:47,239
right, so you have all these
customer IDs, right, And so

651
00:42:49,239 --> 00:42:52,039
in the most native implementation, you
have, you know, maybe a customer's

652
00:42:52,079 --> 00:42:54,079
table, and then you have a
CRM records table and that has a customer

653
00:42:54,199 --> 00:42:57,960
D. But then maybe each record
has a comment and sort of comment just

654
00:42:58,119 --> 00:43:00,920
you know, includes the record ID, but not the actual customer D because

655
00:43:00,920 --> 00:43:02,559
you're like, that's that seems duplicated, right, would I copy this you

656
00:43:02,639 --> 00:43:06,199
know, same value all over the
place. And so the thing I would

657
00:43:06,199 --> 00:43:09,440
actually do if I anticipate the scale
is I would actually include my customer D

658
00:43:09,559 --> 00:43:13,880
in this case, in all my
tables. Because that's the one thing that

659
00:43:14,079 --> 00:43:16,639
makes it a lot easier to shart
out your data is if it's very easy

660
00:43:16,719 --> 00:43:20,880
for you to say, if I
have, you know, my one tear

661
00:43:20,920 --> 00:43:24,159
by table, which subset of the
table belongs to each customer, and then

662
00:43:24,199 --> 00:43:27,840
when I'm going to move you know, like ten percent of my customers to

663
00:43:28,000 --> 00:43:30,320
this at a server, I can
just you know, select you know,

664
00:43:30,440 --> 00:43:34,360
essentially by customer I D versus doing
a very complicated joint that actually becomes quite

665
00:43:34,400 --> 00:43:37,960
expensive once you set its large.
And so that's the one thing I would

666
00:43:37,000 --> 00:43:44,599
do to to anticipate scaling is to
to really, you know, I think

667
00:43:44,679 --> 00:43:46,440
think of that, you know,
what, what's your unit of subdivision essentially

668
00:43:46,480 --> 00:43:51,480
in your data? Is there a
way to like potentially just add that ahead

669
00:43:51,480 --> 00:43:54,159
of time so that you have a
better chance scaling in the future essentially with

670
00:43:54,239 --> 00:44:00,880
outbumach effort. And then the other
thing again data related that I've found recently

671
00:44:01,000 --> 00:44:05,480
is just there are some tips and
tricks. This is really kind of postcers

672
00:44:05,480 --> 00:44:09,079
specific. It probably applies to my
SEQL to which is sometimes it makes sense

673
00:44:09,199 --> 00:44:13,960
to store data in a what seems
like a less than optimal format, but

674
00:44:14,000 --> 00:44:16,159
it's actually more efficient. So the
example is we store a lot of time

675
00:44:16,199 --> 00:44:20,000
serious data, and we don't use
specialized time seriou data base. We use

676
00:44:20,039 --> 00:44:23,920
postcress for for storting postcress time serio
data of course, right, And so

677
00:44:24,280 --> 00:44:29,000
one of the things we've recently done
is we've started using a rays to store

678
00:44:29,039 --> 00:44:31,559
some of our data. And so
imagine you have like a data point and

679
00:44:31,639 --> 00:44:35,280
you have you know, let's say
you know, there's five different values,

680
00:44:35,320 --> 00:44:37,000
they're all in the same timestamp.
And so the most simplest implantation is you

681
00:44:37,079 --> 00:44:39,920
just have you know, timestam column
data point one, data two, they've

682
00:44:39,920 --> 00:44:45,079
went three hundred and four all the
same time stam. So what we've done

683
00:44:45,119 --> 00:44:51,000
recently and this has like yielded a
tremendous performance like well disk based efficiency,

684
00:44:51,000 --> 00:44:54,519
but also performance benefits, is we've
essentially put more data points for the same

685
00:44:54,639 --> 00:45:00,880
customer in one row. Right,
so the row essentially becomes like an array

686
00:45:00,920 --> 00:45:02,639
of time stems, array of data
point one, a ray of data point

687
00:45:02,639 --> 00:45:06,119
two, a rady of data point
three and such. And what that does

688
00:45:06,360 --> 00:45:09,480
is postgress is various mechanisms. How
it you know, kind of will reduce

689
00:45:09,559 --> 00:45:13,519
overhead in this case, like it
will move some things to a second or

690
00:45:13,599 --> 00:45:16,360
storage. In some cases it will
and if you avoid the what's called tuople

691
00:45:16,400 --> 00:45:20,039
header, which is like the each
row has like a little header that takes

692
00:45:20,119 --> 00:45:23,440
extra space. And so if for
some reason you have a problem that looks

693
00:45:23,480 --> 00:45:27,599
like ours, right, then you
may want to think about a rays as

694
00:45:27,639 --> 00:45:30,159
as a way to kind of build
a not column storage of sorts, but

695
00:45:30,239 --> 00:45:34,000
like it's it's the kind of thing
that the column storage is good at.

696
00:45:34,679 --> 00:45:37,119
But if you are in a row
based stores, like postgress, arrays can

697
00:45:37,159 --> 00:45:40,480
be an interesting hack there to optimize
things. And then you can just add

698
00:45:40,760 --> 00:45:45,000
time series database to the list of
things that postgress can do. Exactly,

699
00:45:45,280 --> 00:45:47,280
Yeah, I guess I should you
know, in completeness the stake, I

700
00:45:47,320 --> 00:45:51,320
should probably say that partitioning is what
you should do first. So forget about

701
00:45:51,360 --> 00:45:52,800
the array stuff. You should partition
your tables if you haven't, right,

702
00:45:52,840 --> 00:45:54,639
So, like that's the other thing, if you haven't append on the work

703
00:45:54,719 --> 00:45:58,920
up which time zer state it usually
is. Just make sure you use partitions,

704
00:45:58,960 --> 00:46:02,000
because the big type pattern and progress
is if you're like you're doing inserts

705
00:46:02,000 --> 00:46:06,920
and you're doing the leads and so
what'sentually happens is that you create all lot

706
00:46:06,920 --> 00:46:08,840
of these dead rows versus if you
have partitions, right, Like, let's

707
00:46:08,840 --> 00:46:13,559
imagine you want to keep thirty days
with data, and so you insert each

708
00:46:13,639 --> 00:46:16,159
day's data into each partition and then
on the thirtieth day you drop that partition.

709
00:46:16,559 --> 00:46:21,039
And so dropping a partition like a
table petition is much cheaper than doing

710
00:46:21,119 --> 00:46:24,599
as elite statement across you know,
millions of records in a table, gotcha,

711
00:46:24,719 --> 00:46:28,800
And then that not only is that
more efficient to do, but it

712
00:46:28,960 --> 00:46:31,800
also saves you overhead when it comes
time to vacuum the database. Is that

713
00:46:31,880 --> 00:46:35,760
correct? That's right exactly, because
the vacuum doesn't have to do the work

714
00:46:35,800 --> 00:46:37,679
right because like, yeah, they're
not dead ros, they just are.

715
00:46:38,119 --> 00:46:43,199
You just dropped the table, so
right on. Excellent. Cool. Well,

716
00:46:43,719 --> 00:46:45,719
we could continue digging in on this, but I know you've got a

717
00:46:46,480 --> 00:46:50,960
meeting coming up here, so it
feels like we're a good stopping point and

718
00:46:51,039 --> 00:46:53,719
then we'll get you off to your
next meeting on time. But thanks for

719
00:46:53,960 --> 00:46:57,400
thanks for coming and talking about this. This has been cool, and I

720
00:46:58,000 --> 00:47:00,559
it's been insightful. I learned a
lot of things about put Express that I

721
00:47:00,599 --> 00:47:05,400
didn't know despite having used it for
a long time. Perfect. Yeah,

722
00:47:05,400 --> 00:47:08,760
And if anybody who's listening is interested
to learn even more about Postgress, I

723
00:47:09,039 --> 00:47:13,800
host a weekly video series and YouTube
called five Minutes of Postgress. So if

724
00:47:13,840 --> 00:47:15,519
you want to get you know,
a little like snippet of what's new with

725
00:47:15,559 --> 00:47:19,840
Postgress each week, feel free to
subscribe. To that, and I try

726
00:47:19,880 --> 00:47:22,960
to make that as useful as I
can to the community. Right on.

727
00:47:22,119 --> 00:47:25,559
What's the name of that YouTube channel. It's just an a PG analysed channel,

728
00:47:25,599 --> 00:47:29,840
but okay, the name of the
series is five minutes of Postgress Awesome.

729
00:47:30,000 --> 00:47:35,480
Right on? And then anywhere else
did you hang out online that people

730
00:47:35,519 --> 00:47:39,480
can interact with you? Yeah,
I mean I'm on macedon, I'm still

731
00:47:39,519 --> 00:47:45,199
on Twitter, akax and LinkedIn,
so feel free to you know, find

732
00:47:45,199 --> 00:47:47,239
me online. I'll send you a
few links you can include the show notes

733
00:47:47,559 --> 00:47:51,239
perfect, But generally, you know, if you want to hear more about

734
00:47:51,239 --> 00:47:53,880
PG analyes, just go through a
website pglies dot com. We host webinars

735
00:47:53,920 --> 00:47:57,960
every now and then where we talk
about, you know, things how how

736
00:47:58,000 --> 00:48:00,960
to running your postcress database and of
course how preach olice can help. But

737
00:48:00,079 --> 00:48:04,159
we try to, you know,
make that general useful. And then you

738
00:48:04,199 --> 00:48:07,559
know, also, I'm in a
bunch of the postal community spaces and postals

739
00:48:07,639 --> 00:48:10,280
conferences, so if you're at a
post conference later this year, maybe basically

740
00:48:10,360 --> 00:48:14,519
there awesome, right ern, Well, thank you so much for joining me

741
00:48:14,599 --> 00:48:16,440
on the show. Perfect, Thank
you so much for having me all right,

742
00:48:16,679 --> 00:48:20,480
see you see everyone else Next week, by every one,
