18:13
[Music] I don't know
18:14
I don't know another
30:00
you know I love that music I have to say
30:02
you know I love that music I have to say
30:02
you know I love that music I have to say welcome everybody thank you Isaac on my
30:05
welcome everybody thank you Isaac on my
30:06
welcome everybody thank you Isaac on my team who helped select it um my name is
30:08
team who helped select it um my name is
30:08
team who helped select it um my name is Claires rodano and this is I am Robert
30:12
Claires rodano and this is I am Robert
30:12
Claires rodano and this is I am Robert Treat
30:13
Treat and I'm a studies open source champion
30:15
and I'm a studies open source champion
30:15
and I'm a studies open source champion on the postgres team at Microsoft and
30:18
on the postgres team at Microsoft and
30:18
on the postgres team at Microsoft and you Robert R uh so I'm a former speaker
30:22
you Robert R uh so I'm a former speaker
30:22
you Robert R uh so I'm a former speaker at cytuscon this year doing the
30:24
at cytuscon this year doing the
30:24
at cytuscon this year doing the co-hosting uh otherwise a long time
30:26
co-hosting uh otherwise a long time
30:26
co-hosting uh otherwise a long time postgres Community avocado
30:30
let's dive in we've got a welcome deck
30:32
let's dive in we've got a welcome deck
30:32
let's dive in we've got a welcome deck to walk you through before we go to our
30:34
to walk you through before we go to our
30:34
to walk you through before we go to our first Speaker our keynote speaker
30:37
first Speaker our keynote speaker
30:37
first Speaker our keynote speaker um so if let's see if we can take a look
30:40
um so if let's see if we can take a look
30:40
um so if let's see if we can take a look at those slides
30:41
at those slides um I have a bunch of things to tell you
30:44
um I have a bunch of things to tell you
30:44
um I have a bunch of things to tell you about I mean the first is that this is
30:46
about I mean the first is that this is
30:46
about I mean the first is that this is the America's live stream today and uh
30:49
the America's live stream today and uh
30:49
the America's live stream today and uh we have an emea live stream coming up at
30:51
we have an emea live stream coming up at
30:51
we have an emea live stream coming up at nine o'clock Central European summertime
30:53
nine o'clock Central European summertime
30:53
nine o'clock Central European summertime on Wednesday and then there's 25 new
30:56
on Wednesday and then there's 25 new
30:56
on Wednesday and then there's 25 new on-demand talks that were pre-recorded
30:58
on-demand talks that were pre-recorded
30:58
on-demand talks that were pre-recorded in the last couple of weeks and those
31:01
in the last couple of weeks and those
31:01
in the last couple of weeks and those just dropped on YouTube and are
31:03
just dropped on YouTube and are
31:03
just dropped on YouTube and are available for your watching pleasure now
31:05
available for your watching pleasure now
31:05
available for your watching pleasure now so you can find everything you want
31:06
so you can find everything you want
31:06
so you can find everything you want about cytuscon these two live streams
31:09
about cytuscon these two live streams
31:09
about cytuscon these two live streams and On Demand at aka.mscon
31:13
and On Demand at aka.mscon
31:13
and On Demand at aka.mscon um also I'm so nervous that I jumped
31:15
um also I'm so nervous that I jumped
31:15
um also I'm so nervous that I jumped right over the the most important thing
31:17
right over the the most important thing
31:17
right over the the most important thing which is to say that I am really excited
31:19
which is to say that I am really excited
31:19
which is to say that I am really excited about all the content all the talks all
31:22
about all the content all the talks all
31:22
about all the content all the talks all the postgres learning I mean conferences
31:24
the postgres learning I mean conferences
31:24
the postgres learning I mean conferences are such a great way to help new people
31:27
are such a great way to help new people
31:27
are such a great way to help new people learn and spin up on postgres and um
31:30
learn and spin up on postgres and um
31:30
learn and spin up on postgres and um just give back to the community so
31:32
just give back to the community so
31:32
just give back to the community so there's there's so much going on with
31:34
there's there's so much going on with
31:34
there's there's so much going on with sliderscon this year uh you know be
31:36
sliderscon this year uh you know be
31:36
sliderscon this year uh you know be helpful is that we had some kind of
31:37
helpful is that we had some kind of
31:37
helpful is that we had some kind of guide or something to sort of walk us
31:40
guide or something to sort of walk us
31:40
guide or something to sort of walk us through all the all the stuff that we
31:41
through all the all the stuff that we
31:41
through all the all the stuff that we have to look at yeah yeah well in fact
31:44
have to look at yeah yeah well in fact
31:44
have to look at yeah yeah well in fact um there is a guide It's a blog post
31:46
um there is a guide It's a blog post
31:46
um there is a guide It's a blog post check that out short URL is showing on
31:49
check that out short URL is showing on
31:49
check that out short URL is showing on the screen and there are 37 talks and it
31:51
the screen and there are 37 talks and it
31:51
the screen and there are 37 talks and it categorizes them into different buckets
31:54
categorizes them into different buckets
31:54
categorizes them into different buckets so that you can figure out which talks
31:56
so that you can figure out which talks
31:56
so that you can figure out which talks are most useful to your work and the
31:58
are most useful to your work and the
31:58
are most useful to your work and the kinds of things that you want to do with
32:00
kinds of things that you want to do with
32:00
kinds of things that you want to do with postgres and or with cytus
32:03
postgres and or with cytus
32:03
postgres and or with cytus um also just need to say thank you to
32:06
um also just need to say thank you to
32:06
um also just need to say thank you to all 40 of these amazing speakers
32:09
all 40 of these amazing speakers
32:09
all 40 of these amazing speakers yeah absolutely uh speaking and
32:11
yeah absolutely uh speaking and
32:11
yeah absolutely uh speaking and conferences it's fun it's also a lot of
32:14
conferences it's fun it's also a lot of
32:14
conferences it's fun it's also a lot of work uh I know it was more work than I
32:17
work uh I know it was more work than I
32:17
work uh I know it was more work than I could possibly do a second time so I'm
32:19
could possibly do a second time so I'm
32:19
could possibly do a second time so I'm super happy uh and say thank you to all
32:21
super happy uh and say thank you to all
32:21
super happy uh and say thank you to all these folks who have helped out uh it's
32:23
these folks who have helped out uh it's
32:23
these folks who have helped out uh it's been uh nice working with the ones that
32:24
been uh nice working with the ones that
32:24
been uh nice working with the ones that I've gotten to talk to so far and I'm
32:26
I've gotten to talk to so far and I'm
32:26
I've gotten to talk to so far and I'm definitely looking forward to seeing a
32:28
definitely looking forward to seeing a
32:28
definitely looking forward to seeing a bunch of talks
32:30
bunch of talks there's a Code of Conduct as you might
32:32
there's a Code of Conduct as you might
32:32
there's a Code of Conduct as you might expect even for a virtual event like
32:34
expect even for a virtual event like
32:34
expect even for a virtual event like this one
32:35
this one um and it's all the things you you might
32:37
um and it's all the things you you might
32:37
um and it's all the things you you might expect be respectful be inclusive be
32:40
expect be respectful be inclusive be
32:40
expect be respectful be inclusive be friendly and welcoming
32:41
friendly and welcoming
32:41
friendly and welcoming um if you want to see the full code of
32:43
um if you want to see the full code of
32:43
um if you want to see the full code of conduct or you need to report an issue
32:45
conduct or you need to report an issue
32:45
conduct or you need to report an issue there's a URL showing on the screen AKA
32:48
there's a URL showing on the screen AKA
32:48
there's a URL showing on the screen AKA dot Ms slash cytoscon hyphen conduct you
32:50
dot Ms slash cytoscon hyphen conduct you
32:51
dot Ms slash cytoscon hyphen conduct you can also get to the code of conduct from
32:52
can also get to the code of conduct from
32:52
can also get to the code of conduct from the footer of any one of the cytuscon
32:54
the footer of any one of the cytuscon
32:54
the footer of any one of the cytuscon web pages online
32:57
web pages online um they're if you're watching this on
33:00
um they're if you're watching this on
33:00
um they're if you're watching this on YouTube there are live captions in
33:02
YouTube there are live captions in
33:03
YouTube there are live captions in English for the live stream events and
33:05
English for the live stream events and
33:05
English for the live stream events and then for all the on-demand talks that
33:08
then for all the on-demand talks that
33:08
then for all the on-demand talks that are already published as well as you
33:10
are already published as well as you
33:10
are already published as well as you know once these live talks are published
33:12
know once these live talks are published
33:12
know once these live talks are published within the next couple of weeks they're
33:14
within the next couple of weeks they're
33:14
within the next couple of weeks they're not only available but with English
33:15
not only available but with English
33:15
not only available but with English captions but we're going to have them
33:17
captions but we're going to have them
33:17
captions but we're going to have them those captions translated into a whole
33:19
those captions translated into a whole
33:19
those captions translated into a whole bunch of different languages
33:24
there you go um okay Discord the virtual hallway
33:27
um okay Discord the virtual hallway
33:27
um okay Discord the virtual hallway track
33:28
track what do we want to say Rob well I would
33:31
what do we want to say Rob well I would
33:31
what do we want to say Rob well I would say uh I I don't know how many folks
33:32
say uh I I don't know how many folks
33:32
say uh I I don't know how many folks joined us for the path to cytuscon the
33:34
joined us for the path to cytuscon the
33:34
joined us for the path to cytuscon the ones that did uh those events we had a
33:36
ones that did uh those events we had a
33:36
ones that did uh those events we had a great time talking on Discord uh and
33:38
great time talking on Discord uh and
33:39
great time talking on Discord uh and generally you know in a normal
33:40
generally you know in a normal
33:40
generally you know in a normal conference you know in person conference
33:43
conference you know in person conference
33:43
conference you know in person conference there's always that hallway track
33:44
there's always that hallway track
33:44
there's always that hallway track there's always some back Channel going
33:46
there's always some back Channel going
33:46
there's always some back Channel going on where you get to talk and sort of you
33:48
on where you get to talk and sort of you
33:48
on where you get to talk and sort of you know talk about what is in the talks but
33:50
know talk about what is in the talks but
33:50
know talk about what is in the talks but also what is going on in general uh and
33:53
also what is going on in general uh and
33:53
also what is going on in general uh and I think we're gonna have a good time in
33:54
I think we're gonna have a good time in
33:54
I think we're gonna have a good time in there today I will certainly be watching
33:56
there today I will certainly be watching
33:56
there today I will certainly be watching and I hope to talk to a bunch of you all
33:57
and I hope to talk to a bunch of you all
33:57
and I hope to talk to a bunch of you all there today yeah I'm going to be focused
34:00
there today yeah I'm going to be focused
34:00
there today yeah I'm going to be focused on the live stream during the live
34:01
on the live stream during the live
34:01
on the live stream during the live stream as co-host but as soon as this is
34:04
stream as co-host but as soon as this is
34:04
stream as co-host but as soon as this is over I'll be popping into the Discord
34:05
over I'll be popping into the Discord
34:06
over I'll be popping into the Discord and I I can't wait to you know be part
34:08
and I I can't wait to you know be part
34:08
and I I can't wait to you know be part of be part of that conversation um so
34:10
of be part of that conversation um so
34:10
of be part of that conversation um so it's the aka.ms slash open source
34:13
it's the aka.ms slash open source
34:13
it's the aka.ms slash open source Discord and it's the hashtag societoscon
34:16
Discord and it's the hashtag societoscon
34:16
Discord and it's the hashtag societoscon Channel once you're in there so please
34:17
Channel once you're in there so please
34:18
Channel once you're in there so please please join yeah and also if you're
34:20
please join yeah and also if you're
34:20
please join yeah and also if you're watching depending on where you're
34:21
watching depending on where you're
34:21
watching depending on where you're watching and how you're watching that's
34:23
watching and how you're watching that's
34:23
watching and how you're watching that's probably the best way to ask questions
34:24
probably the best way to ask questions
34:24
probably the best way to ask questions to the speakers so we'll also be
34:27
to the speakers so we'll also be
34:27
to the speakers so we'll also be watching in there to see what we can
34:28
watching in there to see what we can
34:28
watching in there to see what we can funnel to the speakers who are in the
34:29
funnel to the speakers who are in the
34:29
funnel to the speakers who are in the live stream today so definitely check it
34:31
live stream today so definitely check it
34:31
live stream today so definitely check it out
34:32
out um so here we're going to have six talks
34:34
um so here we're going to have six talks
34:34
um so here we're going to have six talks in today's live stream starting with our
34:36
in today's live stream starting with our
34:36
in today's live stream starting with our keynote speaker
34:37
keynote speaker um Simon Willison and then there's just
34:39
um Simon Willison and then there's just
34:39
um Simon Willison and then there's just a fabulous collection of people with
34:42
a fabulous collection of people with
34:42
a fabulous collection of people with different postgres expertise so that's
34:44
different postgres expertise so that's
34:44
different postgres expertise so that's what we're here for but I just wanted to
34:46
what we're here for but I just wanted to
34:46
what we're here for but I just wanted to make sure jumping to the emea live
34:48
make sure jumping to the emea live
34:48
make sure jumping to the emea live stream speakers
34:49
stream speakers um that you all know that there is
34:52
um that you all know that there is
34:52
um that you all know that there is another completely different set of
34:54
another completely different set of
34:54
another completely different set of talks coming
34:55
talks coming um and the URL that's showing in the
34:58
um and the URL that's showing in the
34:58
um and the URL that's showing in the upper right hand side of the screen um
35:00
upper right hand side of the screen um
35:00
upper right hand side of the screen um AKA dot Ms slash slidescon anea that's a
35:04
AKA dot Ms slash slidescon anea that's a
35:04
AKA dot Ms slash slidescon anea that's a calendar invite so if you want to drop
35:06
calendar invite so if you want to drop
35:06
calendar invite so if you want to drop it on your calendar block the time make
35:08
it on your calendar block the time make
35:08
it on your calendar block the time make sure that you're not double booked or
35:09
sure that you're not double booked or
35:09
sure that you're not double booked or something that's an easy way to do it
35:11
something that's an easy way to do it
35:11
something that's an easy way to do it and these folks are great but I know we
35:13
and these folks are great but I know we
35:13
and these folks are great but I know we need to get to Simon's talk so we
35:15
need to get to Simon's talk so we
35:15
need to get to Simon's talk so we probably can't talk about each of these
35:16
probably can't talk about each of these
35:16
probably can't talk about each of these amazing things right now
35:18
amazing things right now
35:18
amazing things right now um oh and there's more
35:22
oh yeah we've got a ton of talks today
35:24
oh yeah we've got a ton of talks today
35:24
oh yeah we've got a ton of talks today uh and and like we said definitely worth
35:27
uh and and like we said definitely worth
35:27
uh and and like we said definitely worth checking out check out the guide
35:29
checking out check out the guide
35:29
checking out check out the guide um and and we'll come back we'll talk
35:31
um and and we'll come back we'll talk
35:31
um and and we'll come back we'll talk about more about this later as the day
35:32
about more about this later as the day
35:32
about more about this later as the day goes on
35:34
goes on all right if you are posting on social I
35:36
all right if you are posting on social I
35:36
all right if you are posting on social I love it when people take photos or live
35:39
love it when people take photos or live
35:39
love it when people take photos or live tweet quotes or inspirational things
35:41
tweet quotes or inspirational things
35:41
tweet quotes or inspirational things that they learn from people's
35:43
that they learn from people's
35:43
that they learn from people's um presentations use the hashtag
35:45
um presentations use the hashtag
35:45
um presentations use the hashtag statuscon hashtag whatever social
35:46
statuscon hashtag whatever social
35:46
statuscon hashtag whatever social platform you're using and then for those
35:49
platform you're using and then for those
35:49
platform you're using and then for those of you who care about Swag
35:51
of you who care about Swag
35:51
of you who care about Swag um there is an opportunity to win these
35:53
um there is an opportunity to win these
35:53
um there is an opportunity to win these really cool Swag Bags um there's 75 of
35:56
really cool Swag Bags um there's 75 of
35:56
really cool Swag Bags um there's 75 of them being given away per live stream
35:57
them being given away per live stream
35:57
them being given away per live stream use the URL that's showing on the screen
36:00
use the URL that's showing on the screen
36:00
use the URL that's showing on the screen right now AKA dot Ms slash slightest con
36:03
right now AKA dot Ms slash slightest con
36:03
right now AKA dot Ms slash slightest con swag
36:04
swag and the codes will get shared in the
36:07
and the codes will get shared in the
36:07
and the codes will get shared in the banners during each of the talks so
36:09
banners during each of the talks so
36:09
banners during each of the talks so you'll need a code to enter in
36:12
you'll need a code to enter in
36:12
you'll need a code to enter in um and the socks look pretty darn cool
36:15
um and the socks look pretty darn cool
36:15
um and the socks look pretty darn cool there's also sticker packs so and you
36:18
there's also sticker packs so and you
36:18
there's also sticker packs so and you can enter for both the swag bag and the
36:21
can enter for both the swag bag and the
36:21
can enter for both the swag bag and the sticker packs and there's 200 sticker
36:23
sticker packs and there's 200 sticker
36:23
sticker packs and there's 200 sticker packs with this collection being given
36:24
packs with this collection being given
36:24
packs with this collection being given away per live stream I don't know about
36:26
away per live stream I don't know about
36:26
away per live stream I don't know about you but I love stickers and I've got
36:28
you but I love stickers and I've got
36:28
you but I love stickers and I've got them all over my laptop Rob
36:30
them all over my laptop Rob
36:30
them all over my laptop Rob you know I have to admit I'm one of
36:32
you know I have to admit I'm one of
36:32
you know I have to admit I'm one of those clean laptop people so I don't
36:34
those clean laptop people so I don't
36:34
those clean laptop people so I don't typically put stickers on it but I'm
36:35
typically put stickers on it but I'm
36:35
typically put stickers on it but I'm starting a new thing which is putting
36:37
starting a new thing which is putting
36:37
starting a new thing which is putting stickers on my luggage uh so that it's
36:40
stickers on my luggage uh so that it's
36:40
stickers on my luggage uh so that it's more easy to spot that when I travel you
36:42
more easy to spot that when I travel you
36:42
more easy to spot that when I travel you know there's not as many uh postgres
36:44
know there's not as many uh postgres
36:44
know there's not as many uh postgres things out in the world on people's
36:46
things out in the world on people's
36:46
things out in the world on people's luggage but I think that's an easy way
36:47
luggage but I think that's an easy way
36:47
luggage but I think that's an easy way also to find people when you're
36:48
also to find people when you're
36:48
also to find people when you're traveling who might be into the things
36:50
traveling who might be into the things
36:50
traveling who might be into the things you're into so pretty good idea very
36:53
you're into so pretty good idea very
36:53
you're into so pretty good idea very retro Vibe too I like that okay I'm
36:56
retro Vibe too I like that okay I'm
36:56
retro Vibe too I like that okay I'm gonna I need to get myself some of these
36:58
gonna I need to get myself some of these
36:58
gonna I need to get myself some of these stickers too okay
36:59
stickers too okay um there is going to be an attendee
37:01
um there is going to be an attendee
37:01
um there is going to be an attendee survey we this is the second annual
37:03
survey we this is the second annual
37:03
survey we this is the second annual cytuscon and event for postgres
37:05
cytuscon and event for postgres
37:05
cytuscon and event for postgres um we're gonna want your feedback if
37:07
um we're gonna want your feedback if
37:07
um we're gonna want your feedback if you're willing to give it to us so that
37:09
you're willing to give it to us so that
37:09
you're willing to give it to us so that we can make the event even better in the
37:11
we can make the event even better in the
37:11
we can make the event even better in the future
37:12
future um so there's a QR code there's a short
37:13
um so there's a QR code there's a short
37:13
um so there's a QR code there's a short URL on the screen
37:16
URL on the screen um please and you can also give feedback
37:18
um please and you can also give feedback
37:18
um please and you can also give feedback on each and every talk and if you watch
37:20
on each and every talk and if you watch
37:20
on each and every talk and if you watch some talks today and some talks tomorrow
37:21
some talks today and some talks tomorrow
37:21
some talks today and some talks tomorrow you can come back and fill out the
37:23
you can come back and fill out the
37:23
you can come back and fill out the survey again and again before it closes
37:26
survey again and again before it closes
37:26
survey again and again before it closes on next Friday April 28th end of day
37:29
on next Friday April 28th end of day
37:29
on next Friday April 28th end of day anywhere are on earth time zone
37:32
anywhere are on earth time zone
37:32
anywhere are on earth time zone so that's important and finally did we
37:35
so that's important and finally did we
37:35
so that's important and finally did we talk about Discord already I don't think
37:37
talk about Discord already I don't think
37:37
talk about Discord already I don't think we did but uh we could talk about it
37:40
we did but uh we could talk about it
37:40
we did but uh we could talk about it more in the Discord if we were on the
37:41
more in the Discord if we were on the
37:41
more in the Discord if we were on the Discord and then we could talk about it
37:42
Discord and then we could talk about it
37:42
Discord and then we could talk about it with everybody who has joined the
37:44
with everybody who has joined the
37:44
with everybody who has joined the Discord definitely join the back channel
37:46
Discord definitely join the back channel
37:46
Discord definitely join the back channel it's it's it's a nice way to not just be
37:49
it's it's it's a nice way to not just be
37:49
it's it's it's a nice way to not just be watching the live stream but
37:50
watching the live stream but
37:50
watching the live stream but participating in the conversation too
37:53
participating in the conversation too
37:53
participating in the conversation too um so without further Ado I think that
37:56
um so without further Ado I think that
37:56
um so without further Ado I think that we should introduce our keynote speaker
37:58
we should introduce our keynote speaker
37:58
we should introduce our keynote speaker Simon Willison
38:01
Simon Willison absolutely welcome Simon hi it's great
38:04
absolutely welcome Simon hi it's great
38:04
absolutely welcome Simon hi it's great to be here
38:05
to be here I'm so glad you're here
38:08
I'm so glad you're here
38:08
I'm so glad you're here um okay so for those of you who don't
38:10
um okay so for those of you who don't
38:10
um okay so for those of you who don't know Simon he's the co-creator of Django
38:12
know Simon he's the co-creator of Django
38:12
know Simon he's the co-creator of Django he's the creator of data set which
38:15
he's the creator of data set which
38:15
he's the creator of data set which you'll hear a bit more about today I've
38:18
you'll hear a bit more about today I've
38:18
you'll hear a bit more about today I've been following Simon for years on
38:19
been following Simon for years on
38:19
been following Simon for years on Twitter
38:20
Twitter um Simon does a lot of his work in
38:22
um Simon does a lot of his work in
38:22
um Simon does a lot of his work in public which means that people like me
38:23
public which means that people like me
38:24
public which means that people like me can learn from his successes his
38:26
can learn from his successes his
38:26
can learn from his successes his failures his learnings
38:29
failures his learnings
38:29
failures his learnings um and it's I I've learned so much from
38:31
um and it's I I've learned so much from
38:31
um and it's I I've learned so much from you Simon these last couple of years
38:34
you Simon these last couple of years
38:34
you Simon these last couple of years I got to tell you I lost uh several
38:36
I got to tell you I lost uh several
38:36
I got to tell you I lost uh several hours the other day after the first half
38:38
hours the other day after the first half
38:38
hours the other day after the first half the sign has gone uh no doubt reading
38:40
the sign has gone uh no doubt reading
38:40
the sign has gone uh no doubt reading his things I learned blog which is kind
38:42
his things I learned blog which is kind
38:42
his things I learned blog which is kind of like a micro blog of sorts uh it was
38:44
of like a micro blog of sorts uh it was
38:44
of like a micro blog of sorts uh it was fantastic uh and you know just sort of
38:46
fantastic uh and you know just sort of
38:46
fantastic uh and you know just sort of reminded me uh I should really get back
38:48
reminded me uh I should really get back
38:48
reminded me uh I should really get back to doing more blogging it's really good
38:50
to doing more blogging it's really good
38:50
to doing more blogging it's really good stuff
38:52
stuff um yes I should point out that Simon was
38:54
um yes I should point out that Simon was
38:54
um yes I should point out that Simon was a guest on the first episode of path to
38:57
a guest on the first episode of path to
38:57
a guest on the first episode of path to side is gone which is this live audio
38:59
side is gone which is this live audio
38:59
side is gone which is this live audio show on Discord kind of like a podcast
39:00
show on Discord kind of like a podcast
39:00
show on Discord kind of like a podcast about a text chat and uh yeah we talked
39:03
about a text chat and uh yeah we talked
39:03
about a text chat and uh yeah we talked about working in public which is a whole
39:05
about working in public which is a whole
39:05
about working in public which is a whole fascinating conversation in and of
39:07
fascinating conversation in and of
39:07
fascinating conversation in and of itself but you're here to talk about big
39:09
itself but you're here to talk about big
39:09
itself but you're here to talk about big opportunities in small data
39:12
opportunities in small data
39:12
opportunities in small data so without further Ado I think we should
39:14
so without further Ado I think we should
39:14
so without further Ado I think we should pass the the floor to you you've got the
39:16
pass the the floor to you you've got the
39:16
pass the the floor to you you've got the stage signing
39:18
stage signing fantastic so yeah today I want to talk
39:20
fantastic so yeah today I want to talk
39:20
fantastic so yeah today I want to talk about small data and um the reason I'm
39:23
about small data and um the reason I'm
39:23
about small data and um the reason I'm talking about that here is I feel like
39:25
talking about that here is I feel like
39:25
talking about that here is I feel like as an industry We've Got Big Data pretty
39:27
as an industry We've Got Big Data pretty
39:27
as an industry We've Got Big Data pretty much figured out at this point you know
39:29
much figured out at this point you know
39:29
much figured out at this point you know you've got tools like situs that let you
39:31
you've got tools like situs that let you
39:31
you've got tools like situs that let you horizontally scale your postcodes
39:32
horizontally scale your postcodes
39:32
horizontally scale your postcodes database it feels like every data
39:34
database it feels like every data
39:34
database it feels like every data warehouse product out there that I care
39:36
warehouse product out there that I care
39:36
warehouse product out there that I care about is is adding postgres um like
39:38
about is is adding postgres um like
39:38
about is is adding postgres um like protocol
39:40
protocol um protocol support and so forth like we
39:42
um protocol support and so forth like we
39:42
um protocol support and so forth like we know what to do if somebody has Peter
39:43
know what to do if somebody has Peter
39:43
know what to do if somebody has Peter bytes of data now that's not a difficult
39:45
bytes of data now that's not a difficult
39:45
bytes of data now that's not a difficult problem for us to solve but I'm really
39:47
problem for us to solve but I'm really
39:47
problem for us to solve but I'm really interested in the very other end of the
39:49
interested in the very other end of the
39:49
interested in the very other end of the scale the last um five years I've been
39:51
scale the last um five years I've been
39:51
scale the last um five years I've been exploring the issue of small data where
39:53
exploring the issue of small data where
39:53
exploring the issue of small data where I Define small as too big for Excel but
39:56
I Define small as too big for Excel but
39:56
I Define small as too big for Excel but small enough that it fits on a thumb
39:58
small enough that it fits on a thumb
39:58
small enough that it fits on a thumb drive or fits on your telephone my
40:00
drive or fits on your telephone my
40:00
drive or fits on your telephone my telephone's got a half terabyte of space
40:02
telephone's got a half terabyte of space
40:02
telephone's got a half terabyte of space on it right now so so small data can get
40:04
on it right now so so small data can get
40:04
on it right now so so small data can get pretty big
40:05
pretty big um and the uh the way I've been
40:07
um and the uh the way I've been
40:07
um and the uh the way I've been exploring that is through an open source
40:08
exploring that is through an open source
40:08
exploring that is through an open source project that I've been working on called
40:10
project that I've been working on called
40:10
project that I've been working on called dataset
40:11
dataset um and I'm actually going rather than
40:13
um and I'm actually going rather than
40:13
um and I'm actually going rather than talk about what data set is I'm going to
40:14
talk about what data set is I'm going to
40:15
talk about what data set is I'm going to dive straight into a demo and show you
40:16
dive straight into a demo and show you
40:16
dive straight into a demo and show you what it is and what what it can do and
40:18
what it is and what what it can do and
40:18
what it is and what what it can do and how it works um this right here is the
40:22
how it works um this right here is the
40:22
how it works um this right here is the city of San Francisco's open data portal
40:24
city of San Francisco's open data portal
40:24
city of San Francisco's open data portal this is a trend from the past decade
40:26
this is a trend from the past decade
40:27
this is a trend from the past decade which I'm very excited about where local
40:29
which I'm very excited about where local
40:29
which I'm very excited about where local and National governments all around the
40:31
and National governments all around the
40:31
and National governments all around the world have been launching these these
40:33
world have been launching these these
40:33
world have been launching these these data portals where they publish data
40:35
data portals where they publish data
40:35
data portals where they publish data about the places where people live
40:37
about the places where people live
40:37
about the places where people live um and some of these things are updated
40:39
um and some of these things are updated
40:39
um and some of these things are updated on a daily basis like it's really
40:40
on a daily basis like it's really
40:40
on a daily basis like it's really interesting seeing quite how much
40:42
interesting seeing quite how much
40:42
interesting seeing quite how much information is flowing through all of
40:44
information is flowing through all of
40:44
information is flowing through all of these things the problem of course is
40:45
these things the problem of course is
40:45
these things the problem of course is that just because you published data
40:47
that just because you published data
40:47
that just because you published data doesn't mean that people can actually
40:48
doesn't mean that people can actually
40:48
doesn't mean that people can actually use it
40:49
use it um and so in this case I'm going to do a
40:51
um and so in this case I'm going to do a
40:51
um and so in this case I'm going to do a demo using the city of San Francisco's
40:53
demo using the city of San Francisco's
40:53
demo using the city of San Francisco's City facilities data and this is
40:55
City facilities data and this is
40:55
City facilities data and this is definitely small data it's 1 700 rows
40:59
definitely small data it's 1 700 rows
40:59
definitely small data it's 1 700 rows right it's absolutely tiny and what I
41:01
right it's absolutely tiny and what I
41:01
right it's absolutely tiny and what I can do is I can copy and paste that URL
41:02
can do is I can copy and paste that URL
41:02
can do is I can copy and paste that URL and I'm going to load it into data set
41:04
and I'm going to load it into data set
41:04
and I'm going to load it into data set dataset is a python web application it
41:07
dataset is a python web application it
41:07
dataset is a python web application it um supports plugins which are similar to
41:09
um supports plugins which are similar to
41:09
um supports plugins which are similar to postgres extensions and I've got plugins
41:11
postgres extensions and I've got plugins
41:11
postgres extensions and I've got plugins installed here for things like load data
41:14
installed here for things like load data
41:14
installed here for things like load data from an open data portal so I'm going to
41:16
from an open data portal so I'm going to
41:16
from an open data portal so I'm going to fight this up here and paste in that URL
41:19
fight this up here and paste in that URL
41:19
fight this up here and paste in that URL I just pasted hit that that button and
41:21
I just pasted hit that that button and
41:21
I just pasted hit that that button and dataset will go and fetch that data and
41:24
dataset will go and fetch that data and
41:24
dataset will go and fetch that data and extract things like the metadata around
41:26
extract things like the metadata around
41:26
extract things like the metadata around it so what the columns are and it'll
41:28
it so what the columns are and it'll
41:28
it so what the columns are and it'll also pull in that table of 1700 rows now
41:31
also pull in that table of 1700 rows now
41:31
also pull in that table of 1700 rows now table's not particularly exciting but um
41:33
table's not particularly exciting but um
41:33
table's not particularly exciting but um one of the one of the data set plugins
41:35
one of the one of the data set plugins
41:35
one of the one of the data set plugins I'm running is it's called data set
41:37
I'm running is it's called data set
41:37
I'm running is it's called data set cluster map and it looks for data with a
41:39
cluster map and it looks for data with a
41:39
cluster map and it looks for data with a longitude and latitude column and sticks
41:42
longitude and latitude column and sticks
41:42
longitude and latitude column and sticks that data on a map so now I can click uh
41:46
that data on a map so now I can click uh
41:46
that data on a map so now I can click uh where are we I can click load all down
41:48
where are we I can click load all down
41:48
where are we I can click load all down at the bottom and now I've got all 1700
41:50
at the bottom and now I've got all 1700
41:50
at the bottom and now I've got all 1700 facilities loaded onto this map and
41:53
facilities loaded onto this map and
41:53
facilities loaded onto this map and already we're starting to see stories in
41:55
already we're starting to see stories in
41:55
already we're starting to see stories in the data like why does the city of San
41:58
the data like why does the city of San
41:58
the data like why does the city of San Francisco have 21 buildings down here it
42:00
Francisco have 21 buildings down here it
42:00
Francisco have 21 buildings down here it turns out this is a juvenile detention
42:01
turns out this is a juvenile detention
42:01
turns out this is a juvenile detention facility that the city owns and operates
42:04
facility that the city owns and operates
42:04
facility that the city owns and operates down in lahonda if you scroll to the
42:07
down in lahonda if you scroll to the
42:07
down in lahonda if you scroll to the side you'll see that there are places
42:08
side you'll see that there are places
42:08
side you'll see that there are places things like these this is um
42:10
things like these this is um
42:10
things like these this is um infrastructure for the hetchy reservoir
42:13
infrastructure for the hetchy reservoir
42:13
infrastructure for the hetchy reservoir this is a Heche substation and then over
42:15
this is a Heche substation and then over
42:15
this is a Heche substation and then over here we've got another cluster of
42:17
here we've got another cluster of
42:17
here we've got another cluster of markers
42:18
markers um which is uh it's an it's a camp this
42:20
um which is uh it's an it's a camp this
42:20
um which is uh it's an it's a camp this is a a camp operated by The Parks and
42:22
is a a camp operated by The Parks and
42:22
is a a camp operated by The Parks and Recreation Department
42:24
Recreation Department
42:24
Recreation Department um an important feature of data set is
42:26
um an important feature of data set is
42:26
um an important feature of data set is what I call faceting it's essentially a
42:28
what I call faceting it's essentially a
42:28
what I call faceting it's essentially a group buying count so I can say you know
42:30
group buying count so I can say you know
42:30
group buying count so I can say you know what let's fast it by city and see that
42:32
what let's fast it by city and see that
42:32
what let's fast it by city and see that most of them are in San Francisco but
42:33
most of them are in San Francisco but
42:33
most of them are in San Francisco but you've got some in the Honda some in
42:35
you've got some in the Honda some in
42:35
you've got some in the Honda some in Groveland we're going to facet by
42:36
Groveland we're going to facet by
42:36
Groveland we're going to facet by jurisdiction now this is telling us
42:38
jurisdiction now this is telling us
42:38
jurisdiction now this is telling us which jurisdictions within the San
42:41
which jurisdictions within the San
42:41
which jurisdictions within the San Francisco government own the most
42:42
Francisco government own the most
42:42
Francisco government own the most property and Parks and Recreation the
42:44
property and Parks and Recreation the
42:44
property and Parks and Recreation the top Port is to 213 the airport's quite
42:48
top Port is to 213 the airport's quite
42:48
top Port is to 213 the airport's quite big the school district but some so just
42:51
big the school district but some so just
42:51
big the school district but some so just with a few clicks with importing that
42:52
with a few clicks with importing that
42:52
with a few clicks with importing that data into a new tool with start already
42:54
data into a new tool with start already
42:54
data into a new tool with start already starting to learn things about the data
42:57
starting to learn things about the data
42:57
starting to learn things about the data with a sort of aggregate level and this
42:59
with a sort of aggregate level and this
42:59
with a sort of aggregate level and this is something I'm really passionate about
43:00
is something I'm really passionate about
43:00
is something I'm really passionate about this idea that you can take data from a
43:03
this idea that you can take data from a
43:03
this idea that you can take data from a open data portal in basically any shape
43:05
open data portal in basically any shape
43:05
open data portal in basically any shape or size and quickly start diving into it
43:08
or size and quickly start diving into it
43:08
or size and quickly start diving into it and figuring okay what are the
43:09
and figuring okay what are the
43:09
and figuring okay what are the interesting Trends and patterns what
43:11
interesting Trends and patterns what
43:11
interesting Trends and patterns what does this look like if we visualize it
43:13
does this look like if we visualize it
43:13
does this look like if we visualize it um
43:13
um another feature of data set I should
43:15
another feature of data set I should
43:15
another feature of data set I should demonstrate quickly is all of this is
43:17
demonstrate quickly is all of this is
43:17
demonstrate quickly is all of this is running on top of a SQL database so when
43:20
running on top of a SQL database so when
43:20
running on top of a SQL database so when I filter by you know 38 for public
43:22
I filter by you know 38 for public
43:22
I filter by you know 38 for public health and then
43:24
health and then um 37 in San Francisco that's building
43:26
um 37 in San Francisco that's building
43:26
um 37 in San Francisco that's building up a SQL where query for me when I click
43:29
up a SQL where query for me when I click
43:29
up a SQL where query for me when I click View and edit SQL it actually shows me
43:32
View and edit SQL it actually shows me
43:32
View and edit SQL it actually shows me that SQL query and it lets me edit it so
43:35
that SQL query and it lets me edit it so
43:35
that SQL query and it lets me edit it so I can say you know what I don't need any
43:37
I can say you know what I don't need any
43:37
I can say you know what I don't need any of this stuff I'm going to keep the rest
43:39
of this stuff I'm going to keep the rest
43:39
of this stuff I'm going to keep the rest in there I can hit run SQL and now I'm
43:42
in there I can hit run SQL and now I'm
43:42
in there I can hit run SQL and now I'm getting back just that just the results
43:44
getting back just that just the results
43:44
getting back just that just the results of that particular query
43:46
of that particular query
43:46
of that particular query um and that's bookmarkable I can copy
43:48
um and that's bookmarkable I can copy
43:48
um and that's bookmarkable I can copy and paste this URL and send it to
43:49
and paste this URL and send it to
43:49
and paste this URL and send it to somebody else and they will see that
43:51
somebody else and they will see that
43:51
somebody else and they will see that same query that I'm seeing even more
43:53
same query that I'm seeing even more
43:53
same query that I'm seeing even more importantly you can get it back out as
43:54
importantly you can get it back out as
43:54
importantly you can get it back out as CSV or you can get it out as Json
43:57
CSV or you can get it out as Json
43:57
CSV or you can get it out as Json and this means that data set can serve
44:00
and this means that data set can serve
44:00
and this means that data set can serve as an integration layer for anything
44:02
as an integration layer for anything
44:02
as an integration layer for anything that can speak Json which is effectively
44:04
that can speak Json which is effectively
44:04
that can speak Json which is effectively everything these days so there's a lot
44:06
everything these days so there's a lot
44:06
everything these days so there's a lot of power in being able to run these
44:09
of power in being able to run these
44:09
of power in being able to run these arbitrary queries against a read-only
44:11
arbitrary queries against a read-only
44:11
arbitrary queries against a read-only database that's safe there's nothing no
44:13
database that's safe there's nothing no
44:13
database that's safe there's nothing no damage that you can do with this
44:14
damage that you can do with this
44:15
damage that you can do with this um but it gives you a very flexible and
44:16
um but it gives you a very flexible and
44:16
um but it gives you a very flexible and Powerful way of of remixing data and
44:19
Powerful way of of remixing data and
44:19
Powerful way of of remixing data and exporting it out
44:20
exporting it out um and and doing interesting things with
44:22
um and and doing interesting things with
44:22
um and and doing interesting things with that with it afterwards
44:26
um we'll switch back to the slides and
44:29
um we'll switch back to the slides and
44:29
um we'll switch back to the slides and where's my slide control gone sorry
44:32
where's my slide control gone sorry
44:32
where's my slide control gone sorry about this
44:34
about this there we are
44:36
there we are um so the reason I got interested in
44:37
um so the reason I got interested in
44:37
um so the reason I got interested in this in the first place was
44:39
this in the first place was
44:39
this in the first place was um I started out actually in the realm
44:41
um I started out actually in the realm
44:41
um I started out actually in the realm of data journalism so the um I worked
44:45
of data journalism so the um I worked
44:45
of data journalism so the um I worked for the Guardian newspaper in London and
44:47
for the Guardian newspaper in London and
44:47
for the Guardian newspaper in London and we realized that our journalists were
44:48
we realized that our journalists were
44:48
we realized that our journalists were collecting all sorts of fascinating
44:50
collecting all sorts of fascinating
44:50
collecting all sorts of fascinating facts about the world in order to create
44:52
facts about the world in order to create
44:52
facts about the world in order to create infographics and maps for the newspaper
44:54
infographics and maps for the newspaper
44:54
infographics and maps for the newspaper and every time this happened we'd
44:55
and every time this happened we'd
44:55
and every time this happened we'd collect all of this data and then it
44:57
collect all of this data and then it
44:57
collect all of this data and then it would sit on a hard drive under
44:58
would sit on a hard drive under
44:58
would sit on a hard drive under somebody's desk and we decided that we'd
45:00
somebody's desk and we decided that we'd
45:00
somebody's desk and we decided that we'd start sharing that data with the wider
45:02
start sharing that data with the wider
45:02
start sharing that data with the wider world try and publish the data behind
45:04
world try and publish the data behind
45:04
world try and publish the data behind the stories anytime we put out one of
45:06
the stories anytime we put out one of
45:06
the stories anytime we put out one of these infographics so we started a thing
45:08
these infographics so we started a thing
45:08
these infographics so we started a thing called the data blog and it was a Blog
45:10
called the data blog and it was a Blog
45:10
called the data blog and it was a Blog of data that we had collected to to
45:12
of data that we had collected to to
45:12
of data that we had collected to to support the reporting that we were doing
45:13
support the reporting that we were doing
45:13
support the reporting that we were doing and at the time we ended up using Google
45:16
and at the time we ended up using Google
45:16
and at the time we ended up using Google Sheets as the mechanism for publishing
45:18
Sheets as the mechanism for publishing
45:18
Sheets as the mechanism for publishing the data because it was free and it
45:20
the data because it was free and it
45:20
the data because it was free and it already existed and we didn't have to
45:22
already existed and we didn't have to
45:22
already existed and we didn't have to build anything custom but it always
45:23
build anything custom but it always
45:24
build anything custom but it always frustrated me I always felt like there
45:25
frustrated me I always felt like there
45:25
frustrated me I always felt like there should be a better way of um of
45:28
should be a better way of um of
45:28
should be a better way of um of publishing data something that was open
45:29
publishing data something that was open
45:29
publishing data something that was open source and more powerful and more
45:31
source and more powerful and more
45:31
source and more powerful and more flexible and the question I asked myself
45:33
flexible and the question I asked myself
45:33
flexible and the question I asked myself was what's the best possible way of
45:35
was what's the best possible way of
45:35
was what's the best possible way of publishing this structure your data and
45:37
publishing this structure your data and
45:37
publishing this structure your data and that's where the idea for data set came
45:38
that's where the idea for data set came
45:39
that's where the idea for data set came from data set was basically my attempt
45:41
from data set was basically my attempt
45:41
from data set was basically my attempt at answering that question about the
45:42
at answering that question about the
45:42
at answering that question about the best possible way of publishing data
45:44
best possible way of publishing data
45:45
best possible way of publishing data that let people both explore it and also
45:47
that let people both explore it and also
45:47
that let people both explore it and also do Integrations and automate it and and
45:49
do Integrations and automate it and and
45:49
do Integrations and automate it and and and things like that
45:52
and things like that
45:52
and things like that um I hinted at this a moment ago this
45:53
um I hinted at this a moment ago this
45:53
um I hinted at this a moment ago this idea of being able to do read-only SQL
45:56
idea of being able to do read-only SQL
45:56
idea of being able to do read-only SQL queries via an API which I find amusing
45:59
queries via an API which I find amusing
45:59
queries via an API which I find amusing because for most web applications this
46:01
because for most web applications this
46:01
because for most web applications this would be considered a SQL injection
46:03
would be considered a SQL injection
46:03
would be considered a SQL injection attack this is a security hole that you
46:05
attack this is a security hole that you
46:05
attack this is a security hole that you must must be prevented at all costs with
46:07
must must be prevented at all costs with
46:07
must must be prevented at all costs with data set it's a documented feature and I
46:09
data set it's a documented feature and I
46:09
data set it's a documented feature and I get away with that because data set open
46:11
get away with that because data set open
46:11
get away with that because data set open it treats data as read-only so you can't
46:13
it treats data as read-only so you can't
46:13
it treats data as read-only so you can't damage it um it sets a time limit on the
46:16
damage it um it sets a time limit on the
46:16
damage it um it sets a time limit on the queries so you can't go too long with
46:17
queries so you can't go too long with
46:18
queries so you can't go too long with them
46:22
and it gives you that that API access so
46:25
and it gives you that that API access so
46:25
and it gives you that that API access so that you can you can you can do
46:27
that you can you can you can do
46:27
that you can you can you can do interesting things on top of it I'll
46:29
interesting things on top of it I'll
46:29
interesting things on top of it I'll show you a um problem that I solved with
46:31
show you a um problem that I solved with
46:31
show you a um problem that I solved with this just the other day
46:33
this just the other day
46:33
this just the other day um this is my blog I've been blogging
46:35
um this is my blog I've been blogging
46:35
um this is my blog I've been blogging for nearly 21 years about old manner of
46:38
for nearly 21 years about old manner of
46:38
for nearly 21 years about old manner of different topics
46:39
different topics um I've got a tag Cloud which goes on
46:41
um I've got a tag Cloud which goes on
46:41
um I've got a tag Cloud which goes on forever with all of the different things
46:43
forever with all of the different things
46:43
forever with all of the different things that I've written about
46:44
that I've written about
46:44
that I've written about um but I've decided that it's 2023 The
46:46
um but I've decided that it's 2023 The
46:46
um but I've decided that it's 2023 The Cutting Edge of publication right now is
46:48
Cutting Edge of publication right now is
46:48
Cutting Edge of publication right now is an email newsletter like email
46:49
an email newsletter like email
46:49
an email newsletter like email newsletters are very much back again and
46:52
newsletters are very much back again and
46:52
newsletters are very much back again and I wanted to start doing that but I
46:53
I wanted to start doing that but I
46:53
I wanted to start doing that but I didn't want to have to do any additional
46:54
didn't want to have to do any additional
46:54
didn't want to have to do any additional work for it so since I've got this data
46:58
work for it so since I've got this data
46:58
work for it so since I've got this data in a database and I've got a data set
47:00
in a database and I've got a data set
47:00
in a database and I've got a data set instance that gives me the ability to
47:02
instance that gives me the ability to
47:02
instance that gives me the ability to query my own blog I figured I'd try and
47:05
query my own blog I figured I'd try and
47:05
query my own blog I figured I'd try and automate the process of constructing
47:07
automate the process of constructing
47:07
automate the process of constructing that newsletter so I built this utterly
47:09
that newsletter so I built this utterly
47:09
that newsletter so I built this utterly terrifying SQL query which pulls from
47:12
terrifying SQL query which pulls from
47:12
terrifying SQL query which pulls from all of my different content types and
47:13
all of my different content types and
47:13
all of my different content types and arranges them together into into a
47:15
arranges them together into into a
47:15
arranges them together into into a format like this and then I built this
47:18
format like this and then I built this
47:18
format like this and then I built this observable notebook um if you haven't
47:20
observable notebook um if you haven't
47:20
observable notebook um if you haven't played with observable yet it's a really
47:21
played with observable yet it's a really
47:21
played with observable yet it's a really interesting tool it's basically kind of
47:23
interesting tool it's basically kind of
47:23
interesting tool it's basically kind of like jupyter notebooks and python but in
47:25
like jupyter notebooks and python but in
47:25
like jupyter notebooks and python but in JavaScript and it makes it very easy to
47:28
JavaScript and it makes it very easy to
47:28
JavaScript and it makes it very easy to build custom applications that pull in
47:30
build custom applications that pull in
47:30
build custom applications that pull in data from different sources remix it and
47:33
data from different sources remix it and
47:33
data from different sources remix it and reformat it in different ways and so
47:35
reformat it in different ways and so
47:35
reformat it in different ways and so what I did is I um I took that
47:38
what I did is I um I took that
47:38
what I did is I um I took that terrifying SQL query I dropped it into
47:40
terrifying SQL query I dropped it into
47:40
terrifying SQL query I dropped it into the notebook right
47:42
the notebook right here where's my skipped implementation
47:45
here where's my skipped implementation
47:45
here where's my skipped implementation so I dropped that in I wrote a bunch of
47:47
so I dropped that in I wrote a bunch of
47:47
so I dropped that in I wrote a bunch of JavaScript to glue it together into HTML
47:49
JavaScript to glue it together into HTML
47:49
JavaScript to glue it together into HTML and all of that sort of stuff and then I
47:51
and all of that sort of stuff and then I
47:51
and all of that sort of stuff and then I had it output the HTML for my newsletter
47:54
had it output the HTML for my newsletter
47:54
had it output the HTML for my newsletter and actually that's I can click this to
47:56
and actually that's I can click this to
47:56
and actually that's I can click this to get rid of the things I've already sent
47:58
get rid of the things I've already sent
47:58
get rid of the things I've already sent out so right now if I was to send a
47:59
out so right now if I was to send a
47:59
out so right now if I was to send a newsletter it would have this content in
48:01
newsletter it would have this content in
48:01
newsletter it would have this content in and I added a button that copies and
48:03
and I added a button that copies and
48:03
and I added a button that copies and pastes the clipboard so now I can pop
48:06
pastes the clipboard so now I can pop
48:06
pastes the clipboard so now I can pop into sub stack and hit paste and my
48:09
into sub stack and hit paste and my
48:09
into sub stack and hit paste and my newsletter is is ready to send this is
48:12
newsletter is is ready to send this is
48:12
newsletter is is ready to send this is also kind of a hack against sub stack
48:14
also kind of a hack against sub stack
48:14
also kind of a hack against sub stack because substack don't offer an API
48:16
because substack don't offer an API
48:16
because substack don't offer an API there's no official automated way to
48:19
there's no official automated way to
48:19
there's no official automated way to create content for sub stack but it
48:20
create content for sub stack but it
48:20
create content for sub stack but it turns out copy and paste is kind of the
48:22
turns out copy and paste is kind of the
48:22
turns out copy and paste is kind of the universal integration method I've done
48:25
universal integration method I've done
48:25
universal integration method I've done quite a few things where something
48:26
quite a few things where something
48:26
quite a few things where something doesn't have an API but if you fiddle
48:29
doesn't have an API but if you fiddle
48:29
doesn't have an API but if you fiddle around with copy and paste just enough
48:30
around with copy and paste just enough
48:30
around with copy and paste just enough you can actually do a lot of automation
48:32
you can actually do a lot of automation
48:32
you can actually do a lot of automation on top of it so a sort of key hack here
48:34
on top of it so a sort of key hack here
48:34
on top of it so a sort of key hack here is to abuse sub Stack's ability to paste
48:37
is to abuse sub Stack's ability to paste
48:37
is to abuse sub Stack's ability to paste content to automate the creation of this
48:39
content to automate the creation of this
48:39
content to automate the creation of this and this works now I've got Simon
48:41
and this works now I've got Simon
48:41
and this works now I've got Simon willison's sub snack newsletter you can
48:43
willison's sub snack newsletter you can
48:43
willison's sub snack newsletter you can subscribe to it I send it out once or
48:44
subscribe to it I send it out once or
48:44
subscribe to it I send it out once or twice a week and it's been it's been
48:47
twice a week and it's been it's been
48:47
twice a week and it's been it's been working out pretty well I've been
48:48
working out pretty well I've been
48:48
working out pretty well I've been running it for a month and it's picked
48:49
running it for a month and it's picked
48:49
running it for a month and it's picked up a bunch of subscribers and it's been
48:51
up a bunch of subscribers and it's been
48:51
up a bunch of subscribers and it's been almost no work I probably spent a lot
48:54
almost no work I probably spent a lot
48:54
almost no work I probably spent a lot more time fiddling with my notebook than
48:55
more time fiddling with my notebook than
48:55
more time fiddling with my notebook than I did actually like writing any custom
48:57
I did actually like writing any custom
48:57
I did actually like writing any custom content for that newsletter
49:00
content for that newsletter
49:00
content for that newsletter and that's really the um that really
49:02
and that's really the um that really
49:02
and that's really the um that really goes to illustrate how useful it is to
49:05
goes to illustrate how useful it is to
49:05
goes to illustrate how useful it is to have data with in with an API that
49:07
have data with in with an API that
49:07
have data with in with an API that speaks SQL right SQL is a very powerful
49:10
speaks SQL right SQL is a very powerful
49:10
speaks SQL right SQL is a very powerful sort of domain-specific language for
49:12
sort of domain-specific language for
49:12
sort of domain-specific language for remixing and combining data and then
49:15
remixing and combining data and then
49:15
remixing and combining data and then well some people use things like graphql
49:17
well some people use things like graphql
49:17
well some people use things like graphql to give themselves a an API for for
49:20
to give themselves a an API for for
49:20
to give themselves a an API for for um for manipulating data in custom ways
49:22
um for manipulating data in custom ways
49:22
um for manipulating data in custom ways sql's graphql from the 70s right it's
49:25
sql's graphql from the 70s right it's
49:25
sql's graphql from the 70s right it's it's worked for a very long time it's a
49:27
it's worked for a very long time it's a
49:27
it's worked for a very long time it's a very robust way of of doing this sort of
49:30
very robust way of of doing this sort of
49:30
very robust way of of doing this sort of um semi-safe custom Integrations and I
49:32
um semi-safe custom Integrations and I
49:32
um semi-safe custom Integrations and I think there's a huge amount of power
49:33
think there's a huge amount of power
49:33
think there's a huge amount of power just in using SQL like that making
49:35
just in using SQL like that making
49:35
just in using SQL like that making things bookmarkable giving things Json
49:37
things bookmarkable giving things Json
49:37
things bookmarkable giving things Json endpoints that accept SQL there's a lot
49:39
endpoints that accept SQL there's a lot
49:39
endpoints that accept SQL there's a lot of really cool stuff that you can do
49:41
of really cool stuff that you can do
49:41
of really cool stuff that you can do with that
49:46
um but crucially data set is not built
49:49
um but crucially data set is not built
49:49
um but crucially data set is not built on top of postgres the um the data the
49:51
on top of postgres the um the data the
49:51
on top of postgres the um the data the database that I'm using under the hood
49:53
database that I'm using under the hood
49:53
database that I'm using under the hood for all of this is SQL Lite it's my
49:55
for all of this is SQL Lite it's my
49:55
for all of this is SQL Lite it's my second favorite database after postgres
49:57
second favorite database after postgres
49:57
second favorite database after postgres and uh the re there are a few different
49:59
and uh the re there are a few different
49:59
and uh the re there are a few different reasons that I'm using sqlite for this
50:02
reasons that I'm using sqlite for this
50:02
reasons that I'm using sqlite for this um sqlite claims to be one of the most
50:04
um sqlite claims to be one of the most
50:04
um sqlite claims to be one of the most widely installed pieces of software in
50:06
widely installed pieces of software in
50:06
widely installed pieces of software in the world and absolutely the most widely
50:08
the world and absolutely the most widely
50:08
the world and absolutely the most widely deployed database engine it's in every
50:10
deployed database engine it's in every
50:10
deployed database engine it's in every iPhone every Android phone I'm pretty
50:12
iPhone every Android phone I'm pretty
50:12
iPhone every Android phone I'm pretty sure it's running on my Apple watch and
50:14
sure it's running on my Apple watch and
50:14
sure it's running on my Apple watch and counting my steps because sqlite is
50:16
counting my steps because sqlite is
50:16
counting my steps because sqlite is designed as an embedded Library it's a
50:19
designed as an embedded Library it's a
50:19
designed as an embedded Library it's a very very small very tightly written C
50:21
very very small very tightly written C
50:21
very very small very tightly written C library that you can drop into literally
50:24
library that you can drop into literally
50:24
library that you can drop into literally anything on any platform and it gives
50:26
anything on any platform and it gives
50:26
anything on any platform and it gives you the part the full power of a
50:27
you the part the full power of a
50:27
you the part the full power of a relational database
50:29
relational database um a crucial characteristic is that a
50:31
um a crucial characteristic is that a
50:31
um a crucial characteristic is that a sqlite database is just a file it's a
50:34
sqlite database is just a file it's a
50:34
sqlite database is just a file it's a data.db file that sits there on your
50:36
data.db file that sits there on your
50:36
data.db file that sits there on your disk and this makes them
50:38
disk and this makes them
50:38
disk and this makes them super easy to work with you can back
50:40
super easy to work with you can back
50:40
super easy to work with you can back them up by creating a copy you can share
50:42
them up by creating a copy you can share
50:42
them up by creating a copy you can share them by emailing them to people you're
50:44
them by emailing them to people you're
50:44
them by emailing them to people you're creating a new file is as expensive as
50:46
creating a new file is as expensive as
50:46
creating a new file is as expensive as creating any file on disk if you've lost
50:48
creating any file on disk if you've lost
50:48
creating any file on disk if you've lost interest in it you can just throw it
50:49
interest in it you can just throw it
50:49
interest in it you can just throw it away again so it's a very quick and
50:51
away again so it's a very quick and
50:51
away again so it's a very quick and agile way of working with data but
50:54
agile way of working with data but
50:54
agile way of working with data but despite the fact that it's so sort of
50:56
despite the fact that it's so sort of
50:56
despite the fact that it's so sort of like tiny and flexible it's also a very
50:58
like tiny and flexible it's also a very
50:58
like tiny and flexible it's also a very stable file format the um the Library of
51:01
stable file format the um the Library of
51:01
stable file format the um the Library of Congress lists sqlite as one of their
51:03
Congress lists sqlite as one of their
51:03
Congress lists sqlite as one of their approved archival mechanisms for
51:05
approved archival mechanisms for
51:05
approved archival mechanisms for structured data and that's because the
51:07
structured data and that's because the
51:07
structured data and that's because the sqlite team are meticulous about not
51:10
sqlite team are meticulous about not
51:10
sqlite team are meticulous about not breaking backwards compatibility when
51:12
breaking backwards compatibility when
51:12
breaking backwards compatibility when they when they release new versions so
51:13
they when they release new versions so
51:13
they when they release new versions so once your data is in sqlite it's going
51:15
once your data is in sqlite it's going
51:15
once your data is in sqlite it's going to basically be safe forever as
51:17
to basically be safe forever as
51:17
to basically be safe forever as something that you can then query and
51:19
something that you can then query and
51:19
something that you can then query and use in the future and that's that's as
51:21
use in the future and that's that's as
51:21
use in the future and that's that's as somebody who cares about newspapers
51:22
somebody who cares about newspapers
51:22
somebody who cares about newspapers that's a really big deal
51:24
that's a really big deal
51:24
that's a really big deal and then the other idea um that's the
51:27
and then the other idea um that's the
51:27
and then the other idea um that's the data set illustrates which um which came
51:29
data set illustrates which um which came
51:29
data set illustrates which um which came from from this from music which inspired
51:31
from from this from music which inspired
51:31
from from this from music which inspired me to use sqlite is something that I
51:33
me to use sqlite is something that I
51:33
me to use sqlite is something that I call the baked data architectural
51:35
call the baked data architectural
51:35
call the baked data architectural pattern these days deploying a web
51:38
pattern these days deploying a web
51:38
pattern these days deploying a web application has never been cheaper in
51:40
application has never been cheaper in
51:40
application has never been cheaper in terms of running that web application
51:42
terms of running that web application
51:42
terms of running that web application somewhere in the cloud you've got
51:44
somewhere in the cloud you've got
51:44
somewhere in the cloud you've got technology like kubernetes and Docker
51:45
technology like kubernetes and Docker
51:45
technology like kubernetes and Docker that there are hosting providers like
51:47
that there are hosting providers like
51:47
that there are hosting providers like AWS Lambda or Google Cloud run or the
51:50
AWS Lambda or Google Cloud run or the
51:50
AWS Lambda or Google Cloud run or the cell or fly who will all run a stateless
51:53
cell or fly who will all run a stateless
51:53
cell or fly who will all run a stateless web application for cents a month if
51:56
web application for cents a month if
51:56
web application for cents a month if it's not getting much traffic they're
51:58
it's not getting much traffic they're
51:58
it's not getting much traffic they're incredibly inexpensive there is a catch
52:00
incredibly inexpensive there is a catch
52:00
incredibly inexpensive there is a catch and the catch is that you can't run a
52:02
and the catch is that you can't run a
52:02
and the catch is that you can't run a database in them because all the reason
52:04
database in them because all the reason
52:04
database in them because all the reason these things are so cheap is that
52:06
these things are so cheap is that
52:06
these things are so cheap is that they're essentially read-only stateless
52:08
they're essentially read-only stateless
52:08
they're essentially read-only stateless web applications that you don't get to
52:10
web applications that you don't get to
52:10
web applications that you don't get to write to a disk or if you do write to a
52:12
write to a disk or if you do write to a
52:12
write to a disk or if you do write to a disk it doesn't get persisted anywhere
52:14
disk it doesn't get persisted anywhere
52:14
disk it doesn't get persisted anywhere which makes them very cheap and
52:15
which makes them very cheap and
52:15
which makes them very cheap and expensive to run but if you want to run
52:17
expensive to run but if you want to run
52:17
expensive to run but if you want to run a relational database you're kind of out
52:19
a relational database you're kind of out
52:19
a relational database you're kind of out of luck you have to pay a lot of extra
52:20
of luck you have to pay a lot of extra
52:20
of luck you have to pay a lot of extra money for a hosted database somewhere
52:22
money for a hosted database somewhere
52:22
money for a hosted database somewhere the thing I realized is that if your
52:25
the thing I realized is that if your
52:25
the thing I realized is that if your data is read only none of this matters
52:28
data is read only none of this matters
52:28
data is read only none of this matters like once you start dealing with
52:30
like once you start dealing with
52:30
like once you start dealing with read-only data and in my case it's
52:31
read-only data and in my case it's
52:31
read-only data and in my case it's sqlite database files so it's literally
52:33
sqlite database files so it's literally
52:33
sqlite database files so it's literally a blob of bytes on a disk if it's reader
52:36
a blob of bytes on a disk if it's reader
52:36
a blob of bytes on a disk if it's reader only the fact that you can't write back
52:38
only the fact that you can't write back
52:38
only the fact that you can't write back to the disk again isn't actually an
52:40
to the disk again isn't actually an
52:40
to the disk again isn't actually an issue and for the world that I care
52:41
issue and for the world that I care
52:41
issue and for the world that I care about coming from things like journalism
52:43
about coming from things like journalism
52:43
about coming from things like journalism a lot of the data sets that we're
52:45
a lot of the data sets that we're
52:45
a lot of the data sets that we're dealing with don't ever get updated you
52:47
dealing with don't ever get updated you
52:47
dealing with don't ever get updated you know it's a snapshot of San Francisco
52:48
know it's a snapshot of San Francisco
52:48
know it's a snapshot of San Francisco city facilities that might be updated a
52:51
city facilities that might be updated a
52:51
city facilities that might be updated a few times a month but it's not something
52:52
few times a month but it's not something
52:52
few times a month but it's not something that's going to accept constant rights
52:54
that's going to accept constant rights
52:54
that's going to accept constant rights so when you've got a read-only database
52:57
so when you've got a read-only database
52:57
so when you've got a read-only database there's a trick you can pull where you
52:59
there's a trick you can pull where you
52:59
there's a trick you can pull where you can effectively bundle that database up
53:01
can effectively bundle that database up
53:01
can effectively bundle that database up as part of the application that you're
53:03
as part of the application that you're
53:03
as part of the application that you're deploying onto these platforms and now
53:05
deploying onto these platforms and now
53:05
deploying onto these platforms and now you're now you can deploy a fully
53:08
you're now you can deploy a fully
53:08
you're now you can deploy a fully capable relational database interface
53:10
capable relational database interface
53:10
capable relational database interface that costs you a few cents a month in
53:13
that costs you a few cents a month in
53:13
that costs you a few cents a month in hosting costs um all of these pla or a
53:16
hosting costs um all of these pla or a
53:16
hosting costs um all of these pla or a lot of these platforms feature have a
53:17
lot of these platforms feature have a
53:17
lot of these platforms feature have a feature called scale to zero which means
53:20
feature called scale to zero which means
53:20
feature called scale to zero which means if there's no traffic coming into your
53:22
if there's no traffic coming into your
53:22
if there's no traffic coming into your application it doesn't even well it
53:23
application it doesn't even well it
53:24
application it doesn't even well it doesn't run it doesn't cost you any
53:25
doesn't run it doesn't cost you any
53:25
doesn't run it doesn't cost you any money at all the application is
53:26
money at all the application is
53:26
money at all the application is essentially static then they start the
53:28
essentially static then they start the
53:28
essentially static then they start the thing up running when the first HTTP
53:30
thing up running when the first HTTP
53:30
thing up running when the first HTTP request comes in when you've got scale
53:32
request comes in when you've got scale
53:32
request comes in when you've got scale to zero it means that your costs for
53:34
to zero it means that your costs for
53:34
to zero it means that your costs for these sort of low traffic projects uh
53:36
these sort of low traffic projects uh
53:36
these sort of low traffic projects uh effectively nothing I think um many of
53:39
effectively nothing I think um many of
53:39
effectively nothing I think um many of these providers here have a free tier
53:41
these providers here have a free tier
53:41
these providers here have a free tier and it's very common for projects that
53:43
and it's very common for projects that
53:43
and it's very common for projects that get little traffic to fit entirely
53:45
get little traffic to fit entirely
53:45
get little traffic to fit entirely within that tier so effectively we have
53:47
within that tier so effectively we have
53:47
within that tier so effectively we have free hosting for these applications if
53:49
free hosting for these applications if
53:49
free hosting for these applications if you can get away with just having
53:50
you can get away with just having
53:50
you can get away with just having read-only data and the flip side of that
53:53
read-only data and the flip side of that
53:53
read-only data and the flip side of that is that they scale up really well as
53:55
is that they scale up really well as
53:55
is that they scale up really well as well
53:56
well um if you want to serve millions of
53:57
um if you want to serve millions of
53:57
um if you want to serve millions of requests a second using a using this
53:59
requests a second using a using this
53:59
requests a second using a using this baked data pattern you can do that by
54:01
baked data pattern you can do that by
54:01
baked data pattern you can do that by running multiple copies spin up as many
54:04
running multiple copies spin up as many
54:04
running multiple copies spin up as many copies of the entire application and a
54:06
copies of the entire application and a
54:06
copies of the entire application and a full copy of the database in each one
54:07
full copy of the database in each one
54:07
full copy of the database in each one stick them behind a load balancer and
54:09
stick them behind a load balancer and
54:09
stick them behind a load balancer and you can handle as much traffic as you
54:11
you can handle as much traffic as you
54:11
you can handle as much traffic as you can throw at it which again for
54:12
can throw at it which again for
54:12
can throw at it which again for journalism when you have things like um
54:14
journalism when you have things like um
54:14
journalism when you have things like um like election results days and so forth
54:16
like election results days and so forth
54:16
like election results days and so forth this is a really useful useful ability
54:17
this is a really useful useful ability
54:17
this is a really useful useful ability as well
54:19
as well I touched on this a little bit but I
54:21
I touched on this a little bit but I
54:21
I touched on this a little bit but I want to dive a bit more into the fact
54:22
want to dive a bit more into the fact
54:22
want to dive a bit more into the fact that SQL plus HTTP is a fantastic
54:26
that SQL plus HTTP is a fantastic
54:26
that SQL plus HTTP is a fantastic integration tool um and I'll show you
54:28
integration tool um and I'll show you
54:28
integration tool um and I'll show you another example of a project that I've
54:30
another example of a project that I've
54:30
another example of a project that I've run and this is a website called Niche
54:32
run and this is a website called Niche
54:32
run and this is a website called Niche hyphen museums.com and it's the website
54:35
hyphen museums.com and it's the website
54:35
hyphen museums.com and it's the website they ran for my main hobby which is
54:38
they ran for my main hobby which is
54:38
they ran for my main hobby which is seeking out and exploring tiny museums
54:41
seeking out and exploring tiny museums
54:41
seeking out and exploring tiny museums the idea behind this is um if you go to
54:43
the idea behind this is um if you go to
54:43
the idea behind this is um if you go to a big Museum sure it'll be interesting
54:45
a big Museum sure it'll be interesting
54:45
a big Museum sure it'll be interesting if you go to a really small Museum it
54:47
if you go to a really small Museum it
54:47
if you go to a really small Museum it doesn't matter what it's about the
54:49
doesn't matter what it's about the
54:49
doesn't matter what it's about the chances are that the person who is at
54:51
chances are that the person who is at
54:51
chances are that the person who is at the desk at that museum is the person
54:52
the desk at that museum is the person
54:52
the desk at that museum is the person who set it up so now you get to have a
54:54
who set it up so now you get to have a
54:54
who set it up so now you get to have a conversation with somebody who collects
54:56
conversation with somebody who collects
54:56
conversation with somebody who collects Pez dispensers or runs the Bigfoot
54:58
Pez dispensers or runs the Bigfoot
54:58
Pez dispensers or runs the Bigfoot Discovery Museum or in this case this is
55:00
Discovery Museum or in this case this is
55:00
Discovery Museum or in this case this is the misalignment museum in San Francisco
55:03
the misalignment museum in San Francisco
55:03
the misalignment museum in San Francisco which tells the story of a um it's it's
55:05
which tells the story of a um it's it's
55:05
which tells the story of a um it's it's an apology from AI for destroying
55:07
an apology from AI for destroying
55:07
an apology from AI for destroying Humanity in the future it's only open
55:10
Humanity in the future it's only open
55:10
Humanity in the future it's only open for a few more weeks it's worth worth
55:11
for a few more weeks it's worth worth
55:11
for a few more weeks it's worth worth popping in if you're in the area but
55:13
popping in if you're in the area but
55:13
popping in if you're in the area but anyway this website is actually just
55:15
anyway this website is actually just
55:15
anyway this website is actually just data set this is my data set web
55:17
data set this is my data set web
55:17
data set this is my data set web application with a custom index. HTML
55:20
application with a custom index. HTML
55:20
application with a custom index. HTML template so that it looks like a website
55:22
template so that it looks like a website
55:22
template so that it looks like a website and all of the features on this like use
55:23
and all of the features on this like use
55:24
and all of the features on this like use my location and search and the the the
55:26
my location and search and the the the
55:26
my location and search and the the the atom feed up here those are all just SQL
55:28
atom feed up here those are all just SQL
55:28
atom feed up here those are all just SQL queries that are baked into data set
55:30
queries that are baked into data set
55:30
queries that are baked into data set this right here is the query for the
55:33
this right here is the query for the
55:33
this right here is the query for the atom feed and um
55:34
atom feed and um if we if we highlight these options here
55:37
if we if we highlight these options here
55:37
if we if we highlight these options here you can see that the way this works it
55:38
you can see that the way this works it
55:38
you can see that the way this works it selects things and it aliases them as
55:40
selects things and it aliases them as
55:40
selects things and it aliases them as atom ID atom title atom updated atom
55:43
atom ID atom title atom updated atom
55:43
atom ID atom title atom updated atom link then there's a data set plugin
55:45
link then there's a data set plugin
55:45
link then there's a data set plugin called data set atom which looks for
55:47
called data set atom which looks for
55:48
called data set atom which looks for those column names and if it finds them
55:49
those column names and if it finds them
55:49
those column names and if it finds them it produces a link to an atom feed so
55:52
it produces a link to an atom feed so
55:52
it produces a link to an atom feed so this single SQL query here defines the
55:55
this single SQL query here defines the
55:55
this single SQL query here defines the atom feed for the website purely by
55:57
atom feed for the website purely by
55:57
atom feed for the website purely by reshaping the data in SQL to that
56:00
reshaping the data in SQL to that
56:00
reshaping the data in SQL to that specific set of columns that terrifying
56:02
specific set of columns that terrifying
56:02
specific set of columns that terrifying coalesce chunk at the bottom is showing
56:04
coalesce chunk at the bottom is showing
56:04
coalesce chunk at the bottom is showing how D how in depth you can get with this
56:06
how D how in depth you can get with this
56:06
how D how in depth you can get with this if you want to construct in this case
56:08
if you want to construct in this case
56:08
if you want to construct in this case the HTML for the blog entry and this
56:11
the HTML for the blog entry and this
56:11
the HTML for the blog entry and this works right this is um net newswire the
56:13
works right this is um net newswire the
56:13
works right this is um net newswire the feedreader subscribed to the feed of
56:15
feedreader subscribed to the feed of
56:15
feedreader subscribed to the feed of content from my Niche Museum's website
56:17
content from my Niche Museum's website
56:17
content from my Niche Museum's website so there's a lot of different moving
56:19
so there's a lot of different moving
56:19
so there's a lot of different moving Parts involved here but fundamentally
56:21
Parts involved here but fundamentally
56:21
Parts involved here but fundamentally I've used a SQL query to transform data
56:23
I've used a SQL query to transform data
56:24
I've used a SQL query to transform data from whatever format it's stored in that
56:26
from whatever format it's stored in that
56:26
from whatever format it's stored in that in that website into a format that's
56:27
in that website into a format that's
56:27
in that website into a format that's compatible with feed readers and then
56:29
compatible with feed readers and then
56:29
compatible with feed readers and then they've got a little plugin that turns
56:30
they've got a little plugin that turns
56:30
they've got a little plugin that turns that into XML
56:33
that into XML so the bake data pattern I've done a lot
56:35
so the bake data pattern I've done a lot
56:35
so the bake data pattern I've done a lot of exploration of this using sqlite a
56:38
of exploration of this using sqlite a
56:38
of exploration of this using sqlite a open question for me
56:40
open question for me
56:40
open question for me um is could you do this with postgres
56:42
um is could you do this with postgres
56:42
um is could you do this with postgres itself I'm very confident that this
56:45
itself I'm very confident that this
56:45
itself I'm very confident that this would work I think there's no reason at
56:47
would work I think there's no reason at
56:47
would work I think there's no reason at all you couldn't put together a Docker
56:49
all you couldn't put together a Docker
56:49
all you couldn't put together a Docker container that contains a full
56:50
container that contains a full
56:50
container that contains a full installation of post squares with data
56:52
installation of post squares with data
56:52
installation of post squares with data with data volumes and all of the stuff
56:54
with data volumes and all of the stuff
56:54
with data volumes and all of the stuff that postgres needs to serve a read-only
56:56
that postgres needs to serve a read-only
56:56
that postgres needs to serve a read-only database bundle that up with an
56:57
database bundle that up with an
56:57
database bundle that up with an application like a Django app or
56:59
application like a Django app or
56:59
application like a Django app or something on top and deploy the entire
57:01
something on top and deploy the entire
57:01
something on top and deploy the entire thing in exactly the same way as I'm
57:03
thing in exactly the same way as I'm
57:03
thing in exactly the same way as I'm doing with sqlite and this would give
57:05
doing with sqlite and this would give
57:05
doing with sqlite and this would give you all of those same benefits you could
57:07
you all of those same benefits you could
57:07
you all of those same benefits you could get a scale to zero thing where your
57:08
get a scale to zero thing where your
57:08
get a scale to zero thing where your website costs nothing to nothing at all
57:11
website costs nothing to nothing at all
57:11
website costs nothing to nothing at all if nobody's visiting it you could scale
57:13
if nobody's visiting it you could scale
57:13
if nobody's visiting it you could scale it to handle a million requests a second
57:14
it to handle a million requests a second
57:14
it to handle a million requests a second if um if you need to just by deploying
57:17
if um if you need to just by deploying
57:17
if um if you need to just by deploying multiple copies of that same thing so I
57:19
multiple copies of that same thing so I
57:19
multiple copies of that same thing so I think that's something very interesting
57:21
think that's something very interesting
57:21
think that's something very interesting and that could be explored around this
57:22
and that could be explored around this
57:22
and that could be explored around this and many other databases as well
57:24
and many other databases as well
57:24
and many other databases as well elasticsearch for example is something
57:26
elasticsearch for example is something
57:26
elasticsearch for example is something that you could absolutely bake into a
57:28
that you could absolutely bake into a
57:28
that you could absolutely bake into a container with its index data and get
57:30
container with its index data and get
57:30
container with its index data and get you know a very powerful search feature
57:32
you know a very powerful search feature
57:32
you know a very powerful search feature that costs to you nothing at all if
57:34
that costs to you nothing at all if
57:34
that costs to you nothing at all if nobody's using that particular
57:35
nobody's using that particular
57:35
nobody's using that particular application
57:37
application I've got one last demo
57:39
I've got one last demo
57:39
I've got one last demo um this again very much fits the theme
57:41
um this again very much fits the theme
57:41
um this again very much fits the theme of small data it turns out you can run
57:44
of small data it turns out you can run
57:44
of small data it turns out you can run things like data set entirely in your
57:46
things like data set entirely in your
57:46
things like data set entirely in your browser these days thanks to webassembly
57:48
browser these days thanks to webassembly
57:48
browser these days thanks to webassembly so I'm going to switch over to My Demo
57:50
so I'm going to switch over to My Demo
57:50
so I'm going to switch over to My Demo again
57:51
again and show you a thing that I built called
57:54
and show you a thing that I built called
57:54
and show you a thing that I built called data set light could we switch to the
57:57
data set light could we switch to the
57:57
data set light could we switch to the demo feed
57:59
demo feed um oh my fault uh my my screen sharing
58:02
um oh my fault uh my my screen sharing
58:02
um oh my fault uh my my screen sharing stopped I'll just fix that now window
58:05
stopped I'll just fix that now window
58:05
stopped I'll just fix that now window uh
58:09
here we go okay we should be there we go so um data
58:12
okay we should be there we go so um data
58:12
okay we should be there we go so um data set light is data set running entirely
58:15
set light is data set running entirely
58:15
set light is data set running entirely in webassembly so I can I've opened up
58:18
in webassembly so I can I've opened up
58:18
in webassembly so I can I've opened up the um the network pane here so you can
58:21
the um the network pane here so you can
58:21
the um the network pane here so you can see what it does in the left hand side
58:23
see what it does in the left hand side
58:23
see what it does in the left hand side it's loading up data set and installing
58:24
it's loading up data set and installing
58:24
it's loading up data set and installing packages and so forth on the right hand
58:26
packages and so forth on the right hand
58:26
packages and so forth on the right hand side you can see it downloaded something
58:27
side you can see it downloaded something
58:27
side you can see it downloaded something called
58:29
called piadite.asm.js that's a and piad idle
58:31
piadite.asm.js that's a and piad idle
58:31
piadite.asm.js that's a and piad idle asm.data this is a full python
58:34
asm.data this is a full python
58:34
asm.data this is a full python interpreter and the python standard
58:36
interpreter and the python standard
58:36
interpreter and the python standard Library bundled up for webassembly so
58:38
Library bundled up for webassembly so
58:38
Library bundled up for webassembly so that it runs in the browser so right
58:40
that it runs in the browser so right
58:40
that it runs in the browser so right here I have a server-side web
58:42
here I have a server-side web
58:42
here I have a server-side web application like data set is a
58:44
application like data set is a
58:44
application like data set is a traditional server side app that's
58:46
traditional server side app that's
58:46
traditional server side app that's running entirely client-side and most of
58:48
running entirely client-side and most of
58:48
running entirely client-side and most of the functionality is working I can facet
58:50
the functionality is working I can facet
58:50
the functionality is working I can facet and I can run SQL queries and all of
58:52
and I can run SQL queries and all of
58:52
and I can run SQL queries and all of those kinds of things
58:54
those kinds of things
58:54
those kinds of things um and I built this actually as a joke I
58:57
um and I built this actually as a joke I
58:57
um and I built this actually as a joke I thought it would be amusing to show that
58:59
thought it would be amusing to show that
58:59
thought it would be amusing to show that a server-side web app can run
59:00
a server-side web app can run
59:00
a server-side web app can run client-side now but I didn't think it
59:02
client-side now but I didn't think it
59:02
client-side now but I didn't think it would be useful for anything but since I
59:04
would be useful for anything but since I
59:04
would be useful for anything but since I built it I found that I'm using it for
59:06
built it I found that I'm using it for
59:06
built it I found that I'm using it for all sorts of different purposes mainly
59:08
all sorts of different purposes mainly
59:08
all sorts of different purposes mainly because it's 100 robust right there is
59:11
because it's 100 robust right there is
59:11
because it's 100 robust right there is no server this is all just HTML and
59:13
no server this is all just HTML and
59:13
no server this is all just HTML and JavaScript there is nothing that can
59:14
JavaScript there is nothing that can
59:14
JavaScript there is nothing that can break with this which means that I can
59:17
break with this which means that I can
59:17
break with this which means that I can build things on it and share them with
59:18
build things on it and share them with
59:18
build things on it and share them with people and not have to to worry about
59:19
people and not have to to worry about
59:19
people and not have to to worry about hosting Bills or the thing might broken
59:21
hosting Bills or the thing might broken
59:21
hosting Bills or the thing might broken 10 years time or anything like that so
59:24
10 years time or anything like that so
59:24
10 years time or anything like that so just yesterday a project called red
59:26
just yesterday a project called red
59:26
just yesterday a project called red pajama released a training data set for
59:29
pajama released a training data set for
59:30
pajama released a training data set for building chat GPT style language models
59:33
building chat GPT style language models
59:33
building chat GPT style language models um there is it was over what they it's
59:34
um there is it was over what they it's
59:34
um there is it was over what they it's 1.2 trillion tokens
59:37
1.2 trillion tokens and this is how they released it it's a
59:39
and this is how they released it it's a
59:39
and this is how they released it it's a file called urls.txt with 2048 URLs in
59:44
file called urls.txt with 2048 URLs in
59:44
file called urls.txt with 2048 URLs in it each of these is like a gigabyte
59:47
it each of these is like a gigabyte
59:47
it each of these is like a gigabyte large file which you can then download
59:49
large file which you can then download
59:49
large file which you can then download and to gather all of this stuff to
59:50
and to gather all of this stuff to
59:50
and to gather all of this stuff to download all of it you'd need 2.7
59:52
download all of it you'd need 2.7
59:52
download all of it you'd need 2.7 terabytes of disk space I do not have
59:55
terabytes of disk space I do not have
59:55
terabytes of disk space I do not have 2.7 terabytes of disk space but I still
59:57
2.7 terabytes of disk space but I still
59:57
2.7 terabytes of disk space but I still wanted to get a feel for this data so I
1:00:00
wanted to get a feel for this data so I
1:00:00
wanted to get a feel for this data so I wrote a little python script that did a
1:00:02
wrote a little python script that did a
1:00:02
wrote a little python script that did a head request against each URL to get
1:00:04
head request against each URL to get
1:00:04
head request against each URL to get back just the length of that file and I
1:00:07
back just the length of that file and I
1:00:07
back just the length of that file and I turned that into a Json
1:00:09
turned that into a Json
1:00:09
turned that into a Json um Json array so here we go this is
1:00:10
um Json array so here we go this is
1:00:10
um Json array so here we go this is saying that file there is this big and
1:00:12
saying that file there is this big and
1:00:12
saying that file there is this big and I've got megabytes and gigabytes as well
1:00:14
I've got megabytes and gigabytes as well
1:00:14
I've got megabytes and gigabytes as well and it's top folders of these very very
1:00:16
and it's top folders of these very very
1:00:16
and it's top folders of these very very simple and what I can do now is I can
1:00:19
simple and what I can do now is I can
1:00:19
simple and what I can do now is I can click load Json and jset like data set
1:00:21
click load Json and jset like data set
1:00:21
click load Json and jset like data set light paste that URL in and it'll fire
1:00:24
light paste that URL in and it'll fire
1:00:24
light paste that URL in and it'll fire it'll import that Json document and turn
1:00:27
it'll import that Json document and turn
1:00:27
it'll import that Json document and turn it into a table so now I can start
1:00:29
it into a table so now I can start
1:00:29
it into a table so now I can start answering questions like what are the
1:00:30
answering questions like what are the
1:00:30
answering questions like what are the top folders it turns out that Wikipedia
1:00:33
top folders it turns out that Wikipedia
1:00:33
top folders it turns out that Wikipedia there's only one file
1:00:34
there's only one file
1:00:34
there's only one file um I think if I sort by size I can see
1:00:37
um I think if I sort by size I can see
1:00:37
um I think if I sort by size I can see that Wikipedia is the largest file they
1:00:39
that Wikipedia is the largest file they
1:00:39
that Wikipedia is the largest file they have a 111 gigabyte file you can
1:00:41
have a 111 gigabyte file you can
1:00:41
have a 111 gigabyte file you can download with all of English Wikipedia
1:00:43
download with all of English Wikipedia
1:00:43
download with all of English Wikipedia in
1:00:44
in um but this project here took me five
1:00:46
um but this project here took me five
1:00:46
um but this project here took me five minutes to knock together
1:00:48
minutes to knock together
1:00:48
minutes to knock together um I've got a article about it on my
1:00:49
um I've got a article about it on my
1:00:49
um I've got a article about it on my blog I didn't I literally did the whole
1:00:52
blog I didn't I literally did the whole
1:00:52
blog I didn't I literally did the whole thing copied and paste it into a gist
1:00:54
thing copied and paste it into a gist
1:00:54
thing copied and paste it into a gist assembled this and then started sending
1:00:56
assembled this and then started sending
1:00:56
assembled this and then started sending people this link so you know running a
1:00:59
people this link so you know running a
1:00:59
people this link so you know running a database in webassembly it's actually a
1:01:01
database in webassembly it's actually a
1:01:01
database in webassembly it's actually a really interesting trick
1:01:02
really interesting trick
1:01:02
really interesting trick excitingly you can do this with postgres
1:01:05
excitingly you can do this with postgres
1:01:05
excitingly you can do this with postgres as well this right here is
1:01:08
as well this right here is
1:01:08
as well this right here is um a project but crunchy data built to
1:01:11
um a project but crunchy data built to
1:01:11
um a project but crunchy data built to provide an interactive
1:01:13
provide an interactive
1:01:13
provide an interactive um an interactive uh tutorial for
1:01:16
um an interactive uh tutorial for
1:01:16
um an interactive uh tutorial for learning SQL and this is a full postgres
1:01:18
learning SQL and this is a full postgres
1:01:18
learning SQL and this is a full postgres running in your browser this was
1:01:20
running in your browser this was
1:01:20
running in your browser this was actually inspired by my work on data set
1:01:22
actually inspired by my work on data set
1:01:22
actually inspired by my work on data set light there's one catch it's a 50
1:01:24
light there's one catch it's a 50
1:01:24
light there's one catch it's a 50 megabyte download to get this working
1:01:25
megabyte download to get this working
1:01:25
megabyte download to get this working because they ended up having to do a
1:01:27
because they ended up having to do a
1:01:27
because they ended up having to do a full virtualized Linux operating system
1:01:29
full virtualized Linux operating system
1:01:30
full virtualized Linux operating system with postgres running on top in order in
1:01:32
with postgres running on top in order in
1:01:32
with postgres running on top in order in order to build this I'm very confident
1:01:34
order to build this I'm very confident
1:01:34
order to build this I'm very confident that with a lot of extra work you could
1:01:36
that with a lot of extra work you could
1:01:36
that with a lot of extra work you could shrink this down and just run postcards
1:01:38
shrink this down and just run postcards
1:01:38
shrink this down and just run postcards itself
1:01:39
itself and that brings me to the my sort of
1:01:43
and that brings me to the my sort of
1:01:43
and that brings me to the my sort of call where are we
1:01:45
call where are we oh could we switch back to the slides
1:01:51
um and so the thing that excites the thing
1:01:54
and so the thing that excites the thing
1:01:54
and so the thing that excites the thing the reason I use sqlite for so many of
1:01:56
the reason I use sqlite for so many of
1:01:56
the reason I use sqlite for so many of my projects is that sqlite is a
1:01:58
my projects is that sqlite is a
1:01:58
my projects is that sqlite is a genuinely serverless browser uh database
1:02:01
genuinely serverless browser uh database
1:02:01
genuinely serverless browser uh database it doesn't need a server and this is the
1:02:02
it doesn't need a server and this is the
1:02:03
it doesn't need a server and this is the original meaning of serverless before it
1:02:04
original meaning of serverless before it
1:02:04
original meaning of serverless before it meant whatever it means today in the
1:02:06
meant whatever it means today in the
1:02:06
meant whatever it means today in the cloud there is no service sqlite it's a
1:02:08
cloud there is no service sqlite it's a
1:02:08
cloud there is no service sqlite it's a file on disk it's a C library that opens
1:02:10
file on disk it's a C library that opens
1:02:10
file on disk it's a C library that opens that file everything is is done without
1:02:12
that file everything is is done without
1:02:12
that file everything is is done without network connections and that's really
1:02:15
network connections and that's really
1:02:15
network connections and that's really powerful and useful and something I'm
1:02:17
powerful and useful and something I'm
1:02:17
powerful and useful and something I'm seeing more of these days as well duckdb
1:02:19
seeing more of these days as well duckdb
1:02:19
seeing more of these days as well duckdb has become popular over the past few
1:02:21
has become popular over the past few
1:02:21
has become popular over the past few years and it's exactly the same idea as
1:02:23
years and it's exactly the same idea as
1:02:23
years and it's exactly the same idea as sqlite it's a library you can install
1:02:25
sqlite it's a library you can install
1:02:25
sqlite it's a library you can install and then you can do it it gives you
1:02:27
and then you can do it it gives you
1:02:27
and then you can do it it gives you column there sort of analytical
1:02:30
column there sort of analytical
1:02:30
column there sort of analytical um uh database so like sqlite that
1:02:32
um uh database so like sqlite that
1:02:32
um uh database so like sqlite that optimize those analytical queries and
1:02:34
optimize those analytical queries and
1:02:34
optimize those analytical queries and it's really cool it's a very exciting
1:02:35
it's really cool it's a very exciting
1:02:35
it's really cool it's a very exciting piece of software my question is could
1:02:38
piece of software my question is could
1:02:38
piece of software my question is could postgres do the same thing what if there
1:02:41
postgres do the same thing what if there
1:02:41
postgres do the same thing what if there was a a sort of a bundled version of
1:02:44
was a a sort of a bundled version of
1:02:44
was a a sort of a bundled version of postgres that existed without the
1:02:46
postgres that existed without the
1:02:46
postgres that existed without the network stack without the sort of server
1:02:48
network stack without the sort of server
1:02:48
network stack without the sort of server side of it but it gave you access to the
1:02:51
side of it but it gave you access to the
1:02:51
side of it but it gave you access to the corpus where's some pork put core Plus
1:02:54
corpus where's some pork put core Plus
1:02:54
corpus where's some pork put core Plus squares SQL and data structures and so
1:02:57
squares SQL and data structures and so
1:02:57
squares SQL and data structures and so forth in a way that meant that me as a
1:02:59
forth in a way that meant that me as a
1:02:59
forth in a way that meant that me as a python programmer could pin install it I
1:03:01
python programmer could pin install it I
1:03:01
python programmer could pin install it I would love to be able to say pip install
1:03:03
would love to be able to say pip install
1:03:03
would love to be able to say pip install postgres lib or npm install or whatever
1:03:05
postgres lib or npm install or whatever
1:03:05
postgres lib or npm install or whatever importance of my python code and then
1:03:07
importance of my python code and then
1:03:07
importance of my python code and then use it like I do sqlite induct DB today
1:03:10
use it like I do sqlite induct DB today
1:03:10
use it like I do sqlite induct DB today this would give me the best of all
1:03:12
this would give me the best of all
1:03:12
this would give me the best of all worlds it would give me the feature I
1:03:14
worlds it would give me the feature I
1:03:14
worlds it would give me the feature I care most about from sqlite but it would
1:03:16
care most about from sqlite but it would
1:03:16
care most about from sqlite but it would also give me postgres is far richer and
1:03:19
also give me postgres is far richer and
1:03:19
also give me postgres is far richer and far more feature complete version of SQL
1:03:21
far more feature complete version of SQL
1:03:21
far more feature complete version of SQL and it would give me access to post GIS
1:03:23
and it would give me access to post GIS
1:03:23
and it would give me access to post GIS and extensions like that as well so this
1:03:25
and extensions like that as well so this
1:03:25
and extensions like that as well so this is kind of my dream I think this is if
1:03:27
is kind of my dream I think this is if
1:03:27
is kind of my dream I think this is if this happened I would drop SQL like in a
1:03:30
this happened I would drop SQL like in a
1:03:30
this happened I would drop SQL like in a heartbeat and I would be able to build
1:03:32
heartbeat and I would be able to build
1:03:32
heartbeat and I would be able to build all of my stuff and post squares again
1:03:33
all of my stuff and post squares again
1:03:33
all of my stuff and post squares again so the idea is to take away
1:03:35
so the idea is to take away
1:03:35
so the idea is to take away um small data is really cool it deserves
1:03:37
um small data is really cool it deserves
1:03:37
um small data is really cool it deserves more tooling it's worth thinking about
1:03:39
more tooling it's worth thinking about
1:03:39
more tooling it's worth thinking about how you can build tools for the small
1:03:41
how you can build tools for the small
1:03:41
how you can build tools for the small end as well as for the big end of the
1:03:43
end as well as for the big end of the
1:03:43
end as well as for the big end of the scale
1:03:43
scale the baked data pattern is a way of
1:03:46
the baked data pattern is a way of
1:03:46
the baked data pattern is a way of scaling read-only data both up and down
1:03:49
scaling read-only data both up and down
1:03:49
scaling read-only data both up and down which I think is really powerful and
1:03:51
which I think is really powerful and
1:03:51
which I think is really powerful and gives you a lot of options for for
1:03:52
gives you a lot of options for for
1:03:52
gives you a lot of options for for interesting deployment strategies read
1:03:54
interesting deployment strategies read
1:03:54
interesting deployment strategies read only SQL apis are a great idea forget
1:03:56
only SQL apis are a great idea forget
1:03:56
only SQL apis are a great idea forget about signal injection it's fine just
1:03:59
about signal injection it's fine just
1:03:59
about signal injection it's fine just put if it's public data let people SQL
1:04:01
put if it's public data let people SQL
1:04:01
put if it's public data let people SQL inject all the all that they like and
1:04:03
inject all the all that they like and
1:04:03
inject all the all that they like and yeah you can run databases in the
1:04:05
yeah you can run databases in the
1:04:05
yeah you can run databases in the browser now I never thought that would
1:04:06
browser now I never thought that would
1:04:06
browser now I never thought that would I'd never thought I'd see the day and if
1:04:08
I'd never thought I'd see the day and if
1:04:08
I'd never thought I'd see the day and if you're looking for a very complicated
1:04:10
you're looking for a very complicated
1:04:10
you're looking for a very complicated side project please build me postgres as
1:04:12
side project please build me postgres as
1:04:12
side project please build me postgres as a library I would really appreciate that
1:04:14
a library I would really appreciate that
1:04:14
a library I would really appreciate that and yeah that's um that's that's all
1:04:16
and yeah that's um that's that's all
1:04:16
and yeah that's um that's that's all I've got for you today I will be on
1:04:18
I've got for you today I will be on
1:04:18
I've got for you today I will be on Discord for the rest of the conference
1:04:20
Discord for the rest of the conference
1:04:20
Discord for the rest of the conference and very keen on talking about this and
1:04:21
and very keen on talking about this and
1:04:21
and very keen on talking about this and yeah I've got time for questions
1:04:24
yeah I've got time for questions
1:04:24
yeah I've got time for questions awesome that there's a number of
1:04:27
awesome that there's a number of
1:04:27
awesome that there's a number of different things I loved about that um
1:04:29
different things I loved about that um
1:04:29
different things I loved about that um somebody mentioned in the Discord about
1:04:30
somebody mentioned in the Discord about
1:04:30
somebody mentioned in the Discord about uh the original start with looking at
1:04:32
uh the original start with looking at
1:04:32
uh the original start with looking at data set and how it really opens things
1:04:35
data set and how it really opens things
1:04:35
data set and how it really opens things up to you know the idea of like let's go
1:04:37
up to you know the idea of like let's go
1:04:37
up to you know the idea of like let's go explore the data and not like hear some
1:04:39
explore the data and not like hear some
1:04:39
explore the data and not like hear some pre-compiled thing that you should look
1:04:41
pre-compiled thing that you should look
1:04:41
pre-compiled thing that you should look at in this data it really is about
1:04:42
at in this data it really is about
1:04:42
at in this data it really is about exploring it which I think is really
1:04:43
exploring it which I think is really
1:04:43
exploring it which I think is really fantastic uh so that that was pretty
1:04:46
fantastic uh so that that was pretty
1:04:46
fantastic uh so that that was pretty interesting uh and I and now I feel like
1:04:48
interesting uh and I and now I feel like
1:04:48
interesting uh and I and now I feel like there's a bit of a call to Arms I sort
1:04:50
there's a bit of a call to Arms I sort
1:04:50
there's a bit of a call to Arms I sort of wanted to ask a little bit about how
1:04:51
of wanted to ask a little bit about how
1:04:51
of wanted to ask a little bit about how you integrate this with postgres how
1:04:53
you integrate this with postgres how
1:04:53
you integrate this with postgres how much have you looked at that and I see
1:04:55
much have you looked at that and I see
1:04:55
much have you looked at that and I see that you've kind of got some called arms
1:04:57
that you've kind of got some called arms
1:04:57
that you've kind of got some called arms in there on like well here's what you
1:04:58
in there on like well here's what you
1:04:58
in there on like well here's what you could do with more postgres in there
1:05:01
could do with more postgres in there
1:05:01
could do with more postgres in there um so I've actually built that a little
1:05:03
um so I've actually built that a little
1:05:03
um so I've actually built that a little bit
1:05:04
bit um last year A couple of years ago I
1:05:05
um last year A couple of years ago I
1:05:05
um last year A couple of years ago I built this thing called Django SQL
1:05:07
built this thing called Django SQL
1:05:07
built this thing called Django SQL dashboard where the idea because I had a
1:05:08
dashboard where the idea because I had a
1:05:08
dashboard where the idea because I had a Django app that I wanted to do data set
1:05:11
Django app that I wanted to do data set
1:05:11
Django app that I wanted to do data set like things against and so this is a
1:05:13
like things against and so this is a
1:05:13
like things against and so this is a little Django application I built which
1:05:15
little Django application I built which
1:05:15
little Django application I built which works against the read-only postgres
1:05:17
works against the read-only postgres
1:05:17
works against the read-only postgres thing and gives you some of that data
1:05:18
thing and gives you some of that data
1:05:18
thing and gives you some of that data set functionality the longer term I
1:05:21
set functionality the longer term I
1:05:21
set functionality the longer term I would love data set to have pluggable
1:05:22
would love data set to have pluggable
1:05:22
would love data set to have pluggable back ends if dataset could talk directly
1:05:24
back ends if dataset could talk directly
1:05:24
back ends if dataset could talk directly to post squares I think that would be
1:05:25
to post squares I think that would be
1:05:25
to post squares I think that would be phenomenal and I've shied away from
1:05:27
phenomenal and I've shied away from
1:05:27
phenomenal and I've shied away from doing that because then I have to build
1:05:29
doing that because then I have to build
1:05:29
doing that because then I have to build every feature for sqlite and for
1:05:30
every feature for sqlite and for
1:05:30
every feature for sqlite and for postgres but I'm getting the I'm
1:05:32
postgres but I'm getting the I'm
1:05:32
postgres but I'm getting the I'm beginning to grow a hunch that that's
1:05:33
beginning to grow a hunch that that's
1:05:33
beginning to grow a hunch that that's not as difficult as I thought it would
1:05:35
not as difficult as I thought it would
1:05:35
not as difficult as I thought it would be the the SQL dialects are very similar
1:05:37
be the the SQL dialects are very similar
1:05:37
be the the SQL dialects are very similar between the two so it might happen yeah
1:05:39
between the two so it might happen yeah
1:05:39
between the two so it might happen yeah absolutely I've had the opportunity to
1:05:41
absolutely I've had the opportunity to
1:05:41
absolutely I've had the opportunity to talk to Richard uh hip from sqlite a
1:05:43
talk to Richard uh hip from sqlite a
1:05:43
talk to Richard uh hip from sqlite a number of times and he's always stated
1:05:45
number of times and he's always stated
1:05:45
number of times and he's always stated that he works at the postgres you know
1:05:47
that he works at the postgres you know
1:05:47
that he works at the postgres you know syntax and language as a sort of a first
1:05:49
syntax and language as a sort of a first
1:05:49
syntax and language as a sort of a first implementation for him to base what he
1:05:51
implementation for him to base what he
1:05:51
implementation for him to base what he does so I think that that would actually
1:05:53
does so I think that that would actually
1:05:53
does so I think that that would actually be pretty pretty easy to to make that
1:05:55
be pretty pretty easy to to make that
1:05:55
be pretty pretty easy to to make that all put together so
1:05:57
all put together so
1:05:57
all put together so sounds awesome what I like about your
1:05:59
sounds awesome what I like about your
1:05:59
sounds awesome what I like about your talk Simon is just that you're you're
1:06:01
talk Simon is just that you're you're
1:06:01
talk Simon is just that you're you're looking at postgres from right next door
1:06:03
looking at postgres from right next door
1:06:03
looking at postgres from right next door right you're not actively spending all
1:06:06
right you're not actively spending all
1:06:06
right you're not actively spending all of your days working on postgres
1:06:09
of your days working on postgres
1:06:09
of your days working on postgres um but you are solving similar sorts of
1:06:11
um but you are solving similar sorts of
1:06:11
um but you are solving similar sorts of problems with other tools and I think
1:06:14
problems with other tools and I think
1:06:14
problems with other tools and I think you and I talked about this as a
1:06:15
you and I talked about this as a
1:06:15
you and I talked about this as a postgres adjacent talk absolutely when
1:06:18
postgres adjacent talk absolutely when
1:06:18
postgres adjacent talk absolutely when we were first conceptualizing it and and
1:06:19
we were first conceptualizing it and and
1:06:19
we were first conceptualizing it and and I think it's great because you see
1:06:21
I think it's great because you see
1:06:21
I think it's great because you see things in a way that's different from
1:06:23
things in a way that's different from
1:06:23
things in a way that's different from those of us who are like in this circle
1:06:26
those of us who are like in this circle
1:06:26
those of us who are like in this circle all the time
1:06:28
all the time so I I would actually love to chat more
1:06:31
so I I would actually love to chat more
1:06:31
so I I would actually love to chat more I think we are Discord description yeah
1:06:34
I think we are Discord description yeah
1:06:34
I think we are Discord description yeah so there are questions on the Discord uh
1:06:36
so there are questions on the Discord uh
1:06:36
so there are questions on the Discord uh I will be checking that out as well
1:06:38
I will be checking that out as well
1:06:38
I will be checking that out as well um but I think we actually need to move
1:06:40
um but I think we actually need to move
1:06:40
um but I think we actually need to move on to our next speaker Simon thank you
1:06:42
on to our next speaker Simon thank you
1:06:42
on to our next speaker Simon thank you so much for being here absolutely 100
1:06:44
so much for being here absolutely 100
1:06:44
so much for being here absolutely 100 that was awesome thanks very much fat
1:06:46
that was awesome thanks very much fat
1:06:46
that was awesome thanks very much fat man yeah I'll be on Discord in just a
1:06:48
man yeah I'll be on Discord in just a
1:06:48
man yeah I'll be on Discord in just a few seconds and thank you for your path
1:06:49
few seconds and thank you for your path
1:06:49
few seconds and thank you for your path to cytuscon episode one uh which I hope
1:06:52
to cytuscon episode one uh which I hope
1:06:52
to cytuscon episode one uh which I hope people catch
1:06:53
people catch um really really interesting
1:06:54
um really really interesting
1:06:54
um really really interesting conversation on working in public too
1:06:56
conversation on working in public too
1:06:56
conversation on working in public too cool have a great day
1:07:00
so our next speaker is going to be yelta
1:07:04
so our next speaker is going to be yelta
1:07:04
so our next speaker is going to be yelta fenima and he is speaking uh ironically
1:07:08
fenima and he is speaking uh ironically
1:07:08
fenima and he is speaking uh ironically uh because of the the little niche
1:07:11
uh because of the the little niche
1:07:11
uh because of the the little niche Museum in Simon's talk I love that like
1:07:13
Museum in Simon's talk I love that like
1:07:13
Museum in Simon's talk I love that like you know here's the apology from AI for
1:07:15
you know here's the apology from AI for
1:07:15
you know here's the apology from AI for destroying the world so of course up
1:07:17
destroying the world so of course up
1:07:17
destroying the world so of course up next we need to and this is this is not
1:07:19
next we need to and this is this is not
1:07:19
next we need to and this is this is not planned we need to have a talk about
1:07:20
planned we need to have a talk about
1:07:20
planned we need to have a talk about Chad gbt and postgres and rust and all
1:07:23
Chad gbt and postgres and rust and all
1:07:23
Chad gbt and postgres and rust and all that so uh that's fantastic how you
1:07:25
that so uh that's fantastic how you
1:07:25
that so uh that's fantastic how you doing Yota I'm doing well yeah so I'm
1:07:28
doing Yota I'm doing well yeah so I'm
1:07:28
doing Yota I'm doing well yeah so I'm very excited you're joining us from the
1:07:30
very excited you're joining us from the
1:07:30
very excited you're joining us from the Netherlands I am yes
1:07:32
Netherlands I am yes
1:07:32
Netherlands I am yes excellent uh so just for those who have
1:07:35
excellent uh so just for those who have
1:07:35
excellent uh so just for those who have not met yelta before he's a senior
1:07:37
not met yelta before he's a senior
1:07:37
not met yelta before he's a senior software engineer at Microsoft uh you
1:07:39
software engineer at Microsoft uh you
1:07:39
software engineer at Microsoft uh you work on cytus uh and on postgres and I
1:07:43
work on cytus uh and on postgres and I
1:07:43
work on cytus uh and on postgres and I believe you're a maintainer on PG
1:07:44
believe you're a maintainer on PG
1:07:44
believe you're a maintainer on PG bouncer yes all of those yeah so so we
1:07:47
bouncer yes all of those yeah so so we
1:07:47
bouncer yes all of those yeah so so we all owe you uh a few beers I imagine uh
1:07:50
all owe you uh a few beers I imagine uh
1:07:50
all owe you uh a few beers I imagine uh for that for work across the board so
1:07:52
for that for work across the board so
1:07:52
for that for work across the board so excellent
1:07:54
excellent or coffee coffee works too or a coffee
1:07:56
or coffee coffee works too or a coffee
1:07:56
or coffee coffee works too or a coffee yeah
1:07:58
okay um well yelta take us away
1:08:01
um well yelta take us away
1:08:01
um well yelta take us away all right
1:08:03
all right so I'm gonna talk a bit about this whole
1:08:07
so I'm gonna talk a bit about this whole
1:08:07
so I'm gonna talk a bit about this whole thing with AI that if you've been on the
1:08:10
thing with AI that if you've been on the
1:08:10
thing with AI that if you've been on the internet for the past few months uh you
1:08:13
internet for the past few months uh you
1:08:13
internet for the past few months uh you saw happening that's sort of taking over
1:08:15
saw happening that's sort of taking over
1:08:15
saw happening that's sort of taking over the world everyone's like oh what's it
1:08:16
the world everyone's like oh what's it
1:08:16
the world everyone's like oh what's it going what you cannot do everything with
1:08:18
going what you cannot do everything with
1:08:18
going what you cannot do everything with that yeah okay great pictures you don't
1:08:20
that yeah okay great pictures you don't
1:08:20
that yeah okay great pictures you don't need to paint anymore uh you can
1:08:24
need to paint anymore uh you can
1:08:24
need to paint anymore uh you can do anything you don't need to Google you
1:08:26
do anything you don't need to Google you
1:08:27
do anything you don't need to Google you can just ask Ai and it knows
1:08:30
can just ask Ai and it knows
1:08:30
can just ask Ai and it knows so I'm I'm here a bit for how can you
1:08:34
so I'm I'm here a bit for how can you
1:08:34
so I'm I'm here a bit for how can you use this AI for postgres what what can
1:08:38
use this AI for postgres what what can
1:08:38
use this AI for postgres what what can you do to integrate it with postgres and
1:08:41
you do to integrate it with postgres and
1:08:41
you do to integrate it with postgres and how can you use it in useful ways
1:08:44
how can you use it in useful ways
1:08:44
how can you use it in useful ways and uh yeah
1:08:46
and uh yeah that's uh that's gonna be my talk
1:08:48
that's uh that's gonna be my talk
1:08:48
that's uh that's gonna be my talk so the reason I'm doing this talk is
1:08:51
so the reason I'm doing this talk is
1:08:51
so the reason I'm doing this talk is because I have one secret problem I've
1:08:53
because I have one secret problem I've
1:08:53
because I have one secret problem I've been working on postgres for uh with
1:08:56
been working on postgres for uh with
1:08:56
been working on postgres for uh with postgres on postgres for quite a few
1:08:58
postgres on postgres for quite a few
1:08:58
postgres on postgres for quite a few years now
1:09:01
uh and I still have it's very difficult
1:09:03
and I still have it's very difficult
1:09:03
and I still have it's very difficult time to write working postgres queries
1:09:06
time to write working postgres queries
1:09:06
time to write working postgres queries I don't know what it is it's just I
1:09:08
I don't know what it is it's just I
1:09:08
I don't know what it is it's just I forget a comma I swap around order buy
1:09:11
forget a comma I swap around order buy
1:09:11
forget a comma I swap around order buy and group buy and then it's like oh it's
1:09:13
and group buy and then it's like oh it's
1:09:13
and group buy and then it's like oh it's the wrong order it doesn't work at all
1:09:14
the wrong order it doesn't work at all
1:09:14
the wrong order it doesn't work at all uh I forget how many parentheses I need
1:09:17
uh I forget how many parentheses I need
1:09:17
uh I forget how many parentheses I need for your values statements all of those
1:09:19
for your values statements all of those
1:09:19
for your values statements all of those things can happen and
1:09:21
things can happen and
1:09:21
things can happen and I don't know it's I my queries never
1:09:23
I don't know it's I my queries never
1:09:23
I don't know it's I my queries never work first try and I don't think I'm
1:09:25
work first try and I don't think I'm
1:09:25
work first try and I don't think I'm alone at least I hope
1:09:28
alone at least I hope
1:09:28
alone at least I hope so
1:09:29
so in the past it was always just sort of a
1:09:31
in the past it was always just sort of a
1:09:31
in the past it was always just sort of a given it's like okay I'll just write the
1:09:32
given it's like okay I'll just write the
1:09:32
given it's like okay I'll just write the query again but now with the AI hacking
1:09:35
query again but now with the AI hacking
1:09:35
query again but now with the AI hacking I was like oh maybe I can ask jet GPT to
1:09:37
I was like oh maybe I can ask jet GPT to
1:09:37
I was like oh maybe I can ask jet GPT to help me
1:09:38
help me so that's that's what I did and chat GPT
1:09:41
so that's that's what I did and chat GPT
1:09:41
so that's that's what I did and chat GPT was like yes of course I can I can help
1:09:44
was like yes of course I can I can help
1:09:44
was like yes of course I can I can help you what uh what do you need
1:09:46
you what uh what do you need
1:09:46
you what uh what do you need and I was like okay well uh what do you
1:09:48
and I was like okay well uh what do you
1:09:48
and I was like okay well uh what do you need to know for me for I mean what do
1:09:51
need to know for me for I mean what do
1:09:51
need to know for me for I mean what do you need me to tell you for me for you
1:09:53
you need me to tell you for me for you
1:09:53
you need me to tell you for me for you to help me
1:09:54
to help me and it wanted to know some tables and
1:09:56
and it wanted to know some tables and
1:09:56
and it wanted to know some tables and column names and like the things I
1:09:57
column names and like the things I
1:09:57
column names and like the things I wanted to do of course
1:09:59
wanted to do of course
1:09:59
wanted to do of course so that's and then I thought that that's
1:10:02
so that's and then I thought that that's
1:10:02
so that's and then I thought that that's that's a bit annoying then I have to
1:10:04
that's a bit annoying then I have to
1:10:04
that's a bit annoying then I have to copy paste my schema from my database or
1:10:07
copy paste my schema from my database or
1:10:07
copy paste my schema from my database or from some file to
1:10:08
from some file to into captivity so notes all the tables
1:10:10
into captivity so notes all the tables
1:10:10
into captivity so notes all the tables or the columns I have
1:10:13
or the columns I have
1:10:13
or the columns I have and that's another problem I have a
1:10:15
and that's another problem I have a
1:10:15
and that's another problem I have a little not so secret is that I don't
1:10:17
little not so secret is that I don't
1:10:17
little not so secret is that I don't like doing boring things and copy
1:10:19
like doing boring things and copy
1:10:19
like doing boring things and copy pressing schemas increase I mean to me
1:10:21
pressing schemas increase I mean to me
1:10:21
pressing schemas increase I mean to me that sounds extremely boring uh so I
1:10:25
that sounds extremely boring uh so I
1:10:25
that sounds extremely boring uh so I thought how can I how can I avoid this
1:10:28
thought how can I how can I avoid this
1:10:28
thought how can I how can I avoid this boring stuff
1:10:30
and because password has all this
1:10:32
because password has all this
1:10:32
because password has all this information maybe I can get postgres to
1:10:35
information maybe I can get postgres to
1:10:35
information maybe I can get postgres to to communicate Rich activity for me
1:10:38
to communicate Rich activity for me
1:10:38
to communicate Rich activity for me instead of doing it myself directly
1:10:41
instead of doing it myself directly
1:10:41
instead of doing it myself directly and that's
1:10:43
and that's that's I mean I would that would need
1:10:46
that's I mean I would that would need
1:10:46
that's I mean I would that would need some functionality in postgres to be
1:10:48
some functionality in postgres to be
1:10:48
some functionality in postgres to be able to do that
1:10:50
able to do that and that's actually quite possible
1:10:52
and that's actually quite possible
1:10:52
and that's actually quite possible because postgres has extensions it's one
1:10:55
because postgres has extensions it's one
1:10:55
because postgres has extensions it's one of the
1:10:56
of the thing it's one of the things that make
1:10:57
thing it's one of the things that make
1:10:57
thing it's one of the things that make Costco's quite special with with respect
1:11:00
Costco's quite special with with respect
1:11:00
Costco's quite special with with respect to other relational databases it's
1:11:02
to other relational databases it's
1:11:02
to other relational databases it's actually in the original paper from bus
1:11:04
actually in the original paper from bus
1:11:04
actually in the original paper from bus for postgres like the abstract
1:11:07
for postgres like the abstract
1:11:07
for postgres like the abstract uh that postgres is it's one of its
1:11:10
uh that postgres is it's one of its
1:11:10
uh that postgres is it's one of its goals is to be extensible
1:11:12
goals is to be extensible
1:11:12
goals is to be extensible and the way it does that is that you can
1:11:15
and the way it does that is that you can
1:11:15
and the way it does that is that you can Define your own types your own functions
1:11:16
Define your own types your own functions
1:11:16
Define your own types your own functions and how you do that is by having two
1:11:19
and how you do that is by having two
1:11:19
and how you do that is by having two parts you have a shared Library which
1:11:21
parts you have a shared Library which
1:11:21
parts you have a shared Library which contains all the native code
1:11:24
contains all the native code
1:11:24
contains all the native code um
1:11:25
um so it can that can do anything basically
1:11:28
so it can that can do anything basically
1:11:28
so it can that can do anything basically it's like any program running on your
1:11:29
it's like any program running on your
1:11:29
it's like any program running on your computer can do can make web requests
1:11:31
computer can do can make web requests
1:11:31
computer can do can make web requests can do you can do whatever
1:11:33
can do you can do whatever
1:11:33
can do you can do whatever uh and then sort of to to call those
1:11:36
uh and then sort of to to call those
1:11:36
uh and then sort of to to call those Nate to call that native code from the
1:11:39
Nate to call that native code from the
1:11:39
Nate to call that native code from the SQL function
1:11:40
SQL function uh from a SQL query you define some SQL
1:11:44
uh from a SQL query you define some SQL
1:11:44
uh from a SQL query you define some SQL functions and those SQL functions are
1:11:46
functions and those SQL functions are
1:11:46
functions and those SQL functions are very simple they're like ah this is the
1:11:47
very simple they're like ah this is the
1:11:47
very simple they're like ah this is the name of the SQL function this is the
1:11:49
name of the SQL function this is the
1:11:49
name of the SQL function this is the things that returns and arguments it
1:11:50
things that returns and arguments it
1:11:50
things that returns and arguments it takes and this is the Nate the name of
1:11:52
takes and this is the Nate the name of
1:11:52
takes and this is the Nate the name of the Native function that it will
1:11:54
the Native function that it will
1:11:54
the Native function that it will actually call internally
1:11:57
actually call internally
1:11:57
actually call internally and scientist is built like that lots of
1:11:59
and scientist is built like that lots of
1:11:59
and scientist is built like that lots of other things are built like that so I
1:12:01
other things are built like that so I
1:12:01
other things are built like that so I thought oh let's write a GPT
1:12:03
thought oh let's write a GPT
1:12:03
thought oh let's write a GPT postgres extension uh how hard can it be
1:12:07
postgres extension uh how hard can it be
1:12:07
postgres extension uh how hard can it be and uh there's one other thing about me
1:12:09
and uh there's one other thing about me
1:12:09
and uh there's one other thing about me I like rust uh it's like it's like this
1:12:12
I like rust uh it's like it's like this
1:12:12
I like rust uh it's like it's like this fancy new sort of new language uh it's I
1:12:15
fancy new sort of new language uh it's I
1:12:15
fancy new sort of new language uh it's I mean it's not super new anymore but it
1:12:17
mean it's not super new anymore but it
1:12:17
mean it's not super new anymore but it still sort of feels new uh it all the
1:12:20
still sort of feels new uh it all the
1:12:20
still sort of feels new uh it all the all the things about it there are feel
1:12:22
all the things about it there are feel
1:12:22
all the things about it there are feel new it has like everything is an
1:12:24
new it has like everything is an
1:12:24
new it has like everything is an expression so even your if you're else
1:12:26
expression so even your if you're else
1:12:26
expression so even your if you're else they sort of return values and you can
1:12:28
they sort of return values and you can
1:12:28
they sort of return values and you can do better metric which is like it's I
1:12:30
do better metric which is like it's I
1:12:30
do better metric which is like it's I mean it's it's sort of destructuring of
1:12:33
mean it's it's sort of destructuring of
1:12:33
mean it's it's sort of destructuring of of a date of structures and structs and
1:12:36
of a date of structures and structs and
1:12:36
of a date of structures and structs and objects and and sort of a it sort of the
1:12:40
objects and and sort of a it sort of the
1:12:40
objects and and sort of a it sort of the way that it just looks nicely called and
1:12:42
way that it just looks nicely called and
1:12:42
way that it just looks nicely called and it's I mean it's hard to explain if you
1:12:44
it's I mean it's hard to explain if you
1:12:44
it's I mean it's hard to explain if you haven't seen it just go look at it
1:12:47
haven't seen it just go look at it
1:12:47
haven't seen it just go look at it um and the type system is also fancy
1:12:48
um and the type system is also fancy
1:12:48
um and the type system is also fancy it's like Haskell but I can actually
1:12:51
it's like Haskell but I can actually
1:12:51
it's like Haskell but I can actually understand what it does
1:12:54
understand what it does
1:12:54
understand what it does um
1:12:54
um and it will give you things like not
1:12:57
and it will give you things like not
1:12:57
and it will give you things like not having no points or exceptions
1:12:59
having no points or exceptions
1:12:59
having no points or exceptions and
1:13:00
and on top of that there's also extremely
1:13:02
on top of that there's also extremely
1:13:02
on top of that there's also extremely secure there's no there's no memory
1:13:04
secure there's no there's no memory
1:13:04
secure there's no there's no memory leaks no double threes no data races
1:13:06
leaks no double threes no data races
1:13:06
leaks no double threes no data races none of those things that you're used to
1:13:07
none of those things that you're used to
1:13:07
none of those things that you're used to from from C or super cluster and data
1:13:10
from from C or super cluster and data
1:13:10
from from C or super cluster and data races can happen in pretty much any
1:13:11
races can happen in pretty much any
1:13:11
races can happen in pretty much any language but even though those that
1:13:14
language but even though those that
1:13:14
language but even though those that don't I mean if you're doing sort of the
1:13:16
don't I mean if you're doing sort of the
1:13:16
don't I mean if you're doing sort of the in the correct way that they don't
1:13:18
in the correct way that they don't
1:13:18
in the correct way that they don't happen in Rust and and even with all
1:13:21
happen in Rust and and even with all
1:13:21
happen in Rust and and even with all those security things it's just as fast
1:13:25
those security things it's just as fast
1:13:25
those security things it's just as fast as SQL plus I mean and see it's it
1:13:27
as SQL plus I mean and see it's it
1:13:27
as SQL plus I mean and see it's it although I mean it doesn't really matter
1:13:28
although I mean it doesn't really matter
1:13:28
although I mean it doesn't really matter for our use case because I mean I just
1:13:31
for our use case because I mean I just
1:13:31
for our use case because I mean I just want to talk to chatgpt and probably
1:13:33
want to talk to chatgpt and probably
1:13:33
want to talk to chatgpt and probably chat GPT is the slow thing in this whole
1:13:36
chat GPT is the slow thing in this whole
1:13:36
chat GPT is the slow thing in this whole it's all set up but I mean we can at
1:13:39
it's all set up but I mean we can at
1:13:39
it's all set up but I mean we can at least sense
1:13:40
least sense questions to share activity very very
1:13:42
questions to share activity very very
1:13:42
questions to share activity very very fast and how long that how long chapter
1:13:44
fast and how long that how long chapter
1:13:44
fast and how long that how long chapter it takes to respond that's uh that's
1:13:46
it takes to respond that's uh that's
1:13:46
it takes to respond that's uh that's another another method
1:13:48
another another method
1:13:48
another another method and finally not unimportant it has a it
1:13:51
and finally not unimportant it has a it
1:13:51
and finally not unimportant it has a it has a package manager and see C2 plus
1:13:53
has a package manager and see C2 plus
1:13:53
has a package manager and see C2 plus they famously don't and managing
1:13:55
they famously don't and managing
1:13:55
they famously don't and managing dependencies it's it's really it's uh
1:13:58
dependencies it's it's really it's uh
1:13:58
dependencies it's it's really it's uh horrible
1:14:00
um that's that's kind of the yeah so
1:14:03
that's that's kind of the yeah so
1:14:03
that's that's kind of the yeah so package eventually rest is good
1:14:05
package eventually rest is good
1:14:05
package eventually rest is good it's I mean like any other package
1:14:07
it's I mean like any other package
1:14:07
it's I mean like any other package manager for most languages but cnc4 plus
1:14:09
manager for most languages but cnc4 plus
1:14:09
manager for most languages but cnc4 plus they don't have it
1:14:11
they don't have it
1:14:11
they don't have it and finally sort of bonus thing it has
1:14:13
and finally sort of bonus thing it has
1:14:13
and finally sort of bonus thing it has this really cute little crab as a mascot
1:14:16
this really cute little crab as a mascot
1:14:16
this really cute little crab as a mascot he's called Ferris so that's uh I mean
1:14:19
he's called Ferris so that's uh I mean
1:14:19
he's called Ferris so that's uh I mean bonus points for cute mascots
1:14:23
bonus points for cute mascots
1:14:24
bonus points for cute mascots so I've actually been playing with rust
1:14:26
so I've actually been playing with rust
1:14:26
so I've actually been playing with rust for quite a long time I I have a fairly
1:14:29
for quite a long time I I have a fairly
1:14:29
for quite a long time I I have a fairly popular open source Library called
1:14:31
popular open source Library called
1:14:31
popular open source Library called derive more
1:14:33
derive more um and it's it's ultimates writing
1:14:35
um and it's it's ultimates writing
1:14:35
um and it's it's ultimates writing boring boilerplating rushed
1:14:37
boring boilerplating rushed
1:14:37
boring boilerplating rushed so you see you can might see some
1:14:40
so you see you can might see some
1:14:40
so you see you can might see some some yeah some things in the things I
1:14:43
some yeah some things in the things I
1:14:43
some yeah some things in the things I work some commonalities of the things I
1:14:45
work some commonalities of the things I
1:14:45
work some commonalities of the things I work on where I'm trying to automate all
1:14:47
work on where I'm trying to automate all
1:14:47
work on where I'm trying to automate all the boring things away that's kind of
1:14:49
the boring things away that's kind of
1:14:49
the boring things away that's kind of what I do
1:14:50
what I do um and I actually did this seven seven
1:14:52
um and I actually did this seven seven
1:14:52
um and I actually did this seven seven years ago I started with it and it I
1:14:54
years ago I started with it and it I
1:14:54
years ago I started with it and it I mean I mean it grew way more than I
1:14:56
mean I mean it grew way more than I
1:14:56
mean I mean it grew way more than I expected uh because apparently other
1:14:58
expected uh because apparently other
1:14:58
expected uh because apparently other developers also don't like doing boring
1:15:00
developers also don't like doing boring
1:15:00
developers also don't like doing boring things so now it's it's actually used by
1:15:02
things so now it's it's actually used by
1:15:02
things so now it's it's actually used by more than 100
1:15:04
000 repositories on GitHub so it's quite I
1:15:07
repositories on GitHub so it's quite I
1:15:07
repositories on GitHub so it's quite I mean I uh that's definitely my most used
1:15:10
mean I uh that's definitely my most used
1:15:10
mean I uh that's definitely my most used library for sure
1:15:13
library for sure so I never used rust professionally I
1:15:17
so I never used rust professionally I
1:15:17
so I never used rust professionally I only did this sort of as like I started
1:15:19
only did this sort of as like I started
1:15:19
only did this sort of as like I started in I started doing this at the end of my
1:15:21
in I started doing this at the end of my
1:15:21
in I started doing this at the end of my University and then uh
1:15:23
University and then uh
1:15:23
University and then uh yeah it I sort of did it outside of
1:15:26
yeah it I sort of did it outside of
1:15:26
yeah it I sort of did it outside of outside of work I sort of cooked it a
1:15:28
outside of work I sort of cooked it a
1:15:28
outside of work I sort of cooked it a bit with it so that's something I kind
1:15:30
bit with it so that's something I kind
1:15:30
bit with it so that's something I kind of want to change so how do I combine
1:15:33
of want to change so how do I combine
1:15:33
of want to change so how do I combine postgres and rust and maybe even cite
1:15:35
postgres and rust and maybe even cite
1:15:35
postgres and rust and maybe even cite this because I mean I'm working on
1:15:36
this because I mean I'm working on
1:15:36
this because I mean I'm working on scientists much as much of the time so I
1:15:38
scientists much as much of the time so I
1:15:38
scientists much as much of the time so I could even even sort of two sides it to
1:15:40
could even even sort of two sides it to
1:15:40
could even even sort of two sides it to make it more relatable to work and get
1:15:42
make it more relatable to work and get
1:15:42
make it more relatable to work and get Microsoft to pay you for doing doing
1:15:45
Microsoft to pay you for doing doing
1:15:45
Microsoft to pay you for doing doing stuff on uh with rust
1:15:49
and actually I'm not the only one that
1:15:52
and actually I'm not the only one that
1:15:52
and actually I'm not the only one that that wants to combine postgres and rest
1:15:55
that wants to combine postgres and rest
1:15:55
that wants to combine postgres and rest there's there's this great PGX library
1:15:59
there's there's this great PGX library
1:15:59
there's there's this great PGX library and I just learned today that like in
1:16:02
and I just learned today that like in
1:16:02
and I just learned today that like in the last 24 hours they changed their
1:16:04
the last 24 hours they changed their
1:16:04
the last 24 hours they changed their name to pgrx so the I guess the r stands
1:16:08
name to pgrx so the I guess the r stands
1:16:08
name to pgrx so the I guess the r stands for rust
1:16:09
for rust um so it's that's I mean so anything
1:16:12
um so it's that's I mean so anything
1:16:12
um so it's that's I mean so anything that says PTX in the next slides it's
1:16:14
that says PTX in the next slides it's
1:16:14
that says PTX in the next slides it's like uh probably it will be pgrx
1:16:18
like uh probably it will be pgrx
1:16:18
like uh probably it will be pgrx tomorrow
1:16:19
tomorrow uh but yeah I couldn't change my slides
1:16:21
uh but yeah I couldn't change my slides
1:16:21
uh but yeah I couldn't change my slides anymore in time
1:16:24
anymore in time but uh yeah this is a rough library to
1:16:27
but uh yeah this is a rough library to
1:16:27
but uh yeah this is a rough library to build postgres extensions with rust and
1:16:29
build postgres extensions with rust and
1:16:29
build postgres extensions with rust and that's kind of just what I want to do
1:16:32
that's kind of just what I want to do
1:16:32
that's kind of just what I want to do and it's actually really easy to use
1:16:33
and it's actually really easy to use
1:16:33
and it's actually really easy to use it's uh it's a few commands you install
1:16:36
it's uh it's a few commands you install
1:16:36
it's uh it's a few commands you install cargo PGA cargo PGX cargo is the rest
1:16:40
cargo PGA cargo PGX cargo is the rest
1:16:40
cargo PGA cargo PGX cargo is the rest package manager and you can install some
1:16:42
package manager and you can install some
1:16:42
package manager and you can install some plugins to it so by by doing by running
1:16:46
plugins to it so by by doing by running
1:16:46
plugins to it so by by doing by running cargo install cargo PGX you can then run
1:16:49
cargo install cargo PGX you can then run
1:16:49
cargo install cargo PGX you can then run cargo PGX in it to sort of set up
1:16:52
cargo PGX in it to sort of set up
1:16:52
cargo PGX in it to sort of set up postgres uh
1:16:54
postgres uh extension tooling
1:16:57
extension tooling and then you can create a new extension
1:16:59
and then you can create a new extension
1:16:59
and then you can create a new extension you could do cargo PGX new my extension
1:17:01
you could do cargo PGX new my extension
1:17:01
you could do cargo PGX new my extension and you go to the directory of the
1:17:02
and you go to the directory of the
1:17:02
and you go to the directory of the extension you can play around with it
1:17:04
extension you can play around with it
1:17:04
extension you can play around with it that's that's in the readme of the of
1:17:06
that's that's in the readme of the of
1:17:06
that's that's in the readme of the of the project
1:17:07
the project and it's
1:17:09
and it's but that sort of brings us to a next
1:17:11
but that sort of brings us to a next
1:17:11
but that sort of brings us to a next issue
1:17:13
issue I need a name for this cool new GPT
1:17:16
I need a name for this cool new GPT
1:17:16
I need a name for this cool new GPT extension so I mean sort of the most
1:17:17
extension so I mean sort of the most
1:17:17
extension so I mean sort of the most obvious one is tgpt
1:17:20
obvious one is tgpt
1:17:20
obvious one is tgpt but it turns out that I wasn't the only
1:17:22
but it turns out that I wasn't the only
1:17:22
but it turns out that I wasn't the only one with this nice idea of combining
1:17:25
one with this nice idea of combining
1:17:25
one with this nice idea of combining things so someone else sort of did
1:17:27
things so someone else sort of did
1:17:27
things so someone else sort of did exactly what I wanted to do but I
1:17:28
exactly what I wanted to do but I
1:17:28
exactly what I wanted to do but I already served into the top and I mean I
1:17:30
already served into the top and I mean I
1:17:30
already served into the top and I mean I still think it's I already started
1:17:32
still think it's I already started
1:17:32
still think it's I already started working on it also so it's uh it's it
1:17:35
working on it also so it's uh it's it
1:17:35
working on it also so it's uh it's it and it's of course it's different mine
1:17:37
and it's of course it's different mine
1:17:37
and it's of course it's different mine is obviously better
1:17:39
is obviously better
1:17:39
is obviously better um but but yeah that's so but I should
1:17:41
um but but yeah that's so but I should
1:17:41
um but but yeah that's so but I should at least not take the same name that's
1:17:43
at least not take the same name that's
1:17:43
at least not take the same name that's that's just confusing for uh for
1:17:45
that's just confusing for uh for
1:17:45
that's just confusing for uh for everyone so maybe flip it around go for
1:17:48
everyone so maybe flip it around go for
1:17:48
everyone so maybe flip it around go for GPT PG and I mean it's a nice palindrome
1:17:52
GPT PG and I mean it's a nice palindrome
1:17:52
GPT PG and I mean it's a nice palindrome so that's kind of cool but it's it's
1:17:53
so that's kind of cool but it's it's
1:17:53
so that's kind of cool but it's it's kind of hard to pronounce the only
1:17:54
kind of hard to pronounce the only
1:17:54
kind of hard to pronounce the only reason that that role of my tongue so
1:17:56
reason that that role of my tongue so
1:17:56
reason that that role of my tongue so easily because I've practiced it a lot
1:17:59
easily because I've practiced it a lot
1:17:59
easily because I've practiced it a lot um so so I ended up as the final choice
1:18:01
um so so I ended up as the final choice
1:18:01
um so so I ended up as the final choice of PG human it doesn't cook 10gbt but
1:18:04
of PG human it doesn't cook 10gbt but
1:18:04
of PG human it doesn't cook 10gbt but it's it sort of brings the idea to you
1:18:07
it's it sort of brings the idea to you
1:18:07
it's it sort of brings the idea to you like ah we want to humanize postgres so
1:18:09
like ah we want to humanize postgres so
1:18:09
like ah we want to humanize postgres so if you don't
1:18:10
if you don't it's not such a machine anymore it
1:18:12
it's not such a machine anymore it
1:18:12
it's not such a machine anymore it becomes becomes a bit more human
1:18:15
becomes becomes a bit more human
1:18:15
becomes becomes a bit more human it understands you better
1:18:18
so I double checked the chat utility if
1:18:20
so I double checked the chat utility if
1:18:20
so I double checked the chat utility if it thought it was a good name for the
1:18:23
it thought it was a good name for the
1:18:23
it thought it was a good name for the extension and it was like yes oh yeah
1:18:25
extension and it was like yes oh yeah
1:18:25
extension and it was like yes oh yeah it's a great name short catchy easy to
1:18:27
it's a great name short catchy easy to
1:18:27
it's a great name short catchy easy to remember
1:18:28
remember so I was like okay well if GPT
1:18:31
so I was like okay well if GPT
1:18:31
so I was like okay well if GPT agrees I will use this name
1:18:34
agrees I will use this name
1:18:34
agrees I will use this name so I continued I changed my extension to
1:18:37
so I continued I changed my extension to
1:18:37
so I continued I changed my extension to PG human so now we have a directory and
1:18:41
PG human so now we have a directory and
1:18:41
PG human so now we have a directory and then in the directory you run cargo PGX
1:18:44
then in the directory you run cargo PGX
1:18:44
then in the directory you run cargo PGX run and you create the extension inside
1:18:47
run and you create the extension inside
1:18:47
run and you create the extension inside the postgresh shell which is Cargo PDX
1:18:49
the postgresh shell which is Cargo PDX
1:18:49
the postgresh shell which is Cargo PDX run actually spawns the possible SQL
1:18:51
run actually spawns the possible SQL
1:18:51
run actually spawns the possible SQL automatically so you can play around and
1:18:54
automatically so you can play around and
1:18:54
automatically so you can play around and then you create the extension and then
1:18:56
then you create the extension and then
1:18:56
then you create the extension and then you it has even has a built-in hello
1:18:58
you it has even has a built-in hello
1:18:58
you it has even has a built-in hello world
1:19:00
so you have you have an extension that
1:19:02
you have you have an extension that
1:19:02
you have you have an extension that works it doesn't do very useful things
1:19:03
works it doesn't do very useful things
1:19:03
works it doesn't do very useful things yet but it works
1:19:07
so then I wanted to make some changes
1:19:08
so then I wanted to make some changes
1:19:08
so then I wanted to make some changes because I mean I wanted to make it do
1:19:10
because I mean I wanted to make it do
1:19:10
because I mean I wanted to make it do useful thing so I made some small
1:19:11
useful thing so I made some small
1:19:11
useful thing so I made some small changes I recompiled and then it took
1:19:13
changes I recompiled and then it took
1:19:13
changes I recompiled and then it took seven seconds to compile so that's
1:19:15
seven seconds to compile so that's
1:19:15
seven seconds to compile so that's that's not really I mean it's not super
1:19:18
that's not really I mean it's not super
1:19:18
that's not really I mean it's not super slow but it's it's slow enough that it
1:19:20
slow but it's it's slow enough that it
1:19:20
slow but it's it's slow enough that it annoys me uh so I looked into it a bit
1:19:23
annoys me uh so I looked into it a bit
1:19:23
annoys me uh so I looked into it a bit like why is it so slow it turns out that
1:19:26
like why is it so slow it turns out that
1:19:26
like why is it so slow it turns out that one thing that makes rust compilation
1:19:28
one thing that makes rust compilation
1:19:28
one thing that makes rust compilation and compilation in general slow usually
1:19:31
and compilation in general slow usually
1:19:31
and compilation in general slow usually is linking which is compilers they
1:19:34
is linking which is compilers they
1:19:34
is linking which is compilers they create multiple files and then they sort
1:19:37
create multiple files and then they sort
1:19:37
create multiple files and then they sort of merge them together in a final binary
1:19:39
of merge them together in a final binary
1:19:39
of merge them together in a final binary that you can
1:19:41
that you can and that step it's it's don't buy the
1:19:44
and that step it's it's don't buy the
1:19:44
and that step it's it's don't buy the link combining those files and that's
1:19:47
link combining those files and that's
1:19:47
link combining those files and that's and the new Linker which is sort of the
1:19:49
and the new Linker which is sort of the
1:19:49
and the new Linker which is sort of the default one that you get on Linux it's a
1:19:52
default one that you get on Linux it's a
1:19:52
default one that you get on Linux it's a bit slow
1:19:55
so it's very easy you can swap it out for a
1:19:57
it's very easy you can swap it out for a
1:19:57
it's very easy you can swap it out for a faster one mode is fast ldd is fast uh
1:20:01
faster one mode is fast ldd is fast uh
1:20:01
faster one mode is fast ldd is fast uh LED is from the Clank people
1:20:03
LED is from the Clank people
1:20:03
LED is from the Clank people and multis from someone else but uh I
1:20:07
and multis from someone else but uh I
1:20:07
and multis from someone else but uh I tried molt and it was like ah so the
1:20:09
tried molt and it was like ah so the
1:20:09
tried molt and it was like ah so the three three lines in some file in my
1:20:11
three three lines in some file in my
1:20:11
three three lines in some file in my home directory and then
1:20:13
home directory and then
1:20:13
home directory and then and then it took only one second to
1:20:15
and then it took only one second to
1:20:15
and then it took only one second to compile instead of seven so that's okay
1:20:18
compile instead of seven so that's okay
1:20:18
compile instead of seven so that's okay one second then I think is acceptable
1:20:20
one second then I think is acceptable
1:20:20
one second then I think is acceptable that's that's that's fine my uh my
1:20:23
that's that's that's fine my uh my
1:20:23
that's that's that's fine my uh my change change compile cycle is it's not
1:20:25
change change compile cycle is it's not
1:20:25
change change compile cycle is it's not impacted so much by that
1:20:27
impacted so much by that
1:20:27
impacted so much by that so now we can actually start uh so what
1:20:30
so now we can actually start uh so what
1:20:30
so now we can actually start uh so what do we actually need in this extension we
1:20:32
do we actually need in this extension we
1:20:32
do we actually need in this extension we first we need to get the schema
1:20:34
first we need to get the schema
1:20:34
first we need to get the schema definition that was sort of the whole
1:20:35
definition that was sort of the whole
1:20:35
definition that was sort of the whole point we wanted to avoid copy pasting
1:20:37
point we wanted to avoid copy pasting
1:20:37
point we wanted to avoid copy pasting that schema definition and then we need
1:20:39
that schema definition and then we need
1:20:39
that schema definition and then we need to send it to gbt because that's sort of
1:20:41
to send it to gbt because that's sort of
1:20:41
to send it to gbt because that's sort of the face part
1:20:42
the face part and then I mean we got some query that
1:20:45
and then I mean we got some query that
1:20:45
and then I mean we got some query that that we can might execute and we kind of
1:20:49
that we can might execute and we kind of
1:20:49
that we can might execute and we kind of don't want to copy paste that either so
1:20:51
don't want to copy paste that either so
1:20:51
don't want to copy paste that either so we just want to execute that I mean I'm
1:20:53
we just want to execute that I mean I'm
1:20:53
we just want to execute that I mean I'm not going to read it anyway that's
1:20:56
not going to read it anyway that's
1:20:56
not going to read it anyway that's that's way too much work
1:20:58
that's way too much work
1:20:58
that's way too much work so I'll I'll just I'll just trust that
1:21:00
so I'll I'll just I'll just trust that
1:21:00
so I'll I'll just I'll just trust that it sort of does what I want and uh
1:21:03
it sort of does what I want and uh
1:21:03
it sort of does what I want and uh and execute it right away and then if
1:21:06
and execute it right away and then if
1:21:06
and execute it right away and then if that query is like a select where you
1:21:08
that query is like a select where you
1:21:08
that query is like a select where you want to
1:21:09
want to factual data
1:21:11
factual data that that should actually also return
1:21:14
that that should actually also return
1:21:14
that that should actually also return that data not just run and then throw
1:21:17
that data not just run and then throw
1:21:17
that data not just run and then throw everything away so we need to sort of
1:21:18
everything away so we need to sort of
1:21:18
everything away so we need to sort of return arbitrary SQL results because all
1:21:21
return arbitrary SQL results because all
1:21:21
return arbitrary SQL results because all the queries they can
1:21:22
the queries they can
1:21:22
the queries they can I mean you don't know what kind of query
1:21:24
I mean you don't know what kind of query
1:21:24
I mean you don't know what kind of query it is it kind of many columns or just
1:21:27
it is it kind of many columns or just
1:21:27
it is it kind of many columns or just one uh that's yeah
1:21:30
one uh that's yeah
1:21:30
one uh that's yeah so let's start and getting the schema
1:21:33
so let's start and getting the schema
1:21:33
so let's start and getting the schema definition is actually not super easy
1:21:35
definition is actually not super easy
1:21:35
definition is actually not super easy postgres somehow doesn't have like a
1:21:37
postgres somehow doesn't have like a
1:21:37
postgres somehow doesn't have like a built-in way to do this which
1:21:39
built-in way to do this which
1:21:39
built-in way to do this which is very surprising to me so
1:21:42
is very surprising to me so
1:21:42
is very surprising to me so um
1:21:43
um well it's also not the worst there's
1:21:45
well it's also not the worst there's
1:21:45
well it's also not the worst there's like sort of internal tables from
1:21:47
like sort of internal tables from
1:21:47
like sort of internal tables from postgres that you can look at uh to see
1:21:50
postgres that you can look at uh to see
1:21:50
postgres that you can look at uh to see what tables exist
1:21:52
what tables exist and I just did that I looked at the
1:21:54
and I just did that I looked at the
1:21:54
and I just did that I looked at the column names it sort of concatenated
1:21:56
column names it sort of concatenated
1:21:56
column names it sort of concatenated strings together to
1:21:58
strings together to
1:21:58
strings together to turn it into something that looked like
1:22:00
turn it into something that looked like
1:22:00
turn it into something that looked like create table statements
1:22:03
create table statements
1:22:03
create table statements and then I had like a schema the schema
1:22:06
and then I had like a schema the schema
1:22:06
and then I had like a schema the schema description
1:22:07
description and then I need to send them to chatgpt
1:22:09
and then I need to send them to chatgpt
1:22:09
and then I need to send them to chatgpt and it's very easy to do that you can
1:22:11
and it's very easy to do that you can
1:22:11
and it's very easy to do that you can install the open AI package for rust and
1:22:15
install the open AI package for rust and
1:22:15
install the open AI package for rust and cargo is the package menu again so it's
1:22:17
cargo is the package menu again so it's
1:22:17
cargo is the package menu again so it's like a one one line command and you have
1:22:20
like a one one line command and you have
1:22:20
like a one one line command and you have you have a 1050 client that you can send
1:22:24
you have a 1050 client that you can send
1:22:24
you have a 1050 client that you can send questions to
1:22:26
questions to and then you need to make those
1:22:28
and then you need to make those
1:22:28
and then you need to make those questions so we first start with giving
1:22:30
questions so we first start with giving
1:22:30
questions so we first start with giving the AI a little bit of confidence you
1:22:32
the AI a little bit of confidence you
1:22:32
the AI a little bit of confidence you wanted to to really give you the good
1:22:34
wanted to to really give you the good
1:22:34
wanted to to really give you the good answers and it only does that if you if
1:22:35
answers and it only does that if you if
1:22:36
answers and it only does that if you if you tell it they're an expert so they're
1:22:38
you tell it they're an expert so they're
1:22:38
you tell it they're an expert so they're like a postgres expert otherwise I mean
1:22:40
like a postgres expert otherwise I mean
1:22:40
like a postgres expert otherwise I mean maybe they're just saying dumb things
1:22:41
maybe they're just saying dumb things
1:22:41
maybe they're just saying dumb things but if they're an expert then they will
1:22:43
but if they're an expert then they will
1:22:43
but if they're an expert then they will never do
1:22:45
never do so that's one thing then you want to
1:22:47
so that's one thing then you want to
1:22:47
so that's one thing then you want to tell it what you have like this is my
1:22:49
tell it what you have like this is my
1:22:49
tell it what you have like this is my database schema so that's the second
1:22:51
database schema so that's the second
1:22:51
database schema so that's the second thing you sent it to
1:22:53
thing you sent it to
1:22:53
thing you sent it to and then you tell it what you need like
1:22:55
and then you tell it what you need like
1:22:55
and then you tell it what you need like after schema
1:22:57
after schema and I want you to give me a query that
1:22:59
and I want you to give me a query that
1:22:59
and I want you to give me a query that that does this normal human description
1:23:03
that does this normal human description
1:23:03
that does this normal human description thing
1:23:06
and then finally also really important you tell it what
1:23:08
also really important you tell it what
1:23:08
also really important you tell it what you don't want so you don't want to kind
1:23:10
you don't want so you don't want to kind
1:23:10
you don't want so you don't want to kind of fluff around it with with things that
1:23:13
of fluff around it with with things that
1:23:13
of fluff around it with with things that that you don't care about like your sort
1:23:16
that you don't care about like your sort
1:23:16
that you don't care about like your sort of descriptions I mean you just want the
1:23:17
of descriptions I mean you just want the
1:23:17
of descriptions I mean you just want the raw SQL query that's that's all you need
1:23:19
raw SQL query that's that's all you need
1:23:19
raw SQL query that's that's all you need and you don't want it to make up tables
1:23:21
and you don't want it to make up tables
1:23:21
and you don't want it to make up tables that don't exist I mean the things
1:23:22
that don't exist I mean the things
1:23:22
that don't exist I mean the things you're sending it it's all the tables
1:23:24
you're sending it it's all the tables
1:23:24
you're sending it it's all the tables that are there so if it if it thinks
1:23:26
that are there so if it if it thinks
1:23:26
that are there so if it if it thinks there's other tables it's like okay well
1:23:27
there's other tables it's like okay well
1:23:27
there's other tables it's like okay well that I mean that query is never going to
1:23:29
that I mean that query is never going to
1:23:29
that I mean that query is never going to work anyway
1:23:30
work anyway so you try to
1:23:32
so you try to suggest to it that it shouldn't do that
1:23:34
suggest to it that it shouldn't do that
1:23:34
suggest to it that it shouldn't do that and that's that's the important part it
1:23:36
and that's that's the important part it
1:23:36
and that's that's the important part it never really completely listens so most
1:23:39
never really completely listens so most
1:23:39
never really completely listens so most of the time it does but sometimes
1:23:40
of the time it does but sometimes
1:23:40
of the time it does but sometimes sometimes it's just it's just ignores
1:23:42
sometimes it's just it's just ignores
1:23:42
sometimes it's just it's just ignores your advice
1:23:44
your advice but yeah that's I mean people also do
1:23:48
but yeah that's I mean people also do
1:23:48
but yeah that's I mean people also do but there's one issue with sending these
1:23:50
but there's one issue with sending these
1:23:50
but there's one issue with sending these requests and that's that
1:23:54
rushed it loves async and threads to
1:23:57
rushed it loves async and threads to
1:23:57
rushed it loves async and threads to make everything go fast
1:23:59
make everything go fast
1:23:59
make everything go fast and async support in PGX it doesn't
1:24:01
and async support in PGX it doesn't
1:24:01
and async support in PGX it doesn't really exist
1:24:03
really exist I opened an issue and get up to sort of
1:24:05
I opened an issue and get up to sort of
1:24:05
I opened an issue and get up to sort of discuss a bit how we could make that
1:24:06
discuss a bit how we could make that
1:24:06
discuss a bit how we could make that better but at the moment it doesn't it
1:24:08
better but at the moment it doesn't it
1:24:08
better but at the moment it doesn't it there's no there's not really a nice
1:24:09
there's no there's not really a nice
1:24:09
there's no there's not really a nice solution and postgres and PGX really
1:24:12
solution and postgres and PGX really
1:24:12
solution and postgres and PGX really don't like threads nothing in postgres
1:24:15
don't like threads nothing in postgres
1:24:15
don't like threads nothing in postgres is stretch safe so if you spawn multiple
1:24:18
is stretch safe so if you spawn multiple
1:24:18
is stretch safe so if you spawn multiple threats with touch postgres things it's
1:24:20
threats with touch postgres things it's
1:24:20
threats with touch postgres things it's gonna everything is gonna go horribly
1:24:22
gonna everything is gonna go horribly
1:24:22
gonna everything is gonna go horribly wrong
1:24:23
wrong but luckily there's a really easy to
1:24:26
but luckily there's a really easy to
1:24:26
but luckily there's a really easy to work around you can just tell like the
1:24:28
work around you can just tell like the
1:24:28
work around you can just tell like the the Rest async Library to run everything
1:24:31
the Rest async Library to run everything
1:24:31
the Rest async Library to run everything on the current thread and then you don't
1:24:33
on the current thread and then you don't
1:24:33
on the current thread and then you don't use threats it won't go as fast as it
1:24:35
use threats it won't go as fast as it
1:24:35
use threats it won't go as fast as it could but that's fine we're only sending
1:24:37
could but that's fine we're only sending
1:24:37
could but that's fine we're only sending one request at the time anyway
1:24:38
one request at the time anyway
1:24:38
one request at the time anyway so we might as well do it on the current
1:24:40
so we might as well do it on the current
1:24:40
so we might as well do it on the current thread
1:24:43
and then we get some query back
1:24:46
and then we get some query back
1:24:46
and then we get some query back that we can execute and then we need to
1:24:48
that we can execute and then we need to
1:24:48
that we can execute and then we need to execute that and postgres SPI which is
1:24:50
execute that and postgres SPI which is
1:24:50
execute that and postgres SPI which is server programming interface which is
1:24:52
server programming interface which is
1:24:52
server programming interface which is like a c bindings for running SQL in an
1:24:56
like a c bindings for running SQL in an
1:24:56
like a c bindings for running SQL in an extension
1:24:58
extension and PGX has really nice wrist bindings
1:25:00
and PGX has really nice wrist bindings
1:25:00
and PGX has really nice wrist bindings for those C bindings so it's just as
1:25:03
for those C bindings so it's just as
1:25:03
for those C bindings so it's just as simple as you get a string of SQL and
1:25:05
simple as you get a string of SQL and
1:25:05
simple as you get a string of SQL and you say execute connect SBI and then
1:25:09
you say execute connect SBI and then
1:25:09
you say execute connect SBI and then executes this uh this sequel that's all
1:25:12
executes this uh this sequel that's all
1:25:12
executes this uh this sequel that's all you need to do
1:25:13
you need to do so that's this was actually very simple
1:25:15
so that's this was actually very simple
1:25:15
so that's this was actually very simple but the returning arbitrary results is
1:25:17
but the returning arbitrary results is
1:25:17
but the returning arbitrary results is where it gets a bit more complicated
1:25:19
where it gets a bit more complicated
1:25:19
where it gets a bit more complicated because
1:25:21
because all you can do with the postgres
1:25:22
all you can do with the postgres
1:25:22
all you can do with the postgres extension is create functions and
1:25:24
extension is create functions and
1:25:24
extension is create functions and functions they need to have a known set
1:25:27
functions they need to have a known set
1:25:27
functions they need to have a known set of columns you can't change those
1:25:29
of columns you can't change those
1:25:29
of columns you can't change those columns on the fly or the while running
1:25:31
columns on the fly or the while running
1:25:31
columns on the fly or the while running the function postgres needs to know what
1:25:34
the function postgres needs to know what
1:25:34
the function postgres needs to know what what columns that function is going to
1:25:36
what columns that function is going to
1:25:36
what columns that function is going to return return
1:25:39
so that's that seems a bit of a problem but
1:25:41
that's that seems a bit of a problem but
1:25:41
that's that seems a bit of a problem but actually it's not so much
1:25:44
actually it's not so much
1:25:44
actually it's not so much we are not running any SQL already
1:25:46
we are not running any SQL already
1:25:46
we are not running any SQL already anyway so we're sort of doing nosql kind
1:25:48
anyway so we're sort of doing nosql kind
1:25:48
anyway so we're sort of doing nosql kind of things so we might as well go through
1:25:49
of things so we might as well go through
1:25:49
of things so we might as well go through you know SQL and start returning Json
1:25:51
you know SQL and start returning Json
1:25:51
you know SQL and start returning Json blobs so that's what we're doing we're
1:25:54
blobs so that's what we're doing we're
1:25:54
blobs so that's what we're doing we're we're postgres can
1:25:57
we're postgres can
1:25:57
we're postgres can transform a query and return one Json
1:25:59
transform a query and return one Json
1:25:59
transform a query and return one Json blob so we have a function that returns
1:26:01
blob so we have a function that returns
1:26:01
blob so we have a function that returns only Json like a single column
1:26:04
only Json like a single column
1:26:04
only Json like a single column with only Json
1:26:06
with only Json and that that's kind of I mean and you
1:26:10
and that that's kind of I mean and you
1:26:10
and that that's kind of I mean and you and you so and you wrap the actual query
1:26:12
and you so and you wrap the actual query
1:26:12
and you so and you wrap the actual query that you get from jet PPT in like a
1:26:13
that you get from jet PPT in like a
1:26:13
that you get from jet PPT in like a little bit of wrapper to turn that query
1:26:15
little bit of wrapper to turn that query
1:26:15
little bit of wrapper to turn that query into returning Json
1:26:17
into returning Json
1:26:17
into returning Json and then then you have something that
1:26:19
and then then you have something that
1:26:19
and then then you have something that works
1:26:20
works so now it's time to show that all this
1:26:24
so now it's time to show that all this
1:26:24
so now it's time to show that all this sort of does what they tell it it does
1:26:26
sort of does what they tell it it does
1:26:26
sort of does what they tell it it does so we're gonna hope that AI is uh doing
1:26:30
so we're gonna hope that AI is uh doing
1:26:30
so we're gonna hope that AI is uh doing the right thing but I didn't know really
1:26:32
the right thing but I didn't know really
1:26:32
the right thing but I didn't know really what kind of demo I needed to do
1:26:34
what kind of demo I needed to do
1:26:34
what kind of demo I needed to do so I'll ask jtp again I was like ah to
1:26:37
so I'll ask jtp again I was like ah to
1:26:37
so I'll ask jtp again I was like ah to do list that that's a good demo app I
1:26:39
do list that that's a good demo app I
1:26:39
do list that that's a good demo app I was like okay well that seems that seems
1:26:41
was like okay well that seems that seems
1:26:41
was like okay well that seems that seems reasonable so let's uh let's do that I
1:26:45
reasonable so let's uh let's do that I
1:26:45
reasonable so let's uh let's do that I will uh
1:26:47
will uh I will start typing now and uh so the
1:26:51
I will start typing now and uh so the
1:26:51
I will start typing now and uh so the thing I do is let's first create some
1:26:54
thing I do is let's first create some
1:26:54
thing I do is let's first create some tables and as you can see
1:26:56
tables and as you can see
1:26:56
tables and as you can see um I'm calling this function I'm very
1:26:58
um I'm calling this function I'm very
1:26:58
um I'm calling this function I'm very I'm feeling very lucky so it's a I mean
1:27:01
I'm feeling very lucky so it's a I mean
1:27:01
I'm feeling very lucky so it's a I mean this is changing your database with
1:27:03
this is changing your database with
1:27:03
this is changing your database with something returned from an AI that might
1:27:06
something returned from an AI that might
1:27:06
something returned from an AI that might or might not do exactly what you want so
1:27:08
or might not do exactly what you want so
1:27:08
or might not do exactly what you want so I think it's a well-named function
1:27:10
I think it's a well-named function
1:27:10
I think it's a well-named function and what we're going to do is we're
1:27:11
and what we're going to do is we're
1:27:12
and what we're going to do is we're creating tables for to-do app with
1:27:14
creating tables for to-do app with
1:27:14
creating tables for to-do app with multiple users
1:27:16
multiple users and
1:27:17
and when I run that
1:27:21
it's uh it's going to think for a little
1:27:23
it's uh it's going to think for a little
1:27:23
it's uh it's going to think for a little bit
1:27:24
bit and if we scroll up a bit you can see it
1:27:26
and if we scroll up a bit you can see it
1:27:26
and if we scroll up a bit you can see it created a table with users
1:27:28
created a table with users
1:27:28
created a table with users and it created table with tasks
1:27:30
and it created table with tasks
1:27:30
and it created table with tasks and user ID description due date is
1:27:34
and user ID description due date is
1:27:34
and user ID description due date is complete for all the tasks and names and
1:27:36
complete for all the tasks and names and
1:27:36
complete for all the tasks and names and emails so that looks that looks pretty
1:27:37
emails so that looks that looks pretty
1:27:37
emails so that looks that looks pretty reasonable but the due date seems a bit
1:27:40
reasonable but the due date seems a bit
1:27:40
reasonable but the due date seems a bit unnecessary
1:27:42
unnecessary so let's let's remove that column we
1:27:45
so let's let's remove that column we
1:27:45
so let's let's remove that column we don't really care about it uh enter GPT
1:27:48
don't really care about it uh enter GPT
1:27:48
don't really care about it uh enter GPT happily obliges it's uh it runs another
1:27:51
happily obliges it's uh it runs another
1:27:51
happily obliges it's uh it runs another table drop column due date even though
1:27:53
table drop column due date even though
1:27:53
table drop column due date even though we didn't specify what table it was
1:27:57
we didn't specify what table it was
1:27:57
we didn't specify what table it was um
1:27:58
um or that there was an underscore between
1:28:00
or that there was an underscore between
1:28:00
or that there was an underscore between Jew and date it sort of
1:28:02
Jew and date it sort of
1:28:02
Jew and date it sort of it sort of understands what we mean
1:28:05
it sort of understands what we mean
1:28:05
it sort of understands what we mean so and if I look at the tables that we
1:28:08
so and if I look at the tables that we
1:28:08
so and if I look at the tables that we have we did have the task in the user
1:28:09
have we did have the task in the user
1:28:09
have we did have the task in the user stable
1:28:10
stable so now we need some data data to show
1:28:13
so now we need some data data to show
1:28:13
so now we need some data data to show some nice queries and creating dummy
1:28:16
some nice queries and creating dummy
1:28:16
some nice queries and creating dummy data is always quite annoying that you
1:28:19
data is always quite annoying that you
1:28:19
data is always quite annoying that you have to think and I mean certificate can
1:28:22
have to think and I mean certificate can
1:28:22
have to think and I mean certificate can take for us so let's
1:28:31
um let's create let's add three famous football players
1:28:33
let's add three famous football players
1:28:33
let's add three famous football players and have three tasks for each of them
1:28:35
and have three tasks for each of them
1:28:35
and have three tasks for each of them based on the training schedule to have
1:28:37
based on the training schedule to have
1:28:37
based on the training schedule to have some nice relatable data
1:28:40
some nice relatable data
1:28:40
some nice relatable data and we wait a bit this usually takes a
1:28:42
and we wait a bit this usually takes a
1:28:42
and we wait a bit this usually takes a bit longer because it actually
1:28:43
bit longer because it actually
1:28:43
bit longer because it actually it's it's
1:28:47
creates tokens and those each of those Dockers
1:28:49
tokens and those each of those Dockers
1:28:49
tokens and those each of those Dockers cost a bit of time but this time it was
1:28:51
cost a bit of time but this time it was
1:28:51
cost a bit of time but this time it was fairly fast only seven seconds uh but
1:28:53
fairly fast only seven seconds uh but
1:28:53
fairly fast only seven seconds uh but because this is I mean this it's
1:28:55
because this is I mean this it's
1:28:55
because this is I mean this it's actually creating quite a bit of data so
1:28:57
actually creating quite a bit of data so
1:28:57
actually creating quite a bit of data so it takes more time to generate that than
1:28:59
it takes more time to generate that than
1:28:59
it takes more time to generate that than a simple query
1:29:01
a simple query and as you can see it's like it adds uh
1:29:03
and as you can see it's like it adds uh
1:29:03
and as you can see it's like it adds uh Messi and Ronaldo and Neymar I mean
1:29:05
Messi and Ronaldo and Neymar I mean
1:29:06
Messi and Ronaldo and Neymar I mean another huge football fan or anything
1:29:07
another huge football fan or anything
1:29:07
another huge football fan or anything but I mean those even even for me those
1:29:09
but I mean those even even for me those
1:29:09
but I mean those even even for me those are names that I know and it even knows
1:29:12
are names that I know and it even knows
1:29:12
are names that I know and it even knows sort of the football clubs that they're
1:29:14
sort of the football clubs that they're
1:29:14
sort of the football clubs that they're from and uses those in the email so
1:29:16
from and uses those in the email so
1:29:16
from and uses those in the email so that's it's really it's really nice
1:29:17
that's it's really it's really nice
1:29:17
that's it's really it's really nice touch
1:29:18
touch and and uh the past they also sort of
1:29:21
and and uh the past they also sort of
1:29:21
and and uh the past they also sort of make make sense
1:29:23
make make sense um they they look like football things
1:29:25
um they they look like football things
1:29:25
um they they look like football things that you could train
1:29:28
so yeah uh let's see if any anyone
1:29:31
so yeah uh let's see if any anyone
1:29:31
so yeah uh let's see if any anyone completes some of their tasks
1:29:36
but if there's any completed tasks you
1:29:38
but if there's any completed tasks you
1:29:38
but if there's any completed tasks you can you can check like the public and
1:29:41
can you can check like the public and
1:29:41
can you can check like the public and and
1:29:42
and it's uh it will create a query it does
1:29:44
it's uh it will create a query it does
1:29:45
it's uh it will create a query it does is complete it's true
1:29:47
is complete it's true
1:29:47
is complete it's true and but there's currently no one that
1:29:48
and but there's currently no one that
1:29:48
and but there's currently no one that completed any tasks so let's let's
1:29:50
completed any tasks so let's let's
1:29:50
completed any tasks so let's let's change that let's add some
1:29:53
change that let's add some
1:29:53
change that let's add some random
1:29:54
random let's complete some tasks randomly so we
1:29:57
let's complete some tasks randomly so we
1:29:57
let's complete some tasks randomly so we randomly complete 50 of the tasks
1:30:01
randomly complete 50 of the tasks
1:30:01
randomly complete 50 of the tasks and it takes a bit of time but uh then
1:30:05
and it takes a bit of time but uh then
1:30:05
and it takes a bit of time but uh then we got a query back like updates tasks
1:30:08
we got a query back like updates tasks
1:30:08
we got a query back like updates tasks set is complete to true and then where I
1:30:11
set is complete to true and then where I
1:30:11
set is complete to true and then where I be in uh something I don't know so some
1:30:14
be in uh something I don't know so some
1:30:14
be in uh something I don't know so some query uh with some random parts so that
1:30:17
query uh with some random parts so that
1:30:17
query uh with some random parts so that seems reasonable time 0.5 so that's
1:30:19
seems reasonable time 0.5 so that's
1:30:19
seems reasonable time 0.5 so that's probably the sort of the chance of it
1:30:21
probably the sort of the chance of it
1:30:21
probably the sort of the chance of it happening so that I mean I I don't know
1:30:23
happening so that I mean I I don't know
1:30:24
happening so that I mean I I don't know the exact details but it looks looks
1:30:25
the exact details but it looks looks
1:30:25
the exact details but it looks looks about all right
1:30:27
about all right um so let's let's take a look again
1:30:29
um so let's let's take a look again
1:30:30
um so let's let's take a look again at uh
1:30:32
at uh let's show the completed tasks again oh
1:30:34
let's show the completed tasks again oh
1:30:34
let's show the completed tasks again oh yeah you can see like oh there's five
1:30:36
yeah you can see like oh there's five
1:30:36
yeah you can see like oh there's five completed tasks now they don't they
1:30:38
completed tasks now they don't they
1:30:38
completed tasks now they don't they might not be entirely random because
1:30:39
might not be entirely random because
1:30:39
might not be entirely random because it's like five six seven eight nine but
1:30:41
it's like five six seven eight nine but
1:30:41
it's like five six seven eight nine but maybe you know it could be could be
1:30:43
maybe you know it could be could be
1:30:43
maybe you know it could be could be random
1:30:45
random and now but now we kind of also want to
1:30:47
and now but now we kind of also want to
1:30:47
and now but now we kind of also want to know
1:30:50
if if they if who's who's complete these
1:30:53
if they if who's who's complete these
1:30:53
if they if who's who's complete these steps and we only see user ID two and
1:30:55
steps and we only see user ID two and
1:30:55
steps and we only see user ID two and three so that's a bit annoying uh so
1:30:58
three so that's a bit annoying uh so
1:30:58
three so that's a bit annoying uh so let's let's use Let's do let's show the
1:31:01
let's let's use Let's do let's show the
1:31:01
let's let's use Let's do let's show the amount of completed tasks by user and
1:31:03
amount of completed tasks by user and
1:31:03
amount of completed tasks by user and include the username just so that
1:31:05
include the username just so that
1:31:05
include the username just so that we know what what what's uh football
1:31:08
we know what what what's uh football
1:31:08
we know what what what's uh football players actually doing their trading and
1:31:10
players actually doing their trading and
1:31:10
players actually doing their trading and which one is not doing anything
1:31:12
which one is not doing anything
1:31:12
which one is not doing anything so we wait a bit again and we can see
1:31:15
so we wait a bit again and we can see
1:31:15
so we wait a bit again and we can see that Neymar is clearly the one that's
1:31:17
that Neymar is clearly the one that's
1:31:17
that Neymar is clearly the one that's that's actually training a lot and
1:31:19
that's actually training a lot and
1:31:19
that's actually training a lot and Ronaldo's behind but Messi is is just I
1:31:21
Ronaldo's behind but Messi is is just I
1:31:21
Ronaldo's behind but Messi is is just I mean he's he's probably old and not I
1:31:23
mean he's he's probably old and not I
1:31:23
mean he's he's probably old and not I mean he thinks he's done enough
1:31:28
so we have and and the query it's actually it's
1:31:30
and and the query it's actually it's
1:31:30
and and the query it's actually it's actually not a super easy query it has
1:31:32
actually not a super easy query it has
1:31:32
actually not a super easy query it has like a join on like the columns that are
1:31:35
like a join on like the columns that are
1:31:35
like a join on like the columns that are actually there and it does a group by
1:31:37
actually there and it does a group by
1:31:37
actually there and it does a group by and it thinks of all these I mean it
1:31:39
and it thinks of all these I mean it
1:31:39
and it thinks of all these I mean it does all the things sort of
1:31:41
does all the things sort of
1:31:41
does all the things sort of in a way that's that makes sense so
1:31:44
in a way that's that makes sense so
1:31:44
in a way that's that makes sense so that's it's quite impressive to me
1:31:47
that's it's quite impressive to me
1:31:47
that's it's quite impressive to me and that's kind of the demo but I one
1:31:50
and that's kind of the demo but I one
1:31:50
and that's kind of the demo but I one final thing is usually after them I
1:31:52
final thing is usually after them I
1:31:52
final thing is usually after them I usually like to clean up uh because I
1:31:55
usually like to clean up uh because I
1:31:55
usually like to clean up uh because I don't like leaving tables around so they
1:31:56
don't like leaving tables around so they
1:31:56
don't like leaving tables around so they mess up other tests that I do
1:31:59
mess up other tests that I do
1:31:59
mess up other tests that I do so let's let's uh sort of uh remove uh
1:32:02
so let's let's uh sort of uh remove uh
1:32:02
so let's let's uh sort of uh remove uh with the leads let's just delete all the
1:32:05
with the leads let's just delete all the
1:32:05
with the leads let's just delete all the tables
1:32:06
tables uh at the end so we drop tables and it
1:32:09
uh at the end so we drop tables and it
1:32:09
uh at the end so we drop tables and it deletes the tables very happily so this
1:32:11
deletes the tables very happily so this
1:32:11
deletes the tables very happily so this is one of the reasons why you probably
1:32:12
is one of the reasons why you probably
1:32:12
is one of the reasons why you probably shouldn't do this all the time uh on
1:32:16
shouldn't do this all the time uh on
1:32:16
shouldn't do this all the time uh on your production database uh I mean
1:32:18
your production database uh I mean
1:32:18
your production database uh I mean whatever you tell chat update uh it will
1:32:21
whatever you tell chat update uh it will
1:32:21
whatever you tell chat update uh it will execute and it's totally totally fine
1:32:25
execute and it's totally totally fine
1:32:25
execute and it's totally totally fine uh that's that's uh that was a demo
1:32:33
and so where can you find this amazing
1:32:35
and so where can you find this amazing
1:32:35
and so where can you find this amazing postgres extension it's uh open source I
1:32:38
postgres extension it's uh open source I
1:32:38
postgres extension it's uh open source I just published it like five minutes
1:32:40
just published it like five minutes
1:32:40
just published it like five minutes before my talk
1:32:41
before my talk uh definitely use it your own risk it's
1:32:43
uh definitely use it your own risk it's
1:32:43
uh definitely use it your own risk it's it's sort of it's more toy than
1:32:45
it's sort of it's more toy than
1:32:45
it's sort of it's more toy than production project at this uh at this
1:32:48
production project at this uh at this
1:32:48
production project at this uh at this moment
1:32:49
moment uh but it's it's fun to look at the code
1:32:51
uh but it's it's fun to look at the code
1:32:51
uh but it's it's fun to look at the code is quite quite easily understandable I
1:32:53
is quite quite easily understandable I
1:32:53
is quite quite easily understandable I think
1:32:54
think that's uh yeah and there's a few things
1:32:57
that's uh yeah and there's a few things
1:32:57
that's uh yeah and there's a few things that for future improvements or like
1:32:59
that for future improvements or like
1:32:59
that for future improvements or like future fun things to maybe add to it
1:33:01
future fun things to maybe add to it
1:33:01
future fun things to maybe add to it like explaining explain plans because
1:33:03
like explaining explain plans because
1:33:03
like explaining explain plans because those are usually hard to understand
1:33:04
those are usually hard to understand
1:33:04
those are usually hard to understand index suggestions like what columns to
1:33:07
index suggestions like what columns to
1:33:07
index suggestions like what columns to index for your queries
1:33:08
index for your queries
1:33:08
index for your queries uh and also for psych this uh it's like
1:33:12
uh and also for psych this uh it's like
1:33:12
uh and also for psych this uh it's like distributed columns distribution column
1:33:14
distributed columns distribution column
1:33:14
distributed columns distribution column suggestions is another thing it's sort
1:33:16
suggestions is another thing it's sort
1:33:16
suggestions is another thing it's sort of even harder than index suggestions
1:33:17
of even harder than index suggestions
1:33:17
of even harder than index suggestions because you can only have one
1:33:18
because you can only have one
1:33:18
because you can only have one distribution column
1:33:20
distribution column
1:33:20
distribution column and it's it's not always the primary key
1:33:22
and it's it's not always the primary key
1:33:22
and it's it's not always the primary key it's uh you have you have to do some you
1:33:25
it's uh you have you have to do some you
1:33:25
it's uh you have you have to do some you have to think a bit about your skin line
1:33:27
have to think a bit about your skin line
1:33:27
have to think a bit about your skin line your and your queries to do something uh
1:33:30
your and your queries to do something uh
1:33:30
your and your queries to do something uh intelligent
1:33:33
intelligent that's uh
1:33:35
that's uh that that was uh my my talk are there
1:33:38
that that was uh my my talk are there
1:33:38
that that was uh my my talk are there any uh if there's any questions I'm
1:33:40
any uh if there's any questions I'm
1:33:40
any uh if there's any questions I'm happy to answer some
1:33:43
happy to answer some
1:33:43
happy to answer some okay so that was a delight yeah oh my
1:33:46
okay so that was a delight yeah oh my
1:33:46
okay so that was a delight yeah oh my goodness okay so we want to know did you
1:33:48
goodness okay so we want to know did you
1:33:48
goodness okay so we want to know did you actually write the code or did you make
1:33:50
actually write the code or did you make
1:33:50
actually write the code or did you make chat GPT write it all for the extension
1:33:53
chat GPT write it all for the extension
1:33:53
chat GPT write it all for the extension so I I I I wrote it all there were some
1:33:56
so I I I I wrote it all there were some
1:33:56
so I I I I wrote it all there were some examples I I know I know one of my
1:33:58
examples I I know I know one of my
1:33:58
examples I I know I know one of my colleagues Marco he uh he did he did use
1:34:01
colleagues Marco he uh he did he did use
1:34:01
colleagues Marco he uh he did he did use charity to you to use PGA like to tell
1:34:04
charity to you to use PGA like to tell
1:34:04
charity to you to use PGA like to tell you to create some other extension sort
1:34:07
you to create some other extension sort
1:34:07
you to create some other extension sort of template uh so it's definitely it
1:34:09
of template uh so it's definitely it
1:34:09
of template uh so it's definitely it sort of knows about it it's used so
1:34:11
sort of knows about it it's used so
1:34:11
sort of knows about it it's used so that's uh that's uh yeah
1:34:13
that's uh that's uh yeah
1:34:13
that's uh that's uh yeah and you just mentioned Marco slaw who's
1:34:15
and you just mentioned Marco slaw who's
1:34:15
and you just mentioned Marco slaw who's giving the keynote talk in the emea live
1:34:17
giving the keynote talk in the emea live
1:34:17
giving the keynote talk in the emea live stream that's tomorrow Wednesday just
1:34:19
stream that's tomorrow Wednesday just
1:34:19
stream that's tomorrow Wednesday just throwing in that shout out there
1:34:21
throwing in that shout out there
1:34:21
throwing in that shout out there um all right Rob yeah I just I I loved
1:34:25
um all right Rob yeah I just I I loved
1:34:25
um all right Rob yeah I just I I loved at the end where you're like you know be
1:34:26
at the end where you're like you know be
1:34:26
at the end where you're like you know be careful what you tell gbt to do earlier
1:34:28
careful what you tell gbt to do earlier
1:34:28
careful what you tell gbt to do earlier when you you said like let's remove the
1:34:31
when you you said like let's remove the
1:34:31
when you you said like let's remove the the you know the column and it like
1:34:33
the you know the column and it like
1:34:33
the you know the column and it like drops the call and I thought like wait
1:34:34
drops the call and I thought like wait
1:34:34
drops the call and I thought like wait was he removing that from his select
1:34:36
was he removing that from his select
1:34:36
was he removing that from his select statement or did he mean from the table
1:34:38
statement or did he mean from the table
1:34:38
statement or did he mean from the table so I'm just waiting for the day when
1:34:39
so I'm just waiting for the day when
1:34:39
so I'm just waiting for the day when it's our first you know like delete
1:34:41
it's our first you know like delete
1:34:41
it's our first you know like delete without a where Clause because of chat
1:34:43
without a where Clause because of chat
1:34:43
without a where Clause because of chat GPT like that will eventually be a thing
1:34:45
GPT like that will eventually be a thing
1:34:45
GPT like that will eventually be a thing that somebody has to tell their boss I'm
1:34:47
that somebody has to tell their boss I'm
1:34:47
that somebody has to tell their boss I'm sure so yes yeah for sure that's uh
1:34:50
sure so yes yeah for sure that's uh
1:34:50
sure so yes yeah for sure that's uh that's going to happen on a slightly
1:34:52
that's going to happen on a slightly
1:34:52
that's going to happen on a slightly more serious note so now that you've
1:34:54
more serious note so now that you've
1:34:54
more serious note so now that you've kind of played around with this rust
1:34:55
kind of played around with this rust
1:34:55
kind of played around with this rust extension uh and by the way thing I
1:34:58
extension uh and by the way thing I
1:34:58
extension uh and by the way thing I learned that they just changed their
1:34:59
learned that they just changed their
1:34:59
learned that they just changed their name like earlier you know within the
1:35:01
name like earlier you know within the
1:35:01
name like earlier you know within the last 24 hours so I think that's good
1:35:02
last 24 hours so I think that's good
1:35:03
last 24 hours so I think that's good long term but uh interesting timing uh
1:35:06
long term but uh interesting timing uh
1:35:06
long term but uh interesting timing uh I'm curious when you would maybe
1:35:07
I'm curious when you would maybe
1:35:07
I'm curious when you would maybe recommend to folks to use rust uh for an
1:35:11
recommend to folks to use rust uh for an
1:35:11
recommend to folks to use rust uh for an extension versus trying to do it in C
1:35:14
extension versus trying to do it in C
1:35:14
extension versus trying to do it in C so I actually think it's it's
1:35:17
so I actually think it's it's
1:35:17
so I actually think it's it's easier in Rust sort of for everyone
1:35:19
easier in Rust sort of for everyone
1:35:19
easier in Rust sort of for everyone honestly I I mean unless you're really
1:35:22
honestly I I mean unless you're really
1:35:22
honestly I I mean unless you're really uh like a postgres core developer like
1:35:25
uh like a postgres core developer like
1:35:25
uh like a postgres core developer like that works on podcast a lot I think it's
1:35:28
that works on podcast a lot I think it's
1:35:28
that works on podcast a lot I think it's it's it's easier to to learn Russ and
1:35:31
it's it's easier to to learn Russ and
1:35:31
it's it's easier to to learn Russ and and even the setup it's the setup is so
1:35:33
and even the setup it's the setup is so
1:35:33
and even the setup it's the setup is so much simpler than having like a creating
1:35:35
much simpler than having like a creating
1:35:35
much simpler than having like a creating extension in C is you have to do lots of
1:35:38
extension in C is you have to do lots of
1:35:38
extension in C is you have to do lots of things you have to look at make files
1:35:39
things you have to look at make files
1:35:39
things you have to look at make files it's it's all very annoying and uh
1:35:43
it's it's all very annoying and uh
1:35:43
it's it's all very annoying and uh and that's hard if you don't if you
1:35:45
and that's hard if you don't if you
1:35:45
and that's hard if you don't if you don't have sort of experience with it
1:35:46
don't have sort of experience with it
1:35:46
don't have sort of experience with it and with rust it was a few commands so
1:35:48
and with rust it was a few commands so
1:35:48
and with rust it was a few commands so you had something set up and there's
1:35:49
you had something set up and there's
1:35:49
you had something set up and there's some examples and that you can look at
1:35:52
some examples and that you can look at
1:35:52
some examples and that you can look at in the in the repo so it's quite it's
1:35:55
in the in the repo so it's quite it's
1:35:55
in the in the repo so it's quite it's really I think it was easier to write an
1:35:57
really I think it was easier to write an
1:35:57
really I think it was easier to write an extension in roster it wasn't C
1:35:59
extension in roster it wasn't C
1:35:59
extension in roster it wasn't C so I know you work on the situs
1:36:01
so I know you work on the situs
1:36:01
so I know you work on the situs extension to postgres are you saying
1:36:03
extension to postgres are you saying
1:36:03
extension to postgres are you saying that if you were to start the slightest
1:36:06
that if you were to start the slightest
1:36:06
that if you were to start the slightest project today you would be proposing to
1:36:08
project today you would be proposing to
1:36:08
project today you would be proposing to implement it in rest
1:36:09
implement it in rest
1:36:09
implement it in rest I think the slide this side this one
1:36:12
I think the slide this side this one
1:36:12
I think the slide this side this one might be sort of a special case because
1:36:13
might be sort of a special case because
1:36:13
might be sort of a special case because it interacts so much with all the all
1:36:16
it interacts so much with all the all
1:36:16
it interacts so much with all the all the postgres things so it's kind of
1:36:17
the postgres things so it's kind of
1:36:17
the postgres things so it's kind of useful to to be able to copy paste on
1:36:19
useful to to be able to copy paste on
1:36:19
useful to to be able to copy paste on code sometimes uh that that is not
1:36:22
code sometimes uh that that is not
1:36:22
code sometimes uh that that is not really exposed in postgres but we kind
1:36:24
really exposed in postgres but we kind
1:36:24
really exposed in postgres but we kind of still need
1:36:26
of still need um so that and if you have rusted I mean
1:36:28
um so that and if you have rusted I mean
1:36:28
um so that and if you have rusted I mean you can't simply copy past C is it I
1:36:31
you can't simply copy past C is it I
1:36:31
you can't simply copy past C is it I mean it's it's a different language
1:36:34
mean it's it's a different language
1:36:34
mean it's it's a different language the team is secretly writing about to
1:36:37
the team is secretly writing about to
1:36:37
the team is secretly writing about to just like rewrite the whole side of this
1:36:38
just like rewrite the whole side of this
1:36:38
just like rewrite the whole side of this extension into rust like over the summer
1:36:40
extension into rust like over the summer
1:36:40
extension into rust like over the summer or something like I'm
1:36:41
or something like I'm
1:36:41
or something like I'm I I don't I don't I don't expect that to
1:36:44
I I don't I don't I don't expect that to
1:36:44
I I don't I don't I don't expect that to happen very attention but uh but but
1:36:46
happen very attention but uh but but
1:36:46
happen very attention but uh but but we're definitely investigating other
1:36:48
we're definitely investigating other
1:36:48
we're definitely investigating other like we're definitely using rust for
1:36:49
like we're definitely using rust for
1:36:49
like we're definitely using rust for some other extensions
1:36:51
some other extensions
1:36:51
some other extensions well thank you thank you so much yelta
1:36:55
well thank you thank you so much yelta
1:36:55
well thank you thank you so much yelta um I know Rob Rob wanted to ask
1:36:57
um I know Rob Rob wanted to ask
1:36:57
um I know Rob Rob wanted to ask questions about the pronunciation of
1:36:58
questions about the pronunciation of
1:36:58
questions about the pronunciation of postgres
1:36:59
postgres yeah well I just wonder can you get Chad
1:37:02
yeah well I just wonder can you get Chad
1:37:02
yeah well I just wonder can you get Chad gbt to actually call it postgres instead
1:37:04
gbt to actually call it postgres instead
1:37:04
gbt to actually call it postgres instead of postgresql I have not tried but I
1:37:08
of postgresql I have not tried but I
1:37:08
of postgresql I have not tried but I think if you ask it sweet enough I think
1:37:10
think if you ask it sweet enough I think
1:37:10
think if you ask it sweet enough I think it might uh that's that's kind of
1:37:15
so sometimes it has its own free will
1:37:18
so sometimes it has its own free will
1:37:18
so sometimes it has its own free will and that's something that you don't want
1:37:19
and that's something that you don't want
1:37:19
and that's something that you don't want anyway so that's what I've kind of
1:37:21
anyway so that's what I've kind of
1:37:21
anyway so that's what I've kind of learned because it's even if you tell it
1:37:23
learned because it's even if you tell it
1:37:23
learned because it's even if you tell it like oh you don't only want the sequel
1:37:25
like oh you don't only want the sequel
1:37:25
like oh you don't only want the sequel the raw sequel sometimes it still apps
1:37:26
the raw sequel sometimes it still apps
1:37:26
the raw sequel sometimes it still apps like this is the sequel right in front
1:37:28
like this is the sequel right in front
1:37:28
like this is the sequel right in front of it and then and then when it you
1:37:30
of it and then and then when it you
1:37:30
of it and then and then when it you execute it it's it just doesn't work
1:37:31
execute it it's it just doesn't work
1:37:31
execute it it's it just doesn't work because it's I mean that's that's not
1:37:33
because it's I mean that's that's not
1:37:33
because it's I mean that's that's not the sequel that you want to execute
1:37:35
the sequel that you want to execute
1:37:35
the sequel that you want to execute well Rob when we look at our checklist
1:37:37
well Rob when we look at our checklist
1:37:37
well Rob when we look at our checklist we can check off like discussion about
1:37:38
we can check off like discussion about
1:37:39
we can check off like discussion about the pronunciation of postgres because
1:37:40
the pronunciation of postgres because
1:37:40
the pronunciation of postgres because you can't have a postgres conference
1:37:42
you can't have a postgres conference
1:37:42
you can't have a postgres conference without having that discussion
1:37:44
without having that discussion
1:37:44
without having that discussion um so there somebody is working on their
1:37:46
um so there somebody is working on their
1:37:46
um so there somebody is working on their bingo card right now so yeah
1:37:49
bingo card right now so yeah
1:37:49
bingo card right now so yeah you are going to co-host in the emea
1:37:52
you are going to co-host in the emea
1:37:52
you are going to co-host in the emea live stream so for people who come to
1:37:54
live stream so for people who come to
1:37:54
live stream so for people who come to that
1:37:55
that um we will see you again
1:37:57
um we will see you again
1:37:57
um we will see you again um tomorrow and for those of you who
1:37:59
um tomorrow and for those of you who
1:37:59
um tomorrow and for those of you who want to mark your calendar
1:38:01
want to mark your calendar
1:38:01
want to mark your calendar um that's happening Wednesday 9cest and
1:38:03
um that's happening Wednesday 9cest and
1:38:03
um that's happening Wednesday 9cest and there's a calendar invite URL showing on
1:38:06
there's a calendar invite URL showing on
1:38:06
there's a calendar invite URL showing on the screen right now so thank you this
1:38:08
the screen right now so thank you this
1:38:08
the screen right now so thank you this is awesome see you tomorrow
1:38:11
interstitial video next is that what's
1:38:13
interstitial video next is that what's
1:38:13
interstitial video next is that what's going on I believe we do uh before we
1:38:16
going on I believe we do uh before we
1:38:16
going on I believe we do uh before we jump to the next speaker uh we wanted to
1:38:18
jump to the next speaker uh we wanted to
1:38:18
jump to the next speaker uh we wanted to kind of highlight some of the on-demand
1:38:19
kind of highlight some of the on-demand
1:38:19
kind of highlight some of the on-demand talks that are available uh and there
1:38:22
talks that are available uh and there
1:38:22
talks that are available uh and there are 25 of them in total this will be
1:38:24
are 25 of them in total this will be
1:38:24
are 25 of them in total this will be just a preview of a few of them uh but
1:38:26
just a preview of a few of them uh but
1:38:27
just a preview of a few of them uh but so hopefully gives people an idea of
1:38:28
so hopefully gives people an idea of
1:38:28
so hopefully gives people an idea of what else is out there uh who haven't
1:38:30
what else is out there uh who haven't
1:38:30
what else is out there uh who haven't you know gone and read all of the guide
1:38:32
you know gone and read all of the guide
1:38:32
you know gone and read all of the guide yet but uh here's a little preview
1:38:36
we are so happy that so many of you are
1:38:39
we are so happy that so many of you are
1:38:39
we are so happy that so many of you are joining us for cytuscon an event for
1:38:41
joining us for cytuscon an event for
1:38:42
joining us for cytuscon an event for postgres 2023 now in its second year I
1:38:45
postgres 2023 now in its second year I
1:38:45
postgres 2023 now in its second year I want to make sure you know there are 37
1:38:47
want to make sure you know there are 37
1:38:48
want to make sure you know there are 37 talks in this year's lineup not just the
1:38:50
talks in this year's lineup not just the
1:38:50
talks in this year's lineup not just the six talks in the two live streams but
1:38:53
six talks in the two live streams but
1:38:54
six talks in the two live streams but there are 25 more brand new talks we
1:38:56
there are 25 more brand new talks we
1:38:56
there are 25 more brand new talks we call them on-demand talks and they're
1:38:58
call them on-demand talks and they're
1:38:58
call them on-demand talks and they're going to publish on YouTube at the very
1:39:00
going to publish on YouTube at the very
1:39:00
going to publish on YouTube at the very start of cytuscon Lucas Borges is giving
1:39:04
start of cytuscon Lucas Borges is giving
1:39:04
start of cytuscon Lucas Borges is giving a demo about how to auto scale Azure
1:39:06
a demo about how to auto scale Azure
1:39:06
a demo about how to auto scale Azure Cosmos DB for postgres using grafana and
1:39:10
Cosmos DB for postgres using grafana and
1:39:10
Cosmos DB for postgres using grafana and Azure serverless hockey Bonita is giving
1:39:13
Azure serverless hockey Bonita is giving
1:39:13
Azure serverless hockey Bonita is giving a wonderful talk on unconventional ways
1:39:15
a wonderful talk on unconventional ways
1:39:15
a wonderful talk on unconventional ways to index uuids in postgres Hedy
1:39:19
to index uuids in postgres Hedy
1:39:19
to index uuids in postgres Hedy dombrovska is presenting on temporal
1:39:21
dombrovska is presenting on temporal
1:39:21
dombrovska is presenting on temporal features and time travel ADI kumara chat
1:39:24
features and time travel ADI kumara chat
1:39:24
features and time travel ADI kumara chat is going to cover partitioning
1:39:26
is going to cover partitioning
1:39:26
is going to cover partitioning strategies for Oracle to postgres
1:39:28
strategies for Oracle to postgres
1:39:28
strategies for Oracle to postgres migrations Bruce momjian is giving one
1:39:31
migrations Bruce momjian is giving one
1:39:31
migrations Bruce momjian is giving one of his many talks on artificial
1:39:33
of his many talks on artificial
1:39:33
of his many talks on artificial intelligence and postgres Paolo
1:39:35
intelligence and postgres Paolo
1:39:35
intelligence and postgres Paolo melchione is going to teach her how to
1:39:37
melchione is going to teach her how to
1:39:37
melchione is going to teach her how to build a web map using Django and post
1:39:40
build a web map using Django and post
1:39:40
build a web map using Django and post GIS and Chelsea Dole has a great talk on
1:39:43
GIS and Chelsea Dole has a great talk on
1:39:43
GIS and Chelsea Dole has a great talk on postgres table bloat and there's even
1:39:45
postgres table bloat and there's even
1:39:45
postgres table bloat and there's even more um you can find all 25 of the
1:39:47
more um you can find all 25 of the
1:39:47
more um you can find all 25 of the on-demand talks that aka.ms slash
1:39:51
on-demand talks that aka.ms slash
1:39:51
on-demand talks that aka.ms slash cytuscon hyphen on demand
1:39:57
all right awesome stuff let's dive in I
1:40:01
all right awesome stuff let's dive in I
1:40:01
all right awesome stuff let's dive in I know that we we spent too much time
1:40:03
know that we we spent too much time
1:40:03
know that we we spent too much time asking questions of Yalta so uh I think
1:40:06
asking questions of Yalta so uh I think
1:40:06
asking questions of Yalta so uh I think we should bring our next speaker on
1:40:09
we should bring our next speaker on
1:40:09
we should bring our next speaker on yeah let's go ahead and bring in Yvonne
1:40:11
yeah let's go ahead and bring in Yvonne
1:40:11
yeah let's go ahead and bring in Yvonne vasmatinov uh who is Hello uh technical
1:40:14
vasmatinov uh who is Hello uh technical
1:40:14
vasmatinov uh who is Hello uh technical lead and accidental DBA how you doing
1:40:17
lead and accidental DBA how you doing
1:40:17
lead and accidental DBA how you doing perfectly fine thanks awesome uh he also
1:40:20
perfectly fine thanks awesome uh he also
1:40:20
perfectly fine thanks awesome uh he also has cytus open source user uh and is
1:40:22
has cytus open source user uh and is
1:40:22
has cytus open source user uh and is going to be talking today about uh Json
1:40:25
going to be talking today about uh Json
1:40:25
going to be talking today about uh Json and analytics and and putting all that
1:40:28
and analytics and and putting all that
1:40:28
and analytics and and putting all that together so let's turn it over to Yvonne
1:40:30
together so let's turn it over to Yvonne
1:40:30
together so let's turn it over to Yvonne and we'll get started
1:40:33
and we'll get started
1:40:33
and we'll get started yes hello everyone uh this talk will
1:40:36
yes hello everyone uh this talk will
1:40:36
yes hello everyone uh this talk will show you a success successful example of
1:40:39
show you a success successful example of
1:40:39
show you a success successful example of real-time analytics based on civil
1:40:42
real-time analytics based on civil
1:40:42
real-time analytics based on civil structured data just on and how cytos
1:40:45
structured data just on and how cytos
1:40:45
structured data just on and how cytos helps with uh shortcomings of that it
1:40:48
helps with uh shortcomings of that it
1:40:48
helps with uh shortcomings of that it will be especially useful for those who
1:40:50
will be especially useful for those who
1:40:51
will be especially useful for those who considering Json B uh for such uh
1:40:54
considering Json B uh for such uh
1:40:54
considering Json B uh for such uh workload to know caveats tips and tricks
1:40:57
workload to know caveats tips and tricks
1:40:57
workload to know caveats tips and tricks and again how status help you with
1:41:01
and again how status help you with
1:41:01
and again how status help you with distribution capabilities
1:41:04
distribution capabilities
1:41:04
distribution capabilities no
1:41:06
no uh here is my uh short info and my
1:41:10
uh here is my uh short info and my
1:41:10
uh here is my uh short info and my socials uh we are a company that makes
1:41:13
socials uh we are a company that makes
1:41:13
socials uh we are a company that makes uh social and mobile games you might
1:41:17
uh social and mobile games you might
1:41:17
uh social and mobile games you might have heard about contact Adventures
1:41:19
have heard about contact Adventures
1:41:19
have heard about contact Adventures that's us uh there's also a short list
1:41:22
that's us uh there's also a short list
1:41:22
that's us uh there's also a short list of shortlist of our achievements and we
1:41:26
of shortlist of our achievements and we
1:41:26
of shortlist of our achievements and we are also big Believers in open source
1:41:28
are also big Believers in open source
1:41:28
are also big Believers in open source and encourage every user of Open Source
1:41:30
and encourage every user of Open Source
1:41:30
and encourage every user of Open Source to join the contribution community
1:41:33
to join the contribution community
1:41:33
to join the contribution community now with that out of the way let's get
1:41:35
now with that out of the way let's get
1:41:36
now with that out of the way let's get started so first I'd like to discuss the
1:41:39
started so first I'd like to discuss the
1:41:39
started so first I'd like to discuss the division of labor in our company here we
1:41:42
division of labor in our company here we
1:41:42
division of labor in our company here we have my department internal tools which
1:41:45
have my department internal tools which
1:41:45
have my department internal tools which is essentially a data engineering
1:41:47
is essentially a data engineering
1:41:47
is essentially a data engineering department so we extract data from
1:41:50
department so we extract data from
1:41:50
department so we extract data from various sources we prepare clean it and
1:41:52
various sources we prepare clean it and
1:41:52
various sources we prepare clean it and start into a data warehouse from the
1:41:55
start into a data warehouse from the
1:41:55
start into a data warehouse from the other side there is analytical
1:41:57
other side there is analytical
1:41:57
other side there is analytical Department that uses data we prepared
1:42:00
Department that uses data we prepared
1:42:00
Department that uses data we prepared from reports machine learning tasks Etc
1:42:03
from reports machine learning tasks Etc
1:42:03
from reports machine learning tasks Etc and this talk will mostly focus on the
1:42:07
and this talk will mostly focus on the
1:42:07
and this talk will mostly focus on the data engineering stuff
1:42:10
data engineering stuff
1:42:10
data engineering stuff now to the to our analytical solution
1:42:12
now to the to our analytical solution
1:42:12
now to the to our analytical solution right the general wipe of our solution
1:42:16
right the general wipe of our solution
1:42:16
right the general wipe of our solution can be described as ingest everything
1:42:18
can be described as ingest everything
1:42:18
can be described as ingest everything every piece of data as fast as possible
1:42:22
every piece of data as fast as possible
1:42:22
every piece of data as fast as possible and
1:42:24
and with any computational power available
1:42:28
with any computational power available
1:42:28
with any computational power available uh this comes from the following set of
1:42:31
uh this comes from the following set of
1:42:31
uh this comes from the following set of initial requirements from the start we
1:42:35
initial requirements from the start we
1:42:35
initial requirements from the start we were required to have a
1:42:37
were required to have a
1:42:37
were required to have a solutions that capable of storing Sim
1:42:39
solutions that capable of storing Sim
1:42:39
solutions that capable of storing Sim structured data because we as you may
1:42:42
structured data because we as you may
1:42:42
structured data because we as you may have noticed have a lot of games and
1:42:44
have noticed have a lot of games and
1:42:44
have noticed have a lot of games and each game uh has its own unique
1:42:48
each game uh has its own unique
1:42:48
each game uh has its own unique analytical events this unit contains
1:42:52
analytical events this unit contains
1:42:52
analytical events this unit contains content per user so the solution should
1:42:57
content per user so the solution should
1:42:57
content per user so the solution should accept that data at real time it should
1:43:00
accept that data at real time it should
1:43:00
accept that data at real time it should be scalable to support increased demand
1:43:03
be scalable to support increased demand
1:43:03
be scalable to support increased demand if any it should store that
1:43:05
if any it should store that
1:43:05
if any it should store that semi-structured data of users
1:43:07
semi-structured data of users
1:43:07
semi-structured data of users efficiently and allow to efficiently
1:43:10
efficiently and allow to efficiently
1:43:10
efficiently and allow to efficiently query it via SQL and now cytus comes to
1:43:15
query it via SQL and now cytus comes to
1:43:15
query it via SQL and now cytus comes to play but you may wonder why exactly
1:43:18
play but you may wonder why exactly
1:43:18
play but you may wonder why exactly cytos and not time scale DB or click
1:43:21
cytos and not time scale DB or click
1:43:21
cytos and not time scale DB or click house or you go by DB cockroach DB or
1:43:24
house or you go by DB cockroach DB or
1:43:24
house or you go by DB cockroach DB or any other available technology on the
1:43:27
any other available technology on the
1:43:27
any other available technology on the market well along with answer you with
1:43:29
market well along with answer you with
1:43:29
market well along with answer you with the list of pros and cons but it will be
1:43:33
the list of pros and cons but it will be
1:43:33
the list of pros and cons but it will be a little bit dated because we were
1:43:35
a little bit dated because we were
1:43:35
a little bit dated because we were making our choice six years ago so now
1:43:39
making our choice six years ago so now
1:43:39
making our choice six years ago so now status is open source technology
1:43:42
status is open source technology
1:43:42
status is open source technology especially now it supports just be
1:43:47
especially now it supports just be
1:43:47
especially now it supports just be natively because it is an extension of
1:43:49
natively because it is an extension of
1:43:49
natively because it is an extension of postgres this is just on these are
1:43:52
postgres this is just on these are
1:43:52
postgres this is just on these are conveniently queried with SQL familiar
1:43:55
conveniently queried with SQL familiar
1:43:55
conveniently queried with SQL familiar to our users and the set itself is a lab
1:43:59
to our users and the set itself is a lab
1:43:59
to our users and the set itself is a lab Centric extension so it was built with
1:44:01
Centric extension so it was built with
1:44:01
Centric extension so it was built with olap for use case in mind
1:44:04
olap for use case in mind
1:44:04
olap for use case in mind from the other hand there were no Corner
1:44:07
from the other hand there were no Corner
1:44:07
from the other hand there were no Corner storage back then typical for roll up
1:44:10
storage back then typical for roll up
1:44:10
storage back then typical for roll up and there were no convenient way to
1:44:14
and there were no convenient way to
1:44:14
and there were no convenient way to scale the cluster in open source Edition
1:44:17
scale the cluster in open source Edition
1:44:17
scale the cluster in open source Edition it was behind the playable luckily today
1:44:20
it was behind the playable luckily today
1:44:20
it was behind the playable luckily today is no day and both of those issues are
1:44:25
is no day and both of those issues are
1:44:25
is no day and both of those issues are resolved they are an open sourced and
1:44:29
resolved they are an open sourced and
1:44:29
resolved they are an open sourced and thanks Microsoft and especially uh
1:44:33
thanks Microsoft and especially uh
1:44:33
thanks Microsoft and especially uh now let's see the hardware that we have
1:44:36
now let's see the hardware that we have
1:44:36
now let's see the hardware that we have for that
1:44:37
for that we use an on-premise installation due to
1:44:41
we use an on-premise installation due to
1:44:41
we use an on-premise installation due to some historical reasons here you can see
1:44:44
some historical reasons here you can see
1:44:44
some historical reasons here you can see our coordinator machine specs and here
1:44:48
our coordinator machine specs and here
1:44:48
our coordinator machine specs and here are working machines spec so that that
1:44:51
are working machines spec so that that
1:44:51
are working machines spec so that that we have for the off now we rely hugely
1:44:56
we have for the off now we rely hugely
1:44:56
we have for the off now we rely hugely on huge pages one gigabyte in size we
1:45:00
on huge pages one gigabyte in size we
1:45:00
on huge pages one gigabyte in size we disabled swap
1:45:04
the hardware that we have
1:45:08
the hardware that we have
1:45:08
the hardware that we have uh we also picked the battery versus our
1:45:11
uh we also picked the battery versus our
1:45:11
uh we also picked the battery versus our file system uh it's a great file system
1:45:14
file system uh it's a great file system
1:45:14
file system uh it's a great file system uh being copy and write it provides us
1:45:17
uh being copy and write it provides us
1:45:17
uh being copy and write it provides us with the ability to make file system
1:45:20
with the ability to make file system
1:45:20
with the ability to make file system level snapshots which are a perfect
1:45:23
level snapshots which are a perfect
1:45:23
level snapshots which are a perfect solution for quick backups but more
1:45:26
solution for quick backups but more
1:45:26
solution for quick backups but more important importantly the it is it is
1:45:29
important importantly the it is it is
1:45:29
important importantly the it is it is capable of compressing data that it
1:45:31
capable of compressing data that it
1:45:31
capable of compressing data that it manages so using better face we gain a
1:45:36
manages so using better face we gain a
1:45:36
manages so using better face we gain a compression ratio six about six to one
1:45:39
compression ratio six about six to one
1:45:39
compression ratio six about six to one uh using the comp size tool
1:45:43
uh using the comp size tool
1:45:43
uh using the comp size tool and that's essentially how we solve the
1:45:47
and that's essentially how we solve the
1:45:47
and that's essentially how we solve the data issue with Json because as you may
1:45:50
data issue with Json because as you may
1:45:50
data issue with Json because as you may know uh on Sim structured data is very
1:45:54
know uh on Sim structured data is very
1:45:54
know uh on Sim structured data is very space demanding it is great it's six to
1:45:57
space demanding it is great it's six to
1:45:57
space demanding it is great it's six to one is great ratio for us but it could
1:45:59
one is great ratio for us but it could
1:46:00
one is great ratio for us but it could have been better if we played with the
1:46:03
have been better if we played with the
1:46:03
have been better if we played with the sex size compilation uh
1:46:08
sex size compilation uh
1:46:08
sex size compilation uh from the beginning but yeah the time has
1:46:11
from the beginning but yeah the time has
1:46:11
from the beginning but yeah the time has passed so now let's see the describe
1:46:15
passed so now let's see the describe
1:46:15
passed so now let's see the describe solution itself uh here's our example of
1:46:20
solution itself uh here's our example of
1:46:20
solution itself uh here's our example of our data that we accept in our system at
1:46:23
our data that we accept in our system at
1:46:23
our data that we accept in our system at the first three fields are essentially a
1:46:26
the first three fields are essentially a
1:46:26
the first three fields are essentially a primary key of an event
1:46:28
primary key of an event
1:46:28
primary key of an event and data fields can contain any Json
1:46:32
and data fields can contain any Json
1:46:32
and data fields can contain any Json that is required for customer
1:46:35
that is required for customer
1:46:35
that is required for customer in order to store that uh in postgres we
1:46:40
in order to store that uh in postgres we
1:46:40
in order to store that uh in postgres we dynamically create tables that
1:46:42
dynamically create tables that
1:46:42
dynamically create tables that correspond to the name and other fields
1:46:45
correspond to the name and other fields
1:46:45
correspond to the name and other fields go as is essentially but there is also a
1:46:50
go as is essentially but there is also a
1:46:50
go as is essentially but there is also a special profile
1:46:52
special profile events designed to store Common data for
1:46:57
events designed to store Common data for
1:46:57
events designed to store Common data for events so it's like normalization
1:47:00
events so it's like normalization
1:47:00
events so it's like normalization there is a permanent profile which is
1:47:03
there is a permanent profile which is
1:47:03
there is a permanent profile which is actually updatable entity so it is
1:47:06
actually updatable entity so it is
1:47:06
actually updatable entity so it is designed to store data like register
1:47:08
designed to store data like register
1:47:08
designed to store data like register date or country something that doesn't
1:47:10
date or country something that doesn't
1:47:10
date or country something that doesn't change over time for all events and
1:47:13
change over time for all events and
1:47:13
change over time for all events and there is a volatile profile for things
1:47:17
there is a volatile profile for things
1:47:17
there is a volatile profile for things like user level or something it is
1:47:19
like user level or something it is
1:47:19
like user level or something it is similar to events it is not updated it
1:47:21
similar to events it is not updated it
1:47:21
similar to events it is not updated it is stored but upon insertion we
1:47:24
is stored but upon insertion we
1:47:24
is stored but upon insertion we calculate a range of activity for the
1:47:26
calculate a range of activity for the
1:47:26
calculate a range of activity for the profile so the profile considered active
1:47:30
profile so the profile considered active
1:47:30
profile so the profile considered active from the update time where the profile
1:47:32
from the update time where the profile
1:47:32
from the update time where the profile came to our system and up to the next
1:47:35
came to our system and up to the next
1:47:35
came to our system and up to the next profile for that users
1:47:37
profile for that users
1:47:37
profile for that users uh combining all this together we get
1:47:41
uh combining all this together we get
1:47:41
uh combining all this together we get the base for most of our analytical
1:47:44
the base for most of our analytical
1:47:44
the base for most of our analytical queries so this join represents like 95
1:47:48
queries so this join represents like 95
1:47:48
queries so this join represents like 95 of queries that are performed in our
1:47:51
of queries that are performed in our
1:47:51
of queries that are performed in our system
1:47:52
system and uh since this per user analytics all
1:47:58
and uh since this per user analytics all
1:47:58
and uh since this per user analytics all of our tables presented here are
1:48:00
of our tables presented here are
1:48:00
of our tables presented here are distributed by the user ID column
1:48:04
distributed by the user ID column
1:48:04
distributed by the user ID column now I also should mention that we have
1:48:06
now I also should mention that we have
1:48:06
now I also should mention that we have more tables to support integration with
1:48:10
more tables to support integration with
1:48:10
more tables to support integration with Partners like cup slider Google Etc but
1:48:14
Partners like cup slider Google Etc but
1:48:14
Partners like cup slider Google Etc but it is a big part of our analytics but it
1:48:17
it is a big part of our analytics but it
1:48:17
it is a big part of our analytics but it is not a part of this topic maybe next
1:48:20
is not a part of this topic maybe next
1:48:20
is not a part of this topic maybe next time the stock sorry maybe next time
1:48:22
time the stock sorry maybe next time
1:48:22
time the stock sorry maybe next time next session so let's see some actual
1:48:26
next session so let's see some actual
1:48:26
next session so let's see some actual results of this uh here you see the
1:48:31
results of this uh here you see the
1:48:31
results of this uh here you see the statistics that we gathered from
1:48:33
statistics that we gathered from
1:48:33
statistics that we gathered from bugister statements extension in about
1:48:36
bugister statements extension in about
1:48:36
bugister statements extension in about two months and as you can see the vast
1:48:40
two months and as you can see the vast
1:48:40
two months and as you can see the vast majority of our analytical queries runs
1:48:43
majority of our analytical queries runs
1:48:43
majority of our analytical queries runs under 30 seconds here is a quick plot to
1:48:48
under 30 seconds here is a quick plot to
1:48:48
under 30 seconds here is a quick plot to demonstrate it visually and now let's
1:48:51
demonstrate it visually and now let's
1:48:51
demonstrate it visually and now let's discuss how we gained it using Json b as
1:48:55
discuss how we gained it using Json b as
1:48:55
discuss how we gained it using Json b as our main type of data
1:48:58
our main type of data
1:48:58
our main type of data uh so there are two caveats that you
1:49:01
uh so there are two caveats that you
1:49:01
uh so there are two caveats that you should be aware when you're using Json B
1:49:03
should be aware when you're using Json B
1:49:03
should be aware when you're using Json B first this Json B's got get toasted
1:49:07
first this Json B's got get toasted
1:49:07
first this Json B's got get toasted because they are an autistic restricted
1:49:09
because they are an autistic restricted
1:49:09
because they are an autistic restricted in size and essentially if they get if
1:49:13
in size and essentially if they get if
1:49:13
in size and essentially if they get if they get toasted uh it will hurt the
1:49:15
they get toasted uh it will hurt the
1:49:15
they get toasted uh it will hurt the performance but secondly and most
1:49:17
performance but secondly and most
1:49:17
performance but secondly and most importantly uh past grass still does not
1:49:20
importantly uh past grass still does not
1:49:20
importantly uh past grass still does not gather planner statistics for Json B
1:49:23
gather planner statistics for Json B
1:49:23
gather planner statistics for Json B content essentially uh what it leads to
1:49:27
content essentially uh what it leads to
1:49:27
content essentially uh what it leads to if you have some query that uses uh
1:49:31
if you have some query that uses uh
1:49:31
if you have some query that uses uh Justin B in where class uh your rows
1:49:36
Justin B in where class uh your rows
1:49:36
Justin B in where class uh your rows gets very underestimated extremely
1:49:38
gets very underestimated extremely
1:49:38
gets very underestimated extremely underestimated and you will get
1:49:41
underestimated and you will get
1:49:41
underestimated and you will get sub-optimal plan so what can we do about
1:49:45
sub-optimal plan so what can we do about
1:49:45
sub-optimal plan so what can we do about that uh well to battle toasts we advise
1:49:49
that uh well to battle toasts we advise
1:49:49
that uh well to battle toasts we advise our clients to use thinner events and
1:49:52
our clients to use thinner events and
1:49:52
our clients to use thinner events and bigger profiles so the vast majority of
1:49:55
bigger profiles so the vast majority of
1:49:55
bigger profiles so the vast majority of events won't get toasted and profiles
1:49:58
events won't get toasted and profiles
1:49:58
events won't get toasted and profiles will be kept in
1:50:00
will be kept in it can go even further with via
1:50:04
it can go even further with via
1:50:04
it can go even further with via restrictions via check constraints or
1:50:07
restrictions via check constraints or
1:50:07
restrictions via check constraints or something on the middleware side but we
1:50:10
something on the middleware side but we
1:50:10
something on the middleware side but we do not recommend it it's too restrictive
1:50:11
do not recommend it it's too restrictive
1:50:12
do not recommend it it's too restrictive and you can play around with table
1:50:14
and you can play around with table
1:50:14
and you can play around with table settings or even calculation setting for
1:50:16
settings or even calculation setting for
1:50:16
settings or even calculation setting for postgres but once again we didn't write
1:50:19
postgres but once again we didn't write
1:50:19
postgres but once again we didn't write we didn't need it so far
1:50:23
we didn't need it so far
1:50:23
we didn't need it so far now uh statistics first what you can do
1:50:26
now uh statistics first what you can do
1:50:26
now uh statistics first what you can do with Statistics is just accept accept
1:50:29
with Statistics is just accept accept
1:50:29
with Statistics is just accept accept the fact that some of your plans will be
1:50:31
the fact that some of your plans will be
1:50:31
the fact that some of your plans will be some optimal and maybe
1:50:34
some optimal and maybe
1:50:34
some optimal and maybe add more computational power to it
1:50:38
add more computational power to it
1:50:38
add more computational power to it but if you need to see uh character
1:50:42
but if you need to see uh character
1:50:42
but if you need to see uh character estimation so you can
1:50:45
estimation so you can
1:50:45
estimation so you can spend some some space on disk and create
1:50:48
spend some some space on disk and create
1:50:48
spend some some space on disk and create a gene Index this is a special type of
1:50:51
a gene Index this is a special type of
1:50:51
a gene Index this is a special type of index specifically designed for
1:50:52
index specifically designed for
1:50:52
index specifically designed for composite values and most importantly
1:50:55
composite values and most importantly
1:50:55
composite values and most importantly for jsons that it supports uh just some
1:50:58
for jsons that it supports uh just some
1:50:58
for jsons that it supports uh just some path expressions
1:50:59
path expressions uh here's an example of gen index and
1:51:02
uh here's an example of gen index and
1:51:02
uh here's an example of gen index and Justin path expression and as you can
1:51:06
Justin path expression and as you can
1:51:06
Justin path expression and as you can see uh row estimations got much better
1:51:10
see uh row estimations got much better
1:51:10
see uh row estimations got much better it's a great solution it covers all
1:51:14
it's a great solution it covers all
1:51:14
it's a great solution it covers all fields of the field or columns within it
1:51:18
fields of the field or columns within it
1:51:18
fields of the field or columns within it but unfortunately it is much slower than
1:51:21
but unfortunately it is much slower than
1:51:21
but unfortunately it is much slower than B3 index because it uses bitmap index
1:51:25
B3 index because it uses bitmap index
1:51:25
B3 index because it uses bitmap index guns and just on pass itself while being
1:51:28
guns and just on pass itself while being
1:51:28
guns and just on pass itself while being pretty powerful in capabilities it's not
1:51:31
pretty powerful in capabilities it's not
1:51:31
pretty powerful in capabilities it's not very readable and reminds me of regular
1:51:35
very readable and reminds me of regular
1:51:35
very readable and reminds me of regular expressions
1:51:36
expressions uh and it can get very ugly very ugly
1:51:40
uh and it can get very ugly very ugly
1:51:40
uh and it can get very ugly very ugly and very fast
1:51:42
and very fast so now let's talk about those between
1:51:45
so now let's talk about those between
1:51:45
so now let's talk about those between indexes well that's quite simple you can
1:51:48
indexes well that's quite simple you can
1:51:48
indexes well that's quite simple you can create them on this Arrow operator and
1:51:52
create them on this Arrow operator and
1:51:52
create them on this Arrow operator and this bit index present the performance
1:51:54
this bit index present the performance
1:51:54
this bit index present the performance is the best the most emissions is the
1:51:59
is the best the most emissions is the
1:51:59
is the best the most emissions is the most accurate Etc but you'll have to
1:52:01
most accurate Etc but you'll have to
1:52:01
most accurate Etc but you'll have to create it on each column of just some
1:52:04
create it on each column of just some
1:52:04
create it on each column of just some separately
1:52:06
separately now there is also an ultimate transverse
1:52:09
now there is also an ultimate transverse
1:52:09
now there is also an ultimate transverse answer for all our
1:52:11
answer for all our
1:52:11
answer for all our performance issues statistics toast Etc
1:52:14
performance issues statistics toast Etc
1:52:14
performance issues statistics toast Etc just add row computational power I mean
1:52:17
just add row computational power I mean
1:52:17
just add row computational power I mean up optimal plan or sub-optimal if you
1:52:20
up optimal plan or sub-optimal if you
1:52:21
up optimal plan or sub-optimal if you have enough computational power your
1:52:23
have enough computational power your
1:52:23
have enough computational power your queries will be will perform adequately
1:52:26
queries will be will perform adequately
1:52:26
queries will be will perform adequately and of course it's mostly joke it's once
1:52:30
and of course it's mostly joke it's once
1:52:30
and of course it's mostly joke it's once off all your problems but again the
1:52:32
off all your problems but again the
1:52:32
off all your problems but again the status it's a very great option to have
1:52:36
status it's a very great option to have
1:52:36
status it's a very great option to have sometimes
1:52:37
sometimes uh now we also have some default indexes
1:52:40
uh now we also have some default indexes
1:52:40
uh now we also have some default indexes that we create for all games to support
1:52:42
that we create for all games to support
1:52:42
that we create for all games to support our base join right but we also
1:52:46
our base join right but we also
1:52:46
our base join right but we also sometimes need to resort to the
1:52:47
sometimes need to resort to the
1:52:47
sometimes need to resort to the normalization because our data can get
1:52:51
normalization because our data can get
1:52:51
normalization because our data can get very big for example our biggest table
1:52:53
very big for example our biggest table
1:52:53
very big for example our biggest table is 28 terabytes in size and indexes
1:52:57
is 28 terabytes in size and indexes
1:52:57
is 28 terabytes in size and indexes indexes this join these profiles will
1:53:00
indexes this join these profiles will
1:53:00
indexes this join these profiles will not perform well so what we do instead
1:53:03
not perform well so what we do instead
1:53:03
not perform well so what we do instead is uh just store that value in the
1:53:07
is uh just store that value in the
1:53:07
is uh just store that value in the event table itself in order to do that
1:53:10
event table itself in order to do that
1:53:10
event table itself in order to do that we just create a cross pointing column
1:53:13
we just create a cross pointing column
1:53:13
we just create a cross pointing column uh write a simple update queries that
1:53:16
uh write a simple update queries that
1:53:16
uh write a simple update queries that fuels that column using the select you
1:53:19
fuels that column using the select you
1:53:19
fuels that column using the select you saw on the slide uh this update is
1:53:22
saw on the slide uh this update is
1:53:22
saw on the slide uh this update is wrapped up into callable nice callable
1:53:24
wrapped up into callable nice callable
1:53:24
wrapped up into callable nice callable procedure and the closest procedure is
1:53:27
procedure and the closest procedure is
1:53:27
procedure and the closest procedure is scaled scheduled when it is appropriate
1:53:30
scaled scheduled when it is appropriate
1:53:30
scaled scheduled when it is appropriate for us using this approach we eliminate
1:53:33
for us using this approach we eliminate
1:53:33
for us using this approach we eliminate the needs of this slow join and it gets
1:53:38
the needs of this slow join and it gets
1:53:38
the needs of this slow join and it gets great results but from the other hand uh
1:53:42
great results but from the other hand uh
1:53:42
great results but from the other hand uh uh now we have some historical databases
1:53:47
uh now we have some historical databases
1:53:47
uh now we have some historical databases fill this column field and fresh data
1:53:50
fill this column field and fresh data
1:53:50
fill this column field and fresh data where it is not filled no problem if you
1:53:54
where it is not filled no problem if you
1:53:54
where it is not filled no problem if you know the threshold of filling you can
1:53:56
know the threshold of filling you can
1:53:56
know the threshold of filling you can just Union fresh data and historical
1:53:58
just Union fresh data and historical
1:53:58
just Union fresh data and historical data
1:54:00
data it is similar to mod views
1:54:04
it is similar to mod views
1:54:04
it is similar to mod views similar to what use might be and let's
1:54:07
similar to what use might be and let's
1:54:07
similar to what use might be and let's see how status
1:54:10
see how status so uh here is a typical analytical query
1:54:13
so uh here is a typical analytical query
1:54:13
so uh here is a typical analytical query and when we want to store it on disk we
1:54:17
and when we want to store it on disk we
1:54:17
and when we want to store it on disk we just create mod view and schedule its
1:54:19
just create mod view and schedule its
1:54:19
just create mod view and schedule its refresh right so simple all okay now the
1:54:23
refresh right so simple all okay now the
1:54:23
refresh right so simple all okay now the status you just create mod view invoke
1:54:26
status you just create mod view invoke
1:54:26
status you just create mod view invoke redistributed table on that mod view and
1:54:29
redistributed table on that mod view and
1:54:29
redistributed table on that mod view and schedule its refresh now it's joinable
1:54:32
schedule its refresh now it's joinable
1:54:32
schedule its refresh now it's joinable with all other distributed data in our
1:54:35
with all other distributed data in our
1:54:35
with all other distributed data in our cluster right nope unfortunately there
1:54:39
cluster right nope unfortunately there
1:54:39
cluster right nope unfortunately there is no way to create distributed module
1:54:41
is no way to create distributed module
1:54:41
is no way to create distributed module in status and it is by Design because
1:54:44
in status and it is by Design because
1:54:44
in status and it is by Design because developers suggest us to use inserting
1:54:47
developers suggest us to use inserting
1:54:47
developers suggest us to use inserting to select functionality
1:54:50
to select functionality
1:54:50
to select functionality instead no problem in order to do that
1:54:53
instead no problem in order to do that
1:54:53
instead no problem in order to do that just create table
1:54:55
just create table regular table distributed and right
1:54:59
regular table distributed and right
1:54:59
regular table distributed and right truncate insert procedure that mimics
1:55:03
truncate insert procedure that mimics
1:55:03
truncate insert procedure that mimics essentially
1:55:05
essentially materialized view functionality
1:55:08
materialized view functionality
1:55:08
materialized view functionality now all right uh that's the gist of our
1:55:13
now all right uh that's the gist of our
1:55:13
now all right uh that's the gist of our solution now let's see how we perform
1:55:15
solution now let's see how we perform
1:55:15
solution now let's see how we perform maintenance of all that beautiful beasts
1:55:17
maintenance of all that beautiful beasts
1:55:17
maintenance of all that beautiful beasts so our maintenance is essentially split
1:55:20
so our maintenance is essentially split
1:55:20
so our maintenance is essentially split between the developers who write uh uh
1:55:24
between the developers who write uh uh
1:55:24
between the developers who write uh uh interesting middleware because uh some
1:55:28
interesting middleware because uh some
1:55:28
interesting middleware because uh some tables just a part of essentially a part
1:55:30
tables just a part of essentially a part
1:55:30
tables just a part of essentially a part of that middleware uh but it is wrapped
1:55:34
of that middleware uh but it is wrapped
1:55:34
of that middleware uh but it is wrapped into DBA maintenance that creates
1:55:38
into DBA maintenance that creates
1:55:38
into DBA maintenance that creates infrastructure databases
1:55:40
infrastructure databases
1:55:40
infrastructure databases Etc roles for that middleware but from
1:55:43
Etc roles for that middleware but from
1:55:43
Etc roles for that middleware but from the other side it creates views and
1:55:45
the other side it creates views and
1:55:45
the other side it creates views and indexes on tables that are managed by
1:55:47
indexes on tables that are managed by
1:55:47
indexes on tables that are managed by middleware but either way uh both parts
1:55:51
middleware but either way uh both parts
1:55:51
middleware but either way uh both parts are managed by a flyweight database and
1:55:53
are managed by a flyweight database and
1:55:53
are managed by a flyweight database and database migration tool which allows to
1:55:56
database migration tool which allows to
1:55:56
database migration tool which allows to store changes to the base in a sequence
1:55:58
store changes to the base in a sequence
1:55:58
store changes to the base in a sequence of uh version migration Scripts
1:56:03
of uh version migration Scripts
1:56:03
of uh version migration Scripts uh now now rdb is use follow the
1:56:07
uh now now rdb is use follow the
1:56:07
uh now now rdb is use follow the github's approach it means that we have
1:56:10
github's approach it means that we have
1:56:10
github's approach it means that we have a single repository with automated
1:56:12
a single repository with automated
1:56:12
a single repository with automated delivery configured and every change to
1:56:15
delivery configured and every change to
1:56:15
delivery configured and every change to the database is performed as a commute
1:56:17
the database is performed as a commute
1:56:17
the database is performed as a commute to the repository with a
1:56:20
to the repository with a
1:56:20
to the repository with a migration script
1:56:22
migration script uh very convenience of this in lock Etc
1:56:26
uh very convenience of this in lock Etc
1:56:26
uh very convenience of this in lock Etc now let's see how we handle
1:56:28
now let's see how we handle
1:56:28
now let's see how we handle multi-tenancy because we have a lot of
1:56:30
multi-tenancy because we have a lot of
1:56:30
multi-tenancy because we have a lot of games each game is essentially a tenant
1:56:33
games each game is essentially a tenant
1:56:33
games each game is essentially a tenant so there are preferred ways to do a
1:56:35
so there are preferred ways to do a
1:56:35
so there are preferred ways to do a multi-tenancy in status uh uh there is a
1:56:39
multi-tenancy in status uh uh there is a
1:56:39
multi-tenancy in status uh uh there is a way where you have a tenant column in
1:56:41
way where you have a tenant column in
1:56:41
way where you have a tenant column in all tables uh all distributed tables it
1:56:45
all tables uh all distributed tables it
1:56:45
all tables uh all distributed tables it is the way described in the
1:56:46
is the way described in the
1:56:46
is the way described in the documentation you can also have a schema
1:56:49
documentation you can also have a schema
1:56:49
documentation you can also have a schema schema per tenant in your database we
1:56:54
schema per tenant in your database we
1:56:54
schema per tenant in your database we went another way and created a database
1:56:57
went another way and created a database
1:56:57
went another way and created a database based multi-tennessee so we'll create a
1:56:59
based multi-tennessee so we'll create a
1:57:00
based multi-tennessee so we'll create a database per tenant because we like that
1:57:02
database per tenant because we like that
1:57:02
database per tenant because we like that strong isolation between tenant uh we
1:57:05
strong isolation between tenant uh we
1:57:05
strong isolation between tenant uh we like that we can have a custom
1:57:07
like that we can have a custom
1:57:07
like that we can have a custom configuration of database pertainant
1:57:09
configuration of database pertainant
1:57:09
configuration of database pertainant especially
1:57:11
especially cytos configuration so we can have
1:57:13
cytos configuration so we can have
1:57:13
cytos configuration so we can have different workers for different
1:57:15
different workers for different
1:57:15
different workers for different databases it was not so easy but uh five
1:57:20
databases it was not so easy but uh five
1:57:20
databases it was not so easy but uh five or six years ago yeah and surprisingly
1:57:23
or six years ago yeah and surprisingly
1:57:23
or six years ago yeah and surprisingly to us it is much easier to maintain uh
1:57:27
to us it is much easier to maintain uh
1:57:27
to us it is much easier to maintain uh multiple tenants this way because we can
1:57:29
multiple tenants this way because we can
1:57:29
multiple tenants this way because we can invoke for example parallel migrations
1:57:31
invoke for example parallel migrations
1:57:31
invoke for example parallel migrations for some common or similar objects a bit
1:57:35
for some common or similar objects a bit
1:57:35
for some common or similar objects a bit alternates from the other hand there are
1:57:37
alternates from the other hand there are
1:57:37
alternates from the other hand there are no as you may know there are no cross
1:57:39
no as you may know there are no cross
1:57:39
no as you may know there are no cross database queries in postgres itself
1:57:42
database queries in postgres itself
1:57:42
database queries in postgres itself I'm more than sure that every podcast
1:57:44
I'm more than sure that every podcast
1:57:44
I'm more than sure that every podcast user have seen this server
1:57:47
user have seen this server
1:57:47
user have seen this server and uh yeah we'll work around it using
1:57:50
and uh yeah we'll work around it using
1:57:50
and uh yeah we'll work around it using the building extension usual story here
1:57:54
the building extension usual story here
1:57:54
the building extension usual story here but more importantly uh approach is
1:57:57
but more importantly uh approach is
1:57:57
but more importantly uh approach is poorly supported by cytos itself because
1:58:00
poorly supported by cytos itself because
1:58:00
poorly supported by cytos itself because one of the best we said this requires uh
1:58:03
one of the best we said this requires uh
1:58:03
one of the best we said this requires uh the demon process on both machines it
1:58:07
the demon process on both machines it
1:58:07
the demon process on both machines it requires and connections for any number
1:58:09
requires and connections for any number
1:58:09
requires and connections for any number of workers in cluster for deadlock
1:58:11
of workers in cluster for deadlock
1:58:11
of workers in cluster for deadlock detections adduction and two connections
1:58:14
detections adduction and two connections
1:58:14
detections adduction and two connections for uh transaction recovery purposes
1:58:19
for uh transaction recovery purposes
1:58:19
for uh transaction recovery purposes no let's see let's assume that we have a
1:58:23
no let's see let's assume that we have a
1:58:23
no let's see let's assume that we have a cluster with 40 machines and 30
1:58:25
cluster with 40 machines and 30
1:58:25
cluster with 40 machines and 30 databases on these machines workers
1:58:27
databases on these machines workers
1:58:27
databases on these machines workers doing a little math we get 1260
1:58:31
doing a little math we get 1260
1:58:31
doing a little math we get 1260 connections only for maintenance
1:58:32
connections only for maintenance
1:58:32
connections only for maintenance purposes per worker uh 60 of them are
1:58:36
purposes per worker uh 60 of them are
1:58:36
purposes per worker uh 60 of them are required each minute by default and will
1:58:40
required each minute by default and will
1:58:40
required each minute by default and will hundred I just stay idle all the time uh
1:58:44
hundred I just stay idle all the time uh
1:58:44
hundred I just stay idle all the time uh what did we do what did we do to battle
1:58:46
what did we do what did we do to battle
1:58:46
what did we do what did we do to battle that is disabled automatic recovery and
1:58:49
that is disabled automatic recovery and
1:58:49
that is disabled automatic recovery and Deadlock detection and schedule it
1:58:51
Deadlock detection and schedule it
1:58:51
Deadlock detection and schedule it sequential location on all databases
1:58:55
sequential location on all databases
1:58:55
sequential location on all databases ourselves luckily tutus provide us with
1:58:58
ourselves luckily tutus provide us with
1:58:58
ourselves luckily tutus provide us with the functions to invoke both of that
1:59:01
the functions to invoke both of that
1:59:01
the functions to invoke both of that manually
1:59:02
manually but it will not solve all your problems
1:59:04
but it will not solve all your problems
1:59:04
but it will not solve all your problems there are still issues with that
1:59:07
there are still issues with that
1:59:07
there are still issues with that another point of playing in our
1:59:09
another point of playing in our
1:59:09
another point of playing in our maintenance process is multi-level mod
1:59:11
maintenance process is multi-level mod
1:59:11
maintenance process is multi-level mod views so consider a scenario where you
1:59:14
views so consider a scenario where you
1:59:14
views so consider a scenario where you have uh mod view that depends on another
1:59:17
have uh mod view that depends on another
1:59:17
have uh mod view that depends on another mod views and you will have to obviously
1:59:20
mod views and you will have to obviously
1:59:20
mod views and you will have to obviously to schedule the refreshes in a strict
1:59:23
to schedule the refreshes in a strict
1:59:23
to schedule the refreshes in a strict succession
1:59:24
succession what's wrong with that well you will
1:59:26
what's wrong with that well you will
1:59:26
what's wrong with that well you will have you'll have to pick uh time of
1:59:29
have you'll have to pick uh time of
1:59:29
have you'll have to pick uh time of refresh carefully and manually and you
1:59:33
refresh carefully and manually and you
1:59:33
refresh carefully and manually and you are at risk to get still data if there
1:59:35
are at risk to get still data if there
1:59:35
are at risk to get still data if there is a delays of Refreshers somewhere in
1:59:37
is a delays of Refreshers somewhere in
1:59:37
is a delays of Refreshers somewhere in the chain
1:59:38
the chain this is a chain so what we did we do to
1:59:41
this is a chain so what we did we do to
1:59:41
this is a chain so what we did we do to control that we implemented a simple
1:59:44
control that we implemented a simple
1:59:44
control that we implemented a simple event-based solution neatly integrated
1:59:47
event-based solution neatly integrated
1:59:47
event-based solution neatly integrated with our ecl middleware to essentially
1:59:50
with our ecl middleware to essentially
1:59:50
with our ecl middleware to essentially configure triggers and now our views are
1:59:54
configure triggers and now our views are
1:59:54
configure triggers and now our views are refreshed on triggers not the best
1:59:57
refreshed on triggers not the best
1:59:57
refreshed on triggers not the best triggers but conceptually triggers when
2:00:00
triggers but conceptually triggers when
2:00:00
triggers but conceptually triggers when their dependents are ready
2:00:03
their dependents are ready
2:00:03
their dependents are ready what we got is get got rid of all Chrome
2:00:06
what we got is get got rid of all Chrome
2:00:06
what we got is get got rid of all Chrome Expressions now all the objects are
2:00:10
Expressions now all the objects are
2:00:10
Expressions now all the objects are updated automatically and we do not get
2:00:14
updated automatically and we do not get
2:00:14
updated automatically and we do not get stale data Refreshers may be delayed but
2:00:17
stale data Refreshers may be delayed but
2:00:17
stale data Refreshers may be delayed but data will always be fresh in those chain
2:00:21
data will always be fresh in those chain
2:00:21
data will always be fresh in those chain and as Knights a nice side effect we get
2:00:25
and as Knights a nice side effect we get
2:00:25
and as Knights a nice side effect we get data to create the fancy uh
2:00:29
data to create the fancy uh
2:00:29
data to create the fancy uh actualization dashboard to see the state
2:00:32
actualization dashboard to see the state
2:00:32
actualization dashboard to see the state of refreshes for our views
2:00:35
of refreshes for our views
2:00:35
of refreshes for our views now let's talk rebalancing here very
2:00:38
now let's talk rebalancing here very
2:00:38
now let's talk rebalancing here very important part for a distributed
2:00:40
important part for a distributed
2:00:40
important part for a distributed database of course I once again remind
2:00:43
database of course I once again remind
2:00:43
database of course I once again remind you that it got much easier in recent
2:00:46
you that it got much easier in recent
2:00:46
you that it got much easier in recent years hooray and thank you Microsoft and
2:00:48
years hooray and thank you Microsoft and
2:00:48
years hooray and thank you Microsoft and cytos but there are some minor issues
2:00:50
cytos but there are some minor issues
2:00:50
cytos but there are some minor issues with that minor first one is well issues
2:00:53
with that minor first one is well issues
2:00:53
with that minor first one is well issues uh not all at skates discovered with
2:00:56
uh not all at skates discovered with
2:00:56
uh not all at skates discovered with that process there may be some issues
2:00:59
that process there may be some issues
2:00:59
that process there may be some issues that you will have to deal with it but
2:01:01
that you will have to deal with it but
2:01:01
that you will have to deal with it but uh
2:01:03
uh more importantly uh uh default balancing
2:01:07
more importantly uh uh default balancing
2:01:07
more importantly uh uh default balancing leads to underutilization of cluster
2:01:09
leads to underutilization of cluster
2:01:09
leads to underutilization of cluster Also let's consider the situation when
2:01:11
Also let's consider the situation when
2:01:11
Also let's consider the situation when we have two workers with four CPUs and
2:01:13
we have two workers with four CPUs and
2:01:13
we have two workers with four CPUs and four shots per worker now we add two new
2:01:18
four shots per worker now we add two new
2:01:18
four shots per worker now we add two new machines to that cluster and invoke a
2:01:20
machines to that cluster and invoke a
2:01:20
machines to that cluster and invoke a rebalance after that we will get two
2:01:23
rebalance after that we will get two
2:01:24
rebalance after that we will get two charts period machine
2:01:25
charts period machine
2:01:25
charts period machine as you may see two is less than four so
2:01:30
as you may see two is less than four so
2:01:30
as you may see two is less than four so to start clearly lead to the
2:01:33
to start clearly lead to the
2:01:33
to start clearly lead to the underutilization of our newly rebalanced
2:01:37
underutilization of our newly rebalanced
2:01:37
underutilization of our newly rebalanced cluster
2:01:38
cluster instead of rebalancing you can do ultra
2:01:42
instead of rebalancing you can do ultra
2:01:42
instead of rebalancing you can do ultra distributed table and change the short
2:01:44
distributed table and change the short
2:01:44
distributed table and change the short count of the table in question
2:01:47
count of the table in question
2:01:47
count of the table in question uh that way it will it will work as
2:01:50
uh that way it will it will work as
2:01:50
uh that way it will it will work as expected you'll get a nice layout of
2:01:52
expected you'll get a nice layout of
2:01:52
expected you'll get a nice layout of shards according to the number of CPUs
2:01:55
shards according to the number of CPUs
2:01:55
shards according to the number of CPUs but there are some considerable
2:01:57
but there are some considerable
2:01:57
but there are some considerable drawbacks with this approach first there
2:02:00
drawbacks with this approach first there
2:02:00
drawbacks with this approach first there is no convenient start function like
2:02:02
is no convenient start function like
2:02:02
is no convenient start function like this defaulty balancing that was open
2:02:04
this defaulty balancing that was open
2:02:04
this defaulty balancing that was open sourced you will have to alter each
2:02:07
sourced you will have to alter each
2:02:07
sourced you will have to alter each table that you want to alter manually
2:02:10
table that you want to alter manually
2:02:10
table that you want to alter manually by name
2:02:12
by name and what's more important uh when table
2:02:16
and what's more important uh when table
2:02:16
and what's more important uh when table short count is altered it is recreated
2:02:19
short count is altered it is recreated
2:02:19
short count is altered it is recreated so you will get a new table with new
2:02:21
so you will get a new table with new
2:02:21
so you will get a new table with new Orit after alteration and assume
2:02:25
Orit after alteration and assume
2:02:25
Orit after alteration and assume suggest it means that you will have to
2:02:28
suggest it means that you will have to
2:02:28
suggest it means that you will have to recreate all dependent view undependent
2:02:30
recreate all dependent view undependent
2:02:30
recreate all dependent view undependent object views indexes Etc
2:02:33
object views indexes Etc
2:02:33
object views indexes Etc stress to do that but unfortunately
2:02:36
stress to do that but unfortunately
2:02:36
stress to do that but unfortunately there are issues with that as well
2:02:39
there are issues with that as well
2:02:39
there are issues with that as well now at time if you want to know more
2:02:43
now at time if you want to know more
2:02:43
now at time if you want to know more about our issues with faced and how we
2:02:47
about our issues with faced and how we
2:02:47
about our issues with faced and how we our practices and conventions there's a
2:02:49
our practices and conventions there's a
2:02:49
our practices and conventions there's a series of posts on the Zone where I am
2:02:52
series of posts on the Zone where I am
2:02:52
series of posts on the Zone where I am after where I describe that in much more
2:02:55
after where I describe that in much more
2:02:55
after where I describe that in much more detail
2:02:57
detail now let's talk a user experience that we
2:03:00
now let's talk a user experience that we
2:03:00
now let's talk a user experience that we users have with our system let's talk
2:03:03
users have with our system let's talk
2:03:03
users have with our system let's talk expectations we expected our users to
2:03:05
expectations we expected our users to
2:03:05
expectations we expected our users to have frequent queries with considerably
2:03:08
have frequent queries with considerably
2:03:08
have frequent queries with considerably with reasonably small results so all the
2:03:11
with reasonably small results so all the
2:03:11
with reasonably small results so all the substantial aggregation will be
2:03:12
substantial aggregation will be
2:03:12
substantial aggregation will be performed at DB level
2:03:14
performed at DB level
2:03:14
performed at DB level unfortunately in reality our users
2:03:17
unfortunately in reality our users
2:03:17
unfortunately in reality our users perform one query per day with a slow
2:03:20
perform one query per day with a slow
2:03:20
perform one query per day with a slow query which loads like gigabytes data of
2:03:24
query which loads like gigabytes data of
2:03:24
query which loads like gigabytes data of size
2:03:25
size why so uh well essentially because bi
2:03:29
why so uh well essentially because bi
2:03:29
why so uh well essentially because bi tools are encourage this so for example
2:03:33
tools are encourage this so for example
2:03:33
tools are encourage this so for example a common
2:03:35
a common Tableau bi tool encourage users of
2:03:39
Tableau bi tool encourage users of
2:03:39
Tableau bi tool encourage users of extracts data science users are used to
2:03:43
extracts data science users are used to
2:03:43
extracts data science users are used to find this library or numpy
2:03:47
find this library or numpy
2:03:47
find this library or numpy this is their statistical API so as a
2:03:50
this is their statistical API so as a
2:03:50
this is their statistical API so as a general rule this common practice for
2:03:52
general rule this common practice for
2:03:52
general rule this common practice for data science users to perform their data
2:03:56
data science users to perform their data
2:03:56
data science users to perform their data science tasks on
2:03:58
science tasks on the CSV file or something like that so
2:04:01
the CSV file or something like that so
2:04:01
the CSV file or something like that so what suck we're getting as a result is
2:04:05
what suck we're getting as a result is
2:04:05
what suck we're getting as a result is user don't care about slow queries and
2:04:08
user don't care about slow queries and
2:04:08
user don't care about slow queries and why they if they run it once a day and
2:04:11
why they if they run it once a day and
2:04:11
why they if they run it once a day and they write an efficient scale that harms
2:04:13
they write an efficient scale that harms
2:04:13
they write an efficient scale that harms them and haram's other
2:04:16
them and haram's other
2:04:16
them and haram's other users of the system what I can do with
2:04:18
users of the system what I can do with
2:04:18
users of the system what I can do with that is education users tell them how to
2:04:22
that is education users tell them how to
2:04:22
that is education users tell them how to write efficient skill especially in
2:04:24
write efficient skill especially in
2:04:24
write efficient skill especially in cytos impose some restrictions there are
2:04:29
cytos impose some restrictions there are
2:04:29
cytos impose some restrictions there are statement demos obviously and there is
2:04:32
statement demos obviously and there is
2:04:32
statement demos obviously and there is also neat extension to postgres to
2:04:34
also neat extension to postgres to
2:04:34
also neat extension to postgres to prevent users to override this statement
2:04:36
prevent users to override this statement
2:04:36
prevent users to override this statement demos
2:04:37
demos but you can also get go even further and
2:04:41
but you can also get go even further and
2:04:41
but you can also get go even further and Implement some sophisticated procedure
2:04:44
Implement some sophisticated procedure
2:04:44
Implement some sophisticated procedure to kill such problematic queries for
2:04:48
to kill such problematic queries for
2:04:48
to kill such problematic queries for those who try to work around those four
2:04:49
those who try to work around those four
2:04:49
those who try to work around those four crowns and you should obviously monitor
2:04:52
crowns and you should obviously monitor
2:04:52
crowns and you should obviously monitor how users use your analytical Warehouse
2:05:01
from our experiences experience there
2:05:04
from our experiences experience there
2:05:04
from our experiences experience there are two primary metrics to monitor first
2:05:08
are two primary metrics to monitor first
2:05:08
are two primary metrics to monitor first one is the memory consumption of course
2:05:12
uh you here you can see the our graphene
2:05:17
uh you here you can see the our graphene
2:05:17
uh you here you can see the our graphene dashboard of memory consumption peer pit
2:05:21
dashboard of memory consumption peer pit
2:05:21
dashboard of memory consumption peer pit and pair query within the deed and we
2:05:25
and pair query within the deed and we
2:05:25
and pair query within the deed and we collected usage using just previous
2:05:27
collected usage using just previous
2:05:27
collected usage using just previous adaptivity and PS2 nothing unusual here
2:05:32
adaptivity and PS2 nothing unusual here
2:05:32
adaptivity and PS2 nothing unusual here uh it is a little bit update outdated
2:05:35
uh it is a little bit update outdated
2:05:35
uh it is a little bit update outdated because it does not use Global pits yet
2:05:37
because it does not use Global pits yet
2:05:37
because it does not use Global pits yet but we are working on it and query
2:05:40
but we are working on it and query
2:05:40
but we are working on it and query execution times uh you already seen that
2:05:43
execution times uh you already seen that
2:05:43
execution times uh you already seen that nice
2:05:45
nice graphic uh it is based on Pakistan's
2:05:49
graphic uh it is based on Pakistan's
2:05:49
graphic uh it is based on Pakistan's statement extension but we also collect
2:05:52
statement extension but we also collect
2:05:52
statement extension but we also collect slow queries directly from server locks
2:05:55
slow queries directly from server locks
2:05:55
slow queries directly from server locks via copy statement it sometimes it is a
2:05:59
via copy statement it sometimes it is a
2:05:59
via copy statement it sometimes it is a killer feature
2:06:01
killer feature now what optimization do we have for uh
2:06:05
now what optimization do we have for uh
2:06:05
now what optimization do we have for uh users to suggest for Setters users
2:06:09
users to suggest for Setters users
2:06:09
users to suggest for Setters users should look out for distributed
2:06:11
should look out for distributed
2:06:11
should look out for distributed supplants and their queries uh it
2:06:13
supplants and their queries uh it
2:06:13
supplants and their queries uh it happens when you try to join distributed
2:06:15
happens when you try to join distributed
2:06:15
happens when you try to join distributed tables without usage of distribution
2:06:18
tables without usage of distribution
2:06:18
tables without usage of distribution columns using City for example uh it
2:06:21
columns using City for example uh it
2:06:21
columns using City for example uh it will lead to distributed Sublime in the
2:06:24
will lead to distributed Sublime in the
2:06:24
will lead to distributed Sublime in the query plan and essentially what it means
2:06:27
query plan and essentially what it means
2:06:27
query plan and essentially what it means is that status will have to pull all the
2:06:29
is that status will have to pull all the
2:06:29
is that status will have to pull all the rows from City to coordinator and then
2:06:32
rows from City to coordinator and then
2:06:32
rows from City to coordinator and then redistributing them back to workers
2:06:36
redistributing them back to workers
2:06:36
redistributing them back to workers uh by default uh this is called
2:06:38
uh by default uh this is called
2:06:38
uh by default uh this is called intermediate result and by default it is
2:06:40
intermediate result and by default it is
2:06:40
intermediate result and by default it is one gigabyte in size but as you can
2:06:42
one gigabyte in size but as you can
2:06:42
one gigabyte in size but as you can guess it's still haram's uh tremendously
2:06:44
guess it's still haram's uh tremendously
2:06:44
guess it's still haram's uh tremendously harms the performance tremendously so if
2:06:47
harms the performance tremendously so if
2:06:47
harms the performance tremendously so if you see this in your career plans you
2:06:50
you see this in your career plans you
2:06:50
you see this in your career plans you will need to either rewrite the query or
2:06:52
will need to either rewrite the query or
2:06:52
will need to either rewrite the query or uh redistribute your data
2:06:55
uh redistribute your data
2:06:55
uh redistribute your data now I haven't mentioned that but our
2:06:58
now I haven't mentioned that but our
2:06:58
now I haven't mentioned that but our data is partitioned and it is also a
2:07:01
data is partitioned and it is also a
2:07:01
data is partitioned and it is also a source of Errors because uh we found out
2:07:04
source of Errors because uh we found out
2:07:04
source of Errors because uh we found out that if you apply some transformation to
2:07:07
that if you apply some transformation to
2:07:07
that if you apply some transformation to partition column uh the podcast will not
2:07:10
partition column uh the podcast will not
2:07:10
partition column uh the podcast will not use the partition pronoun and your
2:07:13
use the partition pronoun and your
2:07:13
use the partition pronoun and your queries your query performance will
2:07:15
queries your query performance will
2:07:15
queries your query performance will suffer so just leave the partition
2:07:18
suffer so just leave the partition
2:07:18
suffer so just leave the partition column alone and indexes well default
2:07:21
column alone and indexes well default
2:07:21
column alone and indexes well default advice if you have some slow queries
2:07:24
advice if you have some slow queries
2:07:24
advice if you have some slow queries just create indexes if uh if they can be
2:07:28
just create indexes if uh if they can be
2:07:28
just create indexes if uh if they can be useful so okay now to conclusion uh here
2:07:33
useful so okay now to conclusion uh here
2:07:33
useful so okay now to conclusion uh here is the top wanted features to make our
2:07:35
is the top wanted features to make our
2:07:35
is the top wanted features to make our solution near ideal uh obviously it will
2:07:39
solution near ideal uh obviously it will
2:07:39
solution near ideal uh obviously it will be just on this Statistics obviously I
2:07:41
be just on this Statistics obviously I
2:07:41
be just on this Statistics obviously I would like to have multi-level schemas
2:07:43
would like to have multi-level schemas
2:07:43
would like to have multi-level schemas instead of databases uh so close
2:07:46
instead of databases uh so close
2:07:46
instead of databases uh so close database reference will be resolved
2:07:49
database reference will be resolved
2:07:49
database reference will be resolved essentially
2:07:50
essentially from cytus I would like to have sub
2:07:52
from cytus I would like to have sub
2:07:52
from cytus I would like to have sub partitions because currently it is not
2:07:54
partitions because currently it is not
2:07:54
partitions because currently it is not possible to create a partition of
2:07:55
possible to create a partition of
2:07:55
possible to create a partition of partition unfortunately
2:07:57
partition unfortunately
2:07:57
partition unfortunately and distributed mod views I assume I
2:08:01
and distributed mod views I assume I
2:08:01
and distributed mod views I assume I already
2:08:02
already described in colors why I would like to
2:08:05
described in colors why I would like to
2:08:05
described in colors why I would like to have that
2:08:06
have that and as final thoughts cytus is a great
2:08:09
and as final thoughts cytus is a great
2:08:09
and as final thoughts cytus is a great partition a great extension for postgres
2:08:11
partition a great extension for postgres
2:08:11
partition a great extension for postgres it delivers exactly what is Promised in
2:08:14
it delivers exactly what is Promised in
2:08:14
it delivers exactly what is Promised in dogs and learning site but it is
2:08:16
dogs and learning site but it is
2:08:16
dogs and learning site but it is extension so you should be prepared that
2:08:20
extension so you should be prepared that
2:08:20
extension so you should be prepared that there are some edge cases uncovered
2:08:22
there are some edge cases uncovered
2:08:22
there are some edge cases uncovered there are some integration issues ETC so
2:08:25
there are some integration issues ETC so
2:08:25
there are some integration issues ETC so every every common issue for extension
2:08:29
every every common issue for extension
2:08:29
every every common issue for extension not a core product
2:08:30
not a core product
2:08:30
not a core product so you can find this presentation here
2:08:34
so you can find this presentation here
2:08:34
so you can find this presentation here on speaker decks speaker deck sorry and
2:08:38
on speaker decks speaker deck sorry and
2:08:38
on speaker decks speaker deck sorry and I am ready for your questions here and
2:08:41
I am ready for your questions here and
2:08:41
I am ready for your questions here and on Discord after the talk
2:08:43
on Discord after the talk
2:08:43
on Discord after the talk hurry hurry that was pretty uh
2:08:47
hurry hurry that was pretty uh
2:08:47
hurry hurry that was pretty uh interesting a flashback to serve many
2:08:48
interesting a flashback to serve many
2:08:48
interesting a flashback to serve many times I've written things like
2:08:50
times I've written things like
2:08:50
times I've written things like monitoring and custom implementations
2:08:52
monitoring and custom implementations
2:08:52
monitoring and custom implementations like that it's really good
2:08:56
I was really excited when you said um
2:09:00
I was really excited when you said um
2:09:00
I was really excited when you said um when you accepted the talk when you
2:09:02
when you accepted the talk when you
2:09:02
when you accepted the talk when you submitted the proposal when you said you
2:09:04
submitted the proposal when you said you
2:09:04
submitted the proposal when you said you were able to give a live stream version
2:09:06
were able to give a live stream version
2:09:06
were able to give a live stream version of the talk not just an on-demand talk
2:09:07
of the talk not just an on-demand talk
2:09:07
of the talk not just an on-demand talk because it's always fun for me to hear
2:09:10
because it's always fun for me to hear
2:09:10
because it's always fun for me to hear how you know people are using cytus in
2:09:12
how you know people are using cytus in
2:09:12
how you know people are using cytus in the real world in the wild
2:09:15
the real world in the wild
2:09:15
the real world in the wild oh yeah it was my pleasure because I
2:09:17
oh yeah it was my pleasure because I
2:09:17
oh yeah it was my pleasure because I wanted to let it out from my chest so so
2:09:20
wanted to let it out from my chest so so
2:09:20
wanted to let it out from my chest so so many years
2:09:22
many years this is just a tip of the book actually
2:09:24
this is just a tip of the book actually
2:09:24
this is just a tip of the book actually because there are so much more yeah I
2:09:27
because there are so much more yeah I
2:09:27
because there are so much more yeah I was gonna say so I noticed on your
2:09:28
was gonna say so I noticed on your
2:09:28
was gonna say so I noticed on your slides that you said 2017 like that I
2:09:30
slides that you said 2017 like that I
2:09:30
slides that you said 2017 like that I mean that's the before times of so many
2:09:32
mean that's the before times of so many
2:09:32
mean that's the before times of so many things uh so what was the first version
2:09:34
things uh so what was the first version
2:09:34
things uh so what was the first version that you deployed this on with cytus uh
2:09:38
that you deployed this on with cytus uh
2:09:38
that you deployed this on with cytus uh I assume uh 7.0 7. okay so that's way
2:09:42
I assume uh 7.0 7. okay so that's way
2:09:42
I assume uh 7.0 7. okay so that's way back side is 7.0 yeah that's yes
2:09:45
back side is 7.0 yeah that's yes
2:09:45
back side is 7.0 yeah that's yes 7.0 so that's that's I mean that's way
2:09:48
7.0 so that's that's I mean that's way
2:09:48
7.0 so that's that's I mean that's way back uh you must have been ecstatic when
2:09:51
back uh you must have been ecstatic when
2:09:51
back uh you must have been ecstatic when Microsoft like fully open sourced cytus
2:09:54
Microsoft like fully open sourced cytus
2:09:54
Microsoft like fully open sourced cytus you know after the the purchase and all
2:09:56
you know after the the purchase and all
2:09:56
you know after the the purchase and all that uh you have no idea
2:09:59
that uh you have no idea
2:09:59
that uh you have no idea that was for us it was like our local
2:10:03
that was for us it was like our local
2:10:03
that was for us it was like our local celebration yeah and I imagine uh
2:10:05
celebration yeah and I imagine uh
2:10:05
celebration yeah and I imagine uh because I've run local scientists uh
2:10:07
because I've run local scientists uh
2:10:07
because I've run local scientists uh before for folks and uh The Shard
2:10:09
before for folks and uh The Shard
2:10:09
before for folks and uh The Shard rebalancer to me was also like one of
2:10:11
rebalancer to me was also like one of
2:10:11
rebalancer to me was also like one of the key pieces that like once that was
2:10:13
the key pieces that like once that was
2:10:13
the key pieces that like once that was put out there like it was kind of a game
2:10:15
put out there like it was kind of a game
2:10:15
put out there like it was kind of a game changer for people who wanted to do
2:10:16
changer for people who wanted to do
2:10:16
changer for people who wanted to do silos well once again once again as I
2:10:19
silos well once again once again as I
2:10:19
silos well once again once again as I said in the presentation it's not uh
2:10:21
said in the presentation it's not uh
2:10:21
said in the presentation it's not uh Silver Bullet it has its own issues but
2:10:23
Silver Bullet it has its own issues but
2:10:23
Silver Bullet it has its own issues but yeah still it is so much easier now
2:10:28
yeah still it is so much easier now
2:10:28
yeah still it is so much easier now so um for people who are just joining I
2:10:31
so um for people who are just joining I
2:10:31
so um for people who are just joining I just want to clarify cytus the extension
2:10:33
just want to clarify cytus the extension
2:10:33
just want to clarify cytus the extension did go open source back in 2016. when it
2:10:36
did go open source back in 2016. when it
2:10:36
did go open source back in 2016. when it was first unforked right and became a
2:10:38
was first unforked right and became a
2:10:38
was first unforked right and became a postgres extension but there were a few
2:10:40
postgres extension but there were a few
2:10:40
postgres extension but there were a few Enterprise features that were not open
2:10:42
Enterprise features that were not open
2:10:42
Enterprise features that were not open source at that point so the the going
2:10:44
source at that point so the the going
2:10:44
source at that point so the the going fully open source that was oh gosh 2021.
2:10:51
I remember that day there you go put it
2:10:54
I remember that day there you go put it
2:10:54
I remember that day there you go put it on the calendar all right well thank you
2:10:56
on the calendar all right well thank you
2:10:56
on the calendar all right well thank you so much uh I think there are some more
2:10:58
so much uh I think there are some more
2:10:58
so much uh I think there are some more questions in the Discord as to those
2:11:00
questions in the Discord as to those
2:11:00
questions in the Discord as to those things tend to do uh but once again
2:11:03
things tend to do uh but once again
2:11:03
things tend to do uh but once again thanks for speaking and uh it was
2:11:05
thanks for speaking and uh it was
2:11:05
thanks for speaking and uh it was awesome
2:11:08
thank you thank you so much Ivan ciao
2:11:11
thank you thank you so much Ivan ciao
2:11:11
thank you thank you so much Ivan ciao we'll see you on the Discord yeah
2:11:12
we'll see you on the Discord yeah
2:11:12
we'll see you on the Discord yeah obviously but
2:11:15
obviously but all right uh and who do we have up next
2:11:19
all right uh and who do we have up next
2:11:19
all right uh and who do we have up next Pamela Fox welcome
2:11:22
Pamela Fox welcome
2:11:22
Pamela Fox welcome hello
2:11:26
hi Pamela is a cloud advocate in the
2:11:30
hi Pamela is a cloud advocate in the
2:11:30
hi Pamela is a cloud advocate in the python team at Microsoft and uh you're
2:11:33
python team at Microsoft and uh you're
2:11:33
python team at Microsoft and uh you're here to give us a talk today about using
2:11:35
here to give us a talk today about using
2:11:35
here to give us a talk today about using postgres on Azure using bicep
2:11:39
postgres on Azure using bicep
2:11:39
postgres on Azure using bicep um so for those of us who are not at
2:11:41
um so for those of us who are not at
2:11:41
um so for those of us who are not at Microsoft uh the title Cloud Advocate
2:11:44
Microsoft uh the title Cloud Advocate
2:11:44
Microsoft uh the title Cloud Advocate sounds a little ephemeral or something
2:11:45
sounds a little ephemeral or something
2:11:45
sounds a little ephemeral or something uh can you explain what that actually
2:11:47
uh can you explain what that actually
2:11:47
uh can you explain what that actually does what do you actually do yeah so my
2:11:50
does what do you actually do yeah so my
2:11:50
does what do you actually do yeah so my my job is like to Advocate on behalf of
2:11:53
my job is like to Advocate on behalf of
2:11:53
my job is like to Advocate on behalf of python Developers for the cloud you
2:11:56
python Developers for the cloud you
2:11:56
python Developers for the cloud you could also think of it as advocating on
2:11:57
could also think of it as advocating on
2:11:57
could also think of it as advocating on behalf of Microsoft I guess we kind of
2:11:59
behalf of Microsoft I guess we kind of
2:11:59
behalf of Microsoft I guess we kind of advocate on both fronts but I think it
2:12:01
advocate on both fronts but I think it
2:12:01
advocate on both fronts but I think it is trying to help python developers be
2:12:03
is trying to help python developers be
2:12:03
is trying to help python developers be successful with azir so whether that's
2:12:05
successful with azir so whether that's
2:12:05
successful with azir so whether that's you know using the Azure SDK or hosting
2:12:07
you know using the Azure SDK or hosting
2:12:07
you know using the Azure SDK or hosting things on Azure you know other things
2:12:09
things on Azure you know other things
2:12:09
things on Azure you know other things you can do with this year so trying to
2:12:11
you can do with this year so trying to
2:12:11
you can do with this year so trying to make it easier for python developers
2:12:12
make it easier for python developers
2:12:12
make it easier for python developers because I'm a python developer and I
2:12:14
because I'm a python developer and I
2:12:14
because I'm a python developer and I wanted to be better for them well and as
2:12:16
wanted to be better for them well and as
2:12:16
wanted to be better for them well and as I've been getting to know you Pamela it
2:12:18
I've been getting to know you Pamela it
2:12:18
I've been getting to know you Pamela it seems like you've been involved in
2:12:20
seems like you've been involved in
2:12:20
seems like you've been involved in helping people understand technology and
2:12:23
helping people understand technology and
2:12:23
helping people understand technology and understand how to solve problems for a
2:12:24
understand how to solve problems for a
2:12:24
understand how to solve problems for a while like you worked at Khan Academy
2:12:26
while like you worked at Khan Academy
2:12:26
while like you worked at Khan Academy didn't you
2:12:27
didn't you yeah I created the computer programming
2:12:29
yeah I created the computer programming
2:12:29
yeah I created the computer programming curriculum for Khan Academy wow which it
2:12:31
curriculum for Khan Academy wow which it
2:12:31
curriculum for Khan Academy wow which it was everything but python
2:12:33
was everything but python
2:12:33
was everything but python JavaScript HTML CSS and SQL so I did
2:12:37
JavaScript HTML CSS and SQL so I did
2:12:37
JavaScript HTML CSS and SQL so I did that was um that was actually the first
2:12:39
that was um that was actually the first
2:12:39
that was um that was actually the first time I taught SQL was for Khan Academy
2:12:41
time I taught SQL was for Khan Academy
2:12:41
time I taught SQL was for Khan Academy and it was very fun to get into uh SQL
2:12:45
and it was very fun to get into uh SQL
2:12:45
and it was very fun to get into uh SQL awesome well without further Ado I think
2:12:49
awesome well without further Ado I think
2:12:49
awesome well without further Ado I think I think you should dive into your talk
2:12:50
I think you should dive into your talk
2:12:50
I think you should dive into your talk let's go
2:12:52
let's go all right
2:12:55
all right well hi everyone so since my
2:12:57
all right well hi everyone so since my
2:12:57
all right well hi everyone so since my job is to make azira better for Python
2:12:59
job is to make azira better for Python
2:12:59
job is to make azira better for Python devs and python devs love postgres I've
2:13:02
devs and python devs love postgres I've
2:13:02
devs and python devs love postgres I've been deploying hundreds of postgres
2:13:04
been deploying hundreds of postgres
2:13:04
been deploying hundreds of postgres servers uh you know for the last year so
2:13:07
servers uh you know for the last year so
2:13:07
servers uh you know for the last year so that I've been at Microsoft and I fell
2:13:09
that I've been at Microsoft and I fell
2:13:09
that I've been at Microsoft and I fell in love with using bicep to deploy them
2:13:10
in love with using bicep to deploy them
2:13:10
in love with using bicep to deploy them so I want to share that that love with
2:13:12
so I want to share that that love with
2:13:12
so I want to share that that love with all of you so my talk is for anyone who
2:13:15
all of you so my talk is for anyone who
2:13:15
all of you so my talk is for anyone who deploys postgres service to Azure or
2:13:18
deploys postgres service to Azure or
2:13:18
deploys postgres service to Azure or anyone who's just generally interested
2:13:19
anyone who's just generally interested
2:13:19
anyone who's just generally interested in using infrastructure as code as a
2:13:22
in using infrastructure as code as a
2:13:22
in using infrastructure as code as a deployment mechanism you know for any
2:13:24
deployment mechanism you know for any
2:13:24
deployment mechanism you know for any platform
2:13:26
so let's get into it so we're going to
2:13:29
so let's get into it so we're going to
2:13:29
so let's get into it so we're going to talk about using using bicep muscles to
2:13:34
talk about using using bicep muscles to
2:13:34
talk about using using bicep muscles to deploy postgres servers to azure
2:13:40
so just want to start off with
2:13:41
so just want to start off with
2:13:41
so just want to start off with clarification because there are multiple
2:13:43
clarification because there are multiple
2:13:43
clarification because there are multiple managed services for postgres on Azure
2:13:46
managed services for postgres on Azure
2:13:46
managed services for postgres on Azure uh there's the Azure database for
2:13:47
uh there's the Azure database for
2:13:47
uh there's the Azure database for postgres single server that was
2:13:49
postgres single server that was
2:13:49
postgres single server that was Microsoft's original offering it's no
2:13:51
Microsoft's original offering it's no
2:13:51
Microsoft's original offering it's no longer recommended for new apps so I
2:13:54
longer recommended for new apps so I
2:13:54
longer recommended for new apps so I basically just never never look at it
2:13:57
basically just never never look at it
2:13:57
basically just never never look at it um what they introduced since then is
2:13:59
um what they introduced since then is
2:13:59
um what they introduced since then is azure database for postgres flexible
2:14:01
azure database for postgres flexible
2:14:01
azure database for postgres flexible server
2:14:02
server and this one is another fully managed
2:14:04
and this one is another fully managed
2:14:04
and this one is another fully managed service but it offers vertical scaling
2:14:06
service but it offers vertical scaling
2:14:06
service but it offers vertical scaling and it also stays up to date with the
2:14:09
and it also stays up to date with the
2:14:09
and it also stays up to date with the latest postgres versions
2:14:10
latest postgres versions
2:14:10
latest postgres versions and there is also as your Cosmos to be
2:14:13
and there is also as your Cosmos to be
2:14:13
and there is also as your Cosmos to be for postgres which is cytus so it's uh
2:14:16
for postgres which is cytus so it's uh
2:14:16
for postgres which is cytus so it's uh you know a distributed database using
2:14:18
you know a distributed database using
2:14:18
you know a distributed database using postgres and the situs extension and
2:14:20
postgres and the situs extension and
2:14:20
postgres and the situs extension and that can scale horizontally as you may
2:14:23
that can scale horizontally as you may
2:14:23
that can scale horizontally as you may know if you're watching cytoscon
2:14:25
know if you're watching cytoscon
2:14:25
know if you're watching cytoscon um I'm going to be focusing today on the
2:14:27
um I'm going to be focusing today on the
2:14:27
um I'm going to be focusing today on the flexible server offering and um you know
2:14:30
flexible server offering and um you know
2:14:30
flexible server offering and um you know part of that is just due to what was
2:14:32
part of that is just due to what was
2:14:32
part of that is just due to what was available when I started Microsoft but
2:14:34
available when I started Microsoft but
2:14:34
available when I started Microsoft but it's also just a good fit for what
2:14:36
it's also just a good fit for what
2:14:36
it's also just a good fit for what python developers tend to use but
2:14:37
python developers tend to use but
2:14:37
python developers tend to use but everything I'm talking about today
2:14:38
everything I'm talking about today
2:14:38
everything I'm talking about today should be fairly applicable to the
2:14:42
should be fairly applicable to the
2:14:42
should be fairly applicable to the cosmos DB for postgres offering as well
2:14:44
cosmos DB for postgres offering as well
2:14:44
cosmos DB for postgres offering as well and if you're interested in trying to
2:14:46
and if you're interested in trying to
2:14:46
and if you're interested in trying to figure out the difference between them
2:14:47
figure out the difference between them
2:14:47
figure out the difference between them there's some you know links there that
2:14:49
there's some you know links there that
2:14:49
there's some you know links there that you can read more
2:14:51
you can read more so there are many ways to display a
2:14:54
so there are many ways to display a
2:14:54
so there are many ways to display a postgres server to Azure and I have
2:14:57
postgres server to Azure and I have
2:14:57
postgres server to Azure and I have experimented with all those ways and
2:14:58
experimented with all those ways and
2:14:58
experimented with all those ways and that's how I came to realize which I
2:14:59
that's how I came to realize which I
2:15:00
that's how I came to realize which I like and don't like so uh you know let's
2:15:02
like and don't like so uh you know let's
2:15:02
like and don't like so uh you know let's let's look at those ways one way is to
2:15:05
let's look at those ways one way is to
2:15:05
let's look at those ways one way is to use the Azure portal and that's where
2:15:07
use the Azure portal and that's where
2:15:07
use the Azure portal and that's where you go to porto.azier.com you create
2:15:09
you go to porto.azier.com you create
2:15:09
you go to porto.azier.com you create your you know free as your account
2:15:11
your you know free as your account
2:15:11
your you know free as your account create a subscription and then you say
2:15:13
create a subscription and then you say
2:15:13
create a subscription and then you say okay I want to create a flexible server
2:15:15
okay I want to create a flexible server
2:15:15
okay I want to create a flexible server so then that's going to pop up this you
2:15:17
so then that's going to pop up this you
2:15:17
so then that's going to pop up this you know nice user interface with a form and
2:15:19
know nice user interface with a form and
2:15:19
know nice user interface with a form and one thing that I do really like about
2:15:21
one thing that I do really like about
2:15:21
one thing that I do really like about the portal is it that it does do this
2:15:22
the portal is it that it does do this
2:15:22
the portal is it that it does do this estimated cost that that is really cool
2:15:25
estimated cost that that is really cool
2:15:25
estimated cost that that is really cool I would love to have that from the
2:15:27
I would love to have that from the
2:15:27
I would love to have that from the command line and I've been asking the
2:15:28
command line and I've been asking the
2:15:28
command line and I've been asking the product teams about that
2:15:30
product teams about that
2:15:30
product teams about that um so that that to me is one of the best
2:15:32
um so that that to me is one of the best
2:15:32
um so that that to me is one of the best reasons for using the portal so the nice
2:15:34
reasons for using the portal so the nice
2:15:34
reasons for using the portal so the nice thing about using the portal is that you
2:15:35
thing about using the portal is that you
2:15:35
thing about using the portal is that you know you've got this uh you've got this
2:15:38
know you've got this uh you've got this
2:15:38
know you've got this uh you've got this UI here and it's really easy to get
2:15:40
UI here and it's really easy to get
2:15:40
UI here and it's really easy to get started and you just have to fill in
2:15:42
started and you just have to fill in
2:15:42
started and you just have to fill in some blanks and you get this cost
2:15:44
some blanks and you get this cost
2:15:44
some blanks and you get this cost estimate as well
2:15:46
estimate as well uh but it's also difficult to replicate
2:15:48
uh but it's also difficult to replicate
2:15:48
uh but it's also difficult to replicate so once I've made something in the
2:15:49
so once I've made something in the
2:15:49
so once I've made something in the portal and then I want to make another
2:15:51
portal and then I want to make another
2:15:51
portal and then I want to make another similar thing I have to remember like
2:15:53
similar thing I have to remember like
2:15:53
similar thing I have to remember like wait what did I you know what did I
2:15:55
wait what did I you know what did I
2:15:55
wait what did I you know what did I enter in that box I can't really
2:15:57
enter in that box I can't really
2:15:57
enter in that box I can't really remember what I put in there
2:15:59
remember what I put in there
2:15:59
remember what I put in there um so you know I don't tend to use it
2:16:01
um so you know I don't tend to use it
2:16:01
um so you know I don't tend to use it once I've once I've used it once I don't
2:16:03
once I've once I've used it once I don't
2:16:03
once I've once I've used it once I don't tend to use it again
2:16:05
tend to use it again
2:16:05
tend to use it again another option is using these CLI so
2:16:07
another option is using these CLI so
2:16:07
another option is using these CLI so there's the Azure CLI and then Azure
2:16:09
there's the Azure CLI and then Azure
2:16:09
there's the Azure CLI and then Azure Powershell so I have used the Azure CLI
2:16:12
Powershell so I have used the Azure CLI
2:16:12
Powershell so I have used the Azure CLI quite a bit
2:16:13
quite a bit um and uh you know it is nice because it
2:16:15
um and uh you know it is nice because it
2:16:15
um and uh you know it is nice because it can do you know pretty much everything
2:16:17
can do you know pretty much everything
2:16:17
can do you know pretty much everything you might want to do on Azure it can do
2:16:20
you might want to do on Azure it can do
2:16:20
you might want to do on Azure it can do uh so here we have a command to create a
2:16:22
uh so here we have a command to create a
2:16:22
uh so here we have a command to create a flexible server in a given location with
2:16:25
flexible server in a given location with
2:16:25
flexible server in a given location with a given name with this particular SKU
2:16:27
a given name with this particular SKU
2:16:27
a given name with this particular SKU and an admin user and password and uh
2:16:31
and an admin user and password and uh
2:16:31
and an admin user and password and uh and that will go ahead in and create
2:16:33
and that will go ahead in and create
2:16:33
and that will go ahead in and create that postgres server so you know the
2:16:35
that postgres server so you know the
2:16:35
that postgres server so you know the nice thing about command line options is
2:16:38
nice thing about command line options is
2:16:38
nice thing about command line options is that we can replicate them okay so I can
2:16:40
that we can replicate them okay so I can
2:16:40
that we can replicate them okay so I can say Hey you know colleague here's here's
2:16:43
say Hey you know colleague here's here's
2:16:43
say Hey you know colleague here's here's the command that I use to create the
2:16:45
the command that I use to create the
2:16:45
the command that I use to create the server if you want to create a similar
2:16:46
server if you want to create a similar
2:16:46
server if you want to create a similar server use this command
2:16:49
server use this command
2:16:49
server use this command however you know one thing that's not as
2:16:51
however you know one thing that's not as
2:16:51
however you know one thing that's not as nice is um that updating you know
2:16:55
nice is um that updating you know
2:16:55
nice is um that updating you know updating a server after the fact I
2:16:57
updating a server after the fact I
2:16:57
updating a server after the fact I actually have to use a different command
2:16:59
actually have to use a different command
2:16:59
actually have to use a different command for updating so now if I start to make
2:17:01
for updating so now if I start to make
2:17:01
for updating so now if I start to make updates now now I have to you know
2:17:03
updates now now I have to you know
2:17:03
updates now now I have to you know Cobble together this the sequence of
2:17:05
Cobble together this the sequence of
2:17:05
Cobble together this the sequence of command that I used and or put it
2:17:07
command that I used and or put it
2:17:07
command that I used and or put it together you know back into an original
2:17:09
together you know back into an original
2:17:09
together you know back into an original command so it is it's definitely you
2:17:12
command so it is it's definitely you
2:17:12
command so it is it's definitely you know much better than the portal in
2:17:13
know much better than the portal in
2:17:13
know much better than the portal in terms of repability uh but uh it still
2:17:16
terms of repability uh but uh it still
2:17:16
terms of repability uh but uh it still is difficult to maintain right so I'm
2:17:18
is difficult to maintain right so I'm
2:17:18
is difficult to maintain right so I'm looking to like what can I maintain what
2:17:20
looking to like what can I maintain what
2:17:20
looking to like what can I maintain what can I come back to six months from now
2:17:21
can I come back to six months from now
2:17:21
can I come back to six months from now and it'll just work
2:17:23
and it'll just work
2:17:23
and it'll just work so that brings me to arm templates and
2:17:25
so that brings me to arm templates and
2:17:25
so that brings me to arm templates and this is something that Azure has had for
2:17:27
this is something that Azure has had for
2:17:27
this is something that Azure has had for a while
2:17:28
a while um and these are Json files that just
2:17:31
um and these are Json files that just
2:17:31
um and these are Json files that just declaratively describe the resources
2:17:33
declaratively describe the resources
2:17:33
declaratively describe the resources that we want to create this is just a
2:17:36
that we want to create this is just a
2:17:36
that we want to create this is just a snippet of the file and not the whole
2:17:38
snippet of the file and not the whole
2:17:38
snippet of the file and not the whole file uh but he's using Json and saying
2:17:41
file uh but he's using Json and saying
2:17:41
file uh but he's using Json and saying like Okay the type of resource we're
2:17:42
like Okay the type of resource we're
2:17:42
like Okay the type of resource we're making is a flexible server
2:17:44
making is a flexible server
2:17:44
making is a flexible server uh we're using this version of the API
2:17:47
uh we're using this version of the API
2:17:47
uh we're using this version of the API this is going to be the name of it this
2:17:49
this is going to be the name of it this
2:17:49
this is going to be the name of it this is the location this is the SKU and then
2:17:51
is the location this is the SKU and then
2:17:51
is the location this is the SKU and then we continue on with specifying the
2:17:53
we continue on with specifying the
2:17:53
we continue on with specifying the properties of that postgres server
2:17:56
properties of that postgres server
2:17:56
properties of that postgres server so the nice thing about this is that it
2:17:58
so the nice thing about this is that it
2:17:58
so the nice thing about this is that it you know everything is declared there
2:18:00
you know everything is declared there
2:18:00
you know everything is declared there and uh it is it is something we can
2:18:04
and uh it is it is something we can
2:18:04
and uh it is it is something we can repeat and we can share arm templates
2:18:06
repeat and we can share arm templates
2:18:06
repeat and we can share arm templates with you know whether other people so if
2:18:08
with you know whether other people so if
2:18:08
with you know whether other people so if you say oh you want to deploy the same
2:18:09
you say oh you want to deploy the same
2:18:09
you say oh you want to deploy the same server here you go and then if I come
2:18:11
server here you go and then if I come
2:18:12
server here you go and then if I come back to my project later I could just
2:18:14
back to my project later I could just
2:18:14
back to my project later I could just use that that arm template and uh and
2:18:16
use that that arm template and uh and
2:18:16
use that that arm template and uh and I'd be able to deploy it
2:18:18
I'd be able to deploy it
2:18:18
I'd be able to deploy it the annoying thing is it's Json you know
2:18:20
the annoying thing is it's Json you know
2:18:20
the annoying thing is it's Json you know I love Json as much as the next you know
2:18:24
I love Json as much as the next you know
2:18:24
I love Json as much as the next you know former JavaScript programmer but it is
2:18:27
former JavaScript programmer but it is
2:18:27
former JavaScript programmer but it is it is Jason and Jason is not a full
2:18:29
it is Jason and Jason is not a full
2:18:29
it is Jason and Jason is not a full programming language it's it's just a
2:18:31
programming language it's it's just a
2:18:31
programming language it's it's just a JavaScript object
2:18:34
JavaScript object um and it can be hard to to actually use
2:18:37
um and it can be hard to to actually use
2:18:37
um and it can be hard to to actually use Json to to describe everything we might
2:18:40
Json to to describe everything we might
2:18:40
Json to to describe everything we might want to do and to to uh you know it's
2:18:44
want to do and to to uh you know it's
2:18:44
want to do and to to uh you know it's not a programming language so we can't
2:18:46
not a programming language so we can't
2:18:46
not a programming language so we can't conditionally create some properties
2:18:48
conditionally create some properties
2:18:48
conditionally create some properties based on some aspect of the environment
2:18:49
based on some aspect of the environment
2:18:49
based on some aspect of the environment right a very common thing is to have you
2:18:52
right a very common thing is to have you
2:18:52
right a very common thing is to have you know different staging environment than
2:18:54
know different staging environment than
2:18:54
know different staging environment than uh your production environment so your
2:18:55
uh your production environment so your
2:18:55
uh your production environment so your production environment might use a
2:18:57
production environment might use a
2:18:57
production environment might use a general purpose postgres server and this
2:18:59
general purpose postgres server and this
2:18:59
general purpose postgres server and this staging and Dev environment might use a
2:19:01
staging and Dev environment might use a
2:19:01
staging and Dev environment might use a burstable server which is much cheaper
2:19:05
burstable server which is much cheaper
2:19:05
burstable server which is much cheaper and you know based on those parameters
2:19:07
and you know based on those parameters
2:19:07
and you know based on those parameters you know you want to be able to say like
2:19:08
you know you want to be able to say like
2:19:08
you know you want to be able to say like okay we're going to deploy this mostly
2:19:11
okay we're going to deploy this mostly
2:19:11
okay we're going to deploy this mostly the same except except change a few
2:19:12
the same except except change a few
2:19:12
the same except except change a few things based on the environment and that
2:19:15
things based on the environment and that
2:19:15
things based on the environment and that can be difficult to do in just a Json
2:19:17
can be difficult to do in just a Json
2:19:17
can be difficult to do in just a Json file
2:19:19
so that brings us to
2:19:22
that brings us to bicep so bicep is uh you know is the the
2:19:26
bicep so bicep is uh you know is the the
2:19:26
bicep so bicep is uh you know is the the solution to this
2:19:28
solution to this and bicep is a uh it's a declarative
2:19:32
and bicep is a uh it's a declarative
2:19:32
and bicep is a uh it's a declarative language or it's also you know we call
2:19:34
language or it's also you know we call
2:19:34
language or it's also you know we call it a domain specific language it's
2:19:35
it a domain specific language it's
2:19:35
it a domain specific language it's specifically for deploying resources to
2:19:39
specifically for deploying resources to
2:19:39
specifically for deploying resources to azure uh you can also call it an
2:19:42
azure uh you can also call it an
2:19:42
azure uh you can also call it an infrastructure as code language so the
2:19:44
infrastructure as code language so the
2:19:44
infrastructure as code language so the idea is that you're describing your
2:19:46
idea is that you're describing your
2:19:46
idea is that you're describing your infrastructure as code it's very similar
2:19:48
infrastructure as code it's very similar
2:19:48
infrastructure as code it's very similar to terraform and terraform's the one
2:19:50
to terraform and terraform's the one
2:19:50
to terraform and terraform's the one that you know most people are familiar
2:19:51
that you know most people are familiar
2:19:51
that you know most people are familiar with because uh you can use terraform
2:19:53
with because uh you can use terraform
2:19:53
with because uh you can use terraform across all Cloud providers you can use
2:19:55
across all Cloud providers you can use
2:19:55
across all Cloud providers you can use terraform on Azure as well uh so if you
2:19:58
terraform on Azure as well uh so if you
2:19:58
terraform on Azure as well uh so if you do prefer terraform you can you know
2:20:00
do prefer terraform you can you know
2:20:00
do prefer terraform you can you know probably use some of the learnings from
2:20:02
probably use some of the learnings from
2:20:02
probably use some of the learnings from this talk but use it for terraform
2:20:03
this talk but use it for terraform
2:20:03
this talk but use it for terraform instead I do prefer to use bicep because
2:20:06
instead I do prefer to use bicep because
2:20:06
instead I do prefer to use bicep because it is it is targeted for Azure so it
2:20:09
it is it is targeted for Azure so it
2:20:09
it is it is targeted for Azure so it just tends to be a bit easier and it's
2:20:12
just tends to be a bit easier and it's
2:20:12
just tends to be a bit easier and it's and we know it's going to be up to date
2:20:14
and we know it's going to be up to date
2:20:14
and we know it's going to be up to date with Azure and it's got a lot of tolling
2:20:16
with Azure and it's got a lot of tolling
2:20:16
with Azure and it's got a lot of tolling that really helps you when you're an
2:20:18
that really helps you when you're an
2:20:18
that really helps you when you're an Azure developer uh but terraform is also
2:20:21
Azure developer uh but terraform is also
2:20:21
Azure developer uh but terraform is also a possibility and would have a lot of
2:20:23
a possibility and would have a lot of
2:20:23
a possibility and would have a lot of the benefits here so this is an
2:20:24
the benefits here so this is an
2:20:24
the benefits here so this is an infrastructure's code language we're
2:20:26
infrastructure's code language we're
2:20:26
infrastructure's code language we're going to declare
2:20:27
going to declare uh what it is that we're creating and
2:20:30
uh what it is that we're creating and
2:20:30
uh what it is that we're creating and then you know deploy based on that uh so
2:20:32
then you know deploy based on that uh so
2:20:32
then you know deploy based on that uh so here you can say okay we're creating a
2:20:34
here you can say okay we're creating a
2:20:34
here you can say okay we're creating a server this is the name of server
2:20:35
server this is the name of server
2:20:35
server this is the name of server location SKU here are the properties uh
2:20:38
location SKU here are the properties uh
2:20:38
location SKU here are the properties uh so this is the entire bicep for the
2:20:40
so this is the entire bicep for the
2:20:40
so this is the entire bicep for the postgres server and this is kind of the
2:20:43
postgres server and this is kind of the
2:20:43
postgres server and this is kind of the minimal the minimal bicep that you would
2:20:45
minimal the minimal bicep that you would
2:20:45
minimal the minimal bicep that you would need to deploy a postgres server
2:20:48
need to deploy a postgres server
2:20:48
need to deploy a postgres server so how do we actually use this bicep oh
2:20:51
so how do we actually use this bicep oh
2:20:51
so how do we actually use this bicep oh well first let's talk about the fact
2:20:53
well first let's talk about the fact
2:20:53
well first let's talk about the fact that the bicep bicep is actually you
2:20:55
that the bicep bicep is actually you
2:20:55
that the bicep bicep is actually you know almost a full-featured language so
2:20:57
know almost a full-featured language so
2:20:57
know almost a full-featured language so uh it's you know it's not just like Json
2:21:00
uh it's you know it's not just like Json
2:21:00
uh it's you know it's not just like Json it also has parameters types uh logic
2:21:03
it also has parameters types uh logic
2:21:03
it also has parameters types uh logic Loops functions modules and that's where
2:21:06
Loops functions modules and that's where
2:21:06
Loops functions modules and that's where you get you know really nice ability to
2:21:08
you get you know really nice ability to
2:21:08
you get you know really nice ability to to refactor things and pass things into
2:21:10
to refactor things and pass things into
2:21:10
to refactor things and pass things into different modules uh so we're going to
2:21:13
different modules uh so we're going to
2:21:13
different modules uh so we're going to see examples of most of these uh most of
2:21:15
see examples of most of these uh most of
2:21:15
see examples of most of these uh most of these throughout the talk so you can see
2:21:17
these throughout the talk so you can see
2:21:17
these throughout the talk so you can see how you how it really helps the fact
2:21:19
how you how it really helps the fact
2:21:19
how you how it really helps the fact that bicep does have all these features
2:21:22
that bicep does have all these features
2:21:22
that bicep does have all these features of a language
2:21:26
all right so let's look at a
2:21:27
all right so let's look at a
2:21:27
all right so let's look at a parametrized bicep file so this is one
2:21:29
parametrized bicep file so this is one
2:21:29
parametrized bicep file so this is one of the you know features I was talking
2:21:30
of the you know features I was talking
2:21:30
of the you know features I was talking about parameters so this is really nice
2:21:33
about parameters so this is really nice
2:21:33
about parameters so this is really nice because I was saying you might have
2:21:34
because I was saying you might have
2:21:34
because I was saying you might have multiple environments you know a Dev
2:21:36
multiple environments you know a Dev
2:21:36
multiple environments you know a Dev environment a prod environment and you
2:21:38
environment a prod environment and you
2:21:38
environment a prod environment and you could use parameters for values that
2:21:40
could use parameters for values that
2:21:40
could use parameters for values that vary between those environments uh so
2:21:43
vary between those environments uh so
2:21:43
vary between those environments uh so here I have a parameter for the server
2:21:45
here I have a parameter for the server
2:21:45
here I have a parameter for the server name right because we're probably going
2:21:46
name right because we're probably going
2:21:46
name right because we're probably going to want a different server name for
2:21:48
to want a different server name for
2:21:48
to want a different server name for those two environments a location you
2:21:50
those two environments a location you
2:21:50
those two environments a location you might use different locations like you
2:21:52
might use different locations like you
2:21:52
might use different locations like you might you have a production location
2:21:53
might you have a production location
2:21:53
might you have a production location that's closer to your customers in a DAV
2:21:55
that's closer to your customers in a DAV
2:21:55
that's closer to your customers in a DAV environment that's closer to your devs
2:21:57
environment that's closer to your devs
2:21:57
environment that's closer to your devs uh and uh here I also have a param for
2:22:01
uh and uh here I also have a param for
2:22:01
uh and uh here I also have a param for the admin password and you notice also I
2:22:03
the admin password and you notice also I
2:22:03
the admin password and you notice also I use something called a decorator uh so
2:22:05
use something called a decorator uh so
2:22:05
use something called a decorator uh so there's these decorators that you can
2:22:06
there's these decorators that you can
2:22:06
there's these decorators that you can add to parameters in order to constrain
2:22:08
add to parameters in order to constrain
2:22:08
add to parameters in order to constrain them and so here I'm using the at secure
2:22:10
them and so here I'm using the at secure
2:22:10
them and so here I'm using the at secure decorator uh so that way bicep notes hey
2:22:12
decorator uh so that way bicep notes hey
2:22:12
decorator uh so that way bicep notes hey this is supposed to be a secure
2:22:13
this is supposed to be a secure
2:22:13
this is supposed to be a secure parameter so don't be you know like
2:22:15
parameter so don't be you know like
2:22:15
parameter so don't be you know like logging this and you know make sure that
2:22:18
logging this and you know make sure that
2:22:18
logging this and you know make sure that this is actually you know protected to
2:22:19
this is actually you know protected to
2:22:19
this is actually you know protected to some extent
2:22:20
some extent uh so you should definitely be using uh
2:22:23
uh so you should definitely be using uh
2:22:24
uh so you should definitely be using uh secure secure parameters uh for you know
2:22:26
secure secure parameters uh for you know
2:22:26
secure secure parameters uh for you know things like passwords
2:22:28
things like passwords
2:22:28
things like passwords um but you also should go way up and
2:22:30
um but you also should go way up and
2:22:30
um but you also should go way up and beyond for things like password so we'll
2:22:32
beyond for things like password so we'll
2:22:32
beyond for things like password so we'll talk more about that uh but uh yeah so
2:22:34
talk more about that uh but uh yeah so
2:22:34
talk more about that uh but uh yeah so there we go using params so this is a
2:22:36
there we go using params so this is a
2:22:36
there we go using params so this is a nice a nice starter file
2:22:39
nice a nice starter file
2:22:39
nice a nice starter file and now how how do I actually use this
2:22:41
and now how how do I actually use this
2:22:41
and now how how do I actually use this file so there is a command in the Azure
2:22:44
file so there is a command in the Azure
2:22:44
file so there is a command in the Azure CLI which is AZ deployment group create
2:22:48
CLI which is AZ deployment group create
2:22:48
CLI which is AZ deployment group create and I tell it what Resource Group is
2:22:51
and I tell it what Resource Group is
2:22:51
and I tell it what Resource Group is going to deploy the bicep resource to
2:22:53
going to deploy the bicep resource to
2:22:53
going to deploy the bicep resource to and I give it the template file so I say
2:22:55
and I give it the template file so I say
2:22:55
and I give it the template file so I say okay you're going to deploy to this
2:22:57
okay you're going to deploy to this
2:22:57
okay you're going to deploy to this Resource Group here's the file so I give
2:22:59
Resource Group here's the file so I give
2:22:59
Resource Group here's the file so I give it that bicep file that we just saw and
2:23:02
it that bicep file that we just saw and
2:23:02
it that bicep file that we just saw and it prompts for any parameters that
2:23:04
it prompts for any parameters that
2:23:04
it prompts for any parameters that haven't been specified so here since I
2:23:06
haven't been specified so here since I
2:23:06
haven't been specified so here since I didn't specify admin password it's
2:23:08
didn't specify admin password it's
2:23:08
didn't specify admin password it's prompting me for it so I'm just going to
2:23:10
prompting me for it so I'm just going to
2:23:10
prompting me for it so I'm just going to type it in you can also store passwords
2:23:12
type it in you can also store passwords
2:23:12
type it in you can also store passwords in a file uh you could fetch them from a
2:23:15
in a file uh you could fetch them from a
2:23:15
in a file uh you could fetch them from a key Vault so you have various options
2:23:17
key Vault so you have various options
2:23:17
key Vault so you have various options for how to pass parameters in but here
2:23:19
for how to pass parameters in but here
2:23:19
for how to pass parameters in but here I'm just going to type it in
2:23:21
I'm just going to type it in
2:23:21
I'm just going to type it in and not show you
2:23:23
and not show you and then so it'll go off and it'll
2:23:26
and then so it'll go off and it'll
2:23:26
and then so it'll go off and it'll create that postgres server and it does
2:23:28
create that postgres server and it does
2:23:28
create that postgres server and it does take a few minutes because it's you know
2:23:29
take a few minutes because it's you know
2:23:29
take a few minutes because it's you know it's literally allocating some space for
2:23:31
it's literally allocating some space for
2:23:31
it's literally allocating some space for you in the cloud and carving that out
2:23:33
you in the cloud and carving that out
2:23:33
you in the cloud and carving that out for you
2:23:34
for you um and postgres servers I do find take a
2:23:36
um and postgres servers I do find take a
2:23:36
um and postgres servers I do find take a little bit longer time to create than
2:23:38
little bit longer time to create than
2:23:38
little bit longer time to create than some of the other as your resources I
2:23:40
some of the other as your resources I
2:23:40
some of the other as your resources I create um so it could take a few minutes
2:23:42
create um so it could take a few minutes
2:23:42
create um so it could take a few minutes and then it comes back and what it comes
2:23:44
and then it comes back and what it comes
2:23:44
and then it comes back and what it comes back with is actually a bunch of jsons
2:23:46
back with is actually a bunch of jsons
2:23:46
back with is actually a bunch of jsons so we're back to Json again right it
2:23:47
so we're back to Json again right it
2:23:47
so we're back to Json again right it shows you a Json output of uh everything
2:23:50
shows you a Json output of uh everything
2:23:50
shows you a Json output of uh everything that got created and you can look
2:23:52
that got created and you can look
2:23:52
that got created and you can look through that output to you know see
2:23:54
through that output to you know see
2:23:54
through that output to you know see properties of it but typically what I do
2:23:56
properties of it but typically what I do
2:23:56
properties of it but typically what I do is I just go to the portal and I look at
2:23:58
is I just go to the portal and I look at
2:23:58
is I just go to the portal and I look at it
2:23:59
it uh you can also specify parameters on
2:24:01
uh you can also specify parameters on
2:24:01
uh you can also specify parameters on the command line that's another way to
2:24:02
the command line that's another way to
2:24:02
the command line that's another way to specifying parameters
2:24:05
specifying parameters
2:24:05
specifying parameters so once it deploys I like to go to the
2:24:07
so once it deploys I like to go to the
2:24:07
so once it deploys I like to go to the portal you know just check everything
2:24:09
portal you know just check everything
2:24:09
portal you know just check everything out make sure it is looking good so I
2:24:12
out make sure it is looking good so I
2:24:12
out make sure it is looking good so I can go in the portal and see okay yeah
2:24:14
can go in the portal and see okay yeah
2:24:14
can go in the portal and see okay yeah it's here this it's got the right server
2:24:16
it's here this it's got the right server
2:24:16
it's here this it's got the right server name it's you know it's got everything
2:24:17
name it's you know it's got everything
2:24:17
name it's you know it's got everything expected I mean it you know it it works
2:24:20
expected I mean it you know it it works
2:24:20
expected I mean it you know it it works so we don't really have to double check
2:24:22
so we don't really have to double check
2:24:22
so we don't really have to double check in the portal but it's nice to see that
2:24:24
in the portal but it's nice to see that
2:24:24
in the portal but it's nice to see that it's there and you can poke around at it
2:24:26
it's there and you can poke around at it
2:24:26
it's there and you can poke around at it uh from the portal
2:24:29
uh from the portal
2:24:29
uh from the portal yeah
2:24:30
yeah okay so now let's go more into what we
2:24:34
okay so now let's go more into what we
2:24:34
okay so now let's go more into what we can do with bicep so that we can
2:24:36
can do with bicep so that we can
2:24:36
can do with bicep so that we can customize This postgres Server more
2:24:37
customize This postgres Server more
2:24:37
customize This postgres Server more because that was basically the minimal
2:24:39
because that was basically the minimal
2:24:39
because that was basically the minimal postgres server
2:24:41
postgres server um but there's a lot more that we might
2:24:42
um but there's a lot more that we might
2:24:42
um but there's a lot more that we might want to do with it uh so for the rest of
2:24:45
want to do with it uh so for the rest of
2:24:45
want to do with it uh so for the rest of what we want to do I'm going to use
2:24:46
what we want to do I'm going to use
2:24:46
what we want to do I'm going to use child resources so these are resources
2:24:49
child resources so these are resources
2:24:49
child resources so these are resources that only exist in the scope of a parent
2:24:52
that only exist in the scope of a parent
2:24:52
that only exist in the scope of a parent resource in a bicep file so for postgres
2:24:55
resource in a bicep file so for postgres
2:24:55
resource in a bicep file so for postgres we can make child resources for
2:24:58
we can make child resources for
2:24:58
we can make child resources for declaring the administrators
2:25:00
declaring the administrators
2:25:00
declaring the administrators configurations databases firewall rules
2:25:03
configurations databases firewall rules
2:25:03
configurations databases firewall rules and migrations
2:25:06
and migrations so let's take a look at making a
2:25:07
so let's take a look at making a
2:25:07
so let's take a look at making a database so by default Azure postgres
2:25:10
database so by default Azure postgres
2:25:10
database so by default Azure postgres flexible servers they have a database
2:25:13
flexible servers they have a database
2:25:13
flexible servers they have a database called postgres and they have system
2:25:15
called postgres and they have system
2:25:15
called postgres and they have system databases named Azure maintenance and
2:25:17
databases named Azure maintenance and
2:25:17
databases named Azure maintenance and Azure CIS so every flexible server will
2:25:20
Azure CIS so every flexible server will
2:25:20
Azure CIS so every flexible server will have those databases but you may want to
2:25:23
have those databases but you may want to
2:25:23
have those databases but you may want to have a different database you might want
2:25:25
have a different database you might want
2:25:25
have a different database you might want to have a database with a different name
2:25:27
to have a database with a different name
2:25:27
to have a database with a different name so what you can do is create a child
2:25:31
so what you can do is create a child
2:25:31
so what you can do is create a child resource underneath you know inside your
2:25:34
resource underneath you know inside your
2:25:34
resource underneath you know inside your postgres server resource in the bicep
2:25:36
postgres server resource in the bicep
2:25:36
postgres server resource in the bicep file so you can see that you know here
2:25:38
file so you can see that you know here
2:25:38
file so you can see that you know here we start with the postgres server and
2:25:40
we start with the postgres server and
2:25:40
we start with the postgres server and then nested inside that we have a
2:25:42
then nested inside that we have a
2:25:42
then nested inside that we have a database resource and we give that a
2:25:44
database resource and we give that a
2:25:44
database resource and we give that a name web app and that's all we need to
2:25:46
name web app and that's all we need to
2:25:46
name web app and that's all we need to do for a database child resource so now
2:25:49
do for a database child resource so now
2:25:49
do for a database child resource so now when we deploy this it's actually going
2:25:51
when we deploy this it's actually going
2:25:51
when we deploy this it's actually going to create that database as well as the
2:25:53
to create that database as well as the
2:25:53
to create that database as well as the default ones
2:25:55
default ones we can get a little fancier if we want
2:25:58
we can get a little fancier if we want
2:25:58
we can get a little fancier if we want to have multiple databases uh so I'm
2:26:01
to have multiple databases uh so I'm
2:26:01
to have multiple databases uh so I'm going to make a param that's an array
2:26:03
going to make a param that's an array
2:26:03
going to make a param that's an array and the array is going to have the names
2:26:06
and the array is going to have the names
2:26:06
and the array is going to have the names for each of the databases so we have web
2:26:08
for each of the databases so we have web
2:26:08
for each of the databases so we have web app and analytics and then I can I can
2:26:12
app and analytics and then I can I can
2:26:12
app and analytics and then I can I can use a for Loop uh to to create multiple
2:26:17
use a for Loop uh to to create multiple
2:26:17
use a for Loop uh to to create multiple child resources and it has this syntax
2:26:19
child resources and it has this syntax
2:26:19
child resources and it has this syntax you see here where you say you know
2:26:21
you see here where you say you know
2:26:21
you see here where you say you know equals and then the brackets and the and
2:26:24
equals and then the brackets and the and
2:26:24
equals and then the brackets and the and the for Loop there and and that way I'm
2:26:27
the for Loop there and and that way I'm
2:26:27
the for Loop there and and that way I'm actually creating multiple child
2:26:28
actually creating multiple child
2:26:28
actually creating multiple child resources so this is a really common use
2:26:31
resources so this is a really common use
2:26:31
resources so this is a really common use of arrays and for Loops is to make
2:26:34
of arrays and for Loops is to make
2:26:34
of arrays and for Loops is to make multiple child resources and we'll see
2:26:36
multiple child resources and we'll see
2:26:36
multiple child resources and we'll see this come up again soon so hopefully
2:26:38
this come up again soon so hopefully
2:26:38
this come up again soon so hopefully you're starting to see some of the
2:26:39
you're starting to see some of the
2:26:39
you're starting to see some of the advantages of really having more
2:26:41
advantages of really having more
2:26:41
advantages of really having more language features inside bicep
2:26:45
language features inside bicep
2:26:45
language features inside bicep so once we've done that and deployed the
2:26:47
so once we've done that and deployed the
2:26:47
so once we've done that and deployed the you know deployed This postgres Server
2:26:49
you know deployed This postgres Server
2:26:49
you know deployed This postgres Server that has databases we could go to the
2:26:50
that has databases we could go to the
2:26:50
that has databases we could go to the portal we could click on databases and
2:26:53
portal we could click on databases and
2:26:53
portal we could click on databases and now we see that the additional two
2:26:55
now we see that the additional two
2:26:55
now we see that the additional two databases are in there analytics and web
2:26:57
databases are in there analytics and web
2:26:57
databases are in there analytics and web app and we also see we can connect to
2:26:59
app and we also see we can connect to
2:26:59
app and we also see we can connect to those databases that's something cool
2:27:00
those databases that's something cool
2:27:00
those databases that's something cool you can do is you can actually connect
2:27:02
you can do is you can actually connect
2:27:02
you can do is you can actually connect to the databases inside the portal and
2:27:04
to the databases inside the portal and
2:27:04
to the databases inside the portal and you know Tinker with them there
2:27:08
uh next we can make fire owls okay so by
2:27:11
uh next we can make fire owls okay so by
2:27:11
uh next we can make fire owls okay so by default you need to figure out when
2:27:12
default you need to figure out when
2:27:12
default you need to figure out when you're deploying a poster server what
2:27:14
you're deploying a poster server what
2:27:14
you're deploying a poster server what your networking story is going to be uh
2:27:16
your networking story is going to be uh
2:27:16
your networking story is going to be uh the default is that it's publicly
2:27:19
the default is that it's publicly
2:27:19
the default is that it's publicly accessible but it's not accessible to
2:27:21
accessible but it's not accessible to
2:27:21
accessible but it's not accessible to any particular IPS so it's not really
2:27:23
any particular IPS so it's not really
2:27:23
any particular IPS so it's not really publicly accessible so you need to
2:27:25
publicly accessible so you need to
2:27:25
publicly accessible so you need to explicitly tell it what IPS you're going
2:27:27
explicitly tell it what IPS you're going
2:27:27
explicitly tell it what IPS you're going to allow it to be accessible from or
2:27:30
to allow it to be accessible from or
2:27:30
to allow it to be accessible from or explicitly tell it not to be accessible
2:27:31
explicitly tell it not to be accessible
2:27:31
explicitly tell it not to be accessible to any IPS so in this case we're going
2:27:34
to any IPS so in this case we're going
2:27:34
to any IPS so in this case we're going to create a child resource
2:27:36
to create a child resource
2:27:36
to create a child resource um of the of firewall rules and tell it
2:27:39
um of the of firewall rules and tell it
2:27:39
um of the of firewall rules and tell it to allow as your internal IPS and the
2:27:42
to allow as your internal IPS and the
2:27:42
to allow as your internal IPS and the way we do that is say the start IP
2:27:44
way we do that is say the start IP
2:27:44
way we do that is say the start IP address is 0.0.00
2:27:46
address is 0.0.00 might have too many zeros there and then
2:27:48
might have too many zeros there and then
2:27:48
might have too many zeros there and then the end ipaters is the same and this
2:27:51
the end ipaters is the same and this
2:27:51
the end ipaters is the same and this will allow any Azure resource to be able
2:27:54
will allow any Azure resource to be able
2:27:54
will allow any Azure resource to be able to contact that postgres server and you
2:27:56
to contact that postgres server and you
2:27:56
to contact that postgres server and you have to keep in mind that's going to
2:27:57
have to keep in mind that's going to
2:27:57
have to keep in mind that's going to mean any of yours or resources also
2:28:00
mean any of yours or resources also
2:28:00
mean any of yours or resources also anybody else's your resources so you
2:28:03
anybody else's your resources so you
2:28:03
anybody else's your resources so you know this is there's a bit of a security
2:28:05
know this is there's a bit of a security
2:28:05
know this is there's a bit of a security hole here if that if there's you know
2:28:07
hole here if that if there's you know
2:28:07
hole here if that if there's you know someone God forbid that's doing
2:28:09
someone God forbid that's doing
2:28:09
someone God forbid that's doing something nefarious on his ear and who
2:28:11
something nefarious on his ear and who
2:28:11
something nefarious on his ear and who manages to figure out your username
2:28:12
manages to figure out your username
2:28:12
manages to figure out your username password
2:28:13
password there they are going to be able to
2:28:15
there they are going to be able to
2:28:15
there they are going to be able to access it from there as you as your
2:28:17
access it from there as you as your
2:28:17
access it from there as you as your resource so we will talk about using a
2:28:19
resource so we will talk about using a
2:28:19
resource so we will talk about using a v-net which is a better approach
2:28:21
v-net which is a better approach
2:28:21
v-net which is a better approach uh but many you know this this is a
2:28:24
uh but many you know this this is a
2:28:24
uh but many you know this this is a decent approach to start off with and
2:28:27
decent approach to start off with and
2:28:27
decent approach to start off with and um we can also allow our individual IPS
2:28:29
um we can also allow our individual IPS
2:28:29
um we can also allow our individual IPS so if we wanted from our Dev machine to
2:28:31
so if we wanted from our Dev machine to
2:28:31
so if we wanted from our Dev machine to be able to access the server we can
2:28:33
be able to access the server we can
2:28:33
be able to access the server we can allow we can do a firewall rule just for
2:28:35
allow we can do a firewall rule just for
2:28:35
allow we can do a firewall rule just for that IP so here I've got an array you
2:28:38
that IP so here I've got an array you
2:28:38
that IP so here I've got an array you know with my IP and you know my my
2:28:40
know with my IP and you know my my
2:28:40
know with my IP and you know my my colleague's IP and I say okay let you
2:28:43
colleague's IP and I say okay let you
2:28:43
colleague's IP and I say okay let you know let all these IPS in here so this
2:28:45
know let all these IPS in here so this
2:28:45
know let all these IPS in here so this is a common setup to allow a zero IPS
2:28:47
is a common setup to allow a zero IPS
2:28:47
is a common setup to allow a zero IPS plus some individual IPS and you do
2:28:49
plus some individual IPS and you do
2:28:49
plus some individual IPS and you do really need to keep in mind the security
2:28:50
really need to keep in mind the security
2:28:50
really need to keep in mind the security implications of doing that
2:28:53
implications of doing that
2:28:53
implications of doing that uh and once we do that we can check the
2:28:55
uh and once we do that we can check the
2:28:55
uh and once we do that we can check the portal and what you'll see you know what
2:28:57
portal and what you'll see you know what
2:28:57
portal and what you'll see you know what it looks like in the networking tab is
2:28:58
it looks like in the networking tab is
2:28:58
it looks like in the networking tab is that it says the connectivity method is
2:29:00
that it says the connectivity method is
2:29:00
that it says the connectivity method is public access and that is going to allow
2:29:03
public access and that is going to allow
2:29:03
public access and that is going to allow Public Access from any Azure service
2:29:05
Public Access from any Azure service
2:29:05
Public Access from any Azure service within Azure to the server and then it's
2:29:07
within Azure to the server and then it's
2:29:07
within Azure to the server and then it's also going to allow those individual IPS
2:29:09
also going to allow those individual IPS
2:29:09
also going to allow those individual IPS so that's what it looks like after
2:29:11
so that's what it looks like after
2:29:11
so that's what it looks like after deploying that bicep file
2:29:15
all right so yeah let's talk about
2:29:17
all right so yeah let's talk about
2:29:17
all right so yeah let's talk about making it more secure by injecting the
2:29:21
making it more secure by injecting the
2:29:21
making it more secure by injecting the postgres server inside a virtual network
2:29:23
postgres server inside a virtual network
2:29:23
postgres server inside a virtual network uh you know with whatever it wants to
2:29:25
uh you know with whatever it wants to
2:29:25
uh you know with whatever it wants to communicate with it like so with like a
2:29:27
communicate with it like so with like a
2:29:27
communicate with it like so with like a python web app right and it's a python
2:29:28
python web app right and it's a python
2:29:28
python web app right and it's a python web app postcard server they can chill
2:29:30
web app postcard server they can chill
2:29:30
web app postcard server they can chill in the v-net so how do we do that in
2:29:32
in the v-net so how do we do that in
2:29:32
in the v-net so how do we do that in bicep this is where we're going to get a
2:29:34
bicep this is where we're going to get a
2:29:34
bicep this is where we're going to get a lot uh bigger with our bicep because we
2:29:37
lot uh bigger with our bicep because we
2:29:37
lot uh bigger with our bicep because we do need to create multiple resources so
2:29:40
do need to create multiple resources so
2:29:40
do need to create multiple resources so we're going to have to make a virtual
2:29:42
we're going to have to make a virtual
2:29:42
we're going to have to make a virtual Network so it is your virtual Network
2:29:44
Network so it is your virtual Network
2:29:44
Network so it is your virtual Network um and is your private DNS Zone and then
2:29:47
um and is your private DNS Zone and then
2:29:47
um and is your private DNS Zone and then also whatever resource actually needs to
2:29:49
also whatever resource actually needs to
2:29:49
also whatever resource actually needs to access the postgres server so in this
2:29:51
access the postgres server so in this
2:29:51
access the postgres server so in this example I'm doing an app service web app
2:29:53
example I'm doing an app service web app
2:29:53
example I'm doing an app service web app but you could have as you know as your
2:29:55
but you could have as you know as your
2:29:55
but you could have as you know as your container app and as your function a VM
2:29:58
container app and as your function a VM
2:29:58
container app and as your function a VM whatever it is that needs access is is
2:30:00
whatever it is that needs access is is
2:30:00
whatever it is that needs access is is going to want to live inside that Vina
2:30:02
going to want to live inside that Vina
2:30:02
going to want to live inside that Vina as well
2:30:05
so the first thing we do is create that
2:30:07
so the first thing we do is create that
2:30:07
so the first thing we do is create that and create that v-net with a we have to
2:30:09
and create that v-net with a we have to
2:30:09
and create that v-net with a we have to tell it you know it's possible private
2:30:11
tell it you know it's possible private
2:30:11
tell it you know it's possible private address space so I gave it a range there
2:30:13
address space so I gave it a range there
2:30:13
address space so I gave it a range there I'm using the cidr notation
2:30:17
I'm using the cidr notation
2:30:17
I'm using the cidr notation and then next I create subnets inside
2:30:19
and then next I create subnets inside
2:30:19
and then next I create subnets inside that v-net and I delegate ranges within
2:30:23
that v-net and I delegate ranges within
2:30:23
that v-net and I delegate ranges within that original address space I delegate
2:30:25
that original address space I delegate
2:30:26
that original address space I delegate ranges to particular types of azure
2:30:29
ranges to particular types of azure
2:30:29
ranges to particular types of azure services so here I'm saying that this
2:30:33
services so here I'm saying that this
2:30:33
services so here I'm saying that this range gets delegated to flexible servers
2:30:38
range gets delegated to flexible servers
2:30:38
range gets delegated to flexible servers uh so that's they're gonna get you know
2:30:40
uh so that's they're gonna get you know
2:30:40
uh so that's they're gonna get you know live in that part of the v-net
2:30:43
live in that part of the v-net
2:30:43
live in that part of the v-net and then I also need a subnet for the
2:30:46
and then I also need a subnet for the
2:30:46
and then I also need a subnet for the other resource so in that case it's I
2:30:47
other resource so in that case it's I
2:30:48
other resource so in that case it's I said server Farms which is actually app
2:30:49
said server Farms which is actually app
2:30:49
said server Farms which is actually app service web apps uh so I said okay this
2:30:52
service web apps uh so I said okay this
2:30:52
service web apps uh so I said okay this this other range uh non-overlapping
2:30:55
this other range uh non-overlapping
2:30:55
this other range uh non-overlapping range is going to be dedicated to these
2:30:57
range is going to be dedicated to these
2:30:57
range is going to be dedicated to these server Farms
2:31:00
and now uh in order to let the other
2:31:03
and now uh in order to let the other
2:31:03
and now uh in order to let the other servers so my app service web app is
2:31:05
servers so my app service web app is
2:31:05
servers so my app service web app is going to be publicly accessible so it
2:31:07
going to be publicly accessible so it
2:31:07
going to be publicly accessible so it doesn't need a private DNS right because
2:31:09
doesn't need a private DNS right because
2:31:09
doesn't need a private DNS right because we're letting it I'm going to let anyone
2:31:10
we're letting it I'm going to let anyone
2:31:10
we're letting it I'm going to let anyone hit that up right no problem but I don't
2:31:13
hit that up right no problem but I don't
2:31:13
hit that up right no problem but I don't want my postgres server to be public
2:31:14
want my postgres server to be public
2:31:14
want my postgres server to be public accessible but I do want to be able to
2:31:16
accessible but I do want to be able to
2:31:16
accessible but I do want to be able to access it via a URL instead of an IP so
2:31:19
access it via a URL instead of an IP so
2:31:19
access it via a URL instead of an IP so what I do is create a private DNS Zone
2:31:22
what I do is create a private DNS Zone
2:31:22
what I do is create a private DNS Zone and the name there's some naming rules
2:31:25
and the name there's some naming rules
2:31:25
and the name there's some naming rules here so the name has to end in
2:31:28
here so the name has to end in
2:31:28
here so the name has to end in postgres.database.zero.com so you see
2:31:30
postgres.database.zero.com so you see
2:31:30
postgres.database.zero.com so you see that on the second line but it can't be
2:31:32
that on the second line but it can't be
2:31:32
that on the second line but it can't be the exact same as the uh as a database
2:31:35
the exact same as the uh as a database
2:31:35
the exact same as the uh as a database so here I have dot private Dot
2:31:38
so here I have dot private Dot
2:31:38
so here I have dot private Dot progress.database.zero.com and that
2:31:41
progress.database.zero.com and that
2:31:41
progress.database.zero.com and that satisfies the naming rules for this
2:31:43
satisfies the naming rules for this
2:31:43
satisfies the naming rules for this private DNS Zone and there's more
2:31:44
private DNS Zone and there's more
2:31:44
private DNS Zone and there's more information in the links on this slide
2:31:46
information in the links on this slide
2:31:46
information in the links on this slide because it is a little tricky
2:31:49
because it is a little tricky
2:31:49
because it is a little tricky uh so I create this private DNS Zone and
2:31:51
uh so I create this private DNS Zone and
2:31:51
uh so I create this private DNS Zone and then I link it to the virtual Network
2:31:54
then I link it to the virtual Network
2:31:54
then I link it to the virtual Network and that is what's going to allow the
2:31:56
and that is what's going to allow the
2:31:56
and that is what's going to allow the web app to be able to use a the URL to
2:32:00
web app to be able to use a the URL to
2:32:00
web app to be able to use a the URL to actually interact with the server
2:32:02
actually interact with the server
2:32:02
actually interact with the server despite the fact that the postgres
2:32:03
despite the fact that the postgres
2:32:03
despite the fact that the postgres server isn't technically accessible at
2:32:06
server isn't technically accessible at
2:32:06
server isn't technically accessible at that URL uh Azure will be able to look
2:32:08
that URL uh Azure will be able to look
2:32:08
that URL uh Azure will be able to look up and realize that there's a private
2:32:10
up and realize that there's a private
2:32:10
up and realize that there's a private DNS that's doing a mapping to a v-net
2:32:12
DNS that's doing a mapping to a v-net
2:32:12
DNS that's doing a mapping to a v-net instead
2:32:15
now I need to configure both the
2:32:17
now I need to configure both the
2:32:17
now I need to configure both the resources so that they know about the
2:32:18
resources so that they know about the
2:32:18
resources so that they know about the v-net so they get injected into the
2:32:20
v-net so they get injected into the
2:32:20
v-net so they get injected into the v-net as it's called so I add on the
2:32:23
v-net as it's called so I add on the
2:32:23
v-net as it's called so I add on the postgres server I add a network property
2:32:26
postgres server I add a network property
2:32:26
postgres server I add a network property and I add there the delegated subnet
2:32:29
and I add there the delegated subnet
2:32:29
and I add there the delegated subnet resource ID so I get that from the
2:32:33
resource ID so I get that from the
2:32:33
resource ID so I get that from the virtual Network object you know resource
2:32:35
virtual Network object you know resource
2:32:35
virtual Network object you know resource that was made before and then I also
2:32:38
that was made before and then I also
2:32:38
that was made before and then I also have the private DNS Zone arm resource
2:32:40
have the private DNS Zone arm resource
2:32:40
have the private DNS Zone arm resource ID and so that's what's linking it to
2:32:43
ID and so that's what's linking it to
2:32:43
ID and so that's what's linking it to the private DNS zone so that postgres
2:32:45
the private DNS zone so that postgres
2:32:45
the private DNS zone so that postgres server has to get connected to both
2:32:47
server has to get connected to both
2:32:47
server has to get connected to both these things both the v-net and the
2:32:49
these things both the v-net and the
2:32:49
these things both the v-net and the private DNS
2:32:51
private DNS but then the web app only needs to be
2:32:53
but then the web app only needs to be
2:32:53
but then the web app only needs to be connected to the v-net so it gets a it
2:32:56
connected to the v-net so it gets a it
2:32:56
connected to the v-net so it gets a it gets connected to this you know with the
2:32:58
gets connected to this you know with the
2:32:58
gets connected to this you know with the subnet resource ID as well
2:33:00
subnet resource ID as well
2:33:00
subnet resource ID as well uh in this case using a child resource
2:33:04
uh in this case using a child resource
2:33:04
uh in this case using a child resource all right so now we have we have all the
2:33:07
all right so now we have we have all the
2:33:07
all right so now we have we have all the pieces and there's a lot there's a lot
2:33:09
pieces and there's a lot there's a lot
2:33:09
pieces and there's a lot there's a lot of a lot of pieces here and uh you know
2:33:13
of a lot of pieces here and uh you know
2:33:13
of a lot of pieces here and uh you know I I'm providing all this code will be
2:33:15
I I'm providing all this code will be
2:33:15
I I'm providing all this code will be accessible afterwards so you're welcome
2:33:17
accessible afterwards so you're welcome
2:33:17
accessible afterwards so you're welcome to you know to just take it if this is
2:33:19
to you know to just take it if this is
2:33:19
to you know to just take it if this is the thing you're trying to do
2:33:21
the thing you're trying to do
2:33:21
the thing you're trying to do uh so all together you know end up with
2:33:23
uh so all together you know end up with
2:33:23
uh so all together you know end up with a fairly long file that has all these
2:33:25
a fairly long file that has all these
2:33:25
a fairly long file that has all these resources in it and uh and you'll be
2:33:28
resources in it and uh and you'll be
2:33:28
resources in it and uh and you'll be able to follow the the links in these
2:33:29
able to follow the the links in these
2:33:29
able to follow the the links in these slides when I share them at the end uh
2:33:32
slides when I share them at the end uh
2:33:32
slides when I share them at the end uh but all together you're gonna you know
2:33:33
but all together you're gonna you know
2:33:33
but all together you're gonna you know have the file with all those resources
2:33:35
have the file with all those resources
2:33:35
have the file with all those resources and also have the web app declared in
2:33:37
and also have the web app declared in
2:33:37
and also have the web app declared in there and uh and then you can you know
2:33:39
there and uh and then you can you know
2:33:39
there and uh and then you can you know we can deploy that using the same
2:33:41
we can deploy that using the same
2:33:41
we can deploy that using the same command as before
2:33:43
command as before so once we deploy it we can check the
2:33:46
so once we deploy it we can check the
2:33:46
so once we deploy it we can check the portal and now we see we've got four
2:33:48
portal and now we see we've got four
2:33:48
portal and now we see we've got four resources deployed right the postgres
2:33:50
resources deployed right the postgres
2:33:50
resources deployed right the postgres server oh which I left out of that
2:33:52
server oh which I left out of that
2:33:52
server oh which I left out of that screenshot but we'll have a postgres
2:33:54
screenshot but we'll have a postgres
2:33:54
screenshot but we'll have a postgres server we'll have the app service we'll
2:33:56
server we'll have the app service we'll
2:33:56
server we'll have the app service we'll have the virtual Network we'll have a
2:33:57
have the virtual Network we'll have a
2:33:57
have the virtual Network we'll have a private NS Zone we also have an app
2:33:59
private NS Zone we also have an app
2:33:59
private NS Zone we also have an app service plan for the app service so you
2:34:01
service plan for the app service so you
2:34:01
service plan for the app service so you end up having a bunch of resources
2:34:03
end up having a bunch of resources
2:34:03
end up having a bunch of resources um so a v-net is a best practice from a
2:34:06
um so a v-net is a best practice from a
2:34:06
um so a v-net is a best practice from a security perspective it is a little more
2:34:08
security perspective it is a little more
2:34:08
security perspective it is a little more complicated to set up luckily I've done
2:34:10
complicated to set up luckily I've done
2:34:10
complicated to set up luckily I've done the work for you I'm sorry so you can
2:34:13
the work for you I'm sorry so you can
2:34:13
the work for you I'm sorry so you can you know take that work
2:34:14
you know take that work
2:34:15
you know take that work um it also can cost a little bit more so
2:34:17
um it also can cost a little bit more so
2:34:17
um it also can cost a little bit more so you can look at the cost for private DNS
2:34:19
you can look at the cost for private DNS
2:34:19
you can look at the cost for private DNS Zone and virtual network uh just to see
2:34:21
Zone and virtual network uh just to see
2:34:21
Zone and virtual network uh just to see if it's reasonable for you to use that
2:34:24
if it's reasonable for you to use that
2:34:24
if it's reasonable for you to use that as a security mechanism but it is
2:34:26
as a security mechanism but it is
2:34:26
as a security mechanism but it is definitely uh recommended from a
2:34:29
definitely uh recommended from a
2:34:29
definitely uh recommended from a security perspective
2:34:32
okay so uh let's talk about tips and
2:34:36
okay so uh let's talk about tips and
2:34:36
okay so uh let's talk about tips and tricks for writing bicep I've been
2:34:39
tricks for writing bicep I've been
2:34:39
tricks for writing bicep I've been writing quite a lot of bicep over the
2:34:41
writing quite a lot of bicep over the
2:34:41
writing quite a lot of bicep over the last eight months and I guess I've
2:34:42
last eight months and I guess I've
2:34:42
last eight months and I guess I've gotten a reputation as the bicep person
2:34:44
gotten a reputation as the bicep person
2:34:44
gotten a reputation as the bicep person on my team or at least you know the one
2:34:46
on my team or at least you know the one
2:34:46
on my team or at least you know the one that loves bicep uh so uh yeah so let's
2:34:49
that loves bicep uh so uh yeah so let's
2:34:49
that loves bicep uh so uh yeah so let's talk about some tips here
2:34:51
talk about some tips here
2:34:51
talk about some tips here so with postgresql I you know I showed
2:34:53
so with postgresql I you know I showed
2:34:53
so with postgresql I you know I showed username password I showed v-net there
2:34:56
username password I showed v-net there
2:34:56
username password I showed v-net there are other things to consider for
2:34:58
are other things to consider for
2:34:58
are other things to consider for security
2:34:59
security um so one thing that's more serious
2:35:00
um so one thing that's more serious
2:35:00
um so one thing that's more serious approach than username password is to
2:35:02
approach than username password is to
2:35:02
approach than username password is to use managed identity uh and so manage
2:35:05
use managed identity uh and so manage
2:35:05
use managed identity uh and so manage identity is you like you give an
2:35:07
identity is you like you give an
2:35:07
identity is you like you give an identity to your Azure server and you
2:35:09
identity to your Azure server and you
2:35:09
identity to your Azure server and you say it's a system of science identity
2:35:11
say it's a system of science identity
2:35:11
say it's a system of science identity and then it's allowed to it kind of kind
2:35:14
and then it's allowed to it kind of kind
2:35:14
and then it's allowed to it kind of kind of like creates username passwords on on
2:35:16
of like creates username passwords on on
2:35:16
of like creates username passwords on on demand it generates these tokens
2:35:19
demand it generates these tokens
2:35:19
demand it generates these tokens um and uh uses that instead so it's nice
2:35:22
um and uh uses that instead so it's nice
2:35:22
um and uh uses that instead so it's nice because you don't have to do Secret
2:35:23
because you don't have to do Secret
2:35:23
because you don't have to do Secret rotation it's kind of doing that for you
2:35:25
rotation it's kind of doing that for you
2:35:25
rotation it's kind of doing that for you so manage identity is generally a best
2:35:28
so manage identity is generally a best
2:35:28
so manage identity is generally a best practice uh it can be a little trickier
2:35:30
practice uh it can be a little trickier
2:35:30
practice uh it can be a little trickier to do in bicep I did recently work on a
2:35:33
to do in bicep I did recently work on a
2:35:33
to do in bicep I did recently work on a sample to uh that uses both bicep and
2:35:36
sample to uh that uses both bicep and
2:35:36
sample to uh that uses both bicep and the Azure CLI in order to do a postgres
2:35:39
the Azure CLI in order to do a postgres
2:35:39
the Azure CLI in order to do a postgres with managed identity so if you're
2:35:40
with managed identity so if you're
2:35:40
with managed identity so if you're interested in that check out that sample
2:35:42
interested in that check out that sample
2:35:42
interested in that check out that sample the ultimate is manage identity inside
2:35:44
the ultimate is manage identity inside
2:35:44
the ultimate is manage identity inside of unit and I think my colleague Aaron
2:35:46
of unit and I think my colleague Aaron
2:35:46
of unit and I think my colleague Aaron is going to be working on that
2:35:48
is going to be working on that
2:35:48
is going to be working on that uh
2:35:50
uh location location location so the
2:35:52
location location location so the
2:35:52
location location location so the biggest problem I've had with deploying
2:35:54
biggest problem I've had with deploying
2:35:54
biggest problem I've had with deploying postgres servers is location constraints
2:35:57
postgres servers is location constraints
2:35:57
postgres servers is location constraints and this is actually mostly because I'm
2:35:59
and this is actually mostly because I'm
2:35:59
and this is actually mostly because I'm a Microsoft employee so uh Microsoft
2:36:01
a Microsoft employee so uh Microsoft
2:36:01
a Microsoft employee so uh Microsoft employees do have different location
2:36:03
employees do have different location
2:36:03
employees do have different location constraints than public folks so
2:36:06
constraints than public folks so
2:36:06
constraints than public folks so hopefully you all won't run into this
2:36:07
hopefully you all won't run into this
2:36:07
hopefully you all won't run into this but as a Microsoft employee if you do
2:36:09
but as a Microsoft employee if you do
2:36:09
but as a Microsoft employee if you do run into an error it might be because
2:36:10
run into an error it might be because
2:36:10
run into an error it might be because you're not permitted to deploy postgres
2:36:13
you're not permitted to deploy postgres
2:36:13
you're not permitted to deploy postgres in that location and in that case you
2:36:15
in that location and in that case you
2:36:15
in that location and in that case you should change the location for example
2:36:17
should change the location for example
2:36:17
should change the location for example East us2 Microsoft employees can't
2:36:19
East us2 Microsoft employees can't
2:36:19
East us2 Microsoft employees can't provision in that for postgres
2:36:22
provision in that for postgres
2:36:22
provision in that for postgres um hopefully if you're not a Microsoft
2:36:24
um hopefully if you're not a Microsoft
2:36:24
um hopefully if you're not a Microsoft employee you're just not going to run
2:36:25
employee you're just not going to run
2:36:25
employee you're just not going to run into that
2:36:25
into that um and uh you know the thing is we're
2:36:27
um and uh you know the thing is we're
2:36:27
um and uh you know the thing is we're you know keeping things more free for
2:36:29
you know keeping things more free for
2:36:29
you know keeping things more free for the uh you know the non-marksoft
2:36:31
the uh you know the non-marksoft
2:36:31
the uh you know the non-marksoft employees out there so it's better for
2:36:34
employees out there so it's better for
2:36:34
employees out there so it's better for you
2:36:34
you uh another good thing to keep in mind is
2:36:37
uh another good thing to keep in mind is
2:36:37
uh another good thing to keep in mind is that you know with bicep one of the cool
2:36:39
that you know with bicep one of the cool
2:36:39
that you know with bicep one of the cool things is you think with bicep like oh
2:36:41
things is you think with bicep like oh
2:36:41
things is you think with bicep like oh okay I need to change my server
2:36:42
okay I need to change my server
2:36:42
okay I need to change my server configuration I'm just going to change
2:36:43
configuration I'm just going to change
2:36:43
configuration I'm just going to change my bicep and you know redeploy it you
2:36:47
my bicep and you know redeploy it you
2:36:47
my bicep and you know redeploy it you can't always change everything postgres
2:36:49
can't always change everything postgres
2:36:49
can't always change everything postgres actually has a number of constraints in
2:36:50
actually has a number of constraints in
2:36:50
actually has a number of constraints in terms of what can't be changed uh so you
2:36:53
terms of what can't be changed uh so you
2:36:53
terms of what can't be changed uh so you can't change the admin username once
2:36:55
can't change the admin username once
2:36:55
can't change the admin username once it's set it's set you can change the
2:36:56
it's set it's set you can change the
2:36:56
it's set it's set you can change the password which is good but you can't
2:36:57
password which is good but you can't
2:36:57
password which is good but you can't change the admin username uh you can't
2:37:00
change the admin username uh you can't
2:37:00
change the admin username uh you can't change the postgresql version so if you
2:37:02
change the postgresql version so if you
2:37:02
change the postgresql version so if you did want to update to postgres 14 you
2:37:05
did want to update to postgres 14 you
2:37:05
did want to update to postgres 14 you would need to you know go through the
2:37:07
would need to you know go through the
2:37:07
would need to you know go through the process of doing a a database backup and
2:37:10
process of doing a a database backup and
2:37:10
process of doing a a database backup and a restore to a new server the nice thing
2:37:12
a restore to a new server the nice thing
2:37:12
a restore to a new server the nice thing is that you could you know just copy and
2:37:14
is that you could you know just copy and
2:37:14
is that you could you know just copy and paste your bicep and now you've got a
2:37:16
paste your bicep and now you've got a
2:37:16
paste your bicep and now you've got a very similar server you don't even copy
2:37:18
very similar server you don't even copy
2:37:18
very similar server you don't even copy paste just change the parameter uh you
2:37:20
paste just change the parameter uh you
2:37:20
paste just change the parameter uh you can't change the location you can't
2:37:22
can't change the location you can't
2:37:22
can't change the location you can't change the networking option so you
2:37:23
change the networking option so you
2:37:23
change the networking option so you should decide from the get-go is do you
2:37:25
should decide from the get-go is do you
2:37:25
should decide from the get-go is do you want to go v-not or do you want to go
2:37:27
want to go v-not or do you want to go
2:37:27
want to go v-not or do you want to go public
2:37:28
public um so yeah you know in theory with bicep
2:37:31
um so yeah you know in theory with bicep
2:37:31
um so yeah you know in theory with bicep with most things you can most resources
2:37:33
with most things you can most resources
2:37:33
with most things you can most resources that I use with bicep uh you can change
2:37:35
that I use with bicep uh you can change
2:37:35
that I use with bicep uh you can change things after the fact but postgres does
2:37:38
things after the fact but postgres does
2:37:38
things after the fact but postgres does have a number of constraints about what
2:37:40
have a number of constraints about what
2:37:40
have a number of constraints about what can't be changed so
2:37:42
can't be changed so
2:37:42
can't be changed so you do some experimentation and and be
2:37:45
you do some experimentation and and be
2:37:45
you do some experimentation and and be confident about your setup uh before
2:37:49
confident about your setup uh before
2:37:49
confident about your setup uh before all right other tips uh bicep the bicep
2:37:52
all right other tips uh bicep the bicep
2:37:52
all right other tips uh bicep the bicep team really wants to make uh it a nice
2:37:55
team really wants to make uh it a nice
2:37:55
team really wants to make uh it a nice experience to write bicep so they've
2:37:57
experience to write bicep so they've
2:37:57
experience to write bicep so they've developed an extension for viscode vs
2:37:59
developed an extension for viscode vs
2:38:00
developed an extension for viscode vs code which is you know a very popular
2:38:01
code which is you know a very popular
2:38:01
code which is you know a very popular IDE comes from Microsoft and this
2:38:04
IDE comes from Microsoft and this
2:38:04
IDE comes from Microsoft and this extension has syntax highlighting it has
2:38:07
extension has syntax highlighting it has
2:38:07
extension has syntax highlighting it has Snippets it has autocomplete it has
2:38:08
Snippets it has autocomplete it has
2:38:08
Snippets it has autocomplete it has linting so I really like to use it for
2:38:11
linting so I really like to use it for
2:38:11
linting so I really like to use it for the linting personally I know other
2:38:12
the linting personally I know other
2:38:12
the linting personally I know other people really like to use it for the
2:38:13
people really like to use it for the
2:38:13
people really like to use it for the Snippets I'm just not a Snippets girl uh
2:38:16
Snippets I'm just not a Snippets girl uh
2:38:16
Snippets I'm just not a Snippets girl uh but it's super helpful so definitely if
2:38:18
but it's super helpful so definitely if
2:38:18
but it's super helpful so definitely if you are a viscode user install that
2:38:20
you are a viscode user install that
2:38:20
you are a viscode user install that extension
2:38:22
extension uh you can also do linting at the
2:38:24
uh you can also do linting at the
2:38:24
uh you can also do linting at the command line right so you can get
2:38:25
command line right so you can get
2:38:25
command line right so you can get linting an extension or you can do it in
2:38:27
linting an extension or you can do it in
2:38:27
linting an extension or you can do it in the command line so you can use AZ bicep
2:38:29
the command line so you can use AZ bicep
2:38:29
the command line so you can use AZ bicep build Dash F and you give it the path to
2:38:32
build Dash F and you give it the path to
2:38:32
build Dash F and you give it the path to the file to lint and it will go and
2:38:35
the file to lint and it will go and
2:38:35
the file to lint and it will go and it'll check it for four errors and a
2:38:38
it'll check it for four errors and a
2:38:38
it'll check it for four errors and a really good practice is to use a CI CD
2:38:40
really good practice is to use a CI CD
2:38:40
really good practice is to use a CI CD workflow in order to always check for
2:38:42
workflow in order to always check for
2:38:42
workflow in order to always check for errors so in all of my repos where I
2:38:44
errors so in all of my repos where I
2:38:44
errors so in all of my repos where I have bicep I have this uh GitHub action
2:38:47
have bicep I have this uh GitHub action
2:38:47
have bicep I have this uh GitHub action workflow that you know attempts to run
2:38:51
workflow that you know attempts to run
2:38:51
workflow that you know attempts to run the build command and if there are
2:38:53
the build command and if there are
2:38:53
the build command and if there are errors it will fail and I'll get you
2:38:55
errors it will fail and I'll get you
2:38:55
errors it will fail and I'll get you know I'll get tilt
2:38:57
know I'll get tilt
2:38:57
know I'll get tilt foreign
2:38:59
foreign is to not just look for errors but look
2:39:02
is to not just look for errors but look
2:39:02
is to not just look for errors but look for security issues right so you could
2:39:04
for security issues right so you could
2:39:04
for security issues right so you could have a bicep file and it's technically
2:39:06
have a bicep file and it's technically
2:39:06
have a bicep file and it's technically valid bicep but maybe you know it's
2:39:09
valid bicep but maybe you know it's
2:39:09
valid bicep but maybe you know it's using uh you know username and password
2:39:11
using uh you know username and password
2:39:11
using uh you know username and password when a better approach is to use manage
2:39:14
when a better approach is to use manage
2:39:14
when a better approach is to use manage identity so there's this new action this
2:39:17
identity so there's this new action this
2:39:17
identity so there's this new action this Microsoft security devops action that
2:39:19
Microsoft security devops action that
2:39:19
Microsoft security devops action that can analyze templates and it will let
2:39:22
can analyze templates and it will let
2:39:22
can analyze templates and it will let you know about all these potential
2:39:24
you know about all these potential
2:39:24
you know about all these potential security errors in your templates so
2:39:27
security errors in your templates so
2:39:27
security errors in your templates so we've started to add this it's very new
2:39:29
we've started to add this it's very new
2:39:29
we've started to add this it's very new so we have started to add this to our
2:39:32
so we have started to add this to our
2:39:32
so we have started to add this to our repos I have it in a few of them
2:39:34
repos I have it in a few of them
2:39:34
repos I have it in a few of them um and just working through some issues
2:39:36
um and just working through some issues
2:39:36
um and just working through some issues because we have very complicated uh
2:39:38
because we have very complicated uh
2:39:38
because we have very complicated uh bicep in in some of ours but the cool
2:39:40
bicep in in some of ours but the cool
2:39:40
bicep in in some of ours but the cool thing is like once you do it you're
2:39:42
thing is like once you do it you're
2:39:42
thing is like once you do it you're going to get this workflow and it'll you
2:39:45
going to get this workflow and it'll you
2:39:45
going to get this workflow and it'll you know it'll error it can error out if you
2:39:47
know it'll error it can error out if you
2:39:47
know it'll error it can error out if you want it to error out and it'll say like
2:39:49
want it to error out and it'll say like
2:39:49
want it to error out and it'll say like okay on this line you know you're doing
2:39:51
okay on this line you know you're doing
2:39:51
okay on this line you know you're doing this thing where you're using a password
2:39:54
this thing where you're using a password
2:39:54
this thing where you're using a password it would be better if you use manage
2:39:56
it would be better if you use manage
2:39:56
it would be better if you use manage identity instead so you can either have
2:39:58
identity instead so you can either have
2:39:58
identity instead so you can either have the workflow you know break when that
2:40:00
the workflow you know break when that
2:40:00
the workflow you know break when that happens or you can upload those errors
2:40:03
happens or you can upload those errors
2:40:03
happens or you can upload those errors uh to your code quality tab in in GitHub
2:40:06
uh to your code quality tab in in GitHub
2:40:06
uh to your code quality tab in in GitHub or in you know Azure devops or whatever
2:40:09
or in you know Azure devops or whatever
2:40:09
or in you know Azure devops or whatever and uh and then you'll be able to you
2:40:11
and uh and then you'll be able to you
2:40:11
and uh and then you'll be able to you know monitor them as security issues for
2:40:13
know monitor them as security issues for
2:40:13
know monitor them as security issues for your repo so this is another really nice
2:40:16
your repo so this is another really nice
2:40:16
your repo so this is another really nice thing to do because we don't want us to
2:40:17
thing to do because we don't want us to
2:40:17
thing to do because we don't want us to have correct bicep we want to have
2:40:19
have correct bicep we want to have
2:40:19
have correct bicep we want to have secure bicep we want to have well you
2:40:21
secure bicep we want to have well you
2:40:21
secure bicep we want to have well you want to have secure you know Azure
2:40:23
want to have secure you know Azure
2:40:23
want to have secure you know Azure resources
2:40:24
resources uh so I just I love I you know in the
2:40:26
uh so I just I love I you know in the
2:40:27
uh so I just I love I you know in the age of you know co-pilot and chat gbt I
2:40:29
age of you know co-pilot and chat gbt I
2:40:29
age of you know co-pilot and chat gbt I love linters so any sort of linking I
2:40:31
love linters so any sort of linking I
2:40:31
love linters so any sort of linking I can do I will do it uh because I want to
2:40:33
can do I will do it uh because I want to
2:40:33
can do I will do it uh because I want to have the confidence that what I'm
2:40:35
have the confidence that what I'm
2:40:35
have the confidence that what I'm creating is you know is going to be
2:40:37
creating is you know is going to be
2:40:37
creating is you know is going to be really successful
2:40:40
uh no another thing you could do is you
2:40:42
uh no another thing you could do is you
2:40:42
uh no another thing you could do is you know I hear have I've showed postgres I
2:40:44
know I hear have I've showed postgres I
2:40:44
know I hear have I've showed postgres I use bicep just to make postgres but of
2:40:47
use bicep just to make postgres but of
2:40:47
use bicep just to make postgres but of course you can use biceps create all of
2:40:49
course you can use biceps create all of
2:40:49
course you can use biceps create all of your Azure resources and most time
2:40:50
your Azure resources and most time
2:40:50
your Azure resources and most time that's what you're going to do right
2:40:51
that's what you're going to do right
2:40:51
that's what you're going to do right you're going to be making you know your
2:40:52
you're going to be making you know your
2:40:52
you're going to be making you know your VMS and app search and all that stuff uh
2:40:54
VMS and app search and all that stuff uh
2:40:54
VMS and app search and all that stuff uh so when I do that I like to use the
2:40:56
so when I do that I like to use the
2:40:56
so when I do that I like to use the Azure developer CLI this is a new a new
2:40:59
Azure developer CLI this is a new a new
2:40:59
Azure developer CLI this is a new a new CLI tool to take care of the entire
2:41:01
CLI tool to take care of the entire
2:41:01
CLI tool to take care of the entire deployment workflow so it requires uh
2:41:04
deployment workflow so it requires uh
2:41:04
deployment workflow so it requires uh you know bicep files it also requires
2:41:06
you know bicep files it also requires
2:41:06
you know bicep files it also requires another yaml file a few other things and
2:41:09
another yaml file a few other things and
2:41:09
another yaml file a few other things and it will make it super easy to both
2:41:11
it will make it super easy to both
2:41:11
it will make it super easy to both create all your resources but also
2:41:13
create all your resources but also
2:41:13
create all your resources but also bundle up your code and send it to the
2:41:16
bundle up your code and send it to the
2:41:16
bundle up your code and send it to the server so that's that's the part I love
2:41:18
server so that's that's the part I love
2:41:18
server so that's that's the part I love of it is that it's gonna you know create
2:41:19
of it is that it's gonna you know create
2:41:19
of it is that it's gonna you know create all my resources in Azure and then you
2:41:22
all my resources in Azure and then you
2:41:22
all my resources in Azure and then you know it'll bundle up my container app
2:41:23
know it'll bundle up my container app
2:41:23
know it'll bundle up my container app and uh you know you know Docker Docker
2:41:26
and uh you know you know Docker Docker
2:41:26
and uh you know you know Docker Docker build and send that to the cloud or
2:41:27
build and send that to the cloud or
2:41:27
build and send that to the cloud or it'll one you know zip zip up my app
2:41:29
it'll one you know zip zip up my app
2:41:30
it'll one you know zip zip up my app code and and bundle and send that to the
2:41:32
code and and bundle and send that to the
2:41:32
code and and bundle and send that to the cloud so I love to use the Azure
2:41:34
cloud so I love to use the Azure
2:41:34
cloud so I love to use the Azure developer CLI to take care of everything
2:41:38
developer CLI to take care of everything
2:41:38
developer CLI to take care of everything um it also does cicd it does monitoring
2:41:41
um it also does cicd it does monitoring
2:41:41
um it also does cicd it does monitoring it does a lot so this is a new a new CLI
2:41:44
it does a lot so this is a new a new CLI
2:41:44
it does a lot so this is a new a new CLI different from the Azure CLI it's easier
2:41:46
different from the Azure CLI it's easier
2:41:46
different from the Azure CLI it's easier developer CLI but it's really it's
2:41:49
developer CLI but it's really it's
2:41:49
developer CLI but it's really it's really really cool and um I I've made 15
2:41:52
really really cool and um I I've made 15
2:41:52
really really cool and um I I've made 15 templates for it in the last you know
2:41:54
templates for it in the last you know
2:41:54
templates for it in the last you know like four months because I love it so
2:41:55
like four months because I love it so
2:41:55
like four months because I love it so much and so you can find app templates
2:41:58
much and so you can find app templates
2:41:58
much and so you can find app templates that use postgres in the azd templates
2:42:00
that use postgres in the azd templates
2:42:00
that use postgres in the azd templates Gallery uh so there's a link here and
2:42:03
Gallery uh so there's a link here and
2:42:03
Gallery uh so there's a link here and it's you know I just already filtered it
2:42:05
it's you know I just already filtered it
2:42:05
it's you know I just already filtered it by postgres for you so if you want to
2:42:07
by postgres for you so if you want to
2:42:07
by postgres for you so if you want to find examples of full applications
2:42:09
find examples of full applications
2:42:09
find examples of full applications within you know their entire bicep files
2:42:12
within you know their entire bicep files
2:42:12
within you know their entire bicep files written you can look in that azd
2:42:15
written you can look in that azd
2:42:15
written you can look in that azd templates gallery and just start from
2:42:16
templates gallery and just start from
2:42:16
templates gallery and just start from those existing ones I have written
2:42:19
those existing ones I have written
2:42:19
those existing ones I have written mostly python ones because you know
2:42:20
mostly python ones because you know
2:42:20
mostly python ones because you know that's my thing
2:42:22
that's my thing um but in terms of the infrastructure
2:42:24
um but in terms of the infrastructure
2:42:24
um but in terms of the infrastructure the infrastructure shouldn't be that
2:42:26
the infrastructure shouldn't be that
2:42:26
the infrastructure shouldn't be that difference between Python and another
2:42:27
difference between Python and another
2:42:27
difference between Python and another language like rust or go or whatever
2:42:30
language like rust or go or whatever
2:42:30
language like rust or go or whatever um so you should be able to be able to
2:42:31
um so you should be able to be able to
2:42:31
um so you should be able to be able to take the imperfiles that I've already
2:42:33
take the imperfiles that I've already
2:42:33
take the imperfiles that I've already written for a particular you know
2:42:34
written for a particular you know
2:42:34
written for a particular you know resource architecture and then use them
2:42:36
resource architecture and then use them
2:42:36
resource architecture and then use them for whatever language you're using
2:42:40
for whatever language you're using
2:42:40
for whatever language you're using foreign
2:42:42
foreign yeah then finally uh some more bicep
2:42:45
yeah then finally uh some more bicep
2:42:45
yeah then finally uh some more bicep writing tips because I do hear a lot
2:42:47
writing tips because I do hear a lot
2:42:47
writing tips because I do hear a lot that people don't like to write bicep
2:42:49
that people don't like to write bicep
2:42:49
that people don't like to write bicep you know it's a new language because
2:42:51
you know it's a new language because
2:42:51
you know it's a new language because it's always hard when you're you know
2:42:52
it's always hard when you're you know
2:42:52
it's always hard when you're you know writing a good language
2:42:54
writing a good language
2:42:54
writing a good language um so I always like to keep the bicep
2:42:56
um so I always like to keep the bicep
2:42:56
um so I always like to keep the bicep reference uh open uh so I'll look up
2:42:59
reference uh open uh so I'll look up
2:42:59
reference uh open uh so I'll look up whatever I'm currently working on right
2:43:00
whatever I'm currently working on right
2:43:00
whatever I'm currently working on right so the postgresql flexible servers bicep
2:43:03
so the postgresql flexible servers bicep
2:43:03
so the postgresql flexible servers bicep reference it has all the properties
2:43:04
reference it has all the properties
2:43:04
reference it has all the properties listed out so I'll just always have that
2:43:06
listed out so I'll just always have that
2:43:06
listed out so I'll just always have that open in a tab just to look at
2:43:09
open in a tab just to look at
2:43:09
open in a tab just to look at um and it also has examples at the
2:43:11
um and it also has examples at the
2:43:11
um and it also has examples at the bottom of each each page and it does
2:43:13
bottom of each each page and it does
2:43:13
bottom of each each page and it does also have examples in both arm and
2:43:15
also have examples in both arm and
2:43:15
also have examples in both arm and terraform so if you watch this but just
2:43:17
terraform so if you watch this but just
2:43:17
terraform so if you watch this but just you know determine you still want to use
2:43:18
you know determine you still want to use
2:43:18
you know determine you still want to use terraform you can use that you know the
2:43:20
terraform you can use that you know the
2:43:20
terraform you can use that you know the same reference has terraform as well
2:43:24
same reference has terraform as well
2:43:24
same reference has terraform as well uh you can search GitHub for examples
2:43:26
uh you can search GitHub for examples
2:43:26
uh you can search GitHub for examples using laying bicep I use GitHub code
2:43:29
using laying bicep I use GitHub code
2:43:29
using laying bicep I use GitHub code search all the time to help me figure
2:43:31
search all the time to help me figure
2:43:31
search all the time to help me figure out how to do something new and so I
2:43:34
out how to do something new and so I
2:43:34
out how to do something new and so I love to search for you know four
2:43:35
love to search for you know four
2:43:35
love to search for you know four examples using link biceps so I'll do
2:43:37
examples using link biceps so I'll do
2:43:37
examples using link biceps so I'll do like laying colon bicep and then
2:43:39
like laying colon bicep and then
2:43:39
like laying colon bicep and then whatever particular thing I'm looking
2:43:40
whatever particular thing I'm looking
2:43:40
whatever particular thing I'm looking for like I've searched for v-net before
2:43:42
for like I've searched for v-net before
2:43:42
for like I've searched for v-net before you know just the name of the resource
2:43:44
you know just the name of the resource
2:43:44
you know just the name of the resource and I can look at other examples see oh
2:43:46
and I can look at other examples see oh
2:43:46
and I can look at other examples see oh is somebody doing something similar to
2:43:48
is somebody doing something similar to
2:43:48
is somebody doing something similar to what I'm looking for so I think that's a
2:43:50
what I'm looking for so I think that's a
2:43:50
what I'm looking for so I think that's a great thing to do for whatever sort of
2:43:51
great thing to do for whatever sort of
2:43:51
great thing to do for whatever sort of new code you're trying to write another
2:43:54
new code you're trying to write another
2:43:54
new code you're trying to write another thing you can do is that you can export
2:43:56
thing you can do is that you can export
2:43:56
thing you can do is that you can export arm templates from the portal so if you
2:43:58
arm templates from the portal so if you
2:43:58
arm templates from the portal so if you did use you know the Porto UI to create
2:44:00
did use you know the Porto UI to create
2:44:00
did use you know the Porto UI to create a resource and exists in the portal and
2:44:02
a resource and exists in the portal and
2:44:02
a resource and exists in the portal and you're trying to figure out how to use
2:44:03
you're trying to figure out how to use
2:44:03
you're trying to figure out how to use bicep for it you can go to the portal
2:44:06
bicep for it you can go to the portal
2:44:06
bicep for it you can go to the portal and you can go on the bottom to like
2:44:09
and you can go on the bottom to like
2:44:09
and you can go on the bottom to like templates and you click like export
2:44:10
templates and you click like export
2:44:10
templates and you click like export template and you'll get the arm Json and
2:44:13
template and you'll get the arm Json and
2:44:13
template and you'll get the arm Json and then you can use AZ bicep decompile to
2:44:15
then you can use AZ bicep decompile to
2:44:15
then you can use AZ bicep decompile to turn that into bicep and uh and then you
2:44:18
turn that into bicep and uh and then you
2:44:18
turn that into bicep and uh and then you can start from there the thing to know
2:44:20
can start from there the thing to know
2:44:20
can start from there the thing to know about that is that it creates a ton like
2:44:22
about that is that it creates a ton like
2:44:22
about that is that it creates a ton like the arm jsons there will have every
2:44:24
the arm jsons there will have every
2:44:24
the arm jsons there will have every possible configuration you don't need
2:44:26
possible configuration you don't need
2:44:26
possible configuration you don't need all of it so a lot of times I'll like
2:44:28
all of it so a lot of times I'll like
2:44:28
all of it so a lot of times I'll like look at it to see you know what is in
2:44:31
look at it to see you know what is in
2:44:31
look at it to see you know what is in here and then I I cut it down because
2:44:32
here and then I I cut it down because
2:44:32
here and then I I cut it down because you don't you don't you don't need you
2:44:34
you don't you don't you don't need you
2:44:34
you don't you don't you don't need you don't need it all you just need the the
2:44:36
don't need it all you just need the the
2:44:36
don't need it all you just need the the minimal you know override configuration
2:44:38
minimal you know override configuration
2:44:38
minimal you know override configuration and I also have a blog post that talks
2:44:40
and I also have a blog post that talks
2:44:40
and I also have a blog post that talks about this more so you can read that
2:44:42
about this more so you can read that
2:44:42
about this more so you can read that blog post for for more bicep writing
2:44:44
blog post for for more bicep writing
2:44:44
blog post for for more bicep writing tips
2:44:48
foreign so this is what I wanted to share with
2:44:50
so this is what I wanted to share with
2:44:50
so this is what I wanted to share with you today uh so thank you for watching I
2:44:53
you today uh so thank you for watching I
2:44:54
you today uh so thank you for watching I hope you've learned some bicep and that
2:44:55
hope you've learned some bicep and that
2:44:55
hope you've learned some bicep and that you know maybe you want to try it out
2:44:57
you know maybe you want to try it out
2:44:57
you know maybe you want to try it out yourself uh you can grab the slides at
2:45:00
yourself uh you can grab the slides at
2:45:00
yourself uh you can grab the slides at this akka dot Ms slash postgres bicep
2:45:03
this akka dot Ms slash postgres bicep
2:45:03
this akka dot Ms slash postgres bicep dot slides and uh and you can follow all
2:45:06
dot slides and uh and you can follow all
2:45:06
dot slides and uh and you can follow all the links in the slide so that you can
2:45:08
the links in the slide so that you can
2:45:08
the links in the slide so that you can take take all of the bicep code and uh
2:45:11
take take all of the bicep code and uh
2:45:11
take take all of the bicep code and uh and you know hopefully make it easier
2:45:13
and you know hopefully make it easier
2:45:13
and you know hopefully make it easier for you to get started with bicep and
2:45:15
for you to get started with bicep and
2:45:15
for you to get started with bicep and you can toot at me or tweet at me or get
2:45:17
you can toot at me or tweet at me or get
2:45:18
you can toot at me or tweet at me or get help at me or whatever Discord at me
2:45:20
help at me or whatever Discord at me
2:45:20
help at me or whatever Discord at me I'll be in the Discord after as well uh
2:45:22
I'll be in the Discord after as well uh
2:45:22
I'll be in the Discord after as well uh to let me know about your experience
2:45:24
to let me know about your experience
2:45:24
to let me know about your experience with bicep and if you're looking for any
2:45:25
with bicep and if you're looking for any
2:45:25
with bicep and if you're looking for any particular examples I'm always looking
2:45:27
particular examples I'm always looking
2:45:27
particular examples I'm always looking for excuses to you know try something
2:45:29
for excuses to you know try something
2:45:29
for excuses to you know try something out So yeah thank you for coming today
2:45:32
out So yeah thank you for coming today
2:45:32
out So yeah thank you for coming today that is my talk
2:45:35
that is my talk excellent uh I have to say like yeah
2:45:37
excellent uh I have to say like yeah
2:45:37
excellent uh I have to say like yeah that was really great I've learned quite
2:45:39
that was really great I've learned quite
2:45:39
that was really great I've learned quite a bit about bicep I did not really know
2:45:41
a bit about bicep I did not really know
2:45:41
a bit about bicep I did not really know that much about bicep it seems like it
2:45:43
that much about bicep it seems like it
2:45:43
that much about bicep it seems like it is one of these like uh I guess next
2:45:45
is one of these like uh I guess next
2:45:45
is one of these like uh I guess next Generation like you said like terraform
2:45:47
Generation like you said like terraform
2:45:47
Generation like you said like terraform or like palumi uh so it kind of Falls in
2:45:50
or like palumi uh so it kind of Falls in
2:45:50
or like palumi uh so it kind of Falls in that category so it's really good
2:45:52
that category so it's really good
2:45:52
that category so it's really good um I I know we're a bit up on the clock
2:45:55
um I I know we're a bit up on the clock
2:45:55
um I I know we're a bit up on the clock but I did want to ask you a quick
2:45:56
but I did want to ask you a quick
2:45:56
but I did want to ask you a quick question like if I'm deploying postgres
2:45:57
question like if I'm deploying postgres
2:45:57
question like if I'm deploying postgres I'm probably going to be needing to
2:45:59
I'm probably going to be needing to
2:45:59
I'm probably going to be needing to deploy PG bouncer with it uh is that
2:46:02
deploy PG bouncer with it uh is that
2:46:02
deploy PG bouncer with it uh is that easy or is that difficult if I want to
2:46:04
easy or is that difficult if I want to
2:46:04
easy or is that difficult if I want to do that with bicep
2:46:05
do that with bicep
2:46:05
do that with bicep uh well it is your PT bouncer is enabled
2:46:08
uh well it is your PT bouncer is enabled
2:46:08
uh well it is your PT bouncer is enabled by default on the postgres server so you
2:46:11
by default on the postgres server so you
2:46:11
by default on the postgres server so you just got it on Azure database for
2:46:13
just got it on Azure database for
2:46:13
just got it on Azure database for postgres you mean on Flex server yeah
2:46:15
postgres you mean on Flex server yeah
2:46:15
postgres you mean on Flex server yeah sorry to be very specific the one that I
2:46:16
sorry to be very specific the one that I
2:46:16
sorry to be very specific the one that I showed is enabled by default uh is it do
2:46:19
showed is enabled by default uh is it do
2:46:19
showed is enabled by default uh is it do you know about with cytus
2:46:23
you know about with cytus
2:46:23
you know about with cytus I imagine it must work pretty similarly
2:46:25
I imagine it must work pretty similarly
2:46:25
I imagine it must work pretty similarly then yeah so it sounds like it'd be
2:46:26
then yeah so it sounds like it'd be
2:46:26
then yeah so it sounds like it'd be pretty easy
2:46:27
pretty easy yeah yeah so actually the ones I showed
2:46:29
yeah yeah so actually the ones I showed
2:46:29
yeah yeah so actually the ones I showed it's just already enabled and you can go
2:46:31
it's just already enabled and you can go
2:46:31
it's just already enabled and you can go into server parameters in the portal to
2:46:32
into server parameters in the portal to
2:46:33
into server parameters in the portal to to look at all those settings but uh
2:46:34
to look at all those settings but uh
2:46:34
to look at all those settings but uh it's it's already enabled okay well and
2:46:36
it's it's already enabled okay well and
2:46:36
it's it's already enabled okay well and I know I know Yalta is on the Discord um
2:46:39
I know I know Yalta is on the Discord um
2:46:39
I know I know Yalta is on the Discord um chat and he's both on the status project
2:46:42
chat and he's both on the status project
2:46:42
chat and he's both on the status project and a PG bouncer maintainer so if
2:46:44
and a PG bouncer maintainer so if
2:46:44
and a PG bouncer maintainer so if anybody has any questions about that
2:46:46
anybody has any questions about that
2:46:46
anybody has any questions about that they can definitely pop them in on the
2:46:47
they can definitely pop them in on the
2:46:47
they can definitely pop them in on the Discord Pamela thank you so much for
2:46:50
Discord Pamela thank you so much for
2:46:50
Discord Pamela thank you so much for being being part of cytuscon today I
2:46:53
being being part of cytuscon today I
2:46:53
being being part of cytuscon today I also learned something about the naming
2:46:54
also learned something about the naming
2:46:54
also learned something about the naming of bicep and that there's a connection
2:46:55
of bicep and that there's a connection
2:46:55
of bicep and that there's a connection between bicep and arm okay I didn't
2:46:59
between bicep and arm okay I didn't
2:46:59
between bicep and arm okay I didn't actually connect those dots beforehand
2:47:01
actually connect those dots beforehand
2:47:01
actually connect those dots beforehand so
2:47:02
so yeah yeah I literally only connected
2:47:04
yeah yeah I literally only connected
2:47:04
yeah yeah I literally only connected them this morning
2:47:07
them this morning all right well thank you so much for
2:47:09
all right well thank you so much for
2:47:09
all right well thank you so much for being here appreciate it and you're
2:47:11
being here appreciate it and you're
2:47:11
being here appreciate it and you're gonna go to the Discord to answer
2:47:12
gonna go to the Discord to answer
2:47:12
gonna go to the Discord to answer questions and chit chat all right
2:47:14
questions and chit chat all right
2:47:14
questions and chit chat all right awesome excellent all right uh and with
2:47:17
awesome excellent all right uh and with
2:47:17
awesome excellent all right uh and with that let us then bring in Melanie
2:47:20
that let us then bring in Melanie
2:47:20
that let us then bring in Melanie plagueman uh she is helping us uh sort
2:47:23
plagueman uh she is helping us uh sort
2:47:23
plagueman uh she is helping us uh sort of round out our third period of uh the
2:47:26
of round out our third period of uh the
2:47:26
of round out our third period of uh the show today uh so excellent excellent uh
2:47:29
show today uh so excellent excellent uh
2:47:29
show today uh so excellent excellent uh how you doing good how are you
2:47:31
how you doing good how are you
2:47:31
how you doing good how are you I'm doing well it's been good so far for
2:47:34
I'm doing well it's been good so far for
2:47:34
I'm doing well it's been good so far for people who don't know Melanie um Melanie
2:47:37
people who don't know Melanie um Melanie
2:47:37
people who don't know Melanie um Melanie is a senior software engineer
2:47:39
is a senior software engineer
2:47:39
is a senior software engineer um at Microsoft and she's a postgres
2:47:41
um at Microsoft and she's a postgres
2:47:41
um at Microsoft and she's a postgres hacker and postgres contributor
2:47:44
hacker and postgres contributor
2:47:44
hacker and postgres contributor and just uh for those who don't pay
2:47:46
and just uh for those who don't pay
2:47:46
and just uh for those who don't pay attention to the postgres commits which
2:47:47
attention to the postgres commits which
2:47:48
attention to the postgres commits which maybe is a larger number of people uh
2:47:50
maybe is a larger number of people uh
2:47:50
maybe is a larger number of people uh than you might expect she recently
2:47:52
than you might expect she recently
2:47:52
than you might expect she recently contributed PG statio into postgres 16
2:47:55
contributed PG statio into postgres 16
2:47:55
contributed PG statio into postgres 16 uh fingers crossed that will be
2:47:56
uh fingers crossed that will be
2:47:56
uh fingers crossed that will be available later this year uh and I guess
2:47:59
available later this year uh and I guess
2:47:59
available later this year uh and I guess that's a big part of what you're going
2:48:00
that's a big part of what you're going
2:48:00
that's a big part of what you're going to talk about today yep that is
2:48:02
to talk about today yep that is
2:48:02
to talk about today yep that is observability and postgres and I know
2:48:04
observability and postgres and I know
2:48:04
observability and postgres and I know people are really excited about PG
2:48:06
people are really excited about PG
2:48:06
people are really excited about PG statio so let's find out why take it
2:48:08
statio so let's find out why take it
2:48:08
statio so let's find out why take it away Melanie
2:48:13
yeah so this talk is going to be great
2:48:15
yeah so this talk is going to be great
2:48:15
yeah so this talk is going to be great for users that are already comfortable
2:48:18
for users that are already comfortable
2:48:18
for users that are already comfortable doing some amount of tuning and want to
2:48:20
doing some amount of tuning and want to
2:48:20
doing some amount of tuning and want to kind of take it to the next level and so
2:48:23
kind of take it to the next level and so
2:48:23
kind of take it to the next level and so that's really who it's targeted at
2:48:27
so uh I'm going to talk about the
2:48:30
so uh I'm going to talk about the
2:48:30
so uh I'm going to talk about the difference between sort of the existing
2:48:32
difference between sort of the existing
2:48:32
difference between sort of the existing i o statistics that are available in
2:48:34
i o statistics that are available in
2:48:34
i o statistics that are available in postgres and what PG statio can offer in
2:48:37
postgres and what PG statio can offer in
2:48:37
postgres and what PG statio can offer in terms of observability
2:48:40
terms of observability
2:48:40
terms of observability just about me I'm a postgres hacker
2:48:43
just about me I'm a postgres hacker
2:48:43
just about me I'm a postgres hacker working at Microsoft on the open source
2:48:45
working at Microsoft on the open source
2:48:45
working at Microsoft on the open source postgres team the last two years I've
2:48:48
postgres team the last two years I've
2:48:48
postgres team the last two years I've been very focused on i o performance and
2:48:50
been very focused on i o performance and
2:48:50
been very focused on i o performance and I O benchmarking but before that I've
2:48:52
I O benchmarking but before that I've
2:48:52
I O benchmarking but before that I've worked on
2:48:54
worked on um a lot of the other subsystems and
2:48:56
um a lot of the other subsystems and
2:48:56
um a lot of the other subsystems and postgres like planner and executor and
2:48:58
postgres like planner and executor and
2:48:58
postgres like planner and executor and things like that
2:49:00
things like that um
2:49:01
um so users have pretty much two main goals
2:49:05
so users have pretty much two main goals
2:49:05
so users have pretty much two main goals for their transactional workload i o
2:49:07
for their transactional workload i o
2:49:07
for their transactional workload i o performance they care about throughput
2:49:09
performance they care about throughput
2:49:09
performance they care about throughput and they care about latency right so you
2:49:11
and they care about latency right so you
2:49:11
and they care about latency right so you want High TPS and low latency low tail
2:49:15
want High TPS and low latency low tail
2:49:15
want High TPS and low latency low tail latency
2:49:16
latency um that's when you're thinking about
2:49:18
um that's when you're thinking about
2:49:18
um that's when you're thinking about performance those are the things that
2:49:20
performance those are the things that
2:49:20
performance those are the things that that really matter you can boil
2:49:21
that really matter you can boil
2:49:21
that really matter you can boil everything down to that but some of the
2:49:24
everything down to that but some of the
2:49:24
everything down to that but some of the impediments to that I mean the most sort
2:49:27
impediments to that I mean the most sort
2:49:27
impediments to that I mean the most sort of obvious one is if your data is not in
2:49:29
of obvious one is if your data is not in
2:49:29
of obvious one is if your data is not in shared buffers so if your data is either
2:49:33
shared buffers so if your data is either
2:49:33
shared buffers so if your data is either too large for shared buffers and you
2:49:36
too large for shared buffers and you
2:49:36
too large for shared buffers and you don't expect it to fit or you've
2:49:38
don't expect it to fit or you've
2:49:38
don't expect it to fit or you've misconfigured shared buffers then you're
2:49:40
misconfigured shared buffers then you're
2:49:40
misconfigured shared buffers then you're going to see more IO and that can be a
2:49:42
going to see more IO and that can be a
2:49:42
going to see more IO and that can be a common performance issue
2:49:45
common performance issue
2:49:45
common performance issue um the cause of common performance issue
2:49:47
um the cause of common performance issue
2:49:47
um the cause of common performance issue uh another one is issues around wall i o
2:49:50
uh another one is issues around wall i o
2:49:50
uh another one is issues around wall i o and checkpoint or misconfiguration
2:49:53
and checkpoint or misconfiguration
2:49:53
and checkpoint or misconfiguration um that's a pretty common reason that
2:49:54
um that's a pretty common reason that
2:49:54
um that's a pretty common reason that people are seeing IO bottlenecks and
2:49:57
people are seeing IO bottlenecks and
2:49:57
people are seeing IO bottlenecks and then Auto vacuum tuning auto vacuum is
2:49:59
then Auto vacuum tuning auto vacuum is
2:49:59
then Auto vacuum tuning auto vacuum is notoriously finicky and it really
2:50:01
notoriously finicky and it really
2:50:01
notoriously finicky and it really depends on your workload and the out of
2:50:03
depends on your workload and the out of
2:50:03
depends on your workload and the out of the box settings are going to be good
2:50:04
the box settings are going to be good
2:50:04
the box settings are going to be good for No One basically so
2:50:08
for No One basically so
2:50:08
for No One basically so um you know if your auto vacuum is not
2:50:09
um you know if your auto vacuum is not
2:50:09
um you know if your auto vacuum is not running frequently enough if you're not
2:50:13
running frequently enough if you're not
2:50:13
running frequently enough if you're not um if there's uh the workers aren't
2:50:15
um if there's uh the workers aren't
2:50:15
um if there's uh the workers aren't being aggressive enough that kind of
2:50:16
being aggressive enough that kind of
2:50:16
being aggressive enough that kind of thing
2:50:18
so as we think about tuning the these kind
2:50:21
as we think about tuning the these kind
2:50:21
as we think about tuning the these kind of are bucketed into different areas so
2:50:24
of are bucketed into different areas so
2:50:24
of are bucketed into different areas so you can tune shared buffers there are
2:50:26
you can tune shared buffers there are
2:50:26
you can tune shared buffers there are different postgres skux around the
2:50:29
different postgres skux around the
2:50:29
different postgres skux around the background writer around checkpointer
2:50:31
background writer around checkpointer
2:50:31
background writer around checkpointer and wall tuning and around Auto vacuum
2:50:34
and wall tuning and around Auto vacuum
2:50:34
and wall tuning and around Auto vacuum I'm not gonna get too much into
2:50:36
I'm not gonna get too much into
2:50:36
I'm not gonna get too much into checkpoint or an auto vacuum I'm
2:50:37
checkpoint or an auto vacuum I'm
2:50:37
checkpoint or an auto vacuum I'm primarily going to focus on what sort of
2:50:40
primarily going to focus on what sort of
2:50:40
primarily going to focus on what sort of data we can use to indicate if we need
2:50:44
data we can use to indicate if we need
2:50:44
data we can use to indicate if we need to make a change to Shared buffers or
2:50:46
to make a change to Shared buffers or
2:50:46
to make a change to Shared buffers or it's a background writer configuration
2:50:50
and as we think about tuning you know
2:50:54
and as we think about tuning you know
2:50:54
and as we think about tuning you know the ideal is that you're data driven so
2:50:56
the ideal is that you're data driven so
2:50:56
the ideal is that you're data driven so you want a base tuning decisions on
2:50:59
you want a base tuning decisions on
2:50:59
you want a base tuning decisions on actually what's happening in the
2:51:02
actually what's happening in the
2:51:02
actually what's happening in the database and so how do you get that
2:51:03
database and so how do you get that
2:51:03
database and so how do you get that information
2:51:05
information um so one option is to use operating
2:51:08
um so one option is to use operating
2:51:08
um so one option is to use operating system statistics and utilities of like
2:51:11
system statistics and utilities of like
2:51:11
system statistics and utilities of like using iOS that and other
2:51:13
using iOS that and other
2:51:13
using iOS that and other um there's a you know different
2:51:14
um there's a you know different
2:51:14
um there's a you know different extensions in postgres that'll give you
2:51:16
extensions in postgres that'll give you
2:51:16
extensions in postgres that'll give you access to some operating system
2:51:18
access to some operating system
2:51:18
access to some operating system information about IO
2:51:20
information about IO
2:51:20
information about IO um but the kind of the first place the
2:51:22
um but the kind of the first place the
2:51:22
um but the kind of the first place the most obvious place that you would go is
2:51:24
most obvious place that you would go is
2:51:24
most obvious place that you would go is the existing i o statistics in postgres
2:51:26
the existing i o statistics in postgres
2:51:26
the existing i o statistics in postgres so uh PG stats database is going to give
2:51:30
so uh PG stats database is going to give
2:51:30
so uh PG stats database is going to give you per database statistics
2:51:32
you per database statistics
2:51:32
you per database statistics um on reads and writes hits and
2:51:35
um on reads and writes hits and
2:51:35
um on reads and writes hits and um sorry read and write time hits and
2:51:37
um sorry read and write time hits and
2:51:37
um sorry read and write time hits and reads uh there's PG statio tables and
2:51:40
reads uh there's PG statio tables and
2:51:40
reads uh there's PG statio tables and indexes there's pgs.bg writer and then
2:51:43
indexes there's pgs.bg writer and then
2:51:43
indexes there's pgs.bg writer and then many many users are using the PG stat
2:51:45
many many users are using the PG stat
2:51:45
many many users are using the PG stat statements extension which is going to
2:51:47
statements extension which is going to
2:51:47
statements extension which is going to give you a fair amount of information
2:51:49
give you a fair amount of information
2:51:49
give you a fair amount of information but we wanted to add PG statio because
2:51:52
but we wanted to add PG statio because
2:51:52
but we wanted to add PG statio because there's actually substantial gaps in
2:51:55
there's actually substantial gaps in
2:51:55
there's actually substantial gaps in these existing i o statistics that are
2:51:57
these existing i o statistics that are
2:51:57
these existing i o statistics that are in postgres and even those that are in
2:51:59
in postgres and even those that are in
2:51:59
in postgres and even those that are in extensions and so what we really needed
2:52:02
extensions and so what we really needed
2:52:02
extensions and so what we really needed to do was actually make changes
2:52:05
to do was actually make changes
2:52:05
to do was actually make changes uh to where we're collecting statistics
2:52:08
uh to where we're collecting statistics
2:52:08
uh to where we're collecting statistics and not just what we're
2:52:10
and not just what we're
2:52:10
and not just what we're um revealing and bubbling up but
2:52:11
um revealing and bubbling up but
2:52:11
um revealing and bubbling up but actually what are we actually collecting
2:52:14
actually what are we actually collecting
2:52:14
actually what are we actually collecting and so we focused I I think you can kind
2:52:17
and so we focused I I think you can kind
2:52:17
and so we focused I I think you can kind of think of the three areas or the three
2:52:19
of think of the three areas or the three
2:52:19
of think of the three areas or the three gaps that um PG statio is addressing uh
2:52:23
gaps that um PG statio is addressing uh
2:52:23
gaps that um PG statio is addressing uh and we can kind of break those down so I
2:52:25
and we can kind of break those down so I
2:52:25
and we can kind of break those down so I think one of the most important ones is
2:52:27
think one of the most important ones is
2:52:27
think one of the most important ones is that in all of the existing i o
2:52:30
that in all of the existing i o
2:52:30
that in all of the existing i o statistics rights are include flushes
2:52:34
statistics rights are include flushes
2:52:34
statistics rights are include flushes and extends and I'm going to talk a
2:52:36
and extends and I'm going to talk a
2:52:36
and extends and I'm going to talk a little bit more about okay what is that
2:52:38
little bit more about okay what is that
2:52:38
little bit more about okay what is that anyway and why does it matter
2:52:41
anyway and why does it matter
2:52:41
anyway and why does it matter um another uh sort of the granularity is
2:52:45
um another uh sort of the granularity is
2:52:45
um another uh sort of the granularity is one of the main problems with the
2:52:46
one of the main problems with the
2:52:46
one of the main problems with the existing i o statistics so all back end
2:52:49
existing i o statistics so all back end
2:52:49
existing i o statistics so all back end types whether that's checkpoint or
2:52:51
types whether that's checkpoint or
2:52:51
types whether that's checkpoint or background writer you know whatever type
2:52:55
background writer you know whatever type
2:52:55
background writer you know whatever type of back end it is those are all going to
2:52:56
of back end it is those are all going to
2:52:57
of back end it is those are all going to be included in the same statistics
2:52:58
be included in the same statistics
2:52:58
be included in the same statistics you're not going to see it broken down
2:53:00
you're not going to see it broken down
2:53:00
you're not going to see it broken down there's some amount of uh there's some
2:53:03
there's some amount of uh there's some
2:53:03
there's some amount of uh there's some exceptions to that but largely you're
2:53:05
exceptions to that but largely you're
2:53:05
exceptions to that but largely you're going to see it all together and then
2:53:07
going to see it all together and then
2:53:07
going to see it all together and then also you're going to CIO for
2:53:09
also you're going to CIO for
2:53:09
also you're going to CIO for all different contexts in context here
2:53:12
all different contexts in context here
2:53:12
all different contexts in context here what I mean is okay vacuuming is
2:53:15
what I mean is okay vacuuming is
2:53:15
what I mean is okay vacuuming is different than uh just normal
2:53:18
different than uh just normal
2:53:18
different than uh just normal transactional workload IO uh doing a
2:53:22
transactional workload IO uh doing a
2:53:22
transactional workload IO uh doing a bulk read load doing bulk data loading
2:53:24
bulk read load doing bulk data loading
2:53:24
bulk read load doing bulk data loading that's different sort of a different i o
2:53:26
that's different sort of a different i o
2:53:26
that's different sort of a different i o pattern a different reason for doing I O
2:53:29
pattern a different reason for doing I O
2:53:29
pattern a different reason for doing I O and so you're going to want to address
2:53:31
and so you're going to want to address
2:53:31
and so you're going to want to address that differently when it comes to tuning
2:53:35
that differently when it comes to tuning
2:53:35
that differently when it comes to tuning so this is just a snapshot probably
2:53:38
so this is just a snapshot probably
2:53:38
so this is just a snapshot probably right after a little bit after I did an
2:53:40
right after a little bit after I did an
2:53:40
right after a little bit after I did an its UB so you can see a lot of it's zero
2:53:42
its UB so you can see a lot of it's zero
2:53:42
its UB so you can see a lot of it's zero but this is what Fiji setio looks like
2:53:45
but this is what Fiji setio looks like
2:53:45
but this is what Fiji setio looks like um and you can see that we actually have
2:53:48
um and you can see that we actually have
2:53:48
um and you can see that we actually have read and write and fsync uh time so
2:53:52
read and write and fsync uh time so
2:53:52
read and write and fsync uh time so that's cool
2:53:53
that's cool um we can see the timing and then uh we
2:53:57
um we can see the timing and then uh we
2:53:57
um we can see the timing and then uh we have some specific columns that are
2:53:59
have some specific columns that are
2:53:59
have some specific columns that are actually relevant for buffer access
2:54:01
actually relevant for buffer access
2:54:01
actually relevant for buffer access strategies which I'll talk about a
2:54:02
strategies which I'll talk about a
2:54:02
strategies which I'll talk about a little bit later like reuses
2:54:05
little bit later like reuses
2:54:05
little bit later like reuses um so that's just to give you a little
2:54:08
um so that's just to give you a little
2:54:08
um so that's just to give you a little picture of it
2:54:10
picture of it um so let's go back to what I said about
2:54:12
um so let's go back to what I said about
2:54:12
um so let's go back to what I said about what are the gaps right so why do we
2:54:15
what are the gaps right so why do we
2:54:15
what are the gaps right so why do we care about counting flushes and extends
2:54:17
care about counting flushes and extends
2:54:17
care about counting flushes and extends separately what is a flush so in PG
2:54:20
separately what is a flush so in PG
2:54:20
separately what is a flush so in PG statio we call flushes rights because
2:54:22
statio we call flushes rights because
2:54:22
statio we call flushes rights because we're able to distinguish rights from
2:54:24
we're able to distinguish rights from
2:54:24
we're able to distinguish rights from extends
2:54:25
extends um and the what in order to kind of talk
2:54:29
um and the what in order to kind of talk
2:54:29
um and the what in order to kind of talk about this what I want to do is walk you
2:54:32
about this what I want to do is walk you
2:54:32
about this what I want to do is walk you through from an internals perspective
2:54:34
through from an internals perspective
2:54:34
through from an internals perspective the update or insert the some of the
2:54:37
the update or insert the some of the
2:54:37
the update or insert the some of the steps are pretty similar workflow in
2:54:40
steps are pretty similar workflow in
2:54:40
steps are pretty similar workflow in postgres to give you an idea of like
2:54:42
postgres to give you an idea of like
2:54:42
postgres to give you an idea of like what is different about an extend and
2:54:44
what is different about an extend and
2:54:44
what is different about an extend and why does it matter if it's separate and
2:54:46
why does it matter if it's separate and
2:54:46
why does it matter if it's separate and what does that tell us so when you are
2:54:49
what does that tell us so when you are
2:54:49
what does that tell us so when you are let's say you're going to just do an
2:54:50
let's say you're going to just do an
2:54:50
let's say you're going to just do an insert insert some data
2:54:52
insert insert some data
2:54:52
insert insert some data first you have to find a place to put it
2:54:54
first you have to find a place to put it
2:54:54
first you have to find a place to put it so your file for your relation is going
2:54:57
so your file for your relation is going
2:54:57
so your file for your relation is going to have
2:54:58
to have you know blocks in it and so you need to
2:55:01
you know blocks in it and so you need to
2:55:01
you know blocks in it and so you need to find a block that has some space
2:55:03
find a block that has some space
2:55:03
find a block that has some space available
2:55:04
available and so that could be in the middle it
2:55:07
and so that could be in the middle it
2:55:07
and so that could be in the middle it could be at the end you're just looking
2:55:09
could be at the end you're just looking
2:55:09
could be at the end you're just looking for a block that has space uh uh for
2:55:11
for a block that has space uh uh for
2:55:11
for a block that has space uh uh for that data you're going to add and if
2:55:15
that data you're going to add and if
2:55:15
that data you're going to add and if there isn't a block that already has
2:55:16
there isn't a block that already has
2:55:16
there isn't a block that already has space you're going to extend need to
2:55:18
space you're going to extend need to
2:55:18
space you're going to extend need to extend the file to add another block to
2:55:20
extend the file to add another block to
2:55:20
extend the file to add another block to the file and if you think about it I
2:55:23
the file and if you think about it I
2:55:23
the file and if you think about it I mean you're gonna this is inevitable you
2:55:25
mean you're gonna this is inevitable you
2:55:25
mean you're gonna this is inevitable you can't avoid this if you have more data
2:55:27
can't avoid this if you have more data
2:55:27
can't avoid this if you have more data eventually you're gonna have to make the
2:55:28
eventually you're gonna have to make the
2:55:28
eventually you're gonna have to make the file bigger right so
2:55:31
file bigger right so
2:55:31
file bigger right so now that you've sort of identified which
2:55:34
now that you've sort of identified which
2:55:34
now that you've sort of identified which block you want to actually in the file
2:55:36
block you want to actually in the file
2:55:36
block you want to actually in the file you want to actually add your data to
2:55:38
you want to actually add your data to
2:55:38
you want to actually add your data to you have to get it into shared buffers
2:55:41
you have to get it into shared buffers
2:55:41
you have to get it into shared buffers so first you'll check okay is it already
2:55:43
so first you'll check okay is it already
2:55:43
so first you'll check okay is it already insured buffers if it's already insured
2:55:46
insured buffers if it's already insured
2:55:46
insured buffers if it's already insured buffers that's a cash hit right I'm sure
2:55:48
buffers that's a cash hit right I'm sure
2:55:48
buffers that's a cash hit right I'm sure buffers hit and we're done so that
2:55:50
buffers hit and we're done so that
2:55:50
buffers hit and we're done so that counts as a hit we don't do we need to
2:55:52
counts as a hit we don't do we need to
2:55:52
counts as a hit we don't do we need to do the read and we can just add our data
2:55:55
do the read and we can just add our data
2:55:55
do the read and we can just add our data however if the block is not already in
2:55:59
however if the block is not already in
2:55:59
however if the block is not already in shared buffers then we need to find a
2:56:01
shared buffers then we need to find a
2:56:01
shared buffers then we need to find a shared buffer to put it in
2:56:03
shared buffer to put it in
2:56:03
shared buffer to put it in and the shared buffer that
2:56:07
and the shared buffer that
2:56:07
and the shared buffer that in order to find a shared buffer to put
2:56:09
in order to find a shared buffer to put
2:56:09
in order to find a shared buffer to put it in we might find one and then
2:56:10
it in we might find one and then
2:56:10
it in we might find one and then actually that shirt buffer is dirty it
2:56:12
actually that shirt buffer is dirty it
2:56:12
actually that shirt buffer is dirty it has unrelated data in it that is not you
2:56:16
has unrelated data in it that is not you
2:56:16
has unrelated data in it that is not you know not necessarily data from our table
2:56:18
know not necessarily data from our table
2:56:18
know not necessarily data from our table but we can't just throw it away because
2:56:20
but we can't just throw it away because
2:56:20
but we can't just throw it away because you know then we would lose the data so
2:56:23
you know then we would lose the data so
2:56:23
you know then we would lose the data so before we can use that buffer we have to
2:56:26
before we can use that buffer we have to
2:56:26
before we can use that buffer we have to write that data out and that's a flush
2:56:28
write that data out and that's a flush
2:56:29
write that data out and that's a flush and that will count as a right MPG
2:56:31
and that will count as a right MPG
2:56:31
and that will count as a right MPG statio and you can see that here if that
2:56:34
statio and you can see that here if that
2:56:34
statio and you can see that here if that buffer had already been free or clean
2:56:36
buffer had already been free or clean
2:56:36
buffer had already been free or clean that it hadn't had dirty data in it then
2:56:39
that it hadn't had dirty data in it then
2:56:39
that it hadn't had dirty data in it then we wouldn't have needed to do this right
2:56:40
we wouldn't have needed to do this right
2:56:40
we wouldn't have needed to do this right so it's an avoidable right
2:56:43
so it's an avoidable right
2:56:43
so it's an avoidable right finally we're going to read our block
2:56:45
finally we're going to read our block
2:56:45
finally we're going to read our block into the shared buffer and now we have
2:56:48
into the shared buffer and now we have
2:56:48
into the shared buffer and now we have it and we can do our insert and this
2:56:50
it and we can do our insert and this
2:56:51
it and we can do our insert and this right is not a right like oh it's a
2:56:52
right is not a right like oh it's a
2:56:52
right is not a right like oh it's a write out to to even to Kernel buffers
2:56:55
write out to to even to Kernel buffers
2:56:55
write out to to even to Kernel buffers or disk this is just basically copying
2:56:57
or disk this is just basically copying
2:56:57
or disk this is just basically copying in our dirty data art Tuple into that
2:57:00
in our dirty data art Tuple into that
2:57:00
in our dirty data art Tuple into that buffer
2:57:02
buffer so what does this tell us about why it's
2:57:04
so what does this tell us about why it's
2:57:04
so what does this tell us about why it's important to count flushes and extend
2:57:06
important to count flushes and extend
2:57:06
important to count flushes and extend separately
2:57:08
separately what we saw was that the flush of the
2:57:11
what we saw was that the flush of the
2:57:11
what we saw was that the flush of the dirty data
2:57:12
dirty data from an unrelated Source was actually
2:57:15
from an unrelated Source was actually
2:57:15
from an unrelated Source was actually avoidable if we had had a clean buffer
2:57:17
avoidable if we had had a clean buffer
2:57:17
avoidable if we had had a clean buffer we wouldn't have needed to do it whereas
2:57:19
we wouldn't have needed to do it whereas
2:57:19
we wouldn't have needed to do it whereas the extend is something that is
2:57:21
the extend is something that is
2:57:21
the extend is something that is avoidable I mean is unavoidable we're
2:57:23
avoidable I mean is unavoidable we're
2:57:23
avoidable I mean is unavoidable we're going to have to do extends as the file
2:57:25
going to have to do extends as the file
2:57:25
going to have to do extends as the file gets bigger it's a consequence of our
2:57:27
gets bigger it's a consequence of our
2:57:27
gets bigger it's a consequence of our workload and not cleaning up someone
2:57:29
workload and not cleaning up someone
2:57:29
workload and not cleaning up someone else's mess basically so by separating
2:57:32
else's mess basically so by separating
2:57:32
else's mess basically so by separating them we can allow ourselves to
2:57:34
them we can allow ourselves to
2:57:34
them we can allow ourselves to understand whether or not we actually
2:57:36
understand whether or not we actually
2:57:36
understand whether or not we actually have to tune something it we could see a
2:57:40
have to tune something it we could see a
2:57:40
have to tune something it we could see a lot of Rights and say oh we probably
2:57:41
lot of Rights and say oh we probably
2:57:42
lot of Rights and say oh we probably need to increase share buffers but
2:57:43
need to increase share buffers but
2:57:43
need to increase share buffers but actually we're just doing a lot of
2:57:44
actually we're just doing a lot of
2:57:44
actually we're just doing a lot of extends and this is something you might
2:57:47
extends and this is something you might
2:57:47
extends and this is something you might see for example if you're doing a bulk
2:57:49
see for example if you're doing a bulk
2:57:49
see for example if you're doing a bulk right you're doing a copy from you might
2:57:52
right you're doing a copy from you might
2:57:52
right you're doing a copy from you might see a ton of in existing i o statistics
2:57:55
see a ton of in existing i o statistics
2:57:55
see a ton of in existing i o statistics a ton of Rights but actually these are
2:57:57
a ton of Rights but actually these are
2:57:57
a ton of Rights but actually these are just copy from doing a bunch of extents
2:57:59
just copy from doing a bunch of extents
2:57:59
just copy from doing a bunch of extents and this is not even data that's really
2:58:01
and this is not even data that's really
2:58:01
and this is not even data that's really part of your working set for your
2:58:03
part of your working set for your
2:58:03
part of your working set for your transactional workload necessarily so
2:58:06
transactional workload necessarily so
2:58:06
transactional workload necessarily so tuning for that is going to be to
2:58:09
tuning for that is going to be to
2:58:09
tuning for that is going to be to disadvantageous
2:58:12
disadvantageous I also mentioned earlier that we want to
2:58:15
I also mentioned earlier that we want to
2:58:15
I also mentioned earlier that we want to separate out IO by the context in which
2:58:19
separate out IO by the context in which
2:58:19
separate out IO by the context in which it's done and by back-end type
2:58:22
it's done and by back-end type
2:58:22
it's done and by back-end type and to talk about that and why that's
2:58:24
and to talk about that and why that's
2:58:24
and to talk about that and why that's important I'm going to use as an example
2:58:26
important I'm going to use as an example
2:58:26
important I'm going to use as an example Auto vacuum i o workflow roughly so Auto
2:58:31
Auto vacuum i o workflow roughly so Auto
2:58:31
Auto vacuum i o workflow roughly so Auto vacuum and vacuum but let's just
2:58:33
vacuum and vacuum but let's just
2:58:33
vacuum and vacuum but let's just specifically talk about Auto vacuum is
2:58:36
specifically talk about Auto vacuum is
2:58:36
specifically talk about Auto vacuum is going to identify okay what are the
2:58:38
going to identify okay what are the
2:58:38
going to identify okay what are the blocks that I need to vacuum and then go
2:58:40
blocks that I need to vacuum and then go
2:58:40
blocks that I need to vacuum and then go through and do each one so it might not
2:58:42
through and do each one so it might not
2:58:42
through and do each one so it might not be all the blocks in the relation of
2:58:44
be all the blocks in the relation of
2:58:44
be all the blocks in the relation of course so uh it identifies the next
2:58:47
course so uh it identifies the next
2:58:47
course so uh it identifies the next block to vacuum and then it has to look
2:58:49
block to vacuum and then it has to look
2:58:49
block to vacuum and then it has to look for it if see if it's in shared buffers
2:58:51
for it if see if it's in shared buffers
2:58:51
for it if see if it's in shared buffers if it's insured buffers then it can
2:58:53
if it's insured buffers then it can
2:58:53
if it's insured buffers then it can vacuum it and it's also that's a cash
2:58:56
vacuum it and it's also that's a cash
2:58:56
vacuum it and it's also that's a cash hit right so we didn't have to actually
2:58:58
hit right so we didn't have to actually
2:58:58
hit right so we didn't have to actually do IO
2:59:01
do IO um now this is what's this is where it
2:59:03
um now this is what's this is where it
2:59:03
um now this is what's this is where it gets different from the regular update
2:59:05
gets different from the regular update
2:59:05
gets different from the regular update insert workflow we were talking about so
2:59:07
insert workflow we were talking about so
2:59:07
insert workflow we were talking about so if we need to actually do i o so we need
2:59:10
if we need to actually do i o so we need
2:59:10
if we need to actually do i o so we need to go out and either and read in the
2:59:14
to go out and either and read in the
2:59:14
to go out and either and read in the block that we're going to vacuum we need
2:59:15
block that we're going to vacuum we need
2:59:15
block that we're going to vacuum we need to find a shared buffer for it that's
2:59:17
to find a shared buffer for it that's
2:59:17
to find a shared buffer for it that's the same as before but the difference
2:59:18
the same as before but the difference
2:59:18
the same as before but the difference here is that vacuum is going to use a
2:59:21
here is that vacuum is going to use a
2:59:21
here is that vacuum is going to use a buffer access strategy and what that
2:59:23
buffer access strategy and what that
2:59:23
buffer access strategy and what that means is that we are going to cap the
2:59:26
means is that we are going to cap the
2:59:26
means is that we are going to cap the number of or so we're going to use a
2:59:28
number of or so we're going to use a
2:59:28
number of or so we're going to use a ring buffer with a certain number of
2:59:30
ring buffer with a certain number of
2:59:30
ring buffer with a certain number of buffers in it and each time we need a
2:59:33
buffers in it and each time we need a
2:59:33
buffers in it and each time we need a new buffer we're gonna go around that
2:59:35
new buffer we're gonna go around that
2:59:35
new buffer we're gonna go around that ring and reuse buffers we've already
2:59:37
ring and reuse buffers we've already
2:59:37
ring and reuse buffers we've already used to vacuum previous blocks and what
2:59:40
used to vacuum previous blocks and what
2:59:40
used to vacuum previous blocks and what this does is it keeps us from using up
2:59:42
this does is it keeps us from using up
2:59:42
this does is it keeps us from using up all of shared buffers just to vacuum
2:59:44
all of shared buffers just to vacuum
2:59:44
all of shared buffers just to vacuum because that that'll you know wash out
2:59:46
because that that'll you know wash out
2:59:46
because that that'll you know wash out all of our working set from share
2:59:48
all of our working set from share
2:59:48
all of our working set from share buffers we don't want that and so the
2:59:50
buffers we don't want that and so the
2:59:50
buffers we don't want that and so the important thing is that when we're
2:59:52
important thing is that when we're
2:59:52
important thing is that when we're initially finding buffers to use we do
2:59:56
initially finding buffers to use we do
2:59:56
initially finding buffers to use we do this lazily on demand so we get a shared
2:59:59
this lazily on demand so we get a shared
2:59:59
this lazily on demand so we get a shared buffer we evict it which counts as an
3:00:01
buffer we evict it which counts as an
3:00:01
buffer we evict it which counts as an eviction in PG statio and now we add it
3:00:04
eviction in PG statio and now we add it
3:00:04
eviction in PG statio and now we add it to the ring
3:00:06
to the ring once we've filled up the ring the next
3:00:09
once we've filled up the ring the next
3:00:09
once we've filled up the ring the next time we need a shared buffer we're going
3:00:11
time we need a shared buffer we're going
3:00:11
time we need a shared buffer we're going to reuse the a buffer that we've already
3:00:15
to reuse the a buffer that we've already
3:00:15
to reuse the a buffer that we've already used and that counts as a reuse in PG
3:00:18
used and that counts as a reuse in PG
3:00:18
used and that counts as a reuse in PG statio and when we reuse that buffer we
3:00:23
statio and when we reuse that buffer we
3:00:23
statio and when we reuse that buffer we need to flush the dirty data that was
3:00:25
need to flush the dirty data that was
3:00:25
need to flush the dirty data that was there already and we need to write out
3:00:27
there already and we need to write out
3:00:27
there already and we need to write out the associated wall up to the Allison
3:00:30
the associated wall up to the Allison
3:00:30
the associated wall up to the Allison that was associated with that data that
3:00:34
that was associated with that data that
3:00:34
that was associated with that data that we vacuumed so
3:00:35
we vacuumed so what's important here is that even if
3:00:38
what's important here is that even if
3:00:38
what's important here is that even if there are clean shared buffers we're not
3:00:40
there are clean shared buffers we're not
3:00:40
there are clean shared buffers we're not going to go and use them for that data
3:00:41
going to go and use them for that data
3:00:41
going to go and use them for that data we're vacuuming because we're using this
3:00:44
we're vacuuming because we're using this
3:00:44
we're vacuuming because we're using this buffer access strategy
3:00:46
buffer access strategy
3:00:46
buffer access strategy and then what we'll do is you know read
3:00:49
and then what we'll do is you know read
3:00:49
and then what we'll do is you know read the
3:00:50
the block that we're trying to vacuum to
3:00:52
block that we're trying to vacuum to
3:00:52
block that we're trying to vacuum to begin with into the buffer that we've
3:00:54
begin with into the buffer that we've
3:00:54
begin with into the buffer that we've selected vacuum it and mark the buffer
3:00:57
selected vacuum it and mark the buffer
3:00:57
selected vacuum it and mark the buffer dirty
3:00:59
dirty so what does this mean about ypg statio
3:01:03
so what does this mean about ypg statio
3:01:03
so what does this mean about ypg statio is useful so
3:01:06
is useful so Auto vacuum is uh their Auto vacuum is
3:01:11
Auto vacuum is uh their Auto vacuum is
3:01:11
Auto vacuum is uh their Auto vacuum is done by Auto vacuum workers so Auto
3:01:13
done by Auto vacuum workers so Auto
3:01:13
done by Auto vacuum workers so Auto vacuum worker is a different back-end
3:01:15
vacuum worker is a different back-end
3:01:15
vacuum worker is a different back-end type than client a client backend so
3:01:19
type than client a client backend so
3:01:19
type than client a client backend so we're able to see that I O pattern and
3:01:21
we're able to see that I O pattern and
3:01:21
we're able to see that I O pattern and that I O that's being done by Auto
3:01:24
that I O that's being done by Auto
3:01:24
that I O that's being done by Auto vacuum separately in PG statio and what
3:01:27
vacuum separately in PG statio and what
3:01:27
vacuum separately in PG statio and what you saw was that we're using this buffer
3:01:29
you saw was that we're using this buffer
3:01:29
you saw was that we're using this buffer access strategy which is separate a
3:01:32
access strategy which is separate a
3:01:32
access strategy which is separate a separate pattern and going to show up
3:01:34
separate pattern and going to show up
3:01:34
separate pattern and going to show up separately then normal transactional
3:01:37
separately then normal transactional
3:01:37
separately then normal transactional workload working set and we may be
3:01:41
workload working set and we may be
3:01:41
workload working set and we may be vacuuming relations that are not part of
3:01:44
vacuuming relations that are not part of
3:01:44
vacuuming relations that are not part of our working set it's not the hottest
3:01:45
our working set it's not the hottest
3:01:45
our working set it's not the hottest data and that's okay so we don't
3:01:47
data and that's okay so we don't
3:01:47
data and that's okay so we don't necessarily want to tune our database to
3:01:50
necessarily want to tune our database to
3:01:50
necessarily want to tune our database to accommodate what we need to vacuum so
3:01:53
accommodate what we need to vacuum so
3:01:53
accommodate what we need to vacuum so it's okay to you know vacuum older data
3:01:55
it's okay to you know vacuum older data
3:01:55
it's okay to you know vacuum older data we don't necessarily need to resize
3:01:57
we don't necessarily need to resize
3:01:57
we don't necessarily need to resize shared buffers so that that all fits in
3:01:59
shared buffers so that that all fits in
3:01:59
shared buffers so that that all fits in shared buffers
3:02:01
shared buffers and similarly context matters so vacuum
3:02:06
and similarly context matters so vacuum
3:02:06
and similarly context matters so vacuum is also a i o context so we can kind of
3:02:09
is also a i o context so we can kind of
3:02:09
is also a i o context so we can kind of see that separated out but another
3:02:12
see that separated out but another
3:02:12
see that separated out but another example of that is a bulk read so for
3:02:15
example of that is a bulk read so for
3:02:15
example of that is a bulk read so for example if you do a large select which
3:02:17
example if you do a large select which
3:02:17
example if you do a large select which is basically a relation with that where
3:02:19
is basically a relation with that where
3:02:19
is basically a relation with that where the number of blocks is
3:02:21
the number of blocks is
3:02:21
the number of blocks is uh you know greater than share buffers
3:02:24
uh you know greater than share buffers
3:02:24
uh you know greater than share buffers divided by four so if we're doing a
3:02:25
divided by four so if we're doing a
3:02:25
divided by four so if we're doing a large select
3:02:27
large select we're going to be doing a lot of reads
3:02:30
we're going to be doing a lot of reads
3:02:30
we're going to be doing a lot of reads and that data is not necessarily part of
3:02:33
and that data is not necessarily part of
3:02:33
and that data is not necessarily part of our working set so we don't necessarily
3:02:35
our working set so we don't necessarily
3:02:35
our working set so we don't necessarily want to resize shared buffers so that
3:02:38
want to resize shared buffers so that
3:02:38
want to resize shared buffers so that that giant table can fit it you might
3:02:40
that giant table can fit it you might
3:02:40
that giant table can fit it you might have a mixed workload with transactional
3:02:41
have a mixed workload with transactional
3:02:41
have a mixed workload with transactional queries alongside some type of
3:02:44
queries alongside some type of
3:02:44
queries alongside some type of analytical work and so what we're trying
3:02:47
analytical work and so what we're trying
3:02:47
analytical work and so what we're trying to do is let's tune specifically for our
3:02:49
to do is let's tune specifically for our
3:02:49
to do is let's tune specifically for our transactional workload for that um that
3:02:53
transactional workload for that um that
3:02:53
transactional workload for that um that working set to fit in memory and to make
3:02:54
working set to fit in memory and to make
3:02:54
working set to fit in memory and to make that workload efficient and then you can
3:02:58
that workload efficient and then you can
3:02:58
that workload efficient and then you can separately think about tuning for your
3:02:59
separately think about tuning for your
3:03:00
separately think about tuning for your analytical workload and make different
3:03:01
analytical workload and make different
3:03:01
analytical workload and make different considerations if you know that a large
3:03:04
considerations if you know that a large
3:03:04
considerations if you know that a large part of your workload is not going to
3:03:05
part of your workload is not going to
3:03:05
part of your workload is not going to fit in shared buffers you might want to
3:03:08
fit in shared buffers you might want to
3:03:08
fit in shared buffers you might want to actually decrease sharp buffers because
3:03:10
actually decrease sharp buffers because
3:03:10
actually decrease sharp buffers because there are certain advantages to that and
3:03:12
there are certain advantages to that and
3:03:12
there are certain advantages to that and I won't go into that more now but
3:03:16
I won't go into that more now but
3:03:16
I won't go into that more now but now given what I just talked about I
3:03:18
now given what I just talked about I
3:03:18
now given what I just talked about I think it's useful to kind of take it
3:03:20
think it's useful to kind of take it
3:03:20
think it's useful to kind of take it back to PG statio and look at a few
3:03:24
back to PG statio and look at a few
3:03:24
back to PG statio and look at a few concrete examples and how it looks in
3:03:26
concrete examples and how it looks in
3:03:26
concrete examples and how it looks in The View
3:03:31
so I just mentioned that sometimes your
3:03:34
I just mentioned that sometimes your
3:03:34
I just mentioned that sometimes your workload is not going to fit in shared
3:03:37
workload is not going to fit in shared
3:03:37
workload is not going to fit in shared buffers like that's just a reality might
3:03:39
buffers like that's just a reality might
3:03:39
buffers like that's just a reality might not even you know might not fit in
3:03:41
not even you know might not fit in
3:03:41
not even you know might not fit in memory you have a large
3:03:43
memory you have a large
3:03:43
memory you have a large um you have a large database that's like
3:03:45
um you have a large database that's like
3:03:45
um you have a large database that's like pretty normal so in that case it might
3:03:47
pretty normal so in that case it might
3:03:47
pretty normal so in that case it might not be an option for you to increase
3:03:49
not be an option for you to increase
3:03:49
not be an option for you to increase share buffers so that everything fits in
3:03:52
share buffers so that everything fits in
3:03:52
share buffers so that everything fits in shared buffers but one thing that you do
3:03:55
shared buffers but one thing that you do
3:03:55
shared buffers but one thing that you do want to watch out for is that you still
3:03:58
want to watch out for is that you still
3:03:58
want to watch out for is that you still want to avoid client back-ends doing uh
3:04:01
want to avoid client back-ends doing uh
3:04:01
want to avoid client back-ends doing uh their own rights as much as possible so
3:04:05
their own rights as much as possible so
3:04:05
their own rights as much as possible so in this example you can see client back
3:04:07
in this example you can see client back
3:04:07
in this example you can see client back end normal relation rights is decently
3:04:12
end normal relation rights is decently
3:04:12
end normal relation rights is decently high and we actually want that number to
3:04:14
high and we actually want that number to
3:04:15
high and we actually want that number to basically be zero because this is not a
3:04:18
basically be zero because this is not a
3:04:18
basically be zero because this is not a right in the sort of the sense of insert
3:04:21
right in the sort of the sense of insert
3:04:21
right in the sort of the sense of insert update whatever this is actually the
3:04:24
update whatever this is actually the
3:04:24
update whatever this is actually the client looking for a buffer it can't
3:04:27
client looking for a buffer it can't
3:04:27
client looking for a buffer it can't find a clean buffer it has to flush the
3:04:29
find a clean buffer it has to flush the
3:04:29
find a clean buffer it has to flush the data that's in that buffer so we want
3:04:31
data that's in that buffer so we want
3:04:31
data that's in that buffer so we want client back ends to be doing zero rights
3:04:34
client back ends to be doing zero rights
3:04:34
client back ends to be doing zero rights we want the checkpointer and the
3:04:37
we want the checkpointer and the
3:04:37
we want the checkpointer and the background writer to be taking care of
3:04:38
background writer to be taking care of
3:04:38
background writer to be taking care of this for them so that when you do a ever
3:04:41
this for them so that when you do a ever
3:04:41
this for them so that when you do a ever a read you're not doing it right first
3:04:44
a read you're not doing it right first
3:04:44
a read you're not doing it right first um so in this example we're seeing a
3:04:48
um so in this example we're seeing a
3:04:48
um so in this example we're seeing a background writer is doing some rights
3:04:50
background writer is doing some rights
3:04:50
background writer is doing some rights but maybe it could be doing more so we
3:04:52
but maybe it could be doing more so we
3:04:52
but maybe it could be doing more so we have some options there
3:04:54
have some options there
3:04:54
have some options there um background writer historically has a
3:04:56
um background writer historically has a
3:04:56
um background writer historically has a certain throughput cap that's relatively
3:04:58
certain throughput cap that's relatively
3:04:58
certain throughput cap that's relatively low so if you have a super you know fast
3:05:02
low so if you have a super you know fast
3:05:02
low so if you have a super you know fast like high throughput workload then maybe
3:05:06
like high throughput workload then maybe
3:05:06
like high throughput workload then maybe background writer is not going to be
3:05:07
background writer is not going to be
3:05:07
background writer is not going to be able to help you but
3:05:09
able to help you but
3:05:09
able to help you but um you can decrease the background
3:05:11
um you can decrease the background
3:05:11
um you can decrease the background writer delay you can increase background
3:05:13
writer delay you can increase background
3:05:13
writer delay you can increase background writer lru Max pages I won't get into
3:05:15
writer lru Max pages I won't get into
3:05:15
writer lru Max pages I won't get into the specific mechanics of those uh gucks
3:05:18
the specific mechanics of those uh gucks
3:05:18
the specific mechanics of those uh gucks but there's articles online you can look
3:05:21
but there's articles online you can look
3:05:21
but there's articles online you can look at background writer tuning that kind of
3:05:22
at background writer tuning that kind of
3:05:22
at background writer tuning that kind of thing so
3:05:25
um and then a lot of times though like
3:05:29
and then a lot of times though like
3:05:29
and then a lot of times though like let's take this case shared buffers is
3:05:31
let's take this case shared buffers is
3:05:31
let's take this case shared buffers is too small a lot of people are if you
3:05:34
too small a lot of people are if you
3:05:34
too small a lot of people are if you have the ability to change share buffers
3:05:36
have the ability to change share buffers
3:05:36
have the ability to change share buffers of course you need to restart and
3:05:38
of course you need to restart and
3:05:38
of course you need to restart and everything
3:05:39
everything um a lot of times that's going to help
3:05:41
um a lot of times that's going to help
3:05:41
um a lot of times that's going to help with your i o problem so in this example
3:05:43
with your i o problem so in this example
3:05:43
with your i o problem so in this example in contrast if you look at the number of
3:05:46
in contrast if you look at the number of
3:05:46
in contrast if you look at the number of reads for client back in in the normal
3:05:48
reads for client back in in the normal
3:05:48
reads for client back in in the normal context and the number of Rights
3:05:51
context and the number of Rights
3:05:51
context and the number of Rights and for a client back in a normal
3:05:53
and for a client back in a normal
3:05:53
and for a client back in a normal context there's almost one right for
3:05:56
context there's almost one right for
3:05:56
context there's almost one right for every read so that's bad that's a bad
3:05:59
every read so that's bad that's a bad
3:05:59
every read so that's bad that's a bad sign that means most likely we are
3:06:01
sign that means most likely we are
3:06:01
sign that means most likely we are evicting and re-reading in the same
3:06:03
evicting and re-reading in the same
3:06:03
evicting and re-reading in the same blocks over and over and you see
3:06:05
blocks over and over and you see
3:06:05
blocks over and over and you see evictions as High
3:06:07
evictions as High um you know after the first time that
3:06:09
um you know after the first time that
3:06:09
um you know after the first time that we've gotten everything off the free
3:06:10
we've gotten everything off the free
3:06:10
we've gotten everything off the free list and we've used every buffer at
3:06:13
list and we've used every buffer at
3:06:13
list and we've used every buffer at least once everything is going to be an
3:06:15
least once everything is going to be an
3:06:15
least once everything is going to be an eviction after that so but in this case
3:06:18
eviction after that so but in this case
3:06:18
eviction after that so but in this case because we're seeing reads and writes be
3:06:20
because we're seeing reads and writes be
3:06:20
because we're seeing reads and writes be very proportionate that's a proportional
3:06:23
very proportionate that's a proportional
3:06:23
very proportionate that's a proportional that's like a hint that hey or a signal
3:06:26
that's like a hint that hey or a signal
3:06:27
that's like a hint that hey or a signal that it may be necessary to increase
3:06:29
that it may be necessary to increase
3:06:29
that it may be necessary to increase sharp buffers there are primary workings
3:06:32
sharp buffers there are primary workings
3:06:32
sharp buffers there are primary workings that just isn't fitting and you can see
3:06:35
that just isn't fitting and you can see
3:06:35
that just isn't fitting and you can see that with our cachet ratio too it's
3:06:36
that with our cachet ratio too it's
3:06:36
that with our cachet ratio too it's around 60 when we look at the client
3:06:39
around 60 when we look at the client
3:06:39
around 60 when we look at the client back in normal hits and reads
3:06:42
back in normal hits and reads
3:06:42
back in normal hits and reads and just a simple cachet ratio query for
3:06:47
and just a simple cachet ratio query for
3:06:47
and just a simple cachet ratio query for client Norm backends uh doing things in
3:06:50
client Norm backends uh doing things in
3:06:50
client Norm backends uh doing things in the normal context with permanent
3:06:52
the normal context with permanent
3:06:52
the normal context with permanent relations you can calculate the cachet
3:06:54
relations you can calculate the cachet
3:06:54
relations you can calculate the cachet ratio this way
3:06:56
ratio this way on the other hand you want to avoid
3:06:58
on the other hand you want to avoid
3:06:58
on the other hand you want to avoid premature optimization
3:07:01
premature optimization
3:07:01
premature optimization so one of the things that was really
3:07:03
so one of the things that was really
3:07:03
so one of the things that was really hard before PG statio was actually
3:07:06
hard before PG statio was actually
3:07:06
hard before PG statio was actually calculating your cachet ratio if you
3:07:08
calculating your cachet ratio if you
3:07:08
calculating your cachet ratio if you have any sort of other types of i o then
3:07:10
have any sort of other types of i o then
3:07:10
have any sort of other types of i o then you're very standard uh transactional
3:07:13
you're very standard uh transactional
3:07:13
you're very standard uh transactional workload going on so in this example you
3:07:16
workload going on so in this example you
3:07:16
workload going on so in this example you can see that client back-end normal
3:07:18
can see that client back-end normal
3:07:18
can see that client back-end normal context reads pretty low looks like we
3:07:22
context reads pretty low looks like we
3:07:22
context reads pretty low looks like we read in our working set and then the
3:07:24
read in our working set and then the
3:07:24
read in our working set and then the hits are really high we just keep
3:07:25
hits are really high we just keep
3:07:25
hits are really high we just keep reusing those
3:07:28
reusing those um those blocks and you know everything
3:07:31
um those blocks and you know everything
3:07:31
um those blocks and you know everything seems like it's kind of going okay in
3:07:34
seems like it's kind of going okay in
3:07:34
seems like it's kind of going okay in terms of that but you can tell that the
3:07:36
terms of that but you can tell that the
3:07:36
terms of that but you can tell that the number of client backend bulk reads is
3:07:39
number of client backend bulk reads is
3:07:39
number of client backend bulk reads is really high we also have a fair number
3:07:40
really high we also have a fair number
3:07:40
really high we also have a fair number of Auto vacuum reads so what you can see
3:07:43
of Auto vacuum reads so what you can see
3:07:43
of Auto vacuum reads so what you can see there is if those were all together they
3:07:46
there is if those were all together they
3:07:46
there is if those were all together they weren't separated by back end type and
3:07:48
weren't separated by back end type and
3:07:48
weren't separated by back end type and context then we would actually calculate
3:07:52
context then we would actually calculate
3:07:52
context then we would actually calculate a pretty incorrect cachet ratio and I'll
3:07:54
a pretty incorrect cachet ratio and I'll
3:07:54
a pretty incorrect cachet ratio and I'll show you what that would look like
3:07:57
show you what that would look like
3:07:57
show you what that would look like if we use PG stats database to calculate
3:08:01
if we use PG stats database to calculate
3:08:01
if we use PG stats database to calculate the cachet ratio here we get about 45
3:08:04
the cachet ratio here we get about 45
3:08:04
the cachet ratio here we get about 45 and that's similar to if you use PG
3:08:07
and that's similar to if you use PG
3:08:07
and that's similar to if you use PG statio without a where Clause but what
3:08:09
statio without a where Clause but what
3:08:09
statio without a where Clause but what we care about is tuning for our regular
3:08:13
we care about is tuning for our regular
3:08:13
we care about is tuning for our regular workload so once we add in the where
3:08:16
workload so once we add in the where
3:08:16
workload so once we add in the where Clause where the client is back with the
3:08:18
Clause where the client is back with the
3:08:18
Clause where the client is back with the backend type is a client backend it's in
3:08:20
backend type is a client backend it's in
3:08:20
backend type is a client backend it's in the normal context it's a permanent no
3:08:24
the normal context it's a permanent no
3:08:24
the normal context it's a permanent no temp relations now caches ratio is
3:08:26
temp relations now caches ratio is
3:08:26
temp relations now caches ratio is basically 99 I mean we definitely don't
3:08:28
basically 99 I mean we definitely don't
3:08:28
basically 99 I mean we definitely don't need to increase share buffers for this
3:08:30
need to increase share buffers for this
3:08:30
need to increase share buffers for this so I think pts.io gives you the
3:08:34
so I think pts.io gives you the
3:08:34
so I think pts.io gives you the additional information that you need to
3:08:37
additional information that you need to
3:08:37
additional information that you need to avoid these kinds of premature
3:08:39
avoid these kinds of premature
3:08:39
avoid these kinds of premature optimizations
3:08:42
so what's next to refugees.io um so
3:08:46
so what's next to refugees.io um so
3:08:46
so what's next to refugees.io um so hopefully in postgres 17 we're going to
3:08:48
hopefully in postgres 17 we're going to
3:08:48
hopefully in postgres 17 we're going to add there's a category of missing i o
3:08:51
add there's a category of missing i o
3:08:51
add there's a category of missing i o that I think
3:08:53
that I think um would be really good to add which is
3:08:54
um would be really good to add which is
3:08:54
um would be really good to add which is all the i o that's outside of uh shared
3:08:57
all the i o that's outside of uh shared
3:08:57
all the i o that's outside of uh shared buffers uh and outside of local buffers
3:09:01
buffers uh and outside of local buffers
3:09:01
buffers uh and outside of local buffers so things like if you do you know
3:09:02
so things like if you do you know
3:09:03
so things like if you do you know there's certain operations like create
3:09:04
there's certain operations like create
3:09:04
there's certain operations like create index where it's doing a lot of i o sort
3:09:07
index where it's doing a lot of i o sort
3:09:07
index where it's doing a lot of i o sort of directly
3:09:09
of directly um we're calling it bypass i o also per
3:09:11
um we're calling it bypass i o also per
3:09:11
um we're calling it bypass i o also per connection i o stats so taking PG statio
3:09:13
connection i o stats so taking PG statio
3:09:13
connection i o stats so taking PG statio separating out per connection and then
3:09:16
separating out per connection and then
3:09:16
separating out per connection and then integrating PG sat wall the information
3:09:18
integrating PG sat wall the information
3:09:18
integrating PG sat wall the information that's there into PG statio and sort of
3:09:21
that's there into PG statio and sort of
3:09:21
that's there into PG statio and sort of streamlining it so there's a single
3:09:22
streamlining it so there's a single
3:09:22
streamlining it so there's a single source of Truth for your i o information
3:09:26
source of Truth for your i o information
3:09:26
source of Truth for your i o information um I will be answering questions on
3:09:28
um I will be answering questions on
3:09:28
um I will be answering questions on Discord so I'd love to hear about your
3:09:31
Discord so I'd love to hear about your
3:09:31
Discord so I'd love to hear about your use cases your questions and I'm just
3:09:33
use cases your questions and I'm just
3:09:33
use cases your questions and I'm just really excited for people to use the
3:09:35
really excited for people to use the
3:09:35
really excited for people to use the view in postgres 16 and tell me what
3:09:37
view in postgres 16 and tell me what
3:09:37
view in postgres 16 and tell me what they think
3:09:42
that was wonderful
3:09:45
that was wonderful
3:09:45
that was wonderful thank you Melanie for being here
3:09:48
thank you Melanie for being here
3:09:48
thank you Melanie for being here um I have a question for you for someone
3:09:49
um I have a question for you for someone
3:09:49
um I have a question for you for someone who's really familiar with PG stat
3:09:51
who's really familiar with PG stat
3:09:51
who's really familiar with PG stat statements
3:09:53
statements um when should they think about using PG
3:09:55
um when should they think about using PG
3:09:55
um when should they think about using PG stat IO versus PG stat statements right
3:09:58
stat IO versus PG stat statements right
3:09:58
stat IO versus PG stat statements right so if you do stat statements is still
3:10:00
so if you do stat statements is still
3:10:00
so if you do stat statements is still going to be your go-to when you care
3:10:03
going to be your go-to when you care
3:10:03
going to be your go-to when you care about per statement per query
3:10:05
about per statement per query
3:10:05
about per statement per query information so if you kind of already
3:10:07
information so if you kind of already
3:10:07
information so if you kind of already know like this query is really slow then
3:10:10
know like this query is really slow then
3:10:10
know like this query is really slow then you want to use PG stat statements for
3:10:12
you want to use PG stat statements for
3:10:12
you want to use PG stat statements for sure
3:10:13
sure okay so the other thing I think people
3:10:15
okay so the other thing I think people
3:10:15
okay so the other thing I think people always wonder about with uh you know I O
3:10:18
always wonder about with uh you know I O
3:10:18
always wonder about with uh you know I O related tooling and postgres and
3:10:20
related tooling and postgres and
3:10:20
related tooling and postgres and specifically in this case
3:10:22
specifically in this case
3:10:22
specifically in this case so once this is out there like
3:10:24
so once this is out there like
3:10:24
so once this is out there like presumably I'll see something if I'm in
3:10:25
presumably I'll see something if I'm in
3:10:25
presumably I'll see something if I'm in a cloud provider like I'll be able to
3:10:27
a cloud provider like I'll be able to
3:10:27
a cloud provider like I'll be able to actually get some data because the
3:10:28
actually get some data because the
3:10:28
actually get some data because the underlying stuff is is often hard to
3:10:30
underlying stuff is is often hard to
3:10:30
underlying stuff is is often hard to understand right yeah so that's the
3:10:32
understand right yeah so that's the
3:10:32
understand right yeah so that's the great thing about PG studio is that um
3:10:35
great thing about PG studio is that um
3:10:35
great thing about PG studio is that um it will on any cloud provider unless
3:10:37
it will on any cloud provider unless
3:10:37
it will on any cloud provider unless they restrict it for some reason you'll
3:10:39
they restrict it for some reason you'll
3:10:39
they restrict it for some reason you'll be able to see it uh we have put all the
3:10:42
be able to see it uh we have put all the
3:10:42
be able to see it uh we have put all the collection points in a place where it
3:10:45
collection points in a place where it
3:10:45
collection points in a place where it doesn't matter what the underlying
3:10:47
doesn't matter what the underlying
3:10:47
doesn't matter what the underlying storage type is or anything like that so
3:10:49
storage type is or anything like that so
3:10:49
storage type is or anything like that so it'll just work on any cloud provider
3:10:51
it'll just work on any cloud provider
3:10:51
it'll just work on any cloud provider awesome
3:10:53
awesome so I know you just um committed PG stat
3:10:56
so I know you just um committed PG stat
3:10:56
so I know you just um committed PG stat i o to postgres 16. I'm just curious
3:10:59
i o to postgres 16. I'm just curious
3:10:59
i o to postgres 16. I'm just curious what's next for you and postgres are you
3:11:01
what's next for you and postgres are you
3:11:01
what's next for you and postgres are you going to continue to work in
3:11:02
going to continue to work in
3:11:02
going to continue to work in observability or I think that one of the
3:11:05
observability or I think that one of the
3:11:05
observability or I think that one of the things that I'm really excited about is
3:11:07
things that I'm really excited about is
3:11:07
things that I'm really excited about is postgres 15 actually added change the
3:11:10
postgres 15 actually added change the
3:11:10
postgres 15 actually added change the underlying statistics system to use
3:11:13
underlying statistics system to use
3:11:13
underlying statistics system to use shared memory and the way that it is now
3:11:16
shared memory and the way that it is now
3:11:16
shared memory and the way that it is now it's much more reliable and um the
3:11:20
it's much more reliable and um the
3:11:20
it's much more reliable and um the adding new statistics is a lot easier
3:11:22
adding new statistics is a lot easier
3:11:22
adding new statistics is a lot easier and sort of straightforward so what I
3:11:24
and sort of straightforward so what I
3:11:24
and sort of straightforward so what I think what I'd really love to see is
3:11:26
think what I'd really love to see is
3:11:26
think what I'd really love to see is users who are not you know necessarily
3:11:28
users who are not you know necessarily
3:11:28
users who are not you know necessarily full-time postgres contributors but know
3:11:30
full-time postgres contributors but know
3:11:30
full-time postgres contributors but know what their use cases are
3:11:32
what their use cases are
3:11:32
what their use cases are so for them to add new statistics and so
3:11:35
so for them to add new statistics and so
3:11:35
so for them to add new statistics and so like I'm excited to review more patches
3:11:37
like I'm excited to review more patches
3:11:37
like I'm excited to review more patches around observability
3:11:39
around observability
3:11:39
around observability um I know some of the folks at dalibur
3:11:41
um I know some of the folks at dalibur
3:11:41
um I know some of the folks at dalibur proposed some uh exciting news
3:11:43
proposed some uh exciting news
3:11:43
proposed some uh exciting news statistics around parallel query for
3:11:45
statistics around parallel query for
3:11:45
statistics around parallel query for postgres 17
3:11:47
postgres 17 um below one of my co-workers here who
3:11:49
um below one of my co-workers here who
3:11:49
um below one of my co-workers here who did a poseiduscon talk on CI he's
3:11:52
did a poseiduscon talk on CI he's
3:11:52
did a poseiduscon talk on CI he's thinking about working on some of the
3:11:54
thinking about working on some of the
3:11:54
thinking about working on some of the items that I the open items that I
3:11:56
items that I the open items that I
3:11:56
items that I the open items that I listed
3:11:57
listed um so I think like the the goal is to
3:12:00
um so I think like the the goal is to
3:12:00
um so I think like the the goal is to have you know people that are using
3:12:03
have you know people that are using
3:12:03
have you know people that are using postgresp being like every single day
3:12:05
postgresp being like every single day
3:12:05
postgresp being like every single day you know in in the trenches adding the
3:12:08
you know in in the trenches adding the
3:12:08
you know in in the trenches adding the new observability
3:12:10
new observability um statistics awesome and people can
3:12:14
um statistics awesome and people can
3:12:14
um statistics awesome and people can reach you obviously today during the
3:12:16
reach you obviously today during the
3:12:16
reach you obviously today during the event on Discord if they have questions
3:12:17
event on Discord if they have questions
3:12:17
event on Discord if they have questions and then more generally if someone
3:12:19
and then more generally if someone
3:12:19
and then more generally if someone watches this talk on YouTube in three
3:12:21
watches this talk on YouTube in three
3:12:21
watches this talk on YouTube in three months and they have an idea for some
3:12:23
months and they have an idea for some
3:12:23
months and they have an idea for some statistics or feedback on observability
3:12:25
statistics or feedback on observability
3:12:25
statistics or feedback on observability how do they reach you well you can tweet
3:12:28
how do they reach you well you can tweet
3:12:28
how do they reach you well you can tweet at me I'm not like that prolific so I
3:12:31
at me I'm not like that prolific so I
3:12:31
at me I'm not like that prolific so I will probably answer but I think the
3:12:33
will probably answer but I think the
3:12:33
will probably answer but I think the best thing to do would be on the hackers
3:12:35
best thing to do would be on the hackers
3:12:35
best thing to do would be on the hackers mailing list you can either email me
3:12:37
mailing list you can either email me
3:12:37
mailing list you can either email me directly or you can send an email to the
3:12:40
directly or you can send an email to the
3:12:40
directly or you can send an email to the Hacker's mailing list and propose your
3:12:42
Hacker's mailing list and propose your
3:12:42
Hacker's mailing list and propose your idea and sort of don't be afraid to like
3:12:45
idea and sort of don't be afraid to like
3:12:45
idea and sort of don't be afraid to like but I like don't be afraid to email the
3:12:47
but I like don't be afraid to email the
3:12:47
but I like don't be afraid to email the hackers mailing list because you have a
3:12:48
hackers mailing list because you have a
3:12:48
hackers mailing list because you have a higher chance that people respond that
3:12:50
higher chance that people respond that
3:12:50
higher chance that people respond that if you want help like composing your
3:12:52
if you want help like composing your
3:12:52
if you want help like composing your email because I know it could be
3:12:53
email because I know it could be
3:12:53
email because I know it could be intimidating I'm happy to help people
3:12:56
intimidating I'm happy to help people
3:12:56
intimidating I'm happy to help people with formatting and wording
3:13:00
with formatting and wording
3:13:00
with formatting and wording awesome well I'm so glad
3:13:03
awesome well I'm so glad
3:13:03
awesome well I'm so glad um that you had this time on your
3:13:05
um that you had this time on your
3:13:05
um that you had this time on your schedule and the ability to come and
3:13:07
schedule and the ability to come and
3:13:07
schedule and the ability to come and give the talk today thank you very much
3:13:08
give the talk today thank you very much
3:13:08
give the talk today thank you very much Melanie thank you
3:13:10
Melanie thank you I also wanted to say that people who
3:13:13
I also wanted to say that people who
3:13:13
I also wanted to say that people who have feedback on your talk or any of the
3:13:15
have feedback on your talk or any of the
3:13:15
have feedback on your talk or any of the other talks we've been popping a banner
3:13:16
other talks we've been popping a banner
3:13:16
other talks we've been popping a banner up with the aka.ms URL to fill out the
3:13:19
up with the aka.ms URL to fill out the
3:13:19
up with the aka.ms URL to fill out the attendee survey and you can fill it out
3:13:21
attendee survey and you can fill it out
3:13:21
attendee survey and you can fill it out multiple times so you know on the talks
3:13:23
multiple times so you know on the talks
3:13:23
multiple times so you know on the talks you've seen in today's live stream
3:13:25
you've seen in today's live stream
3:13:25
you've seen in today's live stream tomorrow is the media live stream the
3:13:27
tomorrow is the media live stream the
3:13:27
tomorrow is the media live stream the on-demand talks that it's open until
3:13:29
on-demand talks that it's open until
3:13:29
on-demand talks that it's open until next Friday the 28th or something cool
3:13:33
next Friday the 28th or something cool
3:13:33
next Friday the 28th or something cool all right so um I've also been popping
3:13:36
all right so um I've also been popping
3:13:36
all right so um I've also been popping stickers and swag bag codes in the
3:13:40
stickers and swag bag codes in the
3:13:40
stickers and swag bag codes in the banners during the live stream so those
3:13:42
banners during the live stream so those
3:13:42
banners during the live stream so those of you who care about Swag
3:13:45
of you who care about Swag
3:13:45
of you who care about Swag um definitely pay attention to those
3:13:46
um definitely pay attention to those
3:13:46
um definitely pay attention to those banners as they pop across on each talk
3:13:48
banners as they pop across on each talk
3:13:48
banners as they pop across on each talk so you can make sure to try to get your
3:13:50
so you can make sure to try to get your
3:13:50
so you can make sure to try to get your swag bag or your sticker pack I I think
3:13:53
swag bag or your sticker pack I I think
3:13:53
swag bag or your sticker pack I I think it's time for our sixth speaker today I
3:13:56
it's time for our sixth speaker today I
3:13:56
it's time for our sixth speaker today I believe it is hi
3:13:59
believe it is hi welcome
3:14:01
um today yes correctly absolutely for those uh
3:14:06
yes correctly absolutely for those uh
3:14:06
yes correctly absolutely for those uh who don't know Andre uh he's actually I
3:14:08
who don't know Andre uh he's actually I
3:14:09
who don't know Andre uh he's actually I guess a long time postgres hacker I
3:14:10
guess a long time postgres hacker I
3:14:10
guess a long time postgres hacker I guess we'll we'll put you in that
3:14:11
guess we'll we'll put you in that
3:14:12
guess we'll we'll put you in that category now
3:14:13
category now um and has worked on a whole bunch of
3:14:15
um and has worked on a whole bunch of
3:14:15
um and has worked on a whole bunch of different parts of the postgres project
3:14:17
different parts of the postgres project
3:14:17
different parts of the postgres project um most recent patch I had a blog post
3:14:19
um most recent patch I had a blog post
3:14:19
um most recent patch I had a blog post on this a couple of years ago like there
3:14:21
on this a couple of years ago like there
3:14:21
on this a couple of years ago like there should be a better way so now there is
3:14:22
should be a better way so now there is
3:14:22
should be a better way so now there is was adding iteration counts for the
3:14:25
was adding iteration counts for the
3:14:25
was adding iteration counts for the watch command in psql which sounds like
3:14:27
watch command in psql which sounds like
3:14:27
watch command in psql which sounds like a small thing but like it's a thing
3:14:29
a small thing but like it's a thing
3:14:29
a small thing but like it's a thing people need to do and there was really
3:14:30
people need to do and there was really
3:14:30
people need to do and there was really no good way to do it so yeah and uh this
3:14:33
no good way to do it so yeah and uh this
3:14:33
no good way to do it so yeah and uh this feature actually we had to do is on
3:14:37
feature actually we had to do is on
3:14:37
feature actually we had to do is on leave session on like discussing which
3:14:41
leave session on like discussing which
3:14:41
leave session on like discussing which fish which features we could add and he
3:14:44
fish which features we could add and he
3:14:44
fish which features we could add and he proposes that let's do this and we uh
3:14:47
proposes that let's do this and we uh
3:14:47
proposes that let's do this and we uh hacked it online
3:14:49
hacked it online and submitted to postgres hackers and it
3:14:52
and submitted to postgres hackers and it
3:14:52
and submitted to postgres hackers and it was committed it was cool awesome a
3:14:55
was committed it was cool awesome a
3:14:55
was committed it was cool awesome a couple other things you've done a bunch
3:14:57
couple other things you've done a bunch
3:14:57
couple other things you've done a bunch of work on indexes over the years and
3:14:59
of work on indexes over the years and
3:14:59
of work on indexes over the years and you are one of the maintainers of YG is
3:15:01
you are one of the maintainers of YG is
3:15:01
you are one of the maintainers of YG is that correct uh yes uh it was started by
3:15:05
that correct uh yes uh it was started by
3:15:05
that correct uh yes uh it was started by uh Dan Farina but I I later joined the
3:15:09
uh Dan Farina but I I later joined the
3:15:09
uh Dan Farina but I I later joined the team and uh
3:15:10
team and uh uh organize it
3:15:13
uh organize it all the stuff and moving around so both
3:15:17
all the stuff and moving around so both
3:15:17
all the stuff and moving around so both of those areas deal with compression
3:15:18
of those areas deal with compression
3:15:18
of those areas deal with compression there's a lot of other parts of
3:15:19
there's a lot of other parts of
3:15:19
there's a lot of other parts of compression in postgres and I think
3:15:22
compression in postgres and I think
3:15:22
compression in postgres and I think that's our topic for today so let's uh I
3:15:24
that's our topic for today so let's uh I
3:15:24
that's our topic for today so let's uh I just want to shine into the wall G when
3:15:27
just want to shine into the wall G when
3:15:27
just want to shine into the wall G when it was first created was Dan Farina was
3:15:29
it was first created was Dan Farina was
3:15:29
it was first created was Dan Farina was absolutely the mentor and the advisor
3:15:31
absolutely the mentor and the advisor
3:15:31
absolutely the mentor and the advisor but it was actually a summer intern at
3:15:33
but it was actually a summer intern at
3:15:33
but it was actually a summer intern at studys data named Katie Lee who I
3:15:35
studys data named Katie Lee who I
3:15:35
studys data named Katie Lee who I believe was a UC Berkeley College
3:15:36
believe was a UC Berkeley College
3:15:36
believe was a UC Berkeley College student at the time so
3:15:38
student at the time so
3:15:38
student at the time so um yeah just wanted to throw that shout
3:15:41
um yeah just wanted to throw that shout
3:15:41
um yeah just wanted to throw that shout out to her all right
3:15:43
out to her all right
3:15:43
out to her all right um Andre over to you
3:15:46
um Andre over to you
3:15:46
um Andre over to you uh so hi everyone my name is Andre I'm
3:15:50
uh so hi everyone my name is Andre I'm
3:15:50
uh so hi everyone my name is Andre I'm doing postgres every day this sounds
3:15:53
doing postgres every day this sounds
3:15:53
doing postgres every day this sounds funny yeah and I'm going to continue uh
3:15:57
funny yeah and I'm going to continue uh
3:15:57
funny yeah and I'm going to continue uh Melanie's talk with
3:16:00
Melanie's talk with
3:16:00
Melanie's talk with some more postgres internals and this
3:16:03
some more postgres internals and this
3:16:03
some more postgres internals and this stock will be uh like interesting to
3:16:06
stock will be uh like interesting to
3:16:06
stock will be uh like interesting to dbas and software developers uh to
3:16:10
dbas and software developers uh to
3:16:10
dbas and software developers uh to enable new features that are already in
3:16:13
enable new features that are already in
3:16:13
enable new features that are already in in postgres core but also this talk
3:16:16
in postgres core but also this talk
3:16:16
in postgres core but also this talk could be of interest to of interest to
3:16:19
could be of interest to of interest to
3:16:19
could be of interest to of interest to uh scientists and someone seeking for
3:16:23
uh scientists and someone seeking for
3:16:23
uh scientists and someone seeking for new ways to improve uh postgres uh in a
3:16:28
new ways to improve uh postgres uh in a
3:16:28
new ways to improve uh postgres uh in a like classical algorithmical way with a
3:16:31
like classical algorithmical way with a
3:16:31
like classical algorithmical way with a compression which is like
3:16:33
compression which is like
3:16:33
compression which is like uh kind of a old field of study
3:16:37
uh kind of a old field of study
3:16:37
uh kind of a old field of study so let's do it on compression of
3:16:40
so let's do it on compression of
3:16:40
so let's do it on compression of everything everything that I could
3:16:43
everything everything that I could
3:16:43
everything everything that I could imagine
3:16:45
what is basic idea of compression that
3:16:48
what is basic idea of compression that
3:16:48
what is basic idea of compression that we like have a function that is reducing
3:16:52
we like have a function that is reducing
3:16:52
we like have a function that is reducing uh reducing size of some input vector
3:16:56
uh reducing size of some input vector
3:16:56
uh reducing size of some input vector and output Vector must be smaller than
3:16:59
and output Vector must be smaller than
3:16:59
and output Vector must be smaller than input
3:17:00
input but now since just don't work that way
3:17:03
but now since just don't work that way
3:17:03
but now since just don't work that way in fact uh output is uh in like so many
3:17:10
in fact uh output is uh in like so many
3:17:10
in fact uh output is uh in like so many cases is bigger than input and what's
3:17:12
cases is bigger than input and what's
3:17:12
cases is bigger than input and what's the point then
3:17:14
the point then uh let's see compression as a some way
3:17:17
uh let's see compression as a some way
3:17:17
uh let's see compression as a some way of a
3:17:18
of a like a process of decompression as uh
3:17:22
like a process of decompression as uh
3:17:22
like a process of decompression as uh some way of decoding three uh so we have
3:17:25
some way of decoding three uh so we have
3:17:25
some way of decoding three uh so we have a decision tree uh and uh each decision
3:17:28
a decision tree uh and uh each decision
3:17:28
a decision tree uh and uh each decision is representing one bit of our input
3:17:33
is representing one bit of our input
3:17:33
is representing one bit of our input now of our compressed input that we want
3:17:35
now of our compressed input that we want
3:17:35
now of our compressed input that we want to decompress uh if this decision tree
3:17:39
to decompress uh if this decision tree
3:17:39
to decompress uh if this decision tree is balanced then uh there is
3:17:45
there is a ratio of exactly one the size
3:17:49
there is a ratio of exactly one the size
3:17:49
there is a ratio of exactly one the size of compressed input equals the size of
3:17:52
of compressed input equals the size of
3:17:52
of compressed input equals the size of decompressed input there is no point in
3:17:53
decompressed input there is no point in
3:17:54
decompressed input there is no point in having a balanced decoding tree so we
3:17:56
having a balanced decoding tree so we
3:17:56
having a balanced decoding tree so we want to uh imbalance decodentry so some
3:18:00
want to uh imbalance decodentry so some
3:18:00
want to uh imbalance decodentry so some passes are short but the vast majority
3:18:02
passes are short but the vast majority
3:18:02
passes are short but the vast majority of passes through the decoding tree is
3:18:05
of passes through the decoding tree is
3:18:05
of passes through the decoding tree is longer than uh pass through the balanced
3:18:08
longer than uh pass through the balanced
3:18:08
longer than uh pass through the balanced uh regarding tree and that's the whole
3:18:11
uh regarding tree and that's the whole
3:18:11
uh regarding tree and that's the whole idea of a compression that we have a
3:18:14
idea of a compression that we have a
3:18:14
idea of a compression that we have a trade-off that some frequent data is
3:18:17
trade-off that some frequent data is
3:18:17
trade-off that some frequent data is shorter but the vast majority of rare
3:18:22
shorter but the vast majority of rare
3:18:22
shorter but the vast majority of rare input vectors is much or at least
3:18:25
input vectors is much or at least
3:18:25
input vectors is much or at least slightly longer than uh average input
3:18:30
slightly longer than uh average input
3:18:30
slightly longer than uh average input and that's why we have to do this uh
3:18:34
and that's why we have to do this uh
3:18:34
and that's why we have to do this uh trade-off wisely we cannot just apply
3:18:37
trade-off wisely we cannot just apply
3:18:37
trade-off wisely we cannot just apply compression everywhere
3:18:38
compression everywhere
3:18:39
compression everywhere but the idea is sensible and it works
3:18:42
but the idea is sensible and it works
3:18:42
but the idea is sensible and it works uh so in recent days uh author of
3:18:46
uh so in recent days uh author of
3:18:46
uh so in recent days uh author of another database because
3:18:49
another database because
3:18:49
another database because told on one of conferences that every
3:18:53
told on one of conferences that every
3:18:53
told on one of conferences that every byte that goes through IO to the D to
3:18:56
byte that goes through IO to the D to
3:18:56
byte that goes through IO to the D to the Discord to the network deserves some
3:18:59
the Discord to the network deserves some
3:18:59
the Discord to the network deserves some compression and I largely agree with
3:19:02
compression and I largely agree with
3:19:02
compression and I largely agree with Alexa
3:19:03
Alexa and uh let's talk about compression
3:19:07
and uh let's talk about compression
3:19:07
and uh let's talk about compression deeper uh what do we have in postgres
3:19:09
deeper uh what do we have in postgres
3:19:09
deeper uh what do we have in postgres for compressing stuff
3:19:11
for compressing stuff
3:19:11
for compressing stuff let's talk about codex so coder and
3:19:14
let's talk about codex so coder and
3:19:14
let's talk about codex so coder and coders and decoders we always have these
3:19:17
coders and decoders we always have these
3:19:17
coders and decoders we always have these two functions uh to take a
3:19:20
two functions uh to take a
3:19:20
two functions uh to take a bit of vector of bits and make another
3:19:24
bit of vector of bits and make another
3:19:24
bit of vector of bits and make another Vector of bits
3:19:26
Vector of bits and vice versa
3:19:29
and vice versa uh usually we use so-called Pareto
3:19:34
uh usually we use so-called Pareto
3:19:34
uh usually we use so-called Pareto Frontier to describe useful codecs
3:19:37
Frontier to describe useful codecs
3:19:37
Frontier to describe useful codecs Pareto Frontier is a chart where on one
3:19:41
Pareto Frontier is a chart where on one
3:19:41
Pareto Frontier is a chart where on one axis we have a compression ratio and
3:19:45
axis we have a compression ratio and
3:19:45
axis we have a compression ratio and another time
3:19:47
another time typically time for CPU that is taken by
3:19:51
typically time for CPU that is taken by
3:19:51
typically time for CPU that is taken by compression
3:19:52
compression and if some codec have no is some code
3:19:58
and if some codec have no is some code
3:19:58
and if some codec have no is some code have a data point which is not
3:19:59
have a data point which is not
3:19:59
have a data point which is not superseded by both
3:20:02
superseded by both
3:20:02
superseded by both access by some other codec by both
3:20:05
access by some other codec by both
3:20:05
access by some other codec by both access we say that the codec is on
3:20:07
access we say that the codec is on
3:20:07
access we say that the codec is on Pareto Frontier here we can see that on
3:20:11
Pareto Frontier here we can see that on
3:20:11
Pareto Frontier here we can see that on a higher uh timings and uh better
3:20:14
a higher uh timings and uh better
3:20:14
a higher uh timings and uh better compression ratios we usually have a
3:20:17
compression ratios we usually have a
3:20:17
compression ratios we usually have a lzma uh compression which is a lampard's
3:20:21
lzma uh compression which is a lampard's
3:20:21
lzma uh compression which is a lampard's event Markov chains algorithm and on the
3:20:25
event Markov chains algorithm and on the
3:20:25
event Markov chains algorithm and on the right side we have a faster codecs which
3:20:27
right side we have a faster codecs which
3:20:27
right side we have a faster codecs which consume a little CPU and have a modest
3:20:31
consume a little CPU and have a modest
3:20:31
consume a little CPU and have a modest compression ratio
3:20:33
compression ratio and so uh for our database we have some
3:20:38
and so uh for our database we have some
3:20:38
and so uh for our database we have some variable variable resources like memory
3:20:41
variable variable resources like memory
3:20:41
variable variable resources like memory but it's not affected much by
3:20:44
but it's not affected much by
3:20:44
but it's not affected much by compression and CPU time and the io time
3:20:49
compression and CPU time and the io time
3:20:49
compression and CPU time and the io time iOS throughput
3:20:55
uh we have almost from the beginning of popularity
3:20:58
almost from the beginning of popularity
3:20:58
almost from the beginning of popularity of progress codec which is called glz uh
3:21:02
of progress codec which is called glz uh
3:21:02
of progress codec which is called glz uh this is a codec which was implemented by
3:21:04
this is a codec which was implemented by
3:21:04
this is a codec which was implemented by janowic back in
3:21:08
janowic back in 1997 and it was proposed in scientific
3:21:12
1997 and it was proposed in scientific
3:21:12
1997 and it was proposed in scientific paper as a as a simple compression
3:21:16
paper as a as a simple compression
3:21:16
paper as a as a simple compression algorithm back in 1993. so it's old it's
3:21:22
algorithm back in 1993. so it's old it's
3:21:22
algorithm back in 1993. so it's old it's not super efficient but the main uh uh
3:21:26
not super efficient but the main uh uh
3:21:26
not super efficient but the main uh uh like advantage of this codec was that
3:21:29
like advantage of this codec was that
3:21:29
like advantage of this codec was that it's it wasn't covered by any kind of
3:21:31
it's it wasn't covered by any kind of
3:21:31
it's it wasn't covered by any kind of patience so it could be used in
3:21:37
um in open source and this is this is a
3:21:40
um in open source and this is this is a
3:21:40
um in open source and this is this is a relatively important I think for for
3:21:45
relatively important I think for for
3:21:45
relatively important I think for for software
3:21:47
software uh later we have we had a new codec
3:21:51
uh later we have we had a new codec
3:21:51
uh later we have we had a new codec which is called zilip it was
3:21:54
which is called zilip it was
3:21:54
which is called zilip it was standardized by IDF with our RFC
3:21:58
standardized by IDF with our RFC
3:21:58
standardized by IDF with our RFC document uh it will it had a good
3:22:02
document uh it will it had a good
3:22:02
document uh it will it had a good implementation back in 2000s
3:22:05
implementation back in 2000s
3:22:05
implementation back in 2000s at the beginning of 2000 and from
3:22:10
at the beginning of 2000 and from
3:22:10
at the beginning of 2000 and from postgres is using zlib too it is
3:22:13
postgres is using zlib too it is
3:22:13
postgres is using zlib too it is slightly better both in a compression
3:22:16
slightly better both in a compression
3:22:16
slightly better both in a compression ratio and in compression performance
3:22:20
ratio and in compression performance
3:22:20
ratio and in compression performance uh but it was like not that importantly
3:22:24
uh but it was like not that importantly
3:22:24
uh but it was like not that importantly better than pglz to replace pglz
3:22:27
better than pglz to replace pglz
3:22:27
better than pglz to replace pglz everywhere
3:22:29
everywhere uh later uh
3:22:33
uh later uh emerged another codec lz4 uh
3:22:38
emerged another codec lz4 uh
3:22:38
emerged another codec lz4 uh which is a compression optimized for
3:22:40
which is a compression optimized for
3:22:40
which is a compression optimized for performance it has a very modest
3:22:43
performance it has a very modest
3:22:43
performance it has a very modest compression ratio usually it compressed
3:22:45
compression ratio usually it compressed
3:22:45
compression ratio usually it compressed like
3:22:46
like normal date some average data to fold
3:22:50
normal date some average data to fold
3:22:50
normal date some average data to fold to compression ratio is around two but
3:22:55
to compression ratio is around two but
3:22:55
to compression ratio is around two but the main advantage of this codec is that
3:22:58
the main advantage of this codec is that
3:22:58
the main advantage of this codec is that the decompression
3:23:00
the decompression almost with a speed of of memory so we
3:23:04
almost with a speed of of memory so we
3:23:04
almost with a speed of of memory so we see here as a just coping bytes is uh
3:23:08
see here as a just coping bytes is uh
3:23:09
see here as a just coping bytes is uh 14 gigabytes per CPU core and
3:23:11
14 gigabytes per CPU core and
3:23:11
14 gigabytes per CPU core and decompression lz4 is 5 gigabytes per CPU
3:23:16
decompression lz4 is 5 gigabytes per CPU
3:23:16
decompression lz4 is 5 gigabytes per CPU core which is a very impressive result
3:23:21
and later author of lz4 you can call it
3:23:25
and later author of lz4 you can call it
3:23:25
and later author of lz4 you can call it tried to supersede lz4 with uh Z
3:23:30
tried to supersede lz4 with uh Z
3:23:30
tried to supersede lz4 with uh Z standard codec uh which was expected to
3:23:34
standard codec uh which was expected to
3:23:34
standard codec uh which was expected to be
3:23:35
be better on Pareto Frontiers and than lz4
3:23:38
better on Pareto Frontiers and than lz4
3:23:38
better on Pareto Frontiers and than lz4 but it actually didn't lc4 is still
3:23:42
but it actually didn't lc4 is still
3:23:42
but it actually didn't lc4 is still super fast and the yesterday even with a
3:23:46
super fast and the yesterday even with a
3:23:46
super fast and the yesterday even with a lower compression uh level
3:23:49
lower compression uh level
3:23:49
lower compression uh level is not always faster than lz4
3:23:52
is not always faster than lz4
3:23:52
is not always faster than lz4 but this this standard is a main uh
3:23:59
but this this standard is a main uh
3:23:59
but this this standard is a main uh Frontier for database applications for
3:24:03
Frontier for database applications for
3:24:03
Frontier for database applications for database systems
3:24:05
database systems and is widely used
3:24:09
and is widely used
3:24:09
and is widely used and all these codecs are currently used
3:24:12
and all these codecs are currently used
3:24:12
and all these codecs are currently used in postgres let's see how they used ah
3:24:15
in postgres let's see how they used ah
3:24:15
in postgres let's see how they used ah now sorry now one more thing uh another
3:24:19
now sorry now one more thing uh another
3:24:19
now sorry now one more thing uh another thing that emerged relatively recently
3:24:21
thing that emerged relatively recently
3:24:21
thing that emerged relatively recently with codex that typically we have we
3:24:25
with codex that typically we have we
3:24:25
with codex that typically we have we need to provide code exam uh run away to
3:24:28
need to provide code exam uh run away to
3:24:28
need to provide code exam uh run away to unfold the com effective compression
3:24:31
unfold the com effective compression
3:24:31
unfold the com effective compression but in database uh
3:24:35
but in database uh
3:24:35
but in database uh from time to time we have to compress
3:24:37
from time to time we have to compress
3:24:37
from time to time we have to compress small buttons like strings of a few
3:24:41
small buttons like strings of a few
3:24:41
small buttons like strings of a few hundred bytes
3:24:43
hundred bytes but for example uh z z standard and lz4
3:24:49
but for example uh z z standard and lz4
3:24:49
but for example uh z z standard and lz4 I almost indistinguishable in a
3:24:51
I almost indistinguishable in a
3:24:51
I almost indistinguishable in a compression ratio on sizes of a few
3:24:56
compression ratio on sizes of a few
3:24:56
compression ratio on sizes of a few kilobytes
3:24:57
kilobytes but to help with compression of small
3:25:00
but to help with compression of small
3:25:00
but to help with compression of small datums small data we can
3:25:04
datums small data we can
3:25:04
datums small data we can use compression dictionaries which is a
3:25:07
use compression dictionaries which is a
3:25:07
use compression dictionaries which is a prefix of like virtual prefix of every
3:25:11
prefix of like virtual prefix of every
3:25:11
prefix of like virtual prefix of every data that must be pre-computed for on
3:25:14
data that must be pre-computed for on
3:25:14
data that must be pre-computed for on some Corpus prepared before we compress
3:25:17
some Corpus prepared before we compress
3:25:17
some Corpus prepared before we compress anything and then it helps codec to
3:25:20
anything and then it helps codec to
3:25:21
anything and then it helps codec to compress uh frequent but small
3:25:24
compress uh frequent but small
3:25:24
compress uh frequent but small byte sequences
3:25:29
so all in all we see that codecs are
3:25:33
so all in all we see that codecs are
3:25:33
so all in all we see that codecs are became much better in recent times and
3:25:36
became much better in recent times and
3:25:36
became much better in recent times and here is a quote from Michelle pakia from
3:25:40
here is a quote from Michelle pakia from
3:25:40
here is a quote from Michelle pakia from pgsql hackers that modern compression
3:25:42
pgsql hackers that modern compression
3:25:42
pgsql hackers that modern compression algorithms became really useful for
3:25:45
algorithms became really useful for
3:25:45
algorithms became really useful for databases and I'd say that in last uh
3:25:49
databases and I'd say that in last uh
3:25:49
databases and I'd say that in last uh three or four years we made a huge
3:25:52
three or four years we made a huge
3:25:52
three or four years we made a huge advancement in postgres core to use
3:25:54
advancement in postgres core to use
3:25:54
advancement in postgres core to use better better compression algorithms
3:25:56
better better compression algorithms
3:25:56
better better compression algorithms let's see what's there and what still to
3:25:59
let's see what's there and what still to
3:25:59
let's see what's there and what still to do
3:26:00
do use cases in polgress first of all
3:26:03
use cases in polgress first of all
3:26:03
use cases in polgress first of all surely in Poltergeist 14 you can use
3:26:06
surely in Poltergeist 14 you can use
3:26:06
surely in Poltergeist 14 you can use better compression for toasts the other
3:26:09
better compression for toasts the other
3:26:09
better compression for toasts the other side
3:26:10
side attribute storage
3:26:12
attribute storage uh this is just a joke no don't use
3:26:16
uh this is just a joke no don't use
3:26:16
uh this is just a joke no don't use varcher to use text every everywhere
3:26:19
varcher to use text every everywhere
3:26:19
varcher to use text every everywhere text is just fine data type uh so
3:26:24
text is just fine data type uh so
3:26:24
text is just fine data type uh so most useful uh
3:26:29
most useful uh application of uh compression is a
3:26:32
application of uh compression is a
3:26:32
application of uh compression is a compression of toasts previously they
3:26:34
compression of toasts previously they
3:26:34
compression of toasts previously they were compressed with pglz now you can
3:26:38
were compressed with pglz now you can
3:26:38
were compressed with pglz now you can use it for uh text columns
3:26:43
use it for uh text columns
3:26:43
use it for uh text columns and when you create a
3:26:46
and when you create a
3:26:46
and when you create a column you can say that this call this
3:26:49
column you can say that this call this
3:26:49
column you can say that this call this column must be compressed with with lz4
3:26:52
column must be compressed with with lz4
3:26:52
column must be compressed with with lz4 unfortunately that's not a default so
3:26:55
unfortunately that's not a default so
3:26:55
unfortunately that's not a default so you have to alter table and set
3:26:59
you have to alter table and set
3:26:59
you have to alter table and set this compression yourself
3:27:04
default is still pglz
3:27:07
default is still pglz
3:27:07
default is still pglz but this setting will make a reading
3:27:11
but this setting will make a reading
3:27:11
but this setting will make a reading toasts much much faster on average lc4
3:27:16
toasts much much faster on average lc4
3:27:16
toasts much much faster on average lc4 is faster approximately paint like from
3:27:20
is faster approximately paint like from
3:27:20
is faster approximately paint like from five to ten times faster than
3:27:22
five to ten times faster than
3:27:22
five to ten times faster than decompression uh glz
3:27:26
another application of a compression is
3:27:29
another application of a compression is
3:27:29
another application of a compression is a base backup like PG based backup uh uh
3:27:33
a base backup like PG based backup uh uh
3:27:33
a base backup like PG based backup uh uh recently there was a new compression
3:27:37
recently there was a new compression
3:27:37
recently there was a new compression method added and you can say that
3:27:41
method added and you can say that
3:27:41
method added and you can say that compression must be done on a server
3:27:43
compression must be done on a server
3:27:43
compression must be done on a server what's the different what's the
3:27:45
what's the different what's the
3:27:45
what's the different what's the difference you can ask and the actual
3:27:48
difference you can ask and the actual
3:27:48
difference you can ask and the actual difference is that uh
3:27:51
difference is that uh
3:27:51
difference is that uh Network bandwidth is a super important
3:27:54
Network bandwidth is a super important
3:27:54
Network bandwidth is a super important resource
3:27:56
resource when you are downloading a backup from
3:28:00
when you are downloading a backup from
3:28:00
when you are downloading a backup from running OTP installation you're
3:28:03
running OTP installation you're
3:28:03
running OTP installation you're consuming a network which is uh used for
3:28:08
consuming a network which is uh used for
3:28:08
consuming a network which is uh used for queries for replication for Waller
3:28:11
queries for replication for Waller
3:28:11
queries for replication for Waller hiding for lots of very important stuff
3:28:16
hiding for lots of very important stuff
3:28:16
hiding for lots of very important stuff and in many cases the limiting resource
3:28:21
and in many cases the limiting resource
3:28:21
and in many cases the limiting resource is not a
3:28:24
um storage is not a storage quote or
3:28:26
storage is not a storage quote or
3:28:26
storage is not a storage quote or storage place on a disks starts empty
3:28:30
storage place on a disks starts empty
3:28:30
storage place on a disks starts empty space and disks but rather uh
3:28:34
space and disks but rather uh
3:28:34
space and disks but rather uh network throughput of uh highly loaded
3:28:37
network throughput of uh highly loaded
3:28:37
network throughput of uh highly loaded installation
3:28:39
installation but actually there are so many other
3:28:42
but actually there are so many other
3:28:42
but actually there are so many other makeup tools like YG or PG Bucharest or
3:28:46
makeup tools like YG or PG Bucharest or
3:28:46
makeup tools like YG or PG Bucharest or Barman or
3:28:48
Barman or uh I know
3:28:50
uh I know G probe cup Armin and many many many
3:28:53
G probe cup Armin and many many many
3:28:53
G probe cup Armin and many many many others uh very good recap tools which
3:28:56
others uh very good recap tools which
3:28:56
others uh very good recap tools which can be used for backupin and they are
3:28:59
can be used for backupin and they are
3:28:59
can be used for backupin and they are actually already we're doing
3:29:02
actually already we're doing
3:29:02
actually already we're doing um compression
3:29:03
um compression of Base backup but anyway it's good to
3:29:07
of Base backup but anyway it's good to
3:29:07
of Base backup but anyway it's good to see a like good uh architectural
3:29:10
see a like good uh architectural
3:29:10
see a like good uh architectural decision it's in made in core and not in
3:29:14
decision it's in made in core and not in
3:29:14
decision it's in made in core and not in external tools
3:29:17
another interesting idea is the
3:29:20
another interesting idea is the
3:29:20
another interesting idea is the compression of wall right a head log uh
3:29:24
compression of wall right a head log uh
3:29:24
compression of wall right a head log uh let's see uh What uh resources wall can
3:29:28
let's see uh What uh resources wall can
3:29:28
let's see uh What uh resources wall can consume
3:29:29
consume we see here that uh
3:29:32
we see here that uh
3:29:33
we see here that uh typical highly available postgres
3:29:35
typical highly available postgres
3:29:35
typical highly available postgres installation is sending wall to our Hive
3:29:39
installation is sending wall to our Hive
3:29:39
installation is sending wall to our Hive for a point in time recovery purposes it
3:29:44
for a point in time recovery purposes it
3:29:44
for a point in time recovery purposes it is reading wall from disks it's writing
3:29:47
is reading wall from disks it's writing
3:29:48
is reading wall from disks it's writing wall for crash recovery on primary it's
3:29:51
wall for crash recovery on primary it's
3:29:51
wall for crash recovery on primary it's sending wall to each of replica or to
3:29:54
sending wall to each of replica or to
3:29:54
sending wall to each of replica or to each uh standby instance and actually
3:29:58
each uh standby instance and actually
3:29:58
each uh standby instance and actually right in this wall on Eastern by
3:30:00
right in this wall on Eastern by
3:30:00
right in this wall on Eastern by instance on disk so if we
3:30:05
instance on disk so if we
3:30:05
instance on disk so if we uh for every single byte if we reduce uh
3:30:10
uh for every single byte if we reduce uh
3:30:10
uh for every single byte if we reduce uh [Music]
3:30:11
[Music] amount of wool written a little uh we
3:30:15
amount of wool written a little uh we
3:30:15
amount of wool written a little uh we actually save a lot of resources in
3:30:17
actually save a lot of resources in
3:30:17
actually save a lot of resources in different places of a whole installation
3:30:20
different places of a whole installation
3:30:20
different places of a whole installation so it it kind of makes sense but
3:30:22
so it it kind of makes sense but
3:30:22
so it it kind of makes sense but compressing wool before writing it on
3:30:25
compressing wool before writing it on
3:30:25
compressing wool before writing it on this is a very tricky thing
3:30:28
this is a very tricky thing
3:30:28
this is a very tricky thing uh
3:30:32
um here are some problems of wall
3:30:35
um here are some problems of wall
3:30:35
um here are some problems of wall compression uh for example
3:30:39
compression uh for example
3:30:39
compression uh for example um actually all records are relatively
3:30:42
um actually all records are relatively
3:30:42
um actually all records are relatively small they're typically like on a scale
3:30:46
small they're typically like on a scale
3:30:46
small they're typically like on a scale of uh tens of bytes like smaller than
3:30:51
of uh tens of bytes like smaller than
3:30:51
of uh tens of bytes like smaller than 100 bytes and these chunks are not
3:30:54
100 bytes and these chunks are not
3:30:54
100 bytes and these chunks are not easily compressible even with more than
3:30:56
easily compressible even with more than
3:30:56
easily compressible even with more than modern algorithms but still they are
3:30:58
modern algorithms but still they are
3:30:58
modern algorithms but still they are compressible
3:31:00
compressible especially when they are going one after
3:31:03
especially when they are going one after
3:31:03
especially when they are going one after each other
3:31:04
each other another problem is that if we compress
3:31:07
another problem is that if we compress
3:31:07
another problem is that if we compress wall before writing it on segments to
3:31:10
wall before writing it on segments to
3:31:10
wall before writing it on segments to disks to disk uh
3:31:14
disks to disk uh each wall sender will have to recompress
3:31:17
each wall sender will have to recompress
3:31:17
each wall sender will have to recompress it and while receiver have to compress
3:31:20
it and while receiver have to compress
3:31:20
it and while receiver have to compress it again
3:31:21
it again and finally a crash recovery will be
3:31:24
and finally a crash recovery will be
3:31:24
and finally a crash recovery will be even more complicated than it is now and
3:31:27
even more complicated than it is now and
3:31:27
even more complicated than it is now and in fact a code of postgres crush
3:31:30
in fact a code of postgres crush
3:31:30
in fact a code of postgres crush recovery is uh
3:31:33
recovery is uh very difficult for a new color to read
3:31:35
very difficult for a new color to read
3:31:35
very difficult for a new color to read it's super well documented so it's very
3:31:40
it's super well documented so it's very
3:31:40
it's super well documented so it's very well documented but
3:31:41
well documented but
3:31:41
well documented but it's simply complicated
3:31:44
it's simply complicated
3:31:44
it's simply complicated and in case of a compressed wall it will
3:31:49
and in case of a compressed wall it will
3:31:49
and in case of a compressed wall it will be much more complicated because we have
3:31:51
be much more complicated because we have
3:31:51
be much more complicated because we have to continue right in the same wall
3:31:53
to continue right in the same wall
3:31:53
to continue right in the same wall segment and modern algorithms
3:31:57
segment and modern algorithms
3:31:57
segment and modern algorithms simply do not allow you to add data to
3:32:01
simply do not allow you to add data to
3:32:01
simply do not allow you to add data to already compressed data you have to
3:32:03
already compressed data you have to
3:32:03
already compressed data you have to extract everything and compress it back
3:32:06
extract everything and compress it back
3:32:06
extract everything and compress it back with some suffix
3:32:09
with some suffix but
3:32:10
but if a very small compressed chunk can
3:32:14
if a very small compressed chunk can
3:32:14
if a very small compressed chunk can constitute a very large decompressed
3:32:16
constitute a very large decompressed
3:32:16
constitute a very large decompressed chunk and you even cannot predict how
3:32:19
chunk and you even cannot predict how
3:32:19
chunk and you even cannot predict how many memory you will need to decompress
3:32:21
many memory you will need to decompress
3:32:21
many memory you will need to decompress few bytes because every few bytes can be
3:32:25
few bytes because every few bytes can be
3:32:25
few bytes because every few bytes can be like a bombed over decompressed stuff
3:32:30
like a bombed over decompressed stuff
3:32:30
like a bombed over decompressed stuff anyway I think at some point we will
3:32:32
anyway I think at some point we will
3:32:32
anyway I think at some point we will have a wall compression because it's
3:32:35
have a wall compression because it's
3:32:35
have a wall compression because it's saves us a lot of
3:32:37
saves us a lot of computational resources a lot of these
3:32:40
computational resources a lot of these
3:32:40
computational resources a lot of these bandwidths and network bandwidth with a
3:32:43
bandwidths and network bandwidth with a
3:32:43
bandwidths and network bandwidth with a very tiny fraction of CPU time
3:32:47
very tiny fraction of CPU time
3:32:47
very tiny fraction of CPU time but already but but now we are already
3:32:50
but already but but now we are already
3:32:50
but already but but now we are already compressing full page images and
3:32:53
compressing full page images and
3:32:53
compressing full page images and starting from uh postgres 15 I as far as
3:32:58
starting from uh postgres 15 I as far as
3:32:58
starting from uh postgres 15 I as far as I remember uh
3:33:01
you already can use lz4 to compress full
3:33:05
you already can use lz4 to compress full
3:33:05
you already can use lz4 to compress full page images
3:33:06
page images I contributed
3:33:08
I contributed this patch together with Justin prisby
3:33:12
this patch together with Justin prisby
3:33:12
this patch together with Justin prisby and of course reviewers and help of
3:33:14
and of course reviewers and help of
3:33:14
and of course reviewers and help of community uh
3:33:17
according to my measurements it's me it
3:33:21
according to my measurements it's me it
3:33:21
according to my measurements it's me it makes average installation of uh like
3:33:25
makes average installation of uh like
3:33:25
makes average installation of uh like about 10 percent faster under PG bench
3:33:30
about 10 percent faster under PG bench
3:33:30
about 10 percent faster under PG bench and also you can use uh Z standard for
3:33:35
and also you can use uh Z standard for
3:33:35
and also you can use uh Z standard for for full page images compression too but
3:33:38
for full page images compression too but
3:33:38
for full page images compression too but on a scale of what single uh page image
3:33:42
on a scale of what single uh page image
3:33:42
on a scale of what single uh page image it I really did not observe real
3:33:46
it I really did not observe real
3:33:46
it I really did not observe real difference between these two codecs so
3:33:49
difference between these two codecs so
3:33:49
difference between these two codecs so they they both are much better than pglz
3:33:52
they they both are much better than pglz
3:33:52
they they both are much better than pglz but still uh
3:33:55
but still uh we need to make something more clever
3:33:57
we need to make something more clever
3:33:57
we need to make something more clever with fpis to make uh this standard
3:34:02
with fpis to make uh this standard
3:34:02
with fpis to make uh this standard really show off I think that good idea
3:34:06
really show off I think that good idea
3:34:06
really show off I think that good idea would be to compress all page images
3:34:08
would be to compress all page images
3:34:08
would be to compress all page images together like having many page images
3:34:11
together like having many page images
3:34:11
together like having many page images that go one after each other they could
3:34:14
that go one after each other they could
3:34:14
that go one after each other they could be compressed together to make uh
3:34:17
be compressed together to make uh
3:34:17
be compressed together to make uh to make compression more efficient or we
3:34:19
to make compression more efficient or we
3:34:19
to make compression more efficient or we could try to keep a context of a
3:34:22
could try to keep a context of a
3:34:22
could try to keep a context of a compression between different
3:34:23
compression between different
3:34:23
compression between different compression Cycles
3:34:25
compression Cycles
3:34:25
compression Cycles but it will complicate a crush recovery
3:34:28
but it will complicate a crush recovery
3:34:28
but it will complicate a crush recovery so
3:34:30
so still straight off of complexity and
3:34:33
still straight off of complexity and
3:34:33
still straight off of complexity and efficiency not just of uh
3:34:35
efficiency not just of uh
3:34:36
efficiency not just of uh [Music]
3:34:37
[Music] compression ratio and compression time
3:34:41
compression ratio and compression time
3:34:41
compression ratio and compression time also communities working on protocol
3:34:43
also communities working on protocol
3:34:43
also communities working on protocol compression like compression of flip EQ
3:34:45
compression like compression of flip EQ
3:34:45
compression like compression of flip EQ protocol
3:34:46
protocol uh it is sometimes the thread is
3:34:50
uh it is sometimes the thread is
3:34:50
uh it is sometimes the thread is resurrected sometimes it's returned with
3:34:52
resurrected sometimes it's returned with
3:34:52
resurrected sometimes it's returned with feedback and currently the main concern
3:34:55
feedback and currently the main concern
3:34:55
feedback and currently the main concern is that compression is in fact defeating
3:34:59
is that compression is in fact defeating
3:34:59
is that compression is in fact defeating TLS we fit in encryption
3:35:03
TLS we fit in encryption
3:35:03
TLS we fit in encryption um
3:35:04
um what's the problem here let's see what
3:35:07
what's the problem here let's see what
3:35:07
what's the problem here let's see what is uh crime attack on a compression uh
3:35:11
is uh crime attack on a compression uh
3:35:11
is uh crime attack on a compression uh before encryption if we compress
3:35:14
before encryption if we compress
3:35:14
before encryption if we compress together some secret data with some user
3:35:17
together some secret data with some user
3:35:17
together some secret data with some user control data
3:35:18
control data the whole idea of
3:35:21
the whole idea of compression is making some more frequent
3:35:26
compression is making some more frequent
3:35:26
compression is making some more frequent data sequences shorter but how do we
3:35:29
data sequences shorter but how do we
3:35:29
data sequences shorter but how do we know that some uh some
3:35:32
know that some uh some
3:35:32
know that some uh some [Music]
3:35:34
[Music] consequence is more frequent
3:35:36
consequence is more frequent
3:35:36
consequence is more frequent the answer is simple if uh
3:35:41
the answer is simple if uh
3:35:41
the answer is simple if uh if the byte sequence is self-resemblant
3:35:44
if the byte sequence is self-resemblant
3:35:45
if the byte sequence is self-resemblant if it's if it is repeating itself then
3:35:47
if it's if it is repeating itself then
3:35:48
if it's if it is repeating itself then we think this is a like more frequent so
3:35:51
we think this is a like more frequent so
3:35:51
we think this is a like more frequent so any compression is uh
3:35:55
any compression is uh
3:35:55
any compression is uh is working better if the data resembles
3:35:59
is working better if the data resembles
3:35:59
is working better if the data resembles itself and if we have a compressor you
3:36:03
itself and if we have a compressor you
3:36:03
itself and if we have a compressor you secret data with user control data user
3:36:07
secret data with user control data user
3:36:07
secret data with user control data user can judge uh can extract information on
3:36:11
can judge uh can extract information on
3:36:11
can judge uh can extract information on the resemblance between secret data and
3:36:14
the resemblance between secret data and
3:36:14
the resemblance between secret data and user data and if the user is observing a
3:36:18
user data and if the user is observing a
3:36:18
user data and if the user is observing a size of encrypted data on public Channel
3:36:22
size of encrypted data on public Channel
3:36:22
size of encrypted data on public Channel they can judge how efficient it how uh
3:36:26
they can judge how efficient it how uh
3:36:26
they can judge how efficient it how uh what what is actually the secret data
3:36:29
what what is actually the secret data
3:36:29
what what is actually the secret data and if attacker can repeat again and
3:36:31
and if attacker can repeat again and
3:36:31
and if attacker can repeat again and again the same query
3:36:34
again the same query
3:36:34
again the same query they can eventually extract secret data
3:36:37
they can eventually extract secret data
3:36:37
they can eventually extract secret data from encrypted Channel
3:36:39
from encrypted Channel
3:36:39
from encrypted Channel it's very theoretical attack but
3:36:42
it's very theoretical attack but
3:36:42
it's very theoretical attack but plausible that's why we don't have a
3:36:45
plausible that's why we don't have a
3:36:45
plausible that's why we don't have a compression in open necessary
3:36:47
compression in open necessary
3:36:47
compression in open necessary implementation and other TLS
3:36:49
implementation and other TLS
3:36:49
implementation and other TLS implementations
3:36:50
implementations but still compression of protocol will
3:36:54
but still compression of protocol will
3:36:54
but still compression of protocol will save us a lot of resources so at some
3:36:56
save us a lot of resources so at some
3:36:56
save us a lot of resources so at some point we will have to have it
3:36:58
point we will have to have it
3:36:58
point we will have to have it despite of some security risks
3:37:02
another wild idea of fusion compression
3:37:05
another wild idea of fusion compression
3:37:05
another wild idea of fusion compression is compression of temporary files uh
3:37:08
is compression of temporary files uh
3:37:08
is compression of temporary files uh like temporary files will go through iOS
3:37:11
like temporary files will go through iOS
3:37:11
like temporary files will go through iOS that's why they deserve a compression I
3:37:14
that's why they deserve a compression I
3:37:14
that's why they deserve a compression I have hacked the patch here is a like
3:37:17
have hacked the patch here is a like
3:37:17
have hacked the patch here is a like just a
3:37:18
just a screenshot of postgresh Hacker's message
3:37:21
screenshot of postgresh Hacker's message
3:37:21
screenshot of postgresh Hacker's message about this patch
3:37:24
it somehow works and allows uh to reduce
3:37:28
it somehow works and allows uh to reduce
3:37:28
it somehow works and allows uh to reduce uh four times number of bytes written
3:37:32
uh four times number of bytes written
3:37:32
uh four times number of bytes written during create index separation or hash
3:37:35
during create index separation or hash
3:37:35
during create index separation or hash join operation
3:37:37
join operation but
3:37:39
but the implementations is just just too
3:37:41
the implementations is just just too
3:37:41
the implementations is just just too difficult to just to complicate it and
3:37:44
difficult to just to complicate it and
3:37:44
difficult to just to complicate it and not usually worth it like
3:37:48
not usually worth it like
3:37:48
not usually worth it like um
3:37:48
um it's always com like trade off between
3:37:51
it's always com like trade off between
3:37:51
it's always com like trade off between maintainability and uh
3:37:55
profit for end users and so far the
3:37:59
profit for end users and so far the
3:37:59
profit for end users and so far the patch is not in the shape what's the
3:38:02
patch is not in the shape what's the
3:38:02
patch is not in the shape what's the basic problem of this uh patch uh it's
3:38:05
basic problem of this uh patch uh it's
3:38:05
basic problem of this uh patch uh it's based on the idea of Random Access
3:38:07
based on the idea of Random Access
3:38:07
based on the idea of Random Access compressed file and each page after
3:38:10
compressed file and each page after
3:38:10
compressed file and each page after compression have a different size it can
3:38:13
compression have a different size it can
3:38:13
compression have a different size it can be
3:38:15
be uh wide resonant was more than eight
3:38:17
uh wide resonant was more than eight
3:38:17
uh wide resonant was more than eight kilobytes it can be slower it can be
3:38:20
kilobytes it can be slower it can be
3:38:20
kilobytes it can be slower it can be smaller than it was and at some point
3:38:23
smaller than it was and at some point
3:38:23
smaller than it was and at some point pages will be deleted and
3:38:26
pages will be deleted and
3:38:26
pages will be deleted and actually defragmented temporary file is
3:38:29
actually defragmented temporary file is
3:38:30
actually defragmented temporary file is like sounds like nonsense but still when
3:38:34
like sounds like nonsense but still when
3:38:34
like sounds like nonsense but still when we write something in the middle of a
3:38:37
we write something in the middle of a
3:38:37
we write something in the middle of a temporary file we will have to
3:38:40
temporary file we will have to
3:38:40
temporary file we will have to defragment it somehow and this complex
3:38:43
defragment it somehow and this complex
3:38:43
defragment it somehow and this complex tip currently prevents me from uh going
3:38:47
tip currently prevents me from uh going
3:38:47
tip currently prevents me from uh going further with this technology
3:38:50
further with this technology
3:38:50
further with this technology there are some Alternatives available
3:38:54
there are some Alternatives available
3:38:54
there are some Alternatives available for example in postgres Pro uh it's it's
3:38:58
for example in postgres Pro uh it's it's
3:38:58
for example in postgres Pro uh it's it's a proprietary Fork closed Source but
3:39:01
a proprietary Fork closed Source but
3:39:01
a proprietary Fork closed Source but they have like compressed file system
3:39:03
they have like compressed file system
3:39:03
they have like compressed file system compressed
3:39:06
um or maybe it's not complete I know
3:39:09
um or maybe it's not complete I know
3:39:09
um or maybe it's not complete I know they they call this technology CFS and
3:39:12
they they call this technology CFS and
3:39:12
they they call this technology CFS and this is effectively compression uh per
3:39:15
this is effectively compression uh per
3:39:15
this is effectively compression uh per page it I know that it works sometimes
3:39:18
page it I know that it works sometimes
3:39:18
page it I know that it works sometimes on some workloads uh very well on some
3:39:22
on some workloads uh very well on some
3:39:22
on some workloads uh very well on some workloads another option is just to use
3:39:25
workloads another option is just to use
3:39:25
workloads another option is just to use a compressing file system like the the
3:39:28
a compressing file system like the the
3:39:28
a compressing file system like the the first btrfs or some others
3:39:32
first btrfs or some others
3:39:32
first btrfs or some others uh another like develop develop
3:39:37
uh another like develop develop
3:39:37
uh another like develop develop development direction is to just go on
3:39:39
development direction is to just go on
3:39:39
development direction is to just go on with a defragmentation of Random Access
3:39:42
with a defragmentation of Random Access
3:39:42
with a defragmentation of Random Access compressed file but the problem is that
3:39:45
compressed file but the problem is that
3:39:45
compressed file but the problem is that uh we actually have to do it durable and
3:39:48
uh we actually have to do it durable and
3:39:48
uh we actually have to do it durable and if we want a durable defragmentation of
3:39:52
if we want a durable defragmentation of
3:39:52
if we want a durable defragmentation of temporary file we will have to log
3:39:55
temporary file we will have to log
3:39:55
temporary file we will have to log temporary file and this is just I'm
3:39:58
temporary file and this is just I'm
3:39:58
temporary file and this is just I'm hesitant to go go go on with this and
3:40:03
hesitant to go go go on with this and
3:40:03
hesitant to go go go on with this and another interesting idea is a approach
3:40:05
another interesting idea is a approach
3:40:06
another interesting idea is a approach taken by a green plum team they
3:40:08
taken by a green plum team they
3:40:08
taken by a green plum team they identified they had that hash join do
3:40:11
identified they had that hash join do
3:40:11
identified they had that hash join do not require Random Access it's always
3:40:14
not require Random Access it's always
3:40:14
not require Random Access it's always sequential and they extended buffer file
3:40:17
sequential and they extended buffer file
3:40:17
sequential and they extended buffer file API with a pledge sequential method that
3:40:20
API with a pledge sequential method that
3:40:20
API with a pledge sequential method that is saying that system profile will be
3:40:25
is saying that system profile will be
3:40:25
is saying that system profile will be never read from in in the middle never
3:40:27
never read from in in the middle never
3:40:27
never read from in in the middle never written in the middle just compress it
3:40:30
written in the middle just compress it
3:40:30
written in the middle just compress it as sequential
3:40:31
as sequential and it is uh usable but I don't think
3:40:35
and it is uh usable but I don't think
3:40:35
and it is uh usable but I don't think that patch as it is in uh green plum is
3:40:38
that patch as it is in uh green plum is
3:40:38
that patch as it is in uh green plum is in any close to commutable uh in
3:40:41
in any close to commutable uh in
3:40:41
in any close to commutable uh in postgres right now
3:40:43
postgres right now
3:40:43
postgres right now so if you want to remember something
3:40:46
so if you want to remember something
3:40:46
so if you want to remember something from this talk uh
3:40:49
from this talk uh first thing that we're trying to
3:40:52
first thing that we're trying to
3:40:52
first thing that we're trying to compress frequent data and uh
3:40:56
compress frequent data and uh
3:40:56
compress frequent data and uh rare data can be longer than uh com than
3:41:00
rare data can be longer than uh com than
3:41:00
rare data can be longer than uh com than before compression and the other ideas
3:41:02
before compression and the other ideas
3:41:02
before compression and the other ideas that any bytes that go slow through goes
3:41:05
that any bytes that go slow through goes
3:41:05
that any bytes that go slow through goes to this code the network deserves a
3:41:07
to this code the network deserves a
3:41:07
to this code the network deserves a compression if you have some good ideas
3:41:10
compression if you have some good ideas
3:41:10
compression if you have some good ideas on
3:41:12
on compression in database I'd be happy to
3:41:14
compression in database I'd be happy to
3:41:14
compression in database I'd be happy to discuss it with you in Discord
3:41:18
discuss it with you in Discord
3:41:18
discuss it with you in Discord um
3:41:18
um also maybe you have some scientific
3:41:21
also maybe you have some scientific
3:41:21
also maybe you have some scientific ideas how to make a recompression in
3:41:24
ideas how to make a recompression in
3:41:24
ideas how to make a recompression in popular algorithms and if you are
3:41:27
popular algorithms and if you are
3:41:27
popular algorithms and if you are willing to work on lz4 and this they
3:41:30
willing to work on lz4 and this they
3:41:30
willing to work on lz4 and this they don't have this API that would be great
3:41:32
don't have this API that would be great
3:41:32
don't have this API that would be great uh
3:41:36
[Music] and basically that's it I'll be happy to
3:41:38
and basically that's it I'll be happy to
3:41:38
and basically that's it I'll be happy to discuss is that this stuff with you in
3:41:41
discuss is that this stuff with you in
3:41:41
discuss is that this stuff with you in Discord thank you so much for listening
3:41:45
Discord thank you so much for listening
3:41:45
Discord thank you so much for listening yeah all right
3:41:48
yeah all right thank you Andre
3:41:50
thank you Andre that was pretty good A lot of
3:41:52
that was pretty good A lot of
3:41:52
that was pretty good A lot of information packed into a short period
3:41:55
information packed into a short period
3:41:55
information packed into a short period of time so nicely done is that a pun is
3:41:59
of time so nicely done is that a pun is
3:41:59
of time so nicely done is that a pun is that a pun Rob come on maybe it could be
3:42:02
that a pun Rob come on maybe it could be
3:42:02
that a pun Rob come on maybe it could be late in the day could be a pun uh I did
3:42:05
late in the day could be a pun uh I did
3:42:05
late in the day could be a pun uh I did want to ask a couple of cool questions
3:42:06
want to ask a couple of cool questions
3:42:06
want to ask a couple of cool questions if you don't mind
3:42:08
if you don't mind um so one for sure is uh I I think most
3:42:12
um so one for sure is uh I I think most
3:42:12
um so one for sure is uh I I think most of the research essentially you touched
3:42:14
of the research essentially you touched
3:42:14
of the research essentially you touched on some research that is not from
3:42:15
on some research that is not from
3:42:15
on some research that is not from Academia I always feel like most of it
3:42:17
Academia I always feel like most of it
3:42:17
Academia I always feel like most of it comes from Academia and some of it is
3:42:19
comes from Academia and some of it is
3:42:19
comes from Academia and some of it is even papers that were written like you
3:42:20
even papers that were written like you
3:42:20
even papers that were written like you know 20 years ago that we just never
3:42:22
know 20 years ago that we just never
3:42:22
know 20 years ago that we just never looked at for some reason
3:42:24
looked at for some reason
3:42:24
looked at for some reason um I'm curious like how do you feel
3:42:25
um I'm curious like how do you feel
3:42:25
um I'm curious like how do you feel about the state of taking that
3:42:27
about the state of taking that
3:42:27
about the state of taking that information and then being able to
3:42:29
information and then being able to
3:42:29
information and then being able to transform it into a way that's useful
3:42:31
transform it into a way that's useful
3:42:31
transform it into a way that's useful for production systems both like there's
3:42:33
for production systems both like there's
3:42:33
for production systems both like there's a technical issue there I'm sure but
3:42:35
a technical issue there I'm sure but
3:42:35
a technical issue there I'm sure but also like legal ones around like patents
3:42:37
also like legal ones around like patents
3:42:37
also like legal ones around like patents and that kind of stuff like any any
3:42:38
and that kind of stuff like any any
3:42:38
and that kind of stuff like any any thoughts on that
3:42:41
thoughts on that mm-hmm yeah well a lot of very good
3:42:46
mm-hmm yeah well a lot of very good
3:42:46
mm-hmm yeah well a lot of very good research was done in uh in the times
3:42:49
research was done in uh in the times
3:42:49
research was done in uh in the times when parents were a thing and we cannot
3:42:52
when parents were a thing and we cannot
3:42:52
when parents were a thing and we cannot just use anything that is potent because
3:42:55
just use anything that is potent because
3:42:55
just use anything that is potent because like
3:42:57
like slightly it's a project policy and
3:43:01
slightly it's a project policy and
3:43:01
slightly it's a project policy and another thing is Academia is like doing
3:43:04
another thing is Academia is like doing
3:43:04
another thing is Academia is like doing a job well in a like bro on a broad
3:43:07
a job well in a like bro on a broad
3:43:07
a job well in a like bro on a broad scale they see us like database
3:43:10
scale they see us like database
3:43:10
scale they see us like database developers as just one application and
3:43:12
developers as just one application and
3:43:12
developers as just one application and they are producing algorithms that are
3:43:15
they are producing algorithms that are
3:43:15
they are producing algorithms that are like general purpose algorithms uh yeah
3:43:18
like general purpose algorithms uh yeah
3:43:18
like general purpose algorithms uh yeah they feel that we could have a
3:43:20
they feel that we could have a
3:43:20
they feel that we could have a algorithms that better see it uh our
3:43:23
algorithms that better see it uh our
3:43:23
algorithms that better see it uh our needs in databases and if some
3:43:26
needs in databases and if some
3:43:26
needs in databases and if some researchers are watching us so I'm
3:43:28
researchers are watching us so I'm
3:43:28
researchers are watching us so I'm asking you to think more about how how
3:43:31
asking you to think more about how how
3:43:31
asking you to think more about how how uh
3:43:33
uh how we can adapt API because uh in
3:43:36
how we can adapt API because uh in
3:43:36
how we can adapt API because uh in databases we usually have a Pages we
3:43:39
databases we usually have a Pages we
3:43:39
databases we usually have a Pages we usually have uh like it we usually have
3:43:43
usually have uh like it we usually have
3:43:43
usually have uh like it we usually have something like tuples we usually have to
3:43:46
something like tuples we usually have to
3:43:46
something like tuples we usually have to add some data into already compression
3:43:48
add some data into already compression
3:43:48
add some data into already compression data and we have so many like ways to
3:43:53
data and we have so many like ways to
3:43:53
data and we have so many like ways to test uh compression algorithms which see
3:43:56
test uh compression algorithms which see
3:43:56
test uh compression algorithms which see it our needs better and we have a
3:43:58
it our needs better and we have a
3:43:58
it our needs better and we have a standard ways to measure performance of
3:44:01
standard ways to measure performance of
3:44:01
standard ways to measure performance of overall system not just a compression
3:44:02
overall system not just a compression
3:44:03
overall system not just a compression system if if your algorithm is uh like
3:44:07
system if if your algorithm is uh like
3:44:07
system if if your algorithm is uh like pressing uh
3:44:09
pressing uh CPU in some extraordinary way we will
3:44:12
CPU in some extraordinary way we will
3:44:12
CPU in some extraordinary way we will see it in a system working as a whole uh
3:44:16
see it in a system working as a whole uh
3:44:16
see it in a system working as a whole uh not not just a one isolated part so yeah
3:44:20
not not just a one isolated part so yeah
3:44:20
not not just a one isolated part so yeah it would be great if some researchers
3:44:22
it would be great if some researchers
3:44:22
it would be great if some researchers would join this uh with together with
3:44:25
would join this uh with together with
3:44:25
would join this uh with together with database scientists and the database
3:44:27
database scientists and the database
3:44:27
database scientists and the database developers the code is there they should
3:44:29
developers the code is there they should
3:44:29
developers the code is there they should join us I agree totally a call to action
3:44:33
join us I agree totally a call to action
3:44:33
join us I agree totally a call to action absolutely go to research
3:44:36
absolutely go to research
3:44:36
absolutely go to research so um we'd love to talk to you more I'm
3:44:40
so um we'd love to talk to you more I'm
3:44:40
so um we'd love to talk to you more I'm sure people have more questions there's
3:44:41
sure people have more questions there's
3:44:41
sure people have more questions there's probably some on the Discord if you
3:44:43
probably some on the Discord if you
3:44:43
probably some on the Discord if you you're going to be popping over there
3:44:46
you're going to be popping over there
3:44:46
you're going to be popping over there yeah sure all right and are you going to
3:44:48
yeah sure all right and are you going to
3:44:48
yeah sure all right and are you going to be at the emea live stream which is
3:44:50
be at the emea live stream which is
3:44:50
be at the emea live stream which is happening
3:44:51
happening um 9 A.M central European summer time uh
3:44:56
um 9 A.M central European summer time uh
3:44:56
um 9 A.M central European summer time uh tomorrow yeah I will try I I'm planning
3:44:59
tomorrow yeah I will try I I'm planning
3:44:59
tomorrow yeah I will try I I'm planning to watch uh you know Marcus lot keynote
3:45:03
to watch uh you know Marcus lot keynote
3:45:03
to watch uh you know Marcus lot keynote and probably I will stay for some other
3:45:06
and probably I will stay for some other
3:45:06
and probably I will stay for some other talks too
3:45:07
talks too awesome well thank you again Andre for
3:45:10
awesome well thank you again Andre for
3:45:10
awesome well thank you again Andre for being part of this and um your talk it
3:45:13
being part of this and um your talk it
3:45:13
being part of this and um your talk it will be available for people who want to
3:45:15
will be available for people who want to
3:45:15
will be available for people who want to watch the replay of the America's live
3:45:17
watch the replay of the America's live
3:45:17
watch the replay of the America's live stream maybe if they got here late
3:45:19
stream maybe if they got here late
3:45:19
stream maybe if they got here late um they'll be able to watch it there and
3:45:20
um they'll be able to watch it there and
3:45:20
um they'll be able to watch it there and then we'll be publishing it as its own
3:45:22
then we'll be publishing it as its own
3:45:22
then we'll be publishing it as its own independent talk um on YouTube within
3:45:25
independent talk um on YouTube within
3:45:25
independent talk um on YouTube within the next couple of weeks cool thank you
3:45:27
the next couple of weeks cool thank you
3:45:27
the next couple of weeks cool thank you all right thanks Andre thank you so now
3:45:30
all right thanks Andre thank you so now
3:45:30
all right thanks Andre thank you so now it's time to go to the wrap-up and it's
3:45:35
it's time to go to the wrap-up and it's
3:45:35
it's time to go to the wrap-up and it's it's you and me
3:45:37
it's you and me um our wrap up is supposed to be like 25
3:45:38
um our wrap up is supposed to be like 25
3:45:38
um our wrap up is supposed to be like 25 minutes but we're gonna have to go
3:45:40
minutes but we're gonna have to go
3:45:40
minutes but we're gonna have to go faster up yeah you know I was gonna say
3:45:43
faster up yeah you know I was gonna say
3:45:43
faster up yeah you know I was gonna say like I can't tell if it was a long day
3:45:44
like I can't tell if it was a long day
3:45:44
like I can't tell if it was a long day or like a short day because it seemed
3:45:46
or like a short day because it seemed
3:45:46
or like a short day because it seemed like it went by pretty quick so but
3:45:48
like it went by pretty quick so but
3:45:48
like it went by pretty quick so but that's good that's good it did go by
3:45:50
that's good that's good it did go by
3:45:50
that's good that's good it did go by really quick and um especially paying
3:45:53
really quick and um especially paying
3:45:53
really quick and um especially paying attention to the chat and trying to pay
3:45:55
attention to the chat and trying to pay
3:45:55
attention to the chat and trying to pay attention to the talks and I forgot
3:45:57
attention to the talks and I forgot
3:45:57
attention to the talks and I forgot about live tweeting thank goodness that
3:45:59
about live tweeting thank goodness that
3:45:59
about live tweeting thank goodness that other people are out there out there
3:46:01
other people are out there out there
3:46:01
other people are out there out there doing that so we have a couple things um
3:46:03
doing that so we have a couple things um
3:46:03
doing that so we have a couple things um that we wanted to cover
3:46:05
that we wanted to cover
3:46:05
that we wanted to cover um in the beginning so let's see the
3:46:07
um in the beginning so let's see the
3:46:07
um in the beginning so let's see the first is just that reminder that there's
3:46:09
first is just that reminder that there's
3:46:09
first is just that reminder that there's a survey and uh you can quickly scan
3:46:12
a survey and uh you can quickly scan
3:46:12
a survey and uh you can quickly scan that QR code you can enter the survey
3:46:15
that QR code you can enter the survey
3:46:15
that QR code you can enter the survey feedback multiple times so if you
3:46:17
feedback multiple times so if you
3:46:17
feedback multiple times so if you haven't watched all the talks yet you
3:46:18
haven't watched all the talks yet you
3:46:18
haven't watched all the talks yet you want to watch some of the on-demand ones
3:46:20
want to watch some of the on-demand ones
3:46:20
want to watch some of the on-demand ones or whatever you can do it multiple times
3:46:22
or whatever you can do it multiple times
3:46:22
or whatever you can do it multiple times until it closes next Friday so yeah
3:46:24
until it closes next Friday so yeah
3:46:24
until it closes next Friday so yeah often vote early vote often that works
3:46:29
often vote early vote often that works
3:46:29
often vote early vote often that works um I I feel like it's a broken record
3:46:31
um I I feel like it's a broken record
3:46:31
um I I feel like it's a broken record I've mentioned the Discord um teen times
3:46:32
I've mentioned the Discord um teen times
3:46:32
I've mentioned the Discord um teen times so
3:46:34
so um there's a word for when you say a
3:46:36
um there's a word for when you say a
3:46:36
um there's a word for when you say a word so many times that it loses its its
3:46:38
word so many times that it loses its its
3:46:38
word so many times that it loses its its meaning inside your head
3:46:41
meaning inside your head
3:46:41
meaning inside your head no and actually that word was on a Ted
3:46:44
no and actually that word was on a Ted
3:46:44
no and actually that word was on a Ted lasso episode recently but that's that's
3:46:45
lasso episode recently but that's that's
3:46:45
lasso episode recently but that's that's another story and I just can't recall
3:46:47
another story and I just can't recall
3:46:47
another story and I just can't recall what it was but anyway I have said the
3:46:49
what it was but anyway I have said the
3:46:49
what it was but anyway I have said the word Discord so many times today but
3:46:52
word Discord so many times today but
3:46:52
word Discord so many times today but there is there is a kind of a fun back
3:46:54
there is there is a kind of a fun back
3:46:54
there is there is a kind of a fun back Channel there for those who want to join
3:46:56
Channel there for those who want to join
3:46:56
Channel there for those who want to join and it's going to continue throughout
3:46:57
and it's going to continue throughout
3:46:57
and it's going to continue throughout the rest of the day
3:46:59
the rest of the day
3:46:59
the rest of the day um
3:47:00
um wow big thank you to all of these
3:47:03
wow big thank you to all of these
3:47:03
wow big thank you to all of these speakers this is pretty cool and I have
3:47:06
speakers this is pretty cool and I have
3:47:06
speakers this is pretty cool and I have the URL showing for the live stream
3:47:08
the URL showing for the live stream
3:47:08
the URL showing for the live stream Playlist cytuscon Live stream playlist
3:47:10
Playlist cytuscon Live stream playlist
3:47:10
Playlist cytuscon Live stream playlist right now and that's why I pop this up
3:47:12
right now and that's why I pop this up
3:47:12
right now and that's why I pop this up there
3:47:14
um and then if you want to mark your
3:47:16
and then if you want to mark your
3:47:16
and then if you want to mark your calendar I want to make it easy for you
3:47:18
calendar I want to make it easy for you
3:47:18
calendar I want to make it easy for you if you want to go to all or part of the
3:47:20
if you want to go to all or part of the
3:47:20
if you want to go to all or part of the emea live stream uh because who knows
3:47:22
emea live stream uh because who knows
3:47:22
emea live stream uh because who knows where you are in the world you might be
3:47:24
where you are in the world you might be
3:47:24
where you are in the world you might be watching this in Replay in a couple
3:47:26
watching this in Replay in a couple
3:47:26
watching this in Replay in a couple hours and maybe you live in Europe
3:47:28
hours and maybe you live in Europe
3:47:28
hours and maybe you live in Europe um so the link in the upper right corner
3:47:31
um so the link in the upper right corner
3:47:31
um so the link in the upper right corner aka.mscon hyphen India that gives you an
3:47:35
aka.mscon hyphen India that gives you an
3:47:35
aka.mscon hyphen India that gives you an easy way to drop something in your
3:47:36
easy way to drop something in your
3:47:36
easy way to drop something in your calendar for the Amia live stream
3:47:38
calendar for the Amia live stream
3:47:38
calendar for the Amia live stream tomorrow
3:47:39
tomorrow you'll be a sleep Rob I hope so I mean I
3:47:42
you'll be a sleep Rob I hope so I mean I
3:47:42
you'll be a sleep Rob I hope so I mean I do want to watch this I will definitely
3:47:44
do want to watch this I will definitely
3:47:44
do want to watch this I will definitely watch the replay of it but I don't think
3:47:46
watch the replay of it but I don't think
3:47:46
watch the replay of it but I don't think I will watch it live I think it will
3:47:48
I will watch it live I think it will
3:47:48
I will watch it live I think it will have gone really wrong if that happens
3:47:49
have gone really wrong if that happens
3:47:49
have gone really wrong if that happens so
3:47:52
so on-demand talks there's 25 of them
3:47:55
on-demand talks there's 25 of them
3:47:55
on-demand talks there's 25 of them that's that's a fair number of talks but
3:47:58
that's that's a fair number of talks but
3:47:58
that's that's a fair number of talks but they're all really well I don't know I
3:48:00
they're all really well I don't know I
3:48:00
they're all really well I don't know I assume they're all really good because
3:48:01
assume they're all really good because
3:48:01
assume they're all really good because I've looked through the list I've been
3:48:03
I've looked through the list I've been
3:48:03
I've looked through the list I've been trying to also produce that along with
3:48:05
trying to also produce that along with
3:48:05
trying to also produce that along with these other things we have going on
3:48:06
these other things we have going on
3:48:06
these other things we have going on today during the live stream was like
3:48:07
today during the live stream was like
3:48:07
today during the live stream was like looking through that list of talks and
3:48:09
looking through that list of talks and
3:48:09
looking through that list of talks and seeing what was there and uh I'm really
3:48:12
seeing what was there and uh I'm really
3:48:12
seeing what was there and uh I'm really excited actually I will be watching some
3:48:13
excited actually I will be watching some
3:48:13
excited actually I will be watching some of those tonight before you know before
3:48:15
of those tonight before you know before
3:48:15
of those tonight before you know before things happen tomorrow so for my tonight
3:48:18
things happen tomorrow so for my tonight
3:48:18
things happen tomorrow so for my tonight well and you gave a really interesting
3:48:21
well and you gave a really interesting
3:48:21
well and you gave a really interesting on-demand talk last year and I know that
3:48:24
on-demand talk last year and I know that
3:48:24
on-demand talk last year and I know that you know those views they don't all
3:48:25
you know those views they don't all
3:48:25
you know those views they don't all happen on the day of the event right
3:48:27
happen on the day of the event right
3:48:27
happen on the day of the event right they happen over time there's a whole
3:48:28
they happen over time there's a whole
3:48:28
they happen over time there's a whole long tail and so the good news is people
3:48:30
long tail and so the good news is people
3:48:30
long tail and so the good news is people can consume it at a time that's
3:48:32
can consume it at a time that's
3:48:32
can consume it at a time that's convenient
3:48:33
convenient to them
3:48:35
to them um but obviously for anyone who is
3:48:37
um but obviously for anyone who is
3:48:37
um but obviously for anyone who is interested in a particular topic if they
3:48:39
interested in a particular topic if they
3:48:39
interested in a particular topic if they hop in there today tomorrow then they
3:48:41
hop in there today tomorrow then they
3:48:41
hop in there today tomorrow then they can ask questions on the Discord and
3:48:43
can ask questions on the Discord and
3:48:43
can ask questions on the Discord and actually it's a little bit easier to
3:48:44
actually it's a little bit easier to
3:48:45
actually it's a little bit easier to reach the speakers unless those speakers
3:48:47
reach the speakers unless those speakers
3:48:47
reach the speakers unless those speakers are a cubecon a few people are
3:48:49
are a cubecon a few people are
3:48:49
are a cubecon a few people are distracted with kubecon
3:48:51
distracted with kubecon
3:48:51
distracted with kubecon um there are other conferences going on
3:48:53
um there are other conferences going on
3:48:53
um there are other conferences going on there are other conferences I know
3:48:55
there are other conferences I know
3:48:55
there are other conferences I know there's a Microsoft MVP Summit too for
3:48:57
there's a Microsoft MVP Summit too for
3:48:57
there's a Microsoft MVP Summit too for people who are involved in that
3:48:59
people who are involved in that
3:48:59
people who are involved in that um so we have a couple of promo videos
3:49:01
um so we have a couple of promo videos
3:49:01
um so we have a couple of promo videos and I wanted to play that that first one
3:49:05
and I wanted to play that that first one
3:49:05
and I wanted to play that that first one that is all about postgres performance
3:49:07
that is all about postgres performance
3:49:07
that is all about postgres performance and security that goes through super
3:49:09
and security that goes through super
3:49:09
and security that goes through super super fast like 10 seconds each for some
3:49:11
super fast like 10 seconds each for some
3:49:12
super fast like 10 seconds each for some of these on-demand talks um does that
3:49:13
of these on-demand talks um does that
3:49:13
of these on-demand talks um does that sound good that sounds awesome okay
3:49:15
sound good that sounds awesome okay
3:49:15
sound good that sounds awesome okay let's go
3:49:24
thank you but in fact I'm going to show you
3:49:26
but in fact I'm going to show you
3:49:26
but in fact I'm going to show you different hopefully exciting ways that
3:49:29
different hopefully exciting ways that
3:49:29
different hopefully exciting ways that you don't know about to index uuids and
3:49:33
you don't know about to index uuids and
3:49:33
you don't know about to index uuids and postgresql
3:49:34
postgresql so if you are a traveling postgres
3:49:38
so if you are a traveling postgres
3:49:38
so if you are a traveling postgres consultant like me you will see that
3:49:40
consultant like me you will see that
3:49:40
consultant like me you will see that often
3:49:42
um a lot of performance can be gained by
3:49:45
a lot of performance can be gained by
3:49:45
a lot of performance can be gained by outsmarting people right
3:49:48
outsmarting people right
3:49:48
outsmarting people right today I'm going to be talking about
3:49:49
today I'm going to be talking about
3:49:49
today I'm going to be talking about postgres tableblo but the overall idea
3:49:51
postgres tableblo but the overall idea
3:49:51
postgres tableblo but the overall idea of this talk is discussing the
3:49:54
of this talk is discussing the
3:49:54
of this talk is discussing the transition from doing postgres at a
3:49:56
transition from doing postgres at a
3:49:56
transition from doing postgres at a smaller scale to post scale
3:49:59
smaller scale to post scale
3:49:59
smaller scale to post scale I'm going to talk to you today about
3:50:00
I'm going to talk to you today about
3:50:00
I'm going to talk to you today about postgresql privileges roles and security
3:50:03
postgresql privileges roles and security
3:50:03
postgresql privileges roles and security and how you can take better advantage of
3:50:06
and how you can take better advantage of
3:50:06
and how you can take better advantage of them within your postgres installation
3:50:08
them within your postgres installation
3:50:08
them within your postgres installation the thing I wanted to talk about explain
3:50:10
the thing I wanted to talk about explain
3:50:10
the thing I wanted to talk about explain the concepts behind tuning high right
3:50:13
the concepts behind tuning high right
3:50:13
the concepts behind tuning high right workloads for postgres and then dive
3:50:16
workloads for postgres and then dive
3:50:16
workloads for postgres and then dive deeper into the specific configurations
3:50:19
deeper into the specific configurations
3:50:19
deeper into the specific configurations and devops data technical lead of jfrog
3:50:22
and devops data technical lead of jfrog
3:50:22
and devops data technical lead of jfrog and I want to share my lecture about
3:50:25
and I want to share my lecture about
3:50:25
and I want to share my lecture about troubleshooting High CPU utilization for
3:50:28
troubleshooting High CPU utilization for
3:50:28
troubleshooting High CPU utilization for postgres database
3:50:30
postgres database we will talk about
3:50:31
we will talk about
3:50:31
we will talk about external attacks about authentication
3:50:34
external attacks about authentication
3:50:34
external attacks about authentication security about data protection and also
3:50:38
security about data protection and also
3:50:38
security about data protection and also I will share some trips and my
3:50:41
I will share some trips and my
3:50:41
I will share some trips and my recommendations with you
3:50:43
recommendations with you
3:50:43
recommendations with you [Music]
3:50:52
I got a chance to look at hockey slides
3:50:54
I got a chance to look at hockey slides
3:50:54
I got a chance to look at hockey slides a little bit and just started like
3:50:56
a little bit and just started like
3:50:56
a little bit and just started like peeking at that uh and that will be one
3:50:58
peeking at that uh and that will be one
3:50:58
peeking at that uh and that will be one definitely I'm watching tonight because
3:50:59
definitely I'm watching tonight because
3:50:59
definitely I'm watching tonight because he he definitely goes into things like I
3:51:02
he he definitely goes into things like I
3:51:02
he he definitely goes into things like I felt like I've dealt with uuids a lot in
3:51:04
felt like I've dealt with uuids a lot in
3:51:04
felt like I've dealt with uuids a lot in postgres he does talk about some stuff
3:51:06
postgres he does talk about some stuff
3:51:06
postgres he does talk about some stuff in there that I don't think I've seen
3:51:07
in there that I don't think I've seen
3:51:07
in there that I don't think I've seen anyone else actually talk about before
3:51:09
anyone else actually talk about before
3:51:09
anyone else actually talk about before so I maybe I missed it but I'm super
3:51:12
so I maybe I missed it but I'm super
3:51:12
so I maybe I missed it but I'm super excited on that one uh and I would all
3:51:14
excited on that one uh and I would all
3:51:14
excited on that one uh and I would all say like Ryan I know has been doing a
3:51:16
say like Ryan I know has been doing a
3:51:16
say like Ryan I know has been doing a lot of advocacy work around roles and
3:51:18
lot of advocacy work around roles and
3:51:18
lot of advocacy work around roles and grants so I'm pretty excited to check
3:51:19
grants so I'm pretty excited to check
3:51:19
grants so I'm pretty excited to check that one out as well uh and I mean they
3:51:21
that one out as well uh and I mean they
3:51:21
that one out as well uh and I mean they all I think look interesting at least on
3:51:23
all I think look interesting at least on
3:51:23
all I think look interesting at least on first glance so that's that's pretty
3:51:25
first glance so that's that's pretty
3:51:25
first glance so that's that's pretty good yeah I think Ryan gave a talk at
3:51:27
good yeah I think Ryan gave a talk at
3:51:27
good yeah I think Ryan gave a talk at scale and you were there as well I don't
3:51:29
scale and you were there as well I don't
3:51:29
scale and you were there as well I don't know if you saw saw that one and then
3:51:31
know if you saw saw that one and then
3:51:31
know if you saw saw that one and then Chelsea Dole I saw her give a lightning
3:51:34
Chelsea Dole I saw her give a lightning
3:51:34
Chelsea Dole I saw her give a lightning talk at pgconf EU in Berlin a couple of
3:51:37
talk at pgconf EU in Berlin a couple of
3:51:37
talk at pgconf EU in Berlin a couple of months ago and
3:51:39
months ago and um yeah she was she was fabulous I went
3:51:42
um yeah she was she was fabulous I went
3:51:42
um yeah she was she was fabulous I went up introduced myself sit told her all
3:51:44
up introduced myself sit told her all
3:51:44
up introduced myself sit told her all about satiscon and said please submit a
3:51:46
about satiscon and said please submit a
3:51:46
about satiscon and said please submit a talk proposal
3:51:47
talk proposal um although I mean the talks that get
3:51:50
um although I mean the talks that get
3:51:50
um although I mean the talks that get accepted in the status con is pretty
3:51:52
accepted in the status con is pretty
3:51:52
accepted in the status con is pretty competitive I think we have like a 27
3:51:55
competitive I think we have like a 27
3:51:55
competitive I think we have like a 27 acceptance rate
3:51:57
acceptance rate um so 32 invited Keynotes obviously but
3:52:00
um so 32 invited Keynotes obviously but
3:52:00
um so 32 invited Keynotes obviously but 35 talks accepted so you can do the
3:52:02
35 talks accepted so you can do the
3:52:02
35 talks accepted so you can do the quick math in your head and figure out
3:52:04
quick math in your head and figure out
3:52:04
quick math in your head and figure out how many submissions there were there
3:52:06
how many submissions there were there
3:52:06
how many submissions there were there were a lot of phenomenal submissions and
3:52:08
were a lot of phenomenal submissions and
3:52:08
were a lot of phenomenal submissions and it was really hard for the the talks
3:52:10
it was really hard for the the talks
3:52:10
it was really hard for the the talks election team
3:52:12
election team um and in fact I think I have a slide
3:52:14
um and in fact I think I have a slide
3:52:14
um and in fact I think I have a slide here
3:52:15
here um if I click ahead yeah there it was
3:52:17
um if I click ahead yeah there it was
3:52:17
um if I click ahead yeah there it was just to say thank you to everybody who
3:52:19
just to say thank you to everybody who
3:52:19
just to say thank you to everybody who was on the the talk selection team
3:52:22
was on the the talk selection team
3:52:22
was on the the talk selection team um Aaron whistling Alicia Marco and
3:52:25
um Aaron whistling Alicia Marco and
3:52:25
um Aaron whistling Alicia Marco and Charles
3:52:26
Charles um spent a lot of time reviewing all
3:52:28
um spent a lot of time reviewing all
3:52:28
um spent a lot of time reviewing all those proposals and this conference you
3:52:30
those proposals and this conference you
3:52:30
those proposals and this conference you know obviously I say every great event
3:52:32
know obviously I say every great event
3:52:32
know obviously I say every great event starts with great speakers but um it the
3:52:35
starts with great speakers but um it the
3:52:35
starts with great speakers but um it the talk selection team does a lot of work
3:52:36
talk selection team does a lot of work
3:52:36
talk selection team does a lot of work to get us there
3:52:38
to get us there absolutely and uh you know again they
3:52:41
absolutely and uh you know again they
3:52:41
absolutely and uh you know again they made sure not to invite me back so I
3:52:42
made sure not to invite me back so I
3:52:42
made sure not to invite me back so I think the level this year has gone up
3:52:44
think the level this year has gone up
3:52:44
think the level this year has gone up even more considerably uh so that's ouch
3:52:47
even more considerably uh so that's ouch
3:52:47
even more considerably uh so that's ouch don't be self-def
3:52:49
ER those of you who care about Swag um
3:52:52
ER those of you who care about Swag um
3:52:52
ER those of you who care about Swag um the code is on screen right now as well
3:52:54
the code is on screen right now as well
3:52:54
the code is on screen right now as well as the URL there are 75 of these being
3:52:56
as the URL there are 75 of these being
3:52:57
as the URL there are 75 of these being given away in this live stream time
3:52:58
given away in this live stream time
3:52:58
given away in this live stream time period so if you haven't gone to do it
3:53:00
period so if you haven't gone to do it
3:53:00
period so if you haven't gone to do it yet
3:53:01
yet um the activity book is fun not only for
3:53:04
um the activity book is fun not only for
3:53:04
um the activity book is fun not only for kids but I remember when I first met
3:53:06
kids but I remember when I first met
3:53:06
kids but I remember when I first met Melanie plagueman
3:53:08
Melanie plagueman um it was because she had gotten the
3:53:09
um it was because she had gotten the
3:53:09
um it was because she had gotten the previous version of the activity book at
3:53:11
previous version of the activity book at
3:53:11
previous version of the activity book at a postgres conference and she was
3:53:13
a postgres conference and she was
3:53:13
a postgres conference and she was enjoying um coloring in it so uh uh yeah
3:53:18
enjoying um coloring in it so uh uh yeah
3:53:18
enjoying um coloring in it so uh uh yeah um nothing better for when you are
3:53:19
um nothing better for when you are
3:53:19
um nothing better for when you are waiting on our index rebuild uh you know
3:53:22
waiting on our index rebuild uh you know
3:53:22
waiting on our index rebuild uh you know than to be able to whip out the activity
3:53:23
than to be able to whip out the activity
3:53:23
than to be able to whip out the activity book and uh you know pass some time that
3:53:26
book and uh you know pass some time that
3:53:26
book and uh you know pass some time that way and socks are always cool and
3:53:28
way and socks are always cool and
3:53:28
way and socks are always cool and there's the sticker packs too for those
3:53:29
there's the sticker packs too for those
3:53:29
there's the sticker packs too for those of you who use stickers either on your
3:53:31
of you who use stickers either on your
3:53:31
of you who use stickers either on your laptops like me or on your luggage I
3:53:33
laptops like me or on your luggage I
3:53:33
laptops like me or on your luggage I feel like I'm in an infomercial but I
3:53:35
feel like I'm in an infomercial but I
3:53:35
feel like I'm in an infomercial but I know how much people care about Swag
3:53:37
know how much people care about Swag
3:53:37
know how much people care about Swag it's like it's a thing and so speakers
3:53:39
it's like it's a thing and so speakers
3:53:39
it's like it's a thing and so speakers yeah they do okay
3:53:42
yeah they do okay um we have another quick video trailer
3:53:44
um we have another quick video trailer
3:53:44
um we have another quick video trailer this one is focused more on the two
3:53:46
this one is focused more on the two
3:53:46
this one is focused more on the two Azure database services for postgres
3:53:49
Azure database services for postgres
3:53:49
Azure database services for postgres Azure Cosmos DB for postgres other what
3:53:53
Azure Cosmos DB for postgres other what
3:53:53
Azure Cosmos DB for postgres other what I happen to call cytus on Azure because
3:53:55
I happen to call cytus on Azure because
3:53:55
I happen to call cytus on Azure because I'm such a sightest open source person
3:53:57
I'm such a sightest open source person
3:53:57
I'm such a sightest open source person and uh and then also Azure database for
3:54:01
and uh and then also Azure database for
3:54:01
and uh and then also Azure database for a postgres flex server so let's roll
3:54:03
a postgres flex server so let's roll
3:54:03
a postgres flex server so let's roll this this promo
3:54:11
it's number two unless you want me to do
3:54:17
[Music] hey Lucas here and I'll present to you
3:54:20
hey Lucas here and I'll present to you
3:54:20
hey Lucas here and I'll present to you how I alto's killed a cosmos DB for
3:54:22
how I alto's killed a cosmos DB for
3:54:22
how I alto's killed a cosmos DB for postgres cluster using cytos of course
3:54:25
postgres cluster using cytos of course
3:54:25
postgres cluster using cytos of course rafana and Azure Surplus
3:54:28
rafana and Azure Surplus
3:54:28
rafana and Azure Surplus this talk is for multi-ten and SAS
3:54:30
this talk is for multi-ten and SAS
3:54:30
this talk is for multi-ten and SAS companies who are growing fast that is
3:54:32
companies who are growing fast that is
3:54:33
companies who are growing fast that is their onboarding thousands of customers
3:54:34
their onboarding thousands of customers
3:54:34
their onboarding thousands of customers and they are expecting or already
3:54:36
and they are expecting or already
3:54:36
and they are expecting or already running into scalability issues with the
3:54:39
running into scalability issues with the
3:54:39
running into scalability issues with the database and want to scale out with
3:54:41
database and want to scale out with
3:54:41
database and want to scale out with distributed postgresql
3:54:43
distributed postgresql
3:54:43
distributed postgresql so today we are going to talk about
3:54:47
so today we are going to talk about
3:54:47
so today we are going to talk about multi-tenant software as a service
3:54:50
multi-tenant software as a service
3:54:50
multi-tenant software as a service applications on Azure customers DB for
3:54:53
applications on Azure customers DB for
3:54:53
applications on Azure customers DB for personal SQL
3:54:54
personal SQL it's built upon like the community
3:54:57
it's built upon like the community
3:54:57
it's built upon like the community version of postgres so the open source
3:54:59
version of postgres so the open source
3:54:59
version of postgres so the open source postgres and ads also
3:55:02
postgres and ads also
3:55:02
postgres and ads also um cytos extension which turns a
3:55:05
um cytos extension which turns a
3:55:05
um cytos extension which turns a postgres into distributed database
3:55:09
postgres into distributed database
3:55:09
postgres into distributed database in this session I would like to talk
3:55:12
in this session I would like to talk
3:55:12
in this session I would like to talk about partitioning in postgresql and how
3:55:14
about partitioning in postgresql and how
3:55:14
about partitioning in postgresql and how it is similar and different than Oracle
3:55:18
it is similar and different than Oracle
3:55:18
it is similar and different than Oracle so today's focus of the talk is to
3:55:21
so today's focus of the talk is to
3:55:21
so today's focus of the talk is to discuss some tools recommendations and
3:55:23
discuss some tools recommendations and
3:55:23
discuss some tools recommendations and best practices of how to reduce cost on
3:55:25
best practices of how to reduce cost on
3:55:25
best practices of how to reduce cost on Azure database Focus flexible server my
3:55:28
Azure database Focus flexible server my
3:55:28
Azure database Focus flexible server my talk is about Asia ID authentication
3:55:30
talk is about Asia ID authentication
3:55:30
talk is about Asia ID authentication with flexible servers if you use
3:55:33
with flexible servers if you use
3:55:33
with flexible servers if you use flexible servers or plan to try it out
3:55:36
flexible servers or plan to try it out
3:55:36
flexible servers or plan to try it out this feature will help make your
3:55:38
this feature will help make your
3:55:38
this feature will help make your application connections more secure it
3:55:40
application connections more secure it
3:55:40
application connections more secure it will save your time managing uh
3:55:42
will save your time managing uh
3:55:42
will save your time managing uh credentials and roles so please check it
3:55:45
credentials and roles so please check it
3:55:45
credentials and roles so please check it out
3:55:50
[Music] awesome uh I'm I'm not gonna be like
3:55:53
awesome uh I'm I'm not gonna be like
3:55:53
awesome uh I'm I'm not gonna be like every one of these is awesome so but I
3:55:55
every one of these is awesome so but I
3:55:55
every one of these is awesome so but I will pick one in particular that I have
3:55:56
will pick one in particular that I have
3:55:56
will pick one in particular that I have to say so I've just I've seen a lot of
3:55:58
to say so I've just I've seen a lot of
3:55:58
to say so I've just I've seen a lot of chatter about multi-tenancy uh
3:56:01
chatter about multi-tenancy uh
3:56:01
chatter about multi-tenancy uh deployments in postgres lately uh Super
3:56:03
deployments in postgres lately uh Super
3:56:03
deployments in postgres lately uh Super Hot Topic uh and I it's like it's come
3:56:05
Hot Topic uh and I it's like it's come
3:56:05
Hot Topic uh and I it's like it's come back into fashion I guess
3:56:07
back into fashion I guess
3:56:07
back into fashion I guess um so that one I'm super interested to
3:56:09
um so that one I'm super interested to
3:56:09
um so that one I'm super interested to see because I to me I don't know if
3:56:11
see because I to me I don't know if
3:56:11
see because I to me I don't know if people always think of cytus as a
3:56:12
people always think of cytus as a
3:56:12
people always think of cytus as a multi-tenancy solution
3:56:14
multi-tenancy solution
3:56:14
multi-tenancy solution um but clearly it can be used for that
3:56:16
um but clearly it can be used for that
3:56:16
um but clearly it can be used for that and so that means that means I wasn't
3:56:18
and so that means that means I wasn't
3:56:18
and so that means that means I wasn't doing my job I mean before the
3:56:20
doing my job I mean before the
3:56:20
doing my job I mean before the acquisition
3:56:21
acquisition um when we came out with the worry-free
3:56:23
um when we came out with the worry-free
3:56:23
um when we came out with the worry-free postgres tagline because we decided that
3:56:25
postgres tagline because we decided that
3:56:25
postgres tagline because we decided that most people only had 30 000 days in
3:56:28
most people only had 30 000 days in
3:56:28
most people only had 30 000 days in their life and time is our most precious
3:56:30
their life and time is our most precious
3:56:30
their life and time is our most precious Resource as developers as Engineers as
3:56:33
Resource as developers as Engineers as
3:56:33
Resource as developers as Engineers as people who manage databases and so we
3:56:36
people who manage databases and so we
3:56:36
people who manage databases and so we were like you know what people don't
3:56:37
were like you know what people don't
3:56:37
were like you know what people don't want to wake up at three in the morning
3:56:38
want to wake up at three in the morning
3:56:38
want to wake up at three in the morning to deal with postgres database issues
3:56:40
to deal with postgres database issues
3:56:40
to deal with postgres database issues and so let's focus on worry-free
3:56:43
and so let's focus on worry-free
3:56:43
and so let's focus on worry-free postgres but the other alternative at
3:56:45
postgres but the other alternative at
3:56:45
postgres but the other alternative at the time was to kind of really focus on
3:56:47
the time was to kind of really focus on
3:56:47
the time was to kind of really focus on driving home the point that cytus was a
3:56:50
driving home the point that cytus was a
3:56:50
driving home the point that cytus was a multi-tenant data space so maybe you
3:56:53
multi-tenant data space so maybe you
3:56:53
multi-tenant data space so maybe you don't connect those dots because we've
3:56:55
don't connect those dots because we've
3:56:55
don't connect those dots because we've focused on that worry-free angle but for
3:56:58
focused on that worry-free angle but for
3:56:58
focused on that worry-free angle but for me yeah so the worry free was definitely
3:57:00
me yeah so the worry free was definitely
3:57:00
me yeah so the worry free was definitely a big part but also I think using it for
3:57:02
a big part but also I think using it for
3:57:02
a big part but also I think using it for like uh statistics monitoring olap like
3:57:05
like uh statistics monitoring olap like
3:57:05
like uh statistics monitoring olap like that type of stuff like we saw in the
3:57:07
that type of stuff like we saw in the
3:57:07
that type of stuff like we saw in the earlier talk with Json and you're right
3:57:09
earlier talk with Json and you're right
3:57:09
earlier talk with Json and you're right and doing that monitoring piece to me I
3:57:11
and doing that monitoring piece to me I
3:57:11
and doing that monitoring piece to me I think a lot of people connected with
3:57:12
think a lot of people connected with
3:57:12
think a lot of people connected with that type of a workload
3:57:14
that type of a workload
3:57:14
that type of a workload um you know and so that that's the
3:57:16
um you know and so that that's the
3:57:16
um you know and so that that's the reason why I say that I don't know if
3:57:17
reason why I say that I don't know if
3:57:17
reason why I say that I don't know if everyone connects it with both sides of
3:57:18
everyone connects it with both sides of
3:57:18
everyone connects it with both sides of that it's good for both of those really
3:57:20
that it's good for both of those really
3:57:20
that it's good for both of those really yeah oh and and it is good for both and
3:57:23
yeah oh and and it is good for both and
3:57:23
yeah oh and and it is good for both and we have tried to serve both of those use
3:57:25
we have tried to serve both of those use
3:57:25
we have tried to serve both of those use cases both multi-tenant SAS and
3:57:26
cases both multi-tenant SAS and
3:57:26
cases both multi-tenant SAS and real-time analytics but I'm not here to
3:57:29
real-time analytics but I'm not here to
3:57:29
real-time analytics but I'm not here to to promo the project right now although
3:57:31
to promo the project right now although
3:57:31
to promo the project right now although I did just pop up the GitHub the GitHub
3:57:33
I did just pop up the GitHub the GitHub
3:57:33
I did just pop up the GitHub the GitHub URL on screen
3:57:35
URL on screen um okay so let me see I've got a promo
3:57:40
um okay so let me see I've got a promo
3:57:40
um okay so let me see I've got a promo while you're prepping the video uh that
3:57:42
while you're prepping the video uh that
3:57:42
while you're prepping the video uh that you know go on GitHub uh it's all open
3:57:44
you know go on GitHub uh it's all open
3:57:44
you know go on GitHub uh it's all open source again this is one of the honestly
3:57:46
source again this is one of the honestly
3:57:46
source again this is one of the honestly one of the reasons I'm here it's all
3:57:48
one of the reasons I'm here it's all
3:57:48
one of the reasons I'm here it's all open source it's on GitHub uh you know
3:57:51
open source it's on GitHub uh you know
3:57:51
open source it's on GitHub uh you know star and and watch the repo all that
3:57:54
star and and watch the repo all that
3:57:54
star and and watch the repo all that good stuff uh would recommend anyone
3:57:56
good stuff uh would recommend anyone
3:57:56
good stuff uh would recommend anyone who's interested in cytus uh there's a
3:57:58
who's interested in cytus uh there's a
3:57:58
who's interested in cytus uh there's a lot of interesting stuff that goes on
3:57:59
lot of interesting stuff that goes on
3:57:59
lot of interesting stuff that goes on over here so you know well you've been a
3:58:01
over here so you know well you've been a
3:58:01
over here so you know well you've been a supporter of the project from far away
3:58:03
supporter of the project from far away
3:58:03
supporter of the project from far away for a long time and I I really
3:58:05
for a long time and I I really
3:58:05
for a long time and I I really appreciate it because a rising tide
3:58:07
appreciate it because a rising tide
3:58:07
appreciate it because a rising tide floats all boats like um I mean I
3:58:09
floats all boats like um I mean I
3:58:09
floats all boats like um I mean I obviously I'm a big fan of other
3:58:10
obviously I'm a big fan of other
3:58:10
obviously I'm a big fan of other projects I love the fact that Lucas did
3:58:13
projects I love the fact that Lucas did
3:58:13
projects I love the fact that Lucas did a demo with grafana which I'm a huge fan
3:58:15
a demo with grafana which I'm a huge fan
3:58:15
a demo with grafana which I'm a huge fan of and and I think it's good when we we
3:58:18
of and and I think it's good when we we
3:58:18
of and and I think it's good when we we support each other or recommend the
3:58:19
support each other or recommend the
3:58:19
support each other or recommend the whole word of mouth thing is useful
3:58:22
whole word of mouth thing is useful
3:58:22
whole word of mouth thing is useful um in the developer world we have
3:58:24
um in the developer world we have
3:58:24
um in the developer world we have um a couple of other I know we're tied
3:58:26
um a couple of other I know we're tied
3:58:26
um a couple of other I know we're tied on time because we have to end in just a
3:58:28
on time because we have to end in just a
3:58:28
on time because we have to end in just a couple minutes and we have two more
3:58:29
couple minutes and we have two more
3:58:29
couple minutes and we have two more videos we want to roll so let's go ahead
3:58:31
videos we want to roll so let's go ahead
3:58:31
videos we want to roll so let's go ahead and roll video number three please
3:58:39
[Music] so this talk we're gonna discuss a
3:58:42
so this talk we're gonna discuss a
3:58:42
so this talk we're gonna discuss a little bit why you want to run a
3:58:43
little bit why you want to run a
3:58:43
little bit why you want to run a database on kubernetes how this was done
3:58:46
database on kubernetes how this was done
3:58:46
database on kubernetes how this was done to run cytos to create cytos clusters
3:58:49
to run cytos to create cytos clusters
3:58:49
to run cytos to create cytos clusters before patroni 3.0 appeared and the most
3:58:52
before patroni 3.0 appeared and the most
3:58:52
before patroni 3.0 appeared and the most important thing how you can run
3:58:53
important thing how you can run
3:58:53
important thing how you can run kubernetes on kubernetes today with open
3:58:57
kubernetes on kubernetes today with open
3:58:57
kubernetes on kubernetes today with open source software in a very very easy
3:58:59
source software in a very very easy
3:58:59
source software in a very very easy manner in my opinion cyber security
3:59:02
manner in my opinion cyber security
3:59:02
manner in my opinion cyber security applications have a certain unique set
3:59:05
applications have a certain unique set
3:59:05
applications have a certain unique set of features and requirements that are
3:59:08
of features and requirements that are
3:59:08
of features and requirements that are different from other applications I also
3:59:10
different from other applications I also
3:59:10
different from other applications I also think that scientist cytus database is a
3:59:14
think that scientist cytus database is a
3:59:14
think that scientist cytus database is a very uniquely positioned to solve many
3:59:17
very uniquely positioned to solve many
3:59:17
very uniquely positioned to solve many of the challenges that cyber security
3:59:19
of the challenges that cyber security
3:59:19
of the challenges that cyber security applications present I'm a principal
3:59:21
applications present I'm a principal
3:59:21
applications present I'm a principal engineer working on a platform here at
3:59:23
engineer working on a platform here at
3:59:23
engineer working on a platform here at jellyfish I've been here since the early
3:59:25
jellyfish I've been here since the early
3:59:25
jellyfish I've been here since the early days and through a lot of our of our
3:59:27
days and through a lot of our of our
3:59:27
days and through a lot of our of our growth and our infra choices including
3:59:30
growth and our infra choices including
3:59:30
growth and our infra choices including the the move to a sharded database that
3:59:32
the the move to a sharded database that
3:59:32
the the move to a sharded database that we're going to talk with you here today
3:59:35
we're going to talk with you here today
3:59:35
we're going to talk with you here today about
3:59:39
ahead of some of the gotchas that come
3:59:41
ahead of some of the gotchas that come
3:59:41
ahead of some of the gotchas that come up when trying to move to a self-hosted
3:59:43
up when trying to move to a self-hosted
3:59:43
up when trying to move to a self-hosted sinus implementation we went through
3:59:46
sinus implementation we went through
3:59:46
sinus implementation we went through this journey and we'd like to share some
3:59:50
this journey and we'd like to share some
3:59:50
this journey and we'd like to share some of the interesting technical challenges
3:59:51
of the interesting technical challenges
3:59:51
of the interesting technical challenges that we had to solve
3:59:53
that we had to solve
3:59:53
that we had to solve because I I frequently get questions
3:59:55
because I I frequently get questions
3:59:55
because I I frequently get questions about foreign key support insighters in
3:59:58
about foreign key support insighters in
3:59:58
about foreign key support insighters in this talk I want to clarify some of the
4:00:00
this talk I want to clarify some of the
4:00:00
this talk I want to clarify some of the concepts regarding scientists
4:00:03
concepts regarding scientists
4:00:03
concepts regarding scientists hello I'm here today to share with you
4:00:05
hello I'm here today to share with you
4:00:05
hello I'm here today to share with you some lessons and observations from
4:00:07
some lessons and observations from
4:00:07
some lessons and observations from safety cultures migration from a managed
4:00:09
safety cultures migration from a managed
4:00:09
safety cultures migration from a managed postgresql service to our own
4:00:10
postgresql service to our own
4:00:10
postgresql service to our own self-hosted process
4:00:11
self-hosted process
4:00:11
self-hosted process [Music]
4:00:16
[Music] also another fine slate of talks uh I
4:00:19
also another fine slate of talks uh I
4:00:19
also another fine slate of talks uh I especially like the one moving to the
4:00:22
especially like the one moving to the
4:00:22
especially like the one moving to the self-hosted uh from the manage that's
4:00:25
self-hosted uh from the manage that's
4:00:25
self-hosted uh from the manage that's really complicated and uh I think we'll
4:00:27
really complicated and uh I think we'll
4:00:27
really complicated and uh I think we'll explain to a lot of people what they're
4:00:29
explain to a lot of people what they're
4:00:29
explain to a lot of people what they're getting into
4:00:30
getting into so yeah it was a pleasure pleasure for
4:00:33
so yeah it was a pleasure pleasure for
4:00:33
so yeah it was a pleasure pleasure for me to meet
4:00:34
me to meet and Delaney McKenzie from jellyfish as
4:00:37
and Delaney McKenzie from jellyfish as
4:00:37
and Delaney McKenzie from jellyfish as they told that story although also Paul
4:00:39
they told that story although also Paul
4:00:39
they told that story although also Paul diadny told a similar story
4:00:41
diadny told a similar story
4:00:42
diadny told a similar story um about moving to a self-hosted sinus
4:00:45
um about moving to a self-hosted sinus
4:00:45
um about moving to a self-hosted sinus environment so
4:00:46
environment so um obviously you know I work at
4:00:48
um obviously you know I work at
4:00:48
um obviously you know I work at Microsoft and uh for people who want
4:00:51
Microsoft and uh for people who want
4:00:51
Microsoft and uh for people who want that managed service offering we we
4:00:53
that managed service offering we we
4:00:53
that managed service offering we we offer that but it's I love I love seeing
4:00:56
offer that but it's I love I love seeing
4:00:56
offer that but it's I love I love seeing people be successful with status open
4:00:58
people be successful with status open
4:00:58
people be successful with status open source too
4:01:00
source too um okay so I think we have let's see
4:01:03
um okay so I think we have let's see
4:01:03
um okay so I think we have let's see another video here that some more
4:01:07
another video here that some more
4:01:07
another video here that some more postgres Community talks so let's take a
4:01:09
postgres Community talks so let's take a
4:01:09
postgres Community talks so let's take a look at these
4:01:16
[Music] I am Dimitri fonten and I've been
4:01:19
I am Dimitri fonten and I've been
4:01:19
I am Dimitri fonten and I've been contributing to postgresql for a very
4:01:20
contributing to postgresql for a very
4:01:21
contributing to postgresql for a very long time now we're going to talk
4:01:23
long time now we're going to talk
4:01:23
long time now we're going to talk about how to copy your postgres database
4:01:26
about how to copy your postgres database
4:01:26
about how to copy your postgres database so from one server to the other one
4:01:29
so from one server to the other one
4:01:30
so from one server to the other one the concept of storing time with
4:01:33
the concept of storing time with
4:01:33
the concept of storing time with databases have been popular for very
4:01:36
databases have been popular for very
4:01:36
databases have been popular for very long time pretty much since relational
4:01:38
long time pretty much since relational
4:01:38
long time pretty much since relational databases come to life
4:01:41
databases come to life
4:01:41
databases come to life exactly the goal this talk is to talk
4:01:43
exactly the goal this talk is to talk
4:01:43
exactly the goal this talk is to talk about how postgres can give you an ideal
4:01:46
about how postgres can give you an ideal
4:01:46
about how postgres can give you an ideal platform to use artificial intelligence
4:01:49
platform to use artificial intelligence
4:01:49
platform to use artificial intelligence uh in your applications
4:01:52
uh in your applications
4:01:52
uh in your applications we'll set the scene by imagining that
4:01:54
we'll set the scene by imagining that
4:01:54
we'll set the scene by imagining that you're the DBA in an organization that
4:01:57
you're the DBA in an organization that
4:01:57
you're the DBA in an organization that needs to implement multi-tenancy
4:02:00
needs to implement multi-tenancy
4:02:00
needs to implement multi-tenancy what I'm going to talk about today is
4:02:02
what I'm going to talk about today is
4:02:02
what I'm going to talk about today is how we've taken postgresql an insane
4:02:05
how we've taken postgresql an insane
4:02:05
how we've taken postgresql an insane relational database and how we've
4:02:08
relational database and how we've
4:02:08
relational database and how we've Twisted it and molded it and created it
4:02:10
Twisted it and molded it and created it
4:02:10
Twisted it and molded it and created it what we think is very robust document
4:02:12
what we think is very robust document
4:02:12
what we think is very robust document database and event store solution
4:02:16
database and event store solution
4:02:16
database and event store solution today I will talk about in-depth Guide
4:02:18
today I will talk about in-depth Guide
4:02:18
today I will talk about in-depth Guide to the Past Chris's new CI and I will
4:02:22
to the Past Chris's new CI and I will
4:02:22
to the Past Chris's new CI and I will explain how to use that new CI
4:02:25
explain how to use that new CI
4:02:25
explain how to use that new CI in this talk we see together how to
4:02:27
in this talk we see together how to
4:02:27
in this talk we see together how to build web maps from scratch using Django
4:02:30
build web maps from scratch using Django
4:02:30
build web maps from scratch using Django and postgis if you are asking yourself
4:02:32
and postgis if you are asking yourself
4:02:32
and postgis if you are asking yourself what type of maps we can build with
4:02:35
what type of maps we can build with
4:02:35
what type of maps we can build with chiangle let's see an example right away
4:02:38
chiangle let's see an example right away
4:02:38
chiangle let's see an example right away [Music]
4:02:46
there's a playlist well actually just go
4:02:49
there's a playlist well actually just go
4:02:49
there's a playlist well actually just go to aka.ms slash on demand and um you can
4:02:54
to aka.ms slash on demand and um you can
4:02:54
to aka.ms slash on demand and um you can get access just just go to the schedule
4:02:56
get access just just go to the schedule
4:02:56
get access just just go to the schedule page and all the on-demand talks have
4:02:58
page and all the on-demand talks have
4:02:58
page and all the on-demand talks have YouTube embeds there on that on demand
4:03:00
YouTube embeds there on that on demand
4:03:00
YouTube embeds there on that on demand Tab and so you can track down any of
4:03:03
Tab and so you can track down any of
4:03:03
Tab and so you can track down any of these and start watching
4:03:04
these and start watching
4:03:04
these and start watching so much goodness um
4:03:06
so much goodness um
4:03:06
so much goodness um I actually promised somebody I would pop
4:03:09
I actually promised somebody I would pop
4:03:09
I actually promised somebody I would pop up a slide I know we're almost out of
4:03:11
up a slide I know we're almost out of
4:03:11
up a slide I know we're almost out of time and we got to talk about what's
4:03:12
time and we got to talk about what's
4:03:12
time and we got to talk about what's next but I am just going to pop up this
4:03:14
next but I am just going to pop up this
4:03:14
next but I am just going to pop up this slide like I promised um for people who
4:03:17
slide like I promised um for people who
4:03:17
slide like I promised um for people who do want to learn more about cytus on
4:03:19
do want to learn more about cytus on
4:03:19
do want to learn more about cytus on Azure this new Azure Cosmos DB for post
4:03:22
Azure this new Azure Cosmos DB for post
4:03:22
Azure this new Azure Cosmos DB for post press training guide came out a couple
4:03:24
press training guide came out a couple
4:03:24
press training guide came out a couple of months ago I think it was maybe
4:03:26
of months ago I think it was maybe
4:03:26
of months ago I think it was maybe October uh Joe Nelson and my team was
4:03:28
October uh Joe Nelson and my team was
4:03:28
October uh Joe Nelson and my team was involved in the creation of it not on my
4:03:30
involved in the creation of it not on my
4:03:30
involved in the creation of it not on my team but on the broader team and uh and
4:03:34
team but on the broader team and uh and
4:03:34
team but on the broader team and uh and it's it's everybody's recommending it so
4:03:36
it's it's everybody's recommending it so
4:03:36
it's it's everybody's recommending it so if you're new and you want to try this
4:03:38
if you're new and you want to try this
4:03:39
if you're new and you want to try this out and get started
4:03:40
out and get started
4:03:40
out and get started um and you want to try it on Azure
4:03:42
um and you want to try it on Azure
4:03:42
um and you want to try it on Azure versus you know just downloading the
4:03:43
versus you know just downloading the
4:03:43
versus you know just downloading the open source packages this is a really
4:03:45
open source packages this is a really
4:03:45
open source packages this is a really good place to get started I thought it
4:03:48
good place to get started I thought it
4:03:48
good place to get started I thought it was going to be a free ketchup slide but
4:03:49
was going to be a free ketchup slide but
4:03:49
was going to be a free ketchup slide but I guess this is uh much more useful and
4:03:52
I guess this is uh much more useful and
4:03:52
I guess this is uh much more useful and much easier for people who want to get
4:03:54
much easier for people who want to get
4:03:54
much easier for people who want to get started and check out some of this stuff
4:03:55
started and check out some of this stuff
4:03:55
started and check out some of this stuff so free ketchup okay
4:03:59
so free ketchup okay
4:03:59
so free ketchup okay a few minutes over time I just wanted to
4:04:01
a few minutes over time I just wanted to
4:04:01
a few minutes over time I just wanted to flag a couple things the first thing is
4:04:03
flag a couple things the first thing is
4:04:03
flag a couple things the first thing is if you're going to be awake during the
4:04:04
if you're going to be awake during the
4:04:04
if you're going to be awake during the emea live stream which is you know not
4:04:07
emea live stream which is you know not
4:04:07
emea live stream which is you know not that far away now
4:04:09
that far away now um it's happening on Wednesday at 9 A.M
4:04:12
um it's happening on Wednesday at 9 A.M
4:04:12
um it's happening on Wednesday at 9 A.M central European summertime you can add
4:04:14
central European summertime you can add
4:04:14
central European summertime you can add it to your calendar with that URL in the
4:04:16
it to your calendar with that URL in the
4:04:16
it to your calendar with that URL in the top left
4:04:19
um let's see no is that right
4:04:22
let's see no is that right
4:04:22
let's see no is that right yes attend emea and then the link on the
4:04:26
yes attend emea and then the link on the
4:04:26
yes attend emea and then the link on the replay the America's live stream is
4:04:28
replay the America's live stream is
4:04:28
replay the America's live stream is wrong so
4:04:30
wrong so um the link on the replay of the
4:04:31
um the link on the replay of the
4:04:31
um the link on the replay of the America's live stream is when I'm
4:04:33
America's live stream is when I'm
4:04:33
America's live stream is when I'm popping on the bottom of the screen now
4:04:34
popping on the bottom of the screen now
4:04:34
popping on the bottom of the screen now in a banner so uh sorry about that folks
4:04:38
in a banner so uh sorry about that folks
4:04:38
in a banner so uh sorry about that folks um and yeah the Discord Discord what
4:04:42
um and yeah the Discord Discord what
4:04:42
um and yeah the Discord Discord what does Discord really mean
4:04:46
um I think we'll have to say that
4:04:47
um I think we'll have to say that
4:04:47
um I think we'll have to say that discussion for another day okay uh treat
4:04:50
discussion for another day okay uh treat
4:04:50
discussion for another day okay uh treat thank you so much for being my co-host
4:04:52
thank you so much for being my co-host
4:04:52
thank you so much for being my co-host and being here I know you don't work for
4:04:53
and being here I know you don't work for
4:04:53
and being here I know you don't work for Microsoft you're not part of the status
4:04:55
Microsoft you're not part of the status
4:04:55
Microsoft you're not part of the status open source project but you are part of
4:04:57
open source project but you are part of
4:04:57
open source project but you are part of the postgres community and it has been a
4:04:59
the postgres community and it has been a
4:04:59
the postgres community and it has been a delight sure it's been my honor thank
4:05:01
delight sure it's been my honor thank
4:05:01
delight sure it's been my honor thank you so much for having me and thanks for
4:05:03
you so much for having me and thanks for
4:05:03
you so much for having me and thanks for everyone who joined us and I hope you
4:05:05
everyone who joined us and I hope you
4:05:05
everyone who joined us and I hope you check out the rest of the cytus contacts
4:05:07
check out the rest of the cytus contacts
4:05:07
check out the rest of the cytus contacts and uh have a