Learn how to insert free minutely crypto price data into PostgreSQL using SQLAlchemy, Python, and Pandas for future crypto algo trading and strategy backtesting purposes.
š Subscribe for more: https://bit.ly/3lLybeP
š Follow along: hhttps://analyzingalpha.com/insert-crypto-price-data-sqlalchemy/
0:00 Introduction
0:30 Clone Notebook
2:00 Create Ticker List
2:21 Create Active Tickers List
3:52 Create Symbols DataFrame
5:26 Create Minute Bar DataFrame
7:22 Insert DataFrames Into PostgreSQL
11:45 Verify Price Data Insert
#crypto
#algotrading
#python
---
Videos are intended strictly for information, education, and entertainment purposes. Do not base any investment decisions based upon these materials. Please read the disclaimer: https://analyzingalpha.com/disclaimer
Show More Show Less View Video Transcript
0:00
hello world today we're going to insert
0:02
crypto data into our postgres database
0:05
using sql command we're going to use the
0:07
free crypto data that we got from kaggle
0:10
i think we created that data set and
0:12
about two about two episodes ago
0:15
and we're going to use the database that
0:17
we created in the last video that we're
0:19
going to use for future crypto stocks
0:22
futures and forex algo trading purposes
0:24
so with that being said let's dig into
0:26
it
0:29
let's create some code
0:31
we'll start by opening up the jupiter
0:33
notebook that we use whenever we got
0:34
that free crypto data from kaggle i'm
0:37
going to scroll to the bottom
0:39
and you'll notice you know we're
0:40
exporting out the minute bars as a csv
0:43
file well we just want the minute bars
0:45
because we're going to insert that into
0:46
the database so i'm going to delete that
0:49
but before i do i'm going to make a copy
0:52
of this
0:53
i'll insert data i'll rename it whenever
0:56
i upload it to github and now that we
0:58
have a clone of that prior notebook i'm
1:00
going to delete this
1:03
create a new heading
1:05
let's call it insert data
1:08
and i'm going to run this and it's going
1:10
to take a little bit of time so through
1:11
the magic of the internet i will see you
1:13
after this has been
1:15
run
1:17
now that we have the minute bar ready
1:19
let's think about what we need to do
1:21
probably the easiest way to think about
1:23
this is to look at this visually and
1:26
look at the erd
1:27
we have two tables that we need to
1:29
populate we have the symbol table which
1:31
again is that parent table and then the
1:33
child
1:34
minute bar table the minute bar is just
1:36
the o h l c v data and the symbol
1:41
includes ticker name market and active
1:43
so let's get the tickers which will also
1:46
be the name because the capital data
1:48
doesn't provide to a name it just
1:50
provides a ticker
1:52
and then whether or not uh it's active
1:54
the market we know is crypto so that's
1:57
pretty easy
1:58
let's go ahead and do that now
2:00
the first we'll get all of the tickers
2:02
so tickers equals bars 1m
2:05
we'll grab the index
2:07
and it's level two so we'll do get a
2:09
level values of the index to level one i
2:12
should say
2:13
and then unique
2:15
and tickers so this should give us a
2:17
list of all of the tickers
2:20
perfect
2:21
now we have to decide
2:22
what do we consider being active
2:25
i'll say that
2:26
anything that we've got in the last day
2:29
so there's data for the prior day
2:31
we'll call this active and we know that
2:34
the last day was 12 31 so we'll just say
2:37
anything that we got from you know 12 30
2:40
and on will be considered active
2:43
so we'll do active
2:45
tickers equals bars 1m
2:48
we'll use um you know slicing here we'll
2:51
do 12
2:52
30.
2:56
okay so that'll give us just the bars
2:58
that start on the 30th
3:01
and then index
3:03
get
3:04
level values get level one again
3:08
and unique
3:09
spell
3:11
unique
3:12
into active
3:14
stickers
3:17
we had 132 last time and now we got 113.
3:20
so it looks like there are some that
3:23
are not active or maybe they just have
3:24
low volume but for our purposes this
3:27
works
3:28
okay great so the next thing that we
3:30
need to do is
3:32
create the data frame this is pretty
3:34
easy
3:35
so actually you know what i'm going to
3:37
create some column headings here
3:40
get symbols
3:43
and now let's
3:46
create
3:47
symbols
3:48
data brain
3:52
okay and let's think about what we need
3:53
to do here right simply just
3:56
populate
3:57
these fields and ids auto populated
4:00
we're auto generated we'll do symbols
4:03
equal pd
4:05
data frame
4:06
and then we'll pass in the tickers which
4:08
are all of the tickers up here
4:10
right
4:11
and then we'll set the name equal to the
4:14
ticker
4:15
symbols
4:16
name
4:17
equal to
4:19
symbols vector so essentially they're
4:22
duplicated
4:24
obviously
4:25
if we had a more robust data source we'd
4:28
put in the name there
4:30
but again we're working with free date
4:31
pre-data
4:33
then symbols
4:34
market this is an easy one we'll just
4:37
pass it the crypto strings here ypto
4:41
and now whether or not it's active
4:43
through symbols
4:45
active and we'll just use np.where so
4:47
np.where
4:49
acid the condition so symbols
4:52
thicker
4:54
is in and then we pass it the active
4:56
ticker list active pickers
4:59
and then if it is found we'll pass
5:02
through so it's active will be true
5:03
otherwise it's false
5:06
and now let's sort
5:07
by ticker
5:09
in alphabetical order so symbols equal
5:11
symbols
5:12
but sort values by
5:15
picker
5:16
then we will output to c
5:19
we did
5:21
any typos nope all right so here are our
5:24
tickers right
5:26
pretty easy so the next thing we need to
5:28
do now that we have our
5:31
symbol data frame again right here is to
5:34
get the minute bar data frame and the
5:36
only thing that's you know even remotely
5:38
complex here is to link
5:40
the id well have a relationship from
5:43
minute bar to symbol we'll actually
5:45
create a symbol id field i just use it
5:48
convention
5:49
so anytime there's an id i just name it
5:52
the symbol underscore id so i understand
5:54
that's the case so really this id
5:57
you know so it's a
5:59
many-to-one relationship so that's
6:00
actually a different field if that makes
6:02
sense but you'll see what i mean in a
6:04
split second so let's create that
6:06
title
6:10
create
6:11
minute bar
6:13
data frame
6:21
yes i'm
6:22
being extra particular here but that's
6:24
how pandas has it
6:26
perfect
6:27
now we'll just grab those minute bars
6:30
but we need to
6:31
uh order them differently because think
6:33
about
6:34
the data right now it's by date we want
6:36
it by ticker and then by date because
6:38
we're going to insert by as no by ticker
6:41
and not by
6:42
date if that makes sense so minute bars
6:45
equals bars 1m we reset the index
6:49
we then sort values by
6:53
method
6:54
list ticker and then date
6:58
and then set
6:59
index
7:01
picker
7:04
date
7:06
and then let's make sure we got what we
7:07
think we have or we should have
7:10
would be by an alphabetical order by
7:13
ticker and then by date
7:15
okay and this looks good
7:18
perfect so now what we need to do
7:21
is we need to insert the data
7:24
okay so we'll create another column
7:26
heading here
7:28
insert data
7:32
and let's go ahead and use some of the
7:35
code that we created previously
7:37
from psql import db and session this is
7:42
so we can create a session to our
7:45
postgres database but again we're using
7:47
sql so you theoretically could be using
7:50
any database of your choice
7:52
and then we also need to grab the model
7:54
classes which
7:55
you know again that's these right here
7:59
so do from
8:00
models
8:01
import
8:02
symbol
8:05
and minute bar
8:07
okay
8:08
and now let's go ahead and think about
8:11
what we need to do right so what we want
8:13
to do is we want to loop through
8:17
every symbol
8:19
create the
8:21
symbol record right that's that parent
8:25
table record and then we want to link up
8:28
the minute bar
8:29
data which is that child table data to
8:32
that parent record
8:34
okay
8:35
so what we'll do is we'll just loop
8:37
through every ticker create the symbol
8:41
and then link this symbol to the minute
8:42
bar
8:43
hopefully that makes sense but i'm sure
8:46
we'll even more once we write the code
8:48
so 4 i
8:50
r so that's the index and then the r is
8:52
for the row
8:53
in symbols
8:55
but it arrows so we're just going to
8:57
iterate through all of the rows
8:59
a symbol equals
9:01
symbol so we pass it that symbol class
9:05
sticker and then for each row we just
9:07
essentially
9:08
populate
9:09
all of the fields right so ticker you've
9:12
got name so we'll just pass it the name
9:16
you've got market
9:21
it
9:22
and then our active boolean
9:24
active
9:26
active perfect so then once we create
9:30
the symbol
9:31
we add it to the session so this
9:34
adds it to the
9:35
sq alchemy session and now we need to
9:37
commit this right because if we don't
9:39
commit
9:40
this to uh
9:42
you know the database it won't have
9:45
an id for us to link to so we'll do
9:47
session
9:49
dot commit
9:51
and now that we have
9:53
the ability to reference our symbol with
9:55
its id we now can add the minute bar
9:58
data so we'll do bars equal minute bars
10:02
we'll take a cross section or xs
10:05
we're going to take a cross section by
10:07
grabbing the ticker we'll do r ticker
10:11
so then that will give us
10:14
um
10:15
you know so let's say we take our first
10:16
one would be a uh ada usd
10:20
you know essentially we remove this and
10:22
then our new cross section would be
10:26
the
10:27
date as the index and then the open high
10:30
low close volume data
10:33
so
10:34
go ahead and get that and we'll want to
10:36
reset the index because we want to be
10:38
able to get date we don't want it as the
10:40
index
10:42
okay and then we can do this this is the
10:43
magic so bars
10:46
id
10:47
is equal to
10:49
symbol
10:51
id right because we committed this to
10:53
the database and now we can reference
10:54
that id
10:55
and now we're going to do a bulk insert
10:57
we don't want to iterate through each
10:59
one of these that would each minute bar
11:01
row we can do a bulk insert which is a
11:05
lot faster it does bypass certain things
11:08
but we're not using that using that in
11:10
this instance
11:12
or certain features or functionality
11:14
it's bulk insert
11:16
mappings and we pass it the class which
11:19
is minute bar
11:21
and then we'll do bars
11:22
we're going to change that to a
11:23
dictionary
11:25
let me orient that by records
11:29
so we're just taking
11:30
um the
11:33
uh the bars data frame making it into a
11:36
dictionary essentially uh a row by rows
11:39
and then we can just
11:41
import that so then we'll do session
11:44
commit
11:46
i'll switch over to the terminal so we
11:48
can see
11:49
the data being populated
11:52
we select star from symbol
11:54
should see a few in there perfect and
11:56
let's make sure that those minute bars
11:58
exist so i'll just select
12:00
symbol dot ticker so we'll grab the
12:02
ticker from the symbol table
12:05
and then we'll do
12:06
minute bar dot star to get all the
12:09
fields
12:10
in the minute bar
12:12
from symbol
12:14
join
12:15
minute bar
12:16
on symbol id is equal to minute bar dot
12:20
symbol id
12:22
where
12:23
symbol dot ticker is equal to a usd
12:28
hit enter and now we have our minute bar
12:32
and it's obviously associated with our
12:34
ticker
12:36
see that wasn't so bad we now have a
12:38
postgres database with crypto data in it
12:41
now all we need to do is figure out how
12:43
to get that data out to be able to start
12:45
back testing now you can probably guess
12:48
what the subject of our next video is so
12:50
if you click right here you'll be taken
12:52
to the next video which actually shows
12:54
you how to get that data out of postgres
12:57
thank you and i hope to see you there oh
12:59
and if you don't mind hit the subscribe
13:02
button if you like these videos too all
13:03
right thanks and see you soon
#Programming
#Investing
