Learn how to use the Polygon.io Python API to get historical price data for crypto, stocks, futures, and forex.
👍 Subscribe for more: https://bit.ly/3lLybeP
👉 Follow along: https://analyzingalpha.com/polygon-api-python/
00:00 Introduction
01:11 Install Python Polygon API Client
02:01 Configure Polygon.io API Key
02:51 Create Polygon.io Python REST Client
04:22 Get Equities Aggregates
10:08 Create Retry Strategy
15:20 Create Get Tickers Method
23:56 Create Get Bars Method
34:50 Validate Results
#polygon.io
#python
#algotrading
#trading
Show More Show Less View Video Transcript
0:00
hello world today we're going to go ahead and you know have you ever had a long day
0:07
and you're just things just start coming out right that's today so we're going to use the polygon api
0:14
to grab minute data we're going to take that minute data and put it in our database that we created in this series
0:20
here right the goal of this my channel essentially is to track the markets
0:25
figure out what really works through the lens of data science and back testing right why i don't know i'm
0:31
weird i just love this stuff okay so that's the goal for today but we
0:36
can't just use the polygon api itself it isn't sufficient for our needs exactly
0:42
right it allows us to grab data but we need to add the ability for it to have a what's called a retry strategy or
0:48
reconnect if it has problems and we also want to make sure that we can grab all of the historical data you know by
0:54
having a start date and an end date polygon doesn't allow you to do that my guess is they're trying to limit massive
1:00
api calls but we're going to take that api that they have and improve upon it
1:05
so that's what we're doing today i hope you're excited about it let's create some code
1:10
so the first thing that we're going to want to do is we're going to want to open up our jupyter notebook i've got a few basic titles here but we really want
1:17
to install the python polygon api client into our virtual environment now if you
1:22
don't know what virtual environments are or haven't used virtual environments before i'll put a link in description
1:29
below because you definitely don't want to be installing a bunch of software into your global environment so first things first go ahead and uh
1:36
and run bang pip install python api client right within the jupyter notebook get that
1:43
installed i already have that api installed so uh no need for me
1:48
to do that there so the next two lines of code so now that we've imported or installed
1:55
the polygon api client we can import the rest client from the polygon api so we're going to do just that and then the
2:01
next line of code local settings this contains the api key for polygon we did
2:07
something very similar uh whenever we were connecting the postgres using sq alchemy you can see
2:14
that this file local settings is just you know contains dictionaries of you know the various key value pairs
2:22
of what's needed to create those connections to be able to connect to sql coming so
2:29
essentially i have one right now that you don't see but i'll upload it after this lesson
2:35
this polygon equals api key no colon then the api and key value pair right
2:42
so that's that i'm going to hit enter here now let's create a polygon
2:49
client from the rest client client equals rest client
2:55
what do we need here knit looks like it we need an offspring uh off key that's a string don't need to
3:01
worry about the timeout so let's just go ahead add a string there
3:08
as the api key so setting api e okay and then
3:15
client that should give us a rest api client
3:20
and as we can see here's the rest api client at that memory address
3:25
now we will see what's available to us using this client
3:30
i type client period and hit tab shows us all of the various methods
3:36
why don't we use the stock aggregates we see stock equities aggregates
3:43
i don't know exactly right off the top of my head what's needed go ahead and read the documentation
3:48
found on polygon i o like we need the ticker multiplier time
3:54
span from two okay so the ticker is just we'll just use apple that's fine multiplier
4:01
uh essentially we were using you know saying minute we could five minute bars instead of one
4:06
minute bars we'll just keep it at one minute time span gives us various uh different time spans so we'll just
4:12
keep defaults here and we'll just get that 20 21 uh
4:18
yeah we'll get the january 21 data all right so ticker the apple
4:25
multiplier one time span
4:30
time and equal minute from equals 20 21 0 1 0 1
4:39
and 2 equals 20 21
4:44
0 2 0 1. everyone okay
4:50
and we want to obviously return this in a response and then that should give us a response
4:56
right so we're using the polygon api client we're making a request
5:01
you know uh which is essentially wrapped in polygons api so we're not worrying about
5:06
you know the actual uh urls and things like that and just all of these methods for us we return a
5:14
response which has you know various uh you know methods such as um results
5:20
in it so let's go ahead and hit enter see if that worked does look like it works so let's see
5:27
what is available to us response period and then tab
5:32
i see uh various items here what i want to see are the results okay i believe
5:38
that's in a dictionary format okay great so this is the open hi-lo clo
5:45
let's see i guess that's v app volume v web open close high low
5:52
transactions and no time and
6:00
oh
6:06
don't know what n is that's okay all right so now what we want to do is we want to get this data into a data frame
6:13
pretty easy to do so we'll do uh import pandas as pd and then df equal pd that data frame
6:21
your response dot results and we'll print out that data frame
6:28
maybe that's probably number of transactions time volume vap open close high low that's
6:34
what that okay so now we have our data frame 5000 rows perfect
6:40
uh let's see here this oops this isn't very useful for me
6:46
because i don't exactly know uh the amount of time that passed by so we'll do this f
6:52
update equals np to date time
6:58
we'll put it in
7:03
and unit equal millisecond and that should give us the actual time
7:12
oh dd all right point what i'm trying to show you is we only got data to the 11th okay
7:20
so that's part of the challenge with this okay so let's go back and look at our response object and
7:27
does it provide like a next url or anything like that and it doesn't
7:32
so that's that's part of the challenge with this api is that
7:38
it doesn't provide what your next query should be so you have to handle that so
7:43
if we want to grab a month we could actually we could probably get a month
7:48
by going through and changing the limit the you know 50 000 the default was 5
7:54
000 which is why we got the 5 000 rows um but once we start
8:00
getting wanting more and more data you know we're going to be limited to that 50 000 and we don't have an you know the
8:07
next url or doesn't help us in that regard so that's what we have to fix and the second thing is you know we don't
8:14
expect i mean if we're going to be downloading hundreds or thousands of tickers at
8:19
minute resolution it's likely that it'll be disconnect and retract and things like that so we also
8:25
want to add that too so you know so the great news is we've used the
8:31
current polygon api we've got the data but we need to modify that api in order
8:38
to make it both sort of a little more resilient and also to be able to automatically grab that historical data
8:45
as opposed to us you know just kind of guessing what the next five thousand rows would be if that makes sense okay
8:51
all right perfect so let's create a new uh title and go from there before we dig into the code let's think about what we
8:58
need to do so we need to create a retry strategy so that way if
9:04
for some reason there's a disconnect or we get some type of error we can look at the http code and determine you know
9:11
what are we going to retry this specific command that we just attempted and you know what is the delay on that going to
9:17
be that's the first thing that we need to do the other thing that we need to do is we want to create our own methods for
9:22
setting the tickers and getting the bars so i wanted to take a step back and kind of
9:28
you know before we just dive right in the code get a real good conceptual understanding of what we're trying to do
9:34
okay let's get back to the code and we call this improved polygon api
9:42
okay and let's add our steps we'll do um add
9:48
retry strategy and we'll also add
9:54
get pickers and we'll also get
10:00
get bars that'll be uh essentially what we're doing
10:06
okay so the first things first let's add our retry strategy so we can
10:12
go to the original rest client right and you can see all of this code
10:18
here uh you know we don't have to override any of this stuff we're just going to
10:23
use it we just have to add our retry strategy in the init i'm
10:30
going to copy this okay
10:37
and paste it here all right so now we have our rest class
10:43
but we're going to call this the my rest class
10:50
will inherit rest class okay makes sense so far right
10:56
and then right because the default api or the see if i go back
11:04
here right to the default host that's part of this rest class right so whenever we
11:10
do a super and init we will get all of that so we'll do super
11:18
connect off key so what are we doing here basically what
11:24
we're doing is we're saying okay let's go to the super class which is the rest client and run its init
11:31
method right and passing in the off key okay that gives us access
11:38
to everything that we need specifically to use uh the session but you'll see that
11:45
in a minute so we'll do retry called super retry strategy equals retry
11:51
total i'm gonna have to actually import this but we'll do total 10 times back off
11:57
factor 10 and then the status course list equals then we put in a
12:04
bunch of http code so 429 500 502 503 504. now if
12:11
you're wondering where to get this just google um retry strategy
12:18
for requests okay so speaking of requests we need to import that from request
12:27
adapters import http adapter
12:32
and from url lib free util retry
12:38
import reply okay so that's right that's what we need right so we've
12:44
got our attp adapter and our retry strategy
12:49
okay so now what we can do is or we already have all this stuff when
12:54
we called super it gave us all of this stuff the only thing that we really want to
13:00
override is this because we want to mount
13:06
and we want to use this session but we want to mount our adapter so do what i mean here
13:12
i can delete all this right because that's already taken care of in our parent class
13:19
then i'll do adapter equals http adapter x retries equal retry
13:27
strategy which we just find above and then self dot under uh score session
13:34
dot mount https go in front says front slash and that's the adapter that we just created
13:41
right so basically again to recap we're creating
13:46
our own myrest class we're inheriting all of the rest line functionality we're able to use this underscore session
13:56
session.mount because we use super here then we call super we
14:02
get all of this stuff right makes sense and then again that's pretty pretty
14:08
common so awesome you should have a good understanding of that now what we'll do
14:14
is we'll try to create a client for our rest
14:20
client right so i will do client equal my rest client
14:26
right before into
14:32
settings passing the api key and assuming we did everything correctly
14:38
which looks like i didn't i rest client i rest oh
14:45
that would be why bl and okay and it looks like we have it there
14:52
now what's cool is we can hit period hit tab and look we've got everything
14:58
um that we had previously on the original uh paul you know really
15:04
original polygon api so we have all of that functionality there now the only thing that we changed was we added this
15:10
http adapter with a retry strategy okay awesome okay so now the next thing that
15:16
we need to do is to get the tickers okay so we'll
15:22
add a space here and then let's see what uh you know
15:27
that reference tickers d3 actually does see where we at we'll go tickers
15:35
speaker types is that references
15:40
what are the parameters okay stocks options so basically we have to pass in a market
15:46
and potentially a locale so let's let's give this a shot so we'll do
15:52
client dot ref
15:59
pickers d3 and then market equal
16:05
crypto and hey this response
16:12
if that worked we did get a response period tab to give
16:18
us our options you can see here you know this is what i was complaining about before whenever we're getting the the data there was no
16:25
next url let's see if there is a next url here depends on yep so it does seem like we can get the
16:32
next url which will get the next part of our you know ticker list
16:37
okay and then results so that gives us so all of those results
16:44
and we can easily put that in a data frame if we want let's do that now okay so we're gonna
16:51
need column or row or pandas as pd
16:57
and we will simply do pd.data frame
17:04
resp response if equals and see if that
17:10
worked here oops results
17:18
perfect so and then obviously we can slice and dice that however we want
17:23
but i think at this point we have everything that we need to create our ticker method right
17:29
and let's see remember it's a method because it's part of the class we're actually going to copy all of this
17:38
paste this here now what we'll do is we'll create that
17:44
get tickers method so we'll do def get tickers
17:53
then mark it want to pass in the market it'll be a string
17:59
again i'm just putting type ins here it makes it easier for people to kind of understand right so if
18:04
you're not familiar with uh type hints um essentially
18:09
uh you know just shows you market we expect the string right now it's none and this will
18:15
uh return a data frame so
18:20
let's create one of our markets we'll do markets equal
18:26
crypto we'll just use stocks for right now see what i'm doing in a second we'll say
18:33
if not market in markets raise exception
18:40
right and we should have market so market must be one of
18:46
circuits okay and now what we'll do is we'll
18:51
use that reference ticker's method which you know
18:56
we currently have access to because that's we have a subclass to get those
19:02
bars we'll do uh rest fonts elf dot reference
19:08
stickers v3 market equal market market
19:15
okay and then what we'll do here is we'll say if has
19:22
attribute response results right because what happens if
19:28
you know we get a response that doesn't have any results i i you know i want to make sure that that exists
19:34
it does have results data frame equals pd data frame
19:40
response results just like we just did previously and then while
19:46
as attribute response next url we saw that right while there's
19:52
a next url that exists we want to make a new request
19:57
and get a new response we'll do self reference tickers b3
20:03
next url equals response next url right as we saw that previously whenever we
20:09
saw that the tickers had that next url once we get to the end of the list it won't have that anymore right that'll be
20:16
the last response and then we'll just append these on on top of each other so df equals df.append
20:23
dd data frame fonts results
20:29
okay and then if that'll be all of our
20:35
results then if market equals crypto
20:41
let's do um only use usd pairings right
20:46
and we'll do df equals df we'll do some boolean logic here currency
20:53
symbol equals usd but i only want anything
20:58
where the currency symbol has usd which means it's a us dollar pairing and then the little name right now
21:05
uh we'll just set that to the base currency name so base currency name
21:12
remember we did this up here so um the base currency name
21:17
is you know one inch or eight essentially the oh no it's right here that's a base
21:23
currency base currency name right here so alpha cat or a cat right
21:28
okay i just want to show you where i'm getting that if you yet and then i only want certain uh
21:35
fields remember uh our entity relationship diagram or erd
21:41
we're only concerned with ticker
21:47
name market and active okay and then this
21:53
is almost everything that we need to do so we'll just for good measure df.drop duplicates
22:01
it does happen duplicates subset so if the ticker is duplicated we want
22:07
to drop it it'll just drop um you know that i believe it's the second
22:13
one it's uh the exact match and return df or
22:19
return none so the return on it doesn't actually have to you know i don't i like to make it explicit so we understand how
22:26
we're handling that but essentially um you know we get tickers if there aren't
22:32
any it will return none and this could be um you know optional because it returns
22:38
none too but that's that point so for our intents and purposes right now this looks good so hit enter
22:45
and let's see uh how many typos i made the client equals my rest
22:52
client settings again just passing the api key in which then gets passed into
23:00
uh the init which then gets called with super right that's where all the magic happens
23:05
and then df equals client dot get pickers and market equal crypto
23:12
yeah i see i knew there would be something here drop duplicate
23:21
set and that's it perfect so now let's just kind of recap
23:29
we ended up you know overwriting or adding a retry capability a retry strategy to our rest
23:36
class and we created this get tickers which will loop through all
23:43
of the next for us automatically right append it to a data frame and then clean up the data frame if it's a crypto data
23:49
frame and return it and that's what we see here perfect so that's the easy part now on to a little bit more challenging
23:55
which is the get bars and because again it is a class
24:02
i'm going to copy all of this paste it down here and now we can start
24:10
working on our get bars method f
24:15
get bars and now we'll do
24:22
type mark it could be a string
24:28
string multiplier and
24:33
one time span string equal minute
24:39
and then from eight one and two
24:45
8 and that will return a 3d data frame
24:51
okay again i know i could improve upon that but the whole point is just to make things a little more clear for for you
24:58
guys without taking up too much screen real estate okay so now we have get bars which is
25:04
essentially takes the um you know that crypto aggregate you know for crypto
25:12
and and brings it all together essentially creates almost like a next url for it right because that's the
25:18
challenge is that our aggregates don't help you like with the next page
25:24
of data you know you just get one you know one block of data and then that's it you have to handle all the
25:29
dates so go ahead and do this we'll do if not market in markets
25:38
we raise exception must be one of
25:46
markets and then
25:52
if sticker is none raise exception
25:58
victor must not be none okay and then we'll do from
26:04
[Music] equals from if we passed it right from if we passed it
26:11
else we'll just make it uh year 2000
26:17
all right and then two will be to today right so two equal to if we pass it also
26:23
will be date day
26:30
from daytime import date okay so now
26:36
um we have various markets right so we're going to just handle the crypto in this one but maybe in the github version
26:44
i'll handle um stocks too but if market equals crypto
26:51
we'll do a response we've already seen this right self dot crypto aggregates
26:57
the ticker multiplier time span
27:05
from bring
27:10
we're going to convert it from a string right string format time
27:16
that month days pattern and u
27:22
s t alright same thing
27:27
and y m okay
27:32
and then we'll do the limit limit equals 50 000 which is the max
27:40
and then this will create um you know that response and remember before we
27:45
ended up creating a data frame out of it so pd data frame
27:50
fonts and results let's set
27:56
the last minute equal to zero we're basically just
28:01
initializing a variable here so now let's think about what we want to do we want to
28:07
loop through all of the results right we want to get the last minute
28:13
or the last date from the result and then that'll be our next query right
28:20
so let's just say um you know we'll just say we got a bar or uh the 23rd our last date was the
28:27
23rd of october well that would be our last minute we want to get everything from the 23rd on
28:33
right so that's how we're going to identify you know what we're going to grab next
28:39
and simply the last row or the last time stamp okay this would
28:45
be this will make this will be more clear as we continue through with the example we'll do
28:50
while response results negative one which is again the last row
28:58
then t remember i converted that to milliseconds before but that's the time or the time stamp
29:04
right so i'll say last minute and response to make that clear
29:10
we'll do last minute equal response alts
29:18
file response 121 greater than last minute
29:24
sorry last minute response results and that's where you can see zero will
29:30
always be will always work right because minute will always be greater than zero
29:37
so we go last minute equals response results we want that last
29:44
minute dash t
29:50
and then we'll do last minute date last minute date
29:56
equals date time from timestamp we do last minute divided
30:01
by a thousand and we can convert that to
30:07
time that'll be again y
30:13
month day now with that data we can then create a
30:18
request right so response equals self crypto aggregates
30:25
the ticker multiplier time span remember the ticker is just whatever
30:30
ticker is passed in the multiplier in this case is one time spent minute
30:38
last minute date is right here which is essentially the last time stamp right
30:45
but convert it to a date of our prior call
30:50
then 2 is simply today so strf time
30:57
here month day and then we'll do that
31:02
limit equals five thousand or fifty thousand so we can get as many requests as we can
31:09
and then we'll call this new bar so new bars equals pd data frame
31:15
fonts which is what we just called right here and we'll do results so that'll give us
31:21
the new the new data frame rate because we've already made a request up here and assuming
31:26
uh you know yeah we've already made a request up here so now
31:32
we will then keep looping until the last minute match i mean in a second
31:40
but we're going to take new bars and then we're going to end those bars but you can't just mush
31:47
them together because if you think about it some of the bars might overlap we're
31:52
giving polygon a start date and end date we're not giving any minute values so if our
31:59
last minute bar was at noon we're still getting all of those you know prior
32:04
values we need to make sure whenever we are merging
32:09
or appending that we are ending only up until the last minute bar
32:15
okay timestamp is greater than our last minute bar makes sense that
32:22
that filters out the earlier bars type df date
32:28
is equal to ed dot 2 date time df
32:34
d which is the time stamp that is unit millisecond that simply converts that time stamp
32:41
into a more human readable date understanding that the unit is millisecond
32:47
now let's rename our columns
32:57
so we'll filter for only the columns that we'll want and that will be
33:03
date open high low close volume
33:17
and we'll return the data frame and then if for some reason we did not get data we'll just
33:23
we don't necessarily have to do that but i like to be explicit
33:29
now let's test it we'll create a new client we'll do my rest
33:35
client ask set the api key
33:44
enter see that we do have a rest client now let's use our new method
33:49
type client get bars get all of the crypto bars
33:57
the market will be crypto of the bitcoin bars and the ticker will be
34:02
bitcoin actually x colon btc us
34:13
we do here oh data that should be date
34:30
i need a date time also i'll add that
34:48
now that that's finished running let's print the data frame to the screen and perfect it looks like that's exactly
34:54
what we want congratulations you now have minute bar
35:00
data from polygon io and you learned how to also take an existing api and improve
35:07
upon it pretty cool stuff so now that you have that minute bar data you can now finally start to test out various
35:14
strategies now if you need some help with that hey that's what i'm here for but if you click on this playlist here
35:21
this will be show again one of those days this will show a playlist
35:28
of a bunch of different ways to backtest various strategies including stocks forex all of that fun stuff so i hope
35:35
you liked this video if you did please hit the subscribe button which i think should be around here and hit the thumbs
35:41
up and i hope to see you in the next one thanks bye
#Stocks & Bonds
