Python 3 Github API V3 Web Scraping Script to Scrape Github User Profile Info & Save it in Excel

0:00
uh hello guys welcome to this video So
0:02
in this video I will show you uh an
0:05
example of web scraping the GitHub API
0:08
to actually fetch the user profile
0:10
details and save it as a uh Excel file
0:14
So it's a simple uh web scraping script
0:16
using beautiful soup library inside
0:18
Python For this we'll be using the
0:20
GitHub API to fetch actually fetch the
0:23
GitHub user profile details So GitHub as
0:26
you all know it's a popular website for
0:28
coding and programming purposes It's a
0:31
mini social network So if you want to
0:33
fetch a particular user details let's
0:36
suppose I take my own
0:38
example So this is actually the profile
0:41
which looks like we have the profile
0:43
picture We also have the
0:45
username We have the short little
0:48
description This is email address
0:50
followers
0:51
everything You can take any other
0:54
example here Let me
1:01
say
1:06
so you can take any example here Let's
1:10
suppose I take this
1:14
one So this is actually my script here
1:17
And here you just need to replace the
1:20
username
1:23
So you just need to this is actually the
1:26
username which looks like So you just
1:29
need to copy this username to fetch the
1:32
other profile details So I will simply
1:34
copy this username After that I run this
1:37
Python script You will actually see it
1:39
will scrape all the details and save it
1:42
inside your Excel file You will see it
1:44
create this Excel file Microsoft Excel
1:47
file And uh then it actually scrapes
1:50
this description the name the company
1:54
name and public repositories as well You
1:57
also get the avatar URL which is you can
2:03
see So everything we have scraped here
2:06
Similarly you can replace any other
2:09
username
2:15
So again you run the
2:23
script So you can see it actually again
2:26
scrape all these details Location
2:32
296 So all the script is given in the
2:35
description of the video So now let me
2:37
get started how to make this So the
2:40
first library you need to install is the
2:43
web scraping library inside Python which
2:45
is called as beautiful
2:47
soup 4 It's a screen scraping library So
2:52
if you want to extract the web pages
2:55
directly you can use this package in
2:57
Python So after installing it uh just
3:00
make a simple app py file and then we
3:04
first of all require the request module
3:06
which is a built-in module and then from
3:08
this beautiful soup library we'll be
3:10
importing this like this and for this we
3:14
also need the pandas library as well for
3:16
exporting the data to the excel
3:18
file so also install this package as
3:23
well so you can see this is actually the
3:26
package and after that you just need
3:30
to define define a function which will
3:33
actually uh do the scraping for us So we
3:36
define this function scrape grid GitHub
3:38
profile We receive the username as an
3:41
argument and then we simply hit this
3:44
endpoint where we will be fetching this
3:47
uh user profile data So
3:50
github.com/ and followed by you will
3:52
paste your username right
3:55
here So this dollar sign is not required
4:02
just so like this this username will be
4:06
the dynamic one So then we'll be
4:08
providing our
4:11
headers This is uh necessary inside
4:13
whenever you are doing web scraping you
4:15
need to provide a user agent So this you
4:19
can set this to this
4:22
value Mosilla 5.0
4:29
zero and then we will simply first of
4:33
all request the HTML of this So we be
4:37
simply be calling the get method passing
4:39
the URL with these
4:42
headers and then we will simply check
4:45
that if the response coming status code
4:48
is
4:49
equal is not equal to 200 be checking
4:56
it So if it is not returning that we can
5:00
print out this error message that your
5:03
failed to fetch user But if the response
5:06
is there in that case we need to simply
5:08
extract the relevant information First
5:11
of all we need to initialize this web
5:13
scraping library beautiful soap and here
5:16
we need to pass this response HTML text
5:20
which is coming and now we just need to
5:23
pass this HTML to extract the relevant
5:27
information So
5:30
this this is actually
5:33
your HTML which is contained inside this
5:45
response.ext So if I just call this
5:48
function right here by
5:52
passing So we basically call this method
5:59
passing this username and then we are
6:01
calling this function with this username
6:03
So if I try to run
6:07
this so I think there is some kind of
6:10
error here So
6:11
[Music]
6:15
return So right here if
6:21
you so this needs to come inside
6:36
this Thank you
6:44
Sorry So is equal
6:47
to So here we need to initialize this by
6:51
passing this response
7:04
So we can just print out to simply see
7:07
what is returning here So if I just uh
7:10
execute this you will see it will return
7:12
this all this HTML that is
7:15
returned So web scraping actually works
7:18
if you let me show
7:22
you So this is essentially the profile
7:24
page If you just rightclick and inspect
7:27
element So every element in this web
7:30
page is assigned to a particular tag
7:32
right here You'll see So we will be
7:34
doing the scraping by targeting these
7:36
tags by the CSS or class properties or
7:41
ID So particularly be inspecting element
7:44
by and uh getting to target those Let me
7:48
show you how we can do this inside
7:50
beautiful soup So here we initialize
7:53
this by passing this HTML parser
7:57
After that we can now individually
8:00
target the elements to extract the name
8:04
we found So this contains the find
8:06
method and here you can basically target
8:10
elements having unique properties So
8:14
this one we are finding it the span tag
8:18
which has a class of p-
8:21
name So if you cross check
8:24
here by navigating your mouse right
8:27
here and right click inspect element and
8:31
if you see in the HTML you will see this
8:33
name this span tag has this class here
8:37
which is this unique class p- name So we
8:40
just targeting this to get the actual
8:43
value So this will return the name here
8:45
So now to get the actual value we'll be
8:48
calling this text method
8:51
So this will get the name for
8:53
us Uh similarly we will extract the more
8:57
details such as biography company name
9:00
all these things The this process will
9:02
remain the same for all the
9:05
properties So you can see for biography
9:08
we are finding this element of div which
9:11
has this class of
9:13
this and similarly for company as well
9:16
And then we also have the
9:18
location number of repositories which
9:22
have So in this way you can actually
9:25
fetch any sort of data which is there on
9:27
the web
9:33
page and then the number of repositories
9:37
which will be there and lastly the
9:39
profile picture which will be
9:42
there inside the avatar URL So in this
9:46
way we fetched all these details and now
9:48
at last we just need to return these
9:50
details inside
9:51
a
9:53
object so that we can
9:58
directly so in this way we can actually
10:02
put all these properties inside a single
10:04
object So we are directly returning it
10:06
from this function So which has username
10:08
biography name company location public
10:11
repos and avatar URL
10:14
So so after getting this we just need to
10:17
save it inside an Excel file so that
10:21
these details can be saved So right here
10:23
in the we have if condition if the
10:26
profile details are there then we'll be
10:29
initializing this using the pandas
10:31
library calling this data
10:34
frame and saving it to
10:38
a excel file So it contains this
10:41
function here which is 2
10:52
x So this is the overall script
10:57
here So if I just delete this here and
11:00
again execute this And the nice thing
11:03
about this that it's unlimited You can
11:06
do this unlimited number of time This
11:08
web scraping doesn't have any sort of
11:10
limitation So you will see after that
11:13
we'll create this Excel file here where
11:15
you will see the user details such as
11:18
username biography name company location
11:20
public repo and the avatar URL So this
11:24
is actually the Python web scraping
11:26
script Uh all the source code is given
11:29
in the description of the video and also
11:32
check out my website freemediatools.com
11:35
uh which contains uh thousands of tools

Python 3 Github API V3 Web Scraping Script to Scrape Github User Profile Info & Save it in Excel

webninjadeveloper.com

Python for Beginners | Python for AI & Machine Learning 2023 (Step by Step guide) | Tutorialspoint

Operators in Python | Python Identifiers | AI & Machine Learning | Tutorialspoint

Gradio Full Tutorial Build AI Web Apps with Python in Minutes | Deploy ML Models Easily!

Python 3 Yahoo Finance API to Export All Companies Financial Stock Market Data & Save as Excel File

Python 3 Script to Generate Google Maps HTML Iframe Embed Code From Address in Terminal

Python 3 Subprocess Module Tutorial to Run Terminal & Shell Commands Programmatically

Python 3 PyAutoGUI Tutorial to Run Terminal & Shell Commands From GUI Desktop App & Print the Result

Python 3 Reddit Video Post Downloader From URL Using Reddit API Library in Terminal

Python 3 Beautifulsoup4 & Pandas Amazon API Product Info Web Scraping Script & Save it in Excel File

Python 3 Web Scraping Script to Search Youtube Videos Using Keyword Using Youtube Data API V3

Python 3 Google Translate API Example to Translate Text in All Languages For FREE in Terminal

Python 3 Web Scraping Script to Get All Videos of Youtube Channel Using Youtube Data API V3

Python 3 pygments Script to Take Screenshot of Source Code and Export to PNG Image

How to Get Python Environment Info in Python Source Code

How to Develop and Debug Python Code with VSCode

Up next in 10

Python 3 Github API V3 Web Scraping Script to Scrape Github User Profile Info & Save it in Excel

webninjadeveloper.com