Python 3 StackExchange API to Web Scrape Stackoverflow Questions of Any Topic and Export it in CSV
376 views
Jun 1, 2025
Get the full source code of application here: https://codingshiksha.com/python/python-3-stackexchange-api-to-web-scrape-stackoverflow-questions-of-any-topic-and-export-it-in-csv-file/
View Video Transcript
0:00
uh hello guys uh welcome to this video
0:02
so in this video I will show you a
0:04
python script uh which will actually
0:06
return the stack overflow questions
0:09
coding questions you can actually ask
0:11
inside this uh python script and it is
0:14
actually using their stack exchange API
0:17
and it is web scraping it so they do
0:20
offer the stack exchange API so we are
0:23
just making a simple get call to this
0:25
API to actually fetch the coding
0:28
questions which are asked in this
0:30
popular website which is called as stack
0:34
overflow which is the most uh popular
0:37
programming and coding website here
0:38
where the people ask questions so they
0:41
do offer their website which is stack
0:42
exchange.com so we'll be writing a
0:44
simple Python script which will actually
0:46
scrape all these top questions related
0:48
to any topic you can be Python
0:50
JavaScript PHP any sort of topic you can
0:54
and then we'll be scraping these
0:56
questions in a CSV file so this is
0:58
actually the Python script that you can
1:00
see right here and for this we are using
1:02
pandas library for data fetching let me
1:05
first of all show you by running this
1:07
application so we are fetching the topic
1:09
here you can see it's Python so it is
1:12
fetching the questions the top questions
1:14
which are asked here so if I run this
1:16
Python script you will see on the left
1:18
hand side it will actually scrape all
1:21
the top questions inside the CSV file
1:24
here and now you will see inside the CSV
1:27
file we have different properties the
1:29
title of the question the direct link to
1:31
that question the creation date what is
1:34
the score what is the tax who own the
1:36
question the search topic everything is
1:40
there inside this you can see
1:43
that this is actually the direct link to
1:45
the question which is asked right here
1:47
so you can
1:49
directly copy the link here and you can
1:52
follow this question by directly pasting
1:54
that link so this actually makes the job
1:57
way more easier if you want to search a
2:00
specific set of topics on
2:01
stackoverflow.com you need to get
2:03
questions similarly I can do I can
2:06
modify the script here and replace the
2:08
topic instead of Python we want PHP
2:11
centering questions so I will simply
2:13
change the topic from Python to PHP once
2:15
again if you run it it will now create
2:19
the second file here let me change the
2:22
file name
2:24
here i will say that uh
2:28
PHP so again if I run this here you will
2:31
see it will contain the second file here
2:33
this is only for PHP questions if I try
2:36
to open this you will see once again it
2:39
actually
2:40
fetches the questions here related to
2:43
PHP on stackoverflow.com i given the
2:46
script in the description of the video
2:48
now let me show you how to create this
2:51
so for this you need to install the
2:53
pandas library so simply you need to go
2:55
to this website simply search for this
2:57
package which is pandas which is a data
3:00
fetching package for Python simply the
3:03
command is simple pip install pandas
3:05
library so simply install this package
3:08
after installing it you just need to
3:10
actually create a app py file and then
3:13
here you need to import the request
3:16
module which is a built-in module inside
3:18
python to make API calls and the
3:21
secondly module is pandas package so
3:23
import pandas as speed pd and there he
3:26
will be setting the topic here so the
3:28
topic can be anything i will say
3:29
JavaScript questions I need to search
3:32
here we just create this and then we
3:34
will be hitting this endpoint here the
3:36
URL to the stack exchange APIs so where
3:40
we'll be fetching the questions so API
3:43
dot stack
3:48
exchange.com/2.3/arch so this is
3:50
actually the endpoint here it's a
3:52
completely free API you can actually use
3:54
it unlimited number of time there is no
3:56
kind of restriction out there that you
3:58
will need to use the API key it's a
4:01
completely free API and then here you
4:03
need to pass the parameters
4:06
so you can see the latest questions here
4:09
uh so we will be saying that in we need
4:12
the questions to be in descending order
4:14
then we will be also be passing the sort
4:18
which will be activity the most relevant
4:20
questions that we will be returning and
4:23
then in title we will say that the title
4:26
should match the topic here that we are
4:28
asking we only want the topiccentric
4:30
questions and then here you provide the
4:33
site here in which site you are
4:35
searching the questions so we will
4:36
providing here the site should be stack
4:38
overflow and then there is a next
4:40
property here how many questions you
4:42
want to fetch so here you can fetch I
4:46
think a maximum of 100 questions so I
4:49
will say 100 and then you can set the
4:52
page number as well so I will say page
4:54
to be one so these are the different
4:56
arguments that you can pass after that
4:59
we will make the request alongside with
5:01
these options so we will simply call the
5:03
request method request dot get and we
5:06
will make the simple get request comma
5:09
and then we'll be passing the parameters
5:11
that we set and then we'll be getting
5:14
this data we then just need to convert
5:16
this response to JSON and then we can
5:20
actually print out to actually see what
5:22
is returning right here so if I actually
5:26
again make the request here uh run the
5:29
script you will see this JSON response
5:32
will be returned to us so a total of 100
5:35
questions are returned here so we have
5:37
various properties such as HTML owner
5:40
who owns the question so now we just
5:42
need to save this inside the CSV file so
5:45
for saving it we now need to extract the
5:48
information so we'll make a questions
5:51
which will be an empty array and then we
5:54
will loop through each question for item
5:57
dot
5:59
data.get and like this and after
6:02
that so for each question here we will
6:05
be appending it so inside this array
6:08
here using the append method and then we
6:10
have various properties here which we
6:13
can do which is the title of the
6:17
question so we can simply get this
6:19
property item title of the question
6:22
similarly we will get the direct link to
6:24
the question here by item
6:27
link and then we have the creation date
6:31
so when the question was created that
6:33
date will be returned here so you can
6:36
get this using this property which is
6:39
creation
6:42
date and similarly we will get the score
6:46
tags owner and the search topic
6:49
everything will be
6:52
there so you can actually go to the
6:55
description and simply get the full
6:57
script so once you get all the
6:59
properties here we can actually save it
7:02
inside a
7:05
uh CSV file for this we'll be using the
7:07
pandas library which contains this data
7:10
frame method we'll converting this
7:13
response to dataf frame and then we will
7:16
have this function here which is two csv
7:19
so this will create a CSV file here with
7:21
the questions that you have fetched you
7:23
can give it a file name and index
7:25
property to false and then you can print
7:28
out that uh your questions have been
7:35
saved so let me just delete everything
7:38
here and start from scratch so now if
7:41
you see if
7:42
I run the Python script it will actually
7:46
scrape the top 100 questions and it will
7:48
save it inside the
7:50
CSV file here you will
7:54
see so now it has fetched the top 100
7:57
questions related to JavaScript on
8:00
stackoverflow.com this is really useful
8:02
because uh you
8:04
may using this website you can search
8:07
quality questions which are asked
8:09
related to the topic so this is
8:11
JavaScript you can change the topic here
8:14
let's suppose you say I want the
8:17
questions to be related to Photoshop so
8:20
once again you
8:25
can so it can be any any topic related
8:29
it's a technical topic and then run this
8:31
so it will again fetch the questions
8:38
here so again you will see the top 100
8:41
questions related to uh
8:43
Photoshop
8:45
so so this is the actual script guys
8:48
thank you very much for watching this
8:50
video and uh also check out my website
8:53
freemediatools.com
8:55
uh which contains thousands of tools
#How-To
# DIY & Expert Content