Python 3 Beautifulsoup4 & Pandas Amazon API Product Info Web Scraping Script & Save it in Excel File
Jun 1, 2025
Get the full source code of application here: https://codingshiksha.com/python/python-3-beautifulsoup4-pandas-amazon-api-product-info-web-scraping-script-save-it-in-excel-file/
View Video Transcript
0:00
uh hello guys welcome to this video so
0:02
in this video I will show you a web
0:05
scraping script inside Python uh by
0:07
which we can actually scrape the data
0:10
from Amazon products and save it inside
0:13
an Excel file for this uh this is
0:16
actually this uh web scraping Python
0:19
script that I developed using beautiful
0:21
soup library and pandas library and here
0:24
you just need to create a file where you
0:28
will be storing all the products one
0:31
product per line so you need to go to
0:33
amazon.com so let me show you the
0:36
process what you need to do before
0:37
running the script go to amazon website
0:40
just search for your favorite
0:43
product copy the address of the product
0:45
here simply copy this
0:47
link go to the address bar simply copy
0:50
the overall link here of the product and
0:53
simply paste it so we create this filet
0:56
txt and per line basis now come to the
1:00
second line and again just
1:06
uh let's suppose I take this product for
1:09
example again copy the address in this
1:12
way you will
1:15
actually make a list of all the products
1:17
that you want to track the information
1:20
of the prices everything
1:27
so again let me copy this link
1:29
address paste
1:31
it so it's a very simple Python script i
1:34
have given this script guys in the
1:36
description of the
1:37
video you can copy this before running
1:40
this make sure that you make this file
1:43
here file.txt in the same file where you
1:46
paste a series of products per
1:49
line and now just run this script here
1:52
it will actually store all the
1:54
information inside the Excel file in a
1:57
tabular structure all the information
1:59
will be scraped here from Amazon.com
2:01
you'll see that the price the rating the
2:05
reviews a availability all these details
2:09
will be scraped the title of the product
2:12
as well after that we will store all
2:15
these information inside and this Excel
2:18
file if you see if I try to open this
2:21
you will see all your four
2:25
products have been successfully saved
2:28
here we have got these columns rating
2:32
reviews availability and this is the
2:36
URL so all these information have been
2:39
successfully scraped from from
2:41
amazon.com this is actually you will see
2:43
the price the
2:45
rating availability and this is your
2:49
URL so now let me show you how to build
2:52
this so first of all you do need to
2:55
install some packages
2:59
so the first package you will need is
3:02
your
3:06
beautiful
3:07
soup package which is very much
3:12
required it says web scraping library
3:15
the command is simple you simply install
3:17
this to
3:19
execute and the second one is the data
3:22
analysis library which is pandas
3:26
you also need to install this as well so
3:28
the command is simple simply install
3:31
this after that simply create a app py
3:35
file and now just copy the code that I'm
3:38
doing it
3:40
so first of all we need to import the
3:43
necessary
3:44
packages the request module and then
3:48
from DS4 the beautiful sweep we need to
3:52
import and then the pandas
3:56
library and then we also need to import
3:59
the time
4:00
module after importing all the modules
4:03
now we will simply define a custom
4:10
function so inside our main function
4:13
first of all we will
4:14
simply inside our try you can open this
4:18
file here which is your
4:21
file.txt inside the read mode and here
4:24
you'll be reading this
4:26
file where all your products are
4:30
mentioned line by
4:32
line so here we will be getting all
4:35
these
4:40
products so line dot
4:45
strip like this and here we'll
4:50
be reading each
4:58
line except file not found
5:02
error so here what we are doing we are
5:05
reading this file here that we created
5:07
file.txt on each file on each line
5:10
number we have a specific product here
5:13
so it is reading line by line and
5:16
storing it inside your URLs
5:20
so after that we declare a variable of
5:24
data which will be an empty array and
5:27
then we are reading this urls one by one
5:31
and again inside your try catch we
5:36
simply first of all we define this
5:39
function here which will extract the
5:42
information so we define this function
5:44
which will be
5:46
extract product info and we pass this
5:49
URL so now we just need to define this
5:53
which is
5:57
extract product
6:00
info and we receive this URL as an
6:03
argument and right inside this function
6:07
we actually scrape this information and
6:10
for scraping this information we first
6:11
of all need to provide the necessary
6:14
headers which will be responsible
6:17
because if you are doing web scraping
6:20
then these headers are required so we
6:23
simply providing the user
6:25
agent and accept language so after we
6:29
provide these headers now it becomes
6:31
easy to do web scraping so we use the
6:35
request module to get
6:40
like this pass the headers and your time
6:45
out you provide 10
6:48
seconds time out is necessary you do
6:51
need to provide the
6:53
timeout and then response we call this
6:57
race for
7:00
status if any sort of exception take
7:03
place then we simply say that the uh
7:05
request URL is not correct after that we
7:11
actually pass this HTML that we scraped
7:14
to beautiful soup which is a web
7:16
scraping library so whatever is the
7:18
response coming
7:20
response.content and we'll be parsing it
7:24
using this parser lxml
7:27
after that we extract the
7:30
title using this very
7:35
simply so this is we are doing some web
7:38
scraping right here to extract the
7:40
relevant information so we extract the
7:44
title similarly we
7:46
also will extract other
7:55
information like this all the source
7:58
code is given in the description so here
7:59
we are necessary putting the title the
8:05
price and then we are also extracting
8:08
the rating as well if you
8:12
see so this is the overall function here
8:16
let me just paste it it is little bit
8:19
complicated this is the main function
8:21
which is actually extracting all this
8:24
information using the concept of web
8:27
scraping let me just paste it lastly it
8:30
is returning this JSON object which will
8:34
contain these properties here the title
8:37
of the product price rating reviews
8:40
availability and URL so after that
8:44
inside this function right here where we
8:46
are getting this information now we just
8:49
need to save this information in
8:54
the Excel file so we simply first of all
8:59
append this in the data
9:03
variable and we sleep for 2 seconds
9:08
before going to the next URL
9:12
and then we simply export this data to a
9:15
excel file by calling the pandas library
9:20
data
9:22
frame and we call this function to
9:32
excel we provide the file name and then
9:34
the second argument is your index
9:43
and then we simply say that your data
9:45
has
9:48
been
9:51
exported to Excel file that's
9:54
all then we'll
9:58
simply provide this in the else block no
10:01
data was extracted uh that's all that is
10:05
needed here
10:08
uh now we just need to call this main
10:11
function so we simply say
10:19
here call this main function that s so
10:24
now this first of all this main function
10:26
would be called this will read the file
10:29
here file.txt for each product it will
10:32
call this function extract product info
10:35
then we are appending this data to this
10:37
data variable and storing it inside
10:40
Excel file so inside this function it is
10:42
reading the actual product information
10:44
scraping
10:46
through using beautiful soup we are
10:50
extracting the relevant information and
10:52
lastly storing it inside this
10:55
object and returning it from this
10:57
function so let me delete this file here
11:00
once again if I repeat this
11:03
process you will see one by one it will
11:07
scan all this it is saying on line
11:14
number so you can go to the description
11:16
guys it's a little bit complicated
11:21
but I have given the link in the
11:23
description simply
11:31
go and then it will
11:35
actually do this for
11:38
you you just need to
11:43
paste the actual URLs and then simply
11:46
run
11:50
this and then your data will be saved so
11:54
thank you very much guys for watching
11:55
this
11:57
video please hit that like button
11:59
subscribe the channel and also check out
12:01
my website
12:04
freemediatools.com i think the internet
12:06
connection was gone
12:09
and also check out my website
12:13
freemediatools.com and I will be seeing
12:15
you in the next
#Consumer Resources
#Product Reviews & Price Comparisons
#Scripting Languages
#Web Services
#Web Stats & Analytics
