
videocam_off
This livestream is currently offline
Check back later when the stream goes live
Python 3 Streamlit Project to Export All PDF Pages to PNG Images Using pdf2image Library in Browser
Jan 9, 2025
Get the full source code of application here:
https://gist.github.com/gauti123456/d5b91b6ecea39299ee88a453bca52b45
Show More Show Less View Video Transcript
0:00
uh Hello friends welcome to this video
0:02
so in this video we will look at how to
0:04
build at a python web application which
0:06
allows you to export PDF document pages
0:09
into PNG images so you'll be using a
0:12
streamlit solution which is an open-
0:15
Source way to building web application
0:17
fast in Python
0:21
so I have this file right here which is
0:23
PDF to images so it is automatically
0:26
opened this application on port number
0:29
850 so if I go to Local Host
0:33
8501 so this is your application convert
0:36
PDF to images so here you can select uh
0:39
the image file sorry PDF file that you
0:42
want
0:44
to so you will see as soon as I selected
0:48
each PDF
0:50
page has a button right here below you
0:54
can export each individual page into an
0:58
image so there is a button right here
1:00
download page one as image download page
1:03
two as image so if I click this button
1:06
my page one will be converted into a JPG
1:09
file jpg image file
1:12
so now you can see your PDF page has
1:15
been exported to a image file so in this
1:18
way easily you can export any individual
1:21
pages of the PDF file to individual
1:24
images very easily so it's a application
1:27
that I want to develop in this video so
1:30
for doing this we are
1:32
using pdf2
1:34
image python package so PDF to image is
1:39
a free open-source python package which
1:42
allows you to convert your PDF documents
1:44
into
1:45
images so this is the module so you need
1:48
to Simply install this using the PIP
1:51
command pip install PDF to
1:54
image and also we are using this
1:57
streamlit solution which is a faster way
2:00
to build web applications inside
2:03
python so it's completely free open
2:06
source
2:08
so now to get started just create a
2:12
simple python file and then we'll be
2:14
importing the streamlit package at the
2:17
very top and then we'll be importing the
2:19
PDF to image and from this we will
2:23
import this method which is convert from
2:25
path and from input output we will
2:29
import B IO which is a buil-in
2:32
package and we'll also be using the
2:34
pillow Library as well for downloading
2:37
the images so below if you don't know
2:42
it's a python
2:44
package
2:46
so for image processing so you can
2:49
install this using the command P install
2:53
pillow p i l o w
2:57
so and then we need to import the
3:00
temporary file module which is a
3:02
built-in module in Python to create
3:04
temporary
3:06
files so here we'll be giving the title
3:09
to the app which is convert PDF to
3:13
images so if you now start this
3:16
application automatically refresh stream
3:18
lit run followed by the name of the
3:22
application so it supports Auto refresh
3:25
so whenever you make changes it will
3:27
automatically refresh so you will see
3:29
the title
3:30
then we will write this short little
3:33
description using the right method which
3:35
is this app will
3:38
convert so upload a PDF file so then
3:43
this streamlet has a pre-built component
3:47
drag and drop component where the we can
3:50
actually allow the user
3:51
to so it has this file uploader method
3:55
here you'll simply
3:57
say choose a PDF file the second
4:01
argument is the type of files
4:04
that we will be allowing the users we
4:07
will be allowing the PDF files only so
4:10
with the single line of code you will
4:11
see this drag and draw functionality
4:14
pre-built so we don't need to make this
4:17
from scratch so streamlit offers these
4:20
components pre-built
4:22
components and now we'll be having this
4:24
button which will we'll be Simply Be
4:27
comparing if this files have been
4:30
uploaded this if condition will be
4:31
valued to true and then we'll be having
4:34
this try catch
4:38
block we having this
4:41
uh
4:53
exception if any sort of exception take
4:55
place we can show this exception
5:01
so inside this Tri block we will write
5:04
the
5:05
code so first of all we need to save
5:09
this PDF as temporary file so for doing
5:12
this we will use the temporary file
5:15
module temp file and it contains a
5:17
function which is a named temporary
5:23
file and here it takes the argument
5:26
delete to fall so this means that
5:30
the file will be deleted after
5:31
processing from the temporary location
5:34
so delete is set to false and we need to
5:36
provide the suffix which is will be a
5:38
PDF document so
5:41
PDF so as temporary
5:44
file so inside
5:48
this we need to write this temporary
5:52
file so here you need to provide the
5:55
actual file so uploaded PDF we need to
5:57
read this so we'll using the read method
6:01
so first of all the whatever the file is
6:04
selected by the user we will read that
6:06
file using the read function and then
6:07
we'll be storing it in the as a
6:10
temporary file using the right
6:13
function and then we will get the path
6:19
using this temporary PDF we'll get the
6:22
name of the file by using the name
6:26
property so after this we need to
6:29
convert
6:30
this PDF to
6:32
images so for doing this we'll create a
6:34
new variable PDF
6:36
images and then we'll be using this
6:38
function convert from path which is
6:40
coming from this open source package PDF
6:43
to image this function actually takes
6:47
uh the path of the file so temporary PDF
6:51
file you'll pass it and then the second
6:53
argument is a DPI value so you can set
6:56
this to
6:57
300 or you can change it as as well this
7:00
simply means the quality of the image
7:02
being taken so DPI is 300 which is more
7:05
than
7:06
enough and then we need to display these
7:09
images for each page so we'll be using
7:11
the for Loop for I page in
7:17
enamorate PDF pages so enamorate will
7:21
actually Loop through each PDF Pages
7:24
which is present in this variable PDF
7:27
images it's an array of images just we
7:29
are looping through each image so what
7:32
we'll do st. image we'll show this image
7:36
for each page and we'll give it an
7:38
caption right here which will be
7:43
page so which will be I + 1 so this will
7:47
be page 1 page two page three so
7:51
essentially we are looping through and
7:53
then the second property we need to give
7:55
which is use underscore call column
8:00
withd to be
8:02
true this means that the image will take
8:05
the full
8:07
space that's all that we need to do so
8:09
you will see each image will show so if
8:12
you I think if you select now your PDF
8:18
file so it is saying that the PDF pages
8:21
is not
8:27
defined oh sorry this needs to be PDF
8:30
images I
8:34
think just rename this to PDF Pages
8:37
because it's a PDF
8:38
Pages not PDF images so PDF
8:50
pages so now you'll see two images are
8:54
showing right here this is the
8:56
first page this is the first page of the
8:58
PDF document this is a second page page
9:02
one page two and now what we need to do
9:04
we need to show a button so that we can
9:06
export each each page as an image file
9:11
so for doing this after this for
9:14
Loop we need to
9:19
show create a variable image buffer byes
9:24
IO then we will have the save method on
9:27
this object
9:29
page image buffer and then the format of
9:33
the image so you can either provide PNG
9:35
or jpg I will provide
9:37
jpg and then we need to move this to the
9:40
first image so image
9:42
buffer do seek
9:47
Z and then we need to also show a
9:50
download button for each image so you'll
9:52
be showing using the download button
9:54
function inside streamlit and it takes
9:56
four arguments first is it actually
9:59
takes the label of the button so the
10:02
label will be simply download
10:08
page I + one
10:12
so as
10:15
image the second one will be the actual
10:18
data that you need to download so we
10:20
need to download the actual image so you
10:22
will say image buffer actual file name
10:26
the third argument which will be
10:31
the actual
10:34
page all the source code will be given
10:36
in the description so just can check
10:42
out and the fourth one is the MIM type
10:45
so MIM is equal
10:47
to for images jpg images it's image/
10:52
jpg so it takes four arguments if you
10:56
see we have the label then we have the
10:59
data file name and the MIM
11:12
type and make just make sure that you
11:14
write all this code in the for Loop so
11:17
let me move this code to the for
11:21
Loop so just after this this this all
11:24
needs to be present in the actual for
11:27
Loop
11:29
so we need to do this for every page so
11:32
that's why for every page we do need to
11:33
show a download button so you will just
11:35
need to move all this code in this for
11:37
Loop you can see that so for each page
11:40
we are showing a download button to
11:42
export each page as a image file so
11:44
which take four arguments first is a
11:46
label second is a data file name and the
11:49
MIM type so if you now refresh your
11:53
application you will actually see if you
11:55
select a PDF
11:57
document now if you see
12:01
after uh get an an expected argument
12:05
file name so this needs to be
12:07
uh file underscore name every time I
12:11
make this mistake this is actually file
12:14
underscore
12:17
name so if you select now the PDF
12:20
documents so for each page there will be
12:23
this download button appearing at the
12:25
bottom so you'll export this page as a
12:27
image jpg you will see that so the
12:30
quality looks tremendous which
12:33
is this is a second page so in this way
12:36
you can take any PDF document so for
12:38
each page it will show a export button
12:41
at the bottom so you can export the
12:43
button sorry page as a JPG image file so
12:47
this is the application so if you need
12:49
the full source code the link is given
12:50
in the description so thank you very
12:53
much for watching this video and do hit
12:56
the like button subscribe the channel
12:58
and do check my website as well free
13:00
mediat tools.com which contains
13:02
thousands of free tools related to audio
13:04
video and image and I will be seeing you
13:07
in the next video
