PHP Script to Extract Text From PDF Document in Textarea and Download it as Text File in Browser
Dec 21, 2025
Buy the full source code of the application here:
https://procodestore.com/index.php/product/php-script-to-extract-text-from-pdf-document-in-textarea-and-download-it-as-text-file-in-browser/
Show More Show Less View Video Transcript
0:04
Uh hello friends. Today in this tutorial
0:08
I will basically show you how we can
0:11
code a PHP script which will allow us to
0:16
extract all the text which is present
0:19
inside the PDF document. So this is the
0:22
live demo of the application guys. we
0:25
have a choose file button and as I
0:28
basically click this button I will be
0:31
able to select any PDF document. So
0:33
let's suppose I select this PDF document
0:36
you will see a sample text is present
0:38
inside this PDF document. So now if I
0:41
want to extract this text, so what I
0:43
will do is that first of all I will
0:45
select this file.
0:48
And now basically guys if I click the
0:51
extract text button. So you will see in
0:54
a model window in bootstrap all the text
0:57
will be extracted in the text area. You
1:00
will see that from the PDF document.
1:03
And now we have a download text button
1:06
here. If you want to download all the
1:08
text in a txt file, if I click this
1:11
button, a txt file will be downloaded
1:14
and uh
1:16
you will see that all the text will be
1:19
there.
1:22
So this is a really useful tool that we
1:25
will be doing it in this tutorial guys
1:29
in this live stream will be allowing us
1:31
to build this application. It's a
1:33
fullyfledged application where you
1:35
select a PDF document
1:39
extract text and now you can see that so
1:41
the text is extracted. It can contain
1:44
the PDF file can contain multiple pages
1:47
and it will extract all the text which
1:49
is present.
1:54
If I basically open this file,
1:57
you will see that a lot of text is
1:59
present right here in this file.
2:06
So we will try to basically build this
2:09
application guys how we can extract text
2:12
from PDF document. The link is given in
2:14
the description of the video. If you
2:16
want the full source code you can
2:17
directly purchase the full source code
2:20
and after purchasing the source code
2:22
from stripe.com you will get this
2:24
directory structure. Uh you will see
2:27
that all the dependencies will be
2:29
included. So let's start building this
2:31
application. So very first thing guys we
2:33
will do we will actually make a
2:37
file index html file.
2:42
So here guys we will have a simple form.
2:46
We will basically include bootstrap for
2:48
this. We are requiring some bootstrap
2:51
for the model window that you are seeing
2:53
right here.
2:55
So just we will also be requiring some
2:57
jQuery as well.
3:00
container
3:02
and then we will have uh a text center
3:05
class. We will say PDF text extractor.
3:17
So after that guys we will basically
3:19
have a form which will basically go to
3:24
PDF2 text dot PHP. This is a PHP script
3:28
that we will write. Method is post and
3:31
encoding type
3:33
will be multipart form data. So here
3:36
guys, we will allow the user to select a
3:39
PDF file.
3:42
We will have
3:45
a label.
3:47
We will say select PDF file.
3:52
Then basically we will have a input type
3:55
file and we will give it a bootstrap
3:58
class of form control and uh this is
4:01
required and also we will give it a name
4:04
of PDF file. We will only be accepting
4:08
dot PDF files only. That's all. And then
4:11
we will have a break tag. And after that
4:14
we will have a button which will be of
4:18
type submit. And here we will be giving
4:20
a class to it btn btn primary
4:24
and we will say that extract
4:28
text.
4:32
So that's all guys and if you refresh
4:35
the application
4:39
just rename this file to PDF to text.
4:48
So now you will see we have this
4:49
interface PDF text extractor. We have
4:52
this button as well. So now we need to
4:55
write this uh PHP script guys which will
4:59
once the user submit the form this PHP
5:02
script will invoke. So now we need to
5:05
first of all upload the file. So for
5:07
uploading the file guys we need to
5:09
create a uploads directory here. So just
5:12
create a uploads directory. It is needed
5:15
for and now we will write the PHP code.
5:19
So first of all we will require the
5:21
library guys. We are using the library
5:24
that I have written. So you will need
5:26
this library. So after you purchase the
5:29
source code, you will get this library
5:30
which will automatically extract the
5:32
text from the PDF document and show it
5:35
in the model window.
5:37
So for that you need to purchase the
5:39
source code and then after you include
5:42
the library we will simply check if the
5:45
request method is equal to post
5:50
and also
5:52
if the file is set or not
5:55
dollar files
5:57
PDF file.
6:01
So here guys we are checking for the
6:03
method if the method is post and also
6:05
the file is present or not. If the file
6:08
is present and in that case we will get
6:11
the file address. So this is present
6:14
inside.
6:16
So whatever name attribute that you have
6:18
given inside index html. So we have
6:20
given this name attribute that's why we
6:22
are using this. After that guys we will
6:25
say what is the file extension. So we
6:29
can basically get the file extension
6:31
using this method string to lower path
6:34
info and then we can pass PDF file the
6:39
name of the file like this
6:42
and then we will put a comma here and
6:45
then we will use this constant which is
6:47
path info extension
7:00
Just copy this line.
7:05
So so you can see that. So
7:11
so here we are getting the file
7:13
extension guys. After we get the file
7:15
extension, we will basically check
7:18
using this method in array. So in array
7:21
basically it's a validation method
7:22
inside PHP which automatically checks if
7:26
uh allowed extensions. So allowed
7:30
extensions
7:33
it will be only PDF. So here we are
7:36
basically doing form validation at the
7:38
server side. We are only allowing the
7:39
files which are PDF. So that's why we
7:43
have this if condition. So this will
7:46
return true if the file is PH uh PDF
7:50
file and then we will get the path of
7:52
the file using PDF file temporary path.
7:56
So temporary path we will get guys like
7:58
this.
8:04
This is
8:06
PDF file template. So after getting the
8:10
path guys it's very simple. We will
8:13
require the library once again vendor
8:15
autoloadad dotphp
8:19
and after loading this we will simply
8:21
say
8:23
pdf parser and then we will initialize
8:26
the library that I have written which is
8:29
sma lot. This is the syntax guys of the
8:33
library PDF parser
8:36
and slash uh parser
8:41
and then we will basically it this
8:43
library contains a method which is
8:47
called as parse file.
8:52
So inside this you need to pass the path
8:54
of the file uploaded file path and then
8:58
the text variable we will declare.
9:00
Currently this variable will be nothing.
9:03
And now we will be running a for each
9:05
loop in PHP. And here we will be running
9:08
it for all the pages. So in order to get
9:10
all the pages in the PDF document we
9:13
will use this syntax which is PDF
9:17
get pages. So it will return the number
9:19
of pages. It will loop it for every
9:21
page. We will convert the text. We will
9:25
extract the text
9:27
dot is equal to and then we will say
9:29
page get text.
9:34
So this is a really simple code guys.
9:36
What we are doing right here if you see
9:38
closely see what we are doing. We are
9:41
concatenating the text in the text
9:43
variable which is currently empty. And
9:45
here we are using the for each loop. If
9:48
you see
9:50
we are using this for each row and uh
9:53
for every page in the PDF document we
9:55
are extracting the text out there with
9:58
really simple code.
10:01
So after doing this we are also in this
10:04
line of code we are initializing the
10:06
library.
10:07
This is actual library.
10:12
So after doing this guys what we need to
10:15
do we need to basically if you just echo
10:18
out this text guys let's suppose just
10:21
for the clarity purpose if I cross check
10:24
and uh if I select the PDF document
10:27
click extract text it is saying
10:29
undefined variable PDF file on line
10:31
number 10 let me see
10:37
sorry this needs to be PDF file
10:44
Just select your file, click and now
10:47
guys you will see that basically all the
10:49
text is extracted from the P PDF
10:51
document. Now we need to show this text
10:54
in a model window or bootstrap. So for
10:58
that guys what we need to do we now need
11:00
to echo out some dynamic JavaScript in
11:03
PHP. We need to mix JavaScript in PHP.
11:07
For doing this we will echo out some
11:10
statement here.
11:12
We will include the script tag. This
11:15
will be for uh including
11:18
JavaScript and uh Bootstrap JS as well.
11:22
So these two echo statements are
11:23
required guys. Simply write this. This
11:25
is actually the CDN of jQuery and
11:28
Bootstrap. We are just including it
11:32
using the echo statement in PHP.
11:35
So after doing this guys, we will
11:38
basically have this element which is a
11:41
script element which will
11:45
be a simple
11:48
code for
11:51
the slightly complicated code. This is
11:54
let me just copy paste it.
12:02
Copy paste.
12:04
So basically what it what this code will
12:06
do guys it will show the text in a you
12:10
will see that so
12:13
me format this so we have writing some
12:17
Java jQuery code guys so we have
12:21
encoding the JSON by removing the
12:23
special characters and then basically we
12:25
are showing it in the model window and
12:27
also we have a download button as well
12:29
which will actually download the file in
12:31
a txt file. So then we are showing the
12:33
model window right here. If I now
12:41
click.
12:44
So now guys nothing happens. Let me see
12:47
why.
12:57
Okay guys, we haven't added the model as
13:00
well. So right after the form inside
13:03
index html we need to add the form as
13:05
well. We need to add the model window.
13:08
So this is the code for adding the model
13:10
window guys. So this is very simple
13:13
code.
13:15
So what I will do
13:18
I will write this code. This is very
13:21
simple code guys. You have seen models
13:23
in bootstrap. So we have written this
13:25
model very simple. We have the header.
13:28
We have the title. The title is
13:31
extracted text. Body will contain the
13:34
actual text of the PDF document. It
13:36
contains a text area and a footer will
13:38
contain a download button. That's all.
13:44
Click
13:47
and once again there is some kind of
13:50
error.
13:51
Guys the link of the full source code is
13:54
given in the description. Sometimes in
13:56
live coding sometimes uh error can take
13:59
place. This is a full file here. And let
14:02
me rename this file to index.php.
14:06
This is a fullyfledged code guys. And uh
14:08
you can basically deploy this
14:10
application on the internet to basically
14:12
earn a lot of money. You will see that.
14:15
So now the text is showing right here.
14:17
And uh you can copy to clipboard or you
14:21
can download the text in a txt file. So
14:23
you will see that. So you can take any p
14:26
PDF file. for example
14:31
you can see that so
14:34
so this was the tool guys that we
14:36
developed in PHP it's very much easy to
14:38
develop all the source code is given you
14:40
can directly purchase the full source
14:42
code from stripe.com alongside with this
14:44
directory structure and thank you very
14:47
much please hit that like button
14:49
subscribe the channel as well and I will
14:52
be seeing you in the next tutorial until
14:54
then thank you very
