Node.js Google Cloud Text to Speech API Converter and Downloads it as MP3 File Using Javascript

0:00
uh hello friends today in this tutorial
0:02
i will be showing you that how to
0:03
convert
0:04
the text into mp3 file or speech using
0:08
node.js and we will be using the google
0:10
cloud text to speech api
0:13
in node.js
0:14
so simply i will be showing you
0:17
in this video so let's get started so
0:19
first of all i will go to my desktop
0:21
directory and inside my projects
0:23
directory i will open command line here
0:26
and i will make a new nodejs project for
0:29
that i will be declaring a new directory
0:31
node text to
0:33
speech
0:34
after creating the directory i will move
0:36
into this directory node text to speech
0:40
so first of all in order to initialize
0:42
the nodejs project we will initialize
0:43
this command npm init dash y so this
0:46
will create the empty package.json file
0:48
after that we can open this inside
0:50
visual studio code text editor
0:52
so you can see it has created this
0:55
package.json file inside the root
0:57
directory so it contains very much basic
0:59
information about the project which is
1:01
name version description the main file
1:03
and the script section so now guys we
1:06
also need to install one of the
1:07
dependency which is a
1:10
node 1 which will automatically restart
1:13
the application and whenever we make any
1:15
sort of changes
1:16
so npmi node mod
1:19
so it will take some time
1:21
and it will install this so you will see
1:23
it is successfully installed and now we
1:26
need to make our
1:28
entry level file which is indexed.js
1:30
which is a main file
1:32
and now to
1:34
you need to add a certain script here
1:36
which is start script you just need to
1:38
write here node mod
1:41
index.js so basically we will just call
1:44
this script npm run start and this will
1:48
watch our application for any sort of
1:50
changes and it will automatically
1:52
restart it so now we just simply need to
1:54
run the application that is npm run
1:57
start
1:59
so you will see it will
2:01
automatically
2:03
see for the changes and it will
2:04
automatically restart the application we
2:06
need not have to manually start the
2:08
application so now guys there is a
2:10
dependency out there in order to
2:12
interact with google cloud text to
2:14
speech api inside this
2:17
node.js there is a module so in order to
2:19
follow along with the video i have given
2:21
the link in the video description so
2:23
this is my step-by-step blog post so
2:25
where you will find all the source code
2:27
and also the step-by-step instructions
2:29
in order to build this application so
2:31
just go to the video description this is
2:33
my blog post so this is the
2:35
library guys so simply you need to copy
2:37
this and go to npmjs.com this is the
2:40
official note shares packages website so
2:43
simply search for this package and the
2:45
very first package which comes in this
2:47
is a package we are talking about google
2:49
cloud text to speech
2:52
and if you click it you will see more
2:54
information about this package so you
2:55
will see 12 000 weekly downloads is
2:58
there so it's a very famous package
3:00
so now we simply need to
3:02
stop the application now and install
3:04
this npm i paste it
3:08
so like this so it will take hardly some
3:10
time to download this package depending
3:12
upon your internet speed so now it has
3:14
successfully installed it so for another
3:16
dependency guys we will use is it is dot
3:19
env so basically in order to store
3:21
security secure information we will use
3:24
make use of this uh module which is npmi
3:27
dot env that's it
3:29
so this is all about dependencies guys
3:31
so we can now again restart the
3:33
application
3:34
and pm run start
3:36
so you will see that and now guys first
3:39
of all you need to be having a valid
3:41
google cloud console account so if you
3:44
don't about if you don't know google
3:45
cloud just search google cloud console
3:48
here on google
3:49
and make sure that you link your
3:52
let me just see what is
3:55
you should be first of all be having a
3:57
valid
3:58
google cloud console account simply type
4:00
this and the very first link which comes
4:02
in simply click it for this you need to
4:05
be verifying your credit card
4:06
information just
4:08
if you're residing in india you can also
4:10
use debit card also so just just go to
4:12
the billing section and add your card
4:15
after adding it just go to apis and
4:17
services and you need to go to library
4:20
and here we will search for the api
4:21
which is the text to speech
4:24
just search for this api
4:26
and
4:27
enter it
4:28
so there are two apis first is text to
4:30
speech and second is speech to text so
4:33
in this we are using this cloud text to
4:35
speech api so simply click this
4:37
and click on manage
4:40
you will see it will go to the api
4:41
section here so here you need to create
4:44
a service account credential so
4:46
just go to this fourth option which is
4:48
credentials so there are many ways by
4:51
which you can interact with google apis
4:53
first is through api key second is
4:55
through client id and third is the most
4:57
secure one which is service account so
4:59
we will be creating service account for
5:00
this
5:01
application so click on create
5:03
credentials
5:04
and then you need to choose here service
5:06
account
5:07
and here you can just name your service
5:10
account credential anything test service
5:12
account
5:14
six
5:14
and create and continue click it
5:18
and here you need to give a role i will
5:20
just choose basic and owner so this is a
5:24
default option so click owner here
5:26
that's it click continue and click on
5:28
done that's it so it will create your
5:30
service account credential so now this
5:33
is your sample text test service account
5:36
six so simply click this and now to
5:38
create a key here so simply click key
5:40
section here and click on add key create
5:43
a new key and there are two options here
5:46
either json or p1 to json is the most
5:48
frequent recommended one click on create
5:50
it will download this as a
5:52
secure json file you don't need to share
5:55
this private key
5:56
which is
5:58
specific to your project so don't share
6:00
it with anyone it can be lost so secure
6:03
it in a secure location so
6:05
i will show the approach how to use this
6:07
inside the application so simply you
6:08
need to cut it and paste it inside your
6:10
working directory
6:12
so my working directory is node
6:15
let me move to the project no text to
6:17
speech this is the directory i think
6:20
sorry node text to speech
6:24
so this is a directory i will simply
6:26
paste it i will rename it as a
6:29
service account
6:31
dot json so this json file contains
6:34
secure information about your project so
6:37
don't share this file with anyone they
6:40
can also misuse and with your api keys
6:42
as well
6:43
i will not show you
6:45
for security reasons so
6:47
now guys first of all we will import
6:50
inside our index.js
6:54
i will make a new variable here text to
6:57
speech i will import the library that we
6:59
have installed here which is the
7:01
required statement
7:02
so here we will provide add the date
7:05
google cloud slash text to speech that's
7:07
it
7:08
now we also need to
7:10
invoke
7:11
or require the dot env so
7:14
as you know dot env is a secure way by
7:17
which you can store
7:19
credentials such as passwords keys so we
7:21
need to require it
7:24
dot env
7:25
and this this contains a config method
7:28
so simply call this and now you can
7:30
store
7:31
environment variables inside your
7:32
environment files so this is very simple
7:35
simply create your env file here inside
7:38
the same directory dot env and inside
7:40
this you need to call
7:43
uh
7:46
if you move here if you just see here
7:52
if you go to the blog post here you will
7:53
find the instructions but what you need
7:55
to do is that just create a dot env file
7:58
and here you need to paste this that
8:00
said google application credentials
8:03
so
8:05
the variable needs to be same which is
8:07
google underscore application underscore
8:10
credentials
8:14
so if the variable name is different
8:16
then you will see a error so simply you
8:18
need to provide the path so my path is
8:20
inside the same directory
8:22
service account dot json that's it
8:25
if you have a different path then you
8:26
need to provide the full path the best
8:28
practice is to store the same file in
8:30
the same directory that's it
8:34
after storing this guys we need to
8:37
request it to the library so for that we
8:40
need to
8:41
invoke or sorry include file system
8:43
module as well in order to store the
8:46
mp3 file inside the same system
8:49
file system this is a built-in module
8:51
and again we also need to require utila
8:55
also which is again a built-in module
8:57
of node shares require util
9:00
so after requiring this
9:02
we will create a new client
9:04
const
9:08
so this will be new
9:10
and
9:12
text to speech that we have invoking
9:14
text to speech dot
9:16
it contains a method which is a
9:19
text to speech client
9:23
that's it so it will create a new text
9:25
to speech client reference here after
9:28
that guys we can declare a function here
9:31
which will convert
9:34
text to
9:36
mp3 so simply we can call this function
9:39
and now we need to write this function
9:41
also
9:42
at the top here so this will be a async
9:44
function guys async
9:46
put a async keyword here async function
9:49
convert
9:50
text to mp3
9:53
so
9:55
whenever we load the application we will
9:56
call this functions so here we need to
9:58
define the text guys first of all which
10:01
text needs to be converted to mp3 so
10:04
here we can just provide john
10:07
latham is the
10:11
so basically this is a sentence that i
10:13
have written so which needs to be
10:15
converted to english mp3 file
10:19
later on i will show how to change the
10:20
language as well to hindi or different
10:22
sort of regional language it supports
10:25
all the languages inside the world as
10:27
you know google cloud text to speech api
10:29
so i will show you that aspect also then
10:32
we need to define the request so
10:34
whichever request we will make here so
10:36
this will be your object this will hold
10:38
various properties first is the input
10:40
text so we will provide the input text
10:42
inside
10:44
this
10:45
text property text is equal to text
10:47
whichever text that we have defined
10:49
and after that we will put a comma
10:52
as this is an object then we will put a
10:54
voice property here so which voice that
10:57
you wanted here so first we will be
10:59
language code
11:01
so basically from the name itself it
11:03
will point which language that you need
11:05
to put
11:06
your
11:07
files so i will put first of all english
11:10
so basically for every language there is
11:12
a two two letter code here en stands for
11:15
english and dash the country name which
11:17
is us
11:19
so this is a first of all i am providing
11:22
and the second is the
11:25
gender so ssml
11:28
gender g is capital so basically harry
11:31
it is saying which voice you want to
11:34
hear either male female or neutral so i
11:36
will put first of all neutral i will
11:38
show you all the voices as well
11:41
and you need to lastly it has a audio
11:45
config also audio config
11:48
so inside this configuration you will
11:50
just
11:51
have a property which is audio encoding
11:54
and here you just need to put the
11:56
extension of the sound file either it
11:58
can be mp3 wave as well so but mp3 is
12:01
default and it has a high quality so
12:04
after defining all these three
12:05
properties first is the input text then
12:07
is the voice
12:09
which contains language and gender then
12:11
there is it is the audio extension so
12:14
after defining all these three
12:15
properties we can just
12:20
pass these properties
12:22
we will again declare this variable
12:24
response
12:27
and inside this
12:30
sorry this needs to be
12:33
not
12:34
angular brackets this needs to be
12:35
angular bracket response
12:38
and here we need to call this function
12:40
which is
12:42
first we need to await it because this
12:44
is the async function so we are using
12:46
async of it
12:47
so client dot and this contains a method
12:50
guys which is synthesize
12:53
speech
12:54
so which will actually carry out the
12:55
process of
12:57
text to speech so here we can put our
12:59
request pass this request all these
13:01
options
13:02
so now we will get the response guys
13:05
so we need to store it inside
13:08
the local system so we can just say
13:09
constrict file
13:13
and
13:14
we can use util method here util dot
13:18
promisify and here we can pass
13:21
file system
13:23
dot write file so there is a method
13:25
inside file system which lets you
13:29
write the file to a local file system
13:31
path so we use this method
13:33
util.promisify
13:35
and we have passed filesystem.write file
13:41
so after that guys we just need to say
13:43
await
13:45
again a weight keyword you will write
13:46
and write file
13:49
and here we can pass the path to
13:51
whichever
13:52
let me call this as output dot mp3
13:55
and the second argument will be
13:58
response
14:00
dot audio content
14:04
and the third property will be binary
14:06
this will be a binary file this is audio
14:08
file
14:10
and lastly we can console log on the
14:11
screen that
14:13
audio text to speech has
14:19
has
14:20
completed
14:22
audio file has been saved
14:27
so now we can run this application in
14:29
the command line so simply
14:33
you will see it has successfully
14:36
rerun the application this node mode and
14:38
it has shown the message on the screen
14:39
that text to speech has completed audio
14:42
file has been saved so you will see in
14:45
the left hand side it has created this
14:47
output dot mp3 file
14:49
so basically if i play this file here
14:51
you will see the text that i have
14:53
written here
14:54
if i open this file
14:56
john williamson latham is the captain of
14:58
new zealand
15:00
so now you can see a person is speaking
15:02
those words here
15:04
and
15:06
you can provide as much text as as you
15:08
want so there is no limit out there
15:10
so
15:11
you can provide a long paragraph and it
15:14
will dictate it and it will create a
15:16
long file
15:18
so i will now show you different
15:19
languages as well you can test this
15:21
application you can try out hindi as
15:23
well so you can just
15:24
see just search for
15:27
random text here
15:34
so let me just copy this text and paste
15:36
it inside the application
15:45
so now you can see
15:46
now this is a hindi text here so we also
15:48
need to conver change the language code
15:51
as well so
15:52
so for hindi there is a you can search
15:55
language codes on the internet so this
15:56
will be hi for hindi and dash
16:00
i n for india code
16:04
and once again if you run this
16:07
now it has again created this output.mp3
16:10
file
16:14
so now if i play this
16:16
is
16:26
so now it has you can just see here it
16:28
has spoken in hindi and
16:31
so in this way you can try out for any
16:33
sort of language either spanish or any
16:36
language in the world so you just need
16:37
to change this language code and
16:40
just the text out there
16:42
so thanks so much for watching this
16:43
video i will be seeing you in the next
16:45
video

Node.js Google Cloud Text to Speech API Converter and Downloads it as MP3 File Using Javascript

webninjadeveloper.com

Come Convertire Video in Wmv Online (Guida Semplice)

Python 3 Flask Whatsapp Web API Example to Build Whatsapp Bot to Send Messages & Files in Browser

Build a FFMPEG WASM Images to Slideshow Video Editor With Audio in Browser Using HTML & JavaScript

Build a Next.js Github & VSCode Source Code to Image Editor Clone in Browser Using TypeScript

C# Desktop and Android Studio App and SQL Server Project

How to Scan WI-FI QR Code on Windows 11/10 | Windows 11: Scan QR Code [2024]

Tag Database Tutorial for Automation Direct Mini PLCs!

Microsoft Edge Browser Tips and Tricks for Windows 11

CODESYS Tutorial: Set Reset Latching instructions in Ladder logic

How To Paste In Google Docs Without Formatting

How to Use Python's memoryview to Manage Data Directly with Examples

How to Display Google Sheets Data on a Website

How to Develop and Debug Python Code with VSCode

Create Files using Salesforce Apex

Salesforce Agent API from Lightning Web Component

Up next in 10

Node.js Google Cloud Text to Speech API Converter and Downloads it as MP3 File Using Javascript

webninjadeveloper.com