0:00
Hi there friends welcome back in this lesson you're going to learn about different types of activation function
0:08
in previous lesson we have learned about activation function why do we need them what they are and this is just the extension of the previous lesson only
0:17
here we're going to learn more about different types of activation function okay here we're going to represent these activation function into
0:23
mathematically graphically as well as programmatically okay first we're going to cover these two
0:29
portion automatically and graphically and offered we're going to apply some programming skills in order to you know represent this activation functions okay
0:38
so stay tuned and let's get it started so the very first function which we're
0:43
going to discuss over here is sigmoid activation function here you can see it's
0:47
how its graph looked like it's a shape curve okay and here it will go into
0:52
map the input to a value between zero and one okay and this work
1:00
So super for you know doing the binary classification. Okay, so that in order to predict the probabilities between two values. Okay, so you have to apply the Sigma
1:10
activation function and output layer. Okay, it works in output layer in order to
1:16
Work for binary classification. Okay, and how you can represent this is a point function mathematically here you can see on my screen
1:23
It is one upon one plus it a raised to the power minus x. Okay, so this is the xx. Okay, so this is a point
1:29
This is a way how you can represent the sigmoid Fightivation function mathematically the disadvantage of the sigmoid activation function is that it has
1:39
the problem of vanishing gradient problem Okay, in order to tackle that we have some more other types of activation function which we can use them
1:48
Okay so the next type of activation function is hyperbolic tangent activation function Here it is just similar to the sigma function but it maps the input value between minus one and one
2:02
and the case of sigma white it maps the value between zero and one but this hyperbolic
2:08
tangent activation function it maps the value between minus one and zero uh minus one and one okay
2:14
and here you can see how we can represent this activation function mathematically okay
2:20
And this kind of activation function are basically used in hidden layers so that we can
2:26
You know recognize the complex patterns complex relationships from the data very easily
2:31
Which we're going to work you know for training the our neural network. Okay, so you can compare this
2:40
Activation function with Sigma function we're going to have a lot of advantages not of different as well. Okay, but yeah, they are most stable in order to
2:50
you know model the more complex pattern as compared to the sigmoid activation function
2:56
okay next we have another kind of activation function which is called
3:00
rectified linear unit activation function in short you can also say Raylu this is
3:04
the most commonly use activation function whenever you're going to look after
3:08
that how to build a neural network and Google search we'll get to find this
3:12
particular active function again again this is the mostly common use activation function used in hidden layers okay and what they are do basically
3:20
that it defined the maximum of 0 and the input value okay it will go into
3:27
return 0 for the negative value suppose if you have some values between minus 1
3:32
minus 2 then and then you know 0 1 2 something like that okay minus 2 minus 1 0
3:39
2 something like that so the value will be 0 0 0 then 1 2 all those negative
3:44
values will be converted into 0 this is how this value activation function works so So somehow it also solves a problem of vanishing gradient problem Okay but there is another problem which you know
3:57
occur over here because we are completely ignoring the negative part. Okay, so you can say it as a dead neurons
4:03
This kind of problem we are facing it and in order to you know cop this problem we're going to have another variant of this
4:10
Railway activation function which is known as leaky railway activation function. This is just a variant of
4:16
this value activation function where we're going to have a slightly slope for the negative
4:21
values and here what we're going to do in the case of relu we are making the negative values
4:27
to zero but here in the case of leaky relu we are multiplying negative value with some alpha
4:33
values that alpha value is completely decided by us okay like you it can be some 0.0.02
4:41
just you have to understand the complexity of a problem your dataset and you have to
4:45
adjust the alpha value accordingly. Okay. So this is how you can use this Relu and, you know
4:53
and this Leaky Relu activation function. Okay, it is basically solving the dying
4:59
value problem, which we have in the case of Relu problem. Okay. Next, we have soft mix
5:06
activation function, okay, and these are activation function, which are commonly used in the output layer for multi-class classification problem
5:15
Okay, sigmoid works fine for binary classification that tangent activation works fine for
5:22
That binary classification this kind of activation function the soft max activation function works fine for
5:28
Multi class we're going to have you know a single neuron for each classes in the output layer at that point of time
5:36
We're going to use this soft max activation function instead of using you know this
5:41
Sigmaoid or ten activation function okay and this is This is how you can represent it This is its mathematical representation Here you can find it and yeah So these are different types of activation
5:55
function which we have learned over here. Now we're going to discuss how you can apply
6:01
them programmatically. Okay, so I hope it will be more interesting. Okay, so let's get
6:06
started. Here you can see I have opened the Google Collap notebook here I have
6:11
already created a program for it for different types of activation function
6:15
And here you can see the very first thing which we did is to import the dump pi and
6:19
matcordid libraries. Then I have created x variable which we're going to return the sequences from minus 5 to
6:25
plus 5. Then I've created functions for different activation functions such as sigmoid where I just
6:30
use the mathematical expressions which I have already created it. Then I have used the tanness function where I've just used the tannage inbuilt function
6:38
offered by numpi library. Then we got the software function. again that mathematical expression then we have relu and the leaky value in
6:45
relu as I mentioned to ignore all the negative values with the 0 whereas the case
6:50
of leaky value we just have to multiply negative value with the alpha value with
6:55
which is 0.2 over here then I just plot all those you know my different kinds of
7:00
activation functions and this is how it looks like sigmoid function as useful it
7:05
is as shape relu function it is ignoring all those negative values and then
7:09
it's activation function similar to the sigmoid but here it is taking from minus one minus five to
7:17
positive five then we got the soft milk function and then leaky relu covering the negative slopes as well
7:23
so this is how it looks like how we can achieve activation function programmatically
7:27
i hope it is clear to all of you so in our upcoming lesson you're going to learn more about neural network
7:34
till then keep learning keep exploring and stay motivated