0:00
Hey everyone, my name is Asta Chohan, welcome to the Tutorials Point
0:04
In the previous video we have talked about the naive-based algorithm and in this video we are going to talk about the logistic regression
0:12
So let's see what's in for you in this video. We are going to look at why logistic regression? Where does logistic regression fit in
0:20
What is logistic regression, linear regression, linear regression and logistic regression? Met behind logistic regression
0:27
difference between linear regression and logistic regression and what are the applications of logistic regression
0:35
So let's first understand why logistic regression. Let's take this example that we have some employees and on the basis of reading of the employees
0:44
we have to determine whether the employee is going to have the promotion or not
0:48
So let's build a model for this. We will train the model or teach the model on employees' data
0:54
then the model will drop the non-essential components of the dataset. That is not important for predicting whether the person is going to have the promotion or not
1:04
Then this will determine whether the employee will have the promotion or not
1:09
So this is where we use the logistic regression. That means in binary classification
1:14
Now the question arises, where does logistic regression fit in? So let's see where it is fit in
1:20
Machine learning algorithm is divided into two types. supervised learning and unsupervised learning. Supervised learning is further divided into classification and regression
1:30
and unsupervised is further divided into clustering and association. So our logistic regression falls under the supervised learning for solving the classification type problems
1:41
Now let's understand what is logistic regression. So from definition, logistic regression produces result in a binary format which is used to predict the outcomes of categorical dependent variables
1:52
So the outcome should be discrete or categorical. That means we use logistic regression where we need to classify the group or we have only binary outcomes like yes no true false whether it is going to happen or not like that
2:07
You can observe the curve of logistic regression on the screen. This S curve represents the logistic regression curve and we have probability on y-axis
2:18
for the event happening on the x-axis. We have some threshold value in this graph
2:25
What does this threshold value mean? If the value is less than threshold value
2:30
then that means the probability is zero. That means the event is not going to happen
2:36
If the value of the probability is greater than this threshold, that means the event is going to happen
2:42
And this is how we determine the outcomes in logistic regression. Now the question arises, why not linear regression
2:50
So for that, let's understand the linear regression first. We have also discussed about the linear regression
2:55
in one of our previous videos, but let's understand this again. From definition, linear regression is a statistical method used to determine the strength and direction of the relationship between a dependent variable and one or more dependent variable
3:09
It is called linear regression because it assumes that the relationship between the variables is linear
3:16
That means change in one variable is associated with the proportional change in another variable
3:21
You can observe the linear regression graph on this screen also, also we have some data points between the dependent variable and independent variable and this
3:31
straight line is called the line of regression which helps us in predicting the values. significa
3:37
we can predict the salary hike on the basis of the employee's rating but we can't predict
3:42
whether the employee will have the promotion or not. In this example we can predict the exact value
3:48
of salary height for every employee. That means in this case we have some numerical outcome or
3:54
continuous out That means we use linear regression where we need some numerical value or the continuous value of the outcome Now let try to use the linear regression for the another case whether the employer will get the promotion or not on the basis of the employer rating
4:10
So here we can observe that, that using the linear regression in this case is totally irrelevant
4:16
As we have some data points that employee will get the promotion and another for not having the promotion
4:23
And this line is not helping us in predicting this. we have to clip the straight line like this
4:29
If the data points are here, that means the employee will not get the promotion
4:33
If the data points are here, then employee will get the promotion. But again, we have three straight lines here
4:39
That means we have to deal with three equations and that is not efficient at all
4:45
So we will convert these three lines into a curve like this
4:50
which is called sigmoid curve and it represents the logistic regression. And here we can easily
4:56
predict whether the employee will get the promotion or not how the data points have the value
5:02
greater than threshold value then the employee will get the promotion and if the value is less
5:07
than threshold value then employer will not get the promotion now let's talk about the maths
5:12
behind logistic regression for understanding the logistic regression let's talk about the odds of
5:17
success can we calculate the odds of success we can calculate this using probability of an event
5:24
happening over probability of an event not happening. That means p upon 1 minus p where p is the probability of event happening
5:34
The values of odds of success range from 0 to infinity while the value of probability change from
5:40
0 to 1. Now let's consider the equation of the straight line. We all know the straight line is represented as y is equals to beta 0 plus beta 1 x where
5:50
beta not is the y intercept, beta 1 is the slope of the line
5:54
and x is the value of x coordinate, y is the value of the prediction
5:59
Now we predict the odds of the success Let put y is equals to log of odds of success Then it will become log px upon 1 minus p x is equal to beta 0 plus beta 1 x Exponenting both sides and further
6:15
solving it we will get the final equation which is p x is equal to 1 upon 1 plus eke
6:22
power minus beta not plus beta 1 x, which is also called sigmoid function which represents
6:29
represents the sigmoid curve or s curve. Now let's talk about the difference between linear
6:35
regression and logistic regression. We use linear regression for solving the regression type problems
6:41
while we use logistic regression for solving the classification type problems. The response variables
6:47
are continuous in nature in linear regression, while the response variable in logistic
6:53
regression is categorical in nature. Linear regression helps estimate the dependent variable
6:59
when there is a change in independent variable, while logistic regression helps calculate the possibility
7:06
of a particular event taking place. We discussed before also, and now we all know that linear regression is a straight line
7:13
and logistic regression is S-curve, that is also called the sigma-oid curve
7:18
Now, what are the applications of logistic regression? We use logistic regression in weather prediction
7:23
like it will go in rain or not, sunny or not, image categorization like whether the given
7:29
picture is of dog or not, cat or not like that. And also it is used in determining the illness
7:36
So that was it for this video. We have already covered the first section that is supervised machine learning algorithm
7:43
In that, we covered supervised machine learning algorithm, K&N, decision free, linear regression, support
7:50
vector machine, random forest, naive base, and logistic regression. And in the next video, we are going to talk about the unsupervised machine learning algorithm
7:59
and rest all the machine learning algorithms in further videos. So stay tuned with tutorials point
8:05
Thanks for watching and have a nice day