Machine Learning- Frequency Distributions and Graphs - Histograms, Frequency Polygons and Ogives
11K views
Oct 17, 2024
Machine Learning- Frequency Distributions and Graphs - Histograms, Frequency Polygons and Ogives https://www.tutorialspoint.com/market/index.asp Get Extra 10% OFF on all courses, Ebooks, and prime packs, USE CODE: YOUTUBE10
View Video Transcript
0:00
In this video, we are going to discuss histograms, frequency polygons and ogyps
0:07
At first we are starting with the histograms. The histogram is a graph that displays data by using contiguous vertical bars and unless
0:17
the frequency of a class is zero, of various heights to represent the frequency of each
0:24
and every class. So to have a better understanding, consider this example
0:29
So, here we are having the respective class boundaries. So, 99.5 to 104.5, then 104.5 to 109.5 in this way, we're having multiple class
0:40
boundaries we are having. Now, you can find that here we're having in total 1, 2, 3, 4, 5, 6, 7 number of respective
0:49
class boundaries. And this is a respective frequency. So for each and every class, we're having a separate one vertical bar here
0:58
So, how many bars we are having here? We're having seven bars because we're having seven boundaries here
1:05
So depending upon the frequency, they have got plotted. So two means it has got plotted up to two eight it has got plotted up to eight then 18 then 13 in this way So this is the respective histogram against this set of data So that is a record high temperatures for this respective ranges and here this temperature
1:25
has been expressed in Fahrenheit. And here, along the way axis, we are plotting the frequency
1:33
Advantages and disadvantages of histograms. So depict the frequencies of observation occurring in certain ranges or interval
1:41
of values and the intervals must be adjacent. The accurate representation of the distribution of numeric data can give a rough sense of
1:53
the density of the underlying distribution of the data. So where this data has got more populated, so we are going to get some idea regarding
2:02
this particular density. Now the disadvantages. So random fluctuations in values and alternative
2:11
choices for end of intervals give very different diagrams. That means if we change the interval limits, then obviously the different diagram will
2:22
be formed. Apparent multi modality can arise then vanish for different choices of intervals or for different small samples So we might be finding that multiple ranges are there having got the same highest frequency
2:40
So, effects diminish with increasing size of the dataset. So these are the multiple disadvantages of the histogram
2:48
Next we are going to discuss the frequency polygon. The frequency polygon is a graph that displays the data by using lines that connect points
2:57
plotted for the frequencies at the midpoints of the classes and the frequencies are represented
3:04
by the heights of the points. So, in the same way, you see, here we are having the respective midpoints and this midpoints
3:12
are nothing but the lower class boundary plus upper class boundary hold by two
3:16
So that will give you the mid points, the respective frequencies are there
3:20
Along the x-axis were plotted the respective mid-points and then this is the respective points
3:26
we are plotting because along the way axis we are having the frequency
3:30
So the height of the point will be decided by the respective frequency
3:34
And then we are having one line diagram, this line graph rather, which is connecting all
3:39
these points and that is producing one frequency polygon Now we are going to discuss ojiv The ojif is a graph that represents the cumulative frequencies for the classes in a frequency distribution
3:56
So, here you can find that less than 99.5 the frequency will be 0
4:00
You can find it from here because this is a range within that the frequency is 2
4:04
So less than 99.5 the frequency will be 0. Less than 104.5, the frequency will be 2
4:11
So, less than 104.5, the frequency will be 2 here. So, less than 109.5, the frequency will be 10
4:20
Because less than 1009.5, the frequency will be 8 plus 2. So that will be 10
4:26
So next time it will be coming as 10 plus 18. So that will be 28
4:30
So in this way, the cumulative frequency has got formed. And now here we are plotting the respective class boundaries and the respective cumulative frequency
4:40
is being plotted along the 1. the Y. So you see this is my cumulative frequency which has been plotted along the Y and
4:46
the respective class boundary is they have got plotted along the X and this is the respective
4:52
graph line diagram, line graph we are getting here. So this is known as OJV. So in this video
4:58
we have discussed histogram, we have discussed frequency polygon and also OJV. Thanks for watching
5:04
this video
#Machine Learning & Artificial Intelligence
#Statistics