Machine Learning -The Nature of Probability and Statistics - Data Collection and Sampling Techniques
5K views
Oct 18, 2024
Machine Learning -The Nature of Probability and Statistics - Data Collection and Sampling Techniques https://www.tutorialspoint.com/market/index.asp Get Extra 10% OFF on all courses, Ebooks, and prime packs, USE CODE: YOUTUBE10
View Video Transcript
0:00
In this video we are going to discuss data collection and sampling techniques
0:05
So now in case of data collection, this is very important in our machine learning because
0:11
in our machine learning actually we'll be working of huge set of data
0:16
So we'll be working on huge set of data and how this data can be collected
0:21
So we shall be discussing that one and also the sampling techniques
0:26
So, data can be collected in variety of ways. So, one of the most common method is through the use of surveys and the service can be done
0:36
by using a variety of methods. So we can go for the telephonic surveys, we can go for the Milt Kucinear surveys, we can go
0:45
for the personal interview surveys and surveys record or direct observation of the situation
0:52
that are multiple different ways in which you can do the survey to collect our data
0:56
So in case of telephonic survey, it is very much cheaper because you need not to move
1:02
physically and but in sometimes some people may not be picking up the phone calls, they
1:07
may not be reachable, so that are the different disadvantages in telephonic survey
1:13
In case of mail the questionnaire surveys, so now it is little bit costly because you
1:17
are supposed to mail all these questionnaires by courier or by some other means but you also can also send this particular questionnaire through the emails also But in this cases all the mails whatever the questionnaires you have sent may not be replied back
1:35
But the people might be more candid in their opinion because there is no directly you are not interrogating he or her to gather data
1:45
Personal interview surveys, in this case it is very much fruitful because you can do the survey
1:50
personally meeting with the person but some cases in some cases the
1:55
interviewer might remain biased in the co-signor set so this particular personal interview surveys may have some other flaws serving records or
2:06
direct observation of the situation so there are different ways in which the
2:11
data can be collected through this survey method so statistician use four
2:18
basic methods of sampling so here we're having the random sampling systematic
2:24
sampling stratified sampling and cluster sampling so we know that whenever our
2:30
population is a huge subjects are there in the population and this population
2:35
cannot be reached to interact with each and every subject is not feasible then
2:41
we'll be going for the sampling so sample is nothing but collection of subjects
2:45
taken from the population so at first we are going for the random samples So random samples are selected by using chance methods or might be using some random numbers So from the population we are selecting random number of subjects using the random samples
3:03
Next one is the researcher obtained systematic samples by numbering each subject of the
3:08
population and then selecting every kth subject. So let us suppose we are having, say, 100 number of, so 100 number of subjects
3:17
we are having. Out of them we are trying to find out, we are trying to interrogate or we're
3:22
trying to pick up a sample of the size, say, 20. In that case, 100 by 20 is 5. So, we can
3:29
go for picking up the fifth subject from the population. This 120 are just used as an example
3:36
These figures are only for the example and explanation. So, researchers obtain systematic
3:43
samples by numbering each subject of the population and then selecting a certain
3:47
every kath subject from the population next one we are going for the stratified samples so
3:54
researcher obtained stratified samples by dividing the population into groups also known as strata
4:02
according to the sum characteristic this is important to study and then sampling from each group
4:10
so what will happen depending upon some property the population will be divided into multiple number of strata and then we shall do the sampling taking taking subjects from each strata next one we are going for the
4:26
cluster samples researchers also use cluster samples here the population will be divided into groups also known as clusters by some means such as using some
4:38
geographic location area or respective different other properties like say schools in the large school district
4:45
So, using some property, the population will be divided into multiple clusters
4:50
And then the researcher randomly select some of the clusters and use or access all the
4:56
subjects belonging those selected clusters. So that is use all the members of the selected clusters at the subjects of the samples
5:05
So that is another way to do the sampling. The last one is our convenience sampling
5:11
So, here a researcher uses subjects that are convenient. For example, the researcher may interview subjects entering a local mall to determine the nature
5:21
of their visit and perhaps what stores they will be visiting for the shopping purpose
5:28
So, in this way, there are different ways in which we can do the collection of data
5:34
So one method is our survey and then what are the different ways we can create one sample
5:41
from the population. Thanks for watching this video
#Computer Education
#Machine Learning & Artificial Intelligence
#Programming
#Statistics