Basics of MapReduce Algorithm
1K views
Nov 18, 2024
Basics of MapReduce Algorithm Watch more Videos at https://www.tutorialspoint.com/videotutorials/index.htm Lecture By: Mr. Arnab Chakraborty, Tutorials Point India Private Limited
View Video Transcript
0:00
In this video we are discussing basics of MapReduce Algorithm
0:05
So, how this Map and the Reduce two tasks with the help of the mapper class and
0:11
the Reducer class, how the things are taking place we shall be discussing in this video
0:16
with one sample example to clear your all doubts. So, let us start with this basics of MapReduce algorithm
0:24
So here we are having the mapper class, also we are having the Reducer class
0:29
output of mapper class will be the input to the reducer class the mapper class is
0:34
taking the input and ultimately from the reducer class we're having the output
0:38
formed and these inputs and outputs will be in the form of key value pairs this
0:44
mapper class will do the tokenizing input mapping and shuffle and shot and
0:50
here this reducer will do the searching and reducing operations so these are the
0:56
main prime operations in mapper class and these are the two operations to be done in the reducer class. So, different tasks in map reduce algorithms
1:05
So, let us discuss more into details. The map reduce programs have two tasks, the map task
1:11
and the reduced task, as it has been shown here. The map task is done by the mapper class
1:17
and the reduced task will be done by the reducer class. So, this is a map task done by the
1:22
mapper class Reduce task will be done by the reducer class Mapper class takes an input tokenizes it and maps and shorts it You can find that map class or the mapping class is taking one input tokenizing it
1:37
and doing the suffling and short. So the mapper class takes the input, tokenizes it, and maps and shorts it
1:46
The output of the mapper class is used at the input to the reducer class, as I have mentioned
1:51
earlier, which in turn such as the matching pairs and reduces them, according to the business
1:58
logic, whatever is required, the respective outputs will be obtained and respective customized
2:04
function will be working at the reducer class for the reducing operation
2:10
MapReduce task example. So, one of the most fundamental map reduce example is word count problem
2:18
This work on problem is something like our hello world. problem in MapReduce. So it takes the string with different words as input and counts the number
2:27
of words in of each type and let's see how the mapper class and reducer class work for this
2:34
task. So a string will be taken as input and then it will count the number of distinct words
2:39
in that particular string and that is known as a word count problem. The mapper class tokenize
2:45
the strings and makes a shorted list of the words and it makes the words. It makes the words
2:51
as the key and the number as the value. So, what will become the key and the number
2:56
That means the frequency of occurrence that is a number that is a count will be the value in that case The reducer class takes the list and count the number of entries of each word in the input
3:09
which is the output from the mapper. So, the reducer class takes the list and counts the number
3:14
of entries of each word. It will check that how many times one word has occurred
3:20
Finally, creates another list with key and their respective values, that is the count values
3:27
is number of times for which the word has occurred in the sentence
3:32
So, here is one example for your better understanding. So, here you see, there is one sentence is there
3:39
One input is here. That is our a duck is a bird
3:43
We have taken a very simple example. So, mapper class, here we're having take line from the input and tokenize it
3:50
And for each word in the line, emit word, comma, one. So there is a, it will create such tokens here
3:57
For each and every word in the sentence, it will go for emit word comma 1
4:02
So, in this way, it will produce output like A, comma 1, duck, 1, is comma 1, A comma 1 and
4:09
B, B, comma 1. So that is the mapper output. Next, it is going for the shorted output
4:14
So now it has got shorted in the alphabetical order. So, A, A, B, D, I
4:19
So in the alphabetical order, on the keys, the corresponding shorted output has been obtained
4:24
Now, remove duplicate keys. So here you can find that A has occurred for twice So that why the key A has got removed here and rest of them are unique So no elimination was taking place So now this this output will be going as
4:40
the input to this reducer class. So, reducer class is having one algorithm, something like this
4:46
So, reducer key, comma, values. So sum is equal to zero. For each value in key, sum is
4:52
equal to sum plus value and emit key comma sum. So, in this way, where it is taking the key
4:59
values as input and emitting the key sums as output, and that is also a key value pair
5:05
So, A will become, we'll have the sum to, bird one, duck, comma one, and we're having
5:12
is a final output which will be obtained. So from this example, there is a word count example
5:18
it is getting clear what is the purpose, what are the operations are being done in the
5:23
mapper class and also in the reducer class. MapReduce task example, let the string is
5:29
a duck is a bird, it has five words and the input is given as input to the input of the
5:37
mapper class. So, at first this input will be taken by the mapper class. And mapper will generate
5:42
the key value pairs. After that, reducer takes it as input and find final word count. And
5:48
the output. In this way, the final word count will be obtained as the output. So, in this video
5:54
we have discussed what is the basics of MapReduce algorithm with details discussion and
6:00
with some sample examples. Thanks for watching this video
#Programming