Combiner in MapReduce
2K views
Nov 18, 2024
Combiner in MapReduce
View Video Transcript
0:00
In this video we are discussing Combiner in MapReduce
0:04
So Combiner will be working in between the mapper and the reducer
0:09
The output of the mapper will be the huge one. And if we make those mapper output directly available to the reducer, that will enhance
0:17
and that will increase the network congestion. So as a result of that, we were supposed to put one combiner in between this mapper
0:25
and reducer. So this combiner can also be called as mini-reducer. reducer. But the position of this combiner or introduction of the combiner in this map
0:35
reduce model will be also optional. So let us go for some more discussions on it. So
0:41
here you have shown that two mappers are there and the respective reducers are there
0:46
So this output, that is the intermediate output of mapper, will not be made available to the
0:51
reducer directly because that will enhance the probability of network congestion. So this combiner will be working in between and which you will
0:59
be working as a mini-reducer. So, what is combiner? From the Hadoop mapper, it creates a huge
1:07
amount of intermediate data, and in the process of sending these data to the reducer creates
1:13
a massive network congestion. And the combiners are used to overcome this very problem
1:20
And the combiner is also known as the mini-reducer, and it processes the intermediate data
1:27
from the mapper which is coming from the mapper and use of the combiner is optional in this map reduce
1:33
so how does combiner work so let us go for one example and in this example we can see there is
1:44
two mappers and the main data is divided into two parts and the mapper finds the intermediate key value pairs so let us show you the diagram at first So here we having this data and the data has got divided to two different mappers
2:00
So, this first line is Hadoop is, Hadoop is, is available to the mapper, and mapper has made a key value pair
2:08
where the key is the each and every word, and the value will be one there
2:12
So, that is a key value pair. So Hadoop, comma, 1, is comma 1, Hadoop comma 1, haddub comma 1, and is comma 1
2:18
Similarly, here in this case, the Mapper 2 will be having this particular data, so sky is cloudy is cloudy
2:26
So in this way, it is also producing the respective key value pairs
2:30
And these things are nothing but intermediate data. And now this data will be available to the combiner
2:37
So this data is not getting available to the reducer directly, but through combiners
2:43
So, Combiner is also doing some reducing work, prior reducing work here
2:48
So, as a result of that, it is having had doop. So, you see, same key values are there
2:53
So the count has got added, so one and one, so it has become two, and ease the same key values
2:59
So these two particular key value pairs will merge, and here will be having the value two here
3:04
And similarly, the same operations are getting done in Combiner 2. The outputs of this Combiner 1 and Combiner 2 will be made available to the reducer, a reducer
3:14
will produce the final output which will be stored on the HDFS that is Hadoop distributed
3:19
file system so the output of the mapper is partially reduced by the combiner and in the
3:27
last stage all the output of the combiners is again reduced by the reducer and finds the final
3:33
output and that will be stored and saved on the HDFS so in this way in this discussion we have
3:40
told you we have discussed that how this combiner works in App Produce. Thanks for watching this video
#Programming