Shuffle & Sorting of MapReduce Task
3K views
Nov 18, 2024
Shuffle & Sorting of MapReduce Task
View Video Transcript
0:00
Suffling and Shorting of Map Reduce Task
0:03
In between mapper and Reducer, the softling and shorting tasks will be performed
0:10
But if there is no reducer, so if the mapper output is the final output, then there is no need to have any
0:16
suffling or shorting tasks. So let us go for some more discussion on it
0:23
So what is suffling? So you are starting with suffling at first. The mapper creates the intermediate key value pairs
0:29
and the transfers them to the reducer task and this procedure is known as shuffling
0:35
So, in case of shuffling, some reordering of this key value pairs will take place
0:40
Mapper will take the key value pairs and it will do some processing. The developer will prove the business logic in the mapper to do the required intended processing on it
0:51
And then the mapper will be also producing the outputs in the form of key value pairs
0:55
and this output is known as intermittent result and that will be stored onto the local disk
1:01
not onto the HDFS. And this suffling is a process with the help of which these key value pairs will be going
1:10
to the reducer in some different order. So, using the shuffling procedure, the system can short the data using the key values
1:19
So depending upon the values of the key this shuffling will take place that means reordering of this key value pairs will be done for the shorting operation The suffling tax begins when some of the mapping tasks are done So suffling tax will begin
1:34
when the some of the mappers will produce some outputs. So it is not, it is not waiting for
1:39
all mappers completion. So it is a faster task. So as a result of that, when the some of the
1:45
mappers have completed their operations and outputs have been obtained, then the
1:49
the suffering operation will be working on that. So, this is the faster process and it will
1:54
not wait for the completion of the all mapper tasks. Next, we are going for what is
2:03
sorting. So, the MapReduce framework automatically shorts the data on the key values
2:09
on the output of the mapper. So before sending it to the reducer, all the key values will be
2:15
shorted. And as a result of that, the reducer will take the
2:19
lesser time to do the reduce operation. The reducer can easily understand when a new reducing
2:26
tax will be started by the shorted key value pairs. And if the user set no reducer task
2:32
if that is no reducer task, and then the suffling and shorting phase will not take place. If there
2:39
is no reducer, there is no need to have any shuffling or shorting. The tax will award after the
2:45
mapper task. So in this way, in this discussion we have discussed, that
2:49
what is the suffling and shorting in MapReduce. Thanks for watching this video
#Computers & Electronics
#Programming