MapReduce and Design Patterns - Distinct Pattern Example
2K views
Oct 18, 2024
MapReduce and Design Patterns - Distinct Pattern Example https://www.tutorialspoint.com/market/index.asp Get Extra 10% OFF on all courses, Ebooks, and prime packs, USE CODE: YOUTUBE10
View Video Transcript
0:00
In this video we are discussing distinct pattern example
0:04
In the previous video we have discussed what is the distinct filtering pattern
0:08
So in case of distinct filtering pattern we are not interested to have the repetition of
0:13
the same key values but we required to get a set of unique values only
0:17
So in this particular video we'll be showing you that how to write the respective Java
0:21
course, how to compile it, how to execute it and how to get the output
0:27
So distinct pattern example. So here we will use the XML file that is a batches dot XML file and which will find the distinct
0:36
batches of the users. So this is the purpose and this assignment in this particular video
0:42
So now let us go for one practical demonstration for the easy implementation of this process
0:49
So in this video we are discussing that filtering pattern design pattern under this category
0:54
we're going to have distinct filters. So we are having the batches.xm. under input batch folder with the size of 21.48 mb
1:06
So it is having so many different rows under the batches tag each and every row has got
1:11
multiple different attributes. The attributes like your ID, we're having the user ID, name, date, class and also the tag based
1:21
So these are the multiple attributes are there under the each and every row
1:25
We're supposed to find out all the distinct list. batch names we're supposed to find out so now let me go for the Java code let me go
1:34
for the Java code so let me come to this eclipse yes so we are having the class
1:40
name that is a distinct batch MR task and here we having two inner classes so one is the mapper class another one is a the reducer class extending so under this mapper class we having the map method and also
1:55
we're having this batch titles batch title there is a text under the map method we're having
2:00
this XML parsed which is the hash map object getting instantiated with XML to map method
2:07
you can see that this is a corresponding method which will take the XML file as input and
2:12
returns the hash map object as output and that will be instantiating our XML parsed
2:17
so string title is equal to XML parsed or get name so name is the one of the
2:22
attributes were there so find the name of the batches so if title is equal to is
2:28
equal to null then return need not execute the this part of the code otherwise
2:32
batches dot title set title so you are just writing that one you are on these
2:37
batches dot title so context dot write ct x is the context object object object
2:42
write that is a batches title comma null writable dot get so in this way the
2:48
corresponding batches dot that that will be written on to the context this is
2:52
my distinct reducer extends reducer so within this reduce method we are
2:58
having this context dot write key and null writable dot get so key value pair
3:04
will be written there now let us discuss this main function within this
3:09
main function we are we know that here we are are going to have the usage will be the distinct batch MR task that is a class name then
3:17
input path then output path so two arguments are to be passed so if it is not
3:22
so it will exit we have defined one job instance that is a job is the object
3:29
name of get instance the name of the job is find the title of the distinct batches we setting the mapper class we setting the reducer class whatever you have defined as the inner classes there
3:40
The output key and also the output value. So output key will be of the type of text
3:44
That class, output value will be null writable class. So from the argument 0 and from the argument 1, we're just initializing the add input path and set output path, which will be passed as Kamenland argument
3:57
depending upon the code value 0 or 1 we can decide that whether the successful
4:02
completion took place or not and that value will be returned here now this
4:07
next step which will be followed here is to create the respective jar file so
4:13
before going for the execution will be creating the respective jar file and for that
4:17
purpose we're going to go for the the respective project name then right button
4:22
click then export and then jar we're supposed to give the jar path
4:27
and the jar file name properly and then next and finish. But already we have created the jar file
4:33
So we are not going to execute the process here again. Just going for right button click the export, then jar files and rest of the steps
4:42
Let me execute our command here. So let me clear the console
4:48
We're going to write the command for execution. Haddub, then jar. Then we are going to give the jar file path
4:57
here that was so slash then we should go for map produce underscore design pattern then
5:03
slash jar file so these are respective path and filtering pattern or jar is a jar file
5:09
name then this is our distinct uh distinct filter is a respective package name class name is our distinct batch immer task then
5:27
you shall give the input path name so that is our slash input slash batch then
5:33
the output path name that is a slash output so in this way the command will be
5:37
executed so we require to find out where we require to filter out all the
5:46
distinct batch names so that is about purpose in the batches dot XML we had one attribute
5:52
name so you see the command has got executed successfully so let me see that
5:58
what the part file has got created so we shall go for this root of the name
6:05
node then output folder we're having this part file has got created containing the
6:08
contents so let me go for the command to see the contents here so is DFS
6:16
DFS slash minus CAT then slash output slash part star so in this way the part file content
6:24
is being displayed so you can find that these are the distinct batch names see these are
6:29
the distinct batch names we have we have obtained in the output file these are the
6:35
distinct batch names you can easily find that what I mean so many distinct batch names
6:40
are there they have got selected in the output file here in the part file under the output folder so now it is a good practice to delete
6:51
the output folder so that we can execute the next map reduce task so we are going to
6:57
delete that one I hope that you have got the conception that how to execute this
7:01
this program using the proper steps and all we have displayed thanks for watching
#Computer Education
#Educational Software
#Java (Programming Language)
#Programming