MapReduce and Design Patterns - Hadoop Word Count Example
15K views
Oct 18, 2024
MapReduce and Design Patterns - Hadoop Word Count Example https://www.tutorialspoint.com/market/index.asp Get Extra 10% OFF on all courses, Ebooks, and prime packs, USE CODE: YOUTUBE10
View Video Transcript
0:00
In this video we are discussing Hadoop word count example. So this is a very fundamental and basic example in our Hadoop environment
0:10
Now what is word count example? The word count example. So what count task in Hadoop is used to count the number of words in one or more than one input data set files
0:23
So now what will happen at the data set files will be having so many different words will be there
0:30
either one data set file or multiple data set files. And from there, this particular problem will take each and every word and ultimately
0:38
it will produce the word, comma, the number of times the word has occurred in the same
0:44
or multiple data files. So that is known as a word count example
0:49
So let there be a file which is containing a sentence that is a duck is a bird
0:56
So this is a content in one of the files. So, after performing word count task, it will generate key value pairs like this
1:04
So, that is A has occurred for two times. You can find that a duck is a bird
1:09
So, A has occurred for two times. Duck has occurred for one times
1:14
Ease has occurred for one times and bird has occurred for a single time So this A duck is bird They are the keys and this 2 they are the values So this outcome will be in the form of key value pairs And that is the basic concept
1:32
of what count example in our Hadoo. There are two different parts in this task. The mapper task
1:42
and the reducer task. After performing the mapper task, it generally, as key value pairs. So, the outcome of the mapper task will be something like this
1:52
That means, it will go for each and every word, and then it will produce the value against
1:57
that key, the word will be the key, and the value will be one. So, mapper will produce a output
2:02
that is a key value pair, but for each and every word, it will not go for whether the word
2:08
has repeated or not. It will produce the word as a key, and then comma one as the value
2:13
So, in the reducer phase, the reducer takes the same key and count number of values under
2:19
the same key. So, whenever we are having the same key, then the reducer will just go on adding the
2:26
respective counts there. So, finally, it generates the exact key value pair, and from that pair, we can get
2:33
the number of each distinct word existing in the given input. So that is our what count example in Hadoop
2:42
Thanks for watching. this video
#Computer Science
#Programming