Hadoop Architecture
5K views
Nov 18, 2024
Hadoop Architecture
View Video Transcript
0:00
In this video, we are going to discuss Hadoop architecture
0:04
And in this discussion, we'll be going for different components in the Hadoop architecture
0:08
and we shall find it and we shall discuss it into more details
0:13
Architecture of Apache Hadoop. The Apache Hadoop versions are following this given architecture has shown in this diagram
0:20
to process a large data set across clusters of computers. And here you can find we're having four different models in our HADUBURG
0:30
Hadoop architecture. And these four modules are, there is a Hadoop common, which has been depicted as common utilities, Hadoop YERN
0:38
Hadoop distributed file system, in short, it can be called as SDFS in abbreviated form, and Hadoop map reduce
0:45
Since 2012, the term Hadoop often refers not just to base models mentioned above, that means, in this particular diagram
0:54
and but also the collection of additional software packages which can be installed on the top of the
1:00
along with the Hadoop. This install softers can be the Apache Peak, can be Apache HB, Apache HBass, Apache Spark, etc
1:10
So let us go through these modules individually, one after another. So how many modules are there
1:16
We're having four modules. MapReduce, SDFS, EARN Framework and Hadoop Common. So these are the, so at first we're starting with the Hadoop common, that is the common utilities
1:28
So these are the Java modules and utilities which are needed for other Hadoop modules as well
1:34
And these libraries provide the file system and waste level abstractions and contents the necessary Java files and the script required to start the Hadoop So common utilities from the term it is quite obvious that some models will be there some libraries will be there which will be accessed from the other different models of this
1:55
Hadoop system. Let us go for Hadoop EARN. So what is the full form of EARN, Y-A-R-N, that is yet another resource negotiator
2:04
So this E-N is a framework for job scheduling and mainly for class-earn
2:09
resource management so now let us go for the next topic that is our sdfs
2:15
that is a Hadoop distributed file system so sdfis is a distributed file
2:20
system that provides the high throughput access to the application data next
2:26
we're going for this Hadoop map reduce this Hadoop map reduce is a very important
2:30
model here and this is EARN based system for parallel processing of large
2:35
data set so what this map produce is actually doing The MapReduce is one of the main components of the Hadoop ecosystem, as we can find that
2:44
here you have mentioned this module at the top, and this MapReduce is designed to process
2:50
a large amount of data in parallel by dividing the work into smaller pieces, into the smaller
2:57
independent tasks. The whole job is taken from the user and divided into smaller tasks in our MapReduce and
3:05
assign them to the worker nodes. The MapReduce programs take input as a list and convert to the output also as a list
3:16
That is the purpose of this MapReduce. In our discussion we have discussed what are the four important modules are there in Apache
3:24
Hadoop architecture and how do they function. Thanks for watching this video
#Cloud Storage
#Computer Science
#Distributed & Cloud Computing
#Education
#File Sharing & Hosting
#Networking
#Programming
#Software