HBase Architecture
761 views
Oct 24, 2024
HBase Architecture Watch more Videos at https://www.tutorialspoint.com/videotutorials/index.htm Lecture By: Mr. Arnab Chakraborty, Tutorials Point India Private Limited
View Video Transcript
0:00
In this video, we are discussing edge-based architecture
0:05
Age-based architecture is having mainly three components. First one is the H-master, which is actually capable to do the load balancing
0:13
Next one is the region server, which is actually the main responsible for read and write operations
0:19
And the last one is the Jew-keeper, which is responsible for distributed coordination and synchronization
0:26
So, let us start with the discussion of edge-based architecture. So, edge base has got mainly three parts, that is the edge master, region server and the zookeeper
0:38
At first, we are dealing with the edge master. So, edge master in edge base is a process which helps to assign the regions to region servers
0:47
Regions means vertical cross-section of our edge base tables. That means, it consists of multiple column families
0:55
And it balances the load by assigning the region. So the load balancing will be done by this H master HMaster manages the Hadoop clusters and it helps to create Coify and delete tables in the database So for the modification division creation of the table this H master
1:17
is responsible for that. It also cares about the different tasks when the client wants to
1:24
change the schema or the metadata. So whenever you want to bring some changes in the metadata
1:30
or in the schema, this H master will also perform that very task
1:37
Next, we are going for this region server. The region servers are the main working nodes, and it handles the read, write, modify
1:47
requests from the client. So, when the requests will be coming from the client regarding the read, right, and
1:53
modification operations, then this region servers will be responsible for their executions
1:59
The region server runs on every node in the Hadoop cluster. So for every node in the Hadoop cluster, this region server, this process will be running
2:09
It has a read cache and called the block cache The read cache is known as the block cache and read data are stored in the read cache there and when the cache will become full then the recently used data will
2:23
be removed from the cache because this cache is having a fixed size and the name of the cache is
2:28
blocks cache. So when the respective cache will become full with the frequently used data
2:34
then the recently used data will be removed from the cache. Another cache is present and that is
2:41
known as the mem store and it is the right cache and stored new data that is not yet stored
2:48
onto the disks. So those data which are new but not yet has been written onto the disk
2:54
will be kept in this particular cache and the name of the cache is mem store. And each column
3:00
family has different right cache in it. It has the actual storage file called the H file, the actual
3:08
storage file which will be there in H-base is known as H-file and it stores the actual data
3:15
on a disk And here you can find that here we are having the region servers So region is there we are having the block cache we are having the mem store H and the respective index for the proper searching So index is loaded when the H file is opened and H file index is kept in the block cache
3:33
that is a memory and the lookups can be performed with a single disk seek
3:39
And that's why the operation will be faster in this case. Now let us go for the last component that is our zookeeper, which is responsible for distributed
3:48
coordination and also synchronization. The architecture of edge base under this, we are going for the last component that is
3:56
our zookeeper. So this is an open source server which enables the reliable distributed coordination
4:03
and Jewkeeper is centralized service that maintains the configuration information. So the configuration information will be kept centrally with this Jew keeper and it also
4:14
maintains the distributed synchronization and etc. Jukepper service keeps track of all the region servers available in the H-Base
4:24
So, these are the main architecture of H-base and you have discussed that one with some proper diagrams
4:31
Thanks for watching this video
#Cloud Storage
#Computer Servers
#Distributed & Cloud Computing
#Networking
#Programming