What is Hadoop?
Show More Show Less View Video Transcript
0:00
what is Hadoop so you know that nowadays
0:03
were dealing with huge amount of data on
0:05
cloud and that's why we are getting a
0:08
new problem that is a Big Data and
0:11
Hadoop is the solution of that problem
0:13
so Hadoop is nothing but one solution
0:17
towards this big data problem and we'll
0:19
be discussing what is Hadoop into
0:21
further more details so let us go for
0:23
further details we are going to discuss
0:25
what is Apache Hadoop so Apogee Hadoop
0:29
is nothing but one open source framework
0:31
and Hadoop can easily handle large
0:34
amount of data on a low cost simple
0:38
hardware cluster so cluster means it is
0:41
collection of multiple computers and
0:43
low-cost computers means here we are
0:45
using the commodity nodes or commodity
0:47
servers that means the hardware cost is
0:49
very low and that's why you are going to
0:52
give a low cost solution and this Hadoop
0:55
is actually providing the solution for
0:57
the big data and this Hadoop is scalable
1:00
if you require to reduce the workload or
1:03
divider or block with multiple servers
1:06
then the workload can be divided in a
1:09
horizontal scaling so Hadoop can easily
1:12
handle the large amount of data on a low
1:14
cost simple hardware cluster and Hadoop
1:17
is also scalable and fault tolerant
1:20
framework fault tolerant means if one
1:23
computer goes down if one server goes
1:25
down then another server will come into
1:28
the play and from the users point of
1:30
view they will not fail any kind of
1:32
failure theories having the failure
1:34
transparency the Hadoop is not only a
1:38
storage system data can be processed
1:40
using this respective framework so data
1:42
will be processed analysis can be done
1:45
here so the Hadoop system is basically
1:47
written in Java language Hadoop is open
1:53
source tool from the Apache Software
1:55
Foundation and as the open source
1:58
project we can even change the source
2:00
course of the hadoop system so as it is
2:03
open source also we can change the code
2:05
and most of the codes are written in
2:06
Java language most of the Hadoop codes
2:09
were written by Yahoo IBM cloud era
2:13
cetera so these companies have written
2:15
so many codes in Hadoop open source how
2:18
to provides parallel processing through
2:20
different commodity hardware
2:22
simultaneously so commodity harder means
2:24
that cheap hardware with the help of
2:26
which we can have the parallel
2:27
processing so that data processing and
2:29
analysis will be done in every faster
2:32
way as it works on commodity hardware so
2:35
the cost is very low and the commodity
2:37
hardware is low end and very cheap
2:39
hardware so the Hadoop solution is also
2:42
economic and cheap why we should use
2:46
Hadoop so it is a very important and
2:48
common question so the Hadoop solution
2:51
is very popular and it has captured at
2:54
least 90% of the big data market Hadoop
2:58
has some unique features that makes it
3:00
the solution very much popular and
3:03
Hadoop is scalable as we have discussed
3:05
earlier to do the load sharing we can go
3:08
on adding other community commodity
3:10
servers so that's right Hadoop is
3:12
scalable in that case so we can increase
3:15
the number of commodity hardware easily
3:17
and it is a fault tolerance solution
3:20
also we have discussed this one if one
3:22
node goes down then another node will
3:24
come into the system and when one node
3:26
goes down other nodes can process the
3:29
respective data data can be stored as a
3:32
structured unstructured and
3:34
semi-structured mode so it is more
3:37
flexible in operation we know that in
3:40
case of structured data we can represent
3:41
our data in the rows and columns so
3:44
database is a is a good example of
3:46
structured data and the sources might be
3:48
the say we are having the web locks
3:50
we're having the machine generated data
3:52
we're having the sensor data so that is
3:55
the source of the structured data we're
3:57
having the unstructured data that means
3:58
the data is not structured as an example
4:00
you can concerned consider a text file a
4:03
PDF video images images sent by the
4:06
respective satellites and different
4:09
machine generated data can be the
4:11
unstructured data and semi structured
4:13
data is some something like to some
4:15
extent structured and to some extent
4:17
unstructured as an example we can go for
4:20
XML files
4:21
JSON files and so on so all this type of
4:24
data can be stored into the
4:27
system so in this particular video we
4:29
have discussed what is Hadoop and what
4:31
are the different features thanks for
4:33
watching this video
#Enterprise Technology
#Data Management
#Programming
#Software
#File Sharing & Hosting
#Engineering & Technology
#Distributed & Cloud Computing
#Cloud Storage

