Spark Map and FlatMap
47K views
Nov 28, 2024
Spark Map and FlatMap
View Video Transcript
0:00
In this video we are discussing Spark map and Flatmap
0:04
We know that on Spark IDD we can perform two operations. One is the transformations, another one is the actions
0:12
This flat map and map, these two functions, will be nothing but falling under the category of transformations
0:19
You know that in case of transformation operation, it will take one RDD as input and it can produce one or more than one RDDs as output
0:27
and that is a basic principle of transformations. So now let us concentrate on this map and flat map operations on our DD
0:37
So what is map in Spark? So one take it takes one element and produces only one element
0:45
So that is the main difference between map and flat map. In case of map, it takes one element as input and produces only one element as output
0:54
But I shall be discussing flat map also. in case of flat map it takes one element as input and it can produce zero one or multiple elements
1:02
as output and that is a basic difference between this map and the flat map here you can take that
1:09
this rddd is the input rdd and it is being input to this map this map can contain some user
1:16
defined operations there and then it produces the final output also in rddd as this map and flat map
1:24
these operations are falling under transformations so it takes rddd as input and produces rdd as output so the main
1:33
feature of map is takes one element and produces one element so what is map in spark a map is a
1:42
transformation operation of spark it takes rdddd as input and finds another rddd as output in
1:50
In map method the programmer can use their own logic and the map in map function one logic can be applied to all data of the RDD the length of the input
2:02
RDD and output RDD I will be same in case of map function so the method or the
2:09
operations which will be mentioned in this map will be applicable on the all
2:14
data of the input RDD now let us discuss this flat map so it
2:21
takes one element and produces zero one or more elements as output you can find this
2:27
one it is taking one rddd as input and there is a flat map operations is being done
2:32
and user can mention the respective logic there so after taking this input the
2:36
flat map operation will be done on this respective RDD and you can see it is
2:42
producing another RDD as output and it may be containing zero number of
2:46
elements one element or more than one elements Now, what is flat map in Spark
2:54
A flat map is also another transformation operation of Spark and it is similar to the map
3:00
function, but for the flat map it may not return the same size, it may return the 0, 1 or
3:08
more than 1 records as output. In case of map, we had one input and one output, but in case of flat map method, we're having
3:16
this one as it returns 0, 1 or more than 1. and one records as output. So in flat map, the programmer may use their own logic and using this
3:28
function, one logic will be applied to the all data of the input RDD. So the length of input
3:35
RDD and the output RDD may not be same for all time in case of flat map method. So in this particular
3:43
discussion we have discussed what are the different transformation techniques using map and
3:48
the flat map methods. Thanks for watching this video
#Programming