MapReduce and Design Patterns - Download and Setup Sample data into HDFS
6K views
Oct 18, 2024
MapReduce and Design Patterns - Download and Setup Sample data into HDFS https://www.tutorialspoint.com/market/index.asp Get Extra 10% OFF on all courses, Ebooks, and prime packs, USE CODE: YOUTUBE10
View Video Transcript
0:00
Download and set up sample data into our SDFS
0:05
So we require some sample data for running our MapReduce programs. So how to download them, how to keep them in the SDFS
0:13
We'll be discussing that one step by step. So we shall download the sample data to run our MapReduce programs
0:21
And we shall use the stack exchange data dump. And you can get all the download links here
0:27
So this is a download link. From here you are supposed to download this file that is Android.dot stackexchange.com
0:35
So this file has to be downloaded. It is containing different datasets
0:43
Now set up HDFS and store downloaded data and we should create some directory in HDFS
0:51
and then store the respective XML files into it. In our example, we have created slash
0:58
input folder and then another folders to store the respective XML so on in the
1:04
SDFS route will be creating one folder the name of the folder will be input
1:08
under that folder folder will be creating multiple subfolders and the respective XML files will be stored on them so under this particular input
1:17
folder we're having this badge folder and then batch. XML we shall create
1:22
another folder comments comments.xm. Post posts dot XML in this way you can find
1:28
that all these 1, 2, 3, 4, 5, 6, 7, 8 XML files will be saved or will be transferred to this
1:36
particular path and they'll be used in our MapReduce Applications Development. So let us go for one demonstration for the easy understanding of this concept
1:47
At first we shall copy one URL here So here is the URL that is the HTTP colon slash slash archive slash download slash stack exchange So we shall copy this URL and we shall paste it onto this browser URL box and paste and go
2:12
This webpage has got open and here we are going to select Android
2:15
stack exchange.com.7.Z. So this particular file is to be downloaded. But actually I have
2:22
already downloaded this one on my download folder. So I'm not going for okay. I'm going for
2:27
cancel. Otherwise I shall have to download that one. So I shall open the download folder. The file is
2:33
there. So we shall go for the right click and we shall go for extract here. A new folder has got
2:40
created Android.com.com. So the extraxact. instruction is taking place
2:52
Extruction has got completed. So we are opening the folder. This folder is having 8 XML files
2:59
You can find that. It is having 8 XML files. So we shall copy all these files onto in the folder that is a map produced design pattern
3:08
So we have gone to the map produce design pattern going for a new folder creation
3:12
Let the folder name is XML underscore files. We are creating a new folder under the MapReduce Design Pattern folder
3:22
So we shall go for the respective download folder and there we are going to copy all
3:28
the 8 XML files, going for the Home, then snakes, then MapReduce Design Pattern
3:37
then XML files. Now we are going to paste all the 8 XML files here
3:45
So now these XML files have to be put in the HDFS so Hadoop has to be started
3:57
we're opening one terminal so we are going for the Hadoop start now Hadoop is
4:14
starting we shall also issue the command JPS to say whether Hadoop processes are running or not JPS all the processes are running
4:29
all the processes are running now we shall open the browser and there is a
4:38
Hadoop browser is there we are going for the refresh there is one folder called
4:42
input in the sdfs route this input folder is void it is blank now we shall
4:50
create so I'm not going to execute that one that is mkdir for input folder so I'm
4:56
going to create eight different folders under input folder and there will
5:02
be keeping the respective XML files so the first folder has got created there is
5:08
a slash input slash batch now going for the rest seven folders creation on HDFS
5:16
And then we shall copy all the eight XML files to the respective folders
5:20
We are supposed to create another four folders
5:42
You know We shall copy XML files there respectively after creation of all 8 folders
6:10
So all these files are to be copied
6:40
So this is the there is a command there so put command so copying files from the local system to the SDFS
6:54
so let me copy all the 8 XML files in the respective folders
7:10
So in this way we shall create all the folders onto the SDFS and then we shall
7:21
copy all these XML files to the respective folders and now let us check the input
7:27
folder now and you can find that all the XMLs have got copied
#Other
#Programming