Lab – HDFS Operations and Commands

1. Copy the file bigdatafile-dihub.txt from local file system to hdfs.

Note: The data files can be found in dataMR directory, and program files can be found in javaMR directory under Documents directory as shown below

Go to dataMR directory inside Documents and follow the command given below to copy the file to hdfs

hadoop dfs -copyFromLocal bigdatafile-dihub.txt

Note: Whenever you run any dfs command with hadoop you will see DEPRECATED message on screen. All dfs command can be run using hdfs which is specific to file system commands while hadoop is common for hdfs and jar commands. Running dfs command with hadoop syntax is more generic but to avoid it, you can run these dfs commands with hdfs as shown below:

hdfs dfs -copyFromLocal bigdatafile-dihub.txt

Note: In the terminal window above you can see that we have given <filename.txt> without any specific local_path because we are in the same directory where the desired file exists, and we put hdfs_path as /

 

2. To download the file from hdfs to local file system the command will be

hadoop dfs -copyToLocal /bidatafile-dihub.txt /home/cent/Documents/dataMR/downloaded-bigdatafile

Note: Here we have given absolute local path to understand local_path argument. In the local path cent is used as per our username. If you are using a different username please change the command accordingly.

 

3. To see the content on any file over hdfs. You need to use the cat command on hdfs. We can use Linux pipe feature to customize our result coming from hdfs as shown:

hdfs dfs -cat /bigdatafile-dihub.txt | head -10

 

4. Delete file from hdfs: To delete any file from hdfs give the command -rm with file path.

hdfs dfs -rm /bigdatafile-dihub.txt