Ways of accessing data in HDFS

You can interact with HDFS using command line interface (CLI) as well as graphical user interface (GUI).

Interacting using CLI: If Hadoop is running in your system (if not, start all components of hadoop using start-all.sh command on terminal), you can access hdfs using command line interface on your terminal.

HDFS commanded line interface supports majorly all Linux commands (ls, mkdir, rmdir etc.) with the prefix “hadoop dfs” which actually tells the operating system to perform the mentioned operation over hdfs.

For example if you wish to see all the files in your cluster you need to give the command ls with prefix hadoop dfs and the location / which shows root entry point to hdfs. So, the command to run will be as shown

hadoop dfs – ls /


Note: Initially on a fresh cluster, if you have not uploaded any file to hdfs, it will show you 0 items while running this command.

Interacting using GUI: HDFS can also be accessed using GUI on localhost at port 50070. On this port you can access Namenode status and as discussed in earlier sections Namenode works as an interface of hdfs to the user.

Go to utilities tab and then click on “Browse the file system” to see all the files / directories residing in hdfs.

Initially as you have not uploaded anything to hdfs, you will not see any file listed over there. But once to upload any file to hdfs or create any directory / file on hdfs in next lab you can see it over here.