Lab – Components failure and recovery mechanisms

In this exercise we will run few hadoop commands to check the status of our cluster and balance the file distribution across the nodes.

1. As an admin you can get the report for hadoop cluster by the command (if you working with cent user you would see [cent@localhost ~] on your terminal window)

hadoop dfsadmin -report

2. To get complete view of your hadoop cluster you can run the command:

hadoop dfsadmin -printTopology

3. To rebalance the blocks availability across available data nodes, use the command:

hadoop balancer

Running hadoop balancer without any run time argument / specification will perform required balancing as per default threshold that is 10.0

Threshold value can be seen in command logs over terminal.