Internals of a File Read in HDFS

HDFS stores data in the form of blocks on datanodes.

When a Namenode receives a file read request, it searches the metadata on Namenode itself to check the blocks location on datanodes. Then Namenode gives this file location information to the client program and instructs to check the desired block on specified datanodes.

If we talk about physical storage of Namenode and datanode, these are directory locations specified by hadoop administrator during hadoop installation. You can find the location details of hdfs in hdfs-site.xml as shown:

You can also visit the file location  using explorer and can see the content of hdfs nodes as per their designed functionality.

Note: Datanodes contains data blocks while Namenode contains metadata.

Note: Storage node and compute nodes are same in hadoop.

In a production environment we work on a multi-node cluster where we have more than one datanodes connected to a Namenode and the blocks are replicated over multiple datanodes. The blocks are distributed over the cluster in such way that any node failure may not result in loss of data.

The data resides in these blocks which are 64 MB or more in size. In hadoop 2.x by default size of a data block is 128 MB. It can be configured as per requirement.

The size of block affects the disk i/o operations needed for reading / processing large files on hadoop cluster. Blocks the replicated across cluster to ensure availability and fault tolerance. Default replication factor is 3.

Image source:

Let’s understand the block replication in the shown image.

Part-0 has 2 data blocks 1 and 3 with replication 2.

Part-1 has three data blocks 2, 4, & 5 with replication 3.

Note that we can have different replication factor set for different files. If you think any file may need more replication and availability concern you can increase its replication factor while uploading to hdfs. That replication value will override the default value.

You can see the replication applied on blocks rather than on file. So, the blocks of part-0 file are replicated on datanode 1 and 3.  While blocks of part-1 file are replicated on all 4 data nodes ensuring 3 replications for each block.