In Part -1, I explained about Data Replication in Hadoop: Replicating Data Blocks (Part – 1) now in this post I am trying to explain about Slave node disk failures.
Hadoop was originally designed with an intention to store petabyte data at the scale, with any Potential limitations to scaling out are minimized. The huge block size is a direct result of this requirement to store data on a very large scale. Firstly, each data block stored in HDFS consists of its own metadata and requires to be tracked down by a central server in way that the applications needs to access a particular file can be directed to the place where all the blocks of file are stored. If the size of that block were in the KB’s range, even modest data volume in the TB scale would overwhelm the metadata server with lots of blocks to track. Secondly, HDFS is built to provide high throughput so that the parallel processing of these massive volume of data sets happens as fast as possible.
The most important thing to Hadoop’s scalability on the data processing side is, and will always be, parallelism — it is the ability of processing the individual blocks of these massive files in parallel. To facilitate efficient processing, a balance needs to be struck.
Just like death and taxes, disk failures (and given enough time, even node or rack failures), are inevitable. Given the example in the adjoining figure, even if one of the racks were to fail, the cluster would still continue functioning. Performance would suffer since we would lost half our processing resources, but the system will still online and the data will be still available.
In this situation, where a disk drive or a slave node fail to perform, the central metadata server for HDFS (called the NameNode) eventually finds it instantly that the file blocks which are stored on the failed resource are no longer available. For example, if Slave Node 3 in this figure fails, it would mean that Blocks A, C, and D are under-replicated. In other words, too few copies of these blocks are now available in HDFS. When HDFS depicts that a block is under-replicated, it orders a new copy.
To continue the example, let’s say that Slave Node 3 comes back online after a few hours. At the meantime, HDFS will ensure that there are now three copies of all the file blocks. So now, Blocks A, C, and D have four copies apiece and are over-replicated. As like under-replicated blocks, the HDFS central metadata server will sense about this as well, and will instantly order one copy of every file to be deleted.
One nice output of the availability of data is that when disk failures do occur, there’s no need to immediately replace failed hard drives. This can more effectively be done at regularly scheduled intervals.