Corrupted blocks in HDFS

Ambari on one of my test environments was warning me about under replicated blocks in the cluster:

ambari - under replicated blocks

Opening the Namenode web UI (http://test01.namenode1:50070/dfshealth.html#tab-overview) painted a different picture:

namenode web ui - overview information about blocks

11 blocks are missing which might lead to corrupted files in the cluster.

Running file check on one of the files:

sudo -u hdfs hdfs fsck /tmp/tony/adac.json

The output:

fsck - corrupted file

The output reveals that the file is corrupted, was split in 4 blocks and is replicated only once on the cluster.

Nothing to do but to permanenetly delete the file:

sudo -u hdfs hdfs dfs -rm -skipTrash /tmp/tony/adac.json

Ambari wil now show 4 blocks less in the Under Replicated Blocks widget. Refreshing the Namenode web UI will also show that the corrupted blocks are removed.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s