How to calculate hadoop cluster growth plan based on storage

How to calculate Hadoop cluster growth plan based on storage?
This calculation is for small 3 node Hadoop cluster assume average daily ingest rate of 10 GB per node.

Average daily ingest rate 10 GB
Replication factor 3 (copies of each block)
Daily raw consumption 30 GB  (Ingest × replication)
Node raw storage 600 GB  (2 x 300GB SATA II HDD)
MapReduce temp space reserve 25% For intermediate MapReduce data
Node-usable raw storage 450 GB (Node raw storage – MapReduce reserve)
1 year (flat growth)

24 Node (Ingest × replication × 365 / node raw storage)
(10 GB x 3 x 365/450 GB)
1 year (5% growth per month)  
1 year (10% growth per month)  


Published by Aryan Nava

Founder of "BlockchainMind", CTO for two Blockchain startup during 2018, Cloud/DevOps Consultant and Blockchain Trainer

Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: