How to calculate Hadoop cluster growth plan based on storage?
This calculation is for small 3 node Hadoop cluster assume average daily ingest rate of 10 GB per node.
|Average daily ingest rate||10 GB|
|Replication factor||3 (copies of each block)|
|Daily raw consumption||30 GB (Ingest × replication)|
|Node raw storage||600 GB (2 x 300GB SATA II HDD)|
|MapReduce temp space reserve||25% For intermediate MapReduce data|
|Node-usable raw storage||450 GB (Node raw storage – MapReduce reserve)|
|1 year (flat growth)||
24 Node (Ingest × replication × 365 / node raw storage)
(10 GB x 3 x 365/450 GB)
|1 year (5% growth per month)|
|1 year (10% growth per month)|