Memory Optimization of your Hadoop Cluster

If you want to get a better performance out of your Hadoop cluster, you might want to look into optimizing the memory settings of your NodeManagers. If you installed your cluster using Ambari, and you didn't change the memory settings during setup, your might underutilize the memory which is available on your machines. Therfore Hortonworks provides a nifty little tool for calculating the memory settings based on their best-practices.

You can find the corresponding python script on github.

To calculate the recommending settings just

git clone https://github.com/hortonworks/hdp-configuration-utils.git  
./2.1/hdp-configuration-utils.py -c 16 -m 64 -d 4 -k True

Adjust the property to fit the sizing of your machines

-c = cores (number of cores)
-m = memory (64 GB of RAM)
-d = disks (number of disks)
-k = is HBase enabled (True/False)

The result in this case would be

Using cores=16 memory=64GB disks=4 hbase=True  
Profile: cores=16 memory=49152MB reserved=16GB usableMem=48GB disks=4  
Num Container=8  
Container Ram=6144MB  
Used Ram=48GB  
Unused Ram=16GB  
***** mapred-site.xml *****
mapreduce.map.memory.mb=6144  
mapreduce.map.java.opts=-Xmx4096m  
mapreduce.reduce.memory.mb=6144  
mapreduce.reduce.java.opts=-Xmx4096m  
mapreduce.task.io.sort.mb=1792  
***** yarn-site.xml *****
yarn.scheduler.minimum-allocation-mb=6144  
yarn.scheduler.maximum-allocation-mb=49152  
yarn.nodemanager.resource.memory-mb=49152  
yarn.app.mapreduce.am.resource.mb=6144  
yarn.app.mapreduce.am.command-opts=-Xmx4096m  
***** tez-site.xml *****
tez.am.resource.memory.mb=6144  

You can now change the corresponding properties in your *.xml files and restart the affected components/deamons. Now you should have more YARN memory availble in your cluster and can run more and bigger containers on the NodeManagers.

Tweaking those settings is also possible. But be careful to not overcommit your memory settings to much, because this might have severe performance implications since swapping the memory back to disk is extremly slow.

Andreas Fritzler

Data Jedi | Cloud and Big Data Expert | Machine Learning Enthusiast | Deep Learning Fanatic @SAP Opinions are my own