Resource Allocation and Scheduler Configuration

Resource Allocation and Scheduler Configuration

Resource Allocation and Scheduler Configuration 

Category: AdministrationResource Management 

Applies to: Apache Hadoop 2.x, 3.x 

Issue Summary 

This document outlines critical configurations for resource allocation and scheduler management within Hadoop YARN (Yet Another Resource Negotiator). Proper configuration of YARN's ResourceManager and NodeManagers is fundamental to optimizing cluster resource utilization, ensuring application stability, and achieving predictable performance for various workloads (e.g., MapReduce, Spark, Tez). Incorrect settings can lead to resource starvation, inefficient job execution, or even cluster instability. 

Critical Properties (in yarn-site.xml and capacity-scheduler.xml / fair-scheduler.xml) 

  1. NodeManager Resource Configuration (yarn-site.xml) 

  • yarn.nodemanager.resource.memory-mb 

  • Description: The total amount of physical memory (in MB) that YARN will allow containers to use on a NodeManager. 

  • Value: Set this to a value less than the total physical RAM on the server (e.g., 75-80% of total RAM) to leave memory for the operating system, Hadoop daemons (DataNode, NodeManager itself), and other system processes. 

  • yarn.nodemanager.resource.cpu-vcores 

  • Description: The total number of virtual CPU cores that YARN will allow containers to use on a NodeManager. 

  • Value: Set this equal to the total number of physical CPU cores on the server. Hyper-threading cores are generally counted as 1 physical core for this setting to avoid over-subscription. 

  1. Scheduler General Allocation Settings (yarn-site.xml) 

  • yarn.scheduler.minimum-allocation-mb 

  • Description: The minimum memory allocation request for any container (in MB). All container requests will be rounded up to this value. 

  • Value: 1024 (1GB) or 2048 (2GB) are common. Setting this too low can lead to an excessive number of small containers, increasing scheduling overhead. 

  • yarn.scheduler.maximum-allocation-mb 

  • Description: The maximum memory allocation request for any container (in MB). No container can request more memory than this. 

  • Value: Should not exceed yarn.nodemanager.resource.memory-mb. Set slightly lower to account for NodeManager overhead. 

  • yarn.scheduler.minimum-allocation-vcores 

  • Description: The minimum virtual CPU cores allocation request for any container. All container requests will be rounded up to this value. 

  • Value: 1 

  • yarn.scheduler.maximum-allocation-vcores 

  • Description: The maximum virtual CPU cores allocation request for any container. 

  • Value: Should not exceed yarn.nodemanager.resource.cpu-vcores. 

  • yarn.resourcemanager.scheduler.class 

  • Description: Specifies the YARN scheduler to use. 

  • Value: 

org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler (Default and generally recommended for multi-tenancy). 

org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler (For fair sharing of resources over time). 

  1. CapacityScheduler Specific Configuration (capacity-scheduler.xml) 

  • yarn.scheduler.capacity.root.queues 

  • Description: Defines the top-level queues under the root. 

  • Value: Comma-separated list of queue names (e.g., default,dev,prod). 

  • yarn.scheduler.capacity.<queue-path>.capacity: 

  • Description: The percentage of cluster resources guaranteed to a specific queue. The sum of direct children under a parent must be 100%. 

  • Value: Percentage (e.g., 50). 

  • yarn.scheduler.capacity.<queue-path>.maximum-capacity 

  • Description: The maximum percentage of cluster resources a queue can use, allowing it to "burst" above its guaranteed capacity if other queues are idle. 

  • Value: Percentage (e.g., 100). Set to 100 for dynamic resource sharing. 

  • yarn.scheduler.capacity.<queue-path>.state 

  • Description: The state of a queue (RUNNING or STOPPED). 

  • Value: RUNNING. 

  • yarn.scheduler.capacity.<queue-path>.user-limit-factor 

  • Description: The percentage of the queue's capacity a single user can consume. 

  • Value: Floating-point number (e.g., 1.0 for 100%, allowing a single user to use the entire queue capacity). 

  • yarn.scheduler.capacity.<queue-path>.maximum-applications: 

  • Description: The maximum number of applications that can run concurrently in a queue. 

  • Value: Integer. Prevents a single queue from overwhelming the ResourceManager. 

  • yarn.scheduler.capacity.resource-calculator: 

  • Description: Specifies how resources are calculated (memory, or memory and CPU). 

  • Value: 

org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator (Memory only). 

org.apache.hadoop.yarn.util.resource.DominantResourceCalculator (Memory and CPU - recommended for balanced scheduling). 

Additional Notes: 

  • Memory vs. CPU: Modern workloads often demand a balance of CPU and memory. Use DominantResourceCalculator with CapacityScheduler to ensure jobs are scheduled based on the resources (CPU or Memory) they are most constrained by. 

  • JVM Overheads: When configuring spark.executor.memory for Spark applications (or similar for other frameworks), remember that the allocated YARN container memory (yarn.scheduler.minimum-allocation-mb) must be larger than the JVM heap size requested by the application. This accounts for JVM overhead (e.g., off-heap memory, garbage collection, stack space). A common practice is to allocate an additional 10-20% beyond the application's requested heap. 

  • Queue Hierarchy Design: For multi-tenant clusters, design a logical queue hierarchy in CapacityScheduler (e.g., root.dev, root.prod, root.data_science) to isolate workloads, manage resource guarantees, and apply different access policies. 

  • Small Containers Overhead: While smaller minimum-allocation-mb values offer more granularity, they increase scheduling overhead for the ResourceManager and lead to more JVM instances, consuming more memory for JVM processes themselves. Balance granularity with efficiency. 

  • Dynamic Resource Allocation (Spark): For Spark, enabling spark.dynamicAllocation.enabled combined with spark.shuffle.service.enabled allows Spark to dynamically acquire and release executors from YARN based on workload, improving cluster utilization. Ensure YARN queues have sufficient maximum capacity for this to be effective. 

  • Monitoring: Continuously monitor YARN queue utilization, application resource consumption, and node health via the YARN ResourceManager UI (http://<rm_hostname>:8088) and cluster monitoring tools (e.g., Prometheus/Grafana, Ganglia) to fine-tune configurations. 

  • Preemption: CapacityScheduler supports preemption (yarn.scheduler.capacity.root.default.acl_submit_applications and related properties) which allows the scheduler to kill containers from over-capacity queues to free up resources for queues that are below their guaranteed capacity. This ensures fairness and resource guarantees. 

    • Related Articles

    • Critical Configuration Properties for HDFS, YARN, Spark, and Other Hadoop Components

      Category: Configuration → Hadoop Platform Applies To: Hadoop 3.x, spark 3.x Issue Summary This document provides a comprehensive list of critical properties and essential configurations for the core components of the Hadoop ecosystem: HDFS, YARN, and ...
    • Hadoop/Yarn Jobs Not starting - stuck in accepted state

      Title: hadoop yarn job stuck in accepted state - Step-by-Step Troubleshooting Guide Category: Troubleshooting Applies To: Last Updated: 23/06/2025 Issue Summary A job submitted via YARN remains in the ACCEPTED state indefinitely and does not ...
    • Troubleshooting Yarn Application Failures

      Troubleshooting Yarn Application Failures Category: Troubleshooting → YARN Applies To: Apache YARN 2.x, 3.x Issues Summary: YARN applications (such as Spark, MapReduce, Tez jobs) fail to complete successfully, often exiting with a FAILED status, or ...
    • Monitoring and Managing Resource Consumption on a Single Node.

      Applies To: Distributed Systems (Hadoop, Spark, etc.), Standalone Servers Category: Troubleshooting → Performance, Resource Management Issue Summary A single node within a cluster or a standalone server is experiencing disproportionately high CPU, ...
    • Memory Overhead and OOM in Spark – Tuning Executor Settings

      Memory Overhead and OOM in Spark – Tuning Executor Settings Title: Memory Overhead and OOM in Spark - Tuning Executor Settings Category: Troubleshooting → Spark Applies To: Apache Spark 2.x, 3.x (running on YARN, Mesos, or Standalone) Issue Summary ...