C3 AI Documentation Home

Use Adaptive Autoscaling to Configure the Scaling of Compute

As an application administrator, you can tune the scaling of compute with adaptive auto-scaling. Adaptive auto-scaling supports the gradual scale up and gradual scale down of node pools.

You can set an adaptive auto-scaling strategy so that task node pools (App.Node.Role#TASK) automatically scale up or down based on the amount of work waiting to be processed in the queue.

By default, auto-scaling for leader node pools is disabled. If enabled, leader nodes use the App.NodePool.AutoScaleStrategy.InvalidationQueues strategy, the same as task nodes.

Set adaptive auto-scaling

Set adaptive auto-scaling per node pool belonging to a shared node environment using App.NodePool.AutoScaleStrategy on App.NodePool.AutoScaleSpec.

You can set AutoScaleStrategy for various Types. The App.NodePool.AutoScaleStrategy.InvalidationQueues Type is one of the most common.

You can tune three fields for App.NodePool.AutoScaleStrategy.InvalidationQueues:

  • runIntervalMins - How often the application should check whether it should scale up or scale down. The default is 10 minutes.

  • maxPctChangeForScaleUp - How much nodes should scale up if there is existing work. For task nodes this is work in the queue. The default is a 100% scale up.

  • minPctChangeForScaleDown - How much nodes should scale down if there are nodes that are currently idle. For task nodes this is work in the queue. The default is a 100% scale down.

Autoscale strategy behavior for InvalidationQueues

The default InvalidationQueues autoscaling strategy monitors pending work in invalidation queues and automatically adjusts node pool sizes based on workload demand.

With adaptive autoscaling, the number of task nodes increases toward the maximum node count if the pending invalidation queue entries exceed the current capacity, and decreases toward the minimum node count if there are fewer pending entries than the current capacity can handle.

The platform calculates current capacity as calculated as:

Current capacity = Current nodes × InvalidationQueue#maxConcurrentComputes per node

The adaptive autoscaling target node count equals the number of pending entries divided by the maxConcurrentComputes value, rounded up:

targetNodeCount = Math.ceil(entriesPending / maxConcurrentComputes)

Scaling constraints

The autoscaling strategy applies the following constraints to prevent rapid scaling changes:

  • Scale-up limit: maxPctChangeForScaleUp prevents scaling up by more than the specified percentage per scaling event
  • Scale-down limit: maxPctChangeForScaleDown prevents scaling down by more than the specified percentage per scaling event
  • Hard boundaries: Always respects the configured minNodeCount and maxNodeCount values

Node pool specific behavior

Adaptive autoscaling has the following behavior for task, leader, and user-defined node pools:

  • Task nodes: Includes general invalidation queue entries plus cron job entries, and excludes inactive or leader-only jobs
  • Leader nodes: Includes entries for active leader-specific cron jobs
  • User-defined node pools: Includes only invalidation queue entries specifically tagged for that node pool

The autoscaling strategy runs as a scheduled job at regular intervals defined by runIntervalMins (default: 10 minutes) and automatically adjusts the node count based on queue demand.

Create auto-scaling strategies that balance performance and cost requirements for a specific application.

Example

The following code snippet demonstrates how to change an application node pool App.NodePool.AutoScaleSpec to have a 200% scale up:

JavaScript
C3.app().nodePool("task").config().withFieldAtPath("autoScale.strategy.maxPctChangeForScaleUp", 200).setConfig()

See also

Was this page helpful?