Elastic Async Limits for Queueable and Future Jobs

Summer '26's beta elastic limit lets Queueable and future jobs continue beyond the licensed daily async Apex limit, up to twice that limit. Jobs above the licensed allowance are throttled.

That replaces a cliff with a slope. It does not provide twice the normal-speed capacity.

Salesforce Apex Jobs banner showing async Apex usage over the licensed daily limit

Five ways to use the headroom

1. Monitor both limits

In API 67.0 and later, System.OrgLimits.getMap() exposes the licensed, elastic and processed async values. Track percentage of the normal allowance and percentage of the elastic ceiling.

Those are different signals: crossing the first means work may slow; approaching the second means admission must stop.

2. Classify work before a spike

Not every job deserves equal treatment. A customer receipt may be high priority while search indexing, enrichment or a cache refresh can wait.

Make that a business decision at the call site:

public enum AsyncPriority { HIGH, NORMAL, LOW }

AsyncDispatcher.enqueue(
    new RefreshSearchIndexQueueable(recordIds),
    AsyncPriority.LOW
);

Keep the limit policy inside one dispatcher rather than scattering thresholds through the codebase.

3. Delay low-priority Queueables

System.enqueueJob(job, delayInMinutes) accepts a delay of up to ten minutes. Use that friction before the org reaches its normal daily allowance.

public static Id enqueue(Queueable job, AsyncPriority priority) {
    Decimal usage = AsyncLimitStatus.licensedUsagePercent();

    Integer delay = 0;
    if (priority == AsyncPriority.LOW && usage >= 80) {
        delay = usage < 100 ? 2 : 8;
    }

    return System.enqueueJob(job, delay);
}

Queueable is generally easier than @future for this policy because it also provides job IDs, richer payloads, chaining and finalizers.

4. Defer work near the elastic ceiling

Delay is not enough when the org is already hot. Stop admitting optional work and persist it for a later worker.

if (AsyncLimitStatus.elasticUsagePercent() < 85) {
    AsyncDispatcher.enqueue(job, AsyncPriority.LOW);
} else {
    insert new Deferred_Work__c(
        Work_Type__c = 'RefreshSearchIndex',
        Payload__c = JSON.serialize(recordIds),
        Run_After__c = System.now().addHours(1)
    );
}

Persisted work needs its own retention, retry and observability rules, but it is preferable to blindly enqueueing into a nearly exhausted org.

5. Compress duplicates

Twenty updates to one account rarely require twenty recalculation jobs. Enqueue one job for a set of IDs, or upsert pending work with a unique business key so repeated requests collapse into one item.

This is more valuable than elastic capacity because it removes the spike rather than merely absorbing it.

What does not change

Elastic limits do not remove transaction CPU, SOQL, DML, heap, callout or row-locking constraints. They do not repair Queueables created in loops, retries without backoff or an external service that is already rate-limited.

A practical starting policy is:

below 80% of the licensed limit: enqueue normally
80–100%: delay low-priority work
above 100%: delay normal work and defer optional work
above 85% of the elastic ceiling: accept only critical work

Tune the thresholds with production evidence. The principle is more important than the exact percentages: shape traffic while capacity remains, not after jobs start failing.

Elastic async limits are a shock absorber. The best use is adaptive queuing—monitor the org, protect important work and compress or postpone everything that can wait.