Extend autoscaling
Last updated on October 24, 2024
Hosted Extend service autoscaling strategy
Hosted Extend services use horizontal scaling as the strategy to match demand. Horizontal scaling works by adjusting the number of the running Extend app replicas. It's immediately scaled up the number of replicas in response to demand (no delay or stabilization period). CPU utilization percentage of the replica (including the internal components) is used as the proxy for demand. The Extend controller maintains the average of CPU utilization from all replicas of a hosted Extend service to be close to 50% by using the following algorithm:
desiredAppReplica = ceil[currentAppReplica * ( currentAvgCPUUtilizationPercentage / 50% )]
For example, with a scale-out scenario, given that:
- The current average CPU utilization of the App replica is 100% (the replicas are fully utilized).
- There are 2 App replicas currently running.
desiredAppReplicas = ceil[2 * ( 100% / 50% )]
desiredAppReplicas = 4
The desired number of app replicas will be 4.
Another example, with a scale-in scenario, given that:
- The current average CPU utilization of the app replicas is 20% (the app replicas are underutilized).
- There are 4 app replicas currently running.
desiredAppReplicas = ceil[4 * ( 20% / 50% )]
desiredAppReplicas = 2
The desired number of app replicas will be 2.