Intro

My notes on autoscaling the App Service

Documentation

Tips and Tidbits

Scale up an app in Azure App Service
Examine autoscale factors
Scale Up: Get more CPU, memory, disk space, and extra features like dedicated virtual machines (VMs), custom domains and certificates, staging slots, autoscaling, and more.
- You scale up by manually changing the pricing tier of the App Service plan that your app belongs to.
Scale Out: Increase the number of VM instances that run your app.
- You can scale out to as many as 30 instances, depending on your pricing tier.
- App Service Environments (ISE) in Isolated tier further increases your scale-out count to 100 instances.
- For more information about scaling out, see Scale instance count manually or automatically.

Autoscaling is a cloud system or process that adjusts available resources based on the current demand.
- Autoscaling performs scaling in and out, as opposed to scaling up and down.
Autoscaling provides elasticity for your services.
Autoscaling responds to changes in the environment by adding or removing web servers and balancing the load between them.
- Autoscaling doesn't have any effect on the CPU power, memory, or storage capacity of the web servers powering the app, it only changes the number of web servers.
A rule specifies the threshold for a metric, and triggers an autoscale event when this threshold is crossed.
- Autoscaling can also deallocate resources when the workload has diminished.
To prevent runaway autoscaling, an App Service Plan has an instance limit.
- Plans in more expensive pricing tiers have a higher limit.
- Autoscaling cannot create more instances than this limit.
Identify autoscale factors
Azure provides two options for autoscaling:
- Scale based on a metric, such as the length of the disk queue, or the number of HTTP requests awaiting processing.
- Scale to a specific instance count according to a schedule. For example, you can arrange to scale out at a particular time of day, or on a specific date or day of the week.
- If you need to scale out incrementally, you can combine metric and schedule-based autoscaling in the same autoscale condition.
You can also scale based on metrics for other Azure services.
- For example, if the web app processes requests received from a Service Bus Queue, you might want to spin up additional instances of a web app if the number of items held in an Azure Service Bus Queue exceeds a critical length.

Explore autoscale best practices
An autoscale setting scales instances horizontally, which is out by increasing the instances and in by decreasing the number of instances.
- An autoscale setting has a maximum, minimum, and default value of instances.
Before scaling in, autoscale tries to estimate what the final state will be if it scaled in.
- For example, 575 x 3 (current instance count) = 1,725 / 2 (final number of instances when scaled in) = 862.5 threads.
- This means autoscale would have to immediately scale-out again even after it scaled in, if the average thread count remains the same or even falls only a small amount.
- However, if it scaled out again, the whole process would repeat, leading to an infinite loop.
- To avoid this situation (termed "flapping"), autoscale does not scale in at all.

Not all pricing tiers support autoscaling. The development pricing tiers are either limited to a single instance (the F1 and D1 tiers), or they only provide manual scaling (the B1 tier). If you've selected one of these tiers, you must first scale up to the S1 or any of the P level production tiers

Autoscale a web app

Autoscale a web app
By default, an App Service Plan only implements manual scaling.
Modify the App Service Plan for a web app to implement autoscaling.
- An App Service Plan has scale-out settings that you use to enable autoscaling, add autoscaling conditions, and define autoscale rules

Steps:

Configure the web app to the Standard App Service Tier
- The Standard tier supports auto-scaling, and we should minimize the cost.
Enable autoscaling on the web app
Add/Configure a Scale condition.
Add a scale rule
- Scale rule is within the Scale condition.

The development pricing tiers are either limited to a single instance (the F1 and D1 tiers), or they only provide manual scaling (the B1 tier). If you've selected one of these tiers, you must first scale up to the S1 or any of the P level production tiers.

You enable autoscaling with the Enable autoscale button on the Scale out page for an App Service Plan

Once you enable autoscaling, you can edit the default scale condition, and you can add your own custom scale conditions.
Remember that each scale condition can either scale based on a metric, or scale to a specific instance count.

A metric-based scale condition contains one or more scale rules. Initially, a scale condition contains only a default rule. You use the Add a rule link to add your own custom rules.

How Autoscale Analyzes Metrics

How an autoscale rule analyzes metrics
Autoscaling works by analyzing trends in metric values over time across and all instances.
Analysis is a multi-step process.
- First step
  - an autoscale rule aggregates the values retrieved for a metric for all instances across a period of time known as the time grain.
  - Each metric has its own intrinsic time grain, but in most cases this period is 1 minute.
  - The aggregated value is known as the time aggregation
- Second step:
  - performs a further aggregation of the value calculated by the time aggregation over a longer, user-specified period, known as the Duration.
  - The minimum Duration is 5 minutes.
An autoscale action has a cool down period, specified in minutes.
- During this interval, the scale rule won't be triggered again.
- This is to allow the system to stabilize between autoscale events.
- The minimum cool down period is five minutes.
When determining whether to scale out, the autoscale action will be performed if any of the scale-out rules are met.
- When scaling in, the autoscale action will run only if all of the scale-in rules are met

Web And Cloud

App Service Auto Scale

Intro

Documentation

Tips and Tidbits

Autoscale a web app

How Autoscale Analyzes Metrics