Azure VM Scale Sets (VMSS)
Intro
My notes on VM Scale Sets
Documentation
Quickstart: Create a virtual machine scale set in the Azure portal
Azure CLI commands: az vmss
Tutorial: Install applications in virtual machine scale sets with an Azure template
Tips And Tidbits
Vertical scaling, also known as scale up and scale down, means increasing or decreasing virtual machine sizes in response to a workload.
It's dependent on the availability of larger hardware, which quickly hits an upper limit and can vary by region.
Vertical scaling also usually requires a virtual machine to stop and restart.
VMSS only support deployment in one region (ie, it doesnt span regions)
A zone-redundant scale set lets you create a single scale set that spans multiple zones.
A regional (non-zonal) scale set uses placement groups, which act as an implicit availability set with five fault domains and five update domains.
Scale sets of more than 100 VMs span multiple placement groups.
Horizontal scaling, also referred to as scale out and scale in, where the number of VMs is altered depending on the workload. In this case, there is a increase (scale out) or decrease (scale in) in the number of virtual machine instances.
Azure virtual machine scale sets let you create and manage a group of load balanced VMs.
The number of VM instances can automatically increase or decrease in response to demand or a defined schedule.
There is no cost for the scale set itself, you only pay for each VM instance that you create.
Virtual machine scale sets simplify designing for high availability by aligning fault domains and update domains.
You will only have to define fault domains count for the scale set. The number of fault domains available to the scale sets may vary by region.
With scale sets, all VM instances are created from the same base OS image and configuration
You can only deploy and manage a set of identical VMs.
Scale sets support the use of the Azure load balancer for basic layer-4 traffic distribution, and Azure Application Gateway for more advanced layer-7 traffic distribution and TLS termination.
Scale sets are used to run multiple instances of your application.
If one of these VM instances has a problem, customers continue to access your application through one of the other VM instances with minimal interruption.
Scale sets can automatically increase the number of VM instances as application demand increases, then reduce the number of VM instances as demand decreases.
Scale sets support up to 1,000 VM instances. If you create and upload your own custom VM images, the limit is 600 VM instances.
Requires use of managed disks
See more requirements for Working with large virtual machine scale sets
There is no additional cost to scale sets.
You only pay for the underlying compute resources such as the VM instances, load balancer, or Managed Disk storage.
The management and automation features, such as autoscale and redundancy, incur no additional charges over the use of VMs
You can assign an NSG to a VMSS by specifying the NSG as part of the NIC’s configuration in an ARM template. In this case, each NIC for VMs in the VMSS will have the NSG applied.
You can create a VMSS with 0, 1 or up 1000 VMs
Networking for Azure virtual machine scale sets
In general, Azure scale set virtual machines do not require their own public IP addresses.
For most scenarios, it is more economical and secure to associate a public IP address to a load balancer or to an individual virtual machine (also known as a jumpbox), which then routes incoming connections to scale set virtual machines as needed (for example, through inbound NAT rules).
Autoscale Rule Evaluation
The amount of time to wait after a scale operation before scaling again.
For example, if cooldown = “PT10M”, Autoscale does not attempt to scale again for another 10 minutes.
The cooldown is to allow the metrics to stabilize after the addition or removal of instances.
An explanation on how cool down works: https://github.com/MicrosoftDocs/azure-docs/issues/17169#issuecomment-431136702
All thresholds are calculated at an instance level. For example, "scale out by one instance when average CPU > 80% when instance count is 2", means scale-out when the average CPU across all instances is greater than 80%.
If one or more scale-out rules are triggered, Autoscale calculates the new capacity determined by the scaleAction of each of those rules.
Then it scales out to the maximum of the computation for each scale-out rule (ie the highest computation is used).
Autoscale only takes a scale-in action if all of the scale-in rules are triggered.
Autoscale calculates the new capacity determined by the scaleAction of each of those rules.
Then it chooses the scale action that results in the maximum of those capacities to ensure service availability
Scaling-in won’t take effect during an evaluation cycle if applying the rule would cause an immediate scale out (avoids flapping)
As an example, consider the following better rule combination.
Increase instances by 1 count when CPU% >= 80
Decrease instances by 1 count when CPU% <= 60
In this case
Assume there are 2 instances to start with.
If the average CPU% across instances goes to 80, autoscale scales out adding a third instance.
Now assume that over time the CPU% falls to 60.
Autoscale's scale-in rule estimates the final state if it were to scale-in. For example, 60 x 3 (current instance count) = 180 / 2 (final number of instances when scaled down) = 90. So autoscale does not scale-in because it would have to scale-out again immediately. Instead, it skips scaling down.
The next time autoscale checks, the CPU continues to fall to 50. It estimates again - 50 x 3 instance = 150 / 2 instances = 75, which is below the scale-out threshold of 80, so it scales in successfully to 2 instances
VMSS on Dedicated Hosts
how to create an Azure dedicated host to host your virtual machines (VMs) and scale set instances.
A host group is a resource that represents a collection of dedicated hosts. You create a host group in a region and an availability zone, and add hosts to it.
When planning for high availability, there are additional options. You can use one or both of the following options with your dedicated hosts:
Span across multiple availability zones. In this case, you are required to have a host group in each of the zones you wish to use.
Span across multiple fault domains which are mapped to physical racks.
When you deploy a scale set, you specify the host group.
When using Virtual Machine Scale Sets with Dedicated Hosts, they must be linked to a Host Group. Thus if we have 3 Host Groups we also need 3 Scale Sets.
Using Azure CLI To Create A VM Scale Set
az vmss create --name app-scaleset --resource-group rg1lod16727460 --generate-ssh-keys --image app-server-image --instance-count 3--lb "myLB"
SSH key files '/home/user1-16727460/.ssh/id_rsa' and '/home/user1-16727460/.ssh/id_rsa.pub' have been generated under ~/.ssh to allow SSH access to the VM. If using machines without permanent storage, back up your keys to a safe location.
{\ Finished ..
"vmss": {
"doNotRunExtensionsOnOverprovisionedVMs": false,
"overprovision": true,
"provisioningState": "Succeeded",
"singlePlacementGroup": true,
"uniqueId": "efc14ecb-3b56-484b-b844-e7d48c3a68ba",
"upgradePolicy": {
"mode": "Manual",
"rollingUpgradePolicy": {
"maxBatchInstancePercent": 20,
"maxUnhealthyInstancePercent": 20,
"maxUnhealthyUpgradedInstancePercent": 20,
"pauseTimeBetweenBatches": "PT0S"
}
},
"virtualMachineProfile": {
"networkProfile": {
"networkInterfaceConfigurations": [
{
"name": "appsc31f5Nic",
"properties": {
"dnsSettings": {
"dnsServers": []
},
"enableAcceleratedNetworking": false,
"enableIPForwarding": false,
"ipConfigurations": [
{
"name": "appsc31f5IPConfig",
"properties": {
"loadBalancerBackendAddressPools": [
{
"id": "/subscriptions/6474fbd5-eeef-432f-94ec-861410d81d1d/resourceGroups/rg1lod16727460/providers/Microsoft.Network/loadBalancers/myLB/backendAddressPools/myLBBEPool",
"resourceGroup": "rg1lod16727460"
}
],
"loadBalancerInboundNatPools": [
{
"id": "/subscriptions/6474fbd5-eeef-432f-94ec-861410d81d1d/resourceGroups/rg1lod16727460/providers/Microsoft.Network/loadBalancers/myLB/inboundNatPools/myLBNatPool",
"resourceGroup": "rg1lod16727460"
}
],
"privateIPAddressVersion": "IPv4",
"subnet": {
"id": "/subscriptions/6474fbd5-eeef-432f-94ec-861410d81d1d/resourceGroups/rg1lod16727460/providers/Microsoft.Network/virtualNetworks/app-server-vnet/subnets/subnet",
"resourceGroup": "rg1lod16727460"
}
}
}
],
"primary": true
}
}
]
},
"osProfile": {
"adminUsername": "user1-16727460",
"allowExtensionOperations": true,
"computerNamePrefix": "appsc31f5",
"linuxConfiguration": {
"disablePasswordAuthentication": true,
"provisionVMAgent": true,
"ssh": {
"publicKeys": [
{
"keyData": "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC34WdO0RSfJXROzZS99qZhUwiY1MdoH7ic+pc9jS/uzjMEZd1nYOVxMfB75pJYEqaUc9Nq/AIWZgMmtRob8vZmUjo4ycZYPpR/zbG6823S4c5uK8+CvVTv3URHvwdUsnablS/ewtul04WtPnZUgMWOzPKutUdUeeaWBf4EXXJlviUiXBPW7NtTZK3RjfkhLOYwxV1bc0W8b9iMF7Oe2KVsJjrPsjZms9QitnVaPZz17M6M/wNZuHEw83CvTIEpajyF4Oru51MJu5AGG8+MNXpVy07livBfSrHlo6eXqB+xuVead2CWrgTNmDeP40XZhKpOiKlD4tSk9nZ8BSklAK6F",
"path": "/home/user1-16727460/.ssh/authorized_keys"
}
]
}
},
"requireGuestProvisionSignal": true,
"secrets": []
},
"storageProfile": {
"imageReference": {
"id": "/subscriptions/6474fbd5-eeef-432f-94ec-861410d81d1d/resourceGroups/rg1lod16727460/providers/Microsoft.Compute/images/app-server-image",
"resourceGroup": "rg1lod16727460"
},
"osDisk": {
"caching": "ReadWrite",
"createOption": "FromImage",
"diskSizeGB": 30,
"managedDisk": {
"storageAccountType": "Premium_LRS"
},
"osType": "Linux"
}
}
}
}
}
Choosing the right number of fault domains for virtual machine scale set
Choosing the right number of fault domains for virtual machine scale set
Virtual machine scale sets are created with five fault domains by default in Azure regions with no zones.
For the regions that support zonal deployment of virtual machine scale sets and this option is selected, the default value of the fault domain count is 1 for each of the zones.
FD=1 in this case implies that the VM instances belonging to the scale set will be spread across many racks on a best effort basis.
You can set the parameter
--platform-fault-domain-count
to 1, 2, or 3 (default of 3 if not specified).