Skip to main content

Monitor your Azure Landing Zone with Baseline Alerts

· 9 min read

When deploying Azure resources, it is crucial to configure alerts to ensure your resources' health, performance, and security.

By setting up alerts, you can proactively monitor your resources and take timely actions to address any issues that may arise.

Here are the key reasons why configuring alerts is essential:

  • Early detection of issues: Alerts enable you to identify potential problems or anomalies in your Azure resources at an early stage. By monitoring key metrics and logs, you can detect high CPU usage, low memory, network connectivity problems, or security breaches. This allows you to take immediate action and prevent any negative impact on your applications or services.
  • Reduced downtime: By configuring alerts, you can minimise downtime by being notified of critical events or failures in real time. This allows you to quickly investigate and resolve issues before they escalate, ensuring the availability and reliability of your applications.
  • Optimized resource utilisation: Alerts help you optimise resource utilization by providing insights into resource usage patterns and trends. By monitoring metrics such as CPU utilisation, memory consumption, or storage capacity, you can identify opportunities for optimisation and cost savings.
  • Compliance and security: Configuring alerts is essential for maintaining compliance with regulatory requirements and ensuring the security of your Azure resources. By monitoring security logs and detecting suspicious activities or unauthorised access attempts, you can immediately mitigate potential risks and protect your data.
  • Proactive capacity planning: Alerts provide valuable information for capacity planning and scaling your resources. By monitoring resource utilisation trends over time, you can identify patterns and forecast future resource requirements. This helps you avoid performance bottlenecks and ensure a smooth user experience."

Overview

Azure Monitor Baseline Alerts (AMBA) exists to help you get started with proactive and reactive monitoring straight out of the bat, focused on standard scenarios for your platform and application Landing Zone that are not industry or application-specific.

Previously under the azure/alz-monitor repository, it is now generally available under the Azure/azure-monitor-baseline-alerts GitHub repository.

The objectives of AMBA are to:

  • Help simplify onboarding to Azure Monitor through a scalable and consistent approach.
  • Reduce time to identify failure within Azure tenants and platforms (i.e., outages).

More documentation can be found on the official Microsoft Learn documentation page: Monitor Azure platform landing zone components.

Monitor your Azure Landing Zone with Azure Monitor Baseline Alerts

"AMBA Provides best practice guidance around creating and configuring Azure Monitor Alerts for Azure services and workload pattern/scenarios)."

One of the ways that makes Azure Monitor Baseline Alerts (AMBA) so useful is the details around the Alerts.

Reference: Alerts Details

Out-of-the-box alerts that are part of the base AMBA solution include (but are not limited to) alerts such as:

Resource TypeNameDescriptionmetricName
automationAccountsTotalJobThe total number of jobsTotalJob
storageAccountsAvailabilityThe percentage of availability for the storage service or the specified API operation. Availability is calculated by taking the TotalBillableRequests value and dividing it by the number of applicable requests, including those that produced unexpected errors. All unexpected errors result in reduced availability for the storage service or the specified API operation.Availability
virtualMachinesAvailable Memory Bytes (MBytes)Amount of physical memory, in bytes, immediately available for allocation to a process or for system use in the Virtual MachineAvailable Memory Bytes
virtualMachineScaleSetsPercentage CPUThe percentage of allocated compute units that are currently in use by the Virtual Machine(s)Percentage CPU
virtualMachineScaleSetsOS Disk IOPS Consumed PercentagePercentage of operating system disk I/Os consumed per minuteOS Disk IOPS Consumed Percentage
virtualMachineScaleSetsData Disk IOPS Consumed PercentagePercentage of data disk I/Os consumed per minuteData Disk IOPS Consumed Percentage
virtualMachineScaleSetsOutbound FlowsOutbound Flows are number of current flows in the outbound direction (traffic going out of the VM)Outbound Flows
virtualMachineScaleSetsInbound FlowsInbound Flows are number of current flows in the inbound direction (traffic going into the VM)Inbound Flows
virtualMachineScaleSetsAvailable Memory BytesAmount of physical memory, in bytes, immediately available for allocation to a process or for system use in the Virtual MachineAvailable Memory Bytes
virtualMachineScaleSetsNetwork In TotalThe number of bytes received on all network interfaces by the Virtual Machine(s) (Incoming Traffic)Network In Total
virtualMachineScaleSetsNetwork Out TotalThe number of bytes out on all network interfaces by the Virtual Machine(s) (Outgoing Traffic)Network Out Total
virtualMachineScaleSetsVmAvailabilityMetricMeasure of Availability of Virtual machines over time.VmAvailabilityMetric
virtualMachineScaleSetsDisk Read Operations/SecDisk Read IOPSDisk Read Operations/Sec
virtualMachineScaleSetsDisk Write Operations/SecDisk Write IOPSDisk Write Operations/Sec
virtualNetworksIf Under DDoS AttackMetric Alert for VNet DDOS Attackifunderddosattack
publicIPAddressesBytes In DDoSMetric Alert for Public IP Address Bytes IN DDOSbytesinddos
publicIPAddressesIf Under DDoS AttackMetric Alert for Public IP Address Under Attackifunderddosattack
publicIPAddressesPackets In DDoSInbound packets DDoSPacketsInDDoS
publicIPAddressesVIP AvailabilityAverage IP Address availability per time durationVipAvailability
expressRouteCircuitsARP AvailabilityARP Availability from MSEE towards all peers.ArpAvailability
expressRouteCircuitsBGP AvailabilityBGP Availability from MSEE towards all peers.BgpAvailability

For:

  • Metric Alerts
  • Log Alerts
  • Activity Logs
  • Service health alerts

Graph 1

Graph 2

Monitoring Management Resource Group

Deployment

The Azure Monitor Baseline Alerts are deployed using Azure Policy (DINE – Deploy If Not Exist) and are included in the Azure Enterprise-Scale landing zone accelerator.

Deployment is currently done via an Azure Resource Manager (ARM) template and can be deployed with the following:

  • PowerShell
  • Azure CLI
  • GitHub Actions DevOps

Because the Azure Monitor Baseline alerts are intended to follow the Azure Landing Zone framework, in accordance with the Ready phase of the Cloud Adoption Framework, the deployment can be scaled across multiple Management groups, such as:

  • Connectivity
  • Identity
  • Management
  • Service Health
  • Landing Zone (i.e., Application)

Or a single management group/subscription consisting of all features, with Alert processing rules sending relevant subscription alerts to an Action group that will notify by email if an alert is raised.

For my own Azure environment, I only have 2 Subscriptions that sit under the following Management Group hierarchy:

Management Group structure

We will deploy these initiatives to the MG Management Group, which allows us to then Assign those initiatives to anything beneath it, in our case, the mg-landingzones and trey-platform management groups.

As part of the deployment, we will adjust the following parameters to match our environment:

  • Management groups
  • Assignment of PolicySetDefinitions
  • Resource Group Name and Tags
  • Location
  • Action group emails

We can adjust the alert rule parameters, such as the severity, and change their effect, whether they automitigate or adjust the MonitorDisable tag, which can be used to exclude from alert monitoring; we will leave these to the default.

Make sure you review and test these policies first! Before proceeding to Production.

We will deploy the ARM template using the Azure Cloud Shell.

  1. Login to the Azure Portal

  2. Click on Cloud Shell (top right)

  3. Select your subscription and create a Storage account to host your Cloud Shell drives (if it hasn't already been done)

  4. I'm going to adjust the parameters to match my environment by changing the following (these will need to be tuned to your specific environment):.

  5. Type in:

    git clone https://github.com/Azure/azure-monitor-baseline-alerts

Type in:

cd azure-monitor-baseline-alerts\patterns\alz\

Type in:

nano alzArm.param.json

I'm going to adjust the parameters to match my environment by changing the following (these will need to be tuned to your specific environment):

ParametersValues
enterpriseScaleCompanyPrefixlukegeeknz
platformManagementGrouptrey-platform
IdentityManagementGroupmg-landingzones
managementManagementGrouptrey-platform
connectivityManagementGrouptrey-platform
LandingZoneManagementGroupmg-landingzones
ALZMonitorResourceGroupNamerg-management-monitoring-001
ALZMonitorResourceGroupLocationaustraliaeast
ALZMonitorActionGroupEmail[email protected]

If you want the alerts to go to more than one email address, add them as comma-separated (i.e. "[email protected], [email protected], [email protected]")

  1. Press Ctrl+X

  2. Press Y to save buffer

  3. Press Enter to overwrite the parameter json file.

  4. Before proceeding to the next step, you can cat the file (cat ./alzArm.param.json) to view it and make sure the parameters are correct and nothing else needs changing.

  5. Now that the parameters are sorted, it is time to deploy:

    $location = 'AustraliaEast' $psedudoRootManagementGroup = "mg" New-AzManagementGroupDeployment -ManagementGroupId $psedudoRootManagementGroup -Location $location -TemplateUri 'https://raw.githubusercontent.com/Azure/azure-monitor-baseline/alerts/main/patterns/alz/alzArm.json' -TemplateParameterFile 'alzArm.param.json'

The ARM template location needs to be internet accessible due to links in the ARM template to dependent resources, although the parameter file can be sourced locally. Error: Code=InvalidTemplate; Message=Deployment template validation failed: 'The template variable 'deploymentUris' is not valid: The language expression property 'templateLink' doesn't exist, available properties are 'template, templateHash, parameters, mode, provisioningState'.. Please see https://aka.ms/arm-functions for usage details.'

All going well after a few minutes – your initiatives and policies have been deployed.

Deploy - AMBA

AMBA - Policy Definition

AMBA - Policy Assignments

If you have already existing Azure resources, you can remediate the policies, forcing them to create the Resource Group, Alerts and action groups by running the following PowerShell commands:

cd scripts

$pseudoRootManagementGroup = "mg"
$identityManagementGroup = "mg-landingzones"
$managementManagementGroup = "trey-platform"
$connectivityManagementGroup = "trey-platform"
$LZManagementGroup= "mg-landingzones"

#Run the following commands to initiate remediation
.\\Start-AMBARemediation.ps1 -managementGroupName $managementManagementGroup -policyName Alerting-Management
.\Start-AMBARemediation.ps1 -managementGroupName $connectivityManagementGroup -policyName Alerting-Connectivity
.\Start-AMBARemediation.ps1 -managementGroupName $identityManagementGroup -policyName Alerting-Identity
.\Start-AMBARemediation.ps1 -managementGroupName $LZManagementGroup -policyName Alerting-LandingZone
.\Start-AMBARemediation.ps1 -managementGroupName $pseudoRootManagementGroup -policyName Alerting-ServiceHealth

You can check the Alert rules in Azure Monitor:

AMBA - Alert Rules

It won't deploy all alert rules unless you have the resources; an example is the Deploy VNetG Tunnel Bandwidth alert; the policy rule will only deploy if it matches: The existence of a Virtual Network Gateway, with a VPN Gateway type, doesn't include the MonitorDisable tag; if I were to deploy a VPN Gateway, it would create the alert rule for me.

Finally, if you want to clean up (delete the resources), you can leverage the Start-AMBACleanup.ps1 script; it will leave the Resource Group, Alert processing rules and action group.