Skip to main content

· 5 min read

Organizing cloud-based resources is a crucial task for IT unless you only have simple deployments. Use naming and tagging standards to organize your resources for these reasons:

  • Resource management: Your IT teams will need to quickly locate resources associated with specific workloads, environments, ownership groups, or other important information. Organizing resources is critical to assigning organizational roles and access permissions for resource management.
  • Cost management and optimization: Making business groups aware of cloud resource consumption requires IT to understand each team's resources and workloads.
  • Operations management: Visibility for the operations management team regarding business commitments and SLAs is an essential aspect of ongoing operations.
  • Security: Classification of data and security impact is a vital data point for the team when breaches or other security issues arise.
  • Governance and regulatory compliance: Maintaining consistency across resources helps identify deviation from agreed-upon policies.
  • Automation: In addition to making resources easier for IT to manage, a proper organizational scheme allows you to take advantage of automation as part of resource creation, operational monitoring, and the result of DevOps processes.

Workload optimization: Tagging can help identify patterns and resolve broad issues. A tag can also help determine the assets required to support a single workload. Tagging all assets associated with each workload enables a more profound analysis of your mission-critical workloads to make sound architectural decisions.

Tagging Types

The common tagging patterns listed below provide examples of how tagging can be used to organize cloud assets. These patterns are not meant to be exclusive and can be used in parallel, providing multiple ways of organizing assets based on your company's needs.

Tag typeExamplesDescription
Functionalapp = catalogsearch1 tier = web webserver = apache env = prod env = staging env = devCategorize resources in relation to their purpose within a workload, what environment they have been deployed to, or other functionality and operational details.
Classificationconfidentiality = private SLA = 24hoursClassifies a resource by how it is used and what policies apply to it.
Accountingdepartment = finance program = business-initiative region = northamericaAllows a resource to be associated with specific groups within an organization for billing purposes.
Partnershipowner = jsmith contactalias = catsearchowners stakeholders = user1; user2; user3Provides information about what people (outside of IT) are related or otherwise affected by the resource.
Purposebusinessprocess = support businessimpact = moderate revenueimpact = highAligns resources to business functions to better support investment decisions.

Tagging Baselines

Tag at the Resource Group level and then have an Azure policy implemented that tags the resources in that Resource Group with the appropriate tags.

Tag NameValueTag TypeDescriptionExample
EnvironmentProduction Development SandboxFunctionalTags the resources with the Environment Tag. This can be used to determine if a resource is Production, Development or Sandbox.Environment: Production
Creator{CreatorName}PartnershipTags the resource with the name of who created the resource. This can be used to determine who created the resource to be able to get more information.Creator: Luke Murray
CreatedDate{CreatedDate}PurposeTags the resource with the Date/Time when the resource was created. This can be used to determine how old a resource is, which can be used to look at new functionality on created resources or check if resources are still required.CreatedDate: 10:00 PM 03/06/2022 NZT
CriticalityP1 P2 P3PurposeTags the resources with the criticality of the resources, i.e., if critical, then it is P1. This can be used to determine whether resources need to be highly available, whether changes can be made during or out of business hours.Criticality:P1
SupportedBy{TeamName}PartnershipTags the resources with the team/person or company who supports the resources, whether it is internally supported by the company or outsourced.SupportedBy:Company
RequesterName{Requestor}-{CompanyName)PartnershipTags the resources with the user that requested the creation of the resources.RequesterName:Project Manager
BillTo{BillTo}AccountingTags the resources with the cost centre or project codes who will pay for the resources.BillTo:AppTransformationProject1
AutoShutDownYesFunctionalThis is an Automation functional tag, i.e., tag the resource (Virtual Machine) with a tagging code which will automatically Shut down and Start-up the Virtual Machine at specified times.AutoShutDown:Yes
ApplicationName{ProjectName}PartnershipTags the resource with the name of the project or what the resources in the resource group are for.ApplicationName:AzureVirtualDesktopSH
Business Unit{BusinessUnit}PartnershipTags the resource with the name of the Business Unit or Company that owns the resources.BusinessUnit:Finance
SnapshotTrueFunctionalThis is an Automation functional tag, i.e., tag the resource (Disk) with a tagging code which can create daily snapshots of disks.Snapshot:True

· One min read

The path to becoming a Microsoft MVP (Most Valuable Professional) is not as linear as some might think, others have a goal to receive the Microsoft MVP Award, and others have a passion for technology that shows in community activities such as speaking or user groups, helping others on forums, and helping to maintain documentation and helping others be up-to-speed with the ever changing ecosystem that is the Microsoft stack - it is not a one size fits all, just as there are multiple ways of learning, there are multiple ways to the MVP Award.

I join Christian Buckley for episode 167 of his #MVPbuzChat Podcast/Video chat, to tell my story, feel free to check it out (if you can ignore the bad camera angle!) and other MVPs talk about their journey to the Microsoft MVP Award.

· 3 min read

Solution architecture is concerned with the planning, design, implementation, and ongoing improvement of a technology system.

The architecture of a system must balance and align the business requirements with the technical capabilities that are needed to execute those requirements.

The finished architecture is a balance of risk, cost, and capability throughout the system and its components.

Running a solution in the cloud does not reduce the need for requirements to be clear. In fact, the flexibility and power provided by the cloud mean that it is even more important to have clear requirements from business stakeholders; otherwise, you could end up solving problems that don't exist, missing an important design decision, or going beyond the available budget by adding unnecessary resiliency.

Requirements and Architecture

Non-functional requirements (NFRs)

Below is a short list of NFRs (not exhaustive) that may be provided by the business to help inform the design of a solution.

Reliability requirements
  • Service level agreement (SLA)
  • Uptime objective
  • Recovery time objective (RTO)
  • Recovery point objective (RPO)
  • Recoverability
Security requirements
  • Geographical location
  • Compliance and legislation
  • Identity and access management
  • Privacy
  • Data Integrity
  • Public or private endpoints (or both)
  • OWASP
  • Hybrid connectivity
  • DDOS
Performance requirements
  • Peak throughput, e.g., Requests per minute (RPM), active users
  • Business plan for growth
  • UX metrics (e.g., Page load time)
  • Asynchronous vs Synchronous operations
  • Workload profile (predictable, unpredictable, peak time of day)
  • Scalability
  • Data estate size and growth rate
  • Time-to-live (TTL) of reports and views (real-time vs eventual consistency)
Operational requirements
  • Prod and non-prod environments (Dev/Test, QA, Pre-prod, Prod)
  • Release frequency (hours / days / months)
  • Time to onboard (new customer)
  • Licensing
  • Cost (Management)
  • Manageability
Cost optimization
  • Cost per user
  • Target hosting costs as a percentage of revenue
  • Pricing model
  • Tenancy model
Azure SLAs
  • Familiarize yourself with Azure service-level agreements
  • An Azure Service-level Agreement (SLA) can also be read as a minimum service-level objective (SLO).
  • An SLA is a financial guarantee, not an absolute guarantee
  • Read the SLA details carefully, particularly the definition of "downtime" for each service, which gives important hints about failure modes

For example, in the SLA for Azure SQL Database, "downtime" is defined as:

"The total accumulated Deployment Minutes across all Databases in a given Microsoft Azure subscription during which the Database is unavailable. A minute is considered unavailable for a given Database if all continuous attempts by Customer to establish a connection to the Database within the minute fail."

The Azure SQL Database team expect almost all outages to be transient (brief and non-recurring). Therefore, the retry pattern should be used to continuously retry for up to a minute. This is typical in cloud services; retry has been the default behaviour in ADO.NET since .NET Framework 4.6.1.

External Resources

Finally, resources such as the Azure Architecture Center, Cloud Adoption and Well-Architected Framework can help with thinking around the design and building blocks of your architecture

· 2 min read

Did you know you can contribute to Microsoft documentation (ms docs)?

Suppose you see something not quite right, technically or even if the document's readability doesn't look right! Then, in true community style, you can contribute!

Tip: You can edit it straight from the Github webpage directly, or pressing "." in a Github repository will open up Visual Studio Code in Dev spaces with the markdown linter to help check against best practices from your browser.

See the image below for an example:

Update Microsoft documentation

Once the pull request is made, it will be reviewed by designated technical document reviewers/product owners at Microsoft. Then your changes will be merged live if successful (and if not, the reviewers will let you know why and what changes could be made)!

If you don't want to make the edit yourself, you can also raise an issue and give your feedback by linking to the document, and this will then be worked on by someone to review, contact the relevant product owners, and amended.

MS Docs  - GitHub Raise an Issue

Try to be as concise as possible, as people reading it may not have the same experience as you!

· 13 min read

Virtual Machines in Microsoft Azure have different states and, depending on what state the Virtual Machine is in, will determine whether you get billed or not (for the Compute, storage and network adapters are still billed).

Power stateDescriptionBilling
StartingVirtual Machine is powering up.Billed
RunningVirtual Machine is fully up. This is the standard working state.Billed
StoppingThis is a transitional state between running and stopped.Billed
StoppedThe Virtual Machine is allocated on a host but not running. Also called PoweredOff state or Stopped (Allocated). This can be result of invoking the PowerOff API operation or invoking shutdown from within the guest OS. The Stopped state may also be observed briefly during VM creation or while starting a VM from Deallocated state.Billed
DeallocatingThis is the transitional state between running and deallocated.Not billed
DeallocatedThe Virtual Machine has released the lease on the underlying hardware and is completely powered off. This state is also referred to as Stopped (Deallocated).Not billed

Suppose a Virtual Machine is not being used. In that case, turning off a Virtual Machine from the Microsoft Azure Portal (or programmatically via PowerShell/Azure CLI) is recommended to ensure that the Virtual Machine is deallocated and its affinity on the host has been released.

Microsoft Azure - Virtual Machine Power States

However, you need to know this, and those new to Microsoft Azure, or users who don't have Virtual Machine Administrator rights to deallocate a Virtual Machine, may simply shut down the operating system, leaving the Virtual Machine in a 'Stopped' state, but still tied to an underlying Azure host and incurring cost.

Our solution can help; by triggering an Alert when a Virtual Machine becomes unavailable due to a user-initiated shutdown, we can then start an Azure Automation runbook to deallocate the Virtual Machine.

Overview

Today, we are going to set up an Azure Automation runbook, triggered by a Resource Health alert that will go through the following steps:

  1. User shutdowns Virtual Machine from within the Operating System
  2. The Virtual Machine enters an unavailable state
  3. A Resource Alert is triggered when the Virtual Machine becomes unavailable (after being available) by a user initiated event
  4. The Alert triggers a Webhook to an Azure Automation runbook
  5. Using permissions assigned to the Azure Automation account through a System Managed Identity connects to Microsoft Azure and checks the VM state; if the Virtual Machine state is still 'Stopped', then deallocate the virtual machine.
  6. Then finally, resolve the triggered alert.

To do this, we need a few resources.

  • Azure Automation Account
  • Az.AlertsManagement module in the Azure Automation account
  • Az.Accounts module (updated in the Azure Automation account)
  • Azure Automation runbook (I will supply this below)
  • Resource Health Alert
  • Webhook (to trigger to the runbook and pass the JSON from the alert)

And, of course, 'Contributor' rights to the Microsoft Azure subscription to provide the resources and the alerts and resources and set up the system managed identity.

We will set up this from scratch using the Azure Portal and an already created PowerShell Azure Automation runbook.

Deploy Deallocate Solution

Setup Azure Automation Account

Create Azure Automation Account

First, we need an Azure Automation resource.

  1. Log into the Microsoft Azure Portal.
  2. Click + Create a resource.
  3. Type in automation
  4. Select Create under Automation, and select Automation.
  5. Create Azure Automation Account
  6. Select your subscription
  7. Select your Resource Group or Create one if you don't already have one (I recommend placing your automation resources in an Azure Management or Automation resource group, this will also contain your Runbooks)
  8. Select your region
  9. Create Azure Automation Account
  10. Select Next
  11. Make sure: System assigned is selected for Managed identities (this will be required for giving your automation account permissions to deallocate your Virtual Machine, but it can be enabled later if you already have an Azure Automation account).
  12. Click Next
  13. Leave Network connectivity as default (Public access)
  14. Click Next
  15. Enter in appropriate tags
  16. Create Azure Automation Account
  17. Click Review + Create
  18. After validation has passed, select Create
Configure System Identity

Now that we have our Azure Automation account, its time to set up the System Managed Identity and grant it the following roles:

  • Virtual Machine Contributor (to deallocate the Virtual Machine)
  • Monitoring Contributor (to close the Azure Alert)

You can set up a custom role to be least privileged and use that instead. But in this article, we will stick to the built-in roles.

  1. Log into the Microsoft Azure Portal.
  2. Navigate to your Azure Automation account
  3. Click on: Identity
  4. Make sure that the System assigned toggle is: On and click Azure role assignments.
  5. Azure Automation Account managed identity
  6. Click + Add role assignments
  7. Select the Subscription (make sure this subscription matches the same subscription your Virtual Machines are in)
  8. Select Role: Virtual Machine Contributor
  9. Click Save
  10. Now we repeat the same process for Monitoring Contributor
  11. lick + Add role assignments
  12. Select the Subscription (make sure this subscription matches the same subscription your Virtual Machines are in)
  13. Select Role: Monitoring Contributor
  14. Click Save
  15. Click Refresh (it may take a few seconds to update the Portal, so if it is blank - give it 10 seconds and try again).
  16. You have now set up the System Managed identity and granted it the roles necessary to execute the automation.
Import Modules

We will use the Azure Runbook and use a few Azure PowerShell Modules; by default, Azure Automation has the base Azure PowerShell modules, but we will need to add Az.AlertsManagement, and update the Az.Accounts as required as a pre-requisite for Az.AlertsManagement.

  1. Log into the Microsoft Azure Portal.
  2. Navigate to your Azure Automation account
  3. Click on Modules
  4. Click on + Add a module
  5. Click on Browse from Gallery
  6. Click: Click here to browse from the gallery
  7. Type in: Az.Accounts
  8. Press Enter
  9. Click on Az.Accounts
  10. Click Select
  11. Import Az.Accounts module
  12. Make sure that the Runtime version is: 5.1
  13. Click Import
  14. Now that the Az.Accounts have been updated, and it's time to import Az.AlertsManagement!
  15. Click on Modules
  16. Click on + Add a module
  17. Click on Browse from Gallery
  18. Click: Click here to browse from the gallery
  19. Type in: Az.AlertsManagement (note its Alerts)
  20. Click Az.AlertsManagement
  21. Az.AlertsManagement module
  22. Click Select
  23. Make sure that the Runtime version is: 5.1
  24. Click Import (if you get an error, make sure that Az.Accounts has been updated, through the Gallery import as above)
  25. Now you have successfully added the dependent modules!
Import Runbook

Now that the modules have been imported into your Azure Automation account, it is time to import the Azure Automation runbook.

  1. Log into the Microsoft Azure Portal.
  2. Navigate to your Azure Automation account
  3. Click on Runbooks
  4. Click + Create a runbook
  5. Specify a name (i.e. Deallocate-AzureVirtualMachine)
  6. Select Runbook type of: PowerShell
  7. Select Runtime version of: 5.1
  8. Type in a Description that explains the runbook (this isn't mandatory, but like Tags is recommended, this is an opportunity to indicate to others what it is for and who set it up)
  9. Create Azure Runbook
  10. Click Create
  11. Now you will be greeted with a blank edit pane; paste in the Runbook from below:
Deallocate-AzureVirtualMachine.ps1
#requires -Version 3.0 -Modules Az.Accounts, Az.AlertsManagement
<#
.SYNOPSIS
PowerShell Azure Automation Runbook for Stopping Virtual Machines, that have been Shutdown within the Windows Operating System (Stopped and not Deallocated).
.AUTHOR
Luke Murray (https://github.com/lukemurraynz/)
#>

[OutputType('PSAzureOperationResponse')]
param (
[Parameter(Mandatory = $true, HelpMessage = 'Data from the WebHook/Azure Alert')][Object]$WebhookData
)

Import-Module Az.AlertsManagement
$ErrorActionPreference = 'stop'

# Get the data object from WebhookData
$WebhookData = $WebhookData.RequestBody
Write-Output -InputObject $WebhookData
$Schema = $WebhookData | ConvertFrom-Json

#Sets the Webhook data into object
$Essentials = [object] ($Schema.data).essentials
Write-Output -InputObject $Essentials

# Get the first target only as this script doesn't handle multiple and and export variables for the resource.
$alertIdArray = (($Essentials.alertId)).Split('/')
$alertTargetIdArray = (($Essentials.alertTargetIds)[0]).Split('/')
$alertid = ($alertIdArray)[6]
$SubId = ($alertTargetIdArray)[2]
$ResourceGroupName = ($alertTargetIdArray)[4]
$ResourceType = ($alertTargetIdArray)[6] + '/' + ($alertTargetIdArray)[7]
$ResourceName = ($alertTargetIdArray)[-1]
$status = $Essentials.monitorCondition
Write-Output -InputObject $alertTargetIdArray
Write-Output -InputObject "status: $status" -Verbose

#Sets VM shutdown
if (($status -eq 'Activated') -or ($status -eq 'Fired')) {
$status = $Essentials.monitorCondition
Write-Output -InputObject "resourceType: $ResourceType" -Verbose
Write-Output -InputObject "resourceName: $ResourceName" -Verbose
Write-Output -InputObject "resourceGroupName: $ResourceGroupName" -Verbose
Write-Output -InputObject "subscriptionId: $SubId" -Verbose

# Determine code path depending on the resourceType
if ($ResourceType -eq 'Microsoft.Compute/virtualMachines') {
# This is an Resource Manager VM
Write-Output -InputObject 'This is an Resource Manager VM.' -Verbose

# Ensures you do not inherit an AzContext in your runbook
Disable-AzContextAutosave -Scope Process

# Connect to Azure with system-assigned managed identity
$AzureContext = (Connect-AzAccount -Identity).context

# set and store context
$AzureContext = Set-AzContext -SubscriptionName $AzureContext.Subscription -DefaultProfile $AzureContext
Write-Output -InputObject $AzureContext
#Checks Azure VM status
$VMStatus = Get-AzVM -ResourceGroupName $ResourceGroupName -Name $ResourceName -Status

Write-Output -InputObject $VMStatus
If ($VMStatus.Statuses[1].Code -eq 'PowerState/stopped') {
Write-Output -InputObject "Stopping the VM, it was Shutdown without being Deallocated - $ResourceName - in resource group - $ResourceGroupName" -Verbose
Stop-AzVM -Name $ResourceName -ResourceGroupName $ResourceGroupName -DefaultProfile $AzureContext -Force -Verbose

#Check VM Status after deallocation
$VMStatus = Get-AzVM -ResourceGroupName $ResourceGroupName -Name $ResourceName -Status -Verbose

Write-Output -InputObject $VMStatus

If ($VMStatus.Statuses[1].Code -eq 'PowerState/deallocated') {
#Closes Alert
Write-Output -InputObject $VMStatus.Statuses[1].Code
Write-Output -InputObject $alertid
Get-AzAlert -AlertId $alertid -verbose -DefaultProfile $AzureContext
Get-AzAlert -AlertId $alertid -verbose -DefaultProfile $AzureContext | Update-AzAlertState -State 'Closed' -Verbose -DefaultProfile $AzureContext
}
}

Elseif ($VMStatus.Statuses[1].Code -eq 'PowerState/deallocated') {
Write-Output -InputObject 'Already deallocated' -Verbose
}

Elseif ($VMStatus.Statuses[1].Code -eq 'PowerState/running') {
Write-Output -InputObject 'VM running. No further actions' -Verbose
}

# [OutputType(PSAzureOperationResponse")]
}
}
else {
# The alert status was not 'Activated' or 'Fired' so no action taken
Write-Output -InputObject ('No action taken. Alert status: ' + $status) -Verbose
}
  1. Click Save
  2. Azure Automation runbook
  3. Click Publish (so the runbook is actually in production and can be used)
  4. You can select View or Edit at any stage, but you have now imported the Azure Automation runbook!
Setup Webhook

Now that the Azure runbook has been imported, we need to set up a Webhook for the Alert to trigger and start the runbook.

  1. Log into the Microsoft Azure Portal.
  2. Navigate to your Azure Automation account
  3. Click on Runbooks
  4. Click on the runbook you just imported (i.e. Deallocate-AzureVirtualMachine)
  5. Click on Add webhook
  6. Click Create a new webhook
  7. Enter a name for the webhook
  8. Make sure it is Enabled
  9. You can edit the expiry date to match your security requirements; make sure you record the expiry date, as it will need to be renewed before it expires.
  10. Copy the URL and paste it somewhere safe (you won't see this again! and you need it for the next steps)
  11. Create Azure webhook
  12. Click Ok
  13. Click on Configure parameters and run settings.
  14. Because we will be taking in dynamic data from an Azure Alert, enter in: [EmptyString]
  15. Click Ok
  16. Click Create
  17. You have now set up the webhook (make sure you have saved the URL from the earlier step as you will need it in the following steps)!

Setup Alert & Action Group

Now that the Automation framework has been created with the Azure Automation account, runbook and webhook, we now need a way to detect if a Virtual Machine has been Stopped; this is where a Resource Health alert will come in.

  1. Log into the Microsoft Azure Portal.
  2. Navigate to: Monitor
  3. Click on Service Health
  4. Select Resource Health
  5. Select + Add resource health alert
  6. Select your subscription
  7. Select Virtual machine for Resource Type
  8. You can target specific Resource Groups for your alert (and, as such, your automation) or select all.
  9. Check Include all future resource groups
  10. Check include all future resources
  11. Under the Alert conditions, make sure Event Status is: All selected
  12. Set Current resource status to Unavailable
  13. Set Previous resource status to All selected
  14. For reason type, select: User initiated and unknown
  15. Create Azure Resource Health Alert
  16. Now that we have the Alert rule configured, we need to set up an Action group. That will get triggered when the alert gets fired.
  17. Click Select Action groups.
  18. Click + Create action group
  19. Select your subscription and resource group (this is where the Action alert will go, I recommend your Azure Management/Monitoring resource group that may have a Log Analytics workspace as an example).
  20. Give your Action Group a name, i.e. AzureAutomateActionGroup
  21. The display name will be automatically generated, but feel free to adjust it to suit your naming convention
  22. Click Next: Notifications
  23. Under Notifications, you can trigger an email alert, which can be handy in determining how often the runbook runs. This can be modified and removed if it is running, especially during testing.
  24. Click Next: Actions
  25. Under Action Type, select Webhook
  26. Paste in the URI created earlier when setting up the Webhook
  27. Select Yes to enable the common alert schema (this is required as the JSON that the runbook is parsing is expecting it to the in the schema, if it isn't the runbook will fail)
  28. Create Azure Action Group
  29. Click Ok
  30. Give the webhook a name.
  31. Click Review + create
  32. Click Create
  33. Finally, enter in an Alert name and description, specify the resource group for the Alert to go into and click Save.

Test Deallocate Solution

So now we have stood up our:

  • Azure automation account
  • Alert
  • Action Group
  • Azure automation runbook
  • Webhook

It is time to test! I have a VM called: VM-D01, running Windows (theoretically, this runbook will also run against Linux workloads, as its relying on the Azure agent to send the correct status to the Azure Logs, but in my testing, it was against Windows workloads) in the same subscription that the alert has been deployed against.

As you can see below, I shut down the Virtual Machine. After a few minutes (be patient, Azure needs to wait for the status of the VM to be triggered), an Azure Alert was fired into Azure Monitor, which triggered the webhook and runbook, and the Virtual Machine was deallocated, and the Azure Alert was closed.

Azure deallocate testing