Skip to main content

Running your own Azure Proactive Resiliency Assessment

· 11 min read

The Azure Proactive Resiliency Library is a curated collection of best practices, guidance, and recommendations designed to improve the resiliency of applications and services running in Azure. Built on the Resiliency pillar of the Well-Architected Framework, this catalog provides valuable insights to ensure your workloads remain robust and reliable.

The library also includes automation capabilities (using Azure Graph queries) that allow you to collect data, analyze it, and generate detailed Word and PowerPoint reports. These reports, part of the Well-Architected Resiliency Assessment workshop, provide visibility into the resiliency of your Azure workloads. This toolset, often used by Microsoft Cloud Solution architects, can be leveraged to identify areas for improvement in your own Azure estate, following Resiliency well-architected principles.

In this article, we will walk through the process of running the scripts to collect, analyze, and report on resiliency data for your workloads using the Azure Proactive Resiliency Library to help you identify and address potential reliability issues in your Azure environment.

Export Azure DevOps Repositories to Azure Storage Account

· 16 min read

When recently deleted, Azure DevOps code repositories go into a soft-delete state, any repositories deleted are retained for 30 days before they are permanently deleted.

For Disaster Recovery scenarios or scenarios where you may not realize something has been deleted until after Microsoft retains the backup, you may want to export the repositories at a single point in time to an Azure storage account.

The Azure DevOps service is highly redundant and built on core Azure Platform infrastructure components, such as Availability Zones and is built using Azure Cloud native Storage and Azure SQL services, which are highly redundant and backed-up.

However, mistakes can happen, and there may be organizational requirements to retain backups of the repositories.

Today, we will examine exporting your repositories to an Azure Storage Account using a DevOps pipeline. This pipeline will capture the repositories at their current state and run nightly.

Deploying Large Language Models on AKS with Kaito

· 16 min read

Today, we are going to look at deploying a large language model (LLM) directly into your AKS (Azure Kubernetes Service) cluster, running on GPU-enabled nodes, using Kaito (Kubernetes AI Toolchain Operator).

KAITO is an open-source operator that transforms how you deploy AI models on Kubernetes. It streamlines the process, automating critical tasks like infrastructure provisioning and resource optimization. It intelligently selects the optimal hardware configuration for your specific model, using available CPU and GPU resources on AKS. KAITO eliminates the manual setup complexities, accelerating your deployment time and reducing associated costs.