Module 1: introduction to ai operations on azure
This module introduces the principles of aiops, combining mlops and genaiops into a unified operational framework.
- understanding aiops concepts and lifecycle
- differences between experimentation and production ai
- overview of azure ai services and architecture
- introduction to azure machine learning and microsoft foundry
- defining operational requirements for ai solutions
Module 2: setting up infrastructure for ai workloads
This module focuses on preparing the infrastructure required to support scalable ai operations.
- provisioning azure machine learning workspaces
- configuring compute resources and environments
- managing data storage and access
- implementing identity and access management
- setting up networking and security controls
Module 3: implementing mlops workflows
This module covers the deployment and lifecycle management of machine learning models.
- versioning datasets, models, and code
- building automated training pipelines
- deploying models to endpoints
- implementing ci/cd for machine learning
- managing model lifecycle and updates
Module 4: operationalising generative ai solutions
This module focuses on deploying and managing generative ai applications and agents.
- deploying generative ai models using microsoft foundry
- building and managing ai agents
- integrating generative ai into applications
- managing prompts, embeddings, and vector stores
- evaluating generative ai outputs and performance
Module 5: monitoring and observability
This module explores how to monitor ai systems in production to ensure reliability and performance.
- implementing logging and telemetry
- monitoring model performance and drift
- tracking generative ai usage and outputs
- setting up alerts and dashboards
- troubleshooting production issues
Module 6: governance and responsible ai
This module focuses on governance frameworks and responsible ai practices.
- implementing responsible ai principles
- managing compliance and regulatory requirements
- auditing ai systems and decision-making processes
- managing data privacy and security
- applying governance policies across ai workloads
Module 7: optimisation and scaling
This module teaches how to improve performance, cost-efficiency, and scalability of ai systems.
- optimising model performance and latency
- scaling infrastructure for high-demand workloads
- cost management strategies for ai services
- optimising generative ai usage and responses
- implementing caching and efficiency techniques
Module 8: end-to-end ai solution lifecycle
This module consolidates learning by examining the full lifecycle of ai solutions in production.
- designing end-to-end ai pipelines
- integrating mlops and genaiops workflows
- managing continuous improvement cycles
- case study: productionising an ai solution on azure
- preparing for the ai-300 exam