
How to do IaC right in Azure? βοΈ
- How to do IaC right in Azure? βοΈ
- A Key Factor for smooth Cloud Journey βοΈ
- What is IaC in Azure βοΈ
- 9 Best Practices in IaC β
- 1οΈβ£ Establish a Central Module Library
- 2οΈβ£ Separate Code, Configurations & Environments
- 3οΈβ£ Enforce Version Control & Code Reviews
- 4οΈβ£ Restrict Azure Portal Access to Read-Only
- 5οΈβ£ Implement Robust State Management
- 6οΈβ£ Secure Handling of Secrets
- 7οΈβ£ Continuously Test Modules with Provider Updates
- 8οΈβ£ Enforce Separation of Service Principals
- 9οΈβ£ Standardize CI/CD-Pipelines
- Operating Model & Responsibilities π·ββοΈπ·
- Conclusion π
A Key Factor for smooth Cloud Journey βοΈ
The concept of Infrastructure as Code (IaC) has significantly reshaped the provisioning and management of cloud resources within the Microsoft Azure platform. At least from a conceptual standpoint! In practice, however, experience from numerous real-life Azure adoption projects shows that many organizations and enterprise customer still rely heavily on the Azure Portal for resource deployment and configuration, effectively assembling their environments through pure manual interactions. An approach which is called disparagingly βClickOpsβ! π±οΈ
As Azure environments expand, due to an increasing number of workloads, multi-subscription architectures, and enterprise-scale landing zone designs. Portal-based (manual) resource management becomes progressively inefficient as environments scale and complexity increases. π It leads to configuration drifts and lacks reproducibility. It also complicates Azure Governance implementations, particularly in scenarios that require a clean/unified Azure Hierarchy (according to the Microsoft CAF), consistent Azure Policy enforcement, role-based access control (Azure RBAC), and standardized networking and security configurations. π
In this context, Infrastructure as Code represents a fundamental paradigm shift for Azure platform engineering. By leveraging tools such as Bicep or Terraform, infrastructure definitions become declarative, version-controlled, and fully integrated into CI/CD pipelines. This enables consistent and repeatable deployments across different Azure subscriptions and environments, while aligning with Azure-native governance capabilities such as Azure Policy or Azure Hierarchy (e.g., Management Groups, etc.) π οΈ
The following chapters analyze the implications of adopting IaC within Azure environments, outline the organizational and technical challenges associated with this transition, and explain why establishing a high degree of automation, particularly during the early stages of building Azure landing zones, is essential for achieving scalability, compliance, and operational resilience. π
Traditional Infrastructure Management π§
For decades, enterprise IT infrastructure, particularly in traditional on-premises data centers, has mostly been managed manually. π’ Infrastructure changes typically followed a ticket-driven process. A request was created, and administrators implemented the required changes via graphical management interfaces (GUI). Operating systems were installed manually on virtual machines (VMs), networking configurations were applied through GUIs or command-line tools such as PowerShell, and services like firewalls (incl. WAFs), load balancers, and database clusters were configured individually, often without standardized or centrally documented procedures.
This approach introduced significant operational risks. Changes were prone to inconsistencies, difficult to reproduce, poorly documented, and highly dependent on individual expertise. In many organizations, only a small number of specialists possessed the knowledge required to perform these tasks, creating bottlenecks and operational risks in cases of absence or staff turnover. As a result, infrastructure management processes often became fragmented, script-orientated, and difficult to maintain. While this manual model was sufficient for relatively static environments. For example, resources such as database clusters or VMs persisted for months or even years and changes were infrequent. However, this approach does not scale to modern requirements anymore. This becomes especially critical when infrastructure services are provisioned and scaled dynamically within cloud platforms such as Microsoft Azure.
Rise of “ClickOps” and its Limitations π£
With the adoption of Microsoft Azure, organizations gained access to powerful, web-based management capabilities such as the Azure Portal. However, while the tooling evolved, operational practices often remained unchanged. π±οΈ Administrators continued provisioning and configuring resources manually through GUIs, using the “ClickOps” approach. Azureβs ease of use, combined with its extensive service catalog, encouraged “rapid” deployment, but also reintroduced familiar challenges.
Azure resources were created individually, configurations diverged across environments (e.g. DEV, TEST, INT, PROD, … etc.) tremendously, and documentation was frequently incomplete or outdated. As Azure environments expanded/scaled, driven by complex architectures, enterprise-scale landing zones, and an increasing number of application workloads, maintaining consistency and governance became increasingly difficult. This anti-pattern amongst cloud engineers and architects is also known as βClick-Click-Cloudβ. π
Characteristics of Modern Azure Infrastructure ποΈ
Azure fundamentally changes the nature of infrastructure. Resources and capabilities are provisioned via standardized and fully automated interfaces such as the Azure Resource Manager (ARM) and its underlying REST APIs. This enables a uniform control plane across all services.
Moreover, Azure resources are inherently ephemeral. They are often provisioned dynamically for hours, days, or weeks, depending on workload requirements. This elasticity enables scenarios such as automatic scaling during periods of peak demand, as well as cost optimization through the automated decommissioning or even deletion of resources when they are no longer required. However, it also significantly increases operational complexity. Manual management approaches cannot keep pace with the scale, velocity, and variability of modern Azure environments. The administrative overhead grows non-linearly with the number of Azure resources, making manual operations unsustainable and hard to manage.
IaC as the operational Foundation ποΈ
This is where the concept of Infrastructure as Code (IaC) becomes essential. Previously manual tasks are translated into declarative or imperative code that fully describes the desired state of the Azure environment. This code is version-controlled and deployed automatically through CI/CD pipelines. As a result, infrastructure can be provisioned, updated, and decommissioned in a consistent, repeatable, and error-resistant way throughout its lifecycle. In the Azure ecosystem, this is commonly implemented using tools such as Azure Bicep, HashiCorp Terraform, Pulumi, Ansible (or many more), often integrated with platforms like GitHub or Azure DevOps.
Infrastructure as Code is not only a tooling evolution. It represents a fundamental shift in how infrastructure is designed, deployed, and operated. It replaces manual, ad-hoc configurations with standardized, scalable, and automated processes. In practice, this requires not only new technical skills but also a cultural transformation within organizations. Operations teams must adopt software engineering principles, and collaboration with development teams becomes significantly closer. This is all aligned with DevOps practices, as the chapter below explains more in detail.
Infrastructure changes in Azure are no longer performed interactively through the Azure Portal but are instead managed through pull requests, automated validation, and deployment pipelines. This ensures traceability, auditability, versioning, and compliance with governance requirements such as policy enforcement, role-based access control, and standardized network and security configurations.
Challenges and Strategic Importance of IaC π
However, the transition to IaC is not without challenges for an organization. It requires upskilling teams, establishing robust module libraries and pipeline architectures, and redefining operational processes. Despite these hurdles, Infrastructure as Code is very important for organizations aiming to operate Azure environments efficiently, securely, and at scale. It forms the foundation of a modern, agile, and future-proof IT landscape, capable of meeting the increasing demands/requirements of cloud transformation.
What is IaC in Azure βοΈ
Mainly, Infrastructure as Code (IaC) means treating the entire IT infrastructure, from the foundational platform layer to application-facing services, as a version-controlled software project. This approach is often referred to as full “end-to-end” automation. While this paradigm may initially feel unfamiliar or even intimidating to traditional system administrators, such concerns are largely unfounded. IT-Professionals who are already experienced with scripting languages (such as PowerShell or Bash) will find it relatively straightforward to transition into IaC practices. The underlying concepts are well-structured, tooling is mature, and extensive documentation and real-world examples significantly lower the barrier to entry. Rather than viewing IaC as a disruption, IT teams should recognize its potential. It enables the automation of repetitive operational tasks, reduces the likelihood of human error, and improves both efficiency and traceability over time.
In the context of Microsoft Azure, implementing Infrastructure as Code (IaC) means that every Azure resource or capability is defined and managed through code. This applies without exception to all components required for operating the platform or supporting applications. This includes a wide range of Azure services, from virtual networks and compute resources to storage, databases, as well as platform-native security and monitoring integrations. Crucially, the IaC approach only delivers its full value when the entire provisioning and management lifecycle is fully automated and free of manual intervention. Any remaining manual steps, whether in deployment, configuration, or change management, introduce inconsistencies and reduce reproducibility. They also undermine the fundamental principles of Infrastructure as Code (IaC). Therefore, all infrastructure changes must be executed exclusively through automated pipelines to ensure consistency and reliable operation at scale.
Technologies such as Azure Resource Manager (ARM), Azure Bicep, or HashiCorp Terraform are commonly used to declaratively define this infrastructure. The objective is to provision and manage Azure environments in a consistent, automated, and reproducible manner. By defining infrastructure as code, organizations establish a reliable foundation for scalable cloud operations, governance, and continuous delivery.
9 Best Practices in IaC β
To fully leverage the potential of Infrastructure as Code (IaC) and avoid common pitfalls, adhering to proven best practices is essential. While IaC enables automation and standardization of infrastructure provisioning, it also introduces risks if processes and governance models are not consistently implemented across the organization. This is particularly relevant in complex enterprise environments on Microsoft Azure, where multiple teams, subscriptions, stages and landing zones must be managed in a controlled and scalable manner.
The following best practices form the foundation for a secure, scalable, and maintainable IaC implementation in Azure. They help reduce errors and improve maintainability. They also strengthen collaboration between development, operations, and security teams, ultimately ensuring consistent and reproducible deployments.

1οΈβ£ Establish a Central Module Library
By maintaining a shared module library, organizations can standardize deployments across teams and continuously improve modules over time, avoiding redundant implementations (DRY principle!). In Azure-centric environments, this can be achieved using internal modules or leveraging curated solutions such as Azure Verified Modules, which provide validated and production-ready building blocks that can be adapted to project-specific requirements.
Details about how to setup a central module library can be found in the separate blog post “The Art of Modularization in Terraform“.

2οΈβ£ Separate Code, Configurations & Environments
In Azure, it is also recommended to structure infrastructure into logical layers (e.g., networking, identity, security, application workloads). This layered approach aligns well with enterprise-scale landing zone architectures and allows for flexible ownership models across teams.
Details about how to structure code best, can be found in the separate blog post “Repository Structure“.

3οΈβ£ Enforce Version Control & Code Reviews
This approach ensures full traceability, auditability, and compliance. It also establishes a controlled change process, which is critical for operating Azure environments in regulated or security-sensitive contexts.

4οΈβ£ Restrict Azure Portal Access to Read-Only
Deployments should be performed using dedicated identities such as Service Principals or Managed Identities, ensuring that all changes are traceable, reproducible, and compliant with governance requirements.

5οΈβ£ Implement Robust State Management
In Azure, a remote backend such as an Azure Storage Account should be used, with features like state locking, blob versioning, and backup mechanisms enabled. This prevents concurrent modifications and protects against data loss. Alternatively, enterprise solutions such as Terraform Cloud or Terraform Enterprise provide enhanced capabilities, including collaboration features, policy enforcement, and integrated security controls.

6οΈβ£ Secure Handling of Secrets
Instead, dedicated secret management solutions such as Azure Key Vault should be used. These services provide secure storage, fine-grained access control via RBAC, and automated rotation capabilities, ensuring compliance with enterprise security standards.

7οΈβ£ Continuously Test Modules with Provider Updates
Automated testing, preferably through integration or end-to-end tests, should be embedded in the CI/CD pipeline. This allows early detection of breaking changes, reduces deployment risks, and ensures long-term stability of Azure environments.

8οΈβ£ Enforce Separation of Service Principals
This separation enforces the principle of least privilege and reduces the blast radius of potential security incidents. It should also be consistently applied across environments (e.g., DEV, TEST, PROD) to ensure clear boundaries and controlled access patterns.

9οΈβ£ Standardize CI/CD-Pipelines
A well-defined pipeline enforces validation, planning, and approval stages before applying changes, and integrates quality gates such as static analysis and security checks. Combined with environment-specific promotion workflows and approval mechanisms, this approach guarantees that all infrastructure changes are reproducible, auditable, and aligned with organizational policies, effectively eliminating “ClickOps” and uncontrolled modifications.
Operating Model & Responsibilities π·ββοΈπ·
A successful Infrastructure as Code adoption requires a clearly defined operating model that establishes ownership, responsibilities, and collaboration patterns across teams. This can have a massive impact on organizations, as the blog post describes more in detail. Without this clarity, organizations risk creating fragmented implementations, inconsistent environments, and uncontrolled changes. β οΈ
In a typical Azure setup, the responsibility for developing reusable infrastructure modules lies with a dedicated Platform Team π·ββοΈ. This team designs and maintains the central module library, encapsulating organizational standards and architectural best practices into reusable building blocks. By treating modules as products, the Platform Team ensures consistency, versioning, and continuous improvement, while shielding application teams from underlying complexity. Moreover, the Platform Team is also responsible for operating the platform itself. In this role, the platform is not a passive foundation but an actively managed environment with clear lifecycle ownership, operational processes, and integration into security and monitoring systems.
Infrastructure changes are executed exclusively through standardized CI/CD pipelines, using dedicated service principals with scoped permissions. Application Teams π· are typically allowed to initiate deployments within their environments, but they do not receive direct write access to Azure resources. Instead, all changes must pass through defined pipeline stages, ensuring validation, planning, and controlled execution. A key control mechanism within this process is the enforcement of the 4-eyes principle. Changes, especially those targeting higher environments such as PROD, must be reviewed and approved by at least one additional qualified individual before they are applied. This is typically implemented through pull request reviews in version control systems and reinforced by mandatory approval gates within CI/CD pipelines. The combination of code review and pipeline-based approvals ensures both technical correctness and organizational accountability. The interaction model between Platform Team π·ββοΈ and Application Teams π· is based on a provider- consumer relationship. The Platform Team provides well-defined interfaces in the form of modules and platform capabilities, while Application Teams consume these interfaces to deploy and operate their workloads. Close collaboration is essential as feedback from Application Teams drives the evolution of modules π, while the Platform Team ensures that new requirements are implemented in a standardized and secure manner. This balance enables autonomy for development teams without compromising control or consistency.
By clearly defining who develops modules, who operates the platform, who is allowed to deploy, and how teams collaborate, the operating model becomes a critical enabler for scaling Infrastructure as Code. It transforms IaC from a tooling decision into a sustainable organizational capability.
Conclusion π
The introduction of Infrastructure as Code (IaC) in the context of Microsoft Azure marks a fundamental shift in how cloud infrastructures are designed, provisioned, and operated. Moving away from manual βClickOpsβ in the Azure Portal and error-prone individual configurations toward a standardized, consistent, and reliable deployment model. What may initially appear as a purely technical enhancement through tools such as Terraform, Bicep or Pulumi often proves to be a huge transformation with significant implications for organizational structures, processes and above all, the corporate culture!
While the technical advantages of IaC in Azure (like automation, repeatability and version control) are quite obvious, the real challenge is mainly beyond the tooling itself. It requires a fundamental shift in mindset and ways of working. Many organizations underestimate that adopting IaC in Azure is not just about introducing new tools, but about introducing a new operating model. Azure resources are no longer configured manually, but are defined, tested, and deployed as code, aligned with principles known from software engineering and governed within dedicated platform and workload teams. A successful adoption of IaC in Azure therefore requires more than just implementing modern deployment tools. It demands clear governance structures (e.g., aligned with Azure Landing Zones and the Cloud Adoption Framework), standardized deployment processes, and a strong enablement strategy for engineers. Equally important is the close collaboration between development, platform engineering, operations, and IT security to ensure that deployments are secure, compliant, and aligned with organizational standards.
Organizations that establish IaC in Azure in a structured and sustainable manner not only create a solid foundation for stable, scalable, and secure cloud environments, but also enable a modern, agile and cross-team operating model. In this context, Infrastructure as Code becomes a key enabler for implementing enterprise-scale Azure environments, increasing transparency, and ensuring long-term maintainability. IaC is therefore not an end in itself, but a critical capability for achieving efficiency, governance, and future readiness in Azure-based cloud infrastructures.
