Loading Now

Modernizing Terraform Pipelines on Azure: OIDC Federation for GitHub Actions and Azure DevOps

Many Terraform-on-Azure pipelines still use the same authentication methods they adopted three years ago. This often involves relying on a long-lived ARM_CLIENT_SECRET stored in GitHub Actions or Azure DevOps—set once, circulated around, and only refreshed when something goes wrong.

This credential tends to be the most overlooked in cloud settings and is statistically prone to leaking. Issues can arise when a developer takes a screenshot of a variable group, a pipeline log reveals a secret, a fork inherits access unexpectedly, or the secret simply expires on a Friday evening, impacting production deployments.

However, Workload Identity Federation (WIF) eliminates these problems. Instead of handling secrets, the pipeline creates a short-lived token at runtime and exchanges it for an Azure access token through Microsoft Entra. This system has been available in GitHub Actions since 2021, and Azure DevOps introduced WIF in February 2024. The azurerm Terraform provider has supported it since version 3.7.

This post will guide you through the entire process for both GitHub Actions and Azure DevOps, based on my experiences across multiple client environments.

Before delving into any YAML configuration, let’s picture the process:

  1. The CI system (either GitHub or ADO) generates a short-lived JWT that specifies exactly what’s being run—such as the repository, branch, environment, and service connection.
  2. The pipeline sends this JWT to Microsoft Entra ID.
  3. Entra verifies it against a federated identity credential you’ve set up on either a managed identity or an app registration. The claims iss, sub, and aud must match, and this comparison is case-sensitive.
  4. If verification succeeds, Entra returns an Azure access token that lasts for the duration of the job.
  5. Terraform then uses this token, and when the job concludes, the token expires—ensuring nothing is left behind.

The token is specifically tied to a subject, such as repo:contoso/platform:environment:prod or sc://contoso/platform/azure-prod. This means it cannot be used across different repositories, branches, or pipelines.

Here are some practical choices that often work well in production:

DecisionChoice
Identity typeUser-assigned managed identity (UAMI), not app registration
Identity granularityOne UAMI per environment (not per pipeline)
Trust scopePinned to the environment claim, not the branch
RBAC scopeResource group, not subscription
Remote stateOIDC + use_azuread_auth = true, shared key access disabled

Why choose UAMIs? They reside within your subscription, don’t require Application Administrator rights for management, and follow the lifecycle of their associated resource group. Having one per environment is sensible because creating a pipeline for each identity could lead to an overwhelming number of identities. Mapping one identity to each environment keeps things neat and organised.

You only need to run two commands for each environment:

az identity create -g rg-platform-identity -n id-tf-prod -l eastus

az identity federated-credential create \
  --name github-prod \
  --identity-name id-tf-prod \
  --resource-group rg-platform-identity \
  --issuer https://token.actions.githubusercontent.com \
  --subject repo:contoso/platform:environment:prod \
  --audiences api://AzureADTokenExchange

Simply repeat these for your non-production environments. No secrets are created during this process.

In your repository, navigate to Settings → Environments to set up nonprod and prod. On the production environment, establish the necessary review processes and impose a branch rule that restricts deployments to the main branch. Then, add three environment variables (not secrets, as they’re not sensitive): AZURE_CLIENT_ID, AZURE_TENANT_ID, and AZURE_SUBSCRIPTION_ID.

Your workflow will be compact:

permissions:
  id-token: write
  contents: read

jobs:
  apply:
    runs-on: ubuntu-latest
    environment: prod
    env:
      ARM_USE_OIDC: "true"
      ARM_CLIENT_ID: ${{ vars.AZURE_CLIENT_ID }}
      ARM_TENANT_ID: ${{ vars.AZURE_TENANT_ID }}
      ARM_SUBSCRIPTION_ID: ${{ vars.AZURE_SUBSCRIPTION_ID }}
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - run: terraform init && terraform apply -auto-approve

Here are three features that enhance security:

  • The only elevated permission is id-token: write, which does not allow write access within GitHub; it merely enables the runner to create a JWT.
  • The environment: line selects the appropriate AZURE_CLIENT_ID and sets the sub claim. Any other claims will be rejected by the federation.
  • There’s no need for an azure/login step when using Terraform, as the azurerm provider automatically reads the OIDC environment variables from GitHub.

Both methods are conceptually the same, but their execution differs.

Azure DevOps provides two options for creating a WIF service connection: automatic (where it generates an app registration) and manual (where you supply your own UAMI). For platform teams, the manual option with a UAMI tends to be preferable since it ensures that identity governance is properly managed.

The process involves a straightforward sequence across two platforms:

  1. In Azure DevOps, initiate the creation of a new ARM service connection, select Workload Identity Federation (manual), and enter your UAMI’s client ID, tenant ID, and subscription. Save it as a draft, and ADO will display the issuer URL and subject identifier.
  2. Then, in Azure, on the UAMI, create a federated credential using the values provided by ADO. The subject will resemble sc://contoso/platform/azure-prod.
  3. Finally, return to ADO and hit Verify and save.

In the pipeline, the service connection only activates if a job task requires it. The simplest way to do this is by using the AzureCLI@2 task:

- task: AzureCLI@2
  inputs:
    azureSubscription: azure-prod   # the WIF service connection
    scriptType: bash
    scriptLocation: inlineScript
    inlineScript: |
      terraform init && terraform apply -auto-approve
  env:
    ARM_USE_OIDC: "true"
    ARM_CLIENT_ID: $(AZURE_CLIENT_ID)
    ARM_TENANT_ID: $(AZURE_TENANT_ID)
    ARM_SUBSCRIPTION_ID: $(AZURE_SUBSCRIPTION_ID)
    ARM_ADO_PIPELINE_SERVICE_CONNECTION_ID: $(SERVICE_CONNECTION_ID)
    SYSTEM_ACCESSTOKEN: $(System.AccessToken)
    SYSTEM_OIDCREQUESTURI: $(System.OidcRequestUri)

For teams transitioning multiple legacy connections, the Azure DevOps team has released a PowerShell helper. This tool examines each ARM service connection in a project and converts them in place, providing a 7-day rollback window on each connection, ensuring a low-risk migration process.

The Terraform state is where your greatest risks lie. With OIDC, it’s straightforward to secure it. The same UAMI can read and write blob data without needing the storage account key:

backend "azurerm" {
  resource_group_name  = "rg-tfstate"
  storage_account_name = "sttfstateprodeastus"
  container_name       = "platform-prod"
  key                  = "platform.tfstate"
  use_oidc             = true
  use_azuread_auth     = true
}

Assign the UAMI the Storage Blob Data Contributor role on the container (not the whole account), and disable shared key access on the storage account. This action removes the last remaining secret in your pipeline.

While federation eliminates the need for a credential, it doesn’t take away the privileges. Here are some valuable habits to maintain:

  • Limit role assignments to resource groups instead of entire subscriptions. The beauty of federation is that scoping is now incredibly straightforward.
  • Opt for Role Based Access Control Administrator instead of User Access Administrator if your Terraform creates role assignments, as it’s a more refined role.
  • Maintain a documented emergency access plan. If there’s a token-service failure in GitHub or ADO, you should still have a method to push a hotfix. A single, hardware-key-protected emergency app registration within a separate identity boundary works well, provided it’s audited regularly.
  • Keep an eye on sign-ins. Every federated exchange is recorded in the Entra sign-in logs as a service principal sign-in. Forward these logs to Sentinel and set up alerts for unusual activities, such as logins outside of typical hours or from non-standard IP addresses.
SymptomWhat it actually indicates
AADSTS70021: No matching federated identity record foundThere’s often a case-sensitive mismatch in iss, sub, or aud, usually due to a trailing slash or incorrect character casing.
AADSTS700016: Application not found in directoryThe client ID or tenant is incorrect; this is not a federated identity issue.
403 on a resource despite successful token exchangeYour RBAC configuration might be off, not the federation. Check the exact scope.
Unable to determine OIDC token (ADO)No task in the job is loading the service connection. Add an AzureCLI@2 step to resolve this.
Works on main, fails on tagsYou’ve likely pinned sub to a branch. Either create an additional federated credential for tags or transition to environment-based scoping.

When migrating to a new repository, you rarely have the luxury of starting from scratch. Here’s a successful order of operations I’ve found for legacy setups:

  1. Create the new UAMI alongside the existing service principal, keeping the same role assignments.
  2. Connect one canary pipeline and confirm it deploys correctly.
  3. Gradually transition pipelines, starting with the least risky environments.
  4. After a complete release cycle with no issues, disable the secret for the old service principal.
  5. Wait another cycle, and then remove the service principal entirely.
  6. Implement a CI checkpoint that fails any new pipeline that tries to introduce ARM_CLIENT_SECRET.

The old and new authentication methods can operate side by side within the same subscription throughout the transition. There’s no abrupt cutover or maintenance window required, just a steady progress towards eliminating secrets.

If you take away one thing from this post, let it be this: search through your CI variable groups for ARM_CLIENT_SECRET. Every instance you find represents a potential outage or security breach.

Federation is rare in that it offers both improved security and reduced operational workload. Once you’ve set it up, you won’t have to fret about rotating credentials, expiry of secrets, or quarterly reviews of service principal access. Your pipeline runs seamlessly, and the audit trail remains within Entra—where it rightly belongs.

That’s a worthwhile trade-off.

Share this content:


Discover more from Qureshi

Subscribe to get the latest posts sent to your email.

Discover more from Qureshi

Subscribe now to keep reading and get access to the full archive.

Continue reading