Running generative AI on Azure, through Azure OpenAI and Azure AI Foundry, gives teams a fast path to production. It also introduces risk surfaces that traditional application security was never designed for. The same governance discipline I apply to infrastructure at scale applies here: define what is allowed, enforce it where every request passes through, and produce evidence continuously. What changes is the specific control points, because AI behaves differently from static software.

Why AI workloads need their own controls

A conventional app has inputs you can review at design time and outputs you can predict. A generative AI workload is dynamic and often multi-step: the same prompt can produce different outputs as retrieval context changes or the underlying model is updated. Three patterns in particular break controls that assume static behavior, and each shows up directly in Azure deployments.

Pattern 1: Retrieval-augmented generation (RAG)

RAG is the most common Azure OpenAI pattern, grounding model responses in your own data through Azure AI Search or a vector store. It also makes data access dynamic: what the model can see depends on what the retriever returns at runtime.

The risks are retrieval-source poisoning, confused-deputy access (the app retrieving documents the end user should not see), and document-level over-permission. The governance patterns:

  • Enforce identity-aware retrieval so the user’s permissions, carried through Microsoft Entra ID, constrain what the retriever can return, rather than the application holding broad access on everyone’s behalf.
  • Attest and control retrieval sources, so only approved indexes feed the model.
  • Log every retrieval as evidence, so you can reconstruct what grounded any given answer.

Pattern 2: Tool-using agents

When an Azure OpenAI assistant or AI Foundry agent can call tools and APIs, it stops merely answering and starts acting. Every tool call becomes an audit event, and tool privilege creep becomes the dominant risk: once an agent holds an API credential, the blast radius of misuse is the whole API surface.

The governance patterns: scope tool access per role rather than granting broad capability, log every action the agent takes, and require human approval gates on consequential operations. Treat the agent’s permissions the way you would treat a service principal’s, least privilege, scoped, and auditable.

Pattern 3: Agentic chains

When agents chain steps, a bad assumption early in the chain cascades through every later step, and the human-in-the-loop you had in the pilot is gone. Governing the chain means instrumenting decision lineage end to end, so you can trace which inputs, retrievals, and tool calls produced a final action, not just inspect the last step.

Wiring it into the Azure control plane

These patterns become real when they run as policy, not documentation. On Azure that means:

  • Azure Policy and Microsoft Defender for Cloud to enforce baseline configuration on the resources hosting the workload (network isolation, private endpoints for Azure OpenAI, key and identity hygiene).
  • Microsoft Entra ID as the identity backbone for identity-aware retrieval and scoped agent permissions.
  • Content filtering and abuse monitoring in Azure OpenAI as a runtime guardrail, complemented by your own output classifiers for domain-specific policy.
  • Centralized logging (Azure Monitor and your evidence store) so decision lineage, retrieval logs, and tool-call records are queryable and retained as audit evidence.

Each control should map back to the framework you answer to, EU AI Act articles on logging and human oversight, NIST AI RMF functions, ISO 42001, so one enforced control feeds multiple compliance reports at once.

The takeaway

Securing AI on Azure is not a matter of bolting a filter onto the model. It is governing the system around the model: identity-aware retrieval, scoped agent permissions, decision lineage, and runtime guardrails, all enforced through the Azure control plane and producing evidence as a byproduct. The platform gives you the building blocks. The governance, deciding what is allowed and enforcing it where every request flows, is the work that makes an AI workload trustworthy in production.


I write about cloud security, DevSecOps governance, and AI risk, and I speak on building and governing trustworthy AI systems at scale. Connect with me on LinkedIn.