PRACTICAL PLAYBOOKS · MARCH 2026

From architecture
to action

Step-by-step security checklists for the most common AI security tasks. Each playbook distils field experience into the minimum viable set of controls — what to do, in what order, and what to watch for.

4 PLAYBOOKS

Based on field research

⚠ Verify steps with Microsoft docs — UIs change

PLAYBOOK 01

Audit Your Copilot Studio Estate

~30 min · Copilot Admin · No extra licensing

PLAYBOOK 02

Secure a New Copilot Studio Agent

~45 min · Power Platform Maker + Admin · Managed Environments required

PLAYBOOK 03

⭐ Set Up the Security Dashboard for AI

~2 hrs · Defender Admin + Power Platform Admin · Agent 365 or E7 required

PLAYBOOK 04

Respond to a Suspected Agent Compromise

~1 hr · Security Engineer · Sentinel + Defender required

PLAYBOOK 01

Audit Your Agent Estate in 30 Minutes

Find no-auth agents, overly shared agents, ownerless agents, and maker credential risks — using KQL in Microsoft Defender Advanced Hunting. The AIAgentsInfo table now covers Copilot Studio, Microsoft Foundry, 3rd-party marketplace, and custom LOB agents. No extra licensing required beyond Defender.

💡 Before running these KQL queries: consider running the M365 Copilot Automated Readiness Assessment (ARA) first. It evaluates your full tenant posture across 6 service domains in minutes and surfaces gaps in licensing, Entra, Defender, Purview, and Power Platform — before you start the manual KQL audit. Free, open source, read-only API access, no data leaves your tenant.

✓ Works for Classic & Modern Agents ✓ Copilot Studio · Foundry · Marketplace · LOB ⚠ Requires AI Agent Inventory enabled

PREREQUISITE

Enable AI Agent Inventory — Security for AI

In Microsoft Defender portal → Settings → Security for AI (previously: Settings → Cloud Apps → AI Agents). Then in Power Platform Admin Center → Security → Threat Detection → enable Microsoft Defender — Copilot Studio AI Agents. Dual-admin setup required (Defender admin + Power Platform admin).

April 2026 expansion: The AIAgentsInfo table now includes additional columns covering all agent types — not just Copilot Studio. Foundry agents, 3rd-party marketplace agents, and custom LOB agents are now included where they are registered with Agent 365 or use the Agent 365 SDK.

⚠ Takes up to 2 hours for initial data population in the AIAgentsInfo table.

STEP 1 — FIND NO-AUTH AGENTS

Run this KQL in Defender Advanced Hunting

Finds published agents with no authentication configured — anyone with the link can chat with them.

AIAgentsInfo | summarize arg_max(Timestamp, *) by AIAgentId | where AgentStatus == "Published" | where UserAuthenticationType == "None" | project AIAgentName, CreatorAccountUpn, OwnerAccountUpns, AgentCreationTime, UserAuthenticationType

⚠ Any result here is a critical finding. A no-auth published agent is accessible to anyone with the link — including external users if the agent is published to a website.

Also run this change-detection query — use as a Sentinel Analytics Rule to alert the moment any agent is switched to no-auth:

// Alert when UserAuthenticationType changes to "None" AIAgentsInfo | summarize arg_max(Timestamp, *) by AIAgentId | where AgentStatus == "Published" | order by AIAgentName | extend PreviousAuthType = prev(UserAuthenticationType, 1) | where UserAuthenticationType == "None" and PreviousAuthType != "None" | project AIAgentName, PreviousAuthType, UserAuthenticationType, ReportId = tostring(AIAgentId), Timestamp

💡 Save this as a Sentinel Analytics Rule to get an incident the moment a published agent is downgraded to no-auth — even if the change was made by someone who isn't the agent owner.

STEP 2 — FIND OWNERLESS AGENTS

Find agents with no accountable owner

Agents without an owner lack accountability — no one is responsible for reviewing or decommissioning them.

AIAgentsInfo | summarize arg_max(Timestamp, *) by AIAgentId | where AgentStatus == "Published" | where isempty(OwnerAccountUpns) | project AIAgentName, CreatorAccountUpn, AgentCreationTime, AgentStatus

STEP 3 — FIND ORG-WIDE SHARED AGENTS

Identify agents shared with the entire organisation

Org-wide sharing means every employee can interact with the agent. When combined with maker credentials this is critical — the maker's privileges are extended to everyone.

AIAgentsInfo | summarize arg_max(Timestamp, *) by AIAgentId | where AgentStatus == "Published" | where SharedWithOrganization == true | project AIAgentName, CreatorAccountUpn, OwnerAccountUpns, UserAuthenticationType

⚠ Cross-reference this list with Step 4 (maker credentials). Any agent that is both org-wide shared AND uses maker credentials is your highest blast-radius risk.

STEP 4 — IDENTIFY MAKER CREDENTIAL RISK

Find agents using maker credentials (Classic agents with connected services)

Classic Copilot Studio agents authenticate connected services (SharePoint, Outlook etc) using the builder's credentials — not the end user's. Review the creator of each published agent to assess blast radius.

let base = AIAgentsInfo | summarize arg_max(Timestamp, *) by AIAgentId | where AgentStatus == "Published"; let directActions = base | mv-expand detail = AgentToolsDetails | where detail.action.connectionProperties.mode == "Maker" | extend ActionType = "FromTools", Action = detail.action | project-reorder AgentCreationTime, AIAgentId, AIAgentName, UserAuthenticationType, CreatorAccountUpn; let topicActions = base | mv-expand topic = AgentTopicsDetails | extend topicActionsArray = topic.beginDialog.actions | mv-expand Action = topicActionsArray | where Action.connectionProperties.mode == "Maker" | extend ActionType = "FromTopic" | project-reorder AgentCreationTime, AIAgentId, AIAgentName, AgentStatus, CreatorAccountUpn, OwnerAccountUpns, Action; directActions | union topicActions | sort by AIAgentId, Timestamp desc

💡 This query checks both AgentToolsDetails and AgentTopicsDetails — more precise than checking only UserAuthenticationType. Prioritise agents created by high-privilege users (Global Admins, SharePoint Admins).

PLAYBOOK 02

Secure a New Copilot Studio Agent Before Publishing

Minimum security configuration for any new Copilot Studio agent — before it goes live. Covers authentication, sharing controls, MCP tool risk, and what to tell the maker.

⚠ Classic Agents: limited controls ✓ Modern Agents: full Entra stack Managed Environments required

BEFORE THE MAKER STARTS BUILDING

Enable Managed Environments in Power Platform

Power Platform Admin Center → Environments → select environment → Enable Managed Environments. This is the prerequisite for all governance controls including sharing limits and DLP policies.

Set sharing limits before agents are built

In Power Platform Admin Center → Managed Environments → Sharing limits → configure who makers can share agents with. Setting this before building prevents org-wide sharing by default.

💡 Recommend: restrict to specific security groups by default. Require explicit approval for org-wide sharing.

Brief the maker on maker credentials risk

When a maker adds a connector (SharePoint, Outlook, Teams) to a Classic agent, that connector authenticates as the maker — not the end user. Every user who interacts with the agent effectively acts with the maker's privileges. High-privilege makers (Global Admins, SharePoint Admins) should not build agents that access corporate data.

⚠ This is a build-time decision that cannot be fully mitigated after deployment. The right person needs to build the agent.

DURING BUILD

Configure authentication — never leave as "No authentication"

In Copilot Studio → Settings → Security → Authentication → select "Authenticate with Microsoft" for internal agents. Enable "Require users to sign in". Classic agents: this is the primary identity control available. Modern agents: this plus Entra Agent ID controls.

⚠ Copilot Studio shows a warning at publish time if authentication is set to None — but makers can bypass it. Administrators can enforce this at the environment level via data policies.

Review MCP tools carefully before adding

Every MCP tool added to a Classic agent uses maker credentials. Each tool expands the blast radius. For each tool ask: (a) does this tool need to authenticate as the maker? (b) could a malicious prompt abuse this tool to exfiltrate data? (c) is there a safer connector alternative?

💡 Use built-in connectors instead of HTTP request nodes or direct MCP connections where possible — connectors have OAuth governance via Defender for Cloud Apps.

Enable Block Images and URLs (external threat detection)

In Copilot Studio → Settings → Security → enable external threat detection and configure Microsoft Defender as the provider. This blocks image-based and URL-based prompt injection before the agent processes the content.

BEFORE PUBLISHING

Scope sharing to the minimum required audience

In Copilot Studio → Share → add only the specific security groups who need access. Avoid "Everyone in [Org]" unless there is a documented business justification and security review.

Assign an owner and document the agent

Every published agent should have a named owner accountable for reviewing it quarterly. Document: what connectors it uses, what data it can access, who can interact with it, and who built it.

✓

If Modern Agent: configure Entra Agent ID controls

Enable Modern Agent mode in Power Platform Admin Center → Copilot → Settings → Copilot Studio. Once enabled, the agent gets an Entra Agent Identity and you can apply Conditional Access policies, Access Reviews, and ID Protection via Entra. Note: Entra Agent ID is still in preview as of March 2026.

PRE-PUBLISH CHECKLIST

Managed Environments enabledRequired for all governance controls

Authentication set to "Authenticate with Microsoft" + Require sign-inNever publish with No authentication

Maker is not a high-privilege account (Global Admin, SharePoint Admin etc)Maker credentials = agent credentials for Classic agents

All MCP tools and connectors reviewed and justified

Block Images and URLs enabled via Defender external threat detection

Sharing scoped to minimum required audience

Named owner assigned and agent documented

Agent visible in AI Agent Inventory after publishingVerify it appears in Defender Advanced Hunting AIAgentsInfo table

PLAYBOOK 03

Set Up the Security Dashboard for AI GA · START HERE

Configure the unified AI security posture view in Microsoft Defender. Requires collaboration between Defender admin and Power Platform admin. Allow up to 2 hours for data population.

⚠ Requires Agent 365 or M365 E7 ⚠ Dual-admin setup ✓ Works for Classic & Modern Agents

STEP 1 — DEFENDER ADMIN SIDE

Enable preview features in Defender XDR

Microsoft Defender portal → Settings → Microsoft Defender XDR → Preview features → turn on. The AI Agent Inventory and Security Dashboard for AI features require preview mode enabled.

Connect the Microsoft 365 app connector

Defender portal → Settings → Security for AI → connect Microsoft 365 (previously: Settings → Cloud Apps → Connected Apps → Microsoft 365). This is required for Copilot agent telemetry to flow into Defender.

Enable Copilot Studio AI Agents

Defender portal → Settings → Security for AI → enable Copilot Studio AI Agents (previously: Settings → Cloud Apps → Copilot Studio AI Agents). Copy the URL shown — you will need to share this with your Power Platform admin to complete the next step.

💡 Save this URL carefully — it encodes your tenant ID and is required for the Power Platform side of setup.

STEP 2 — POWER PLATFORM ADMIN SIDE

Enable external threat detection in Power Platform

Power Platform Admin Center → Security → Threat Detection → Additional threat detection → enable "Allow Copilot Studio to share data with a threat detection partner" → paste the URL from Step 3 → enter the Entra App ID.

⚠ The App ID must match exactly. Mismatch causes a silent failure — status will show "pending" indefinitely.

Verify connection status

Back in Defender portal → Settings → Security for AI → check that the Power Platform action status shows "Connected". If it shows "Pending" after 30 minutes, re-check the App ID and URL entered in Step 4.

STEP 3 — VERIFY DATA IS FLOWING

Confirm AIAgentsInfo table is populating

Run this query in Defender Advanced Hunting. If it returns rows, setup is complete. If it returns nothing after 2 hours, check the connection status in Step 5.

AIAgentsInfo | summarize arg_max(Timestamp, *) by AIAgentId | summarize count() by AgentStatus

Open the Security Dashboard for AI

Defender portal → left nav → expand Microsoft Sentinel → AI → Security Dashboard for AI. You should see agent inventory, posture findings, and risk signals from Entra, Defender, and Purview.

⚠ The dashboard shows different agent counts than Entra Agent ID portal or Agent 365. This is a known inconsistency. Use Advanced Hunting for precise counts.

STEP 4 — ENABLE REAL-TIME PROTECTION

Enable RT protection for Copilot Studio agents

This step enables the webhook-based runtime inspection of tool invocations. Defender evaluates each tool call before execution and can block suspicious actions. Defender portal → Settings → Security for AI → enable Real-time protection for Copilot Studio AI Agents.

⚠ 1-second timeout applies. If Defender doesn't respond within 1 second, the tool invocation is allowed through. This is a deliberate tradeoff for reliability but means high-speed tool calls may not always be evaluated.

SETUP CHECKLIST

Preview features enabled in Defender XDR

Microsoft 365 app connector connected

Copilot Studio AI Agents enabled — URL copied

Power Platform external threat detection configured with correct App ID and URL

Connection status shows "Connected" in Defender portal

AIAgentsInfo table returning data in Advanced Hunting

Security Dashboard for AI accessible in Defender portal

Real-time protection enabled

PLAYBOOK 04

Respond to a Suspected Agent Compromise

Triage and contain a suspected agent abuse incident — prompt injection, data exfiltration via agent, or suspicious agent behaviour. Requires Sentinel and Defender for Cloud Apps.

⚠ Time-sensitive — act within 1 hour of detection Sentinel + Defender required

DETECT

Check Defender portal for RT protection alerts

Defender portal → Incidents & Alerts → filter by "Copilot Studio". RT protection generates SOC-ready alerts that explain what was stopped, why it was considered risky, and which agent, user, and tool were involved.

Query for suspicious agent activity in Advanced Hunting

Run these queries to surface anomalous agent behaviour in the last 24 hours.

// Agents with sudden auth type changes AIAgentsInfo | summarize arg_max(Timestamp, *) by AIAgentId | where AgentStatus == "Published" | extend PreviousAuthType = prev(UserAuthenticationType, 1) | where UserAuthenticationType == "None" and PreviousAuthType != "None" | project AIAgentName, PreviousAuthType, UserAuthenticationType, Timestamp // High-volume tool invocations in last 24h // (use CopilotActivity table if connector enabled) CopilotActivity | where TimeGenerated > ago(24h) | where Operation contains "Tool" | summarize Count = count() by AgentName, UserId | where Count > 50 | order by Count desc

CONTAIN

Unpublish the agent immediately

In Copilot Studio → open the suspect agent → Settings → Channels → unpublish all channels. This immediately stops all user interactions with the agent while investigation continues.

Revoke the maker's sessions if maker credentials are involved

If the agent uses Classic maker credentials and you suspect the maker account is compromised: Entra portal → Users → select maker → Revoke all sessions. Also check for new credentials on the associated Enterprise Application in Entra.

⚠ For Classic agents, the Enterprise Application owner is "Power Virtual Agent Service" — check if the maker's account has been added as an owner (a known risk pattern that enables credential abuse and bypasses CA/MFA).

INVESTIGATE

Review Copilot Data Connector logs in Sentinel

If the Copilot Data Connector is enabled, query the CopilotActivity table in Sentinel for the time window of the suspected incident. Look for CopilotAgentManagement events (config changes), unusual CopilotInteraction volumes, and CopilotPlugin lifecycle events.

CopilotActivity | where TimeGenerated between (ago(48h) .. now()) | where AgentName == "<>" | project TimeGenerated, UserId, Operation, AgentName, PromptContent, ResponseContent | order by TimeGenerated desc

Use Security Copilot or Security Analyst Agent for triage

In Defender portal, open the Security Copilot pane and ask it to summarise the incident, identify the affected users, and recommend next steps. The Security Analyst Agent (Preview, March 2026) can autonomously triage the incident against your Sentinel data.

RECOVER

Remediate and rebuild with security controls

Before republishing: apply all controls from Playbook 02. If the agent is Classic, evaluate whether it should be migrated to Modern (requires enabling Modern Agent mode in Power Platform). Update your DLP policies if data exfiltration occurred via prompts.

Document and update your agent security policy

Record the incident in your risk register. Update your agent security checklist with any gaps this incident revealed. Consider running Playbook 01 across your entire estate as a follow-up audit.

INCIDENT RESPONSE CHECKLIST

Defender RT protection alerts reviewed

Advanced Hunting queries run for anomalous activity

Suspect agent unpublished from all channels

Maker sessions revoked if account compromised

Enterprise Application checked for rogue credentials or owners

CopilotActivity logs reviewed in Sentinel

Incident documented in risk register

Agent rebuilt with Playbook 02 controls before republishing

Enable and route the four Foundry logging layers into Microsoft Sentinel before your workload carries real traffic. Log gaps in production are possible to close retrospectively — but data that was never collected cannot be recovered.

PREREQUISITE — UNDERSTAND THE FOUNDRY RESOURCE MODEL

Foundry resource ≠ Foundry project — they are separate Azure Monitor resources

A Foundry resource (Microsoft.CognitiveServices/accounts) can contain many Foundry projects (Microsoft.CognitiveServices/accounts/projects). Diagnostic Settings configured at the resource level do not cascade to projects. Every new project needs its own separate configuration — or you accept the gap silently. RBAC assigned at resource scope does cascade to projects, but least-privilege access may require project-level RBAC assignments.

⚠ This is the most common Foundry security misconfiguration — teams enable logging on the resource and assume projects are covered. They are not.

STEP 1 — ENABLE ACTIVITY LOG ROUTING

Route the Activity Log to your Sentinel Log Analytics Workspace

The Activity Log is the only Foundry logging layer that requires no opt-in — it is generated automatically by Azure Resource Manager. It captures resource creation/deletion, RBAC role assignment changes, key rotation events, network config changes, and model deployment operations. It does not route to your Sentinel workspace by default.

Azure Portal → Foundry Resource → Monitoring → Diagnostic settings → Add diagnostic setting → Select: "Activity Log" → Destination: Send to Log Analytics workspace (your Sentinel LAW) → Save

💡 Route to the same Log Analytics Workspace as your Sentinel instance. Foundry resource logs and Entra ID logs must share the same workspace for Sentinel analytics rules to correlate them.

STEP 2 — ENABLE DIAGNOSTIC SETTINGS AT RESOURCE LEVEL

Enable Audit and RequestResponse at the Foundry resource level

Data plane logging must be explicitly enabled — nothing is collected by default. At the resource level, enable these two categories for SecOps. Note: data can take up to 2 hours before it is available to query.

Azure Portal → Foundry Resource → Monitoring → Diagnostic settings → Add diagnostic setting → Name: "SecOps-Resource-Logs" → Enable categories: ✅ Audit (data plane access — key retrievals, connection access, admin API calls) ✅ RequestResponse (inference metadata — model, operation, status, latency, tokens) ❌ AzureOpenAIRequestUsage (no SecOps value) ❌ Trace (not a detection source under normal conditions) ❌ AllMetrics (no SecOps value) → Destination: Log Analytics workspace (Sentinel LAW) → Save

⚠ RequestResponse captures metadata about every inference call — model name, operation type, status codes, latency. It does NOT include prompt text or model-generated completions. That is a deliberate design choice by Microsoft to reduce sensitive data exposure through platform-level logging.

STEP 3 — ENABLE DIAGNOSTIC SETTINGS AT EACH PROJECT LEVEL

Repeat Diagnostic Settings for every Foundry project — separately

This is a separate configuration from Step 2. Projects are separate resources in Azure Monitor. For each project, enable the Audit category — it records agent operations such as runs, file uploads, and evaluations at project scope. This is the only source that tells you which identities accessed which Foundry capabilities and when.

Azure Portal → Foundry Resource → Projects → [select each project] → Monitoring → Diagnostic settings → Add diagnostic setting → Name: "SecOps-Project-Logs" → Enable categories: ✅ Audit (agent runs, file uploads, evaluations, data plane access at project scope) ❌ Trace ❌ AllMetrics → Destination: Log Analytics workspace (same Sentinel LAW) → Save → Repeat for every project

💡 Build a governance process: every new Foundry project created must have Diagnostic Settings configured before it receives any production traffic. Make this a deployment checklist item.

STEP 4 — ENABLE ENTRA ID LOGS

Configure Entra ID diagnostic settings at the tenant level

Foundry agents are Entra ID identities. Without Entra ID logs, non-interactive sign-ins, service principal activity, and agent lifecycle events are invisible. Entra ID diagnostic settings are configured at the tenant level — not at the Foundry resource. Route to the same Log Analytics Workspace as your Foundry resource logs.

Entra admin center → Monitoring → Diagnostic settings → Add diagnostic setting → Enable: ✅ SignInLogs (interactive sign-ins) ✅ NonInteractiveUserSignInLogs (service principal / agent sign-ins) ✅ ServicePrincipalSignInLogs ✅ AuditLogs (agent lifecycle events, RBAC changes) → Destination: Same Log Analytics workspace as Foundry logs → Save

⚠ This is a tenant-level setting — it affects all Entra logs, not just Foundry. Confirm with your identity team before enabling if this workspace doesn't already receive Entra logs.

STEP 5 — CONNECT APPLICATION INSIGHTS (OPTIONAL BUT RECOMMENDED)

Connect Application Insights for agent-level runtime visibility

Application Insights is the deepest logging layer — it surfaces agent-level behaviours absent from Foundry resource logs: anomalous tool call chains, unexpected external dependencies, unusual exception patterns, and (optionally) prompt and completion content. It must be a workspace-based Application Insights instance linked to the same Log Analytics Workspace as Sentinel — otherwise Sentinel analytics rules cannot query it.

Azure Portal → Foundry Project → Settings → Application Insights → Connect to workspace-based Application Insights resource → Must use the SAME Log Analytics workspace as Sentinel // Key tables available once connected: AppDependencies → model inference calls, tool calls AppTraces → agent execution traces, orchestration steps AppExceptions → errors during inference or tool execution AppRequests → inbound requests (if agent exposed via HTTP)

⚠ Content capture (prompt and completion logging) is OFF by default. To enable: set AZURE_TRACING_GEN_AI_CONTENT_RECORDING_ENABLED=true in your SDK config. Before enabling, ensure you have governance controls in place for storage, access, and retention — prompt content may contain PII, secrets, or business data.

✅ Playbook 05 Checklist

Activity Log routed to Sentinel LAW

Diagnostic Settings (Audit + RequestResponse) enabled at Foundry resource level

Diagnostic Settings (Audit) enabled at every Foundry project level separately

Entra ID sign-in + audit logs routing to same LAW

Application Insights connected — workspace-based, same LAW as Sentinel

Governance process established: every new project gets Diagnostic Settings before production traffic

Content capture decision documented — enabled or explicitly deferred with governance rationale

PLAYBOOK 06

Pre-Deployment AI Agent Red Teaming with PyRIT

Test your AI agents for OWASP LLM Top 10 vulnerabilities before they ship. Microsoft's PyRIT framework automates adversarial testing across 53+ datasets, 70+ converters, and 6 attack strategies — with CI/CD release gate integration. Source: Microsoft Tech Community — PyRIT for AI Agent Security (Apr 28, 2026)

✓ Open Source · MIT · github.com/microsoft/PyRIT ✓ Battle-tested on 100+ Microsoft products including Copilot ⚠ Pre-deployment only — not a runtime control

PREREQUISITE — INSTALL PYRIT

Install and configure

pip install pyrit

# Required env vars (for Azure OpenAI target + scorer)
export AZURE_OPENAI_API_KEY="your-key"
export AZURE_OPENAI_ENDPOINT="https://your-instance.openai.azure.com/"
export AZURE_OPENAI_DEPLOYMENT="gpt-4o"

STEP 1 — CREATE YAML SCAN CONFIG

Config-driven — switch strategies, datasets, or thresholds with no code changes. Point target at your agent entry point.

# scan.yaml
scan:
  name: "my-agent-security-scan"

  target:
    module: "my_agent.main"         # Your agent module
    function: "handle_prompt"       # Your agent entry function

  strategies:
    - name: "prompt_sending"        # Single-turn baseline
    - name: "crescendo"             # Gradual escalation
      max_turns: 5
    - name: "tap"                   # Tree of Attacks with Pruning

  converters:                       # Stack — applied in order
    - name: "Base64Converter"
    - name: "LeetspeakConverter"
    - name: "TranslationConverter"

  datasets:
    - "airt_illegal"                # Microsoft AI Red Team dataset
    - "harmbench"                   # Harm evaluation benchmark
    - "advbench"                    # Adversarial benchmark
    - path: "datasets/custom.yaml"  # Your domain-specific prompts

gate:
  fail_on:
    critical_findings: 0            # Zero tolerance for critical
    high_findings: 3
    max_score: 0.3                  # Defense rate must exceed 70%

STEP 2 — FOUR ESCALATING ATTACK PHASES

Run in sequence — if plain prompts pass, layer evasion. Each phase builds on the previous.

Phase 1: Plain prompts
  → Baseline — establishes what passes without evasion
  → Catches basic prompt injection and policy violations

Phase 2: Encoded prompts
  → Base64, ROT13, Leetspeak, Unicode confusables
  → Tests whether your agent/guardrails decode before evaluating

Phase 3: Semantic attacks
  → LLM-powered rephrasing, translation, multimodal injection
  → Converters stack: translate → Base64 → embed in image

Phase 4: Multi-turn dialogue attacks
  → CrescendoAttack: gradual escalation over 5–10 turns
  → TreeOfAttacksWithPruning (TAP): branching attack trees
  → Tests whether context accumulation bypasses initial guardrails

STEP 3 — OWASP LLM TOP 10 MAPPING

Map findings to OWASP LLM Top 10 (2025) for structured risk reporting. Turns PyRIT output into a risk register your security team understands.

LLM01 Prompt Injection       → PromptSendingAttack + injection datasets (airt_illegal)
LLM02 Sensitive Info         → Data exfiltration datasets + PII scorers
LLM06 Excessive Agency       → Tool-calling attack datasets (advbench)
LLM07 System Prompt Leakage  → System prompt extraction datasets
LLM10 Unbounded Consumption  → High-volume automated attack patterns

STEP 4 — CI/CD RELEASE GATE

Integrate into your pipeline. Exit code 0 = pass (deploy), exit code 1 = fail (block). No custom actions needed.

# GitHub Actions example
jobs:
  security-scan:
    steps:
      - name: Run AI security scan
        run: |
          pip install pyrit
          python scanner.py --config scan.yaml --output reports/

      - name: Evaluate release gate
        run: |
          python gate.py --report reports/scan_results.json
          # Exit 1 blocks deployment automatically

# When to run:
# Every merge to main:  Quick scan only (phases 1–2, ~10 min)
# Pre-release branch:   Full scan (all 4 phases, architect approval)
# Weekly scheduled:     Full scan across full agent estate

⚠ Two risk surfaces — test both: PyRIT tests security vulnerabilities (LLM01–LLM10) AND responsible AI harms (bias, toxicity, manipulation) simultaneously. Traditional pen tests focus on only one. Most AI agents ship with neither tested. You wouldn't ship a web app without OWASP ZAP — the same standard should apply to AI agents.

REFERENCE

Four AI Security KPIs — Operational KQL

Four metrics to track weekly and report quarterly. The trend matters more than the absolute number — you're looking for No-Auth count trending down and DLP hits stabilising as policies mature. For the strategic framing, see Strategy → Four AI Security KPIs.

✓ All KQL runs on data you already have Bookmark in Copilot Activity Monitoring workbook

KPI 1 — RISKY AGENTS · TARGET: DECREASING TO ZERO

Count of published agents with no authentication

The single most important agent-security metric. Any agent in this count is reachable by anyone — including external users if published to a website. Trend this weekly. If it isn't going down, your Phase 2 enforcement isn't sticking.

AIAgentsInfo | summarize arg_max(Timestamp, *) by AIAgentId | where AgentStatus == "Published" | where UserAuthenticationType == "None" | summarize RiskyAgents = count()

💡 Pair this with a Sentinel Analytics Rule (see Playbook 01 Step 1) that alerts on any new no-auth agent — the count metric is for trend, the rule is for incident response.

KPI 2 — SENSITIVE ACCESS EVENTS · TARGET: STABLE

AI interactions citing Confidential+ labels

Sourced from Purview Activity Explorer. A stable trend means label enforcement is working. A rising trend means either more sensitive data is being grounded by agents (governance problem) or sites that should have higher labels don't (label coverage gap).

CopilotActivity | where TimeGenerated > ago(7d) | where RecordType has_any ("CopilotInteraction", "AIPluginOperation") | where isnotempty(SensitivityLabelEventData) | extend LabelName = tostring(SensitivityLabelEventData.LabelName) | where LabelName has_any ("Confidential", "Highly Confidential", "Restricted") | summarize SensitiveAccessEvents = count() by bin(TimeGenerated, 1d)

💡 Cross-reference with DSPM oversharing assessment — a rising trend may indicate overshared sites are being grounded.

KPI 3 — DLP POLICY HITS · TARGET: STABLE AFTER TUNING SPIKE

Blocked or warned responses from Purview DLP at Copilot location

Expect a spike when DLP policy first deploys (audit mode reveals true volume), then stabilisation. A second spike post-stabilisation means either a new data category is being surfaced, or makers are working around an existing policy. Purview portal → Data Loss Prevention → Reports → filter by Microsoft Copilot location.

💡 Split this by policy: NIN/PII blocks tell a different story to OFFICIAL-SENSITIVE label blocks. Both belong in the KPI but should trend separately.

KPI 4 — BLOCKED TOOL ACTIONS · TARGET: INCREASING THEN STABLE

Tool invocations blocked by Defender real-time protection (ATG)

Counter-intuitive trend — you want this to increase initially, because it means runtime protection is firing. A flat-at-zero trend usually means ATG isn't enabled or the tool surface is too narrow for blocks to occur, not that everything is safe.

💡 Flat-at-zero is a posture finding, not a success. Investigate ATG configuration before celebrating.

REPORTING CADENCE

Weekly — All four KPIs in the Copilot Activity Monitoring workbook for the security team.
Monthly — KPI trend slide for the AI Security Working Group.
Quarterly — KPI trends as one section of the board-level reporting pack (see Strategy → Quarterly reporting pack).

PLAYBOOK 07

Brief Your Makers — 30-Minute Security Awareness

Maker behaviour is the largest controllable factor in agent risk. Most agent security incidents trace back not to platform vulnerabilities but to maker decisions — using maker credentials, sharing org-wide by default, granting connectors broad scope. A short, focused awareness session converts the platform controls into shared discipline. Run this once before Phase 2 governance rolls out, then quarterly for new makers.

✓ 30 minutes · live or recorded Audience: anyone publishing a Copilot Studio agent ⚠ Mandatory before maker is granted environment access

PART A — THE FIVE THINGS EVERY MAKER MUST KNOW (15 MIN)

Maker credentials = your permissions, extended to every user

When you add a connector (SharePoint, Outlook, Teams) to a Classic agent and choose "Maker credentials," every user of that agent acts with your account's permissions. If you have admin access, every user has admin access via the agent. The fix: always choose end-user authentication on connectors, even if it takes longer to set up.

No authentication = anyone, including outside the company

Setting authentication to "None" doesn't mean "easier sign-in" — it means no sign-in at all. If the agent is published to a website or shared org-wide, anyone who finds the URL can use it. Default to end-user auth. Only use no-auth if there's a documented business reason and the Approver has signed off.

Org-wide sharing is a security decision, not a convenience toggle

Sharing your agent with "Everyone in the organisation" exposes it to 100% of your colleagues — including their devices, their permissions, and any compromise of their accounts. Start with a named group. Expand only when there's a reason. Org-wide sharing requires Approver sign-off.

Connector scope is permanent — grant the minimum

When you grant a connector "Files.ReadWrite.All" or "Mail.Read", the agent has that scope forever, across every conversation, every user. Don't pick the broadest scope to "make sure it works." Pick the narrowest scope that does the job. If you need to broaden later, you can — but you can't easily narrow once users depend on it.

Every agent needs an Owner, a Sponsor, and a documented purpose

If you build it and leave, the agent becomes ownerless. If no business stakeholder cares whether it exists, it's invisible to governance reviews. Before publishing, fill in: who maintains it (Owner), who's accountable for whether it should still exist (Sponsor), and one sentence on what it does. Future-you will thank present-you.

PART B — RED FLAGS IN YOUR OWN AGENT (10 MIN)

Self-audit checklist before publishing

Walk through these yourself before clicking Publish. If you can't tick all six, don't publish yet.

☐ End-user authentication is on (not "None", not "Maker")
☐ All connectors use end-user auth, not maker credentials
☐ Connector scopes are the narrowest that work
☐ Sharing is set to a named group, not "Everyone"
☐ Owner and Sponsor are filled in (different people for HIGH-tier agents)
☐ The agent description explains what it does in one sentence

PART C — WHERE TO GET HELP (5 MIN)

Escalation paths

Makers should leave the session knowing where to ask. Adapt this list to your organisation:

"I need to share my agent more broadly" → IT Approver (named individual or team mailbox)
"My connector needs a broader scope" → IT Approver + DLP exception process
"I think my agent has been misused" → Security team (security mailbox or SOC)
"I'm leaving — who takes my agent?" → Sponsor (hand off Owner role before last day)
"I need an external connector or new model" → Agent Lifecycle Board (monthly)

💡 Run format: 30 minutes total — 15 min content, 10 min checklist walkthrough using a sample agent, 5 min Q&A. Recording the session and making it a watch-once prerequisite for environment access scales this without burning facilitator time.

PLAYBOOK 08

Vet a Third-Party Agent Before Publish

External agents — from the Microsoft Agent Store, ISV partners, or vendor-supplied apps — should not reach your tenant without a security review. This is the equivalent of third-party software vetting for the agent era. Run it for every external agent before it appears in any environment, and treat any agent processing regulated data as a DPIA trigger.

✓ Run once per third-party agent Owner: IT Approver + Security review ⚠ Some checks need DPIA where citizen / regulated data is in scope

STEP 1 — PUBLISHER & PROVENANCE CHECK

Verify who built it and their security posture

Before looking at the agent itself, validate the publisher. A poorly secured publisher is a supply chain risk even if their agent is well-designed.

☐ Publisher identity verified (Microsoft Partner status, registered company, contact details)
☐ Publisher security posture documented (SOC 2, ISO 27001, or equivalent)
☐ Vulnerability disclosure / responsible disclosure policy exists
☐ Publisher subject to GDPR / UK GDPR / equivalent data protection regulation
☐ Agent listed on Microsoft Agent Store with "Publisher Verified" badge (if applicable)

STEP 2 — CONNECTOR & DATA SCOPE REVIEW

What data does it touch, and how broadly?

Document every connector, every OAuth scope, every data source the agent will access. The publisher's documentation may understate this — verify against what the agent's manifest actually requests.

☐ Full connector list documented (with required scopes per connector)
☐ Each scope justified — narrowest scope that works has been selected
☐ No broad-read scopes (Files.ReadWrite.All, Mail.Read, Directory.Read.All) without explicit justification
☐ Data residency confirmed — does data leave the tenant geo? Cross-EUDB?
☐ Sub-processors documented if the agent uses third-party APIs
☐ Sensitive data categories the agent will touch are identified and DLP coverage verified

STEP 3 — AUTHENTICATION & IDENTITY MODEL

How does the agent authenticate, and what is its identity model?

External agents must use end-user authentication and Modern identity model. Maker credentials are not acceptable for any third-party agent regardless of context.

☐ Agent uses end-user authentication (not maker credentials, not no-auth)
☐ Agent is Modern (Entra Agent ID) — Classic agents from external publishers are rejected outright
☐ Conditional Access policies cover the agent identity
☐ Access Package or equivalent time-bound permission model is in place
☐ Agent identity is registered in your tenant inventory (Phase 1)

STEP 4 — DPIA & REGULATORY TRIGGER

Does processing trigger a DPIA or regulator notification?

Any external agent processing regulated data (PII, financial, health, citizen records, government-classified content) is a DPIA trigger. Don't skip this step — it's the single most common compliance failure for third-party agent deployments.

☐ Data Protection Officer / DPIA team notified before deployment
☐ DPIA completed if regulated personal data is in scope
☐ EU AI Act Annex III classification reviewed (high-risk categories trigger documentation, transparency, human oversight obligations)
☐ Existing DPIAs covering Copilot Studio updated if the agent introduces a new processing purpose
☐ Regulator notification considered (ICO, EU AI Office, sector regulator) where applicable

STEP 5 — APPROVAL & ONGOING GOVERNANCE

Sign-off and lifecycle entry

Approval is not the end of the workflow — every third-party agent enters the same ongoing governance cycle as internally built agents, plus a few extras.

☐ Agent Approver signs off in writing (with conditions documented if applicable)
☐ Risk tier assigned per the methodology on the Risk page
☐ Added to quarterly governance sweep from day one
☐ Included in red team rotation if HIGH-tier
☐ Publisher's vulnerability disclosure contact added to security team's vendor register
☐ Annual re-vetting scheduled — publisher security posture and agent scope are reviewed every 12 months

💡 Standing veto: any of Owner / Sponsor / Approver / DPO can block a third-party agent at any step. The default for external agents is "not approved" — you have to actively decide to allow them in. This is the inverse of internally built agents, where the default is "permitted within environment policy."

← PREVIOUSGaps NEXT →Agent 365

STAY UPDATED

Get notified when Microsoft AI security changes

Monthly updates on new controls, GA announcements, and critical gaps — direct to your inbox.

Subscribe to updates →

aiagentsecurity.substack.com · Free · No spam

From architectureto action

From architecture
to action