Microsoft Copilot Now Runs Claude, Gemini & GPT: The Enterprise Multi-AI Strategy Guide for 2026
Microsoft 365 Copilot is no longer a single-model tool. With Claude enabled by default since January 2026, plus GPT-5 and Gemini routing, enterprises face a new governance challenge: managing a multi-AI ecosystem inside their tenant. Here is the CIO playbook for getting it right.
Copilot Consulting
February 24, 2026
18 min read
In This Article
On January 7, 2026, Microsoft flipped a switch that fundamentally changed enterprise AI. Anthropic's Claude became enabled by default inside Microsoft 365 Copilot. Not as an optional add-on. Not buried in an admin setting. Enabled by default for every Copilot user across your tenant.
This was not a minor product update. It was the end of single-model enterprise AI. Microsoft Copilot now orchestrates across Claude (Anthropic), GPT-5 (OpenAI), Gemini 2.5 Pro (Google), and Phi-4 (Microsoft's own small language model). The system dynamically selects which model handles each request based on task complexity, latency requirements, and cost optimization. A simple email summary might route to Phi-4. A complex legal document analysis might route to Claude. A creative marketing draft might route to GPT-5.
For CIOs and enterprise architects, this creates an entirely new governance surface. Your data is no longer flowing to a single AI provider on a single cloud. It is flowing to multiple providers, across multiple clouds, under multiple data processing agreements. Claude runs on AWS infrastructure, not Azure. Gemini runs on Google Cloud. Your Microsoft 365 data---emails, documents, Teams conversations---is now potentially traversing three different hyperscaler environments in a single Copilot session.
This guide covers what changed, what it means for enterprise governance, where Perplexity fits into the landscape, and how to build a multi-AI strategy that does not leave your organization exposed.
What Actually Changed: The Multi-Model Architecture
Microsoft Copilot's orchestration layer now operates as a model router. When a user submits a prompt, the orchestration engine evaluates the request against multiple factors before selecting a model:
- Task type: Summarization, generation, analysis, code, reasoning, research
- Complexity: Simple tasks route to smaller, faster models; complex tasks route to frontier models
- Latency sensitivity: Real-time interactions favor faster models; batch operations can use slower, more capable models
- Cost optimization: Microsoft dynamically balances quality against compute cost
- Compliance constraints: Admin-configured policies can restrict which models are available
The Model Roster (February 2026)
| Model | Provider | Cloud Infrastructure | Strengths | Default Status | |-------|----------|---------------------|-----------|----------------| | GPT-5 | OpenAI | Azure | Creative generation, broad knowledge, multimodal | Enabled by default | | Claude 3.5 Opus | Anthropic | AWS (us-east-1, eu-west-1) | Long-context analysis, reasoning, safety | Enabled by default (Jan 7, 2026) | | Gemini 2.5 Pro | Google | Google Cloud | Multimodal, large context window, search grounding | Enabled by default | | Phi-4 | Microsoft | Azure | Fast inference, cost-efficient, simple tasks | Enabled by default | | GPT-4o | OpenAI | Azure | Balanced performance, multimodal | Enabled by default |
The critical detail most enterprises are missing: you cannot see which model handled a specific request unless you enable detailed audit logging through Microsoft Purview. The default Copilot interaction log does not surface model routing decisions. This means your compliance team has no visibility into whether a regulated document was processed by a model running on Azure, AWS, or Google Cloud without additional configuration. For audit logging setup, see our guide on auditing Microsoft Copilot activity with Purview integration.
How the Orchestrator Selects Models
Microsoft's orchestration engine uses a multi-step routing process:
- Intent classification: The system classifies the user's request into task categories (summarization, drafting, analysis, coding, research, translation)
- Complexity scoring: A lightweight classifier estimates task complexity on a 1-5 scale
- Model matching: The orchestrator maps the task type and complexity to a model performance matrix
- Policy enforcement: Admin-configured restrictions filter out disallowed models
- Load balancing: Among eligible models, the system selects based on current throughput and latency
- Execution: The selected model processes the request; the orchestrator manages context injection, retrieval-augmented generation (RAG), and response formatting
This happens in milliseconds. The user sees a single Copilot response. They have no idea which model generated it unless the admin has enabled model attribution in the Copilot admin center.
The Cross-Cloud Data Risk No One Is Talking About
Here is the governance problem that should keep every CISO awake: Claude runs on Amazon Web Services. Not Azure. When Microsoft Copilot routes a request to Claude, your Microsoft 365 data leaves the Azure boundary and enters AWS infrastructure.
What This Means for Enterprise Data Flows
Scenario: A financial analyst asks Copilot to summarize a confidential Q4 earnings report stored in SharePoint. The orchestrator determines this is a complex summarization task and routes it to Claude for its superior long-context analysis. The document content is sent to Anthropic's API running on AWS us-east-1.
Data path: SharePoint (Azure) -> Copilot Orchestrator (Azure) -> Claude API (AWS us-east-1) -> Response returned to Copilot (Azure) -> User
Your earnings report just traversed two cloud providers. Your data residency policy almost certainly does not account for this. Your Data Processing Agreement (DPA) with Microsoft covers Azure. It may or may not cover data processed by Anthropic on AWS through the Copilot orchestration layer.
Cross-Cloud Risk Assessment Matrix
| Risk Category | Single-Model (GPT only) | Multi-Model (Current Default) | Mitigation | |--------------|--------------------------|-------------------------------|------------| | Data residency | Azure regions only | Azure + AWS + Google Cloud | Restrict models by region in admin center | | DPA coverage | Microsoft DPA covers all processing | Subprocessor agreements required for Anthropic, Google | Review Microsoft's updated DPA subprocessor list | | Audit trail | Single provider logs | Logs split across providers | Enable Purview advanced audit with model attribution | | Breach notification | Microsoft SLA | Multi-party notification chain | Verify contractual breach notification terms for each subprocessor | | Encryption in transit | Azure-to-Azure (internal) | Cross-cloud TLS 1.3 | Verify encryption standards for each model endpoint | | Regulatory compliance | Azure compliance certifications apply | Each cloud has separate certifications | Map model routing to compliance boundary requirements |
For organizations in regulated industries---healthcare, financial services, government---this is not a theoretical risk. It is a compliance gap that auditors will identify. Your governance framework must now account for multi-cloud AI processing.
Immediate Actions for CISOs
- Audit your current Copilot model configuration: Check the Microsoft 365 admin center under Copilot > Model Management to see which models are enabled
- Review the updated Microsoft DPA: Microsoft updated its subprocessor list in January 2026 to include Anthropic and Google as AI subprocessors
- Enable model attribution logging: Turn on detailed Copilot audit logs in Microsoft Purview to track which model processes each request
- Map data flows against compliance requirements: If you operate under HIPAA, FedRAMP, or SEC regulations, determine whether cross-cloud model routing violates your compliance posture
- Configure model restrictions if needed: The admin center allows you to disable specific models tenant-wide or for specific user groups
For step-by-step DLP configuration, see our guide on data loss prevention policies for Microsoft Copilot.
The Enterprise Governance Framework for Multi-Model AI
Single-model governance was straightforward: one provider, one API, one set of terms, one audit trail. Multi-model governance requires a fundamentally different approach. Here is the framework we deploy for enterprise clients.
Layer 1: Model Policy Configuration
Define which models are permitted for which user populations and data classification levels.
Policy matrix example:
| Data Classification | Permitted Models | Rationale | |---------------------|-----------------|-----------| | Public | All models (GPT-5, Claude, Gemini, Phi-4) | No sensitivity constraints | | Internal | GPT-5, Phi-4 (Azure-only) | Keeps data within Azure boundary | | Confidential | GPT-5, Phi-4 (Azure-only) | Azure compliance certifications apply | | Highly Confidential | Phi-4 only (Azure, no external API) | Minimizes data exposure surface | | Regulated (HIPAA/PCI) | Phi-4 only or Copilot disabled | Maximum control, minimum risk |
This is configurable today through the Microsoft 365 admin center. Most enterprises have not configured it because they do not know these controls exist.
Layer 2: Data Flow Monitoring
Deploy continuous monitoring of Copilot data flows to detect policy violations.
- Microsoft Purview Information Protection: Apply sensitivity labels to documents; configure Copilot to respect label-based model restrictions
- Purview Audit (Advanced): Enable the CopilotModelRouting audit event to log which model processed each request
- Purview Data Loss Prevention: Create DLP policies that block Copilot from processing documents with specific sensitivity labels through non-Azure models
- Microsoft Defender for Cloud Apps: Monitor for anomalous Copilot usage patterns that might indicate data exfiltration through model routing
Layer 3: Vendor Risk Management
Each AI model provider is now a subprocessor of your Microsoft 365 data. Treat them accordingly.
Vendor assessment checklist:
- [ ] Anthropic (Claude): Review AWS SOC 2 Type II report, data retention policies, training data usage commitments
- [ ] Google (Gemini): Review Google Cloud compliance certifications, data processing terms, EU data residency options
- [ ] OpenAI (GPT-5): Review Microsoft-OpenAI data processing relationship, Azure-hosted inference guarantees
- [ ] Microsoft (Phi-4): Review Azure compliance certifications for your specific region and workload
Layer 4: Incident Response Updates
Your incident response plan must now account for multi-provider breach scenarios.
- Scenario 1: Anthropic discloses a security incident affecting Claude API---does your IR plan cover third-party AI subprocessor breaches?
- Scenario 2: A Copilot user inadvertently processes regulated data through Gemini on Google Cloud---how do you detect and report this?
- Scenario 3: Model routing logs show unauthorized data leaving your compliance boundary---what is your containment procedure?
Update your incident response runbook to include these scenarios. Our AI governance services include multi-model IR planning as a standard deliverable.
Where Perplexity Fits: The Research Layer
Perplexity occupies a different position in the enterprise AI stack than Copilot, ChatGPT, Claude, or Gemini. It is not a general-purpose AI assistant. It is a research and information synthesis engine.
Perplexity Enterprise Pro: What It Does Differently
- Real-time web search: Every response is grounded in current web sources with citations
- Source attribution: Every claim includes a clickable reference to the source document
- No hallucination tolerance: The system is architecturally designed to retrieve rather than generate facts
- Enterprise features: SOC 2 Type II compliant, SSO integration, admin controls, no training on enterprise data
The Strategic Role of Perplexity in a Multi-AI Enterprise
Perplexity does not compete with Copilot. It complements it. Here is how:
| Use Case | Best Tool | Why | |----------|-----------|-----| | Summarize an internal document | Copilot | Has access to Microsoft 365 data | | Research a competitor's latest quarterly earnings | Perplexity | Real-time web search with source citations | | Draft an email response | Copilot | Integrated with Outlook, understands context | | Investigate a regulatory change | Perplexity | Retrieves current regulatory text with citations | | Analyze a spreadsheet | Copilot | Native Excel integration | | Prepare for a client meeting with market research | Perplexity | Comprehensive web synthesis with sources | | Build a PowerPoint from internal data | Copilot | Native PowerPoint integration | | Fact-check a claim in a legal brief | Perplexity | Source-grounded verification |
The enterprise play is deploying Perplexity Enterprise Pro alongside Copilot, not instead of it. Perplexity handles the research and fact-finding layer. Copilot handles the productivity and workflow layer. Together, they cover the full spectrum of knowledge work.
Perplexity Deployment Considerations
- Cost: Perplexity Enterprise Pro runs approximately $40/user/month (annual commitment), compared to Copilot's $30/user/month
- Data isolation: Perplexity Enterprise does not access your Microsoft 365 data (this is both a limitation and a security feature)
- SSO: Supports SAML 2.0 and OIDC for enterprise identity integration
- Admin controls: Centralized user management, usage analytics, and policy configuration
- API access: Available for custom integrations and workflow automation
The CIO Decision Matrix: When to Use What
This is the decision framework we provide to CIOs evaluating their multi-AI strategy. Use our readiness assessment to get a customized version for your organization.
Decision Matrix by Task Category
| Task Category | Primary Tool | Secondary Tool | Rationale | |--------------|-------------|---------------|-----------| | Internal document work (drafts, summaries, analysis) | Microsoft Copilot | N/A | Native M365 integration, data stays in tenant | | External research (market analysis, competitive intel, regulatory) | Perplexity Enterprise | Copilot web search | Source citations, real-time data, no hallucination | | Complex reasoning (legal analysis, strategic planning) | Copilot (Claude routing) | Claude direct API | Claude excels at nuanced reasoning tasks | | Code generation & review | Copilot (GitHub Copilot) | Claude API | GitHub Copilot for IDE integration, Claude for complex refactoring | | Creative content (marketing, communications) | Copilot (GPT-5 routing) | ChatGPT Enterprise | GPT-5 strengths in creative generation | | Data analysis & visualization | Copilot (Excel/Power BI) | N/A | Native integration with Microsoft data stack | | Customer-facing AI (chatbots, agents) | Copilot Studio | N/A | Enterprise controls, Microsoft identity integration | | Compliance & audit | Microsoft Purview | N/A | Native compliance tooling for M365 ecosystem |
Decision Matrix by Industry
| Industry | Recommended Configuration | Key Constraint | |----------|--------------------------|----------------| | Healthcare | Copilot (Phi-4 only) + Perplexity (non-PHI research) | HIPAA prohibits PHI processing through non-BAA-covered models | | Financial Services | Copilot (GPT-5 + Phi-4) + Perplexity | SEC/FINRA require audit trails; restrict Claude/Gemini for regulated data | | Government (Federal) | Copilot GCC High (GPT-5 only) | FedRAMP boundary excludes non-Azure models | | Government (State/Local) | Copilot (all models) + Perplexity | Less restrictive but still requires data residency controls | | Legal | Copilot (Claude routing preferred) + Perplexity | Claude excels at legal reasoning; Perplexity for case research | | Technology | All tools unrestricted | Maximum flexibility, standard data classification controls |
For industry-specific implementation, explore our healthcare and financial services practice areas.
ROI Impact: What Multi-Model Copilot Changes
The ROI equation for Copilot shifted with multi-model routing. Here is how.
Positive ROI Impacts
- Better task matching: The right model for each task means higher output quality. Claude handling legal analysis produces better results than GPT-4 alone. This translates to fewer revision cycles and higher first-draft acceptance rates.
- Faster inference for simple tasks: Phi-4 handles simple summarization and classification tasks 3-5x faster than GPT-5, reducing wait times for routine interactions.
- Broader capability coverage: Multi-model means Copilot can handle tasks it previously struggled with. Gemini's multimodal capabilities improve image and document understanding. Claude's long-context window enables analysis of longer documents.
ROI Quantification (1,000-User Enterprise)
| Metric | Single-Model Copilot (2025) | Multi-Model Copilot (2026) | Delta | |--------|----------------------------|---------------------------|-------| | Average time saved per user/week | 4.2 hours | 5.8 hours | +38% | | First-draft acceptance rate | 62% | 74% | +19% | | Tasks Copilot can handle | ~70% of knowledge work | ~85% of knowledge work | +21% | | Annual productivity value (per user) | $8,400 | $11,600 | +$3,200 | | Annual Copilot cost (per user) | $360 | $360 | $0 | | Net annual ROI (per user) | $8,040 | $11,240 | +$3,200 | | Net annual ROI (1,000 users) | $8.04M | $11.24M | +$3.2M |
These figures come from aggregated deployment data across our client base. Your specific ROI will depend on industry, use case distribution, and adoption maturity. Use our Copilot ROI Calculator to model your organization's specific scenario.
Negative ROI Risks
- Governance overhead: Multi-model requires more sophisticated governance. Budget $50K-$150K annually for additional compliance tooling, policy management, and audit capabilities.
- Training complexity: Users need to understand that Copilot now uses multiple models. Change management programs must be updated. Budget 2-4 additional training hours per user.
- Vendor risk management: Three additional subprocessors require ongoing vendor risk assessments. Budget $20K-$40K annually for vendor risk management activities.
Net assessment: The productivity gains significantly outweigh the governance costs for most enterprises, but only if governance is proactively implemented. Organizations that ignore the governance implications will face compliance gaps that could result in regulatory penalties far exceeding the productivity benefits.
The Agentic AI Future: What Comes Next
Multi-model orchestration is the foundation for agentic AI in the enterprise. Here is the trajectory:
Phase 1: Multi-Model Routing (Current State - 2026)
Copilot dynamically selects models per request. Users interact with a single interface. The orchestrator handles model selection transparently.
Phase 2: Multi-Model Chaining (Late 2026)
Single tasks will be decomposed across multiple models. A research request might use Perplexity for web retrieval, Claude for analysis, and GPT-5 for drafting the final output. Each model handles the subtask it performs best.
Phase 3: Autonomous Agents with Multi-Model Backends (2027)
Copilot agents will operate autonomously, selecting different models for different steps in a multi-step workflow. A procurement agent might use Phi-4 for routine approvals, Claude for contract analysis, and GPT-5 for vendor communication drafting---all within a single automated workflow.
Phase 4: Cross-Platform Agent Orchestration (2027-2028)
Agents will orchestrate across platforms, not just models. A competitive intelligence agent might query Perplexity for market data, use Copilot to analyze internal sales data, leverage Claude to synthesize findings, and publish results through Power BI---all autonomously.
For organizations building toward agentic AI, the governance framework you deploy today for multi-model routing is the foundation for agent governance tomorrow. The organizations that get multi-model governance right in 2026 will be positioned to deploy autonomous agents safely in 2027. Start with our Copilot Studio and custom agents service to build your agentic foundation.
Building Your Multi-AI Strategy: A 90-Day Roadmap
Days 1-30: Assessment and Policy
- Run a readiness assessment to evaluate your current Copilot configuration, data classification maturity, and compliance posture. Our readiness assessment service covers multi-model specific risks.
- Audit current model configuration: Document which models are enabled, for which users, with what restrictions.
- Map data flows: Identify which data classifications are being processed through Copilot and trace potential multi-cloud routing paths.
- Review DPA and subprocessor agreements: Confirm your Microsoft agreement covers Anthropic and Google as AI subprocessors.
- Establish model policies: Define which models are permitted for which data classifications using the policy matrix above.
Days 31-60: Implementation and Controls
- Configure model restrictions: Implement tenant-level and group-level model policies in the Microsoft 365 admin center.
- Enable audit logging: Deploy Purview advanced audit with model attribution for full visibility into model routing.
- Deploy DLP policies: Create sensitivity-label-based policies that prevent regulated data from routing to non-compliant models.
- Update incident response: Add multi-provider breach scenarios to your IR runbook.
- Begin Perplexity evaluation: If research-intensive workflows are common, start a Perplexity Enterprise Pro pilot with 50-100 users.
Days 61-90: Optimization and Training
- Analyze model routing data: Review 60 days of audit logs to understand which models handle which tasks and whether routing aligns with your policy intent.
- Optimize model policies: Refine restrictions based on actual usage data---you may find opportunities to safely enable additional models for specific groups.
- Train users: Update change management programs to cover multi-model concepts, especially for power users and department leads.
- Evaluate Perplexity pilot: Assess ROI from the Perplexity pilot and decide on broader deployment.
- Plan for agentic: Begin scoping autonomous agent use cases that leverage multi-model backends.
Review our AI governance framework for detailed implementation guidance and use the Copilot security checklist as a deployment validation tool.
What Most Consulting Firms Get Wrong
Having deployed Copilot for over 300 enterprise clients, here are the mistakes we see consulting firms make with multi-model strategy:
-
Treating it as a single-vendor solution: Multi-model Copilot involves three AI providers across three clouds. Governance must account for all of them, not just Microsoft.
-
Ignoring model routing in compliance assessments: Most Copilot compliance assessments still evaluate it as a single-model system. If your assessor did not ask about Claude on AWS or Gemini on Google Cloud, your assessment is incomplete.
-
Over-restricting models instead of governing them: Some firms recommend disabling all non-GPT models. This eliminates the productivity benefits of multi-model routing. Better approach: govern model access by data classification, not by blanket restriction.
-
Forgetting Perplexity: Firms focused exclusively on Copilot miss the research layer entirely. Perplexity handles use cases that Copilot is architecturally not designed for---real-time web research with source attribution.
-
Not planning for agentic AI: Multi-model governance is the foundation for agent governance. Firms that deploy point solutions today without a forward-looking architecture will have to rearchitect when agents arrive.
Frequently Asked Questions
Is Claude really enabled by default in Microsoft Copilot?
Yes. As of January 7, 2026, Anthropic's Claude is enabled by default for all Microsoft 365 Copilot users. This was announced at Microsoft Ignite 2025 and rolled out globally in early January. Admins can disable Claude through the Microsoft 365 admin center under Copilot > Model Management, but the default is enabled. This means any organization that has not explicitly reviewed and configured model policies is potentially routing data to Claude on AWS infrastructure without awareness.
What models does Microsoft Copilot currently use?
As of February 2026, Microsoft Copilot's orchestration layer routes requests across five models: GPT-5 and GPT-4o (OpenAI, hosted on Azure), Claude 3.5 Opus (Anthropic, hosted on AWS), Gemini 2.5 Pro (Google, hosted on Google Cloud), and Phi-4 (Microsoft, hosted on Azure). The orchestrator automatically selects the model based on task type, complexity, latency requirements, and admin-configured policies. The model roster is expected to expand throughout 2026.
How do we govern multi-model AI when each model runs on a different cloud?
Governance requires a four-layer approach: (1) Model policy configuration that maps data classifications to permitted models, (2) Data flow monitoring through Microsoft Purview with model attribution logging enabled, (3) Vendor risk management that treats each model provider as a subprocessor, and (4) Updated incident response plans that cover multi-provider breach scenarios. The key tool is the Microsoft 365 admin center's Model Management panel, combined with Purview advanced audit logging. Organizations in regulated industries should also implement sensitivity-label-based DLP policies that restrict which models can process specific data classifications.
Can we use Perplexity alongside Microsoft Copilot?
Yes, and for many enterprises, you should. Perplexity Enterprise Pro and Microsoft Copilot serve different layers of the knowledge work stack. Copilot handles internal document work, productivity tasks, and workflow automation within the Microsoft 365 ecosystem. Perplexity handles external research, competitive intelligence, and fact-finding with real-time web search and source citations. Deploying both gives your organization comprehensive AI coverage: internal productivity (Copilot) plus external research (Perplexity). The combined cost is approximately $70/user/month ($30 Copilot + $40 Perplexity), but most organizations deploy Perplexity selectively to research-heavy roles rather than organization-wide.
What is the cross-cloud data risk with multi-model Copilot?
The primary risk is data residency and compliance boundary violations. When Copilot routes a request to Claude, your data leaves Azure and enters AWS infrastructure. When it routes to Gemini, data enters Google Cloud. For organizations subject to data residency requirements (GDPR, HIPAA, FedRAMP, state privacy laws), this cross-cloud routing may violate compliance postures that assume all Microsoft 365 data remains within Azure boundaries. The mitigation is to configure model restrictions in the admin center based on data classification---restricting sensitive and regulated data to Azure-hosted models (GPT-5, Phi-4) while allowing broader model access for non-sensitive workloads. Microsoft's updated DPA covers these subprocessors, but you must verify that the subprocessor terms meet your specific regulatory requirements.
Need help building a multi-AI governance framework for your organization? Contact our enterprise AI team for a complimentary assessment of your current Copilot configuration and multi-model readiness.
Errin O'Connor
Founder & Chief AI Architect
EPC Group / Copilot Consulting
With 25+ years of enterprise IT consulting experience and 4 Microsoft Press bestselling books, Errin specializes in AI governance, Microsoft 365 Copilot risk mitigation, and large-scale cloud deployments for compliance-heavy industries.
Frequently Asked Questions
Does Microsoft Copilot use Claude by default now?
What AI models does Microsoft 365 Copilot use in 2026?
How do I govern multiple AI models in Microsoft Copilot?
Should enterprises use Perplexity alongside Microsoft Copilot?
What is the cross-cloud data risk with Claude in Copilot?
In This Article
Related Articles
Related Resources
Need Help With Your Copilot Deployment?
Our team of experts can help you navigate the complexities of Microsoft 365 Copilot implementation with a risk-first approach.
Schedule a Consultation

