Can AI vendors use my company's data to train their models?

Under default terms from most AI vendors, yes — your data may be used for model improvement and training unless you explicitly opt out or negotiate exclusionary language into your contract. Enterprise agreements from OpenAI, Microsoft, Google, and others typically include model improvement opt-outs for paid enterprise tiers, but these defaults are not always surfaced prominently. Always review and explicitly negotiate data usage restrictions before signing any AI vendor agreement.

Who owns AI-generated outputs produced using my company's data?

Ownership of AI-generated outputs varies by vendor and jurisdiction. Most enterprise AI vendors assign output ownership to the customer for content generated using the customer's prompts and data. However, vendors typically retain rights to use outputs for model improvement (unless restricted), and the IP status of AI-generated content remains unsettled in many jurisdictions. Your contracts should explicitly state that outputs generated using proprietary data and prompts are owned by your organisation and cannot be used by the vendor for any purpose without consent.

What is data residency and why does it matter for AI contracts?

Data residency refers to where your data is physically stored and processed. For enterprises subject to GDPR, data localisation laws, or sector-specific regulations (healthcare, finance, defence), data may be legally required to remain within specific geographic boundaries. Many AI vendors — including hyperscaler AI services — default to processing data globally. Enterprise contracts must include explicit data residency commitments: specific regions for storage and processing, restrictions on cross-border transfers, and audit rights to verify compliance.

What happens to my data when an AI vendor is acquired or goes bankrupt?

Change of control and insolvency provisions are among the most overlooked protections in AI vendor contracts. Without explicit language, your data may transfer to an acquirer whose data practices differ materially from your vendor's, or become inaccessible in an insolvency. Enterprise AI contracts should include: data return or destruction obligations triggered by change of control events; a termination for convenience right if an acquirer fails to match the existing data protection commitments; and step-in rights or data portability guarantees in insolvency scenarios.

AI Data Rights in Vendor Contracts: What to Demand (2026)

Contents

Why AI Data Rights Are the New IP Battleground
Training Data Consent and Model Improvement Clauses
Output Ownership: Who Owns What the AI Produces
Data Residency and Sovereignty Requirements
Third-Party Data Sharing and Sub-Processor Chains
Change of Control and Insolvency Protections
Non-Negotiable Clauses: A Reference List
How Major AI Vendors Approach Data Rights by Default

Why AI Data Rights Are the New IP Battleground

When you deploy an enterprise AI system, you are not simply purchasing software. You are feeding your organisation's most sensitive assets — customer data, proprietary documents, internal communications, financial records, trade secrets — into systems controlled by third parties, operating under terms of service that most procurement teams don't read beyond the pricing section.

The consequences of inadequate data rights protections in AI contracts have moved from theoretical to concrete. In 2024 and 2025, we saw the first wave of enterprise disputes arising from AI vendor defaults on data handling: training data leakage across enterprise tenants, outputs incorporating proprietary customer methodologies surfacing in competitor contexts, and regulatory enforcement actions against enterprises whose AI vendors processed personal data in non-compliant jurisdictions.

The AI vendor community, to varying degrees, has responded with improved default terms for enterprise tiers. But "improved" is not the same as "adequate" — and defaults remain the floor, not the ceiling, of what is achievable in negotiation.

"Every AI contract we review contains at least one data rights provision that, if triggered, would create significant liability or competitive risk for the customer. Most of those provisions are removable in negotiation."

Training Data Consent and Model Improvement Clauses

The central data rights question in any AI vendor contract is whether the vendor can use your data — inputs, outputs, usage patterns, documents — to train or improve their models. This is frequently buried in "Service Improvement" or "Product Enhancement" language rather than being prominently flagged as a training data provision.

What Default Terms Typically Allow

Under standard terms (not enterprise-negotiated), most AI vendors reserve the right to use customer data for model improvement unless you explicitly opt out or upgrade to a paid enterprise tier. Even on enterprise tiers, the default opt-out is typically limited to direct training data use — it may not cover usage pattern analysis, aggregate telemetry, or the use of your prompting styles to improve prompt engineering.

What to Negotiate

Your contract should include a blanket restriction on the use of any customer data — including inputs, outputs, metadata, usage logs, and aggregated telemetry — for any AI training, model improvement, fine-tuning, evaluation, or benchmarking purpose, except with explicit written consent on a case-by-case basis. This should apply to the vendor's subprocessors and affiliated entities, not just the vendor itself.

A stronger position includes a warranty from the vendor that their foundation models were not trained on data sourced from enterprises without explicit consent — a provision that has significant implications for IP indemnification (addressed below).

Output Ownership: Who Owns What the AI Produces

Enterprise AI systems generate outputs — documents, code, analyses, recommendations, images, audio — that form the basis of commercial decisions and customer deliverables. The IP ownership question is more complex than it appears.

The Current Legal Landscape

Copyright law in most jurisdictions does not currently protect AI-generated content as such — copyright requires human authorship. The practical implication is that AI outputs may be unprotectable as original works, regardless of what your contract says about "ownership." What your contract can determine is who has rights to use those outputs commercially and whether the vendor retains any right to use your outputs for their purposes.

Key Contractual Provisions

Provision	Minimum Acceptable Position	Preferred Position
Output usage rights	Customer has unlimited commercial use rights to outputs	Customer has exclusive rights; vendor has no use rights
Vendor use of outputs	Vendor cannot use outputs to train or improve models	Vendor cannot use outputs for any purpose without consent
IP indemnification	Vendor indemnifies customer against third-party IP claims arising from vendor-side training data	Uncapped indemnification with defence obligations
Output warranties	Vendor warrants outputs won't knowingly infringe third-party IP	Plus: accuracy/fitness representations with SLA remedies

The IP indemnification clause deserves particular attention. If an AI vendor's foundation model was trained on copyrighted material without licence, outputs from that model may constitute copyright infringement — and without an adequate indemnification clause, your enterprise bears that risk. Several major AI vendors now provide indemnification for commercially deployed models, but the scope and caps vary significantly. See our guide on essential AI vendor contract clauses for the full analysis.

Data Residency and Sovereignty Requirements

Data sovereignty is where AI procurement intersects most directly with regulatory compliance. Enterprises subject to GDPR, the EU AI Act, sector-specific data localisation requirements (banking, healthcare, defence), or national security frameworks face a non-trivial challenge: most AI inference infrastructure is designed for global distribution, not geographic containment.

Processing Location vs Storage Location

A critical distinction many procurement teams miss: data residency provisions typically cover where data is stored, not necessarily where it is processed. For AI systems, the processing location is often the more sensitive consideration — inference (where the model runs on your prompts) may occur in a different jurisdiction from where your data is stored. Your contract must address both storage and processing locations explicitly.

What to Include in Data Residency Provisions

Explicit named regions for both data storage and inference processing
Prohibition on cross-border data transfer for your specific data categories without written consent
Notification obligations if processing location changes (minimum 60-day notice)
Annual audit rights to verify residency compliance, including the right to review subprocessor locations
Termination for cause rights if residency commitments are breached without cure within 30 days

"Data residency for AI systems must cover inference location, not just storage. The model runs your data somewhere — and that 'somewhere' needs to be in your contract."

Enterprise AI platforms routinely rely on sub-processor chains: cloud infrastructure providers, fine-tuning specialists, evaluation services, safety classifiers, and content moderation layers. Each of these sub-processors potentially touches your data. Your vendor contract governs the prime vendor's obligations — but sub-processor obligations are typically addressed only by reference to a "Sub-Processor List" that can change with minimal notice.

Minimum Sub-Processor Protections

Your contract should require: (1) prior written notice before adding sub-processors who will access your enterprise data; (2) your right to object to new sub-processors with a reasonable cure period; (3) contractual flow-down of your data rights protections to all sub-processors; and (4) vendor liability for sub-processor breaches as if they were vendor breaches.

Change of Control and Insolvency Protections

The AI vendor landscape is consolidating rapidly. Several significant AI companies that enterprises contracted with in 2023 have since been acquired, merged, or restructured. Without explicit change of control protections, your data rights obligations may survive a transaction while your operational protections don't.

Essential provisions include: termination for convenience triggered by a change of control event if the acquirer fails to assume equivalent data obligations within 60 days; data return or certified destruction within 30 days of contract termination; and step-in rights or escrow arrangements for critical data processing dependencies.

Non-Negotiable Clauses: A Reference List

Based on our AI procurement advisory engagements, the following provisions should be treated as non-negotiable in any enterprise AI contract involving sensitive or regulated data:

Training data exclusion: Your data will not be used to train, fine-tune, or evaluate AI models without explicit written consent
Output ownership: All outputs generated from your data and prompts are owned by your organisation
IP indemnification: Vendor defends and indemnifies against IP claims arising from vendor-side model training
Data residency: Storage and processing locations specified by name, including subprocessors
Sub-processor controls: Prior notice and objection rights for changes to subprocessor chain
Data return/destruction: Certified within 30 days of termination
Audit rights: Annual right to audit data handling practices
Change of control: Termination rights if acquirer fails to assume equivalent obligations
Regulatory cooperation: Vendor cooperates with regulatory investigations involving your data
Breach notification: 72-hour notification of any data breach involving your data

How Major AI Vendors Approach Data Rights by Default

Understanding vendor defaults helps you prioritise where to focus your negotiating effort:

Vendor	Default Training Data Use	Enterprise Opt-Out Available	IP Indemnification
OpenAI (Enterprise)	Excluded by default for API/Enterprise	Yes — included in enterprise tier	Limited — Copyright Shield programme
Microsoft Copilot (Enterprise)	Excluded for M365 Copilot enterprise	Yes — Microsoft Purview controls	Yes — Copilot Copyright Commitment
Google Gemini (Enterprise)	Excluded for Workspace Enterprise	Yes — admin controls available	Limited — indemnification for certain services
Anthropic Claude (Enterprise)	Excluded by default in enterprise agreements	Yes — contractual guarantee	Available — scope varies by agreement
Amazon Bedrock	AWS doesn't use customer inputs to train AWS models	Built into service by default	Third-party model IP indemnification limited

These defaults reflect enterprise tier positions and can change. Always verify current terms directly and supplement them with contractual language — defaults can change unilaterally through terms of service updates unless contractually locked. For a full pre-signature review framework, see our GenAI Procurement Checklist for Enterprise Buyers.

AI Data Rights in Vendor Contracts: What to Demand (2026)

Why AI Data Rights Are the New IP Battleground

Training Data Consent and Model Improvement Clauses

What Default Terms Typically Allow

What to Negotiate

Output Ownership: Who Owns What the AI Produces

The Current Legal Landscape

Key Contractual Provisions

Data Residency and Sovereignty Requirements

Processing Location vs Storage Location

What to Include in Data Residency Provisions

Minimum Sub-Processor Protections

Change of Control and Insolvency Protections

Non-Negotiable Clauses: A Reference List

How Major AI Vendors Approach Data Rights by Default

AI Procurement: Related Guides

AI Data Rights: Common Questions

Protect Your Data Rights Before You Sign

AI Data Rights in Vendor Contracts: What to Demand (2026)

Why AI Data Rights Are the New IP Battleground

Training Data Consent and Model Improvement Clauses

What Default Terms Typically Allow

What to Negotiate

Output Ownership: Who Owns What the AI Produces

The Current Legal Landscape

Key Contractual Provisions

Data Residency and Sovereignty Requirements

Processing Location vs Storage Location

What to Include in Data Residency Provisions

Third-Party Data Sharing and Sub-Processor Chains

Minimum Sub-Processor Protections

Change of Control and Insolvency Protections

Non-Negotiable Clauses: A Reference List

How Major AI Vendors Approach Data Rights by Default

AI Procurement: Related Guides

AI Data Rights: Common Questions

Protect Your Data Rights Before You Sign

AI Procurement Intelligence