What AI buyers really want from your data

At this year’s CDOIQ conference in Boston, we walked attendees through one of the fastest-growing (and most misunderstood) segments of the data economy: monetizing enterprise data for use in AI systems.

Aug 11, 2025

What AI buyers really want from your data

Insights from Jessica Li Gebert’s presentation at CDOIQ Boston 2025:

At this year’s CDOIQ conference in Boston, I walked attendees through one of the fastest-growing (and most misunderstood) segments of the data economy: monetizing enterprise data for use in AI systems.

Here are five key insights from my session:

1. Why are news and academic publishers dominating AI data deals?

I analyzed 52 licensing agreements since late 2022. The most active sellers? News outlets and academic publishers. Why? Their data is high-quality, language-rich, and - crucially - clearly owned. AI developers need large volumes of clean, vetted text to train models. These sources deliver that with minimal legal friction.

2. You don’t have to give away exclusivity - slice your data strategically instead.

Non-exclusive licensing is the norm in this space. Even better, you can segment your data by topic, timeframe or use case. Think of it as “slice and sell.” One dataset can become multiple products, each aligned with different AI needs - without relinquishing full control.

3. Four ways to monetize your data in the AI era

The playbook is expanding. Some sellers opt for traditional fixed-term licenses. Others strike licensing-plus-partnership deals, like OpenAI funding newsrooms in return for content. Usage-based pricing is emerging (e.g. Cloudflare pay per crawl). And revenue-sharing models are on the horizon, though still maturing.
Each model carries trade-offs - understanding them is key to choosing the right one.

4. The future of AI lies in narrow models and niche data

General-purpose LLMs may dominate the headlines, but the real growth is happening in small language models and vertical AI: models built for finance, healthcare, supply chain and more. That shift unlocks demand for specialized datasets - including underrepresented languages, domain-specific logic and data, and conversation transcripts.

If your data is niche, accurate and hard to replicate, it may be far more valuable than you think.

5. AI data monetization carries risk - but it’s manageable

Yes, there are concerns: copyright infringement, geopolitical restrictions, data leakage, competitive overlap. But they’re not deal-breakers. With smart licensing terms, technical access controls and proper due diligence, enterprises can navigate this landscape safely - and profitably.
The bottom line? If you’ve been sitting on high-quality, structured content - now is the time to assess its AI value. The demand is real, and the playbook is already forming.

For more information, contact: consulting@neudata.co

Blog suggestion

Suggest a topic for the Neudata blog

Suggest a blog topic

Complete Neudata's annual trends survey before Aug. 8th

Get early access to results | Win a $500 gift certificate (terms and conditions apply)

Complete Neudata's annual trends survey before Aug. 8th

Get early access to results | Win a $500 gift certificate (terms and conditions apply)