The licensing infrastructure for academic pedagogy

The human curriculum, licensed so machines can actually reason.

Frontier models have saturated web text. The highest-density reasoning data on Earth still lives inside universities. We help bring it to light the right way: rights-cleared and auditable.

A professor lecturing in a university hall
Real-world grounding
Progressive exercises. Multi-step proofs. Expert feedback loops. The pedagogy that builds human intelligence.
Provenance cleared
LIC-FR-MATH-2026-0417
Université Paris-Saclay
14.2M tokens · 47 profs
EU AI Act compliant 800+ French institutions in reach Moral rights preserved by design Auditable from lecture to token
The problem · shadow extraction

Today, academic pedagogy reaches AI models through the back door, with students pasting it in and no one keeping track.

Every day, students paste proprietary courses, structured exams, and grading rubrics into chat models. It quietly bootstrapped today's models, but no one chose it, no one cleared it, and no one was paid for it. That can't carry the next step, and the lawsuits will only pile up.

AI labs get

Reasoning data with no clear provenance, and real legal exposure.

Students get

Their homework done. Millions of uploads a day.

Educators get

Nothing. No consent, no credit, no compensation.

How MLectio works

The compliant bridge between European academia and frontier AI.

01

Clear the rights

We navigate educators’ inalienable moral rights and the patrimonial rights held by professors or institutions, and we manage opt-in and opt-out at the source.

02

Clean & structure

Lectures, exercises, exams, and graded feedback become high-density, structured corpora, and the seed data labs need to generate synthetic reasoning.

03

Deliver, auditable

EU AI Act-compliant provenance, traceable from lecture to token, ready for advanced SFT and RL pipelines.

The opportunity

A market shaped by compliance, and the data only universities hold.

$22.6B

AI training-data market by 2034

Grand View Research, 2025 · 18.9% CAGR
164+

active copyright lawsuits over AI training

Stanford HAI AI Index, 2026
~5,000

higher-education institutions across the EU

800+ in France alone
For frontier labs

Auditable reasoning tokens, cleared for training.

Talk to our team

High-density corpora

Progressive curricula, multi-step proofs, structured exams and expert feedback. The reasoning signal web text simply doesn’t have.

EU AI Act provenance

Every token traces to a licensed source. Export an audit trail for any package.

Annual refresh

Licenses renew as courses update, so the signal keeps coming. The year-over-year delta is rare on its own.

For universities & professors

Your curriculum, licensed on your terms.

Talk to our team

Moral rights, preserved

Inalienable and untouched. You stay the author; we only license patrimonial use.

Opt-in, opt-out, anytime

You give consent course by course, and we honor it at the source. Change your mind and the change propagates.

Compensation that recurs

When your pedagogy helps train a model, you get paid. Every refresh, transparently.

Clean data. Clear rights. Better models.

The next paradigm of reasoning runs on rights-cleared data.

Let's build the auditable bridge between European academia and your training pipeline. We'd love to talk.