Measure-to-measure interpolation using Transformers

GT Transport optimal - EDP - Machine learning

10
mars 2025
-
10
mars 2025

Intervenant :	Borjan Geshkovski
Institution :	Inria-Paris et LJLL
Heure :	14h00 - 15h00
Lieu :	1A13

Transformers are deep neural network architectures that underpin the recent successes of large language models. Unlike more classical architectures that can be viewed as point-to-point maps, a Transformer acts as a measure-to-measure map implemented as specific interacting particle system on the unit sphere: the input is the empirical measure of tokens in a prompt and its evolution is governed by the continuity equation. Transformers are not limited to empirical measures and can in principle process any input measure. We provide an explicit choice of parameters that allows a single Transformer to match N arbitrary input measures to N arbitrary target measures, under the minimal assumption that every pair of input-target measures can be matched by some transport map.

Voir tous les événements