ROADEF 2026 - Sciencesconf.org

sciencesconf.org:roadef2026:685221

Reinforcement learning for online stochastic matching

Antoine Lunven 1, @ , Ana Bušić 1, @ , Jean-Michel Fourneau 1, 2, @

1 : Apprentissage, graphes et optimisation distribuée

Département d'informatique - ENS-PSL, Centre Inria de Paris

2 : Données et algorithmes pour une ville intelligente et durable - DAVID

Université de Versailles Saint-Quentin-en-Yvelines

We consider online stochastic matching model, with items arriving over time according to a stochastic process. A matching model is defined as a set of classes and a set a compatibilities among these classes of items, which are represented by a connected compatibility graph. Items enter the system one by one and their class is drawn according to a probability distribution. We consider First Come First Matched Policy, i.e. uppon arrival an item is matched with the oldes compatible item in the system, and stored in a queue if there is no compatible items. We propose to use Reinforcement Learning (RL) algorithms to investigate the impact of graph topology on the system performance (e.g. what edges to add to the graph in order to minimize the queue size) for the FCFM policy, for which there is a known closed form steady-state distribution.

Type :	:	Résumé
Export du papier vers HAL	:	Oui
Thématiques	:	[INVITE] [GT COSMOS et GT DAAO] Apprentissage par renforcement
Mots-Clés	:	stochastic matching ; product form queueing networks ; reinforcement learning
Licence du fichier	:	Paternité

Vie privée | Accessibilité