Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit

Julien Zhou; Pierre Gaillard; Thibaud Rahier; Julyan Arbel

Pré-Publication, Document De Travail Année : 2024

Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit

(1, 2, 3) , (1) , (2) , (3)

1
2
3

Julien Zhou

Fonction : Auteur correspondant
PersonId : 1355200

Connectez-vous pour contacter l'auteur

Apprentissage de modèles à partir de données massives

Criteo AI Lab

Modèles statistiques bayésiens et des valeurs extrêmes pour données structurées et de grande dimension

Pierre Gaillard

Fonction : Auteur
PersonId : 13025
IdHAL : pierre-gaillard
ORCID : 0000-0002-5665-7904
IdRef : 19041992X

Apprentissage de modèles à partir de données massives

Thibaud Rahier

Fonction : Auteur
PersonId : 1084694

Criteo AI Lab

Julyan Arbel

Fonction : Auteur
PersonId : 5183
IdHAL : julyanarbel
ORCID : 0000-0002-2525-4416
IdRef : 178641936

Modèles statistiques bayésiens et des valeurs extrêmes pour données structurées et de grande dimension

Résumé

We address the online unconstrained submodular maximization problem (Online USM), in a setting with stochastic bandit feedback. In this framework, a decision-maker receives noisy rewards from a nonmonotone submodular function, taking values in a known bounded interval. This paper proposes Double-Greedy - Explore-then-Commit (DG-ETC), adapting the Double-Greedy approach from the offline and online full-information settings. DG-ETC satisfies a O(d log(dT)) problemdependent upper bound for the 1/2-approximate pseudo-regret, as well as a O(dT^{2/3}log(dT)^{1/3}) problem-free one at the same time, outperforming existing approaches. To that end, we introduce a notion of hardness for submodular functions, characterizing how difficult it is to maximize them with this type of strategy.

Mots clés

Unconstrained submodular maximization Stochastic bandits Logarithmic regret Sampling complexity

Domaines

Machine Learning [stat.ML] Apprentissage [cs.LG] Combinatoire [math.CO] Optimisation et contrôle [math.OC]

Fichier principal

Nonmonotone_Submodular_Bandits_HAL.pdf (717.45 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Julien Zhou : Connectez-vous pour contacter le contributeur

https://hal.science/hal-04729023

Soumis le : jeudi 10 octobre 2024-11:03:46

Dernière modification le : samedi 9 novembre 2024-22:02:19

Dates et versions

hal-04729023 , version 1 (10-10-2024)

Licence

Paternité

Identifiants

HAL Id : hal-04729023 , version 1

Citer

Julien Zhou, Pierre Gaillard, Thibaud Rahier, Julyan Arbel. Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit. 2024. ⟨hal-04729023⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

UGA CNRS INRIA INSMI LJK LJK_GI LJK_PS INRIA2 TDS-MACS LJK-GI-THOTH LJK-PS-STATIFY

44 Consultations

42 Téléchargements

Logarithmic Regret for Unconstrained Submodular Maximization Stochastic Bandit

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Partager