Computing an Index Policy for Bandits with Switching Penalties

Niño-Mora, José (2010) Computing an Index Policy for Bandits with Switching Penalties. In: 1st International ICST Workshop on Tools for solving Structured Markov Chains.

[thumbnail of 8305.pdf] PDF
8305.pdf

Download (344kB)

Abstract

We address the multiarmed bandit problem with switching penalties including both costs and delays. Asawa and Teneketzis (1996) introduced an index for bandits with switching penalties that partially characterizes optimal policies, attaching to each project state a "continuation index" (its Gittins i

Item Type: Conference or Workshop Item (UNSPECIFIED)
Date Deposited: 04 Mar 2026 08:45
Last Modified: 18 Apr 2026 06:30
URI: http://eprints.eai.eu/id/eprint/948

Actions (login required)

View Item
View Item