Video: The Song of the Multi-Armed Bandit

A song designed to assist in teaching the basics of Multi-Armed Bandits, which is a type of machine learning algorithm and is the foundation for many recommender systems. These algorithms spend some part of the time exploiting choices (arms) that they know are good while exploring new choices.  The song (music and lyrics) was written in 2021 by Cynthia Rudin from Duke University and was part of a set of three data science oriented songs that won the grand prize in the 2023 A-mu-sing competition.  The lyrics are full of double entendres so that the whole song has another meaning where the bandit could be someone who just takes advantage of other people! The composer mentions these examples of lines with important meanings:
"explore/exploit" - the fundamental topic in MAB!
"No regrets" - the job of the bandit is to minimize the regret throughout the game for choosing a suboptimal arm
"I keep score" - I keep track of the regrets for all the turns in the game
"without thinking too hard,"  - MAB algorithms typically don't require much computation
"no context, there to use," - This particular bandit isn't a contextual bandit, it doesn't have feature vectors 
"uncertainty drove this ride." - rewards are probabilistic
"I always win my game"  - asymptotically the bandit always finds the best arm
"help you, decide without the AB testing you might do" - Bandits are an alternative to massive AB testing of all pairs of arms
"Never, keeping anyone, always looking around and around" - There's always some probability of exploration throughout the play of the bandit algorithm

No votes yet
Source Code Available: 
Source Code Not Available
Material Type: 
Statistical Topic: 
Free for Nonprofits

You must Login or Register to post comments.