Video: The Song of the Multi-Armed Bandit

A music video designed to assist in teaching the basics of Multi-Armed Bandits, which is a type of machine learning algorithm and is the foundation for many recommender systems. These algorithms spend some part of the time exploiting choices (arms) that they know are good while exploring new choices (think of an ad company choosing an advertisement they know is good, versus exploring how good a new advertisement is). The music and lyrics were written by Cynthia Rudin of Duke University and was one of three data Science songs that won the grand prize and first in the song category for the 2023 A-mu-sing competition.

The lyrics are full of double entendres so that the whole song has another meaning where the bandit could be someone who just takes advantage of other people! The author provides these examples of some lines with important meanings:
"explore/exploit" - the fundamental topic in MAB!
"No regrets" - the job of the bandit is to minimize the regret throughout the game for choosing a suboptimal arm
"I keep score" - I keep track of the regrets for all the turns in the game
"without thinking too hard,"  - MAB algorithms typically don't require much computation
"no context, there to use," - This particular bandit isn't a contextual bandit, it doesn't have feature vectors 
"uncertainty drove this ride." - rewards are probabilistic
"I always win my game"  - asymptotically the bandit always finds the best arm
"help you, decide without the AB testing you might do" - Bandits are an alternative to massive AB testing of all pairs of arms
"Never, keeping anyone, always looking around and around" - There's always some probability of exploration throughout the play of the bandit algorithm

No votes yet
Source Code Available: 
Source Code Not Available
Material Type: 
Statistical Topic: 
Free for Nonprofits

You must Login or Register to post comments.