Make Out Like a Bandit

Continuous ImprovementA-B Testing
This simulation is intended to teach you about the multi-armed bandit problem or bandit algorithms. Each button will give you a different random amount of fictional money but costs a fictional $5 to click. How much fictional money can you make?

an octopus playing with slot machines

You are an octopus at a casino. You want to make as much money as possible by pulling levers on 8 different slot machines. Each time you pull a lever, it costs you $5.

Click on the buttons below to pull a lever.

How much fictional money can you make in...


Which button was the most profitable?

How much money did you have to lose before you figured this out (and started to come out ahead)?

What was your strategy for balancing Exploitation (i.e., earning the most money possible from buttons that were returning high values) and Exploration (i.e., trying new buttons)? When did you move on, and when did you stay?

Are you certain that you couldn't have made more money if you had explored more? Why? (The best algorithms can net around $60 in 100 clicks.)

If you were to write instructions to systematically choose the best solution in another scenario, what process would you follow?

Royce Kimmons

Brigham Young University

Royce Kimmons is an Associate Professor of Instructional Psychology and Technology at Brigham Young University where he seeks to end the effects of socioeconomic divides on educational opportunities through open education and transformative technology use. He is the founder of,, and many other sites focused on providing free, high-quality learning resources to all. More information about his work may be found at, and you may also dialogue with him on Twitter @roycekimmons.

This content is provided to you freely by EdTech Books.

Access it online or download it at