Multi-armed bandits


You may have seen or heard recently about the Netflix documentary called the Social Dilemma, exploring the potential problems of companies constantly bombarding us with suggestions and adverts. But how do they pick which adverts to show us? I’ll introduce you to recommender systems and show through a simplified example called the multi-armed bandit problem how these companies can learn what the most effective adverts are. I’ll then discuss where this could go in the future and probably sound like a maniac telling you that we’re all doomed to be controlled by the machines. It should be a lot of fun! Disclaimer: I am no expert in statistics or reinforcement learning so I will try to keep the serious maths to a minimum and cover any holes in my knowledge with pretty pictures.

Oct 29, 2020 10:15 AM
Bath Postgraduate Student Seminar
Jeremy Worsfold
Jeremy Worsfold
PhD in Applied Mathematics and Collective Behaviour

My research interests include Collective Behaviour, speficially swarming models and interacting particles Systems. I also have interests in Reinforcement Learning and Scientific Computing.