You may have seen or heard recently about the Netflix documentary called the Social Dilemma, exploring the potential problems of companies constantly bombarding us with suggestions and adverts. But how do they pick which adverts to show us? I’ll introduce you to recommender systems and show through a simplified example called the multi-armed bandit problem how these companies can learn what the most effective adverts are. I’ll then discuss where this could go in the future and probably sound like a maniac telling you that we’re all doomed to be controlled by the machines. It should be a lot of fun! Disclaimer: I am no expert in statistics or reinforcement learning so I will try to keep the serious maths to a minimum and cover any holes in my knowledge with pretty pictures.