Deep Q Learning for Pokemon Showdown by Tayyab

Deep Q Learning for Pokemon Showdown

Deep Q-Learning for Pokémon Showdown Tayyab Hussain Supervisor: Tom Chothia BSc Computer Science 2019/2020

Abstract

Pokémon Showdown is an open source battle simulator for the hit video game series Pokémon, used by thousands of players to practice their skills in preparation for tournaments or just for fun. Pokémon is an incredibly diverse and difficult game to master as it has a very large possible state space and requires an incredible amount of game knowledge and human prediction skills. Deep Q-learning techniques were used to see if this game is solvable by a machine or whether the problem is intractable for computers. There are many features that make Pokémon a potentially difficult game for machine learning to solve. For example, move trees cannot be easily created due to an inherent randomness factor to every move, such as missing a move, critical hits or damage rolls. This increases the state space drastically therefore simple expectimax algorithms will not produce the best results, and simple Q-learning is insufficient. The very top players in Pokémon Showdown predict their opponent’s moves based on human psychology. There is not always an obviously correct move so choices cannot always be easily predicted, leading to moves based on “gut feeling”. However, deep learning has shown extremely impressive results in games such as Go that also have an element of human psychology to them, which is why it was chosen as the preferred technique to solve this game.