Deep Q-Learning for Pokémon Showdown Tayyab Hussain Supervisor: Tom Chothia BSc Computer Science 2019/2020
1
Abstract
Pokémon Showdown is an open source battle simulator for the hit video game series Pokémon, used by thousands of players to practice their skills in preparation for tournaments or just for fun. Pokémon is an incredibly diverse and difficult game to master as it has a very large possible state space and requires an incredible amount of game knowledge and human prediction skills. Deep Q-learning techniques were used to see if this game is solvable by a machine or whether the problem is intractable for computers. There are many features that make Pokémon a potentially difficult game for machine learning to solve. For example, move trees cannot be easily created due to an inherent randomness factor to every move, such as missing a move, critical hits or damage rolls. This increases the state space drastically therefore simple expectimax algorithms will not produce the best results, and simple Q-learning is insufficient. The very top players in Pokémon Showdown predict their opponent’s moves based on human psychology. There is not always an obviously correct move so choices cannot always be easily predicted, leading to moves based on “gut feeling”. However, deep learning has shown extremely impressive results in games such as Go that also have an element of human psychology to them, which is why it was chosen as the preferred technique to solve this game.
1