Grid World Reinforcement Learning Python