24. December 2018
3 min

Reinforcement learning - Part 3: Creating your own gym environment

What is the ability to create an agent without knowing how to implement an environment? The Gym toolkit already provides a bunch of settings. But not every problem is included there. In this post I will show you how you can create your very own Gym environment. I will use a bubble shooter game written in python and wrap it into the expected shape. The procedure stays pretty much the same for every problem. You just have to adapt this tutorial to your needs.


Before you start building your environment, you need to install some things first. Git and Python 3.5 or higher are necessary as well as installing Gym. I recommend cloning the Gym Git repository directly. This is particularly useful when you’re working on modifying Gym itself or adding new environments (which we are planning on doing). You can download and install using:

For this special case we also need the PyGame lib, as the bubble shooter is based on that. You can skip this if you aren’t using this library. For installation just do:

Setting up the package structure

Gym environments all come in the PIP package structure which we will set up now. Just replace everything specific to bubble shooter with your own innovative name. We create a new repository called “gym-bubbleshooter”. It should have at least the following files:

The README.md  can contain a short description of your environment. In this case the game bubble shooter is briefly explained. The file gym-bubbleshooter/setup.py  should have:

The gym-bubbleshooter/gym_bubbleshooter/__init__.py  should contain the following lines. The id is the name we will use later to make our environment.

The gym-bubbleshooter/gym_bubbleshooter/envs/__init__.py  should have:


The core of the environment is the gym-bubbleshooter/gym_bubbleshooter/envs/bubbleshooter_env.py . It contains the environment-class with its four methods we know from the interaction with other environments.

The first method initializes the class and sets the initial state. The second function takes an action and steps the environment one step ahead. It returns the next state, the reward for that action, a boolean that describes whether the episode is over and some additional info on our problem. The remaining two functions are reset, which resets the state and returns it and render, which visualizes the state of the environment in some form.

For my special case I have used an existing game and shaped it into to the form above. I will not go into detail of every little step, but give you my general approach to this problem. The first thing I did was creating a new repository following all of the above steps. Then I cloned the game’s git repository and copied the code of the puzzlebubble.py  into my environment class. After that, I deleted everything unnecessary like the start of the main function, everything related to music and endscreen method. If you don’t start from scratch, I highly recommend investing some time into understanding everything in the given code. This will help you a lot as you gather everything up. The main difficulties regarding my environment were creating an interface such that you can interact with the game using step() instead of the keyboard and separating the game logic from the rendering. The latter was very time consuming as I basically had to rewrite the whole collision logic. You should also take a look at other Gym environments to get a feeling for how they work and their usage of spaces. If you are finished you can go on with the next section.


Once you are done implementing, you can install the environment. Just go to the gym-bubbleshooter folder and run the following command:


There it is, a working Gym environment of bubble shooter. It probably needs some performance improvements and optimizations in some other stuff before you can unleash your agents on this environment. But for now I’m pretty contented with it.

To use this environment you have to clone my git repository and install it. Then you can utilize the following lines of code.

And that’s the end of my blog post trilogy about reinforcement learning. After these three blogs full of RL, you should be able to create your own agent and environment now. I hope you learned something new and enjoyed reading. I would love to read your feedback in the comments. Merry Christmas to you all and a happy new year!





Comment article

Cookies help us to improve your experience when surfing our website. By continuing to use this website, you agree to our use of cookies. Learn more