Cartpole - Introduction to Reinforcement Learning (DQN - Deep Q-Learning)

Solving OpenAI Gym Environment

Greg Surma
6 min readSep 26, 2018


In today’s article, I am going to introduce you to the hot topic of Reinforcement Learning. After this post, you will be able to create an agent that is capable of learning through trial and error and ultimately solving the cartpole problem.

Before and after training / Before and after reading this article

Table of Contents

Cartpole Problem

Cartpole - known also as an Inverted Pendulum is a pendulum with a center of gravity above its pivot point. It’s unstable, but can be controlled by moving the pivot point under the center of mass. The goal is to keep the cartpole balanced by applying appropriate forces to a pivot point.

Cartpole schematic drawing
  • Violet square indicates a pivot point
  • Red and green arrows show possible horizontal forces that can be applied to a pivot point

A pole is attached by an un-actuated joint to a cart, which moves along a frictionless track. The system is controlled by applying a force of +1 or -1 to the cart. The pendulum starts upright, and the goal is to prevent it from falling over. A reward of +1 is provided for every timestep that the pole remains upright. The episode ends when the pole is more than 15 degrees from vertical, or the cart moves more than 2.4 units from the center.

Take a look at a video below with a real-life demonstration of a cartpole problem learning process.