🚀 Getting Started

Overview

GEM is a diverse collection of environments for training LLM agents in the era of experience. The library includes Math, Code, general reasoning, and question-answering environments, as well as a suite of games (Mastermind, Minesweeper, Hangman, etc). GEM also features fully integrated python and search tool use.

New to GEM? Start with our Quick Start guide below to get started and running in minutes.

Installation

pip install gem-llm

Quick Start

Here’s a simple example to get you started. The interface closely follows Gym and other popular RL environment suites.

Environments can be initialized with make() (or make_vec() for parallelization) and each environment has Env.reset(), Env.step() and Env.sample_random_action() functions.

import gem

# Initialize the environment
env = make("game:GuessTheNumber-v0")

# Reset the environment to generate the first observation
observation, info = env.reset()
for _ in range(30):
    action = env.sample_random_action() # insert policy here

    # apply action and receive next observation, reward
    # and whether the episode has ended
    observation, reward, terminated, truncated, info = env.step(action)

    # If the episode has ended then reset to start a new episode
    if terminated or truncated:
        observation, info = env.reset()

Please see further documentation for details of vectorized environments, automated resetting, different observation/chat templates, and integrated tools.

Training Agents

GEM includes single file examples for training an LLM agent through oat or verl framework.

train with OAT

The OAT framework provides a comprehensive solution for training language model agents in reinforcement learning environments.

train with verl

The VERL framework offers another approach to training agents with different optimization strategies and capabilities.