๐ Getting Started
GEM is a diverse collection of environments for training LLM agents in the era of experience. The library includes Math, Code, general reasoning, and question-answering environments, as well as a suite of games (Mastermind, Minesweeper, Hangman, etc). GEM also features fully integrated python and search tool use.
Installation
pip install gem-llm
Quick Start
Here’s a simple example to get you started. The interface closely follows Gym and other popular RL environment suites.
Environments can be initialized with make()
(or make_vec()
for parallelization) and each environment hasย Env.reset()
,ย Env.step()
ย andย Env.sample_random_action()
functions.
import gem
# Initialize the environment
env = make("game:GuessTheNumber-v0")
# Reset the environment to generate the first observation
observation, info = env.reset()
for _ in range(30):
action = env.sample_random_action() # insert policy here
# apply action and receive next observation, reward
# and whether the episode has ended
observation, reward, terminated, truncated, info = env.step(action)
# If the episode has ended then reset to start a new episode
if terminated or truncated:
observation, info = env.reset()
Training Agents
GEM includes single file examples for training an LLM agent through oat
or verl
framework.
The OAT framework provides a comprehensive solution for training language model agents in reinforcement learning environments.
The VERL framework offers another approach to training agents with different optimization strategies and capabilities.