

import gymnasium as gym
# Create an environment
env = gym.make('CartPole-v1')
# Initialize the environment
state = env.reset()
# Example of taking an action
action = env.action_space.sample()
next_state, reward, done, info = env.step(action)
env = create_environment()
state = env.get_initial_state()
for i in range (n_iterations):
action = choose_action(state)
state, reward = env.execute (action)
update_knowledge(state, action, reward)
Tasks -
- episodic - tasks segmented into episodes. episode has a beginning and an end.
- continuous
Key points
- RL is based on reward for desirable behaviors and punishments for undesirable ones.
- RL is based on interaction between an agent and an environment, to achieve a specific goal.
