WindyGridworld: Windy Gridworld

Description

Windy Gridworld problem for reinforcement learning. Actions include going left, right, up and down. In each column the wind pushes you up a specific number of steps (for the next action). If an action would take you off the grid, you remain in the previous state. For each step you get a reward of -1, until you reach into a terminal state.

Arguments

...

[any] Arguments passed on to makeEnvironment.

Usage

makeEnvironment("windy.gridworld", ...)

Methods

$step(action) Take action in environment. Returns a list with state, reward, done.
$reset() Resets the done flag of the environment and returns an initial state. Useful when starting a new episode.
$visualize() Visualizes the environment (if there is a visualization function).

Details

This is the gridworld (goal state denoted G, start state denoted S). The last row specifies the upward wind in each column.

.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
S	.	.	.	.	.	.	G	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
0	0	0	1	1	1	2	2	1	0

References

Sutton and Barto (Book draft 2017): Reinforcement Learning: An Introduction Example 6.5

Examples

Run this code

# NOT RUN {
env = makeEnvironment("windy.gridworld")
# }

Run the code above in your browser using DataLab

.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
S	.	.	.	.	.	.	G	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
0	0	0	1	1	1	2	2	1	0

.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
S	.	.	.	.	.	.	G	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
0	0	0	1	1	1	2	2	1	0

.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
S	.	.	.	.	.	.	G	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
.	.	.	.	.	.	.	.	.	.
0	0	0	1	1	1	2	2	1	0