Reinforcement Learning Assignment 3 Solution

~~$30.00~~ $24.00

Introduction The goal of this assignment is to do experiment with model-free control, includ-ing on-policy learning (Sarsa) and o -policy learning (Q-learning). For deep understanding of the principles of these two iterative approaches and the di er-ences between them, you will implement Sarsa and Q-learning at the application of the Cli Walking Example, respectively. Cli…

Description

5/5 – (1 vote)

Introduction

The goal of this assignment is to do experiment with model-free control, includ-ing on-policy learning (Sarsa) and o -policy learning (Q-learning). For deep understanding of the principles of these two iterative approaches and the di er-ences between them, you will implement Sarsa and Q-learning at the application of the Cli Walking Example, respectively.

Cli Walking

Figure 1: Cli Walking

Consider the gridworld shown in the Figure 1. This is a standard undis-counted, episodic task, with start state (S), goal state (G), and the usual actions causing movement up, down, right, and left. Reward is -1 on all transitions ex-cept those into the region marked \The Cli “. Stepping into this region incurs a reward of -100 and sends the agent instantly back to the start.

Experiment Requirments

Programming language: python3

You should build the Cli Walking environment and search the optimal travel path by Sara and Q-learning, respectively.

Di erent settings for can bring di erent exploration on policy update. Try several (e.g. = 0:1 and = 0) to investigate their impacts on performances.

Report and Submission

Your reports and source code should be compressed and named after “stu-dentID+name”.

The les should be submitted on Canvas before Apr. 24, 2020.

Reinforcement Learning Assignment 3 Solution

Share this:

Share this:

Description

Share this:

Related products

Lab 2: Ray tracing a Sphere Solution

Programming II Assignment 5: The Emergency Room Solution

Assignment_4 Solution

Assignment 2 Solution

ASSIGNMENT-02 Solution