Reinforcement Learning Assignment 3 Solution

$30.00 $24.00

Introduction The goal of this assignment is to do experiment with model-free control, includ-ing on-policy learning (Sarsa) and o -policy learning (Q-learning). For deep understanding of the principles of these two iterative approaches and the di er-ences between them, you will implement Sarsa and Q-learning at the application of the Cli Walking Example, respectively. Cli…

5/5 – (1 vote)

You’ll get a: zip file solution

 

Description

5/5 – (1 vote)
  • Introduction

The goal of this assignment is to do experiment with model-free control, includ-ing on-policy learning (Sarsa) and o -policy learning (Q-learning). For deep understanding of the principles of these two iterative approaches and the di er-ences between them, you will implement Sarsa and Q-learning at the application of the Cli Walking Example, respectively.

  • Cli Walking

Figure 1: Cli Walking

Consider the gridworld shown in the Figure 1. This is a standard undis-counted, episodic task, with start state (S), goal state (G), and the usual actions causing movement up, down, right, and left. Reward is -1 on all transitions ex-cept those into the region marked \The Cli “. Stepping into this region incurs a reward of -100 and sends the agent instantly back to the start.

  • Experiment Requirments

Programming language: python3

You should build the Cli Walking environment and search the optimal travel path by Sara and Q-learning, respectively.

Di erent settings for can bring di erent exploration on policy update. Try several (e.g. = 0:1 and = 0) to investigate their impacts on performances.

2

  • Report and Submission

Your reports and source code should be compressed and named after “stu-dentID+name”.

The les should be submitted on Canvas before Apr. 24, 2020.

3

Reinforcement Learning Assignment 3 Solution
$30.00 $24.00