Name: Implement value iteration
SKU: 26902
Price: 24.99 USD
Availability: InStock

Implement value iteration

~~$24.99~~ $18.99

In this exercise you will implement the value iteration algorithm for Markov Decision Processes. The value iteration algorithm is applied to a 2D grid decision process, where different locations on a can contain different rewards. The purpose is to compute the value of each location, and the corresponding policy. Instructions ^^^^^^^^^^^^ 1. To prepare for…

Description

5/5 – (2 votes)

In this exercise you will implement the value iteration algorithm for Markov Decision Processes.

The value iteration algorithm is applied to a 2D grid decision process, where different locations on a can contain different rewards. The purpose is to compute the value of each location, and the

corresponding policy.

Instructions

^^^^^^^^^^^^

1. To prepare for the exercise, make sure you have consulted the lecture slides

and MyCourses material related to Markov Processes, The Bellman Equation, and

Value Iteration.

2. Copy `template-valueiteration.py` to `valueiteration.py`

3. Read and understand all code

– mdp.py :: This file defines an abstract class providing a general interface

for Markov Decision Processes. No need to edit.

– valueiteration.py :: Declares function related to value iteration

TASKs 2.x are found here.

– gridmdp.py :: This file defines a grid Markov Decision Process by

inheriting from mdp.py. No need to edit.

– gridactions.py :: Defines actions used by gridmdp.py. No need to edit.

– utils.py :: Defines some utility function, notably `argmax` which may come in handy.

4. Implement TASK 2.1, and 2.2

Tasks

^^^^^

– TASK 2.1 :: Implement the `value_of` function.

– TASK 2.2 :: Implement the `value_iteration` function (using `value_of`)

Testing

^^^^^^^

– `python valueiteration.py` :: Will execute a few basic examples on grids.

– `python test_valueiteration.py` :: Will execute a few unit tests.

Good luck!

Implement value iteration

Share this:

Share this:

Description

Share this:

Related products

Develop a multithreaded app that can find the integer in the range– Assignment 2 Solution

Exercise 5: Regularized Linear Regression and Bias v.s. Variance Solution

Foundation of statistical inference

Homework 7 Solution

Assignment-(H) Solution