Homework#8 Solution

$30.00 $24.00

Pl. refer to Dr. Sutton’s book Ch 1 and Ch 2 only as needed. http://incompleteideas.net/book/bookdraft2017nov5.pdf Then, answer the below: 1. Consider the use case (application) of a Robot driving a car. In this context, what is RL? How can the ADP and TD methods be used for this? What about the Active RL method? […

5/5 – (2 votes)

You’ll get a: zip file solution

 

Description

5/5 – (2 votes)

Pl. refer to Dr. Sutton’s book Ch 1 and Ch 2 only as needed.

http://incompleteideas.net/book/bookdraft2017nov5.pdf

Then, answer the below:

1. Consider the use case (application) of a Robot driving a car. In this context, what is RL? How can the ADP and TD methods be used for this? What about the Active RL method? [ 30 points]

  1. Based on Ch. 21 from textbook Fig. 21.9 [50 points]

For the problem shown in Fig. 21.9 (balancing a long pole on a moving cart):

    1. Construct a Q-Learning representation and explain this as an Active RL problem. Show the details of Policy and Transitions and explain why it is an Active RL rpoblem.

3. Answer from our textbook Norvig & Russell page 858 Question 21.1 – this is a Python implementation. [ 80 points]

4. Implement a Q Learning algorithm similar to this tutorial:

https://www.learndatasci.com/tutorials/reinforcement-q-learning-scratch-python-openai-gym/

but to use the maze problem we learned in class (see Q-Learning Example.docx) and prove your implementation using this data set. [140 points]

Homework#8 Solution
$30.00 $24.00