Description
Department Of Computer Science and Engineering, IIT Palakkad
Q1) Implement the following iteration :
-
xt+1 = xt + t(yt xt)
(1)
-
where xt 2 R, yt is a random variable, and t > 0 is a step-size. Let us understand how this works by changing the step-size and the random variable:
-
25
Marks Keep t = 0:1; 0:01; 0:001 and then
1.
yt is a uniform in [
1; 1]. Plot xt.
2.
yt is a uniform in [0; 1]. Plot xt.
25
Marks Keep
t
= 1=(t + 1),
t
=
c
for some c; c0
> 0, and then
t+c0
1.
yt is a uniform in [
1; 1]. Plot xt.
2.
yt is a uniform in [0; 1]. Plot xt.
For all the above cases, plot xt.
Q2) Implement value iteration for grid world with Q values. Same as previous lab second question, however use the 2-D array namely Q-values. [30 Marks]
Q3) Implement Q-learning for grid world. [20 Marks]