Description
This Assignment carries a total of 8 marks for CS337 Theory and 14.5 marks for CS335 Lab
-
Kernel Methods
1.1 CS 337: Theoretical Problem
-
Prove that the function K : R
n
R
n
! R de ned as K (x; y) = exp
x yk2
is a valid Kernel.
2 2
You may use the properties proved in class. [Hint – Taylor series
expansion]
(2 marks)
1.2 CS 335: Lab Problems
In this task, you need to complete functions in the les kernel.py, krr.py and kernel logistic.py. The details for each subtask are listed below-
1
-
kernel.py – Complete the functions gaussian kernel and linear kernel matrix. For vectors x and y in Rd, the linear kernel is given by K(x; y) =
-
gaussian kernel is given by K (x; y) = exp
x y
k
2
2 2
(b) kernel logistic.py –
to return the Gram
Pd
i=1 xiyi and the
(2 marks)
-
Complete the functions fit and predict to train a logistic regression using the gaus-sian kernel. Use gradient descent to minimize the dual kernelized objective of logistic
regression as presented in Lecture Lecture 10 . (1.5 marks)
(ii) Complete the function k fold cv to perform k-fold cross validation to get the best . Divide the dataset into k equal parts for this without any randomization. Report the best obtained along with the plot in the report. Ensure that you use the default values
of ; ; max iter as provided in the starter code. (1.5 marks)
-
-
Brie y justify the plot of the variation of error and sigma that you obtained. (1 mark)
-
-
krr.py –
-
-
Complete the functions fit and predict. These need to be done without any for loops. (1.5 marks)
-
(ii) Put the rst two plots obtained depicting variation of t with and in the report.
Interpret the graph produced by various values of and . (1.5 marks)
-
Kernel Design
2.1 CS 337: Theory Questions
Given that K(x; x0 ) is a valid kernel, where x 2 Rm, prove the following –
-
Let g : Rm ! Rm be a function. Then show that K(g(x); g(x0 )) is a valid kernel.(1.5 marks)
-
Let q be a polynomial with non-negative coe cients. Then show that q(K(x; x0 )) is a valid
kernel. (1.5 marks)
2.2 CS 335: Lab Question
In this task, you need to design a kernel to t the given data. Complete my kernel() function in kernel.py such that a good t to the data is obtained in the third gure plotted when running krr.py. Mention your kernel function in the report along with the obtained plot. To get an idea of what type of kernel to use, you may plot variation of the target value of the data with y at a
constant x or vice versa. (1.5 marks)
-
K-means clustering for Image compression
3.1 CS 337: Theory Questions
-
d
2, where for i
2 f
1; 2; : : : ; n
g
; xi
2
Consider an optimal 2-clustering of a data set x1; x2; : : : ; xn; n
R , where d 1. Without loss of generality, assume that for some m 2 f1; 2; : : : ; n
1g, the points
2
x1; x2; : : : ; xm are assigned to the rst cluster, and the points xm+1; xm+2; : : : ; xn are assigned to the second cluster. Assume that the n points are all distinct, and no point is equidistant to both cluster centres.
Show that there exists a hyperplane of the form a x + b = 0, where a 2 Rd; b 2 R, such that all the points in the rst cluster (x1; x2; : : : ; xm) lie on one side of the hyperplane and all the points in the second cluster (xm+1; xm+2; : : : ; xn) lie to the other side of the hyperplane. You should express
a and b in terms of x1; x2; : : : ; xn, n and m. (3 marks)
3.2 CS 335: Lab Question
(i) In kmeans.py, complete the functions init clusters, pred and train (2 marks)
-
For each of the given 3 images, comment on the relation between number of clusters and quality of the generated images, i.e. generate images for k = 2; 5; 10 for all three images and
comment on the resultant image. Attach all 9 images in the report. (1 marks)
(iii) Describe why some generated images look ne with smaller number of clusters while others
need more clusters (1 marks)
3