Kernel SVM, Chow-Liu Algorithm & Game of Classifiers

$24.99 $18.99

Instructions: You can choose any programming language as long as you implement the algorithm from scratch. Use this latex file as a template to develop your homework. Submit your homework on time as a single pdf file to Canvas. Please check Piazza for updates about the homework. Kernel SVM [30pts] Consider the following kernel function…

5/5 – (2 votes)

You’ll get a: zip file solution

 

Categorys:

Description

5/5 – (2 votes)

Instructions: You can choose any programming language as long as you implement the algorithm from scratch. Use this latex file as a template to develop your homework. Submit your homework on time as a single pdf file to Canvas. Please check Piazza for updates about the homework.

  • Kernel SVM [30pts]

Consider the following kernel function defined over z, z Z, where Z is some set:

κ(z, z) = 1 if z = z,

    • otherwise.

  1. (10 pts) Prove that for any positive integer m, any z1, . . . , zm Z, the m × m kernel matrix K = [Kij ] is positive semi-definite, where Kij = κ(zi, zj ) for i, j = 1 . . . m. (Let us assume that for i ̸= j, we have zi ̸= zj ) Hint: An m × m matrix K is positive semi-definite ifu Rd : uKu ≥ 0.

  1. (10 pts) Given a training set (z1, y1), . . . , (zn, yn) with binary labels, the dual SVM problem with the above kernel κ will have parameters a1, . . . , an, b R. (Let us assume that for i ̸= j, we have zi ̸= zj ) The predictor for input z takes the form

n

f(z) = aiyiκ(zi, z) + b .

i=1

Recall that the label prediction is sgn(f(z)). Prove that there exist a1, . . . , an, b such that f correctly separates the training set. In other words, κ induces a feature space rich enough such that in it, any training set can be linearly separated.

3. (10 pts) How does that f predict input z that is not in the training set?

Comment: One useful property of kernel functions is that the input space Z does not need to be a vector space; in other words, z does not need to be a feature vector. For all we know, Z can be all the turkeys in the world. As long as we can compute κ(z, z), kernel SVM works on turkeys.

  • Chow-Liu Algorithm [30 pts]

Suppose we wish to construct a directed graphical model for 3 features X, Y , and Z using the Chow-Liu algorithm. We are given data from 100 independent experiments where each feature is binary and takes value T or F . Below is a table summarizing the observations of the experiment:

X

Y

Z

Count

T

T

T

36

T

T

F

4

T

F

T

2

T

F

F

8

F

T

T

9

F

T

F

1

F

F

T

8

F

F

F

32

Homework 6 CS 760 Machine Learning

  1. Compute the mutual information I(X, Y ) based on the frequencies observed in the data. (5 pts)

  1. Compute the mutual information I(X, Z) based on the frequencies observed in the data. (5 pts)

  1. Compute the mutual information I(Z, Y ) based on the frequencies observed in the data. (5 pts)

  1. Which undirected edges will be selected by the Chow-Liu algorithm as the maximum spanning tree? (5 pts)

  1. Root your tree at node X, and assign directions to the selected edges. (10 pts)

  • Game of Classifiers [60pts]

3.1 Implementation

Implement the following models in the choice of your programming language. Include slack variables in SVM implementation if needed. You can use autograd features of PyTorch, TensorFlow, etc., or derive gradients on your own (but do not use inbuilt models for SVM, Kernel SVM, and Logistic Regression from libraries).

  • Implement Linear SVM (without kernels).

  • Implement Kernel SVM, with options for linear, rbf, and polynomial kernels. You should keep the kernel parameters tunable (e.g., do not fix the degree of polynomial kernels but keep it as a variable and play with different values of it.) Is Linear SVM a special case of Kernel SVMs?

  • Implement Logistic Regression with and without kernels (use same kernels as above).

3.2 Synthetic Dataset-1 (20 pts)

Generate a 2-D dataset as follows: Let µ = 2.5 and I2 be the 2 × 2 identity matrix. Generate points for the positive and negative classes, respectively from N([µ, 0], I2), and N([−µ, 0], I2). For each class, generate 750 points (1500 in total). Randomly create train, validation, and test splits of 1000, 250, and 250 points, respectively. Do the following with this dataset:

  1. (5 pts) Train your Linear SVM, Logistic Regression models and report decision boundaries and test accura-cies.

  1. (5 pts) Show the decision boundaries with k-NN and Naive Bayes Classifiers. (You can use library imple-mentations or implement from scratch. Figure out the hyper-parameters using the validation set.)

  1. (5 pts) Repeat the process by varying µ from 1.0 to 2.4 with a step size of 0.2 for each value of µ to obtain test accuracies of the models and plot ( µ on x-axis and test accuracy on y-axis). (You will have a curve for each of the 4-classifiers mentioned above.)

  1. (5 pts) What are your conclusions from this exercise?

3.3 Synthetic Dataset-2 (20 pts)

Generate 1500 data points from the 2-D circles dataset of sklearn:

sklearn.datasets.make_circles

Randomly create train, validation, and test splits of 1000, 250, and 250 points, respectively. Evaluate the above classifiers in this setting.

  1. ( 5 pts) Show decision boundaries for Linear SVM and Logistic Regression classifiers.

  1. ( 5 pts) Show decision boundaries for Kernel SVM and Kernel Logistic Regression ( use rbf, polynomial kernels). Try different values of hyperparameters, and report results with whichever works best.

  1. ( 5 pts ) Train Neural Network from HW4, and k-NN classifiers on this dataset and show decision bound-aries. ( You can use library implementation for these classifiers).

  1. ( 5 pts ) What are your conclusions from this exercise?

Homework 6 CS 760 Machine Learning

3.4 Evaluation on Real Dataset (20 pts)

Let’s put all this to some real use. For this problem, use the Wisconsin Breast Cancer dataset. You can download it from the sklearn library:

sklearn.datasets.load_breast_cancer

  1. (10 pts) Do all the points of Section 3.3 in this dataset. Since these are high-dimensional data, you do not have to show the decision boundaries. Report test accuracies for these classifiers and discuss your findings.

  1. (10 pts) In addition, you also want to figure out the important features which determine the class. Which regularization will you use for this? Upgrade your SVM, Kernel SVM implementation to include this regularization. Discuss the important features that you obtain by running your regularized SVM on this dataset. (You might need to normalize this dataset before training any classifier.)

3

Kernel SVM, Chow-Liu Algorithm & Game of Classifiers
$24.99 $18.99