Name: Problem Set 5: Boosting, Unsupervised learning Solution
SKU: 9257
Price: 35.00 USD
Availability: InStock

Description

5/5 – (2 votes)

Submission instructions

Submit your solutions electronically on the course Gradescope site as PDF les.

If you plan to typeset your solutions, please use the LaTeX solution template. If you must submit scanned handwritten solutions, please use a black pen on blank white paper and a high-quality scanner app.

AdaBoost [5 pts]

In the lecture on ensemble methods, we said that in iteration t, AdaBoost is picking (h_t; _t) that minimizes the objective:

(h_t (x); _t ) =	X		y_{n t}h_t(x_n)
(h_t (x); _t ) =	arg min	w_t(n)e	y_{n t}h_t(x_n)
	(h_t(x); _t) _n		X
=	arg min(e ^t	_e t ₎		w_t(n)I[y_n 6= h_t(x_n)]
	(h_t(x); _t)		n		X
			n
				₊_e t		w_t(n)
				₊_e t		w_t(n)
					n
We de ne the weighted misclassi cation error at time t, to be					=	w (n) [y	= h (x )]. Also
	P		t		t	^P_{n t} I	n 6 t n
the weights are normalized so that	_n w_t(n) = 1.		t		t	^P_{n t} I	n 6 t n

Take the derivative of the above objective function with respect to _t and set it to zero to solve for _t and obtain the update for _t.

Suppose the training set is linearly separable, and we use a hard-margin linear support vector machine (no slack) as a base classi er. In the rst boosting iteration, what would the resulting ₁ be?

K-means for single dimensional data [5 pts]

In this problem, we will work through K-means for a single dimensional data.

Consider the case where K = 3 and we have 4 data points x₁ = 1; x₂ = 2; x₃ = 5; x₄ = 7. What is the optimal clustering for this data ? What is the corresponding value of the objective

Parts of this assignment are adapted from course material by Jenna Wiens (UMich) and Tommi Jaakola (MIT).

One might be tempted to think that Lloyd’s algorithm is guaranteed to converge to the global minimum when d = 1. Show that there exists a suboptimal cluster assignment (i.e., initialization) for the data in the above part that Lloyd’s algorithm will not be able to improve (to get full credit, you need to show the assignment, show why it is suboptimal and explain why it will not be improved).

Gaussian Mixture Models [8 pts]

We would like to cluster data fx₁; : : : ; x_N g, x_n 2 R^d using a Gaussian Mixture Model (GMM) with K mixture components. To do this, we need to estimate the parameters of the GMM, i.e., we need to set the values = f!_k; _k; _kg^K_k=1 where !_k is the mixture weight associated with mixture component k, and _k and _k denote the mean and the covariance matrix of the Gaussian distribution associated with mixture component k.

If we knew which cluster each sample x_n belongs to (we had complete data), we showed in the lecture on Clustering that the log likelihood l is what we have below and we can compute the maximum likelihood estimate (MLE) of all the parameters.

X
l( ) =	log p(x_n; z_n)
n			(		_nk log N(x_nj _k; _k)⁾
=	n	_nk log !_k +	(	n	_nk log N(x_nj _k; _k)⁾	(1)
^Xk	X		^Xk	X

Since we do not have complete data, we use the EM algorithm. The EM algorithm works by iterating between setting each _nk to the posterior probability p(z_n = kjx_n) (step 1 on slide 26 of the lecture on Clustering) and then using _nk to nd the value of that maximizes l (step 2 on slide 26). We will now derive updates for one of the parameters, i.e., _j (the mean parameter associated with mixture component j).

(a) To maximize l, compute r _j l( ): the gradient of l( ) with respect to _j.

(b) Set the gradient to zero and solve for _j to show that _j = P ¹ ^P_n njxn.

_n nj

Suppose that we are tting a GMM to data using K = 2 components. We have N = 5 samples in our training data with x_n; n 2 f1; : : : ; Ng equal to: f5; 15; 25; 30; 40g.

We use the EM algorithm to nd the maximum likeihood estimates for the model parameters, which are the mixing weights for the two components, !₁ and !₂, and the means for the two components, ₁ and ₂. The standard deviations for the two components are xed at 1. Suppose that at the end of step 1 of iteration 5 in the EM algorithm, the soft assignment _nk for the ve data items are as shown in Table 1.

1	2
0:2	0:8
0:2	0:8
0:8	0:2
0:9	0:1
0:9	0:1

Table 1: Entry in row n and column k of the table corresponds to _nk

What are updated values for the parameters !₁, !₂, ₁, and ₂ at the end of step 2 of the EM algorithm?

Problem Set 5: Boosting, Unsupervised learning Solution

Share this:

Share this:

Description

Share this:

Related products

Develop a multithreaded app that can find the integer in the range– Assignment 2 Solution

Foundation of statistical inference

Assignment-7 Binary Search Trees II:Solution

Assignment 1 C++ FUNDAMENTALS Solution

Lab 4: Bash Script and Bitwise Operations Solution