MDS5210 · Homework 1

$30.00 $24.00

Problem 1 (30pts). Fundamental Knowledge Clearly state the difference between supervised learning and unsupervised learning. Explain the usage of training set, validation set, and test set in a learning task. Also explain why we need a validation set. Suppose we have a dataset of people in which we record their heights hi, as well as…

Rate this product

You’ll get a: zip file solution

 

Categorys:

Description

Rate this product

Problem 1 (30pts). Fundamental Knowledge

  1. Clearly state the difference between supervised learning and unsupervised learning.

  1. Explain the usage of training set, validation set, and test set in a learning task. Also explain why we need a validation set.

  1. Suppose we have a dataset of people in which we record their heights hi, as well as length of left arms li, and right arms ri. Suppose hi N (10, 2) (in unspecified units), and li N (ρihi, 0.02) and ri N (ρihi, σ2), with ρi Unif(0.6, 0.7). Is using both arms necessarily a better choice than using only one arm to approximate hi? What if σ2 = 0.02? Explain the intuition.

  1. Let X Rn×d be a full column rank matrix. Explain why XT X is positive definite using SVD. (Hint: The singular matrices are orthonormal)

  1. Consider the problem of

minXθ − y22 + λθ22.

θ

Suppose X is full column rank, write down its optimal solution θ.

Problem 2 (30pts). Least Square without Full Column Rank Consider the problem

minXθ − y22,

θ

where X Rn×d, θ Rd, y Rn.

(1) Given

X = 1 0 , y = 1 .

Draw the figure of the objective function using python.

(2) The thin SVD of X is given by

UT

= VΣ1U1T .

X=V Σ1 0

T1

U2

Show that when n < d, optimal solutions are non-unique. Derive the expression of the optimal

solutions using thin SVD. (Hint: Let A := VΣ1, z := UT1 θ. SolveAz − y22 first, then solve UT1 θ = z)

Problem 3 (50pts). A Robust LP Formulation Suppose we have the generative linear regression model

  • =Xθ+ϵ,

where ϵ is the error term and ϵ ∼ N(0, Σ). The maximum likelihoog estimator for θ is:

ˆ

2

θLS = argmin

Xθ − y2

θRd

= (XT X)−1XT y.

distribution,

i.e.

ε

i.i.d

(a) Suppose

the error

term, ϵ = [ε1, ε2, · · · , εn]

follows the Laplace

i

2b

ε

0

i

· · ·

| i− |

L(0, b), i

= 1,2,

, n and the probability density function is P (ε

) =

1

e

for some

b

b > 0. Under the MLE principle, what is the learning problem? Please write out the derivation process. (15 points)

Figure 1: PDF of Laplace distribution

(b) Huber-smoothing. L1-norm minimization

ˆ

=

θ

1

θL1

y

argmin

is one possible solution for robust regression. However, it is nondifferentiable. We utilize smoothing technique for approximately solving the L1-norm minimization. Huber function is one possibility. The definition and sketch map are shown as below.

hµ(z) z2

|z|,

µ

|z| ≥ µ

+

,

|z| ≤ µ

2

Then,

n

Hµ(Z) = hµ(zj).

j=1

By using Huber smoothing, the approximation of the optimization of L1−norm can be changed to

min Hµ(Xθ − y).

θ

Let

f(θ) = Hµ(Xθ − y),

find the gradientf(θ).(10 points)

0.8

0.6

0.4

0.2

Figure 2: Huber smoothing

  1. Gradient descent for minimizing f(θ). The process of gradient descent algorithm is shown in the following table.

    1. Input: observed data X,y and initialization parameter θ0 Huber smoothing parameter µ,

total iteration number T , learning rate α.

    1. for k = 1, 2, · · · , T ,do

    1. θk+1 = θk − αf(θk)

    2. end for

    3. return θT

The data set is generated by the linear model

  • =Xθ12,

where ϵ1 ∈ R n follows Gaussian distribution, ϵ2 are outliers. Given the observed data (x, y) = {(x1, y1), (x2, y2), · · · , (xn, yn)} and true value θ,

ˆ

ˆ

(1) calculate the estimation θLS

by using linear least squares and compute

θLS−θ

2

.(5

points)

  1. suppose n = 1000, d = 50, use python to implement the gradient descent algorithm to minimize f(θ),the parameters are set as µ = 10−5, α = 0.001, T = 1000, plot the error θk − θ2 as a function of iteraction number.You can download the data {y, X, θ} from Blackboard.(20 points)

4

MDS5210 · Homework 1
$30.00 $24.00