Homework 7: Convex programs

Description

5/5 – (2 votes)

Moving averages. There are many ways to model the relationship between an input sequence fu₁; u₂; : : : g and an output sequence fy₁; y₂; : : : g. In class, we saw the moving average (MA) model, where each output is approximated by a linear combination of the k most recent inputs:

MA: y_t b₁u_t + b₂u_t ₁ + + b_ku_{t k+1}

We then used least-squares to nd the coe cients b₁; : : : ; b_k. What if we didn’t have access to the inputs at all, and we were asked to predict future y values based only on the previous y values? One way to do this is by using an autoregressive (AR) model, where each output is approximated by a linear combination of the ‘ most recent outputs (excluding the present one):

AR: y_t a₁y_t ₁ + a₂y_t ₂ + + a_‘y_{t ‘}

Of course, if the inputs contain pertinent information, we shouldn’t expect the AR method to outper-form the MA method!

Using the same dataset from class uy_data.csv, plot the true y, and on the same axes, also plot the estimated y^ using the MA model and the estimated y^ using the AR model. Use k = 5 for

both models. To quantify the di erence between estimates, also compute ky y^k for both cases.

Yet another possible modeling choice is to combine both AR and MA. Unsurprisingly, this is called the autoregressive moving average (ARMA) model:

ARMA: y_t a₁y_t ₁ + a₂y_t ₂ + + a_‘y_{t ‘} + b₁u_t + b₂u_t ₁ + + b_ku_{t k+1}

Solve the problem once more, this time using an ARMA model with k = ‘ = 1. Plot y and y^ as before, and also compute the error ky y^k.

Note: For the problems in this question you don’t need to use optimization codes; you can just use the \backslash” notation for solving linear least squares.

2. The Huber loss. In statistics, we frequently encounter data sets containing outliers, which are bad data points arising from experimental error or abnormally high noise. Consider for example the following data set consisting of 15 pairs (x; y).

x	1	2	3	4	5	6	7	8	9	10	11	12	13	14	15

y	6.31	3.78	24	1.71	2.99	4.53	2.11	3.88	4.67	4.25	2.06	23	1.58	2.17	0.02

The y values corresponding to x = 3 and x = 12 are outliers because they are far outside the expected range of values for the experiment.

a) Compute the best linear t to the data using an ‘₂ cost (least squares). In other words, we are looking for the a and b that minimize the expression:

	15
	X_i	ax_i b)²
‘₂ cost:	(y_i	ax_i b)²
	=1

Repeat the linear t computation but this time exclude the outliers from your data set. On a single plot, show the data points and both linear ts. Explain the di erence between both ts.

CS/ECE/ISyE 524 Introduction to Optimization Steve Wright, Spring 2021

It’s not always practical to remove outliers from the data manually, so we’ll investigate ways of automatically dealing with outliers by changing our cost function. Find the best linear t again (including the outliers), but this time use the ‘₁ cost function:

	15
‘₁ cost:	X_i	ax_i b j
‘₁ cost:	j y_i	ax_i b j
	=1

Include a plot containing the data and the best ‘₁ linear t. Does the ‘₁ cost handle outliers better or worse than least squares? Explain why.

Another approach is to use an ‘₂ penalty for points that are close to the line but an ‘₁ penalty for points that are far away. Speci cally, we’ll use something called the Huber loss, de ned as:

CS/ECE/ISyE 524 Introduction to Optimization Steve Wright, Spring 2021

Consider a simple instance of this problem, where C_max = 500 and ₁ = ₂ = ₃ = ₄ = 1. Also assume for simplicity that each variable has a lower bound of zero and no upper bound. Solve this problem using JuMP. Use the Ipopt solver and the command @NLconstraint(…) to specify nonlinear constraints such as log-sum-exp functions. Have your code print the optimal values of T , r, and w, as well as the optimal objective value.

3 of 3

Share this:

Share this:

Description

Share this:

Related products

Assignment #2 Solution

Examining the Effect of Cache Parameters and Program Factors on Cache Hit Rate Solution

Homework 10 Solution

Assignment 1 C++ FUNDAMENTALS Solution

Homework 03 Solution