Description

Moving averages. There are many ways to model the relationship between an input sequence fu_{1}; u_{2}; : : : g and an output sequence fy_{1}; y_{2}; : : : g. In class, we saw the moving average (MA) model, where each output is approximated by a linear combination of the k most recent inputs:
MA: y_{t} b_{1}u_{t} + b_{2}u_{t} _{1} + + b_{k}u_{t k+1}
We then used leastsquares to nd the coe cients b_{1}; : : : ; b_{k}. What if we didn’t have access to the inputs at all, and we were asked to predict future y values based only on the previous y values? One way to do this is by using an autoregressive (AR) model, where each output is approximated by a linear combination of the ‘ most recent outputs (excluding the present one):
AR: y_{t} a_{1}y_{t} _{1} + a_{2}y_{t} _{2} + + a_{‘}y_{t ‘}
Of course, if the inputs contain pertinent information, we shouldn’t expect the AR method to outperform the MA method!

Using the same dataset from class uy_data.csv, plot the true y, and on the same axes, also plot the estimated y^ using the MA model and the estimated y^ using the AR model. Use k = 5 for
both models. To quantify the di erence between estimates, also compute ky y^k for both cases.

Yet another possible modeling choice is to combine both AR and MA. Unsurprisingly, this is called the autoregressive moving average (ARMA) model:
ARMA: y_{t} a_{1}y_{t} _{1} + a_{2}y_{t} _{2} + + a_{‘}y_{t ‘} + b_{1}u_{t} + b_{2}u_{t} _{1} + + b_{k}u_{t k+1}
Solve the problem once more, this time using an ARMA model with k = ‘ = 1. Plot y and y^ as before, and also compute the error ky y^k.
Note: For the problems in this question you don’t need to use optimization codes; you can just use the \backslash” notation for solving linear least squares.
2. The Huber loss. In statistics, we frequently encounter data sets containing outliers, which are bad data points arising from experimental error or abnormally high noise. Consider for example the following data set consisting of 15 pairs (x; y).

x
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
y
6.31
3.78
24
1.71
2.99
4.53
2.11
3.88
4.67
4.25
2.06
23
1.58
2.17
0.02
The y values corresponding to x = 3 and x = 12 are outliers because they are far outside the expected range of values for the experiment.
a) Compute the best linear t to the data using an ‘_{2} cost (least squares). In other words, we are looking for the a and b that minimize the expression:







15
X_{i}
ax_{i} b)^{2}
‘_{2} cost:
(y_{i}
=1






Repeat the linear t computation but this time exclude the outliers from your data set. On a single plot, show the data points and both linear ts. Explain the di erence between both ts.
CS/ECE/ISyE 524 Introduction to Optimization Steve Wright, Spring 2021

It’s not always practical to remove outliers from the data manually, so we’ll investigate ways of automatically dealing with outliers by changing our cost function. Find the best linear t again (including the outliers), but this time use the ‘_{1} cost function:







15
‘_{1} cost:
X_{i}
ax_{i} b j
j y_{i}
=1






Include a plot containing the data and the best ‘_{1} linear t. Does the ‘_{1} cost handle outliers better or worse than least squares? Explain why.

Another approach is to use an ‘_{2} penalty for points that are close to the line but an ‘_{1} penalty for points that are far away. Speci cally, we’ll use something called the Huber loss, de ned as:
CS/ECE/ISyE 524 Introduction to Optimization Steve Wright, Spring 2021

Consider a simple instance of this problem, where C_{max} = 500 and _{1} = _{2} = _{3} = _{4} = 1. Also assume for simplicity that each variable has a lower bound of zero and no upper bound. Solve this problem using JuMP. Use the Ipopt solver and the command @NLconstraint(…) to specify nonlinear constraints such as logsumexp functions. Have your code print the optimal values of T , r, and w, as well as the optimal objective value.
3 of 3