Description
P
1. For SVR, prove that w_{0} = y_{j} _{i}_{2S} (x_{i}; x_{j}), where S is the set of indices of support vectors and x_{j} is a support vector at the upper edge of the tube. (10 pts)

For the following classi cation problem, design a singlelayer perceptron, by using the multiclass Perceptron update rule. (20 pts)
1
_{D!1 = x1 = }1









^{D}!_{2}
= x_{2} =
1_{1}
D!_{3}
=
x_{3} =
1
1








Use onehot coding for classes, for example, !_{3} should be represented using the following vector
2 3
1
y_{3} = 4 1 5
1

Start with W(0) = 0_{2 3} and choose (i) = 0:5 in W(i + 1) = W(i) + (i)x(i)e^{T} . Do not use the augmented space and assume that the biases are always zero (no update for biases). Showmultiple steps of your algorithm. Does it converge? Why? It is alright if you use a computer or calculator to perform the matrix calculations, but you should write down all the steps, and should not write a computer program to yield the nal results.

Now redo the previous procedure, but this time deal with each of the columns of W as one perceptron, i.e. update each column (the weight associated with a linear discriminant) separately, for example the rst iteration becomes:
W(0) = [w_{1}(0) w_{2}(0) w_{3}(0)]
w_{1}(1) = w_{1}(0) + e_{1}x(1)
w_{2}(1) = w_{2}(0) + e_{2}x(1)
w_{3}(1) = w_{3}(0) + e_{3}x(1)
where e_{i} = y_{i1} sign(w_{i}(0)^{T} x(1)) is the di erence between the i^{th} element of y_{1} (the target vector for x_{1}) and the output of the i^{th} neuron/linear discriminant w_{i}(1).
Perform two epochs only. This is essentially to make you observe that a multicategory Perceptron algorithm is based on multiple binary problems.

Consider the two classes of patterns that are shown in the gure below. Design a multilayer neural network with the following architecture to distinguish these categories. (30 pts)
AND Operations
x
_{x}_{2 } OR Operation
+


Use the design parameters that you chose in the rst part and train a neural network, but this time set earlystopping=True. Research what early stopping is, and compare the performance of your network on the test set with the previous network. You can leave the validationfraction as the default (0.1) or change it to see whether you can obtain a better model. Remember to standardize your features. Report your R^{2} on both training and test sets. (10 pts)

Note: there are a lot of design parameters in a neural network. If you are not sure how they work, just set them as the default of sklearn, but if you use them masterfully, you can have better models.

Optional Programming Assignment: (Deep) CNNs for Image Colorization. This part will not be graded.


This assignment uses a convolutional neural network for image colorization which turns a grayscale image to a colored image.^{2} By converting an image to grayscale, we loose color information, so converting a grayscale image back to a colored version is not an easy job. We will use the CIFAR10 dataset. Downolad the dataset from http://www.cs.toronto.edu/_{~}kriz/cifar10python.tar.gz.



From the train and test dataset, extract the class birds. We will focus on this class, which has 6000 members.



Those 6000 images have 6000 32 32 pixels. Choose at least 10% of the pixels randomly. It is strongly recommended that you choose a large number or all of the pixels. You will have between P = 614400 and P = 6144000 pixels. Each pixel is an RGB vector with three elements.



Run kmeans clustering on the P vectors using k = 4. The centers of the clusters will be your main colors. Convert the colored images to kcolor images by converting each pixel’s value to the closest main color in terms of Euclidean distance. These are the outputs of your network, whose each pixel falls in one of those k classes.^{3}



Use any tool (e.g., openCV or scikitlearn) to obtain grayscale 32 32 1 images from the original 32 32 3 images. The grayscale images are inputs of your network.



Set up a deep convolutional neural network with two convolution layers (or more) and two (or more) MLP layers. Use 5 5 lters and a softmax output layer. Determine the number of lters, strides, and whether or not to use padding yourself. Use a minimum of one max pooling layer. Use a classi cation scheme, which means your output must determine one of the k = 4 color classes for each pixel in

^{2}MATLAB seems to have an easy to use CNN library. https://www.mathworks.com/help/nnet/ examples/trainaconvolutionalneuralnetworkforregression.html

Centers of clusters have been reported too close previously, so the resultant tetrachrome images will be very close to grayscale. In case you would like to see colorful images, repeat the exercise with colors you select from https://sashat.me/2017/01/11/listof20simpledistinctcolors/ or https://www.rapidtables.com/web/color/RGB_Color.html. A suggestion would be Navy = (0,0,128), Red =( 230, 25, 75), Mint = (170, 255, 195), and White = (255, 255, 255).
your grayscale image. Your input is a grayscale version of an image (32 32 1) and the output is 32 32 4. The output assigns one of the k = 4 colors to each of the 32 32 pixels; therefore, each of the pixels is classi ed into one of the classes [1 0 0 0]; [0 1 0 0]; [0 0 1 0]; [0 0 0 1]. After each pixel is classi ed into one of the main colors, the RGB code of that color can be assigned to the pixel. For example, if the third main color^{4} is [255 255 255] and pixel (32,32) of an image has the onehot encoded class [0 0 1 0], i.e it was classi ed as the third color, the (32,32) place in the output can be associated with [255 255 255]. The size of the output of the convolutional part, c_{1} c_{2} depends on the size of the convolutional layers you choose and is a feature map, which is a matrix. That matrix must be attened or reshaped, i.e. must be turned into a vector of size c_{1}c_{2} 1, before it is fed to the MLP part. Choose the number of neurons in the rst layer of the MLP (and any other hidden layers, if you are willing to have more than one hidden layer) yourself, but the last layer must have 32 32 4 = 4096 neurons, each of which represents a pixel being in one of the k = 4 classes. Add a softmax layer^{5} which will choose the highest value out of its k = 4 inputs for each of the 1024 pixels; therefore, the output of the MLP has to be reshaped into a 32 32 4 matrix, and to get the colored image, the RGB vector of each of the k = 4 classes has to be converted to the RGB vector, so an output image will be 32 32 3. Train at least for 5 epochs (30 epochs is strongly recommended). Plot training, (validation), and test errors in each epoch. Report the train and test errors and visually compare the arti cially colored versions of the rst 10 images in the test set with the original images.^{6}

Extra Practice: Repeat the whole exercise with k = 16; 24; 32 colors if your computer can handle the computations.