Description

You are given the following training data in a 1D (1 feature), 2class problem:
S_{!}: x_{!} = 0, x_{“} = 2, x_{#} = 3
S_{“}: x_{$} = −2, x_{%} = −3, x_{&} = −4

(i) Plot the data in nonaugmented feature space. Draw a linear decision boundary H (a point) that correctly classifies them, showing which side is positive.


Give the weight vector w that corresponds to your H. Because the length of is not determined by the given quantities, choose so that w = 1.

Hint: the boundary is given by g(x) = 0, and the decision region for Γ_{!} is g(x) > 0.

Plot the data points (in augmented form) in augmented feature space. Also show the decision boundary and which side is positive.

Plot the augmented data points as lines _{‘}_{!}6 7 in weight space. Show the positive side of each such line with a small arrow. Show the solution region.

Plot the weight vector w of H from part (a) as a point in weight space. Is w in the solution region?
For parts (e)(f) below, you will use the same dataset of points x_{n} given above, except in augmented form as in (b), but you will reflect them and use the reflected data points z_{n} x_{n} for your plots.

Plot the reflected data points z_{n} x_{n} in augmented feature space. Draw the linear decision boundary H from your part (b) and its weight vector. Does H correctly classify all the data points?
Hint: write down the general condition for correct classification of reflected data points.

Plot the data points z_{n} x_{n} as lines in 2D weight space, showing the positive side of each with a small arrow. Show the final solution region.

This problem uses the notation we used in Lecture 6, and m is a positive integer. For the following computational complexity:
p(m) = 2m +1000

(a) Is
p(m) = O(m)?
If
yes, prove your
answer by
letting a = 1, and solve for what m_{0} we have
m ≥ p(m) ∀m ≥ m_{0} .
If you need a larger a, then state what value of a will work.
Find the smallest such integer m_{0}
for the value of a you used.
If no, justify why not.
(b)
Is
p(m) = Ω(m)?
If
yes, prove your answer by
letting b = 1, and solve for what m_{1} we have
m ≤ p(m) ∀m ≥ m_{1}. If you need a smaller b, then state what value of b will work.
Find the smallest such integer m_{1}
for the value of b you used.
If no, justify why not.
(c)
Is
p(m) = Θ(m)?
Justify your answer.

This problem also uses the notation of Lecture 6.


Suppose we have a function p(m) that can be expressed as:

p(m) = p_{1} (m) + p_{2} (m) + p_{3} (m)
and we have:


p_{k} (m) = O(q_{k} (m)), k = 1,2,3
(i)
Prove that:
p(m) = O(q_{1} (m) + q_{2} (m) + q_{3} (m))
(ii)
Hints: (i) Use the definition of bigO.





If you find the problem statement unclear or confusing, try looking at the example in the appendix below.





Is a similar statement to (a) true for Ω(..) ? (That is, if you replace each O(..) in part




with Ω(..) , would the last equation be true?) Justify your answer.



Consider the following computational complexity:

p(m) = m^{2} log_{2} m +
2m^{3}
+
(log_{2} m)^{3}
log
2
m
(a)
Is p(m) = O(m3 )?
If yes, justify it
by
showing it
satisfies
the
definition.
If no,
justify why not.
Hint: p(m) is the sum of 3
terms.
Try
setting
a = 1,
and
check
each
term
individually to see if it is O(m3 )
(if it is, then give the value of each m_{0} ).
Then use
the result of Problem 3(a).
(b)
Is p(m) = Ω(m3 )?
Justify your answer.

For a nearestmeans classifier, with C = 2 classes (held constant), 1_{2} N data points in each class (assume N is even), and D features, find the computational time complexity and space complexity, for the operations given in (a) and (b) below.
For all parts of this problem, assume a completely serial computer. Your answer may be in the form of a bigO upper bound; express the bigO bound in simplest form, without making it looser (higher) than necessary. As part of your answer to each part, please:



Give a brief statement of the algorithm you are analyzing





Show your reasoning in calculating the complexity




Computing each mean vector from the training data (the training phase)



Classifying M data points, given the mean vectors (the classification phase)

(c), (d) Repeat parts (a) and (b), except for a Cclass problem, with _{C}N data points per class, in which C is a variable.
Hint for (d): you may use without proof that finding the minimum of C values that are unsorted can be done in O(C ) time and O(1) space.
Appendix – Example (relates to Problem 3)
Suppose we want to find and prove the (tightest) asymptotic upper bound of p(m) , with:
p(m) = 3m^{3} +100m^{2} log_{2} m + 0.1(2^{m} )
Applying the definition directly to p(m) (especially to prove your bound, including finding m_{0} ) might be difficult. Instead, you could use the result of Problem 3a, to apply the bigO bound to each term independently:
3m3 = O(m3 )
100m^{2} log_{2} m = O(m^{2} log m)
0.1(2m ) = O(2m )
Then using Problem 3a equation (ii), we can conclude:
3m^{3} +100m^{2} log_{2} m+ 0.1(2^{m} ) = O(m^{3} + m^{2} log m+ 2^{m} )
= O(2m )
p. 3 of 3