Homework 3 (Week 4)

$24.99 $18.99

Please note: in AML, Exercises are embedded in the text of each section; Problems are at the end of each chapter. Please be sure you work the assigned problem or exercise to get credit. AML Problem 1.7(a) (p. 36), plus the following part: Take the scenario of part (a), for the case of 1,000 coins,…

5/5 – (2 votes)

You’ll get a: zip file solution

 

Description

5/5 – (2 votes)

Please note: in AML, Exercises are embedded in the text of each section; Problems are at the end of each chapter. Please be sure you work the assigned problem or exercise to get credit.

  1. AML Problem 1.7(a) (p. 36), plus the following part:

  1. Take the scenario of part (a), for the case of 1,000 coins, and = 0.05. Consider the following interpretation in applying it in a machine learning setting.

There is one hypothesis that is given (one decision boundary and corresponding set of decision regions, or one decision rule); call it. The out of sample error is !”#(ℎ) = 0.05, and the in-sample error depends on the dataset drawn.

Hint: The number of tosses of a coin, N = 10 , corresponds to the size of a dataset.

Complete the machine-learning interpretation by answering the following:

  1. What do the 1000 coins represent?

  1. What does the calculation in part (a), for 1000 coins and = 0.05, represent?

Tip: if you are not sure of your answers, try working Problem 2, then come back to this problem.

  1. Suppose you have trained a model in a binary classification problem, and you want to estimate its error based on a test set.

You want to estimate this error empirically for the case where labeled data is expensive. So you decide to first do some simulations to see how your test-set error might vary with the “luck of the draw” of the test set.

Let the true probability of error on your model be Eout (h) = µ . Because µ is unknown to you, for purposes of simulation you will try different values of µ . Method: Conceptually, a test set is (%) created by drawing N data points randomly from the input space, with replacement. An expert could correctly label each data point, and then you could color each data point as “correct” or “incorrect” depending on the ML classifier’s prediction.

You decide to simulate this by drawing (colored) data points randomly, with replacement, from a bin of “correct” and “incorrect” data points, with

P(incorrect) = µ .Let µ = 0.20 .

    1. Draw a colored dataset (‘() of size = 10. From your 10 drawn data points, compute the error rate E (10) (h) . Is it equal to 0.20? Explain

Show that (3) = 2+

    1. Show that (4) = 2,

Hint: Choose a set of data points that are symmetric in some sense; then count dichotomies using symmetry (e.g., show one dichotomy and state how many other dichotomies can be realized in the same way due to symmetry). This can save you some writing and drawing.

    1. You are given that m(5) < 2. Give a simple upper bound for (N), valid for any N, that is a polynomial in N.

Hint: AML Theorem 2.4 will not be a considered “simple” polynomial for this problem.

  1. [Based on AML Exercise 2.1, p. 45]

    1. Find the smallest break point k for the hypothesis set consisting of Positive Rays (defined in Example 2.2).

    2. Find the smallest break point k for the hypothesis set consisting of Positive Intervals (defined in Example 2.2).

Homework 3 (Week 4)
$24.99 $18.99