CS–Assignment #2 Solution

$35.00 $29.00

Note 1: Your submission header must have the format as shown in the above-enclosed rounded rectangle. Note 2: Homework is to be done individually. You may discuss the homework problems with your fellow students, but you are NOT allowed to copy – either in part or in whole – anyone else’s answers. Note 3: Your…

5/5 – (2 votes)

You’ll get a: zip file solution

 

Description

5/5 – (2 votes)

Note 1: Your submission header must have the format as shown in the above-enclosed rounded rectangle.

Note 2: Homework is to be done individually. You may discuss the homework problems with your fellow students, but you are NOT allowed to copy – either in part or in whole – anyone else’s answers.

Note 3: Your deliverable should be a .pdf file submitted through Gradescope until the deadline. Do not forget to assign a page to each of your answers when making a submission. In addition, source code (.py files) should be added to an online repository (e.g., github) to be downloaded and executed later.

Note 4: All submitted materials must be legible. Figures/diagrams must have good quality.

Note 5: Please use and check the Canvas discussion for further instructions, questions, answers, and hints.

  1. [16 points] Considering that ID3 built the decision tree below after analyzing a given training set, answer the following questions:

Tear

Normal Reduced

Spectacle

No

Myope

Hypermetrope

Yes

Astigmatism

No

Yes

Yes

No

  1. [12 points] What is the accuracy of this model if applied to the test set below? You must inform the values for True Positives, True Negatives, False Positives, and False Negatives for full credit.

Age

Spectacle

Astigmatism

Tear

Lenses (ground truth)

Young

Hypermetrope

Yes

Normal

Yes

Young

Hypermetrope

No

Normal

Yes

Young

Myope

No

Reduced

No

Presbyopic

Hypermetrope

No

Reduced

No

Presbyopic

Myope

No

Normal

No

Presbyopic

Myope

Yes

Reduced

No

Prepresbyopic

Myope

Yes

Normal

Yes

Prepresbyopic

Myope

No

Reduced

No

  1. [4 points] What is the precision, recall, and F1-measure of this model when applied to the same test set?

  1. [15 points] Complete the Python program (decision_tree.py) that will read the files contact_lens_training_1.csv, contact_lens_training_2.csv, and contact_lens_training_3.csv. Each of those training sets has a different number of instances. You will observe that now the trees are being created setting the parameter max_depth = 3, which it is used to define the maximum depth of the tree (pre-pruning strategy) in sklearn. Your goal is to train, test, and output the performance of the models created by using each training set on the test set provided (contact_lens_test.csv). You must repeat this process 10 times (train and test by using a different training set), choosing the lowest accuracy as the final classification performance of each model.

  1. [32 points] Consider the dataset below to answer the following questions:

y

x

  1. [4 points] What is the leave-one-out cross-validation error rate (LOO-CV) for 1NN? Use Euclidean distance as your distance measure and the error rate calculated as:

=

    1. [4 points] What is the leave-one-out cross-validation error rate (LOO-CV) for 3NN?

    1. [4 points] What is the leave-one-out cross-validation error rate (LOO-CV) for 9NN?

    1. [5 points] Draw de decision boundary learned by the 1NN algorithm.

    1. [15 points] Complete the Python program (knn.py) that will read the file binary_points.csv and output the LOO-CV error rate for 1NN (same answer of part a).

  1. [12 points] Find the class of instance #10 below following the 3NN strategy. Use Euclidean distance as your distance measure. You must show all your calculations for full credit.

ID

Red

Green

Blue

Class

#1

220

20

60

1

#2

255

99

21

1

#3

250

128

14

1

#4

144

238

144

2

#5

107

142

35

2

#6

46

139

87

2

#7

64

224

208

3

#8

176

224

23

3

#9

100

149

237

3

#10

154

205

50

?

5. [25 points] Use the dataset below to answer the next questions:

  1. [10 points] Classify the instanceD15, Sunny, Mild, Normal, Weak following the Naïve Bayes strategy. Show all your calculations until the final normalized probability values.

  1. [15 points] Complete the Python program (naïve_bayes.py) that will read the file weather_training.csv (training set) and output the classification of each test instance from the file weather_test (test set) if the classification confidence is >= 0.75. Sample of output:

Day

Outlook Temperature

Humidity

Wind

PlayTennis

Confidence

D15

Sunny

Hot

High

Weak

No

0.86

D16

Sunny

Mild

High

Weak

Yes

0.78

Important Note: Answers to all questions should be written clearly, concisely, and unmistakably delineated. You may resubmit multiple times until the deadline (the last submission will be considered).

NO LATE ASSIGNMENTS WILL BE ACCEPTED. ALWAYS SUBMIT WHATEVER YOU HAVE COMPLETED FOR PARTIAL CREDIT BEFORE THE DEADLINE!

CS–Assignment #2 Solution
$35.00 $29.00