Project 3 Solution

$40.00 $34.00

General Guidelines: Please prepare a typed report that describes what you did. The report should be as concise as possible while providing all necessary information required to replicate your plots. For each problem, please provide, at the end of your report, a commented version of your python code files. Python Notebook files are preferred. You…

5/5 – (2 votes)

You’ll get a: zip file solution

 

Description

5/5 – (2 votes)

General Guidelines:

  • Please prepare a typed report that describes what you did. The report should be as concise as possible while providing all necessary information required to replicate your plots.

  • For each problem, please provide, at the end of your report, a commented version of your python code files. Python Notebook files are preferred. You may put the codes for all the problems in a SINGLE ipynb file with necessary texts to separate each problem.

P3-1. Revisit Text Documents Classification

Use the 20 newsgroups dataset embedded in scikit-learn:

from sklearn.datasets import fetch_20newsgroups

(See https://scikit-

learn.org/stable/modules/generated/sklearn.datasets.fetch_20newsgroups.html#sklearn.datasets.f etch_20newsgroups)

  1. Load the following 4 categories from the 20 newsgroups dataset: categories = [‘rec.autos’, ‘talk.religion.misc’, ‘comp.graphics’, ‘sci.space’].

  1. Build classifiers using the following methods:

    • Support Vector Machine (sklearn.svm.LinearSVC)

    • Naive Bayes classifiers (sklearn.naive_bayes.MultinomialNB)

    • K-nearest neighbors (sklearn.neighbors.KNeighborsClassifier)

    • Random forest (sklearn.ensemble.RandomForestClassifier)

    • AdaBoost classifier (sklearn.ensemble.AdaBoostClassifier)

Optimize the hyperparameters of these methods and compare the results of these methods.

P3-2. Recognizing hand-written digits

Use the hand-written digits dataset embedded in scikit-learn:

from sklearn import datasets

digits = datasets.load_digits()

  1. Develop a multi-layer perceptron classifier to recognize images of hand-written digits. To build your classifier, you can use:

sklearn.neural_network.MLPClassifier

(See https://scikit-

learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_ network.MLPClassifier)

Instructions: use sklearn.model_selection.train_test_split to split your dataset into random train and test subsets, where you set test_size=0.5.

  1. Optimize the hyperparameters of your neural network to maximize the classification accuracy. Show the confusion matrix of your neural network. Discuss and compare your results

with the results using a support vector classifier (see https://scikit-

learn.org/stable/auto_examples/classification/plot_digits_classification.html#sphx-glr-auto-examples-classification-plot-digits-classification-py).

P3-3. Nonlinear Support Vector Machine

  1. Randomly generate the following 2-class data points import numpy as np

np.random.seed(0)

X = np.random.rand(300, 2)*10-5

Y = np.logical_xor(X[:, 0] > 0, X[:, 1] > 0)

  1. Develop a nonlinear SVM binary classifier (sklearn.svm.NuSVC).

  1. Plot these data points and the corresponding decision boundaries, which is similar to the figure in the slide 131 in Chapter 4.

Project 3 Solution
$40.00 $34.00