Description
-
Markings will be based on the correctness and soundness of the outputs.
-
Marks will be deducted in case of plagiarism.
-
Proper indentation and appropriate comments (if necessary) are mandatory.
-
Use of frameworks like scikit-learn etc is allowed.
-
All benchmarks(accuracy etc), answers to questions and supporting examples should be added in a separate file with the name ‘report’.
-
All code needs to be submitted in ‘.py’ format. Even if you code it in ‘.ipynb’ format, download it in ‘.py’ format and then submit
-
You should zip all the required files and name the zip file as:
-
-
<roll_no>_assignment_<#>.zip, eg. 1501cs11_assignment_01.zip.
-
-
Upload your assignment ( the zip file ) in the following link:
Problem Statement:
-
The assignment targets to implement K-Means and K-Medoid algorithms to cluster the dataset consists of socio-economic and health factors of countries and determine the overall development of the country
Implementation:
-
Implement K-Means and K-Medoid algorithms to cluster the given dataset as follows:
-
-
Perform standard data cleaning operations such as data cleaning (handling missing values) and data scaling (handling the outliers)
-
Perform 5-fold cross validation
-
-
-
Classify the countries according to the following categories:
-
-
-
-
Developed Country
-
-
-
-
-
Developing Country
-
-
-
-
-
Under-Developing Country
-
-
Dataset:
-
Model code
-
Accuracy, Precision, Recall and F1 Scores of each fold
-
Visualization of clusters after the model is converged