Description
Lab Objective:
In this lab, you will need to analysis diabetic retinopathy (糖尿病所引發視網膜病變) in the following three steps. First, you need to write your own custom DataLoader through PyTorch framework. Second, you need to classify diabetic retinopathy grading via the ResNet architecture [1]. Finally, you have to calculate the confusion matrix to evaluate the performance.
Turn in:
-
Experiment Report (.pdf)
-
Source code
Notice: zip all files in one file and name it like「DLP_LAB4_your
studentID_name.zip」, ex: 「DLP_LAB4_0851909_陳昭宇.zip」
Requirements:
-
Implement the ResNet18 、 ResNet50 architecture and load parameters from a pretrained model
-
Compare and visualize the accuracy trend between the pretrained model and without pretraining in same architectures, you need to plot each epoch accuracy (not loss) during training phase and testing phase.
-
Implement your own custom DataLoader
-
Calculate the confusion matrix and plotting
Dataset- Diabetic Retinopathy Detection (kaggle):
Diabetic retinopathy is the leading cause of blindness in the working-age population of the developed world. It is estimated to affect over 93 million people. This dataset provided with a large set of high-resolution retina images taken under a variety of
imaging conditions. Format: .jpeg
Reference :
https://www.kaggle.com/c/diabetic-retinopathy-detection#description
-
Prepare Data
The dataset contains 35124 images, we divided dataset into 28,099 training data and 7025 testing data. The image resolution is 512×512 and has been preprocessed.
Download link:
https://drive.google.com/open?id=1RTmrk7Qu9IBjQYLczaYKOvXaHWBS0o72
-
Custom Dataloader
-
This is the skeleton that you have to fill to have a custom dataset, refer to
-
“dataloader.py”
-
The __init__ function is where the initial logic happens like reading a csv, assigning transforms etc.
-
The __getitem__ function returns the data and labels and you can process the data like loading image, preprocessing, transforming image before returns.
-
The index parameter where in the __getitem__ function, it is the nth data/image you are going to return.
-
Reference:
https://pytorch.org/tutorials/beginner/data_loading_tutorial.html https://github.com/utkuozbulak/pytorch-custom-dataset-examples
You can use getData function to read all images name and ground truth.
-
ResNet
ResNet (Residual Network) is the Winner of ILSVRC 2015 in image classification, detection, and localization, as well as Winner of MS COCO 2015 detection, and segmentation [2].
-
Degradation problem: the network depth increasing, accuracy gets saturated (which might be unsurprising) and then degrades rapidly. Not overfitting, it’s the vanishing/ exploding gradient problem.
Figure. Training error (left) and test error (right) on CIFAR-10 with 20-layer and 56-layer “plain” networks. The deeper network has higher training error, and thus test error.
-
Skip/Shortcut connection: to solve the problem of vanishing/exploding gradients, a skip / shortcut connection is added to add the input x to the output after few weight layers as below [2]
-
Building residual basic block and bottleneck block:
Basic block Bottleneck block
-
Using pretrained model and reinitialize the specific layers https://pytorch.org/tutorials/beginner/finetuning_torchvision_models_tutori al.html
-
Your visualization figure of accuracy should like example as below.
A confusion matrix is a table that is often used to describe the performance of a classification model on a set of test data for which the true values are known. It is extremely useful for measuring Recall, Precision, Specificity, Accuracy and most importantly AUC-ROC curve [3].
-
Sample code:
You can use the following example and library to conduct this section. https://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html#sp hx-glr-auto-examples-model-selection-plot-confusion-matrix-py
-
The confusion matrix will be similar as the following example:
Batch size= 4 Learning rate = 1e-3
Epochs = 10(resnet18), 5(resnet50)
Optimizer: SGD momentum = 0.9 weight_decay = 5e-4
Loss function: torch.nn.CrossEntropyLoss()
You can adjust the hyper-parameters according to your own ideas.
Report Spec
-
-
Introduction (20%)
-
-
-
Experiment setups (30%)
-
-
-
-
The details of your model (ResNet)
-
-
-
-
-
The details of your Dataloader
-
-
-
-
-
Describing your evaluation through the confusion matrix
-
-
-
-
Experimental results (30%)
-
-
-
-
The highest testing accuracy
-
-
-
-
-
-
-
Screenshot
-
-
-
-
-
-
-
-
-
Anything you want to present
-
-
-
-
-
-
-
Comparison figures
-
-
-
-
-
-
Plotting the comparison figures
-
-
-
(RseNet18/50, with/without pretraining)
-
-
Discussion (20%)
-
-
-
-
Anything you want to share
-
-
-
Criterion of result (40%) —-
Accuracy > = 82% = 100 pts
Accuracy 80~82% = 90 pts
Accuracy 75~80% = 80 pts
Accuracy < 75% = 70 pts
Score: 40% experimental results + 60% (report+ demo score) P.S If the zip file name or the report spec have format error, it will be penalty (-5).
In demo phase, you need to load trained model and evaluate.
-
He, Kaiming, et al. “Deep residual learning for image recognition.” Proceedings of the IEEE conference on computer vision and pattern recognition. 2016.
-
Review: ResNet
https://towardsdatascience.com/review-resnet-winner-of-ilsvrc-2015-image-classification-localization-detection-e39402bfa5d8
-
Understanding Confusion Matrix https://towardsdatascience.com/understanding-confusion-matrix-a9ad42dcfd62
-
Confusion Matrix
http://www2.cs.uregina.ca/~dbd/cs831/notes/confusion_matrix/confusion_matrix.html