Description
-
Overview
The goal of this assignment is to implement basic image processing functions and assemble them into a data augmentation pipeline for training machine learn-ing models. The assignment covers image processing techniques (e.g., resizing, cropping, color manipulation, and rotation) and the data input pipeline.
-
Setup
You won’t need Cloud Computing or GPU for this assignment.
This assignment is NOT team-based and you must complete the assign-ment on your own.
You will need to install some Python packages and ll in missing code in:
./code/student code.py
You can test your code by running the provided notebook: jupyter notebook ./code/proj1.ipynb
You can generate the submission le once you’ve nished the project using: python zip submission.py
This assignment is given 12 points plus 2 bonus points.
-
Details
This project is intended to familiarize you with Python and image processing. If you do not have previous experience Python or image processing, please refer to the resources in our tutorial. All tutorial materials can be found on Canvas.
3.1 Setup the Computing Environment (2 Pts)
The rst step is to properly set up a computing environment for the assignment
Install Anaconda or Miniconda (recommended). We recommend using Conda to manage your packages.
The following packages are needed: OpenCV, NumPy, Matplotlib, Jupyter Notebook, PyTorch. You might need to do a bit of research for how to install all packages (and hopefully their latest versions). See a potential issue here https://github.com/pytorch/vision/issues/4076. Upon successful installation, run
conda list > ./results/packages.txt
3.2 Image Processing
Our next step is to implement a set of functions that manipulate an input im-age. These functions include resizing, cropping, color jittering, and rotation.
Image Resizing (2 Pts): Re-sampling is one of the fundamental operations in image processing. You can nd many implementations in di erent packages, yet re-sampling might be a bit tricky. We have provide you a helper function for resizing an input image (./code/utils/image resize). Your goal is to implement a version of adaptive resizing, often used in machine learning models. Speci cally, given an input image, you will resize the image to match its shortest side to a pre-speci ed length. Please ll in the missing code in class Scale. You must use the provided image resize function. More details can be found in the code and comments.
Image Cropping (2 Pts): Cropping selects a region from an image. You will implement a more advanced version of image cropping. Concretely, this version crops a image by sampling a random region, where the size of the region is drawn from a uniform distribution of a given range of areas, and the aspect ratio is constrained to a pre-speci ed interval. This region is further resized to a xed size. This technique is described in [2] and has been widely used for train-ing deep networks. Please ll in the code in the class RandomSizedCrop. We recommend using NumPy for cropping the region and using our provided image resize function to resize the region. Again, more details can be found in the code and comments.
Color Jitters (2 Pts): A small perturbation in the color space can lead to im-ages with drastically di erent pixel values, yet are still perceptually meaningful. You will implement a simple version of color jitters in the class RandomColor. For each of the color channel, your code will sample from a uniform distribu-tion U(1 r; 1 + r) with r 2 (0; 1). is then multiplied to the corresponding color channel. This is done independently for each color channel. The technique is described in several papers, e.g., [1]. Details can be found in the code and
comments. Hint: you can make this very e cient by building a lookup table for each color channel.
Rotation (2 Pts): 2D rotation is a simple form of parametric warping. It rotates the image pixels around the center of the image by a (random) degree uniformly sampled within an interval. You will need to implement this function in class RandomRotate.
Bonus (rotation without black pixels) (+2 Pts): In PIL, you can simply rotate an image using the function Image.rotate. However, this function will create empty black pixels in the result image. See an example in Figure 1. Of-tentimes, we want to avoid these black pixels. This can be done by cropping the result image. While there are many di erent ways of cropping, we are interested in nding a rectangular region with the maximum area that does not contain a single empty pixel. Bonus points will be provided for (1) deriving the analytic solution of nding the rectangular region of the max area without an empty pixel; and (2) implementing the solution in the code.
Rubric: With a bit of e ort, you can nd the solution online. The bonus points are granted only for those solutions with full derivation.
Hint: You should check cv2.warpA ne.
Figure 1: How can you nd the rectangular region of the max area without an empty pixel after a random 2D rotation?
3.3 Data Augmentation and Input Pipeline (2 Pts)
Putting things together, we have included a full data augmentation and input pipeline at the end of the notebook. This is done by using PyTorch dataloader class. Please go through the implementation, run the code and check the results. Note that you have to nish the code in Sec 3.1 to get this part working.
Composition of Image Transforms: We have provided sample implementa-tion in helper code that composes a series of transforms and applies them to an input image. As random augmentation is employed, you should run this part
of code multiple times, in case that an error produced by some corner cases is not captured.
-
Writeup
For this assignment, and all other assignments, you must submit a project re-port in PDF. In the report you will describe your algorithm and any decisions you made to write your algorithm a particular way. You will show and discuss the results of your algorithm, and answer all questions in the assignment. For this project, please include the results of your transformed images (the note-book should have saved such images already). Also, discuss anything extra you did (e.g., your derivation for the bonus question). Feel free to add any other information you feel is relevant. A good writeup doesn’t just show results, it tries to share some insights or draw some conclusions from your experiments.
-
Handing in
This is very important as you will lose points if you do not follow instructions. Every time after the rst that you do not follow instructions, you will lose 5%. Hand in your project as a zip le through Canvas. You can create this zip le using python zip submission.py. The zip le you hand in must contain the following folders:
code/ – directory containing all your code for this assignment
writeup/ – directory containing your report (PDF) for this assignment. results/ – directory containing your results (generated by the notebook)
Do not use absolute paths in your code (e.g. /user/classes/proj1). Your code will break if you use absolute paths and you will lose points because of it. Please use relative paths as the starter code already did. Do not turn in the /data/ folder unless you have added new data.
References
-
D. Eigen, C. Puhrsch, and R. Fergus. Depth map prediction from a single image using a multi-scale deep network. In Advances in Neural Information Processing Systems 27, pages 2366{2374. 2014.
-
C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In Pro-ceedings of the IEEE conference on computer vision and pattern recognition, pages 1{9, 2015.
4