Machine Learning Computer Exercise Solution

$30.00 $24.00

Multiple Instance Learning: image classi cation In this exercise we will make an image classi er, using a Multiple Instance Learning approach. To keep it simple, we will use a relatively tiny dataset, with simple features and a simple classi er. We will try to distinguish images of apples from bananas. Below are two examples…

5/5 – (2 votes)

You’ll get a: zip file solution

 

Description

5/5 – (2 votes)

Multiple Instance Learning: image classi cation

In this exercise we will make an image classi er, using a Multiple Instance Learning approach. To keep it simple, we will use a relatively tiny dataset, with simple features and a simple classi er.

We will try to distinguish images of apples from bananas. Below are two examples of images, the rst containing an apple, the second containing a banana. Notice that the backgrounds in the images are quite similar.

To make an image classi er based on MIL, we have to make several steps:

  1. De ne the instances that constitute a bag. Here we will make use of a mean-shift image segmentation to segment an image into subparts.

  1. De ne the features that characterize each instance. Here we will just use the (average) red, green and blue color.

1

  1. De ne the MIL classi er. Here we will use a naive approach, that is using a standard classi er trained on the individual instances.

  1. De ne the combination rule that combines the predicted labels on the individual instances to a predicted label for a bag.

In the coming exercise we will go though it step by step.

  • The Naive MIL classi er

    1. Get the data sivalsmall.zip and some additional Matlab functions additionalcode.zip from Blackboard. The data should contain two folders, one with apple, the other with banana. The additionalcode contains a function to segment an image im meanshift, and a function to convert a cell-array of bags to a Prtools dataset bags2dataset.

    1. Implement a script that reads all images from a given directory. You can use the Matlab functions dir and imread.

    1. Next implement a function extractinstances that segments an image using the Mean Shift algorithm (using im meanshift), computes the average red, green and blue color per segment, and returns the resulting features in a small data matrix.

Notice that the number of segments that you obtain depends on a width-parameter that you have to supply to im meanshift. Set this parameter such that the background in the rst ‘apple’ image is rougly one segment. What value of the width parameter did you nd?

    1. Create a function gendatmilsival that creates a MIL dataset, by go-ing through all apple and banana-images, extracting the instances per image, and storing them in a Prtools dataset with bags2dataset. Make sure you give the apple and banana objects a di erent class label (for instance label 1 and 2). The resulting dataset should contain all in-stances of all images, and the label of each of the instances is copied from the bag label. Additionally, the bag identi ers are stored in the dataset. If you are interested, you can retrieve them using bagid = getident(a,’milbag’).

2

How many bags did you obtain? How many features do the instances have? How many instances are there per bag? Make a scatterplot to see if the instances from the two classes are a bit separable.

  1. Create a function combineinstlabels that accepts a list of labels, and outputs a single label obtained by majority voting.

  1. Now we are almost ready to classify images… First we have to train a classi er; let’s use a Fisher classi er for this. Now apply the trained classi er to each instance in a bag, classify the instances (using labeld), and combine the label outputs (using your combineinstlabels) to get a bag label.

How many apple images are misclassi ed to be banana? And vice versa? Why is this error estimate not trustworthy?

  1. Invent at least two ways in which you may improve the performance of this classi er (think of how you obtained your MIL dataset).

  • MILES

The classi er that we used in the previous section was very simple. In this section we implement one of the most successful classi ers, called MILES. Also have a look at the article “MILES: Multiple-instance learning via em-bedded instance selection.” by Chen, Yixin, Jinbo Bi, and James Ze Wang, IEEE Transactions on Pattern Analysis and Machine Intelligence, (2006): 1931-1947.

  1. Implement a function bagembed that represents a bag of instances Bi by a feature vector m(Bi), using equation (7) from the article.

How large will this feature vector m(Bi) become for our apple-banana problem?

  1. Make a Prtools dataset with the vectors m(Bi) and their corresponding labels yi. Choose a sensible value for such that not all numbers become 0 or 11. Train on this large dataset a L1-support vector classi er (or, more correctly called, LIKNON): liknonc.

1For me, = 25 appeared to work reasonably well.

3

  1. Test the LIKNON classi er on this dataset: how many errors is this classi er making? Is this classi er better than the naive MIL classi er trained in the previous section? What can you do to make this MILES classi er perform better?

  • Another MIL classi er

    1. Finally, implement your own MIL classi er. Any classi er may do, except for the Naive approach and the MILES classi er (obviously). It may be something you invented yourself, or some classi er from literature.

Explain what MIL classi er you are implementing, give the code, and compare its performance with that of the Naive classi er (i.e. the Fisher classi er with a majority vote), and of the MILES classi er.

4

Machine Learning Computer Exercise Solution
$30.00 $24.00