In this assignment you will practice HoG, DenseSIFT and SIFT feature extractions from images,and then testing out VLAD and Fisher Vector aggregation schemes to generate an uniform dimension feature representation for images. Then for a given set of images and their pair wise matching/non-matching info from the CDVS data set, you will test the performance of different combination of features and aggregation schemes to see which one gives us the best performance in TPR-FPR ROC.
[Q1, 20pts] Compute Image Features, create a matlab/python function that compute n x d features by calling vl_feat HoG, DSIFT and SIFT functions (notice that vl_feat also has Python version), implementing the following function:
%im – input images, let us make them all grayscale only, so it is ah x w matrix
%opt.type = {‘hogf’, ‘sift’, ‘dsft’} for hog, sift and densesift
%f – n x d matrix containing n features of d dimension
function [f]=getImageFeatures(im, opt)
[Q2, 20pts] Compute VLAD and Fisher Vector models of image features, for this purpose you need to first compute a Kmeans model for HoG, DenseSIFT and SIFT. Use the CDVS data set given as both training and testing for convenience (not the right way in research though, should use a different data set, say FLICKR MIR, or ImageNet), implementing the following functions:
% f – n x d matrix containing training features from say 100 images.
% k – VLAD kmeans model number of cluster
% kd – desired dimension of the feature
% vlad_km – VLAD kmeans model
% A – PCA projection for dimension reduction
function [vlad_km, A]=getVladModel(f, kd, k)
% PCA dimension reduction of the feature
[A,s,lat]=princomp(f);
f0 = f*A(:,1:kd); % this is the feature with desired d-dimensions