First of all, locating representative pixels of the objects in an image, without any a-priori information about the scene, is a very useful step for wide-ranging applications like object segmentation, feature extraction, image matching, image retrieval, figure-ground separation, among others.
To this end, saliency detectors are widely used by the image processing and computer vision community. A saliency detector inspects low-level cues like edges, orientation and color channels to arrive at a likelihood of each pixel being “interesting” or not.
clear; close all;[X map] = imread('shadow.tif');shadow = ind2rgb(X,map); % convert to truecolorfigure(1), imshow(shadow); title('RGB image');salmap = gbvs(shadow);salmap = salmap.master_map_resized;figure(2), imshow(salmap); title('Original saliency map');
Above is an example of an image and its saliency map obtained using the popular GBVS package. The saliency map is in the range of [0, 1], where the high values corresponds to the “hot” spots in the map. While it is easy to manually spot two distinct regions in the saliency map, corresponding to the regions where the jackfruit and the shadows meet, it is much harder to select them using a threshold, also acknowledged in the literature [1].
Opening by Reconstruction
We propose a little known technique in the image processing community called, ‘opening-by-reconstruction’, to obtain a binary mask containing blobs, preferably within the object boundaries. The first step is to erode the original saliency map using a disk shaped structuring element of radius 5 pixels. The erosion step retains areas which are at least, ten pixels wide, without placing restrictions on the maximum size of the structures.
se = strel('disk',5,0);Ie = imerode(salmap, se);figure(3), imshow(Ie); title('Eroded saliency map');
The second step is to dilate the eroded image constrained by the original saliency map. In other words, the original saliency map acts as a “guide” to retain the “shape” of the eroded map. This dilation step is iterated until convergence. For a detailed discussion of ‘opening by reconstruction’, please refer to ourpaper. The iterative dilation step is easily implemented using the following code.
Iobr = imreconstruct(Ie, salmap);figure(4), imshow(Iobr); title('Reconstructed saliency map');
Obtaining regional maxima on the reconstructed image would select meaningful blobs instead of isolated pixels. Note that obtaining regional maxima directly on the saliency map won’t work well, in most cases, due to the “peaky” structures. The output of the reconstruction step flattens the peaks in the eroded image.
fgm = imregionalmax(Iobr,8);figure(5), imshow(fgm); title('Regional maxima on the reconstructed saliency map');direct_maxima = imregionalmax(salmap,8);figure(6), imshow(direct_maxima); title('Regional maxima on the original saliency map');
The drawback of using ‘opening by reconstruction’ is the selection of broad enough peaks (that survive erosion) in the original saliency map, irrespective of the actual saliency values. We can hope to discard those blobs outside the object of interest, by throwing away those with very low saliency values. An efficient call to regionprops calculates the average saliency value of each blob in the original saliency map. A relative threshold of 0.2 is applied to discard low saliency blobs and ismember selects all the blobs that exceed the threshold.
discard_thresh = 0.2;labelimg = bwlabel(fgm);s = regionprops(labelimg, salmap, 'MeanIntensity');avg_sal = [s.MeanIntensity];avg_sal = rescale(avg_sal,0,1);idx = find(avg_sal > discard_thresh);if ~isempty(idx)fgm = ismember(labelimg,idx);endfigure(7), imshow(fgm); title('Final binary mask');
Let us overlay the binary mask on the RGB image.
figure(8), imshow(shadow);hold on;h1 = imshow(fgm); set(h1,'alphadata',0.4);hold off
The ROI detector is available on MATLAB central as a single file. Please feel free to let me know any comments or criticisms.