Convolutional Neural Networks (CNN) have brought spectacular improvements in several fields of machine vision including object, scene and face recognition. Nonetheless, the impact of this new paradigm on the classification of fine-grained images—such as colour textures—is still controversial. In this work, we evaluate the effectiveness of traditional, hand-crafted descriptors against off-the-shelf CNN-based features for the classification of different types of colour textures under a range of imaging conditions. The study covers 68 image descriptors (35 hand-crafted and 33 CNN-based) and 46 compilations of 23 colour texture datasets divided into 10 experimental conditions. On average, the results indicate a marked superiority of deep networks, particularly with non-stationary textures and in the presence of multiple changes in the acquisition conditions. By contrast, hand-crafted descriptors were better at discriminating stationary textures under steady imaging conditions and proved more robust than CNN-based features to image rotation.
Comparative evaluation of hand-crafted image descriptors vs. off-the-shelf CNN-based features for colour texture classification under ideal and realistic conditions
Raquel Bello-Cerezo;Francesco Bianconi
;Francesco Di Maria;Fabrizio Smeraldi
2019
Abstract
Convolutional Neural Networks (CNN) have brought spectacular improvements in several fields of machine vision including object, scene and face recognition. Nonetheless, the impact of this new paradigm on the classification of fine-grained images—such as colour textures—is still controversial. In this work, we evaluate the effectiveness of traditional, hand-crafted descriptors against off-the-shelf CNN-based features for the classification of different types of colour textures under a range of imaging conditions. The study covers 68 image descriptors (35 hand-crafted and 33 CNN-based) and 46 compilations of 23 colour texture datasets divided into 10 experimental conditions. On average, the results indicate a marked superiority of deep networks, particularly with non-stationary textures and in the presence of multiple changes in the acquisition conditions. By contrast, hand-crafted descriptors were better at discriminating stationary textures under steady imaging conditions and proved more robust than CNN-based features to image rotation.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.