Dimensionality reduction strategies for CNN-based classification of histopathological images

Cascianelli, Silvia; BELLO CEREZO, Raquel; Bianconi, Francesco; Fravolini, Mario Luca; Belal, Mehdi; Palumbo, Barbara; Kather, Jakob N.

doi:10.1007/978-3-319-59480-4_3

Features from pre-trained Convolutional Neural Newtorks (CNN) have proved to be effective for many tasks such as object, scene and face recognition. Compared with traditional, hand-designed image descriptors, CNN-based features produce higher-dimensional feature vectors. In specific applications where the number of samples may be limited – as in the case of histopatological images – high dimensionality could potentially cause overfitting and redundancy in the information to be processed and stored. To overcome these potential problems feature reduction methods can be applied, at the cost of a moderate reduction in the discrimination accuracy. In this paper we investigate dimensionality reduction schemes for CNN-based features applied to computer-assisted classification of histopathological images. The purpose of this study is to find the best trade-off between accuracy and dimensionality. Specifically, we test two well-known techniques (i.e.: Principal Component Analysis and Gaussian Random Projection) and propose a novel reduction strategy based on the cross-correlation between the components of the feature vector. The results show that it is possible to reduce CNN-based features by a high ratio with a moderate decrease in accuracy with respect to the original values.