We propose a novel approach for aggregating the scores of different feature selection (FS) methods using the SMART-or fuzzy aggregation operator. Feature selection is a valuable technique for dimension reduction and explainability in artificial intelligence. However, there is no one-size-fits-all FS method suitable for all domains. To address this, ensemble methods combining multiple FS methods have gained popularity. In our study, we aggregate the scores assigned by four popular filter methods (CFS, ReliefF, MIFS, and Inf-FS$_S$). Filter methods are model agnostic, meaning they assign scores to features independently of any subsequent learning algorithm used for classification or clustering. To mitigate the variability of individual methods, we employ bootstrapping techniques and fuzzy sets to express the vagueness of results. Unlike other proposals, we follow the principle of maximum specificity by eliciting fuzzy sets through a probability-possibility transformation. Instead of using common weighted fuzzy combinations or drastic sum aggregations, we utilize the SMART-or aggregation operator and Yager's ordering to obtain feature rankings. The SMART-or operator considers degrees of agreement/disagreement among fuzzy sets without requiring weight choices, while Yager's ordering is sensitive to the specific shapes of membership functions. Empirical results on eight benchmark databases from the UCI repository demonstrate the accuracy and stability of our approach. Furthermore, our proposal can be applied to any ensemble of filter methods.
A Fuzzy Ensemble of Features Importances via SMART-or aggregation
Andrea Capotorti
Membro del Collaboration Group
;Alessio TroianiMembro del Collaboration Group
2024
Abstract
We propose a novel approach for aggregating the scores of different feature selection (FS) methods using the SMART-or fuzzy aggregation operator. Feature selection is a valuable technique for dimension reduction and explainability in artificial intelligence. However, there is no one-size-fits-all FS method suitable for all domains. To address this, ensemble methods combining multiple FS methods have gained popularity. In our study, we aggregate the scores assigned by four popular filter methods (CFS, ReliefF, MIFS, and Inf-FS$_S$). Filter methods are model agnostic, meaning they assign scores to features independently of any subsequent learning algorithm used for classification or clustering. To mitigate the variability of individual methods, we employ bootstrapping techniques and fuzzy sets to express the vagueness of results. Unlike other proposals, we follow the principle of maximum specificity by eliciting fuzzy sets through a probability-possibility transformation. Instead of using common weighted fuzzy combinations or drastic sum aggregations, we utilize the SMART-or aggregation operator and Yager's ordering to obtain feature rankings. The SMART-or operator considers degrees of agreement/disagreement among fuzzy sets without requiring weight choices, while Yager's ordering is sensitive to the specific shapes of membership functions. Empirical results on eight benchmark databases from the UCI repository demonstrate the accuracy and stability of our approach. Furthermore, our proposal can be applied to any ensemble of filter methods.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.