Purpose: Fruit detection and counting represent one of the most important steps toward yield estimation and a well-known practice for farmers, on which they base the management of the harvesting, storage, and distribution phases of agricultural products. In the era of precision agriculture, yield estimation, which was previously performed only by human operators, is currently being re-designed through the employment of Artificial Intelligence and Computer Vision techniques. Despite the impressive results that AI has demonstrated in fruit detection systems, they rely on large image datasets, whose availability is still limited if compared to the great number of crop typologies. For this reason, great interest has recently been devoted to weakly supervised algorithms, which can reduce the dataset annotation effort required by using simple image-level labels. Method: Based on these considerations, this work proposes a new method relying on a sample-efficient weakly supervised approach. The proposed system, named MangoDetNet, is trained through a two-stage curriculum learning approach, first involving an image reconstruction task, and secondly an image binary classification task for heatmap generation. In particular, during the first stage, the network is trained in an unsupervised manner for the image reconstruction task, in order to promote the learning of robust feature extractors that are customized for the fruit scenarios. The second stage of training, instead, is performed to achieve image binary classification, employing presence/absence binary labels. This phase further refines the feature extractor from the previous stage and favors the computation of more refined and precise activation maps. Conclusion: As demonstrated through the experimental campaign, performed on a mango orchard image dataset, MangoDetNet is able to outperform the state-of-the-art weakly supervised approaches, providing an F1 score equal to 0.861, which is on par with those of fully supervised methods, and an F1 score equal to 0.856 when halving the number of labeled samples needed for training.

MangoDetNet: a novel label-efficient weakly supervised fruit detection framework

Crocetti F.;Costante G.;Valigi P.;Fravolini M. L.
2024

Abstract

Purpose: Fruit detection and counting represent one of the most important steps toward yield estimation and a well-known practice for farmers, on which they base the management of the harvesting, storage, and distribution phases of agricultural products. In the era of precision agriculture, yield estimation, which was previously performed only by human operators, is currently being re-designed through the employment of Artificial Intelligence and Computer Vision techniques. Despite the impressive results that AI has demonstrated in fruit detection systems, they rely on large image datasets, whose availability is still limited if compared to the great number of crop typologies. For this reason, great interest has recently been devoted to weakly supervised algorithms, which can reduce the dataset annotation effort required by using simple image-level labels. Method: Based on these considerations, this work proposes a new method relying on a sample-efficient weakly supervised approach. The proposed system, named MangoDetNet, is trained through a two-stage curriculum learning approach, first involving an image reconstruction task, and secondly an image binary classification task for heatmap generation. In particular, during the first stage, the network is trained in an unsupervised manner for the image reconstruction task, in order to promote the learning of robust feature extractors that are customized for the fruit scenarios. The second stage of training, instead, is performed to achieve image binary classification, employing presence/absence binary labels. This phase further refines the feature extractor from the previous stage and favors the computation of more refined and precise activation maps. Conclusion: As demonstrated through the experimental campaign, performed on a mango orchard image dataset, MangoDetNet is able to outperform the state-of-the-art weakly supervised approaches, providing an F1 score equal to 0.861, which is on par with those of fully supervised methods, and an F1 score equal to 0.856 when halving the number of labeled samples needed for training.
2024
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/1587780
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact