IRIS - Res&Arch Institutional Research Information System - Research & Archive

Over Over the past years, the field of Machine Learning (ML) and Deep Learning (DL) has seen strong developments both in terms of software and hardware, with the increase of specialized devices. One of the biggest challenges in this field is the inference phase, where the trained model makes predictions of unseen data. Although computationally powerful, traditional computing architectures face limitations in efficiently managing requests, especially from an energy point of view. For this reason, the need arose to find alternative hardware solutions, and among these, there are Field Programmable Gate Arrays (FPGAs): their key feature of being reconfigurable, combined with parallel processing capability, low latency and low power consumption, makes those devices uniquely suited to accelerating inference tasks. In this paper, we present a novel approach to accelerate the inference phase of a multi-layer perceptron (MLP) using BondMachine framework, an OpenSource framework for the design of hardware accelerators for FPGAs. Analysis of the latency, energy consumption, and resource usage, as well as comparisons with respect to standard architectures and other FPGA approaches, is presented, highlighting the strengths and critical points of the proposed solution. The present work represents an exploratory study to validate the proposed methodology on MLP architectures, establishing a crucial foundation for future work on scalability and the acceleration of more complex neural network models.

Extending a Moldable Computer Architecture to Accelerate DL Inference on FPGA

Mariotti, Mirko;Bianchini, Giulio;Neri, Igor;Spiga, Daniele;Ciangottini, Diego;Storchi, Loriano

2025

Abstract

Over Over the past years, the field of Machine Learning (ML) and Deep Learning (DL) has seen strong developments both in terms of software and hardware, with the increase of specialized devices. One of the biggest challenges in this field is the inference phase, where the trained model makes predictions of unseen data. Although computationally powerful, traditional computing architectures face limitations in efficiently managing requests, especially from an energy point of view. For this reason, the need arose to find alternative hardware solutions, and among these, there are Field Programmable Gate Arrays (FPGAs): their key feature of being reconfigurable, combined with parallel processing capability, low latency and low power consumption, makes those devices uniquely suited to accelerating inference tasks. In this paper, we present a novel approach to accelerate the inference phase of a multi-layer perceptron (MLP) using BondMachine framework, an OpenSource framework for the design of hardware accelerators for FPGAs. Analysis of the latency, energy consumption, and resource usage, as well as comparisons with respect to standard architectures and other FPGA approaches, is presented, highlighting the strengths and critical points of the proposed solution. The present work represents an exploratory study to validate the proposed methodology on MLP architectures, establishing a crucial foundation for future work on scalability and the acceleration of more complex neural network models.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2025
			
	Rivista su cui è pubblicata l'opera
	
				ELECTRONICS
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11391/1615987

Citazioni

ND

0

0

social impact