Big data is usually processed in a decentralized computational environment with a number of distributed storage systems and processing facilities to enable both online and offline data analysis. In such a context, data access is fundamental to enhance processing efficiency as well as the user experience inspecting the data and the caching system is a solution widely adopted in many diverse domains. In this context, the optimization of cache management plays a central role to sustain the growing demand for data. In this article, we propose an autonomous approach based on a Reinforcement Learning technique to implement an agent to manage the file storing decisions. Moreover, we test the proposed method in a real context using the information on data analysis workflows of the CMS experiment at CERN.
Caching Suggestions Using Reinforcement Learning
Tracolli M.;Baioletti M.;Poggioni V.;
2020
Abstract
Big data is usually processed in a decentralized computational environment with a number of distributed storage systems and processing facilities to enable both online and offline data analysis. In such a context, data access is fundamental to enhance processing efficiency as well as the user experience inspecting the data and the caching system is a solution widely adopted in many diverse domains. In this context, the optimization of cache management plays a central role to sustain the growing demand for data. In this article, we propose an autonomous approach based on a Reinforcement Learning technique to implement an agent to manage the file storing decisions. Moreover, we test the proposed method in a real context using the information on data analysis workflows of the CMS experiment at CERN.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.