Reinforcement Learning Applicability for Resource-Based Auto-scaling in Serverless Edge Applications

Benedetti, Priscilla; Femminella, Mauro; Reali, Gianluca; Steenhaut, Kris

Serverless computing is an alternative deployment paradigm for cloud computing platforms, aimed to provide scalability and cost reduction without requiring any additional deployment overhead from developers. Generally, open-source serverless computing platforms rely on two auto-scaling approaches: workload-based and resource-based. In the former, a designated algorithm scales instances according to the number of incoming requests. In the latter, instances are scaled when a certain resource usage limit, such as maximum Central Processing Unit (CPU) utilization, is reached. Resource-based auto-scaling is usually implemented leveraging Kubernetes Horizontal Pod Autoscaler (HPA). In this work, we investigate the applicability of a reinforcement-based approach to resource-based auto-scaling in OpenFaaS, the most widely used open-source serverless platform. Serverless technologies are particularly convenient when dealing with edge computing on constrained devices or resource-limited machines. Our experimental analysis has been conducted on constrained Kubernetes-based nodes, to simulate such an edge application scenario. Its preliminary results show that our proposed model learns an effective scaling policy, based on CPU utilization, to provide minimal service latency within a limited number of iterations.