
— WP Objectives
Porteur: Shadi Ibrahim
- Design and implementation of multi-criteria strategies for intelligent, dynamic and holistic orchestration
- Design and implement a framework to facilitate the analysis and integration of multiple orchestrators
- Design and implement a software framework for scheduling and resource allocation across the edge-cloud continuum
- Develop dynamic and adaptive configurations
- Implement and optimise efficient serverless computing across the edge-cloud continuum.
— Missions
The evolution of distributed and heterogeneous infrastructures (from the edge to centralised data centres), together with the increasing variety and complexity of applications, means that current optimisation strategies are not very effective, as they usually target a single type of application or infrastructure and a single objective (or several but correlated ones). Furthermore, these state-of-the-art optimisation techniques usually consider little data (static information) extracted from the underlying applications or infrastructures. In this work package, we will study the optimisation problems associated with the use of these infrastructures and the execution of applications, and how these two components can be integrated to achieve the desired objectives. In particular, we will study multi-objective (sometimes even conflicting) problems such as cost, usage, energy, response time, quality of service, etc., involving a large number of decision variables. This work package covers several subtasks, including the optimisation of resource orchestrators and job scheduling strategies to account for dynamic changes in resources and workloads, and focuses on two emerging scenarios designed to optimise the deployment and execution of so-called ‘urgent’ and ‘serverless computing’ workflows in the edge-cloud continuum.
The experiments will be linked to the SILECS project and the associated SLICES-FR platform.
— Tasks description
T4.1 Multi-criteria strategies for intelligent, dynamic and holistic orchestration
Modern infrastructures are heterogeneous to support the diversity of services, making resource orchestration complex and highly dynamic. This task aims at optimizing existing orchestrators or designing new ones capable of efficiently managing this complexity at design time and runtime. It is based on three key principles:
- Requires prediction via machine learning models to select the best orchestration strategy.
- Holistic approach that integrates multiple variables, including energy consumption and quality of service.
- Efficient coordination of existing orchestrators, with or without an analytics engine.
Artificial intelligence of things (AIoT) reinforces the need for adaptive resource management on the IoT-Edge-cloud axis. Machine learning models can be trained to predict demand and dynamically adjust resource configuration. A self-adaptive protocol is being developed for energy-efficient and effective management.
Network and storage contention issues impact application performance and require multi-criteria resource allocation. The goal is to optimize dynamic sharing according to specific application needs.
To ensure SLOs in container orchestration, several strategies will be implemented: dynamic load balancing, automatic scaling and proactive container migration. The impact of orchestrators on performance (latency, energy consumption, failure rates) will be evaluated.
T4.2 Scheduling and resource allocation on multiple, heterogeneous platforms
This task investigates how to efficiently allocate resources and schedule tasks and jobs to achieve the desired performance, taking into account the heterogeneity of resources and data, and doing so dynamically.
In the IoT-Edge-Cloud continuum, execution models should be thoroughly revised to account for this intrinsic heterogeneity and its multiple indicators to achieve better multi-criteria optimization of placement decisions. In addition, we will explore how these models can be used to introduce scheduling algorithms that produce solutions that are both theoretically sound and execute in a sufficiently short time to be exploitable. Taking this a step further, we will investigate how to efficiently allocate resources and schedule tasks (or operators) to achieve desired performance, while dealing with fluctuations in input data and the dynamic nature of resources within the fog, as well as their sharing.
The goal is to use the data collected by the underlying applications and systems to gain information (using machine learning) to optimize the allocation of resources and tasks when deploying flow processing applications within the fog.
T4.3 Configuration and optimization for complex applications and dynamic environments
This task focuses on adaptations between the different layers of the system to enable resource selection and configuration of application services, taking into account the constraints of application models for data-driven applications.
The ioT-Edge-Cloud continuum requires the ability to dynamically reconfigure the infrastructure fabric. This is in response to the volatile nature of the amount of data that needs to be processed and the fact that the infrastructure is shared between multiple users. This means that the infrastructure must be able to scale instantly according to the workload to be performed. The implementation of ‘urgent’ type workflows, i.e. workflows whose computations are performed under strict time and quality constraints for a decision to be made within a time limit, becomes extremely complex. The objectives of this task are:
- (1) to model all the factors, i.e. the type of data to be received, the hardware on which the infrastructure runs, the libraries available, etc., that need to be taken into account;
- (2) to evaluate the interaction scenarios between the different layers while allowing the model to evolve, and finally;
- (3) to apply this model to application scenarios in order to evaluate the ability to manage the cost/performance trade-offs for decision making within the application and infrastructure sizing.
T.4.4 Setting up and optimizing efficient serverless computing in the Edge-Cloud continuum
Serverless computing (or Function-As-A-Service) has emerged as an important platform for building future web services, thanks in particular to the fine granularity of its pricing, its elasticity and its ease of management. These services are typically based on distributed machine learning (ML) or deep learning (DL) applications.
While much effort has been put into deploying and optimizing these applications in homogeneous data centers, little effort has been put into deploying serverless computing in the edge-cloud continuum, where resources are heterogeneous and have limited compute and storage capacity, to solve the problem of deploying multiple applications simultaneously. The objectives of this task are:
- (1) to introduce a new software framework to enable serverless computing in the edge-cloud continuum;
- (2) to optimise the performance of stateless or learning (ML) applications when their functions are co-located;
- (3) to enable these applications to scale to adapt to fluctuating workloads and optimise the use of available resources.
Other Work packages