**Model Learning in Combinatorial Optimization: A Case Study on Thermal Aware Dispatching**

**A Google Focused Grant Program on Mathematical Optimization and Combinatorial Optimization in Europe**

Principal Investigator: Prof. Michela Milano

### The Challenge

Combinatorial optimization technology has achieved a good level of maturity in the last decades. Approaches such as Mathematical Programming, Constraint Programming (CP) and incomplete methods like meta-heuristics have been applied to effectively solve optimization and decision problems.These techniques rely on some kind of system model: their effectiveness depends to a tremendous extent on the model accuracy and on the ability to exploit its specificities to rule out infeasible solutions or quickly find good quality ones.

In many real world settings, however, decisions to be taken affect and are affected by complex systems, that exhibit phenomena emerging from a collection of interacting objects, capable to adapt their behavior according to their history and to external feedback. Such systems are unfortunately impervious to modeling efforts via state-of-the-art combinatorial optimization techniques. An example is provided by the thermal behavior of many core CPUs: this depends on the interaction of many concurrent factors such as heat conduction, the workload of each core, the chip layout and on the presence of active components such as power-management/resource-arbitration policies. The thermal behavior of a computer room in an High Performance Computing (HPC) center is a second, related, example: despite some of the concurring factors may be different (CRAC1 groups play the role of heat sinks and heat transfer occurs via air flows), the overall behavior depends on a similarly complex set of interacting actors. If the dynamics of the single actors is known, it is possible to devise a numerical model (i.e. a simulator), but the resulting evaluation time is typically too large for a direct use within an optimization routine. Yet, accurately taking into account thermal issues in workload dispatching would enable substantial power savings and performance improvements.

Our group is pursuing a research line aiming to provide a methodology to connect and integrate decision making and optimization technology with complex systems. In this project we plan to investigate an integration method based on the use of a hybrid combinatorial/machine-learning model. In detail, we will use learning techniques to extract (part of) the optimization model from simulation or from direct system observation. As a practical case study, we will consider thermal aware workload dispatching on many core chips (Scenario A) and on HPC centers (Scenario B). The developed methodology will provide accurate, adaptive, easy to deploy thermal models for optimization methods, and hence the opportunity to achieve power savings and performance improvements beyond the current state of the art.

### State of the Art

Simulation and optimization have been merged in the so-called simulation optimization field. In this case optimization is used for choosing optimal simulation parameters. Similarly, Neuro-Dynamic Programming and Reinforcement Learning have used simulation for learning the best agent policy parameters. Here we do not employ learning to optimize the simulation outcome, but rather to understand how high level decisions affect the system behavior. As an advantage, our approach requires no modification in the controlled system. Simulation-aware optimization has been considered in the context of Genetic Algorithms (GA), by solving a numerical model to evaluate the fitness function. Although GAs encode some system knowledge through the population individuals, they have difficulties in exploiting model properties to improve their effectiveness. Some approaches consider aspects related to model learning (either entire models, or constraints structures from examples), but much work needs to be done to generalize these conclusions.

In the context of scenario A, proactive workload dispatching has been investigated as a mean to prevent the activation of chip cooling measures with adverse effects on the performance. The related approaches consist of: (1) on-line optimization policies, taking advantage of run-time temperatures read from hardware sensors, but typically producing sub-optimal solutions. (2) Off-line allocation and scheduling approaches, embedding a simplified thermal model of the target platform. (3) Off-line iterative methods, relying on a time-consuming simulation to assess the quality of the incumbent solution.

As concerns scenario B, heat management is rapidly becoming a major limiter for HPC and data centers (Approximately 50\% of the energy consumed by data centers is used for powering the cooling infrastructure). A reduction of power consumption while respecting the Service Level Agreement can be obtained via power saving technology and clever job dispatching policies. In many cases some model of the computer room is exploited to evaluate the thermal impact of the dispatching decisions; a learning step is used to calibrate parameters, but the models themselves tend to be either very simple or extremely hard to engineer.

### Method Outline and Evaluation Process

The key idea in this project is to use a machine learning model (e.g. a Neural Network, a Support Vector Machine, a Decision Tree...) to extract the relation between the decision variables and their effect on the system behavior or on the simulation observables. The trained component can then be merged with a declarative model and exploited to rule out infeasible solutions and guide search toward promising ones.

We have pioneered the approach in out CP 2012 paper, where a Neural Network is used to learn the thermal behavior of a quad core CPU and merged with a Constraint Programming model. Much work is needed to extend such result into a solid methodology. We plan to investigate the following directions:

**Identification of Suitable Machine Learning Techniques:**promising candidates will be identified based on their accuracy and their ease of integration within a combinatorial model.**Improving the Training and the Solution Process:**this includes the definition of rules to build effective training sets, the identification of system symmetries to boost the training/solution time, properties characterization of the learned decision making components to devise heuristics and filtering algorithm. In addition, multi-level optimization will be investigated. Simple optimization and unexpected events can be handled by a fast on-line optimization policy, while a complex solver operating at a more coarse-grained time scale may focus on critical decisions.**Continuos Learning and Adaptive Optimization:**the system behavior may change over time, due to external inputs (e.g. weather conditions) or to internal processes (e.g. hardware failure); alternatively, the learned model may just turn out to be inaccurate for a particular input configuration. In all those cases it is possible to repeat the whole process, so that the training stage becomes part of an iterative optimization approach. This idea is particularly appealing in the context of scenario B, due to the availability of computational power and the relatively large time horizon.

These techniques will be experimented on scenarios A and B by defining benchmarks and defining proper evaluation metrics based on the workload. Our research group has a strong expertise in hybrid optimization and is actively cooperating with people at the MICREL lab, having a strong expertise on many core architectures. MICREL has been recently granted access to a 48 core experimental platform by Intel (Single-chip Cloud Computer, SCC3), that will serve as a basis for our experimentation on scenario A. Concerning scenario B, we plan to take advantage of an active connection with CINECA4, to get access to real world measurements. All the developed techniques will be implemented on top of the Google or-tools framework.