Toolchain Porting

At the heart of all Embecosm’s compiler work is robustness. Exhaustive testing to ensure C/C++ standards compliant code that always executes correctly.

We specialize in developing compilers very early in the life cycle — before silicon tape-out in some cases, by using processor models. Our expertise extends to high performance compiler development for unusual architectures. We are the only consultancy that can implement LLVM for word-address embedded Harvard architectures with less than 32-bit addressing.

Our active research programs has allowed us to bring forward commercially robust versions of GCC and LLVM incorporating machine learning optimizers, that ensure your code automatically uses the best optimization options, that can optimize the generated code to be more energy efficient, and that can provide superoptimization to achieve the best theoretical translation of key pieces of code.

The Power Problem

Modern CPU hardware has some very effective methods of saving power. Techniques such as clock-gating and dynamic voltage and frequency scaling (DVFS) ensure the hardware mimimizes static and dynamic energy losses, only consuming power when it is actually doing something.

However, all these gains can be lost by poor software. Famously an embedded Linux system saved 70% of its energy usage by switching off the blinking cursor on its console. The consequences of such poor software are mobile devices which need frequent recharging, remote sensors which need frequent battery replacement, and shortened device lifetime and reliability due to overheating.

Using the Compiler to Reduce Energy Consumption

Embecosm are pioneers in using compilers to minimize energy usage by embedded software.

As a first step, a fast program is a power efficient program. The sooner execution is finished, the sooner clock-gating and DVFS can come into effect to shut off power drain. Our optimizing compiler back-ends provide the first step in reducing the energy consumption of your software.

Modern compilers such as the GNU Compiler Collection (GCC) and LLVM are large and complex pieces of software; GCC is between 3 and 4 million lines of code, with over 200 different optimization passes available, all controlled by their own command line flags. In addition the compiler must serve the conflicting needs of a diverse user base, ranging from those running programs on large server clusters, to those squeezing the maximum performance out of tiny embedded processors.

For the user, selecting which optimizations to use is virtually impossible. Compilers do provide several “packaged” optimization settings from -Os to generate the most compact code, to -O3 to try every optimization that might help. However, these represent a compromise suited to all the users compiling for all the targets, and they are focussed on code size and performance alone. There is no “-Oe” to optimize for the energy consumption of the compiled program.

Our research has shown that if you want low power consumption you can do much better than just optimizing for speed. We were the first to use fractional factorial design to identify compiler optimizations that are able to further reduce power consumption. We can examine all the compiler optimization passes, to tune your compiler to minimize the power consumption of your application.

This is not a theoretical analysis; we instrument your platform to measure microsecond by microsecond the energy being used. From that data we can then find the best ways for your compiler to generate energy-efficient computation.

Using Machine Learning with Compilers to Optimize for Power

By using fractional factorial design, we can tune your compiler and application very effectively, but it is still an iterative procedure that takes some weeks to complete, and is very specific to a particular application or class of applications.

Wouldn’t it be great for the compiler to “learn” from this exercise. Then it could immediately suggest which options would be best for future programs.

This is where machine learning techniques, combined with hardware instrumentation can help.

We first train the compiler, using iterative compilation directed by a fractional factorial design analysis. The compiler runs with many different optimization settings over a representative set of programs. It uses machine learning techniques to map static characteristics of the programs being compiled (for example number of loops, frequency of use of arrays) to the optimization passes used and the energy consumed. From this it builds up an optimization database, which has “learned” which optimizations best suit code with particular characteristics.

New programs can then be supplied to the compiler, and the database used to suggest the best optimization passes to use. Compilation takes little longer than a standard compilation, yet the compiled code is close to optimal for energy usage.

This technology builds on previous work where Embecosm staff were involved in the EU funded MILEPOST (MachIne Learning for Embedded PrOgramS opTimization) project. During 2006-2009 this demonstrated that machine learning was highly effective in selecting the best optimizations, more than doubling execution performance in some cases.

MILEPOST was an excellent proof of the approach, but focussed only on program speed and was deeply integrated into one release of the GNU C Compiler.

Embecosm is taking the lead in creating the next generation of commercially robust machine learning compiler frameworks. Our technology will work with any compiler, and can select for any measurable optimization criterion. In particular it can use the results from hardware instrumentation, to select the best compiler optimizations that minimize energy consumption of the compiler program.