The compiler tool chain is one of the largest and most complex components of any system, and increasingly will be based on open source code, either GCC or LLVM. On a Linux system only the operating system kernel and browser will have more lines of code. For a commercial system, the compiler has to be completely reliable—whatever the source code, it should produce correct, high performance binaries.
So how much does producing this large, complex and essential component cost? Thanks to open source not as much as you might think. In this post, I provide a real world case study, which shows how bringing up a new commercially robust compiler tool chain need not be a huge effort.
How much code?
An analysis by David A Wheeler’s SLOCCount shows that GCC is over 5 million lines. LLVM is smaller at 1.6 million lines, but is newer, supports only C and C++ by default and has around one third the number of architectures included as targets. However a useful tool chain needs many more components.
- Debugger: Either GDB (800k lines) or LLDB (600k lines)
- Linker: GNU ld (160k lines), gold (140k lines) or lld (60k lines)
- Assembler/disassembler: GNU gas (850k lines) or the built in LLVM assembler
- Binary utilities: GNU (90k lines) and/or LLVM (included in main LLVM source)
- Emulation library: libgcc (included in GCC source) or CompilerRT (340k lines)
- Standard C library: newlib (850k lines), glibc (1.2M lines), musl (82k lines) or uClibC-ng (251k lines)
In addition the tool chain needs testing. In most GNU tools, the regression test suite is included with the main source. However for LLVM, the regression tests are a separate code base of 500 thousand lines. Plus for any embedded system, it is likely a debug server will be needed to talk to the debugger to allow tests to be loaded.
What is involved in porting a compiler?
Our interest is in a port of the tool chain that is robust for commercial deployment. Many PhD students round the world port compilers for their research, but their effort is dedicated to exploring a particular research theme. The resulting compiler is often produced quickly, but is neither complete, nor reliable—since this is not the point of a research program.
This article is instead concerned with creating a set of tools which reliably produce correct and efficient binaries for any source program in a commercial/industrial environment.
Fortunately most of this huge code base is completely generic. All mainstream compiler tools go to considerable efforts to provide a clean separation of target specific code, so the task of porting a compiler tool chain to a new architecture is a manageable task. There are five stages in porting a compiler tool chain to a new target.
- Proof of concept tool chain. Initial working ports of all components are created. This prototype is essential to identify areas of particular challenge during the full porting process and should be completed in the first few months of the project. At this point it will be possible to compile a set of representative programs to demonstrate the components will work together as expected.
- Implementation of all functionality. All functionality of the compiler and other tools is completed. Attributes, builtin/intrinsic functions and emulations of missing functionality are completed. All relocations are added to the linker, the full C library board support package is written and custom options are added to tools as needed. At the end of this process, the customer will have a fullly functional tool chain. Most importantly, all the regression test suites will be running, with the great majority of tests running.
- Production testing. This is often the largest part of the project. Testing must pass in three areas:
- regression testing, to demonstrate that the tool chain has not broken anything which works on other architectures;
- compliance testing, often using the customer’s tests to show that all required functionality is present; and
- benchmarking, to demonstrate that the tool chain generates code which meets the required performance criteria, whether for execution speed, code size or energy efficiency.
- Roll out. This is primarily about helping users understand their new compiler and how it differs from the previous tools, and usually involves written and video tutorials. While there will be real bugs uncovered in use, invariably there will also be numerous bug reports which amount to “this new compiler is different to the old one”. This is particularly pronounced where GCC and LLVM replace old proprietary compilers, because the step up in functionality is so great. Where there is a large user base, phased roll-out is essential to manage the initial support load.
- Maintenance. LLVM and GCC are very active projects, and new functionality is always being added, both to support new language standards in the front end and to add new optimizations in the back end. The compiler will need to be kept up to date with these changes. Plus of course there will be new functionality specific to the target required and on such a large project bugs reported by users.
How much effort: the general case
Let us consider the general case. A new architecture with a large external user base, which must support C and C++ in both bare metal and embedded Linux targets. In this case it is likely that the architecture provides a range of implementations, from small processors used as bare metal or with RTOS in deeply embedded systems, to large processors capable of supporting a full application Linux environment.
Overall first production release of a such a tool chain takes 1-3 engineer years. The initial proof of concept tool chain should be completed in 3 months. Implementation of all the functionality then takes a further 6-9 months, with a further 3 months if both bare metal and Linux targets are to be supported.
Production testing takes at least 6 months, but with a large range of customer specific testing this can be as large as 12 months. Initial roll-out takes 3 months, but with a large user base, phased general release can take up to 9 months more.
Maintenance effort depends hugely on the size of the customer base reporting in issues and the number of new features needed. It can be as little as 0.5 engineer months per month, but is more usually 1 engineer month per month.
It is important to note that a complete team of engineers will work on this: compiler specialists, debugger experts, library implementation engineers and so on. Compiler engineering is one of the most technically demanding disciplines in computing, and no one engineer can have all the expertise needed.
How much effort: the simpler case
Not everyone needs a full compiler release for a large number of external users. There are numerous application specific processors, particularly DSPs which are used solely in-house by one engineering company. Where such processors have proved commercially successful they have been developed and what was a tiny core programmed in assembler by one engineer has become a much more powerful processor with a large team of assembler programmers. In such cases moving to C compilation would mean a great increase in productivity and reduction in cost.
For such use cases, the tool chain need only support C, not C++ and a minimal C library is sufficient. There may well be a pre-existing assembler and linker that can be reused. This greatly reduces the effort and timescales to as little as one engineer year for a full production compiler.
The proof-of-concept still takes 3 months, but then completing full functionality can be achieved in as little as 3 more months. Production testing is still the largest effort, taking 3-6 months, but with a small user base 3 months is more than sufficient for roll out.
The tool chain still needs to be maintained, but for this simpler system with a small user base, an effort of 0.25 engineer months/month is typically enough.
For the smallest customers, it can be sufficient to stop after completing full functionality. If there are only a handful of standard programs to be compiled, it may be enough to demonstrate that the compiler handles these well and efficiently without progressing to full production testing.
In 2016, Embecosm was approached by an electronic design company, who for many years had used an in-house 16-bit word addressed DSP designed to meet the needs of their specialist area. This DSP was now on its third generation and they were conscious that they needed a great deal of assembler programming effort. This was aggravated by the standard codecs on which they relied having C reference implementations. They had an existing compiler, but it was very old and porting it to the new generation DSP was not feasible.
Embecosm were tasked with providing a LLVM based tool chain capable of compiling their C codecs and delivering high quality code. There was an assumption that this code would then be hand-modified if necessary. They had an existing assembler/linker, which worked by combining all the assembler into a single source file, resolving cross references and generating a binary file to load onto the DSP. The customer was also keen to build up in-house compiler expertise, so one of their engineers joined the Embecosm implementation team and has been maintaining the compiler since the end of the project.
In the first 3 months, we created a tool chain based on their existing assembler/disassembler. In order to use newlib, we created a pseudo-linker, which would extract the required files from newlib as source assembler to combine with the test program. Because silicon was not yet available, we tested against a Verilator model of the chip. For this we wrote a gdbserver, allowing GDB to talk to the model. In the absence of ELF, source level debugging was not possible, but GDB was capable of loading programs and determining results, sufficient for testing purpose. In the absence of 16-bit char support in LLVM, we used packed chars for the proof-of-concept. This meant many character based programs would not work, but was sufficient for this stage.
This allowed us to compile representative test programs and demonstrate that the compiler tool chain would work. It became clear that there were two major obstacles to achieving full-functionality: 1) lack of ELF binary support; and 2) lack of proper 16-bit character support.
For phase two, we implemented a GNU assembler/disassembler using CGEN, which required approximately 10 days of effort. We also implemented 16-bit character support for LLVM as documented in this blog post. With these two features, completing the tool chain functionality became much more straightfoward and we were able to run standard LLVM lit and GCC regression tests for the tool chain, the great majority of which passed. The DSP has a number of specialist modes for providing saturating fixed-point arithmetic. To support these we implemented specialist builtin and intrinsic functions.
At this point we had a compiler which correctly compiled the customer’s code. The ELF support meant techniques such as link-time optimization (LTO) and garbage collection of sections were possible, leading to successful optimization of the code so it met the memory constraints of the customer. With an investment of 120 engineer days, they had achieved their goal of being able to compile C code efficiently for their new DSP.
The customer decided they had all the functionality they needed by this point and decided no further work was required. Should they decide to make the compiler more widely available they have the option to continue with full production testing of the compiler tool chain.
Two factors made it possible to deliver a fully functional compiler tool chain in 120 engineer days.
- Using an open source compiler. The tools used in this project represent a cumulative effort of thousands of engineer years by the compiler community over three decades. Our customer was able to take advantage of this to acquire a state-of-the-art compiler tool chain.
- An expert compiler team. Although this was a 120 days project, a team of five were involved, each bringing years of specialist expertise. No one individual could know everything about emulation, GDB, CGEN assemblers, the GNU linker and the LLVM compiler. But within Embecosm we have the skill set to deliver the full project.
If you would like to know more about bringing up a compiler tool chain for your processor, please get in touch.