Free software is everywhere. You may be reading this article on a smartphone running Android, using a browser such as Firefox, with the article supplied by a web server running Apache. All this works, because software has a marginal cost of distribution, that is effectively zero. It costs no more to supply one million copies of a program than a single copy. Instead we can make money from anything associated with that software, whether it is advertising on the site from which it is downloaded, hardware on which to run it, or services to support it. And typically there is much more money to be made in this way if there are a million copies of that software out there.
But how about applying this model to hardware rather than software? Real hardware costs money, and even if it is cheap hardware, a million copies will be a lot of money to give away in order to reap profit from advertising or services. What we are really talking about is not free and open source hardware, but free and open source hardware designs. Like software, designs also have a zero marginal cost of distribution, so can be provided by the million. And if other things can be sold alongside those millions of copies, then “free” business models can still apply.
Softcores for FPGAs absolutely fall into this criteria. Indeed blasting a bitstream onto an FPGA is functionally little different from loading a program into a computer’s memory. So there can be a business case for free softcores.
Which free softcore?
I’ll focus primarily on four widely used free softcores. Major FPGA providers offer “free” cores, such as Nios or MicroBlaze, but in the sense of “not paid for”, and we are not interested here in that sort of “free”. The four cores we shall consider are the OpenRISC 1000 architecture from the OpenCores community, the Lattice Semiconductor LM32, the LEON3 from Aeroflex Gaisler and the OpenSPARC family from Oracle. All are made freely available with their RTL. I’ll describe the OpenRISC 1000 in most detail — partly because it has the most active community — and then describe the other processors largely by comparison.
The OpenRISC 1000
The OpenRISC 1000 architecture dates from an initiative to develop a freely available RISC architecture by Damjan Lampret and colleagues in 1999. For the first few years the project was backed by their employer, Flextronics, who produced an ASIC version of the design. From the start the project was led by an independent community, opencores.org. While opencores.org hosts well over 1000 projects today, the OpenRISC 1000 remains its flagship project.
The OpenRISC 1000 is not an actual chip design, but an architecture specification for a family of 32 and 64-bit processors. The specification includes options for floating point and vector processing support as well as multi-core implementations. Architecturally it draws heavily on the DLX design of David Patterson and John Hennessy, which itself is closely based on MIPS. There is a regular array of 32, 32-bit registers, with simple addressing modes, and all arithmetic register-to-register. It is a Harvard architecture in the sense that it has separate instruction and data buses, but instructions and data share a unified memory address space.
The first implementation was the OpenRISC 1200, written in Verilog, which offered 32-bit integer functionality. Although always conceived as being suitable for FPGA use, this first design was also made into an ASIC by Flextronics, requiring around 150 thousand gates plus memory blocks.
A processor is only as good as the peripherals around it, and from an early stage, a reference system-on-chip was developed using free IP blocks, as a way to test the processor design. The OpenRISC Reference Platform System-on-Chip, ORPSoC, adds memory management units, a UART, flash and SRAM controllers, with GPIO, VGA, PS2, Ethernet and audio interfaces. The implementation uses a 4/5 stage pipeline, much like early MIPS designs.
A SoC needs a bus to connect components, and popular buses, such as AMBA are proprietary. OpenCores has developed its own bus architecture, Wishbone, which has gone on to be used by many other projects, including the LM32 (or which more later). Wishbone is intended as a “logic bus”. It does not specify electrical information or the bus topology. Instead, the specification is written in terms of “signals”, clock cycles, and high and low levels.
The Open RISC 1200 and ORPSoC were the mainstay of OpenRISC development for many years, with implementations available for common FPGA development boards. However, being a freely available design, there is nothing to stop others coming up with their own implementations. For example Raul Fajardo, a research student at Hanover University, developed a simpler SoC, minsoc, during 2011.
Increasing frustration with some aspects of the OR1200 implementation led Julius Baxter to write a new implementation of the OpenRISC 32-bit architecture, mor1kx. This design was intended to eliminate some of the bottlenecks that limited clock speed with the OR1200 design. It has now grown to 3 variants, with different pipeline implementations. “Cappuccino” provides a 6-stage pipeline with MMU and cache support aimed at top performance for designs that will run Linux. “Espresso” is a simpler 2-stage pipeline without cache or MMU support, intended for more deeply embedded applications. “Pronto Espresso” is the simplest of all, also with a 2-stage pipeline and eliminates the branch delay slot.
The original OpenRISC 1000 architecture specification did not conceive of removing the delay slot. However it is a living document, and continues to be refined and extended as required. In this case by allowing branches without delay slots.The benefit of the re-implementation can be seen in the performance figures, with Cappuccino achieving around 100MHz on common FPGA boards, while the old OR1200 struggles to exceed 25 MHz.
There are many other projects using OpenRISC, particularly in academic research. For example, Stefan Walentowitz at the Technical University of Munich has developed a multi-core implementation. A large team led by Prof Luca Benini at the University of Bologna, and also involving researchers at ETH Zurich, is currently developing an ultra-low power multi-core OpenRISC SoC, which will be fabricated by ST Microelectronics.
The reference SoC, ORPSoC, remains a key proving ground for all these processor implementations. Its third revision, released this year provides a much more configurable design flow, making it easy to add and remove peripherals as required. The design is now capable of fitting on sub-$100 FPGA development boards while running Linux.
The Lattice Semiconductor LM32
Like the OpenRISC 1000, the LM32 is a RISC design, with 32 32-bit general purpose registers. It is also a Harvard architecture, with separate instruction and data buses but a unified address space. Arithmetic operations are always register-to-register. Unlike OpenRISC, LM32 is not an architecture specification, but an actual chip design and implementation. It features a 6-stage pipeline which is fully bypassed and interlocked. Lattice Semiconductor make available a range of free peripherals, allowing SoC implementations to be constructed. These include SDRAM, DDR and DMA controllers, a timer, and GPIO, I2C and SPI interfaces.
The most-well known user of the LM32 is the MilkyMist project, which has developed an open source video synthesis system for live performance artists. At the heart of the hardware is an FPGA using the LM32 processor. Milkymist engineers have modified the LM32 design and contributed those modifications back to the wider community.
The LM32 is the youngest of the four free designs considered here, and while it is a comprehensive and well supported design, its youth means it has the smallest support community. The fact that it is a processor design, rather than an architecture specification, means the opportunity for reimplementation is somewhat limited.
The Aeroflex Gaisler LEON3
The LEON project was initiated by the European Space Agency, ESA, in late 1997 to study and develop a high-performance processor to be used in European space projects. The objectives for the project were to provide an open, portable and non-proprietary processor design, capable of meeting future requirements for performance, software compatibility and system cost.
Being aimed at space environments, another objective was to be able to manufacture in a Single event upset (SEU) sensitive semiconductor process. To maintain correct operation in the presence of SEUs, extensive error detection and error handling functions were needed. The goal was to be able detect and tolerate one error in any register without software intervention, and to suppress effects from Single Event Transient (SET) errors in combinatorial logic.
All LEON devices are implemented in VHDL. The first, LEON 1, was a test chip to prove the fault-tolerance concepts. LEON 2 was used in the commercially available Atmel AT697. Both LEON 1 and LEON 2 were developed by ESA. LEON 3 was developed commercially, although still as a free design, by Gaisler Research, Jiri Gaisler having led the original LEON development. Gaisler Research is now Aeroflex Gaisler, and has recently announced the LEON 4 processor.
LEON is not a new processor architecture, but instead is based on the SPARC v8 RISC architecture. This has the advantage that a huge body of SPARC software can be reused. Although all LEON versions are fault-tolerant designs, smaller and simpler non-fault-tolerant versions are provided.
All LEON implementations are based on a 5-stage pipeline. The new LEON 4 adds branch prediction to the pipeline for increased performance. As with OpenRISC 1000 and LM32, free peripheral IP components are provided to build SoCs, but in the case of LEON, using the proprietary ARM AMBA AHB bus. The peripherals are marketed by Aeroflex Gaisler as GRLIB. While originally quite small in number, for LEON 3, the range of peripherals is extensive, with a large number of memory and bus controllers and a wide range of interface blocks.
While the OpenRISC 1000 and LM32 are both primarily aimed at FPGAs, LEON is intended to be used for ASIC. This is particularly true for space applications, where suitable ASIC processes are needed to achieve fault-tolerance. However, LEON 3 has proved popular in other applications, not least because of its SPARC compatibility.
The Oracle OpenSPARC
OpenSPARC has the youngest open source pedigree of the four processors described here, but it is the oldest architecture. It arose from Sun Microsystems decision to open source the UltraSPARC T1 design in 2005. Like LEON, this follows the SPARC architecture, but is implemented in Verilog rather than VHDL. In late 2007 Sun also open sourced the UltraSPARC T2 design. Oracle have since acquired Sun Microsystems and now have responsibility for the UltraSPARC project.
This is the only 64-bit free softcore, and the only true multi-core implementation, although there are research multi-core implementations of the OpenRISC 1000. Unlike the other three softcores, the design is aimed at desktop environments.
Other free softcores
There are a huge number of softcores available, many developed for academic research. The expiration of some of the early RISC patents has meant that free versions of early designs are now available.
Perhaps one of the most exciting is the Bluespec Extensible RISC Implementation, BERI, developed by Simon Moore and his team at Cambridge University. This is a 64-bit implementation of the MIPS ISA, using Bluespec. This project has also been extended to add capability hardware for security applications. The Capability Hardware Enhanced RISC Instructions (CHERI) design runs a version of FreeBSD with capsicum hybrid capability operating system extensions.
This is also an interesting project, in that while fully free and open source, it relies on a technology, Bluespec, for which only proprietary tools are available. However this has in turn spawned a project to develop free Bluespec tools.
Commercial applications using free softcores
OpenRISC was first fabricated into a commercial standalone ASIC by Flextronics in 2003. More recently it has been used by Samsung in their set top box processors, starting with the SDP -83 ‘B’ series through to the SDP-1003 and SDP-1006 ‘E’ series. A fault-tolerant version of OpenRISC was developed by the Swedish space and defense company ÅAC Microtec, and flew in NASA’s TechEdSat last year. Beyond Semiconductor is a Slovenian chip design company founded by some of the original OpenRISC project team. One of its BA family of processors, derivatived from the original OpenRISC, was used by NXP in its JN5148 ultra-low power Zigbee transceiver chip.
As mentioned earlier the LM32 is a relatively new processor. By far its most high-profile use has been in the Milkymist video mixer, which while an open source community project, is intended to deliver a product for commercial deployment. Its primary commercial role is hower to be the free processor IP for Lattice Semi’s FPGA products, and as such is most widely seen in their FPGA development boards.
LEON has been used for many space based projects by both the European Space Agency and NSA. The EADS-Astrium SCOC3 is a “spacecraft controller on a chip” developed using the LEON3-FT processor design.
Central to any processor’s adoption is the availability of a robust software ecosystem.nThis applies just as much to free softcores.
Compiler tool chains
A processor must have a compiler, and all four designs offer the GNU tool chain. A challenge for commercial deployment is to develop not just a functional compiler, but one that is robustly accurate. LEON and OpenSPARC can rely on the standard SPARC compilers, although this has the disadvantage of not offering processor specific optimization. Aeroflex Gaissler has addressed this for LEON with their own variants of GCC, although as a consequence their tool chain is based on a very out-of-date version (GCC 4.4.2). Embecosm, developed a commercially robust release of GCC 4.5.1 for OpenRISC for the TechEdSat project. The compiler has since continued to be developed by the community, with GCC 4.8 being available. LM32’s compiler is not only up-to-date, but has been adopted into the FSF mainline.
These are not the only compilers in town. Oracle encourages users to use their proprietary Sun Studio compiler. SPARC is supported by the generic LLVM compiler, while an experimental version of LLVM 3.3 for the OpenRISC 1000 has been developed by Simon Cook of Embecosm and Stefan Kristiansson.
All four platforms offer Linux support. In the case of OpenRISC, it is part of the official Linux distribution, the latest 3.11 version is available, and commonly used with BusyBox on platforms. LM32 has also offered Linux, although it is not part of the official distribution and at the time of writing, its status and availability is unclear.
LEON3 has its own port of Linux, the Snapgear Embedded Linux. However it has a problem in that it uses very old versions of the kernel. For non-fault tolerant processors it is kernel 188.8.131.52, while the fault-tolerant version only offers an ancient 2.0.x based kernel.
OpenSPARC as befits a design aimed at desktop environments offers not just Linux kernels, but full Linux distributions. Both Gentoo and Ubuntu distributions are supported. OpenSPARC also supports other Unix variants, with both FreeBSD and OpenSolaris available.
Real Time Operating Systems
The OpenRISC 1000, LM32 and LEON3 are all aimed at embedded applications. For larger applications, Linux may be appropriate, but for more deeply embedded uses, a real-time operating system is needed.
OpenRISC is supported by RTEMS, FreeRTOS and eCos, although the status and quality of these ports depends on the activity and motivation of the particular project carrying out the work. For LM32 there is some commercial RTOS support with uC/OS-II, and it also supports the open source ulTRON RTOS.
LEON3 is the big leader in RTOS support, with implementations of many commercial RTOS, such as RTLinux, PikeOS, VxWorks and LynxOS. It also supports the open source RTEMS and eCos .
Licensing free softcores
Licensing of softcores is something of a gray area. In general developers have tended to regard RTL implementations as software. The design is then licensed using standard free and open source software licenses, with the RTL as source code and the bitstream as object code. While this is probably legally defensible for FPGAs, it is not clear that the provisions of such licenses would apply to ASIC manufacture. It is also problematic even with FPGA, because such licenses tend to refer to terminology, such as “linking”, “compiling” and “object code”, which are not directly translatable to FPGA synthesis.
In many cases commercial manufacturers have stuck to the spirit of the licensing, with for example Samsung making the RTL of the OR1200 available through their open source download center. Others have not been so forthcoming. One supplier likes to proclaim their OpenRISC heritage until you request the RTL source, at which point they are keen to tell you that it is a completely new design. Similarly, the novel fault-tolerant features of the TechEdSat design have never been made publicly available.
This in part represents the relative immaturity of some of the players in open source softcore design. The same was the case in the early days of free and open source software. However, in time companies learn that if you take this approach, you will lose the community support on which you depend. You also lose the trust of end users: if you are willing to betray the community which provided the original design, how will you treat your customers. It is thus perhaps not surprising that the largest companies, with past experience in open source software, have taken the most enlightened approach to free softcores.
In general free softcores have relied on standard open source software licenses. It is not clear that this is legally completely effective, even for FPGA, since the terminology — compiling and linking etc. — does not map directly to RTL synthesis. However, a processor is only useful with its ISA description, and standard free documentation licenses such as the GFDL and creative commons, are effective here.
Looking forward there are now alternatives, better suited to hardware, such as the permissive SolderPad License, the weakly copyleft CERN Open Hardware License and the more strongly copyleft TAPR Open Hardware License. These are still not perfect for protecting silicon chip designs, but they are an important step on the way.
The free softcore communities continue to thrive, as can be seen by the rapid growth of projects on communities such as opencores.org. But perhaps the most interesting recent development is the emergence of crowd funding.
The problem with paying for open source development is that you are at risk of paying for something from which others (possibly competitors) will then benefit without having contributed. There are many models of crowd funding, but the Kickstarter approach, which is essentially about pre-paid sales is of most interest.
You may have seen very recently the Kickstarter project to develop an open source GPU. This will require many people to come together before the project can get off the ground. Unless everyone plays their part it won’t happen; players hoping to come along afterwards and get something for free, risk not having the project start at all.
We are not sure this particular project is going to succeed – we think they have their benefits (the pre-sales) poorly defined. However the general approach is something that may allow competitors to jointly support free and open source development that is to their mutual benefit.
- there are a range of well established free softcores available;
- these free softcores have been deployed in many commercial projects;
- there is generally good tool, software and operating system support;
- there are strong communities supporting free softcores;
- commercial support is available where needed;
- the risks are well understood and mitigated; and finally
- there is no reason not to consider a free softcore for your next project.
OpenRISC is a community project of which Embecosm is just a small part. It is the cumulative result of 14 years work by a very large number of people.
This post is a shortened version of a talk given by Simon Cook to the NMI meeting “Embedded Processors – You’ve got the power, but which to choose?” held on 24 October 2013 in Bristol, UK. The full paper is available as Embecosm Article EAR15.