GCC gets an overhaul
Published: 15 Mar 2005 12:00 GMT
The entire realm of open source software could get a performance boost if all goes well with a plan to overhaul the GNU Compiler Collection — more commonly known as the GCC.
Almost all open source software is built with GCC. The forthcoming GCC 4.0 includes a new foundation that will allow that translation to become more sophisticated, said Mark Mitchell, the GCC 4 release manager and "chief sourcerer" of a small company called CodeSourcery.
"The primary purpose of 4.0 was to build an optimisation infrastructure that would allow the compiler to generate much better code," Mitchell said.
Compilers are rarely noticed outside the software development community, but GCC carries broad significance. For one thing, an improved GCC could boost performance for the open source software realm — everything from Linux and Firefox to OpenOffice.org and Apache.
For another, GCC is a foundation for an entire philosophy of cooperative software development. It's not too much of a stretch to say GCC is as central an enabler to the free and open source programming movements as a free press is to democracy.
GCC, which stands for GNU Compiler Collection, was one of the original projects in the GNU effort. Richard Stallman launched GNU and the accompanying Free Software Foundation in the 1980s to create a clone of Unix that was free from proprietary licensing constraints.
The first GCC version was released in 1987, and GCC 3.0 was released in 2001. A company called Cygnus Solutions, an open source business pioneer acquired in 1999 by Linux seller Red Hat, funded much of the compiler's development.
But improving GCC isn't a simple matter, said Evans Data analyst Nicholas Petreley. There have been performance improvements that came from moving from GCC 3.3 to 3.4, but at the expense of backwards-compatibility: Some software that compiled fine with 3.3 broke with 3.4, Petreley said.
RedMonk analyst Stephen O'Grady added that updating GCC shouldn't compromise its ability to produce software that works on numerous processor types.
"If they can achieve the very difficult goal of not damaging that cross-platform compatibility and backwards-compatibility, and they can bake in some optimisations that really do speed up performance, the implications will be profound," O'Grady said.
What's coming in 4.0
GCC 4.0 will bring a foundation to which optimisations can be added. Those optimisations can take several forms, but in general, they'll provide ways that the compiler can look at an entire program.
For example, the current version of GCC can optimise small, local parts of a program. But one new optimisation, called scalar replacement and aggregates, lets GCC find data structures that span a larger amount of source code. GCC then can break those objects apart so that object components can be stored directly in fast on-chip memory rather than in sluggish main memory.
"Optimisation infrastructure is being built to give the compiler the ability to see the big picture," Mitchell said. The framework is called Tree SSA.
However, Mitchell said the optimisation framework is only the first step. Next will come writing optimisations that plug into it. "There is not as much use of that infrastructure as there will be over time," Mitchell said.
One optimisation that are likely to be introduced in GCC 4.1 is called autovectorisation, said Richard Henderson, a Red Hat employee and GCC core programmer. That feature economises processor operations by finding areas in software in which a single instruction can be applied to multiple data elements — something handy for everything from video games to supercomputing.





