I have started work on a BSD licensed simple but powerful C compiler suite here https://github.com/andrewchambers/c (A C port/continuation of my now frozen Go based C compiler). After a few months of work in my free time the compiler is building some non trivial test cases on amd64 Ubuntu, but no real software.
I encourage you to clone it and have a play around.
Some general goals I have in mind are:
- Compile times that are 2 - 5 times faster than gcc or clang. TCC is 10 times faster, but does not have text assembly or an AST.
- Be one to two orders of magnitude smaller than gcc and clang/llvm. For every million lines of gcc code, we could have ten thousand line of code.
- Emit assembly that has performance at least equal to tcc. This is a modest performance goal so we don't focus prematurely on this over compatibility.
- Have the whole system build from source in less than 30 seconds (probably much less) on a modest desktop machine or even low end arm systems.
- Be zero config compatible with the excellent Musl libc on Linux.
To answer why I would start a new compiler suite from scratch, perhaps the following will resonate with you.
GCC is large and complicated and non standard. Generally porting it is difficult and out of reach of hobbyists. Building these compilers from source requires 20 minutes to many hours. LLVM and Clang suffer from the same issues and they have added CMake to the list of things I can't get behind.
For most of my use cases I question the need for hundreds of thousands of lines of optimizer code. I think the Google Go toolchain + stdlib's 30 second build proves this nicely. I would prefer a simple C compiler written in C, to a complicated C++ compiler written in C++ supporting all of C++ with C on the side.
Bootstrapping these cross compilers with working libc's is so complicated/arcane there are dedicated tools like buildroot and crosstool-ng just to manage the complexity.
Both these compilers also seem to require more ram and cpu to self host than modest hardware or emulators like qemu can provide. This is actually a serious barrier to overcome when trying to work with many platforms.
TCC is extremely fast and small, I generally use tcc as my primary C compiler when I don't want to deal with GCC. I have two issues with this compiler.
I don't think I am alone in saying the code style is terse, hard to understand. Perhaps it was written with speed alone in mind, perhaps the lack of AST has allowed some ugly hacks into the code base, or perhaps my taste is just different. I would encourage you to make these judgement call for yourself by comparing code.
The major limitation however, is that because TCC emits binary directly with no text assembly, it is much harder to use with some hobby systems which have existing assemblers. This was the main deal breaker for me.
These are the best candidate's so far to meet my goals. All I can really say is I think we can take the best ideas from these projects, and have no problem sharing code/design in order to create the best system possible.