Thursday, 3 December 2015

Self hosted C - breakdown

I did it. It wasn't easy, but I did it. My C Compiler can compile itself. Even though it still has holes in functionality and obvious bugs, It gives me a funny sense of pride that my compiler can now be used to improve itself. I consider it a significant milestone, and this post shares an overview of what was involved.

Lets look at the breakdown...

The self hosting commit:

The final commit to self hosting was support for C va_lists, just enough to compile the code used in the compiler itself.

Timeline:





~ 188 days of self directed work.


Commits: 

Approximately 380 commits, though probably more, because I discarded lots of work in local branches. I estimate a commit to be 20-30 minutes of work on average, so that translates to about 4 work weeks of hard work actually coding.

Punch card:

 





I worked whenever I could take a break, including evenings and lunch times. Sometimes I couldn't sleep and would still be hacking away at 3 am. Once I woke up at 6 am, probably to fix a bug that was giving me nightmares. 

Lines of code:

5902 lines of code currently, but will grow. My original goal was a complete toolchain in less than 15k lines of code, and I think I have still have room to spare for an optimizing backend and assemblers.

Code quality/clarity:

I think I have done a good job, though there is always room for improvement. I really wanted to make something anybody could understand. It is a matter of opinion, but look for yourself.

Compare my for loop parser:
https://github.com/andrewchambers/c/blob/7775638eeb241979d2568ec699911bc797f7bb6e/src/cc/parse.c#L1279

To the equivalent clang for loop parser:
https://github.com/llvm-mirror/clang/blob/08e3bfe1f5d00ebe115c2f2e44a93e396d59177e/lib/Parse/ParseStmt.cpp#L1474

To the equivalent gcc parser:
https://github.com/gcc-mirror/gcc/blob/e01e62c7a9ae012337243c86e1e1a2e0041f9895/gcc/c/c-parser.c#L5596

To the equivalent Tiny C compiler parser:
https://github.com/andrewchambers/tccmirror/blob/d6d7686b608c4b7cd88877b30579ca2346e5d284/tccgen.c#L4526

Motivation levels:

I am unfortunately the type of person who constantly starts new projects and stop before they hit a major milestone. I eventually reached a point where I felt like a failure, and that I couldn't finish anything. Overcoming this can be a struggle, but in this case I challenged myself to not be a quitter. Whenever I hit a brick wall and wanted to give up, I told myself that this barrier would stop someone else, but it won't stop me. 

My motivation levels did drop at times, but I picked myself back up every time to reach this milestone.

The future:

My crazy ambition was to write the cleanest C compiler that could be used for an all C operating system like plan9 or Openbsd and there is still lots of work to do to reach that level of sophistication. Real OS support will require funding or many more dedicated code contributors.

Conclusions:

This milestone really made me reflect and appreciate what some of the early programming language pioneers went through, the first self hosted languages really are something special.

Let me know if you enjoyed this work and want it to continue.