summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorEkaitz Zarraga <ekaitz@elenq.tech>2022-08-02 14:14:40 +0200
committerEkaitz Zarraga <ekaitz@elenq.tech>2022-08-02 14:14:40 +0200
commit6e551039cd9afea33d0acde6cfd248ece8e63e04 (patch)
treea01d10fc291cc2cd681e8158f9a009f9164ce34b
parent775281bfd957ced61dbca446d47a36202e2d7944 (diff)
Add post about tinycc
-rw-r--r--content/bootstrapGcc/05_tcc_changes.md278
1 files changed, 278 insertions, 0 deletions
diff --git a/content/bootstrapGcc/05_tcc_changes.md b/content/bootstrapGcc/05_tcc_changes.md
new file mode 100644
index 0000000..0eb42c6
--- /dev/null
+++ b/content/bootstrapGcc/05_tcc_changes.md
@@ -0,0 +1,278 @@
+Title: Adding TinyCC to the mix
+Date: 2022-08-01
+Category:
+Tags: Bootstrapping GCC in RISC-V
+Slug: bootstrapGcc5
+Lang: en
+Summary:
+ Discussing what changes need to be done to make GCC compilable form a
+ simpler C compiler, TinyCC.
+
+In the [series]({tag}Bootstrapping GCC in RISC-V) we already introduced GCC,
+made it able to compile C programs and so on, but we didn't solve how to build
+that GCC with a simpler compiler. In this post I'll try to explain which
+changes must be applied to all the ecosystem to be able to do this.
+
+### The current status
+
+I already talked about this in the past, but it's always a good moment to
+remind the bootstrapping process we are immerse in. There are steps before of
+these, but I'm going to start in GNU Mes, which is the core of all this.
+
+From the part that interests us, GNU Mes has a C compiler, called MesCC. This C
+compiler is the one we use to compile TinyCC and we use that TinyCC to compile
+a really old version of GCC, the 2.95, and from that we compile more recent
+versions until we reach the current one. From the current one we compile the
+world.
+
+That's the theory, and it's what we currently have in the most widely supported
+architectures (`i386` and maybe some ARM flavour). Problems arise when you deal
+with some new architecture, like the one we have to deal with: RISC-V.
+
+RISC-V was invented recently, and the compilers did not add support for it
+until some years ago. GCC added support for RISC-V in the 7.5 version, as we
+have been discussing through this series, which needed a C++ compiler in order
+to be built. That's a problem we almost solved in the previous steps,
+backporting the RISC-V support to a GCC that only needed a C compiler to be
+built.
+
+Now, extra problems appear. Which C compiler are we going to use to build that
+GCC 4.6.4 that has the RISC-V support we backported?
+
+According to the process we described, we should use GCC 2.95, but it doesn't
+support RISC-V so we would need to backport the RISC-V support to that one too.
+That's not cool.
+
+Another option would be to remove the GCC 2.95 from the equation and compile
+the GCC 4.6.4 directly from TinyCC, if that's possible. Making the whole
+process faster removing some dependencies. But this means TinyCC has to be able
+to compile GCC 4.6.4. We are going to try to make this one, but that requires
+some work we will describe today.
+
+On the other hand, in order to be able to build all this for RISC-V, TinyCC and
+MesCC have to be able to target RISC-V...
+
+Too many conditions have to be true to all this to work. But hey! Let's go step
+by step.
+
+### RISC-V support in TinyCC
+
+First, we have to make sure that TinyCC has RISC-V support, and it does. Since
+not a long time ago, TinyCC is able to compile, assemble and link for RISC-V,
+only for 64 bits.
+
+I tested this support using a TinyCC cross-compiler and it works. If you want
+to try it, I have a simple [Guix package][tcc-package] for the cross compiler,
+and I also fixed the official Guix package for the native TinyCC, which have
+been broken for long.
+
+Still, I didn't test the RISC-V support natively, but if the cross-compiler
+works, chances are the native will also work, so I'm not really worried about
+this point.
+
+[tcc-package]: https://github.com/ekaitz-zarraga/tcc/blob/guix_package/guix.scm
+
+
+### GNU Mes compiling TinyCC
+
+GNU Mes supports an old C standard that is simpler than the one TinyCC uses, so
+it uses a fork of TinyCC with some C features removed. This fork was done way
+before the RISC-V support was added to TinyCC and many things have changed
+since then.
+
+[We need to backport the TinyCC RISC-V support to Mes's own TinyCC fork,
+then.](https://www.youtube.com/watch?v=-1qju6V1jLM) Or at least do something
+about it.
+
+When I first took a look into this issue, I thought it would be an easy fix, I
+already backported GCC, which is orders of magnitude larger than TinyCC... But
+it's not that easy. TinyCC's internal API changed quite a bit since the fork
+was done, and I need to review all of it in order to make it work. Also, this
+process includes the need to convert all the modern C that is not supported by
+MesCC to the older C constructs that are available on it.
+
+It's a lot of work, but it's doable to a certain degree, and this might suppose
+a big step for the full source bootstrap process. Like what I did in GCC, it's
+not going to solve everything, but it's a huge step in the right direction.
+
+
+### GNU Mes supporting RISC-V
+
+On the lower level part of the story, if we want to make all this process work
+for RISC-V, GNU Mes itself should be runnable on it, and able to generate
+binaries for it.
+
+[There have been efforts][mes-riscv-effort] to make all this possible, and I
+don't expect this support to take long to appear finally in GNU Mes. It's just
+a matter of time and funding. I am aware that Jan is also interested on
+spending time on this, so I think we are covered on this area.
+
+[mes-riscv-effort]: https://lists.gnu.org/archive/html/bug-mes/2021-04/msg00031.html
+
+
+### GCC compilation with TinyCC
+
+The only point we are missing then is to be able to build the backported GCC
+from TinyCC, without the intermediate GCC 2.95. This a tough one to test and
+achieve, because the GCC compilation process is extremely complex, and we need
+to make quite complex packages for this process to work.
+
+On the other hand, the work I already did, packaging my backported GCC for guix
+is not enough for several reasons: it was designed to work with a modern GCC
+toolchain, and not with TinyCC; and a cross-compiler is not the same thing as a
+native one.
+
+GCC is normally compiled in stages, which are called *bootstrap* by the GCC
+build system. I described a little bit of that process [in a footnote in
+past][staged]. That process is not activated in a cross-compilation
+environment, which is what I used when the backend I backported was
+<del>back</del>tested. If the *bootstrap* process doesn't work, it means the
+compilation process fails, so this introduces possible errors in the build
+system which we were avoiding thanks to the cross-compilation trick.
+
+[staged]: https://ekaitz.elenq.tech/bootstrapGcc3.html#fn:staged
+
+I did this on purpose, of course. I just wanted a simple working environment
+which was letting me test the backported RISC-V backend of the compiler, but
+now we need to make a proper package for GCC 4.6.4, and make it work for
+TinyCC.
+
+I wouldn't mention this if I didn't try it and failed making this package. It's
+not specially difficult to make a package, or it doesn't look like, until you
+get errors like:
+
+``` weird-error-lol
+configure: error: C compiler cannot create executables
+
+`¯\_(ツ)_/¯`
+```
+
+That being said, this is not only a packaging issue. As we already mentioned,
+we are removing GCC 2.95 from the pipeline, so TinyCC has to be able to deal
+with the GCC 4.6.4 codebase directly, including the backport I did.
+
+The easiest way to test this is to compile GCC 4.6.4 for x86_64 in my machine,
+with no emulation in between, so we can find the things TinyCC can't deal with.
+Later we would be able to test this further in an emulated environment or
+directly in a RISC-V machine to make sure TinyCC can deal with the RISC-V
+backend, but for a first review in the GCC core, using x86_64 can be enough.
+It requires no weird setup, further than a working package... Ouch!
+
+I'm not really good at this part and I'm not sure if anyone else is, but I
+don't feel like spending time in trying to make this package cascade. I feel
+like my time is better spent on fixing stuff, or, once the package cascade is
+done, fixing the compatibility.
+
+During the whole project, making Guix packages and figuring out build systems
+is the part where more time was spent, and it's the one with the lowest success
+rate. It feels like I wasted hours trying to make the build process work for
+nothing.
+
+The funny part of this is Guix is partially the one to blame here, not
+conforming the FHS and having this weird way to handle inputs is what makes the
+whole process really complex. Code has to be patched to find the libraries,
+scripts must be patched too, binaries are hard to find... On the good side,
+it's Guix that makes this work worth the effort, and also what makes this
+process reproducible, once it's done, to let everyone enjoy it.
+
+
+#### Wait, but didn't Mes use a TinyCC fork?
+
+Oh yeah of course. What I forgot to mention is the step we just described,
+making TinyCC able to compile the backported GCC 4.6.4, is not just as simple
+as I mentioned. If we use upstream TinyCC to compile GCC, who is going to
+compile that TinyCC? We already said MesCC is not able to do that directly.
+
+We could build that TinyCC with the TinyCC fork Mes has or make the TinyCC fork
+go directly for the GCC 4.6.4, but in any case there's an obvious task to
+tackle: The RISC-V support must arrive the TinyCC fork before we can do
+anything else. And that's where I want to focus.
+
+### This is not only about RISC-V
+
+I have to be clear with you: I mixed two problems together and I did that on
+purpose.
+
+On the one hand we have the RISC-V support related changes. And on the other
+hand we have the changes on the compilation pipeline: the removal of GCC 2.95.
+
+The second part is just a consequence of the first, but it's not only related
+with the RISC-V world. Once we have our compilers ready, we are going to apply
+the change for the whole thing. Removing a step is a really important task for
+many reasons but one is the obvious at this point: having a really old compiler
+like GCC 2.95 forces us to stay with the architectures it was able to target,
+or makes us add them and maintain them ourselves. It's a huge flexibility
+issue for the little gain it gives: GCC 4.6.4 is already compilable from a C90
+compiler.
+
+So, this is an important milestone, not only for my part of the job but also
+for the whole GNU Mes and bootstrapping effort. Skipping GCC 2.95 has to be
+done in every architecture, and the packaging effort of that is unavoidable.
+
+### What I already did
+
+While I was reviewing what it needed to be done, I started doing things here
+and there, preparing the work and making sure I was understanding the context
+better.
+
+First, I realized I introduced some non-C90 constructs in the backport of GCC,
+because I directly copied some code from 7.5 and I removed those. This is
+important, because we need to be able to compile all this with TinyCC, and I
+don't expect TinyCC to support modern constructs.
+
+
+I packaged a TinyCC RISC-V cross compiler [for the upstream
+project][tcc-package], and also for [the Mes fork][mes-tcc-package] even
+thought the latter is not available yet for compilation: we need to backport
+the backend in order to make it work. Still, it's important work, because it
+lets me start the backport easily. I'll need to apply more changes on top of
+it, for sure, but at the moment I have all I need to start coding the new
+backend.
+
+[mes-tcc-package]: https://github.com/ekaitz-zarraga/tcc/blob/riscv-mes/guix.scm
+
+I spent countless hours trying to make a proper GCC package and trying to use
+TinyCC as the C compiler for it with no success. This is why I decided to move
+on and work in a more interesting and usable part: adding the RISC-V backend to
+the Mes fork of TinyCC.
+
+Of course, I already started working on the RISC-V support of the TinyCC fork
+from Mes, and started encountering API mismatches here and there. Most of them
+related with some optimizations introduced after the fork, that I need to
+review in more detail in the upcoming weeks. I also spent some time trying to
+understand how TinyCC works, and it's a very interesting approach I have to
+say[^maybe].
+
+[^maybe]: Maybe I'll have the time to explain it in a future blog post, maybe
+ not.
+
+
+### Conclusions
+
+I'd love to tackle all these problems together and fix the whole system, but
+I'm just one guy coding from his couch. It's not realistic to think I can fix
+everything, and trying to do so is detrimental to my mental health.
+
+So I decided to go for the RISC-V support for the TinyCC fork we have at Mes.
+This would leave all the ingredients ready for someone more experienced than me
+to make the final recipe.
+
+The same thing happened with the GCC backport. I didn't really finish the job:
+there's no C++ compiler working yet, but that's not what matters. Anyone can
+take what I did, package it properly, which it happened to be an impossible
+task for me, and make it be ready. We already made a huge step.
+
+Fighting against a wall is bad for everyone, it's better to pick a task where
+you can provide something. You feel better, and the overall state of the
+project is improved. Achieving things is the best gasoline you can get for
+achieving new things.
+
+Regarding the task I chose, I've already spent some hours working on it. It's
+not an easy task. The internal TinyCC API changed a lot since the moment the
+fork was done, and there are many commits related with RISC-V since then. One
+of the most recent one fixes the RISC-V assembler after I reported it wasn't
+working, few weeks ago. All these changes must be reviewed carefully, undoing
+the API changes and also, most importantly, keeping the code compatible with
+GNU Mes's C compiler.
+
+Not an easy task.