diff options
-rw-r--r-- | content/bootstrapGcc/11_sidetracked.md | 292 |
1 files changed, 292 insertions, 0 deletions
diff --git a/content/bootstrapGcc/11_sidetracked.md b/content/bootstrapGcc/11_sidetracked.md new file mode 100644 index 0000000..4d9a37a --- /dev/null +++ b/content/bootstrapGcc/11_sidetracked.md @@ -0,0 +1,292 @@ +Title: So we've got sidetracked... +Date: 2024-03-30 +Category: +Tags: Bootstrapping GCC in RISC-V +Slug: bootstrapGcc11 +Lang: en +Summary: + We've got sidetracked, but it doesn't really matter if we continue forward. + + +There are many software projects involved in our bootstrapping process, not +only compilers! And many of them are not fully supported for RISC-V or we don't +have the compilers ready to build them. In order to be able to build +everything, we need to touch the build scripts of many of them, add patches to +them or fix our C standard library and compilers, as they are extremely minimal +and may lack support. + +During these days we had a really interesting case of back-and-forth that +sidetracked us a little bit, so let me share it with you, and meanwhile I'll +introduce most of the work we've been doing since January. + +#### Gash + +In the bootstrapping in Guix we don't rely on Bash to run our scripts. Instead +we use Gash. + +During the bootstrapping process in Guix, we found Gash hangs in some specific +points of the process, mostly when configuring Binutils. + +I managed to use Bash in Binutils in my RISC-V hardware instead, but in my +`x86_64` laptop I was unable to build all the dependencies I needed to build +Bash for RISC-V. This is not cool. + +Thankfully, Gash maintainers are friendly and we can talk with them to try to +fix the issue. + +#### Gzip + +Gzip is also an integral part of the process, as we download the software +releases in Gzip format. We need to decompress them as fast as possible (and +correctly). + +It built pretty easily, using the bootstrappable Tinycc but it didn't run +properly at first. It was able to decompress files but then, when it tried to +compare the file checksum, it failed to do so. + +This happened to be related with some missing integer support in our +bootstrappable TinyCC. The [`riscv-mes` +branch](https://github.com/ekaitz-zarraga/tcc/commits) of our bootstrappable +TinyCC shows all the commits we needed to add to fix this issue and the rest of +the issues that I share in this post. + +The Gzip issue was fixed by this commit[^commit-gzip]: + +[^commit-gzip]: [Link to GitHub](https://github.com/ekaitz-zarraga/tcc/commit/589d2ab1847dd2dcb5f35b3709890d6f51a7cacb) + +``` bash +589d2ab1 RISCV: 32bit val sign extension +``` + +It has some 32 bit value sign extension support we were missing. Without it, +the binary operations that calculate the checksum in Gzip were simply wrong, as +everything was incorrectly sign-extended[^integers]. + +[^integers]: [I already told you *integers are + hard*](https://ekaitz.elenq.tech/bootstrapGcc8.html#size-problems) + +As you might imagine, due to the lack of proper debug symbols and the fact that +the issue was so specific, this was really hard to deal with until we found out +the real problem. Of course, this very issue would affect many other programs +but this was the first time we saw it. That's why it's very important to fix +things properly, as they may have many ramifications. + +#### GNU-Make + +One of the other dependencies is GNU-Make, needed in many projects. In the +previous steps of the bootstrap we manage running commands manually, but in +more complex projects GNU-Make is necessary. + +In January we built Make using our bootstrappable TinyCC, but it didn't work! + +First it didn't work, it segfaulted, when using long options (`--help`) but +short ones (`-h`) did. This happened because our bootstrappable TinyCC had a +missing piece from the backport I did. The commit `db9d7b09` of the `riscv-mes` +branch and the following two show a clear history of how this worked. We first +realized the `char *` was loaded to a register using a `lb` instruction, which +is a *load byte*. I realized this printing the value of the pointer in hex +format, that shown only the lower bytes were the same, while the higher ones +were empty. Then I disassembled and found the `lb`, that should have been an +`ld` (*load doubleword*), the pointer size in a 64 bit machine that is. + +The problem here was that the `char *` was detected in the compiler as an array +of characters, which has a size of 1: a byte. The TinyCC I took the code from +uses a function that calculates the size of the type, a pretty reasonable thing +to have in a compiler. The problem we had was the type information was not +stored properly and that function calculated the type based on that wrong +information. My first attempt was to use a different function for that, but +when I sent a patch to upstream TinyCC, thinking they also had this issue, they +told me they didn't have it in the first place[^check]. That was more than +surprising for me so I dig in the Git history until I realize they had a very +interesting commit: `Make casts lose top-level qualifiers`. Ah-ha! The commit +only has one line and then some tests. This is the content, without the tests: + +[^check]: Yes, I should have checked better. My bad. + +``` diff ++ vtop->type.t &= ~ ( VT_CONSTANT | VT_VOLATILE | VT_ARRAY ); +``` + +This removed the `VT_ARRAY` flag from the pointer in the type, so the function +that calculates the type treats the type as a pointer, 64 bit then in our case, +so `ld` is emitted and we are happy. We cherry pick the commit from upstream, +revert our fix and go on. + +But of course that was not enough, that'd be too easy. We found some other +issues in Make. + +Now Make was running and not failing, but it said *`"No makefile was found"`* +and it never run the recipes. We realized later there was some kind of issue +when reading files and my colleague Andrius found the `getdents` system call +returns a different struct in 64 bits, and we were reading it like the 32 bit +structure so he fixed that in Meslibc for all 64 bit architectures. This error +makes a lot of sense in Meslibc, because all the previous attempts in the +bootstrapping were in 32 bits and our starting point only supports that. That's +one of the other sources of errors we have, we are also making this whole thing +*64bit-ready*. + +Once Make was able to find and read the Makefile and run it, we realized other +problem, this one related with the dates of the files. Make started to give us +weird messages like *`"Timestamp out of range; substituting..."`*. Later, I +found that some recipes were executed even if the files it required didn't +change. + +This is not a big deal if you just want things to be built once so we left this +as not-very-important-thing until I used this Make in a Guix package. The +`gnu-build-system` in Guix first runs `./configure` (`configure` phase) and +later runs `make` (`build` phase). This `make` rerun the `./configure` command +from the previous step, because it thought some of the files where changed +between both phases. This behavior is more problematic than it feels, because +Guix needs to fix the shebangs of all the scripts in the +project[^guix-shebang], and it has a phase for this between the ones I just +mentioned: `patch-generated-file-shebangs`. If it's the `make` run itself that +configures the project and right after that starts building, the shebangs of +the generated files are not fixed, and the process fails. The issue is not a +not-very-important-thing anymore! + +[^guix-shebang]: Guix doesn't store binaries in the classic places. It does not + follow the File Hierarchy Standard. It needs to replace the references to + things like `#!/bin/bash` with something like + `#!/gnu/store/295aavfhzcn1vg9731zx9zw92msgby5a-bash-5.1.16/bin/bash` + +Of course, after what I just explained I was forced to fix this. Some debugging +sessions later I found the `stat` system call's result was not interpreted +correctly in MeslibC. There were some padding issues, so I just fixed that in +RISC-V and mostly fixed Make. Now Make, built using our bootstrappable TinyCC, +works well enough for us. + +#### TinyCC + +In my talk this February in [FOSDEM-2024][fosdem2024] I explained upstream +TinyCC was missing some RISC-V support and that we didn't have it working yet. +During this time we solved the main issue we had with it: + +[fosdem2024]: https://fosdem.org/2024/schedule/event/fosdem-2024-1755-risc-v-bootstrapping-in-guix-and-live-bootstrap/ + +``` something +Unimplemented large addend for global address +``` + +I had no idea about how to fix this so I wrote an email to the person that +wrote most of the code around the relocations and he answered me, giving me a +very interesting answer. Thank you, Michael. + +That answer was more than enough for me to write the code for it (it was almost +done) and in a couple of hours I had a fix for this. The large addend support +was pretty simple, actually. It was just that relocations are still a little +bit scary for me, and the codebase doesn't help a lot. + +With this issue fixed, now we can go for upstream TinyCC and use it for later +steps on the project, as we do in the bootstrapping chains in other +architectures, as the upstream TinyCC is more stable and capable than our +bootstrappable fork. + +#### Binutils + +We need to remember our goal is to build GCC. That's why we try to use upstream +TinyCC, as it is able to build it whereas our bootstrappable TinyCC might not +be as. + +Building GCC requires Binutils, so we tried to build it. We had several issues +in Binutils and we haven't managed to make Binutils' programs that don't +explode. The problem here is probably because of limitations of our standard +library, so here comes the sidetrack. + +We considered using Musl instead, as it's a powerful standard library that is +also very simple. + +#### Musl + +Musl is really cool. We've used it a lot as a reference for MeslibC, but Musl +is not used in Guix's bootstrapping process in other architectures. Our plan is +to try use it for Binutils to see if our broken binaries are because of MeslibC +or because of something else. + +Musl, as most C standard libraries, requires some support for assembly, and +more specifically Extended Asm. + +We already talked about Extended Asm[^extended-asm] support before but, in +summary, it was unimplemented in TinyCC's backend for RISC-V. + +Apart from that, TinyCC lacks some very important pseudoinstructions that are +used in Musl and the assembly syntax it uses is not the one that the GNU +Assembler uses, so TinyCC is unable to build simple instructions like: + +``` asm +ld a0, 8(a0) +``` + +As TinyCC expects something like: + +``` asm +ld a0, a0, 8 +``` + +Hmm... + +[^extended-asm]: Extended Asm helps you call assembly blocks using C variables, + and it also protects the variables you don't want to touch. + You can read more about that in [GCC's + documentation](https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html). + + +#### Back to TinyCC + +This is were the sidetrack went so wild we went back to almost the beginning. +I wanted to make Musl build so I started to write support for everything I +wanted it to do. + +I implemented many pseudoinstructions and instructions that were missing and +Musl needed. This includes GNU Assembler syntax for memory access instructions +like loads and stores. By the way, don't trust them blindly because I realized +I did `jal` wrong (some relocation issue again!) and I had to fix it later. + +I also added `.options` directive for the RISC-V assembly, that is used really +often (I didn't implement it yet). I did enough to make the builds pass. Most +of the times the `.options` directive is used to disable the linker +relaxation, which TinyCC doesn't do anyway so... Why bother? + +I also have a draft for the Extended Asm, and I have it kind of working. I am +not sure about some of the things I did but I feel it's pretty close. + +The Extended Asm support is not upstreamed yet, but I sent it to the TinyCC +mailing list. The rest of the things I sent it already to TinyCC and you can +see in the `mob` branch. + +#### MeslibC + +Of course, I can't stop, so I took of the all support I did for TinyCC and +tried to apply it to the bootstrappable TinyCC. + +I was also a little bit forced to do so because we rebuild MeslibC with TinyCC +and after the changes we could not do it. When we started we had to make a copy +of MeslibC that didn't have the GNU As style assembly and supported the TinyCC +style assembly instead. Mes' Guix package as-is only provides one of the +flavors of the MeslibC code, the TinyCC style one, which we can't rebuild with +the modern support in TinyCC. + +My solution was to backport all the Extended Asm support and all the new +assembler to the bootstrappable TinyCC and then remove the MeslibC copy that +used the old syntax. I managed to make it build but the executables generated +with it explode at the time of writing, so we need to review that further. In +any case, this is a good change because it reduces the amount of code we have, +and it uses the more recent TinyCC assembly, that had many improvements since I +did the backport, a year ago. + + +#### So... + +It looks we are back again at the very beginning, and near to the end at the +same time, if you take in account what I shared in the latest post of the +series about GCC. + +We still need to work in some other related projects, like Patch, that would +allow us to apply our bootstrapping patches, but that's also almost working. I +want to believe it's not going to give us many headaches in the future. + +In summary, it looks like sometimes you have to run and later go back to walk +the same path, slowly this second time, with all the knowledge you got in the +first run. + +Here we are. Sidetracked, but also pretty happy, as this is still going +forward. |