summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorEkaitz Zarraga <ekaitz@elenq.tech>2024-03-29 19:53:07 +0100
committerEkaitz Zarraga <ekaitz@elenq.tech>2024-03-30 00:36:25 +0100
commit53f62e743d7643cfff1c61f510b9d90c5cae4ab3 (patch)
treef0f63245c257154129328770b6b7f375bc051947
parent4762d945f5a464330c5130dbdcceaf7b57e6ca9e (diff)
add bootstrapGcc-11
-rw-r--r--content/bootstrapGcc/11_sidetracked.md292
1 files changed, 292 insertions, 0 deletions
diff --git a/content/bootstrapGcc/11_sidetracked.md b/content/bootstrapGcc/11_sidetracked.md
new file mode 100644
index 0000000..4d9a37a
--- /dev/null
+++ b/content/bootstrapGcc/11_sidetracked.md
@@ -0,0 +1,292 @@
+Title: So we've got sidetracked...
+Date: 2024-03-30
+Category:
+Tags: Bootstrapping GCC in RISC-V
+Slug: bootstrapGcc11
+Lang: en
+Summary:
+ We've got sidetracked, but it doesn't really matter if we continue forward.
+
+
+There are many software projects involved in our bootstrapping process, not
+only compilers! And many of them are not fully supported for RISC-V or we don't
+have the compilers ready to build them. In order to be able to build
+everything, we need to touch the build scripts of many of them, add patches to
+them or fix our C standard library and compilers, as they are extremely minimal
+and may lack support.
+
+During these days we had a really interesting case of back-and-forth that
+sidetracked us a little bit, so let me share it with you, and meanwhile I'll
+introduce most of the work we've been doing since January.
+
+#### Gash
+
+In the bootstrapping in Guix we don't rely on Bash to run our scripts. Instead
+we use Gash.
+
+During the bootstrapping process in Guix, we found Gash hangs in some specific
+points of the process, mostly when configuring Binutils.
+
+I managed to use Bash in Binutils in my RISC-V hardware instead, but in my
+`x86_64` laptop I was unable to build all the dependencies I needed to build
+Bash for RISC-V. This is not cool.
+
+Thankfully, Gash maintainers are friendly and we can talk with them to try to
+fix the issue.
+
+#### Gzip
+
+Gzip is also an integral part of the process, as we download the software
+releases in Gzip format. We need to decompress them as fast as possible (and
+correctly).
+
+It built pretty easily, using the bootstrappable Tinycc but it didn't run
+properly at first. It was able to decompress files but then, when it tried to
+compare the file checksum, it failed to do so.
+
+This happened to be related with some missing integer support in our
+bootstrappable TinyCC. The [`riscv-mes`
+branch](https://github.com/ekaitz-zarraga/tcc/commits) of our bootstrappable
+TinyCC shows all the commits we needed to add to fix this issue and the rest of
+the issues that I share in this post.
+
+The Gzip issue was fixed by this commit[^commit-gzip]:
+
+[^commit-gzip]: [Link to GitHub](https://github.com/ekaitz-zarraga/tcc/commit/589d2ab1847dd2dcb5f35b3709890d6f51a7cacb)
+
+``` bash
+589d2ab1 RISCV: 32bit val sign extension
+```
+
+It has some 32 bit value sign extension support we were missing. Without it,
+the binary operations that calculate the checksum in Gzip were simply wrong, as
+everything was incorrectly sign-extended[^integers].
+
+[^integers]: [I already told you *integers are
+ hard*](https://ekaitz.elenq.tech/bootstrapGcc8.html#size-problems)
+
+As you might imagine, due to the lack of proper debug symbols and the fact that
+the issue was so specific, this was really hard to deal with until we found out
+the real problem. Of course, this very issue would affect many other programs
+but this was the first time we saw it. That's why it's very important to fix
+things properly, as they may have many ramifications.
+
+#### GNU-Make
+
+One of the other dependencies is GNU-Make, needed in many projects. In the
+previous steps of the bootstrap we manage running commands manually, but in
+more complex projects GNU-Make is necessary.
+
+In January we built Make using our bootstrappable TinyCC, but it didn't work!
+
+First it didn't work, it segfaulted, when using long options (`--help`) but
+short ones (`-h`) did. This happened because our bootstrappable TinyCC had a
+missing piece from the backport I did. The commit `db9d7b09` of the `riscv-mes`
+branch and the following two show a clear history of how this worked. We first
+realized the `char *` was loaded to a register using a `lb` instruction, which
+is a *load byte*. I realized this printing the value of the pointer in hex
+format, that shown only the lower bytes were the same, while the higher ones
+were empty. Then I disassembled and found the `lb`, that should have been an
+`ld` (*load doubleword*), the pointer size in a 64 bit machine that is.
+
+The problem here was that the `char *` was detected in the compiler as an array
+of characters, which has a size of 1: a byte. The TinyCC I took the code from
+uses a function that calculates the size of the type, a pretty reasonable thing
+to have in a compiler. The problem we had was the type information was not
+stored properly and that function calculated the type based on that wrong
+information. My first attempt was to use a different function for that, but
+when I sent a patch to upstream TinyCC, thinking they also had this issue, they
+told me they didn't have it in the first place[^check]. That was more than
+surprising for me so I dig in the Git history until I realize they had a very
+interesting commit: `Make casts lose top-level qualifiers`. Ah-ha! The commit
+only has one line and then some tests. This is the content, without the tests:
+
+[^check]: Yes, I should have checked better. My bad.
+
+``` diff
++ vtop->type.t &= ~ ( VT_CONSTANT | VT_VOLATILE | VT_ARRAY );
+```
+
+This removed the `VT_ARRAY` flag from the pointer in the type, so the function
+that calculates the type treats the type as a pointer, 64 bit then in our case,
+so `ld` is emitted and we are happy. We cherry pick the commit from upstream,
+revert our fix and go on.
+
+But of course that was not enough, that'd be too easy. We found some other
+issues in Make.
+
+Now Make was running and not failing, but it said *`"No makefile was found"`*
+and it never run the recipes. We realized later there was some kind of issue
+when reading files and my colleague Andrius found the `getdents` system call
+returns a different struct in 64 bits, and we were reading it like the 32 bit
+structure so he fixed that in Meslibc for all 64 bit architectures. This error
+makes a lot of sense in Meslibc, because all the previous attempts in the
+bootstrapping were in 32 bits and our starting point only supports that. That's
+one of the other sources of errors we have, we are also making this whole thing
+*64bit-ready*.
+
+Once Make was able to find and read the Makefile and run it, we realized other
+problem, this one related with the dates of the files. Make started to give us
+weird messages like *`"Timestamp out of range; substituting..."`*. Later, I
+found that some recipes were executed even if the files it required didn't
+change.
+
+This is not a big deal if you just want things to be built once so we left this
+as not-very-important-thing until I used this Make in a Guix package. The
+`gnu-build-system` in Guix first runs `./configure` (`configure` phase) and
+later runs `make` (`build` phase). This `make` rerun the `./configure` command
+from the previous step, because it thought some of the files where changed
+between both phases. This behavior is more problematic than it feels, because
+Guix needs to fix the shebangs of all the scripts in the
+project[^guix-shebang], and it has a phase for this between the ones I just
+mentioned: `patch-generated-file-shebangs`. If it's the `make` run itself that
+configures the project and right after that starts building, the shebangs of
+the generated files are not fixed, and the process fails. The issue is not a
+not-very-important-thing anymore!
+
+[^guix-shebang]: Guix doesn't store binaries in the classic places. It does not
+ follow the File Hierarchy Standard. It needs to replace the references to
+ things like `#!/bin/bash` with something like
+ `#!/gnu/store/295aavfhzcn1vg9731zx9zw92msgby5a-bash-5.1.16/bin/bash`
+
+Of course, after what I just explained I was forced to fix this. Some debugging
+sessions later I found the `stat` system call's result was not interpreted
+correctly in MeslibC. There were some padding issues, so I just fixed that in
+RISC-V and mostly fixed Make. Now Make, built using our bootstrappable TinyCC,
+works well enough for us.
+
+#### TinyCC
+
+In my talk this February in [FOSDEM-2024][fosdem2024] I explained upstream
+TinyCC was missing some RISC-V support and that we didn't have it working yet.
+During this time we solved the main issue we had with it:
+
+[fosdem2024]: https://fosdem.org/2024/schedule/event/fosdem-2024-1755-risc-v-bootstrapping-in-guix-and-live-bootstrap/
+
+``` something
+Unimplemented large addend for global address
+```
+
+I had no idea about how to fix this so I wrote an email to the person that
+wrote most of the code around the relocations and he answered me, giving me a
+very interesting answer. Thank you, Michael.
+
+That answer was more than enough for me to write the code for it (it was almost
+done) and in a couple of hours I had a fix for this. The large addend support
+was pretty simple, actually. It was just that relocations are still a little
+bit scary for me, and the codebase doesn't help a lot.
+
+With this issue fixed, now we can go for upstream TinyCC and use it for later
+steps on the project, as we do in the bootstrapping chains in other
+architectures, as the upstream TinyCC is more stable and capable than our
+bootstrappable fork.
+
+#### Binutils
+
+We need to remember our goal is to build GCC. That's why we try to use upstream
+TinyCC, as it is able to build it whereas our bootstrappable TinyCC might not
+be as.
+
+Building GCC requires Binutils, so we tried to build it. We had several issues
+in Binutils and we haven't managed to make Binutils' programs that don't
+explode. The problem here is probably because of limitations of our standard
+library, so here comes the sidetrack.
+
+We considered using Musl instead, as it's a powerful standard library that is
+also very simple.
+
+#### Musl
+
+Musl is really cool. We've used it a lot as a reference for MeslibC, but Musl
+is not used in Guix's bootstrapping process in other architectures. Our plan is
+to try use it for Binutils to see if our broken binaries are because of MeslibC
+or because of something else.
+
+Musl, as most C standard libraries, requires some support for assembly, and
+more specifically Extended Asm.
+
+We already talked about Extended Asm[^extended-asm] support before but, in
+summary, it was unimplemented in TinyCC's backend for RISC-V.
+
+Apart from that, TinyCC lacks some very important pseudoinstructions that are
+used in Musl and the assembly syntax it uses is not the one that the GNU
+Assembler uses, so TinyCC is unable to build simple instructions like:
+
+``` asm
+ld a0, 8(a0)
+```
+
+As TinyCC expects something like:
+
+``` asm
+ld a0, a0, 8
+```
+
+Hmm...
+
+[^extended-asm]: Extended Asm helps you call assembly blocks using C variables,
+ and it also protects the variables you don't want to touch.
+ You can read more about that in [GCC's
+ documentation](https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html).
+
+
+#### Back to TinyCC
+
+This is were the sidetrack went so wild we went back to almost the beginning.
+I wanted to make Musl build so I started to write support for everything I
+wanted it to do.
+
+I implemented many pseudoinstructions and instructions that were missing and
+Musl needed. This includes GNU Assembler syntax for memory access instructions
+like loads and stores. By the way, don't trust them blindly because I realized
+I did `jal` wrong (some relocation issue again!) and I had to fix it later.
+
+I also added `.options` directive for the RISC-V assembly, that is used really
+often (I didn't implement it yet). I did enough to make the builds pass. Most
+of the times the `.options` directive is used to disable the linker
+relaxation, which TinyCC doesn't do anyway so... Why bother?
+
+I also have a draft for the Extended Asm, and I have it kind of working. I am
+not sure about some of the things I did but I feel it's pretty close.
+
+The Extended Asm support is not upstreamed yet, but I sent it to the TinyCC
+mailing list. The rest of the things I sent it already to TinyCC and you can
+see in the `mob` branch.
+
+#### MeslibC
+
+Of course, I can't stop, so I took of the all support I did for TinyCC and
+tried to apply it to the bootstrappable TinyCC.
+
+I was also a little bit forced to do so because we rebuild MeslibC with TinyCC
+and after the changes we could not do it. When we started we had to make a copy
+of MeslibC that didn't have the GNU As style assembly and supported the TinyCC
+style assembly instead. Mes' Guix package as-is only provides one of the
+flavors of the MeslibC code, the TinyCC style one, which we can't rebuild with
+the modern support in TinyCC.
+
+My solution was to backport all the Extended Asm support and all the new
+assembler to the bootstrappable TinyCC and then remove the MeslibC copy that
+used the old syntax. I managed to make it build but the executables generated
+with it explode at the time of writing, so we need to review that further. In
+any case, this is a good change because it reduces the amount of code we have,
+and it uses the more recent TinyCC assembly, that had many improvements since I
+did the backport, a year ago.
+
+
+#### So...
+
+It looks we are back again at the very beginning, and near to the end at the
+same time, if you take in account what I shared in the latest post of the
+series about GCC.
+
+We still need to work in some other related projects, like Patch, that would
+allow us to apply our bootstrapping patches, but that's also almost working. I
+want to believe it's not going to give us many headaches in the future.
+
+In summary, it looks like sometimes you have to run and later go back to walk
+the same path, slowly this second time, with all the knowledge you got in the
+first run.
+
+Here we are. Sidetracked, but also pretty happy, as this is still going
+forward.