summaryrefslogtreecommitdiff
path: root/content/bootstrapGcc/11_sidetracked.md
blob: 4d9a37a9d29a1d5fddb99dcc86231499e517a90f (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
Title: So we've got sidetracked...
Date: 2024-03-30
Category:
Tags: Bootstrapping GCC in RISC-V
Slug: bootstrapGcc11
Lang: en
Summary:
    We've got sidetracked, but it doesn't really matter if we continue forward.


There are many software projects involved in our bootstrapping process, not
only compilers! And many of them are not fully supported for RISC-V or we don't
have the compilers ready to build them. In order to be able to build
everything, we need to touch the build scripts of many of them, add patches to
them or fix our C standard library and compilers, as they are extremely minimal
and may lack support.

During these days we had a really interesting case of back-and-forth that
sidetracked us a little bit, so let me share it with you, and meanwhile I'll
introduce most of the work we've been doing since January.

#### Gash

In the bootstrapping in Guix we don't rely on Bash to run our scripts. Instead
we use Gash.

During the bootstrapping process in Guix, we found Gash hangs in some specific
points of the process, mostly when configuring Binutils.

I managed to use Bash in Binutils in my RISC-V hardware instead, but in my
`x86_64` laptop I was unable to build all the dependencies I needed to build
Bash for RISC-V. This is not cool.

Thankfully, Gash maintainers are friendly and we can talk with them to try to
fix the issue.

#### Gzip

Gzip is also an integral part of the process, as we download the software
releases in Gzip format. We need to decompress them as fast as possible (and
correctly).

It built pretty easily, using the bootstrappable Tinycc but it didn't run
properly at first. It was able to decompress files but then, when it tried to
compare the file checksum, it failed to do so.

This happened to be related with some missing integer support in our
bootstrappable TinyCC. The [`riscv-mes`
branch](https://github.com/ekaitz-zarraga/tcc/commits) of our bootstrappable
TinyCC shows all the commits we needed to add to fix this issue and the rest of
the issues that I share in this post.

The Gzip issue was fixed by this commit[^commit-gzip]:

[^commit-gzip]: [Link to GitHub](https://github.com/ekaitz-zarraga/tcc/commit/589d2ab1847dd2dcb5f35b3709890d6f51a7cacb)

``` bash
589d2ab1 RISCV: 32bit val sign extension
```

It has some 32 bit value sign extension support we were missing. Without it,
the binary operations that calculate the checksum in Gzip were simply wrong, as
everything was incorrectly sign-extended[^integers].

[^integers]: [I already told you *integers are
    hard*](https://ekaitz.elenq.tech/bootstrapGcc8.html#size-problems)

As you might imagine, due to the lack of proper debug symbols and the fact that
the issue was so specific, this was really hard to deal with until we found out
the real problem. Of course, this very issue would affect many other programs
but this was the first time we saw it. That's why it's very important to fix
things properly, as they may have many ramifications.

#### GNU-Make

One of the other dependencies is GNU-Make, needed in many projects. In the
previous steps of the bootstrap we manage running commands manually, but in
more complex projects GNU-Make is necessary.

In January we built Make using our bootstrappable TinyCC, but it didn't work!

First it didn't work, it segfaulted, when using long options (`--help`) but
short ones (`-h`) did. This happened because our bootstrappable TinyCC had a
missing piece from the backport I did. The commit `db9d7b09` of the `riscv-mes`
branch and the following two show a clear history of how this worked. We first
realized the `char *` was loaded to a register using a `lb` instruction, which
is a *load byte*. I realized this printing the value of the pointer in hex
format, that shown only the lower bytes were the same, while the higher ones
were empty. Then I disassembled and found the `lb`, that should have been an
`ld` (*load doubleword*), the pointer size in a 64 bit machine that is.

The problem here was that the `char *` was detected in the compiler as an array
of characters, which has a size of 1: a byte. The TinyCC I took the code from
uses a function that calculates the size of the type, a pretty reasonable thing
to have in a compiler. The problem we had was the type information was not
stored properly and that function calculated the type based on that wrong
information. My first attempt was to use a different function for that, but
when I sent a patch to upstream TinyCC, thinking they also had this issue, they
told me they didn't have it in the first place[^check]. That was more than
surprising for me so I dig in the Git history until I realize they had a very
interesting commit: `Make casts lose top-level qualifiers`. Ah-ha! The commit
only has one line and then some tests. This is the content, without the tests:

[^check]: Yes, I should have checked better. My bad.

``` diff
+    vtop->type.t &= ~ ( VT_CONSTANT | VT_VOLATILE | VT_ARRAY );
```

This removed the `VT_ARRAY` flag from the pointer in the type, so the function
that calculates the type treats the type as a pointer, 64 bit then in our case,
so `ld` is emitted and we are happy. We cherry pick the commit from upstream,
revert our fix and go on.

But of course that was not enough, that'd be too easy. We found some other
issues in Make.

Now Make was running and not failing, but it said *`"No makefile was found"`*
and it never run the recipes. We realized later there was some kind of issue
when reading files and my colleague Andrius found the `getdents` system call
returns a different struct in 64 bits, and we were reading it like the 32 bit
structure so he fixed that in Meslibc for all 64 bit architectures. This error
makes a lot of sense in Meslibc, because all the previous attempts in the
bootstrapping were in 32 bits and our starting point only supports that. That's
one of the other sources of errors we have, we are also making this whole thing
*64bit-ready*.

Once Make was able to find and read the Makefile and run it, we realized other
problem, this one related with the dates of the files. Make started to give us
weird messages like *`"Timestamp out of range; substituting..."`*. Later, I
found that some recipes were executed even if the files it required didn't
change.

This is not a big deal if you just want things to be built once so we left this
as not-very-important-thing until I used this Make in a Guix package. The
`gnu-build-system` in Guix first runs `./configure` (`configure` phase) and
later runs `make` (`build` phase). This `make` rerun the `./configure` command
from the previous step, because it thought some of the files where changed
between both phases. This behavior is more problematic than it feels, because
Guix needs to fix the shebangs of all the scripts in the
project[^guix-shebang], and it has a phase for this between the ones I just
mentioned: `patch-generated-file-shebangs`. If it's the `make` run itself that
configures the project and right after that starts building, the shebangs of
the generated files are not fixed, and the process fails. The issue is not a
not-very-important-thing anymore!

[^guix-shebang]: Guix doesn't store binaries in the classic places. It does not
    follow the File Hierarchy Standard. It needs to replace the references to
    things like `#!/bin/bash` with something like
    `#!/gnu/store/295aavfhzcn1vg9731zx9zw92msgby5a-bash-5.1.16/bin/bash`

Of course, after what I just explained I was forced to fix this. Some debugging
sessions later I found the `stat` system call's result was not interpreted
correctly in MeslibC. There were some padding issues, so I just fixed that in
RISC-V and mostly fixed Make. Now Make, built using our bootstrappable TinyCC,
works well enough for us.

#### TinyCC

In my talk this February in [FOSDEM-2024][fosdem2024] I explained upstream
TinyCC was missing some RISC-V support and that we didn't have it working yet.
During this time we solved the main issue we had with it:

[fosdem2024]: https://fosdem.org/2024/schedule/event/fosdem-2024-1755-risc-v-bootstrapping-in-guix-and-live-bootstrap/

``` something
Unimplemented large addend for global address
```

I had no idea about how to fix this so I wrote an email to the person that
wrote most of the code around the relocations and he answered me, giving me a
very interesting answer. Thank you, Michael.

That answer was more than enough for me to write the code for it (it was almost
done) and in a couple of hours I had a fix for this. The large addend support
was pretty simple, actually. It was just that relocations are still a little
bit scary for me, and the codebase doesn't help a lot.

With this issue fixed, now we can go for upstream TinyCC and use it for later
steps on the project, as we do in the bootstrapping chains in other
architectures, as the upstream TinyCC is more stable and capable than our
bootstrappable fork.

#### Binutils

We need to remember our goal is to build GCC. That's why we try to use upstream
TinyCC, as it is able to build it whereas our bootstrappable TinyCC might not
be as.

Building GCC requires Binutils, so we tried to build it. We had several issues
in Binutils and we haven't managed to make Binutils' programs that don't
explode. The problem here is probably because of limitations of our standard
library, so here comes the sidetrack.

We considered using Musl instead, as it's a powerful standard library that is
also very simple.

#### Musl

Musl is really cool. We've used it a lot as a reference for MeslibC, but Musl
is not used in Guix's bootstrapping process in other architectures. Our plan is
to try use it for Binutils to see if our broken binaries are because of MeslibC
or because of something else.

Musl, as most C standard libraries, requires some support for assembly, and
more specifically Extended Asm.

We already talked about Extended Asm[^extended-asm] support before but, in
summary, it was unimplemented in TinyCC's backend for RISC-V.

Apart from that, TinyCC lacks some very important pseudoinstructions that are
used in Musl and the assembly syntax it uses is not the one that the GNU
Assembler uses, so TinyCC is unable to build simple instructions like:

``` asm
ld a0, 8(a0)
```

As TinyCC expects something like:

``` asm
ld a0, a0, 8
```

Hmm...

[^extended-asm]: Extended Asm helps you call assembly blocks using C variables,
    and it also protects the variables you don't want to touch.
    You can read more about that in [GCC's
    documentation](https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html).


#### Back to TinyCC

This is were the sidetrack went so wild we went back to almost the beginning.
I wanted to make Musl build so I started to write support for everything I
wanted it to do.

I implemented many pseudoinstructions and instructions that were missing and
Musl needed. This includes GNU Assembler syntax for memory access instructions
like loads and stores. By the way, don't trust them blindly because I realized
I did `jal` wrong (some relocation issue again!) and I had to fix it later.

I also added `.options` directive for the RISC-V assembly, that is used really
often (I didn't implement it yet). I did enough to make the builds pass. Most
of the times the `.options` directive is used to disable the linker
relaxation, which TinyCC doesn't do anyway so... Why bother?

I also have a draft for the Extended Asm, and I have it kind of working. I am
not sure about some of the things I did but I feel it's pretty close.

The Extended Asm support is not upstreamed yet, but I sent it to the TinyCC
mailing list. The rest of the things I sent it already to TinyCC and you can
see in the `mob` branch.

#### MeslibC

Of course, I can't stop, so I took of the all support I did for TinyCC and
tried to apply it to the bootstrappable TinyCC.

I was also a little bit forced to do so because we rebuild MeslibC with TinyCC
and after the changes we could not do it. When we started we had to make a copy
of MeslibC that didn't have the GNU As style assembly and supported the TinyCC
style assembly instead. Mes' Guix package as-is only provides one of the
flavors of the MeslibC code, the TinyCC style one, which we can't rebuild with
the modern support in TinyCC.

My solution was to backport all the Extended Asm support and all the new
assembler to the bootstrappable TinyCC and then remove the MeslibC copy that
used the old syntax. I managed to make it build but the executables generated
with it explode at the time of writing, so we need to review that further. In
any case, this is a good change because it reduces the amount of code we have,
and it uses the more recent TinyCC assembly, that had many improvements since I
did the backport, a year ago.


#### So...

It looks we are back again at the very beginning, and near to the end at the
same time, if you take in account what I shared in the latest post of the
series about GCC.

We still need to work in some other related projects, like Patch, that would
allow us to apply our bootstrapping patches, but that's also almost working. I
want to believe it's not going to give us many headaches in the future.

In summary, it looks like sometimes you have to run and later go back to walk
the same path, slowly this second time, with all the knowledge you got in the
first run.

Here we are. Sidetracked, but also pretty happy, as this is still going
forward.