summaryrefslogtreecommitdiff
path: root/content/bootstrapGcc/00_intro.md
blob: e3570ad4c694450518910456acf9c74af6dda83b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
Title: Intro to GCC bootstrap in RISC-V
Date: 2022-02-14
Category:
Tags: Bootstrapping GCC in RISC-V
Slug: bootstrapGcc0
Lang: en
Summary:
    Introduction to my new adventure bootstrapping GCC for RISC-V. Why, how,
    and who is going to pay for it.

You probably already know about how I spent more than a year having fun with
RISC-V and software bootstrapping from source.

As some may know from my [FOSDEM talk][fosdem22], [NLNet / NGI-Assure put the
funds][nlnet] to make me spend more time on this for this year and I decided
to work on GCC's bootstrapping process for RISC-V.

[nlnet]: https://nlnet.nl/project/GNUMes-RISCV/
[fosdem22]: https://fosdem.org/2022/schedule/event/riscvadventures/

### Why GCC

GCC is probably the most used compiler collection, period.  With GCC we can
compile the world and have a proper distribution directly from source, but who
compiles the compiler?[^1]

[^1]: *wHo wATcHes tHE wAtchMEN?*

Well, someone has to.

### The bootstrap

Bootstrapping a compiler with a long history like GCC for a new architecture
like RISC-V involves some complications, starting on the fact that the first
version of GCC that supports RISC-V needs a C++98 capable compiler in order to
build. C++98 is a really complex standard, so there's no way we can bootstrap a
C++98 compiler at the moment for RISC-V. The easiest way we can think of at
this point is to use an older version of GCC for that, one of those that are
able to build C++98 programs but they only require a C compiler to build. Older
versions of GCC, of course, don't have RISC-V support so... We need a
*backport*[^2].

[^2]: Insert "Back to the Future" music here.

So that's what I'm doing right now. I'm taking an old version of GCC that only
depends on C89 and is able to compile C++98 code and I'm porting it to RISC-V
so we can build newer GCCs with it.

Only needing C to compile it's a huge improvement because there are *Tiny C
Compilers* out there that can compile C to RISC-V, and those are written using
simple C that we can bootstrap with simpler tools of a more civilized world.

In summary:

- C++98 is too complex, but C89 is fine.
- GCC is the problem and also the solution.

### What about GNU Mes?

When *we*[^3] started with this effort we wanted to prepare GNU Mes, a small C
compiler that is able to compile a *Tiny C Compiler*, to work with RISC-V so we
could start to work in this bootstrap process from the bottom.

[^3]: "*We*" means I shared my thoughts and plans with other people who have a
  much better understanding of this than myself.

Some random events, like someone else working on that part, made us rethink our
strategy so we decided to start from the top and try to combine both efforts at
the end. We share the same goal: full source bootstrap for RISC-V.

### Tiny C Compilers?

There are many small C compilers out there that are written in simple C and are
able to compile an old GCC that is written in C. Our favorite is TinyCC (Tiny C
Compiler).

[^4]: But there are some others that are really interesting (see
[cproc](https://sr.ht/~mcf/cproc/), for example)

GNU Mes is able to build a patched version of TinyCC, which already supports
RISC-V (RV64 only), and we can use that TinyCC to compile the GCC version I'm
backporting.

We'd probably need to patch some things in both projects to make everything
work smoothly but that's also included in the project plan.

### Binutils

Binutils is also a problem mostly because GCC, as we will talk about in the
future, does not compile to binary directly. GCC generates assembly code and
coordinates calls to `as` and `ld` (the GNU Assembler and Linker) to generate
the final binaries. Thankfully, TinyCC can act as an assembler and a linker,
and there's also the chance to compile a modern binutils version because it is
written in C.

In any case, the binary file generation and support must be taken in account,
because GCC is not the only actor in this film and RISC-V has some weird things
on the assembly and the binaries that have to be supported correctly.

### Conclusion

This is a very interesting project, where I need to dig in **BIG** stuff, which
is cool, but also has a huge level of uncertainty, which scares the hell out of
me. I hope everything goes well...

In any case, I'll share all I learn here in the blog and I keep you all posted
with the news we have.

That's all for this time. If you have any question or comment or want to share
your thoughts and feelings with me[^5] you can find my
[contact information here](https://ekaitz.elenq.tech/pages/about.html).

[^5]: Or even hire me for some freelance IT stuff 🤓

---

> PS: Big up to NlNet / NGI-Assure for the money.

<style>
.container{
    display: flex;
    flex-flow: row wrap;
    justify-content: center;
    gap: 40px;
}
.no-side-margin{
    margin:  0px;
}
</style>
<div class="container">
<img class="no-side-margin" src="{attach}/bootstrapGcc/nlnet.svg"     width=200px>
<img class="no-side-margin" src="{attach}/bootstrapGcc/NGIAssure.svg" width=200px>
</div>