summaryrefslogtreecommitdiff
path: root/content/bootstrapGcc/03_first_working_compiler.md
blob: f6329420c3e6276c8ac0c463284f327a02885bbf (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
Title: Milestone — Minimal RISC-V support in GCC 4.6.4
Date: 2022-04-08
Category:
Tags: Bootstrapping GCC in RISC-V
Slug: bootstrapGcc3
Lang: en
Summary:
    Description of the changes for a minimal RISC-V support in GCC-4.6.4 and
    how did I reach this point.

In the [series]({tag}Bootstrapping GCC in RISC-V) we already introduced GCC,
its internals, and the work I'm doing to make it able to bootstrap on RISC-V.
In this post we are going to tackle the backporting effort and see how I
managed to make GCC-4.6.4 compile a simple program to RISC-V.



### How to follow this post

As this is going to be deeply connected to the changes I introduced in the
codebase, I suggest you to follow it directly in [the
repository](https://github.com/ekaitz-zarraga/gcc), the branch where I did the
changes is `riscv`, which starts from `releases/gcc-4.6.4`. As I will continue
adding changes on top of this, I left [a tag called `minimal-compiler`][tag]
that points to the contents of the repository when this blog post was written.

[tag]: https://github.com/ekaitz-zarraga/gcc/releases/tag/minimal-compiler

In any case, I'll share small pieces of the code in the post, but of course I
can't share everything here so I recommend you to go to the sources. I won't
link the sources directly but mention where you can find the changes so you are
not forced to follow all the links in the browser and you can use your favorite
editor for that.



### Overview of the commits

The `riscv` branch were I made all the work is split in several commits from
`releases/gcc-4.6.4`, where it started.

First it comes a series of 4 commits that make GCC-4.6.4 compilable with more
recent toolchains. These should be separated as independent patches later and
apply them by the distribution tool, Guix in this case.

Next a couple of commits describe a precarious `guix.scm` file that should
compile the project properly. At the moment it's not fully ready for
distribution but that's not really our job in the project, so I don't want to
spend a lot of time on that yet. At the moment it's just working so you can run
`guix build -f guix.scm` from the project directory and it should build a
minimal compiler, as we'll see later. There's also a `channels.scm` file, so
you can use the exact packages I used thanks to the very powerfull `guix
time-machine` command and replicate my exact build.

Even if I didn't want to spend a long time with the Guix package, I'd lie to
you if I tell you I didn't. Compiling legacy software is extremely difficult.
In this case, I had to patch the code to be compatible with more modern GCC
Toolchains, package an old `flex`, choose lots of configure time options...
Still, there are tons of things missing: there's no C++ support, the package
doesn't find system's libraries such as glibc and it's not integrated with
system's binutils. I don't know how I'm going to fix that to be honest, but I
don't want to think on that right now.

The next commits are what interests us the most: changes on top of GCC.

The first of them[^port] is just the RISC-V port commit from upstream GCC
applied on top of the project, being a little bit careful about
conflicts[^conflicts].  Obviously, this change doesn't really work, it doesn't
even compile, but it serves us to see which changes were needed on top of it.

[^port]: `06166d9e5ff121fd3dfd6c0995621e557a023ef0`

[^conflicts]: I screwed the ChangeLog files anyway LOL.

In the next commit[^md-files] I made a high-level fix on the Machine
Description files. If you remember from the [post about GCC
internals]({filename}01_internals.md), the machine description files are some
kind of Lisp-like files that describe both the translations between GIMPLE and
RTL and also between RTL and assembly, among other things. In this commit I
just removed some of the RTXs that were not available back in the 4.6.4 days
but were in use in the port. I'm talking, more specifically, about
`define_int_iterator` and `define_int_attr`.  Thankfully they were just a
couple of loops that were easy to unroll by hand.  Not a big deal.

[^md-files]: `af295d607786f96b4e8f2e35f41ca34820a9aacb`

Then, I made a larger commit that tries to fix the rest of the
`gcc/config/riscv` folder[^large-commit]. In this one I had two goals: make the
port compatible with the old C-based API and remove parts that weren't strictly
necessary but complex to keep. This means I removed all the builtins support so
I didn't need to port them (nice trick, huh?) and I kept the code related with
memory models out of the equation. I may need to fix that in the future, but I
was looking for a minimal support and I didn't need that for my goal.

[^large-commit]: `14577a05e3d64c9e2a05e8f0ff1f8965ddb27b68`

After that I tried to compile the project and run it, but I realized there was
a problem with the argument handling of the compiler. It was unable to find
arguments like `-march` and it was always failing to compile anything.

I realized there was a weird file at `gcc/common/config/riscv/riscv-common.c`
that looked like it was handling input arguments, so I focused on porting that
one too. It happens that the old GCC didn't have that code structure:
everything was done in `gcc/config/` back then, so I moved the support and
made the argument handling follow the old API. That's the last commit of the
series[^last-commit].

[^last-commit]: `2b97a03a443fe8e408d7129bce9658032d0d9cd2`



### Deep diving

Now I'll try to explain the changes I made in the code here and there, but
first I have to explain the method I followed to make this.

It might be surprising but for the first time I didn't try to understand
everything but work my way through it. This means I have absolutely no clue
about what does the code do in most of the places[^guilt]. I just looked the
overall shape of it and try to match that shape with the code found in other
architecture, mostly MIPS, which the RISC-V support was based on. If I found
anything that I didn't know how to convert I would read how that thing was
implemented on MIPS when the RISC-V support was added and then compare that
implementation with the one at 4.6.4. That would give me an idea about how to
convert to the old way to make things.

[^guilt]: And I'm trying not to feel guilty for it.

So, yeah, most of the coding was a mental exercise of pattern matching code and
conversion. There are very few things that I coded myself, like understanding
what I was doing deeply.

This doesn't really mean you don't need any knowledge to do this. Of course you
do. You need to understand what the code does in a very high-level, and know
how targets are described in GCC[^gcc-course], but you don't really need to
know each function to the detail.

Sadly, in some cases I had to read functions carefully and understand them, so
there's some knowledge needed, still.

[^gcc-course]: There's a great set of [videos about GCC at the GCC Resource
  Center](https://www.cse.iitb.ac.in/grc/index.php?page=videos). They
  specifically talk about GCC 4.6! I watched them before going for the code and
  they helped me a lot to understand how was the code organized and how did GCC
  work. I recommend them a lot.


#### First patch set

The first patch set is not really relevant. I just made it while I was trying
to compile the project without changes. The compilation ended with errors, I
reviewed them, go to the GCC issue tracker and search. In some cases I was
lucky that I found a patch that fixed them, in others I only found suggestions
and I had to fix the thing myself. Not really interesting, honestly.

#### The Guix package

The Guix part in `guix.scm` is not really interesting neither, at least for the
moment. The most interesting part might be the addition of `flex-2.5` to the
input and the use of `local-file` as a source for the GCC package[^efraim].

[^efraim]: This `local-file` thing I learned from Efraim Flashner, currently a
  Guix maintainer, who gave a talk called "Compile it with Guix" where he
  introduces this method. Sadly, I can't find the talk in the web to link you
  to it.

All the rest is playing around with the configure flags and trying to read
Guix's GCC packages and [Janneke's work with the full-source
bootstrap][janneke].

[janneke]: https://gitlab.com/janneke/guix/-/blob/wip-full-source-bootstrap/gnu/packages/commencement.scm

Even with all that, there are some things missing, so I have to come back to
this in the future.

There is, though, a really interesting point to take in account. We already
said in the [post about GCC internals]({filename}01_internals.md) that GCC is
a driver that calls other programs, such as `as` and `ld` from GNU Binutils, so
we know we only need the very basics in order to test that our compiler can
output RISC-V assembly so we can ignore the rest of things and focus on one
thing: I'm talking, of course, about `cc1`, the C compiler.

That's why I only set the target to `all-gcc` and focus on that. Later we'll
need to dig deeper.

One of the issues I'll have to tackle is that the GCC I'm building is a
cross-compiler, but this whole project is being developed for a RISC-V target.
This doesn't let the compiler check itself using the staged approach[^staged],
which is something I'm interested on watching.

[^staged]: This process is that you compile GCC with the compiler you had
  (stage-1), then the resulting GCC compiles itself (stage-2), and the
  resulting GCC compiles itself again (stage-3). One way to make sure
  everything is correct is to compare the binary of the stage-2 and the
  stage-3. If they are the same, there are chances that our code is correct.
  If they are different, our code is wrong. GCC's compilation framework does
  this automatically (if `--disable-bootstrap` is not set) but, you can't do it
  when cross-compiling, because there's no way to run the stage-1 compiler. I
  would like to see the result of this process, but I can't at the moment.

Once the proper `guix.scm` file is generated, I'll prepare a package for the
RISC-V bootstrapping process. In that package I'll define the first 4 commits
as separate patches to apply on top of the source, but I'll remove them from
the original source. That way the codebase will continue to be compatible with
old toolchains and we'll only apply those patches where needed, that is, when
we try to build with more recent environments.

#### Machine Description files

The machine description files did not change that much during the years. Some
extra constructs were added but the idea, the goal and the shape of the files
didn't really change.

As we introduced already, the RISC-V port used `define_int_iterator` constructs
in order to simplify some of the work, repeating pieces of the machine
description file according to the integer iterator. Back in GCC 4.6.4 that
construct was not available so I unrolled the loop by hand following the
example at the GCC documentation:

<https://gcc.gnu.org/onlinedocs/gccint/Int-Iterators.html>

Simply repeat the structures (unroll them) using the value of the iterators and
use the `define_int_attr` to set some of the fields too. The example in the
docs gives a good description on how to do it.

On the other hand, I also found that the RTLs at RISC-V port were using
`simple-return` in some places and I realized that didn't exist in the past. I
replaced that with `return`, hoping that it was the same, but I don't remember
if I reasoned further[^see]. In any case, you can take a look into
`gcc/rtl.def`[^def] and see how `SIMPLE_RETURN` was added later.

[^see]: See? That's why I try to write blog posts about the things I do, that
  way I don't forget things. It was too late for this.

[^def]: These `.def` files are a lot of fun in GCC's codebase. They appear
  really often. They are files that look like a bunch of similar function calls
  but what they actually are macro calls. Then, this files are `#include`d
  into another file right after the macro is defined so they generate code.
  Later, you can redefine the macro to create some other output and `#include`
  them again so they'll always generate coherent code. This is used a lot on
  enums and switch-case statements, if you want them both to be coherent, you
  can move them to a `.def` file, define all the possible values of the enum
  there, and generate first the enum with the first `#include` and later the
  switch-case with a new `#include` later. Take a look to `gcc/rtl.c` and
  you'll see what I mean. (Yes I know this is like hardcore magic and it's hard
  to understand, I didn't choose to do this).

#### Matching the API

There are other more meaningful changes. The large commit[^large-commit] is
full of changes related with the conversion back to the C API.

The most obvious ones are converting from `rtx_insn *` to `rtx`, and
adding/removing machine modes where needed. It was just a matter of searching
the functions being used in the MIPS target and trying to match them. Boring,
and probably wrong in a couple of places, but looks like it's working, I don't
know. Examples:

``` diff
-  emit_insn (gen_rtx_SET (target, src));
+  emit_insn (gen_rtx_SET (VOIDmode, target, src));
```

``` diff
-    op = plus_constant (Pmode, UNSPEC_ADDRESS (base), INTVAL (offset));
+    op = plus_constant (UNSPEC_ADDRESS (base), INTVAL (offset));
```

There were a couple of functions using a small class called `cumulative_args_t`
that it was easy to convert to `CUMULATIVE_ARGS *` just removing calls to
`get_cumulative_args` and `pack_cumulative_args`. In C everything is rougher
and low level. Thankfully in this case, the low level API was still present so
we could just use that instead of the new C++ one, and removing the abstraction
level was trivial. See `riscv_setup_incoming_varargs` in
`gcc/config/riscv/riscv.c` as an example. There might be some things wrong, but
it looks reasonable.

There were also a couple of `std::swap` calls here and there I needed to get
rid of. I made a temporary variable and made the swap by hand in the classic
way.

Some other changes were harder to spot. Like these:

``` diff
            || !TYPE_MIN_VALUE (index)
-           || !tree_fits_uhwi_p (TYPE_MIN_VALUE (index))
-           || !tree_fits_uhwi_p (elt_size))
+           || !host_integerp(TYPE_MIN_VALUE (index),0)
+           || !host_integerp(elt_size,0))
          return -1;
 
-       n_elts = 1 + tree_to_uhwi (TYPE_MAX_VALUE (index))
-                  - tree_to_uhwi (TYPE_MIN_VALUE (index));
+       n_elts = 1 + TREE_INT_CST_LOW(TYPE_MAX_VALUE (index))
+                  - TREE_INT_CST_LOW (TYPE_MIN_VALUE (index));
```

All those functions and macros are pretty different, but they happen to be more
or less the same. What I did here was: read the newer MIPS implementation, try
to find those and then go back in time to the old MIPS implementation and see
what they were using instead. It wasn't obvious at the beginning so I read the
definitions of all of those things (ctags for the win!) and I even had to
define some like `sext_hwi`, which I added to `gcc/hwint.h` like I could.

#### The include dance

If you check the changes on the top of `gcc/config/riscv/riscv.c`, you'll see
there are a lot of `#include`s removed and some new ones are added. This is
normal, as the older C API was very different to the newer C++ one, but also
because many of these includes were not really used inside of the code. First I
reviewed which files did exist but later just copied from MIPS and rearranged
until the thing compiled.

#### Crazy changes and inventions

Some other changes were crazier. I had to add the `riscv_cpu_cpp_builtins`
which was defined in `gcc/config/riscv/riscv-c.c` but I had no way to make it
work so I copied what was done in other places and made it a huge macro, added
it to `gcc/config/riscv/riscv.h` and prayed. The compiler was happy with that
change, and I was too. That let me remove the `riscv-c.c` file from the
compilation process, even if it's still included in the repository (yeah, I
know...).

The `riscv.h` file has some other magic tricks too. The `ASM_SPEC` is a lot of
fun now. Basically a copy of somewhere else, because defining the craziest
macro I've seen in my life was too much for me:

``` diff
#define ASM_SPEC "\
 %(subtarget_asm_debugging_spec) \
-%{" FPIE_OR_FPIC_SPEC ":-fpic} \
+%{fpic|fPIC|fpie|fPIE:-k}\
 %{march=*} \
 %{mabi=*} \
 %(subtarget_asm_spec)"
```

Wanna see the macro? Well you asked for it (this is just half of it):

``` c
#ifdef ENABLE_DEFAULT_PIE
#define NO_PIE_SPEC		"no-pie|static"
#define PIE_SPEC		NO_PIE_SPEC "|r|shared:;"
#define NO_FPIE1_SPEC		"fno-pie"
#define FPIE1_SPEC		NO_FPIE1_SPEC ":;"
#define NO_FPIE2_SPEC		"fno-PIE"
#define FPIE2_SPEC		NO_FPIE2_SPEC ":;"
#define NO_FPIE_SPEC		NO_FPIE1_SPEC "|" NO_FPIE2_SPEC
#define FPIE_SPEC		NO_FPIE_SPEC ":;"
#define NO_FPIC1_SPEC		"fno-pic"
#define FPIC1_SPEC		NO_FPIC1_SPEC ":;"
#define NO_FPIC2_SPEC		"fno-PIC"
#define FPIC2_SPEC		NO_FPIC2_SPEC ":;"
#define NO_FPIC_SPEC		NO_FPIC1_SPEC "|" NO_FPIC2_SPEC
#define FPIC_SPEC		NO_FPIC_SPEC ":;"
#define NO_FPIE1_AND_FPIC1_SPEC	NO_FPIE1_SPEC "|" NO_FPIC1_SPEC
#define FPIE1_OR_FPIC1_SPEC	NO_FPIE1_AND_FPIC1_SPEC ":;"
#define NO_FPIE2_AND_FPIC2_SPEC	NO_FPIE2_SPEC "|" NO_FPIC2_SPEC
#define FPIE2_OR_FPIC2_SPEC	NO_FPIE2_AND_FPIC2_SPEC ":;"
#define NO_FPIE_AND_FPIC_SPEC	NO_FPIE_SPEC "|" NO_FPIC_SPEC
#define FPIE_OR_FPIC_SPEC	NO_FPIE_AND_FPIC_SPEC ":;"
```

Well anyway, more things were basically made up like that, like these lines in
`gcc/config/riscv/linux.h`:

``` diff
-#define TARGET_OS_CPP_BUILTINS()                               \
-  do {                                                         \
-    GNU_USER_TARGET_OS_CPP_BUILTINS();                         \
-  } while (0)
+#define TARGET_OS_CPP_BUILTINS()  LINUX_TARGET_OS_CPP_BUILTINS()
```


``` diff
   %{!shared: \
     %{!static: \
       %{rdynamic:-export-dynamic} \
-      -dynamic-linker " GNU_USER_DYNAMIC_LINKER "} \
+      -dynamic-linker " LINUX_DYNAMIC_LINKER "} \
     %{static:-static}}"
```

I just copied from other places because there were absolutely no references to
those macros, so... I thought the best way to do this was to copy what other
targets did.

Of course this whole thing is not really tested right now, because this affects
how the linker is called, but that was broken anyway because of my distribution
of choice (Guix I love you but...) so what could I do? Just make them up and
fix them later sounded like a good plan.

As I already mentioned, I left builtins and memory models out of the equation.
Just commented them out and hoped everything worked properly for small
programs. I will try larger programs later.

#### Argument handling

The last commit[^last-commit] was a little bit hard to do too, the changes
related to this one were adding a file that was completely out of place, as we
said earlier, so I reviewed other architectures and found how those
architectures dealt with this. First, the API was pretty different so the first
thing I made was to make the function's formal arguments fit those on the API
and then started making changes.

It was really hard to realize how the `MASK_*` macros worked just looking to
the code, because there were defined nowhere!

The problem was I wasn't looking in the correct place. More code generation
magic! The `gcc/config/riscv/riscv.opt` file is what handles all those masks
and `TARGET_*` macros, like `TARGET_MUL` to check if the target has the
multiplication plugin. All those were defined there, even if the definition was
obscure and hard to match with anything else in the code[^hard-to-match].

Once that was understood everything else was easier to do, "just follow MIPS
and you'll be fine" I told myself, and it worked. Moved everything to `riscv.c`
where all the other target description macros and functions are defined and...
Boom! Working compiler.

[^hard-to-match]: I say "hard to match" because searching for `TARGET_MUL` or
  `MASK_MUL` gave **NO** results, and searching for `MUL` gave too many.

### Result

With all these changes is now possible to generate a minimal compiler and
compile a file. As we said, we are only interested on the C to assembly
conversion at the moment, and that's what we have and nothing else.

Taking the project as it is right now you can run:

``` bash
$ guix build -f guix.scm
...
/gnu/store/gsq72r3xnv7b2f1l4z5idpy3j900hizk-gcc-4.6.4-HEAD-debug
/gnu/store/qglp0cx0nq2nblcg9ya4gmc5gfk2amjg-gcc-4.6.4-HEAD-lib
/gnu/store/l612a4h9a6l4hs7kq49rph4clwf6l2k5-gcc-4.6.4-HEAD
```

So you'll get something like this:

<style>
code {
    line-height: 1;
}
</style>

``` bash
$ tree /gnu/store/l612a4h9a6l4hs7kq49rph4clwf6l2k5-gcc-4.6.4-HEAD
/gnu/store/l612a4h9a6l4hs7kq49rph4clwf6l2k5-gcc-4.6.4-HEAD
├── bin
│   ├── riscv64-unknown-linux-gnu-cpp
│   ├── riscv64-unknown-linux-gnu-gcc
│   ├── riscv64-unknown-linux-gnu-gcc-4.6.4
│   └── riscv64-unknown-linux-gnu-gcov
├── etc
│   └── ld.so.cache
├── libexec
│   └── gcc
│       └── riscv64-unknown-linux-gnu
│           └── 4.6.4
│               ├── cc1
│               ├── collect2
│               ├── install-tools
│               │   ├── fixincl
│               │   ├── fixinc.sh
│               │   ├── mkheaders
│               │   └── mkinstalldirs
│               └── lto-wrapper
├── riscv64-unknown-linux-gnu
│   └── lib
└── share

...

16 directories, 28 files
```

If you want to try it, you can generate an extremely simple C file and give it
a go:

``` bash
$ cat <<END > hello.c
int main (int argc, char * argv[]){
    return 19;
}
END

$ /gnu/store/...-gcc-4.6.4-HEAD/bin/riscv64-unknown-linux-gnu-gcc -S hello.c
$ cat hello.s
.file	"hello.c"
	.option nopic
	.text
	.align	1
	.globl	main
	.type	main, @function
main:
	add	sp,sp,-32
	sd	s0,24(sp)
	add	s0,sp,32
	mv	a5,a0
	sd	a1,-32(s0)
	sw	a5,-20(s0)
	li	a5,19
	mv	a0,a5
	ld	s0,24(sp)
	add	sp,sp,32
	jr	ra
	.size	main, .-main
	.ident	"GCC: (GNU) 4.6.4"
```

This can be later assembled and linked using binutils with not much
trouble, as we might have introduced in the past.


### Conclusion

The process as you can see is pretty much a pattern matching exercise, as I
already mentioned in the beginning. Of course there were some places where I
needed to review the different APIs and their implementation, but those were
just a few. Not bad. We made this "work" in a short period of time and it looks
pretty well.

Now I need to test this further, make more complex programs and try it, but
it's actually very difficult to do with the current compilation process because
the standard C library is not found correctly and the assembler and the linker
have to be dealt with independently.  This means I need to fix the context
first and then review the compiler itself.

On the other hand, the memory model related code, the builtins and the code I
basically made up are worrying part of the project, because they might be a
point of failure in the future. If they work only for optimizations and
multithreading, that might not be an issue, because I don't know how much of
that is used in the GCC version we are going to compile with this compiler.
Remember our backport's only goal is to compiler a more recent GCC with it, so
we don't really need to care about other programs.

I already asked some people[^people] about the memory model parts and I got a
very simple solution from them (basically forget about the memory models and
always make a `fence` before and after synchronization code), so that's going
to be solved for the next post, and I can always review the builtins later if I
need them.

[^people]: I asked Andrew Waterman himself (one of the authors of RISC-V, and
  the current maintainer of the RISC-V GCC target). Yep, and he actually
  answered.

The rest of the code looks like it would work in more complex cases, but still
this needs proper testing and I need to be able to include the standard C
library for that.


### Reviewing the code

Of course, we are going to find bugs, and I did find some bugs in the
development of the process. The code review is really hard to do so it's better
to use tricks and magic.

First of all, we need some debug symbols for `gdb` to find where the errors are
and be able to debug them properly. The defined Guix package has a
strip-binaries step that moves all the debug symbols to a separate folder:

``` bash
$ guix build -f guix.scm
...
/gnu/store/gsq72r3xnv7b2f1l4z5idpy3j900hizk-gcc-4.6.4-HEAD-debug
/gnu/store/qglp0cx0nq2nblcg9ya4gmc5gfk2amjg-gcc-4.6.4-HEAD-lib
/gnu/store/l612a4h9a6l4hs7kq49rph4clwf6l2k5-gcc-4.6.4-HEAD
```

The `debug` directory there contains the debug symbols of the binaries so we
can just call `gdb` and then use the `symbol-file` command to load the debug
symbols associated with the program itself.

It is important to note that loading the `gcc` binary is a problem because it
is a driver that `exec`s other binaries, so the errors can't be really followed
properly. It's better to choose the specific program we want to debug, normally
`cc1`.

This happened to be extremely important because I forgot to convert one
function to the old API and it was giving a segmentation fault. Using the GNU
Debugger I found the source of the error and I just replaced formal arguments
with the proper ones.


### Last words

So, all that being said, we covered the changes, the possible problems, how to
debug and what's coming next. That was basically it.

If you have any question, suggestion, comment, or anything you want to share
about this, contact me[^contact]. I'd be very happy to discuss.

From here, the plan is to review what I already did, test more complex software
and share the results with you and also try to make the compilation process
more reasonable. I hope it's easier to do than it looks.

Wish me luck.

[^contact]: You can find my contact info in the [About
  page](/pages/about.html).