summaryrefslogtreecommitdiff
path: root/content/fasterMes
diff options
context:
space:
mode:
Diffstat (limited to 'content/fasterMes')
-rw-r--r--content/fasterMes/02_garbage_collector.md274
1 files changed, 274 insertions, 0 deletions
diff --git a/content/fasterMes/02_garbage_collector.md b/content/fasterMes/02_garbage_collector.md
new file mode 100644
index 0000000..8b7ee26
--- /dev/null
+++ b/content/fasterMes/02_garbage_collector.md
@@ -0,0 +1,274 @@
+Title: GNU Mes and garbage collection
+Date: 2025-09-17
+Category:
+Tags: GNU Mes interpreter speedup
+Slug: fasterMes2
+Lang: en
+Summary: How does the GNU Mes garbage collector work and how to improve it.
+
+In our previous blogpost in the series we described how GNU Mes' scheme
+interpreter handles evaluation and how fast it was. There were some details
+that weren't really obvious, though: why does it really do all that magic to
+emulate function calls? Is it just to make it look like scheme?
+
+In this blog post we'll explain how does the garbage collector work in GNU Mes,
+try to take a new look to the `eval_apply` function from the perspective of the
+garbage collector, and propose an improvement that would enable a different
+style of programming.
+
+### GNU Mes' garbage collection
+
+GNU Mes' scheme interpreter has a very simple approach to memory management. It
+allocates a big chunk of memory when the interpreter is run and that is used
+for every scheme value. All scheme values are represented with the same
+`struct`, called `struct scm`, and every `make_`whatever function returns a
+pointer to an `struct scm` bump-allocated in that big chunk of memory which
+we'll call *arena*.
+
+The bump-allocation is very easy to do, and doesn't require much computation.
+It's just in increase in the pointer to the first free element in the *arena*.
+
+If we just had this, our scheme interpreter would just eat memory and never
+release or reuse the space we allocated for the scheme values we are not using
+anymore. This is a possible approach, really, but GNU Mes does provide garbage
+collection so the memory is reused and we don't run out of our precious
+resource.
+
+Mes uses a stop-the-world copying garbage collector. Similar to the one
+described in the "Structure and Interpretation of Computer Programs" (aka SICP)
+book or the one described in [This Wikipedia entry about Cheney's garbage
+collection algorithm based on a paper from 1970][cheney-wiki]. It starts in few
+scheme values and copies them from the current arena to some other piece of
+memory, one after the other. It also check their internal values like elements
+in a list, a vector or so on, and copies those too, adjusting the pointers that
+refer to them. When a scheme value is copied, its memory representation is
+tagged with a *broken heart* (this appears in SICP), something that identifies
+that value as already processed, and the pointer to the new location replaces
+its value. Now, it can check if an object that is being copied points to a
+value that has been already copied, and in that case, it only needs to update
+the pointer of its reference.
+
+[cheney-wiki]: https://en.wikipedia.org/wiki/Cheney%27s_algorithm
+
+Once all the objects and its references have been processed, what we have is a
+new memory space that contains only the objects in use, one after the other,
+with no empty spaces in between. This is why this process is called also a
+compacting garbage collector.
+
+In Mes, once the new piece of memory is set up, it is moved back to the
+original position, element by element (remember all elements are the same
+size). In a process that in Mes is called `flip`. The goal of the flip is to
+keep the memory footprint low, only requiring a maximum memory of the twice of
+the current memory usage (and oftentimes way less) which, potentially, is
+smaller than having two fixed semi-spaces as this kind of garbage collectors
+often have.
+
+But now two things are left to explain: when is this whole process run and how
+it is kickstarted.
+
+The garbage collector is run during the interpreter execution. When the
+interpreter's programmer decides to (by adding a call to the garbage collection
+function, `gc_check`). This function will check if the memory usage is large
+enough so the garbage collecting is worth the effort and, in affirmative case,
+then run the garbage collector process.
+
+The first thing the garbage collection process does is copy a group of objects
+that will kickstart the copying of the rest of the *live* objects, meaning
+those objects that the running scheme program still has access to. Those
+objects are called *roots*, and they are fundamental to understand how this
+whole thing can be improved. The roots, more than being a great hip hop band,
+are easy to deduce what they can be in the interpreter: the current
+*environment*, the *stack*, the known symbols, and such.
+
+Currently in Mes the roots are fixed, meaning they are hardcoded in the code,
+like so:
+
+``` clike
+ struct scm *new_cell_nil = g_free;
+ struct scm *s;
+ for (s = cell_nil; s < g_symbol_max; s = s + M2_CELL_SIZE)
+ gc_copy (s);
+
+ g_symbols = gc_copy (g_symbols);
+ g_macros = gc_copy (g_macros);
+ g_ports = gc_copy (g_ports);
+ scm_hash_table_type = gc_copy (scm_hash_table_type);
+ scm_variable_type = gc_copy (scm_variable_type);
+ M0 = gc_copy (M0);
+ M1 = gc_copy (M1);
+
+ long i;
+ for (i = g_stack; i < STACK_SIZE; i = i + 1)
+ copy_stack (i, gc_copy (g_stack_array[i]));
+
+ gc_loop (new_cell_nil);
+```
+
+### The issues
+
+This kind of copying garbage collectors have a really two very obvious
+problems that affect the way we write our C code.
+
+Consider this code:
+
+``` clike
+struct scm *val = make_number (10); // Makes a scheme value = 10
+gc_check (); // May garbage collect
+val->value; // What do we have there?
+```
+
+If we make a number like in the example and then garbage collect, the number
+will be allocated in the *arena*, but during garbage collection it won't have
+anything pointing to it so it would be cleaned from the memory. Interestingly
+enough, it won't probably produce a segfault or any kind of error like that,
+because it will try to access to the *arena*, and that's valid memory, but
+it would still explode in very dirty ways afterwards, maybe after some other
+garbage collection loops. Nasty stuff.
+
+Also, even if you keep the variable from being collected, it would be moved by
+the garbage collector, which will give you a very low chance of the value
+falling in the same spot, so your pointer would be outdated, and you wouldn't
+have any way to know where the variable is now.
+
+Do you want an easy solution?
+
+First, register that `val` as another root in the garbage collector, so it's
+not lost in each loop. The current garbage collector in Mes needs to hardcode
+the roots, so if this example were just a local variable for a function, you'd
+need to make it global for the garbage collector to find it. Of course, you
+shouldn't do this for every single variable in your code, but if you make sure
+you only need a couple of temporaries you could call them `R1` and `R2` and use
+those all the time for everything[^oh-yes].
+
+[^oh-yes]: Oh! Yes! I know what you are thinking. Trust me.
+
+That would leave you with the moving problem. After a garbage collector run,
+your variables `R1` and `R2` could potentially point to a memory location where
+they are not anymore, because of the copying.
+
+At this point if you were really inspired, you could implement a stack of those
+lets call them "*registers*" (`R1`, `R2`...), register it as a root, and there
+you would have emulated a function call system that respects your garbage
+collector. Oh! *But isn't that what we described in the previous blog post?*
+
+Implementing a stack that knows where to push and where to pop (those global
+registers) would make every single piece of your code keep your pointers to the
+variables coherent in the C world. You would just need to refer to `R1` or
+whatever, and it would always point to the proper memory location after the pop
+is run.
+
+But of course this is a little bit of a Henry Ford situation here: *"everybody
+will have a car of the color they like as long as it is black"*. So you can
+have all the variables you want as long as they are `R0`, `R1`...
+
+
+### One solution
+
+Consider this code:
+
+``` clike
+struct scm *
+macro_expand_program (struct scm *program)
+{
+ struct scm *cursor = program;
+ gc_preserve (&program);
+ for (cursor = program; cursor != cell_nil; cursor = cursor->cdr)
+ {
+ gc_preserve (&cursor);
+ cursor->car = macro_expand_expr (cursor->car); // May GC
+ gc_release (1); // cursor
+ }
+ gc_release (1); // program
+ return program;
+}
+```
+
+In this piece of code we can see how `macro_expand_expr` could try to garbage
+collect and make us lose references to our local variables. By calling
+`gc_preserve` we register them as roots, and we can release them later, when we
+don't need them anymore. We send the address to the pointer to the garbage
+collector so it's able to fix the pointers when the copying is done.
+
+This looks boring and boilerplatey, but we would only need it for those
+variables that would cross the function barrier to functions that we know that
+may garbage collect. The rest of the cases we can just use an ordinary C
+variable without preservation.
+
+The improvements I see here is we can call the variables whatever we want, use
+as many as we want and actually use C function calls without all the `goto` and
+`push_cc` ceremony we described in the previous article, which is also
+extremely confusing to me.
+
+Also, we could get rid of all the other global roots pushing them first and
+never releasing them.
+
+#### How is this thing implemented
+
+I have to admit it's not the greatest solution in the world but it works well
+enough if you are careful.
+
+First, I added a new scheme value type for roots, that can store a reference to
+an scheme value and the address of the pointer to update. It just covers the
+simplest case.
+
+The `gc_preserve` function adds a root to a global list of preserved roots. The
+`gc_release` function removes the top `n` elements from the list of preserved
+roots.
+
+With that ready, the garbage collector only needs to handle that global list as
+any other hardcoded root we mentioned before, but after the `gc_flip` the
+pointers stored in the roots themselves are updated. This means the pointers in
+our C code would be updated when we exit the garbage collection operation,
+keeping our C pointers consistent with the memory.
+
+So, that leaves us with this copy step:
+
+``` diff
+@@ -647,6 +695,8 @@ gc_ ()
+ gc_copy (s);
+
+ g_symbols = gc_copy (g_symbols);
++ if (g_preserved != NULL)
++ g_preserved = gc_copy (g_preserved);
+ g_macros = gc_copy (g_macros);
+ g_ports = gc_copy (g_ports);
+ scm_hash_table_type = gc_copy (scm_hash_table_type);
+```
+
+And this flip step:
+
+``` diff
+@@ -464,6 +503,15 @@ gc_flip ()
+ long dist = g_news - g_cells;
+ g_free = g_free - dist;
+ g_symbols = g_symbols - dist;
++ if (g_preserved != NULL)
++ {
++ g_preserved = g_preserved - dist;
++ /* Update the preserved pointers */
++ struct scm *s;
++ for (s = g_preserved; s != cell_nil; s = s->cdr)
++ *(s->car->ptr) = s->car->car;
++ }
++
+ g_macros = g_macros - dist;
+ g_ports = g_ports - dist;
+ scm_hash_table_type = scm_hash_table_type - dist;
+```
+
+As my roots are copied and flipped the values they point at are also copied and
+flipped. That way I only need to update the pointers using the new address of
+the pointed objects.
+
+And that's it.
+
+The only thing missing is to use the `gc_preserve` and `gc_release` properly.
+It's easy to misuse them and overrelease or overpreserve.
+
+I'm writing a macro expander using this and happens to be somewhat comfortable,
+at least way more comfortable than the goto and `push_cc` style. I was having a
+very hard time with that one.
+
+We'll see how it goes.
+
+Cheers!