diff options
author | Ekaitz Zarraga <ekaitz@elenq.tech> | 2025-09-17 00:39:57 +0200 |
---|---|---|
committer | Ekaitz Zarraga <ekaitz@elenq.tech> | 2025-09-17 00:39:57 +0200 |
commit | 759b8e466bb455298f9e8903d8cd25e649d4e0d2 (patch) | |
tree | 7ec68b2c36611d6b05913ce0f6d68dd0b70946aa /content | |
parent | 028382476218e83676197cfbc6068d6fe5f2ef1c (diff) |
Diffstat (limited to 'content')
-rw-r--r-- | content/fasterMes/02_garbage_collector.md | 274 |
1 files changed, 274 insertions, 0 deletions
diff --git a/content/fasterMes/02_garbage_collector.md b/content/fasterMes/02_garbage_collector.md new file mode 100644 index 0000000..8b7ee26 --- /dev/null +++ b/content/fasterMes/02_garbage_collector.md @@ -0,0 +1,274 @@ +Title: GNU Mes and garbage collection +Date: 2025-09-17 +Category: +Tags: GNU Mes interpreter speedup +Slug: fasterMes2 +Lang: en +Summary: How does the GNU Mes garbage collector work and how to improve it. + +In our previous blogpost in the series we described how GNU Mes' scheme +interpreter handles evaluation and how fast it was. There were some details +that weren't really obvious, though: why does it really do all that magic to +emulate function calls? Is it just to make it look like scheme? + +In this blog post we'll explain how does the garbage collector work in GNU Mes, +try to take a new look to the `eval_apply` function from the perspective of the +garbage collector, and propose an improvement that would enable a different +style of programming. + +### GNU Mes' garbage collection + +GNU Mes' scheme interpreter has a very simple approach to memory management. It +allocates a big chunk of memory when the interpreter is run and that is used +for every scheme value. All scheme values are represented with the same +`struct`, called `struct scm`, and every `make_`whatever function returns a +pointer to an `struct scm` bump-allocated in that big chunk of memory which +we'll call *arena*. + +The bump-allocation is very easy to do, and doesn't require much computation. +It's just in increase in the pointer to the first free element in the *arena*. + +If we just had this, our scheme interpreter would just eat memory and never +release or reuse the space we allocated for the scheme values we are not using +anymore. This is a possible approach, really, but GNU Mes does provide garbage +collection so the memory is reused and we don't run out of our precious +resource. + +Mes uses a stop-the-world copying garbage collector. Similar to the one +described in the "Structure and Interpretation of Computer Programs" (aka SICP) +book or the one described in [This Wikipedia entry about Cheney's garbage +collection algorithm based on a paper from 1970][cheney-wiki]. It starts in few +scheme values and copies them from the current arena to some other piece of +memory, one after the other. It also check their internal values like elements +in a list, a vector or so on, and copies those too, adjusting the pointers that +refer to them. When a scheme value is copied, its memory representation is +tagged with a *broken heart* (this appears in SICP), something that identifies +that value as already processed, and the pointer to the new location replaces +its value. Now, it can check if an object that is being copied points to a +value that has been already copied, and in that case, it only needs to update +the pointer of its reference. + +[cheney-wiki]: https://en.wikipedia.org/wiki/Cheney%27s_algorithm + +Once all the objects and its references have been processed, what we have is a +new memory space that contains only the objects in use, one after the other, +with no empty spaces in between. This is why this process is called also a +compacting garbage collector. + +In Mes, once the new piece of memory is set up, it is moved back to the +original position, element by element (remember all elements are the same +size). In a process that in Mes is called `flip`. The goal of the flip is to +keep the memory footprint low, only requiring a maximum memory of the twice of +the current memory usage (and oftentimes way less) which, potentially, is +smaller than having two fixed semi-spaces as this kind of garbage collectors +often have. + +But now two things are left to explain: when is this whole process run and how +it is kickstarted. + +The garbage collector is run during the interpreter execution. When the +interpreter's programmer decides to (by adding a call to the garbage collection +function, `gc_check`). This function will check if the memory usage is large +enough so the garbage collecting is worth the effort and, in affirmative case, +then run the garbage collector process. + +The first thing the garbage collection process does is copy a group of objects +that will kickstart the copying of the rest of the *live* objects, meaning +those objects that the running scheme program still has access to. Those +objects are called *roots*, and they are fundamental to understand how this +whole thing can be improved. The roots, more than being a great hip hop band, +are easy to deduce what they can be in the interpreter: the current +*environment*, the *stack*, the known symbols, and such. + +Currently in Mes the roots are fixed, meaning they are hardcoded in the code, +like so: + +``` clike + struct scm *new_cell_nil = g_free; + struct scm *s; + for (s = cell_nil; s < g_symbol_max; s = s + M2_CELL_SIZE) + gc_copy (s); + + g_symbols = gc_copy (g_symbols); + g_macros = gc_copy (g_macros); + g_ports = gc_copy (g_ports); + scm_hash_table_type = gc_copy (scm_hash_table_type); + scm_variable_type = gc_copy (scm_variable_type); + M0 = gc_copy (M0); + M1 = gc_copy (M1); + + long i; + for (i = g_stack; i < STACK_SIZE; i = i + 1) + copy_stack (i, gc_copy (g_stack_array[i])); + + gc_loop (new_cell_nil); +``` + +### The issues + +This kind of copying garbage collectors have a really two very obvious +problems that affect the way we write our C code. + +Consider this code: + +``` clike +struct scm *val = make_number (10); // Makes a scheme value = 10 +gc_check (); // May garbage collect +val->value; // What do we have there? +``` + +If we make a number like in the example and then garbage collect, the number +will be allocated in the *arena*, but during garbage collection it won't have +anything pointing to it so it would be cleaned from the memory. Interestingly +enough, it won't probably produce a segfault or any kind of error like that, +because it will try to access to the *arena*, and that's valid memory, but +it would still explode in very dirty ways afterwards, maybe after some other +garbage collection loops. Nasty stuff. + +Also, even if you keep the variable from being collected, it would be moved by +the garbage collector, which will give you a very low chance of the value +falling in the same spot, so your pointer would be outdated, and you wouldn't +have any way to know where the variable is now. + +Do you want an easy solution? + +First, register that `val` as another root in the garbage collector, so it's +not lost in each loop. The current garbage collector in Mes needs to hardcode +the roots, so if this example were just a local variable for a function, you'd +need to make it global for the garbage collector to find it. Of course, you +shouldn't do this for every single variable in your code, but if you make sure +you only need a couple of temporaries you could call them `R1` and `R2` and use +those all the time for everything[^oh-yes]. + +[^oh-yes]: Oh! Yes! I know what you are thinking. Trust me. + +That would leave you with the moving problem. After a garbage collector run, +your variables `R1` and `R2` could potentially point to a memory location where +they are not anymore, because of the copying. + +At this point if you were really inspired, you could implement a stack of those +lets call them "*registers*" (`R1`, `R2`...), register it as a root, and there +you would have emulated a function call system that respects your garbage +collector. Oh! *But isn't that what we described in the previous blog post?* + +Implementing a stack that knows where to push and where to pop (those global +registers) would make every single piece of your code keep your pointers to the +variables coherent in the C world. You would just need to refer to `R1` or +whatever, and it would always point to the proper memory location after the pop +is run. + +But of course this is a little bit of a Henry Ford situation here: *"everybody +will have a car of the color they like as long as it is black"*. So you can +have all the variables you want as long as they are `R0`, `R1`... + + +### One solution + +Consider this code: + +``` clike +struct scm * +macro_expand_program (struct scm *program) +{ + struct scm *cursor = program; + gc_preserve (&program); + for (cursor = program; cursor != cell_nil; cursor = cursor->cdr) + { + gc_preserve (&cursor); + cursor->car = macro_expand_expr (cursor->car); // May GC + gc_release (1); // cursor + } + gc_release (1); // program + return program; +} +``` + +In this piece of code we can see how `macro_expand_expr` could try to garbage +collect and make us lose references to our local variables. By calling +`gc_preserve` we register them as roots, and we can release them later, when we +don't need them anymore. We send the address to the pointer to the garbage +collector so it's able to fix the pointers when the copying is done. + +This looks boring and boilerplatey, but we would only need it for those +variables that would cross the function barrier to functions that we know that +may garbage collect. The rest of the cases we can just use an ordinary C +variable without preservation. + +The improvements I see here is we can call the variables whatever we want, use +as many as we want and actually use C function calls without all the `goto` and +`push_cc` ceremony we described in the previous article, which is also +extremely confusing to me. + +Also, we could get rid of all the other global roots pushing them first and +never releasing them. + +#### How is this thing implemented + +I have to admit it's not the greatest solution in the world but it works well +enough if you are careful. + +First, I added a new scheme value type for roots, that can store a reference to +an scheme value and the address of the pointer to update. It just covers the +simplest case. + +The `gc_preserve` function adds a root to a global list of preserved roots. The +`gc_release` function removes the top `n` elements from the list of preserved +roots. + +With that ready, the garbage collector only needs to handle that global list as +any other hardcoded root we mentioned before, but after the `gc_flip` the +pointers stored in the roots themselves are updated. This means the pointers in +our C code would be updated when we exit the garbage collection operation, +keeping our C pointers consistent with the memory. + +So, that leaves us with this copy step: + +``` diff +@@ -647,6 +695,8 @@ gc_ () + gc_copy (s); + + g_symbols = gc_copy (g_symbols); ++ if (g_preserved != NULL) ++ g_preserved = gc_copy (g_preserved); + g_macros = gc_copy (g_macros); + g_ports = gc_copy (g_ports); + scm_hash_table_type = gc_copy (scm_hash_table_type); +``` + +And this flip step: + +``` diff +@@ -464,6 +503,15 @@ gc_flip () + long dist = g_news - g_cells; + g_free = g_free - dist; + g_symbols = g_symbols - dist; ++ if (g_preserved != NULL) ++ { ++ g_preserved = g_preserved - dist; ++ /* Update the preserved pointers */ ++ struct scm *s; ++ for (s = g_preserved; s != cell_nil; s = s->cdr) ++ *(s->car->ptr) = s->car->car; ++ } ++ + g_macros = g_macros - dist; + g_ports = g_ports - dist; + scm_hash_table_type = scm_hash_table_type - dist; +``` + +As my roots are copied and flipped the values they point at are also copied and +flipped. That way I only need to update the pointers using the new address of +the pointed objects. + +And that's it. + +The only thing missing is to use the `gc_preserve` and `gc_release` properly. +It's easy to misuse them and overrelease or overpreserve. + +I'm writing a macro expander using this and happens to be somewhat comfortable, +at least way more comfortable than the goto and `push_cc` style. I was having a +very hard time with that one. + +We'll see how it goes. + +Cheers! |