From 1e9a20f654abbace9e687457d50c338db6266f90 Mon Sep 17 00:00:00 2001 From: Ekaitz Zarraga Date: Fri, 28 May 2021 21:36:33 +0200 Subject: Update recent posts --- content/2020.md | 240 ++++++++++++++++++++++++++++++++++++++++ content/lightening.md | 300 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 540 insertions(+) create mode 100644 content/2020.md create mode 100644 content/lightening.md diff --git a/content/2020.md b/content/2020.md new file mode 100644 index 0000000..9d354c3 --- /dev/null +++ b/content/2020.md @@ -0,0 +1,240 @@ +Title: Review of 2020 +Date: 2021-05-16 +Category: +Tags: +Slug: 2020 +Lang: en +Summary: + The review of our year 2020 at ElenQ Technology. + +It's been a while since the previous post here, and it's not because I don't +have anything to talk about. I've been working on many things since the +previous one. + +I wanted to write specifically about something I'm doing these days, but that's +difficult to contextualize if there's a full year gap in the middle. So I +decided to talk about the 2020 and make a short review about what we did so we +can look forward and see what can we build from this. + +#### 2020 at ElenQ Technology + +2020 have been harsh for everyone, including ElenQ Technology. We started the +year with a lot of energy and we were pretty busy with courses here and there. +But then the pandemic came and all the in-person training stopped so we lost +our main income source, which is also one of the works I personally enjoy the +most. + +So, after finishing our course on *Modern C++* in July (we'll talk about that +in a future post), right after we were freed from the lockdown here, everything +stopped. No more in-person courses, no more clients, nothing. + +We knew that the pandemic was affecting the economy so we were well aware that +there were few chances to get clients in the rest of the year. Thankfully, we +had some work to do: [ElenQ Publishing](https://publishing.elenq.tech/en/). + +We spent the summer and part of the autumn preparing the books, the printing +and making the paperwork as well as the tools we needed for the website and +future books. By November 13 we already had every book shipped and the +website was almost ready. At the beginning of December, the website was +finished and published. + +It was more work than we expected but now we have a complete set of tools for +future publications, that can cover any of the points of the process with +almost no human interaction. We automated almost everything, and those things +we didn't automate are simple things once you know how to make them. + +Of course, as engineers, we only consider automating things that we are going +to repeat so you can think about all this work as a plan to keep publishing new +material in the future. + +It's really interesting to mention that our whole process is reproducible as we +are using [Guix](https://guix.gnu.org) as a tool, so no matter what happens we +could still go back in time and remake the books exactly as they were when we +published them. + +As you see, at a company level, most of our work of 2020 was focused on +teaching and making the books (another form of teaching), because it's +something I personally enjoy a lot and I'd say it's more fulfilling than +anything else I've done. But it was sadly affected by the pandemic, so we need +to reorganize a little bit our strategy. + +#### Personal level + +Of course, I spend time on other things too. A great part of my job is to +randomly research anything I find interesting, so I can keep my mind fresh for +the possible projects that may come. This gives me tools and ideas, and also +lets me learn from other people. + +During the year I spent some time contributing to Guix, for [reasons I already +discussed here](https://ekaitz.elenq.tech/donations-guix-01.html). The most +notable contributions were the addition of a really interesting package that +was missing: Meshlab; and the correction of a package that was failing to +compile for months: FreeCAD. + +Being locked at home, I also had the chance to go back to electronics, which +are a huge part of what I studied at university, but I never had the chance to +work on that in a professional level. I even designed some PCBs, produced +and soldered them with the highest level of quality possible. It was a great +experience. + +On the other hand, I also needed some time to relax and try to recover from +some longstanding health issues I've been dealing with, that also deteriorated +because of the pandemic. + +After some time practicing yoga and taking care of my body, I feel much better +in general, even if my issues are still there, at least they are not aggravated +by the bad posture and the physical stress that working in a computer can +provoke. So, if you are open to a suggestion: stretch, make some strength +exercises and try to keep your body on shape, specially if you work in an +office or any other kind of sedentary work that makes use of repetitive +movements like using a mouse or typing in a keyboard. + +##### December + +As I mentioned, our work with ElenQ Publishing was done at the beginning of +December. We approached that as a chance to stop and think. + +During the last three years I had few chances to focus on an specific subject +for a long time, I had to quickly jump from one thing to another, in order to +be able to reach all the projects we had. + +I was frustrated because of that. I'm easily distracted and it's hard for me to +pay attention for a while to the same thing but I really like to understand +things **deeply**, those who know me or that attended to my courses know it, +and my everyday life, full of stress and various stimulus, was making me unable +to concentrate. + +I had moments of attention and clearness of mind during the pandemic (and due +to the pandemic) that made me feel in peace so I wanted to feel that kind of +frustration-less live on purpose, not only when things come like that. + +So that's what I did. I just needed something to investigate, something I was +interested since the early beginning of my career: programming languages. + +I collected some books on compiler implementation and started reading them, +then I realized I was interested on operating system implementation so I read +about that too. Both things need to run somewhere so I also spent some time +digging on various architectures and their instruction sets, and so on. + +I started developing a simple [Scheme +implementation](https://github.com/ekaitz-zarraga/blas) (only started, not +finished or anything) that served as an excuse to have a goal in mind in the +process. Also, I decided to [live stream](https://twitch.tv/ekaitzza) my +research process so I could share my findings with others and let them provide +me some thoughts and help me go slowly, paying attention to the interesting +details. + +And let me tell you compiler implementation is often a difficult subject for +me, specially the theory, because my background is lacking some of the concepts +that Computer Science students have but I have to study from scratch[^note]. + +Having the chance to tackle a difficult long term task helped me forget and not +worry about the *bad* year we had as a company, in which we only had actual +paid work during the first half of the year. I was just grateful to be able to +sustain myself enough time to have the chance to breathe and spend more time +with myself, doing something I don't always have the chance to do, regardless +of everything we, individually and collectively, were going through. + +I hope you had some moments of relief too. + + +##### What I learned + +I obviously learned many things during the year (books have been read!) But I +don't want to focus on that. + +Sometimes the most important thing is not the goal, but the process. You learn +more from the travel than from the arrival, right? + +I like to think that I learned to care more about myself in 2020. I'm still +sick, and my recovery got stuck as I was literally stuck at home, but that's +just a temporary issue, because I'm taking care of myself. Maybe not everyday, +but almost everyday I take care of myself. That's what counts. + +2020 taught me how to make a publishing house. That's some important piece of +knowledge, but I consider more valuable to reclaim my time and my attention. +That taught me an important lesson by itself and it also served me to learn +about myself. + +I learned that I was feeling alone in my interests. I had no one to share my +interests with. I know it is surprising to you, but basically nobody is +interested on how do garbage collectors, processors or anything like that work. +Most of the people don't even care about what they are. Crazy huh? + +Sharing my findings, my research and my errors with other people makes me feel +better. I feel someone is there, on the other side. It helps me avoid +the frustration and the lack of motivation I have been feeling during the last +years. + +The streaming helped with that[^english]: I had people reacting instantly, some +sent me papers to read, ideas, and others proposed me interesting things to do. +That feels good. It helped me remember that I'm not alone. + +If 2020 had taught me anything is that I, or we, need others to feel better. +We need to take care of people[^people], because life is much better with them. + +On top of many things, being conscious that I was researching **deep** opened +the door to apply that deepness in my everyday life more often. Not that I +wasn't doing that before, those who know me are aware that I'm kind of an +intense guy, but that I'm more conscious about it and I can selectively choose +to go deeper about my thoughts and feelings. + +This time for myself remind me how intense I was back then and how I enjoyed +being a dedicated person. + + + +#### So what + +As I said, in a company level I decided to use that time to arrange a new +strategy. I wouldn't say I changed it that much, because I was in peace when it +was developed, almost 4 years ago, but it let me rethink it taking in account +my professional and personal experience in the recent years. + +Collaborating on free software projects has shown me that I feel comfortable +with larger codebases and more complex concepts that were too much for me in +the past. Now I feel more confident about that. + +Of course, this came with practice and time, but also after years of stressful +work and random research that is not really fulfilling. I don't mean that you +need to spend time on that to be able to tackle bigger projects. I mean that my +past is part of what I am now, and even the bad times can help forge a better +future. + +I decided to keep researching the way I was, because it's something that makes +me feel good, and work more slowly, but paying attention to the details as I +like to do. + +I'll try to share more about my work, in a technical and a personal level. I'll +keep streaming for some time, and I'll try to use this blog more, as I was in +the past. + +So, as I was saying, all this year helped me remember about important things, +and forget a little bit about urgent things. + +> "Instead of swimming fast trying to reach as far as I could, pumping my +> blood, splashing water around and having to take a short breath between each +> arm stroke, now I want to dive. I'm far enough from the coast, already. +> +> I want to stay in the surface until I'm ready, having some rest and breathing +> as much as I want, and then, I'll dive. I'll discover the colors of the +> coral reef, the sea creatures and even the deepest darkness if I feel like +> it. When I'm done or I'm tired, I'll go back to the surface, take a deep +> breath and have some rest, feeling the sun in my face, until the next +> immersion. +> +> I'm not going anywhere. I'm not in a hurry anymore." + + +[^note]: But hey, I'm much more comfortable with low level stuff like ISAs and + all that. My degree is not useless after all. + +[^blog]: In this blog, as contrast, I can't really know how many people reads + or interacts with what I write. So I encourage you to contact me and share + ideas! + +[^english]: Making the videos also helped me to feel more confident about my + English (people understand what I say!) and that is helping me tackle larger + projects that involve people from different places. + +[^people]: More now, that we have some heavy shit going on out there. diff --git a/content/lightening.md b/content/lightening.md new file mode 100644 index 0000000..d754082 --- /dev/null +++ b/content/lightening.md @@ -0,0 +1,300 @@ +Title: RISC-V Adventures: Migrating Lightening +Date: 2021-05-19 +Category: +Tags: +Slug: lightening +Lang: en +Summary: + The Migration of Lightening, the code generation library used in + Guile Scheme, and other adventures on the low level world of RISC-V. + +In [the latest post](https://ekaitz.elenq.tech/2020.html) I summarized the last +year because I wanted to talk about what I'm doing **now**. In this very moment +I just realized that almost the half of this 2021 is already gone so following +the breadcrumbs until this day could be a difficult task. That's why I won't +give you more context than this: RISC-V is a deep, *deep*, hole. + +I told you I was researching on programming languages and that made me research +a little bit about ISAs. That's how I started reading about RISC-V, and I +realized learning about it was a great idea for many reasons: it's a new thing +and as an R&D engineer I should keep updated and the book I chose is really +good[^book] and gives a great description about the design decisions behind +RISC-V. + +[^book]: It's available for free in some languages and it's 20 bucks in + English. Totally worth it: + + +From that, and I don't really know how, I started taking part on the efforts of +migrating Guix to RISC-V. One of the things I'm working on right now is the +migration of the machine code generation library that Guile uses, called +`lightening`, to RISC-V, and that's what I'm talking about today. + + +### The lightening + +Lightening is a lightweight fork of the [GNU +Lightning](https://www.gnu.org/software/lightning/), a machine code generation +library that can be used for many things that need to abstract from the target +CPU, like JIT compilers or so. + +The design of GNU Lightning is easy to understand. It exposes a set of +instructions that are inspired in RISC machines, you use those, the library +maps them to actual machine instructions on the target CPU and returns you a +pointer to the function that calls them. Simple stuff. + +The code is not that easy to understand, it makes a pretty aggressive and +clever use of C macros that I'm not that used to read so it is a little bit +hard for me. + +I could try to explain the reasons behind the fork, but [the guy who did it, +that is also the maintainer of Guile explains it much better than I +could](https://wingolog.org/archives/2019/05/24/lightening-run-time-code-generation). +But at least I can summarize: lightening is simpler and it fits better what +Guile needs for its JIT compiler. + +Boom! Lightened! + + +### The process + +So Lightening is basically simpler but the idea is the same. But how do you +make the migration of a library like that to other architecture? + +The idea is kind of simple, but we need to talk about the basics first. + +Lightening (and GNU Lightning too, but we are going to specifically talk about +Lightening from here) emulates a fake RISC machine with its functions. It +provides `movr`, `movi`, `addr` and so on. Basically, all those are C functions +you call, but they actually look like assembly. Look a random example here +taken from the [`tests/addr.c` file](https://gitlab.com/wingo/lightening/-/blob/main/tests/addr.c#L6): + + ::clike + jit_begin(j, arena_base, arena_size); + size_t align = jit_enter_jit_abi(j, 0, 0, 0); + jit_load_args_2(j, jit_operand_gpr (JIT_OPERAND_ABI_WORD, JIT_R0), + jit_operand_gpr (JIT_OPERAND_ABI_WORD, JIT_R1)); + + jit_addr(j, JIT_R0, JIT_R0, JIT_R1); + jit_leave_jit_abi(j, 0, 0, align); + jit_retr(j, JIT_R0); + + size_t size = 0; + void* ret = jit_end(j, &size); + + int (*f)(int, int) = ret; + ASSERT(f(42, 69) == 111); + + +Basically you can see we get the `f` function from the calls to `jit_WHATEVER`, +which include the call to the preparation of the arguments, `jit_load_args_2`, +and the actual body of the function: `jit_addr`. The word `addr` comes from +*add* and *r*egisters, so you can understand what it does: adds the contents of +the registers and stores the result in other register. + +The registers have understandable names like `JIT_R0` and `JIT_R1`, which are +basically the register number (the `R` comes from "register"). + +So, if you check the line of the `jit_addr` you can understand it's adding the +contents of the register `0` and the register `1` and storing them in the +register `0` (the first argument is the destination). + +That's pretty similar to RISC-V's `add` instruction, isn't it? + +Well, it's basically the same thing. The only problem is that we have to emit +the machine code associated with the `add`, not just writing it down in text, +and we also need to declare which are the registers `JIT_R0` and `JIT_R1` in +our actual machine. + +Thankfully, the library has already all the machinery to make all that. There +are functions that emit the code for us, and we can also make some `define`s to +set the `JIT_R0` to the RISCV `a0` register, and so on. + +We just need to make new files for RISC-V, define the mappings and add a little +bit of glue around. + +### The problems + +All that sounds simple and easy (on purpose), but it's not *that* easy. + +Some instructions that Lightening provides don't have a simple mapping to +RISC-V and we need to play around with them. + +There's an interesting example: `movi` (move immediate to register). + +Loading and immediate to a register is something that sounds extremely simple, +but it's more complex than it looks. The RISC-V assembly has a +pseudoinstruction for that, called `li` (load immediate) that can be literally +mapped to the `movi`. The main problem is that pseudoinstructions *don't really +exist*. + +You all know there are CISC and RISC machines. CISC machines were a way to make +simpler compilers, pushing that complexity to the hardware. RISC machines are +the other way around. + +The RISC hardware tends to be simple and they have few instructions, the +compiler is the one that has to make the dirty job, trying to make the +programmer's life better. + +Pseudoinstructions are a case of that. The programmer only wants to load a +constant to a register but real life can be very depressing. When you want to +load an immediate you don't want to think about the size of it, if it fits a +register you are fine, aren't you? + +Pseudoinstructions are expanded to actual instructions by the assembler, so you +don't need to worry about those details. In fact, RISC-V doesn't really have +move instructions, they are all pseudoinstructions that are expanded to +something like: + + addi destination, source, 0 + +Which means "add 0 to source and store the result in destination". + +The `li` pseudoinstruction is a very interesting case, because the expansion is +kind of complex, it's not just a conversion. + +In RISC-V all the instructions are 32bit (or 16 if you take in account the +compressed instruction extension) and the registers are 32bit wide in RV32 and +64bit wide in RV64. You see the problem, right? No 32bit instruction is able to +load a full register at once, because that would mean that all the bits +available for the instruction (or more!) need to be used to store the +immediate. + +Depending on the size of the immediate you want to load, the `li` instruction +can be expanded to just one instruction (`addi`), two (`lui` and `addi`) or, if +you are in RV64 to a series of eight instructions (`lui`, `addi`, `slli`, +`addi`, `slli`, `addi`, `slli`, `addi`.). There are also sign extensions in the +middle that make all the process even funnier. + +Of course, as we are generating the machine code, we can't rely in an assembler +to make the dirty job for us: we need to expand everything ourselves. + +So, something that looked extremely simple, the implementation of an obvious +instruction, can get really messy, so we need a reasonable way to check if we +did the expansions correctly. + +And we didn't talk yet about those instructions that don't have a clear mapping +to the machine! + +Don't worry: we won't. I just wanted to point the need of proper tools for this +task. + +### The debugging + +The debugging process is not as complex as I thought it was going to be, but +my setup is a little bit of a mess, basically because I'm on Guix, which +doesn't have a proper support for RISC-V so I can't really test on my machine +(if there's a way please let me know!). + +I'm using an external Debian Sid machine (see acknowledgements below) for it. + +I basically followed the [Debian +tutorial](https://wiki.debian.org/RISC-V#Cross_compilation) for cross +compilation environments and Qemu and everything is perfectly set for the +task. + +Next: how to debug the code? + +I'm using Qemu as a target for GDB, so I can run a binary on Qemu like this: + + ::sh + qemu-riscv64-static -g 1234 test-riscv-movi + +Now I can attach GDB to that port and disassemble the `*f` function that was +returned from Lightening to see if the expansion is correct: + + + $ gdb-multiarch + GNU gdb (Debian 10.1-2) 10.1.90.20210103-git + ... + For help, type "help". + Type "apropos word" to search for commands related to "word". + (gdb) file lightening/tests/test-riscv-movi + Reading symbols from lightening/tests/test-riscv-movi... + (gdb) target remote :1234 + Remote debugging using :1234 + 0x0000000000010538 in _start () + (gdb) break movi.c:15 + Breakpoint 1 at 0x1d956: file movi.c, line 15. + (gdb) continue + Continuing. + + Breakpoint 1, run_test (j=0x82e90, arena_base=0x4000801000 + "\023\001\201\377#0\021", arena_size=4096) at movi.c:15 + 15 ASSERT(f() == 0xa500a500); + (gdb) disassemble *f,+100 + Dump of assembler code from 0x4000801000 to 0x4000801064: + 0x0000004000801000: addi sp,sp,-8 + 0x0000004000801004: sd ra,0(sp) + 0x0000004000801008: lui a0,0x0 + 0x000000400080100c: slli a0,a0,0x20 + 0x0000004000801010: srli a0,a0,0x21 + 0x0000004000801014: mv a0,a0 + 0x0000004000801018: slli a0,a0,0xb + 0x000000400080101c: addi a0,a0,660 # 0x294 + 0x0000004000801020: slli a0,a0,0xb + 0x0000004000801024: addi a0,a0,20 + 0x0000004000801028: slli a0,a0,0xb + 0x000000400080102c: addi a0,a0,1280 + 0x0000004000801030: ld ra,0(sp) + 0x0000004000801034: addi sp,sp,8 + 0x0000004000801038: mv a0,a0 + 0x000000400080103c: ret + 0x0000004000801040: unimp + ... + +Of course, I can debug the library code normally, but the generated code has to +be checked like this, because there's no debug symbol associated with it and +GDB is lost in there. + +Important stuff. Take notes. + +--- + +
+This free software work is also work. It needs funding!
+Remember you can hire ElenQ +Technology to help you with your research, development or training.
+If you want to encourage my free software work you can support me on Buy Me a Coffee or on Liberapay. +
+ +--- + +### The acknowledgements + +It's weird to have acknowledgments in a random blog post like this one, but I +have to thank my friend [Fanta](https://56k.es/) for preparing me a Debian +machine I can use for all this. + +Also I'd like to thank Andy Wingo for the disassembly trick you just read. +Yeah, there were no chances I discovered that by myself! + +### The code + +All the process can be followed in the gitlab of the project where I added a +Merge Request. Feel free to comment and propose changes. + +[Here's the link](https://gitlab.com/wingo/lightening/-/merge_requests/14/commits). + + +### The future + +There's still plenty of work to do. I only implemented the basics of the ALU, +some configuration of the RISC-V context like the registers and all that, but +I'd say the project is in the good direction. + +I don't know if I'm going to be able to spend as much as time as I want on it +but I'm surely going to keep adding new instructions and eventually try to wrap +my head around how are jumps implemented. + +It's going to be a lot of fun, that's for sure. -- cgit v1.2.3