summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorEkaitz Zarraga <ekaitz@elenq.tech>2021-06-28 20:35:11 +0200
committerEkaitz Zarraga <ekaitz@elenq.tech>2021-06-28 20:35:25 +0200
commit93e2411b543392875a3ab18947ff2bf48178bb20 (patch)
treebdf8ac8956a2cbc9f7bf0d9e2f663947cb84faa1
parentbe5f6884f19fa966b37b4fc89fa16642931d8ee3 (diff)
Corrections in machine-code-generation.md
-rw-r--r--content/machine-code-generation.md36
1 files changed, 18 insertions, 18 deletions
diff --git a/content/machine-code-generation.md b/content/machine-code-generation.md
index ebf2fe2..470937d 100644
--- a/content/machine-code-generation.md
+++ b/content/machine-code-generation.md
@@ -97,7 +97,7 @@ the instruction. This one is from the `I` format, because it includes an
- First the *opcode*, `addi` for you, has a binary counterpart: `0010011`. 7
bits for this instruction format.
-- Then the destination register, `a0`, has an binary representation: `01010`.
+- Then the destination register, `a0`, has a binary representation: `01010`.
There are 32 registers in RISC-V so each of them are represented by a 5 bit
value.
- There's some extra space for an opcode-like field called `funct3`: `000`
@@ -180,7 +180,7 @@ what happens:
}
In that example we build an array of two values. The first one corresponds to
-the instruction we encoded by had before and the second corresponds to `jalr
+the instruction we encoded by hand before and the second corresponds to `jalr
zero, ra, 0`, the return instruction, which you can encode yourself.
After that we convert the address of the array to a function that returns and
@@ -234,12 +234,12 @@ The C example where we executed an array is correct, it runs and all that, but
the reality is that memory has different kinds of permissions for each part of
it.
-Code in memory is normally read only and executable, and data can be read-only
+Code in memory is normally read-only and executable, and data can be read-only
or not, depending on the goal it has (constant or variable).
If you think about the example above, once the array is set, we can overwrite
it later, or even write it from the instructions we inserted on it. This could
-led to security issues or unexpected results. That's why code is normally read
+lead to security issues or unexpected results. That's why code is normally read
only and any attempt to write it will raise an exception to the kernel.
There are several ways to identify a memory block as code: the RISC-V assembly
@@ -319,7 +319,7 @@ previously.
### Lessons learned {#problems}
There are many problems that a machine code generation library like that can
-encounter, but they are not exclusive for those libraries. These kind of
+encounter, but they are not exclusive for those libraries. This kind of
problems can also appear in compilers, assemblers and many other things.
The lessons I learned come as problems I encountered during these days of
@@ -372,7 +372,7 @@ Consider this RV64 code:
addi t0, t0, 1 // x[t0] = x[t0] + 1
// What's the value of t0 here?
-The code has some comments in the right that I'm going to use through the whole
+The code has some comments on the right that I'm going to use through the whole
post, so get used to them. The `x` means register access (base registers are
called X registers in RISC-V), and `mem` is memory, `PC` (program counter) is
written in uppercase and not as if it was a register because it's not
@@ -389,7 +389,7 @@ to the question is `0xAAAAAAAAAAAAAAAB`.
But can you see the trick we are using?
The `jal` instruction is jumping over the constant so we can't execute it by
-accident (which will cause an `Illegal Instruction` error), and using the `ld`
+accident (which would cause an `Illegal Instruction` error), and using the `ld`
instruction we are able to load a big constant to a register. A constant which
**is mixed with the code**, as any immediate would be, but without being
associated with any instruction.
@@ -460,7 +460,7 @@ get the labels, associate them with their addresses and then assemble the whole
thing, but you are not always in this situation.
Lightening, for instance, generates the code as you call the API, so it doesn't
-know where does your jump point to until you call the API for the label later.
+know where your jump points to until you call the API for the label later.
Compilers may encounter this issue too, when they are using separate
compilation and linking steps. You must be able to compile one source file by
@@ -537,9 +537,9 @@ instructions when we know where do they need to point to.
###### Example: Lightening {#relocs-lightening}
-Same approach is followed in Lightening, and you can follow in your assembler,
-library or anything that has a similar problem. Let's consider some code using
-Lightening (obtained from `tests/beqr.c`, comments on my own):
+The same approach is followed in Lightening, and you can follow in your
+assembler, library or anything that has a similar problem. Let's consider some
+code using Lightening (obtained from `tests/beqr.c`, comments added by me):
::clike
// Make a function that loads two arguments
@@ -568,9 +568,9 @@ You see the `beqr` function doesn't receive the target address or offset as an
argument, but it returns a `jit_reloc_t`, which other functions like `reti`
don't return.
-That `jit_reloc_t` is what we are patching later with the `jit_patch_here` to
-tell where does it need to jump. The `jit_patch_here` function is going to
-correct the bits we set to zero because we didn't know the target at that
+That `jit_reloc_t` is what we are patching later with the `jit_patch_here`
+indicating where does it need to jump. The `jit_patch_here` function is going
+to correct the bits we set to zero because we didn't know the target at that
moment.
There are different kinds of relocations, as it happened in the previous
@@ -582,7 +582,7 @@ associated with it, so we can check and act accordingly.
#### Problem: Long jumps {#jumps}
As we saw, some jumps encode the target as an *immediate*. This has a couple of
-implications that we just described in previously:
+implications that we described previously:
- The jump target could be larger than the space we have for the immediate.
- Sometimes we can't know the target until we reach the position where the jump
@@ -637,7 +637,7 @@ targets, not only one.
###### Optimization: pointer relaxation {#relaxation}
-But using the largest possible jumps can led to inefficiencies because we use
+But using the largest possible jumps can lead to inefficiencies because we use
two instructions for jumps that can potentially fit in just one.
We can use something we saw before for that: relocations. More specifically,
@@ -684,7 +684,7 @@ Is optimized to this:
::clike
ld t0, offset_from_data(gp) // x[t0] = mem[ x[gp] + offset_from_data ]
-Of course, the offsets have to be calculated and all that, but it's not that
+Of course, the offsets have to be calculated and all that, but this not that
difficult.
@@ -866,7 +866,7 @@ encourage my free software work.
### Final thoughts {#final}
-I know these are just a few things, but they are enough to let you make you
+I know these are just a few things, but they are enough to let you make your
first program that involves machine code generation to certain level.
I'm not a computer scientist but a telecommunication engineer[^engineer], so I