1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
|
Title: Call me maybe
Date: 2019-01-09
Category:
Tags:
Slug: call-me-maybe
Lang: en
Summary: Recursion, stacks and optimizations.
Do you remember what happens when you call a function in your program?
What happens when you make too many nested calls?[^1]
When you call your functions, there some stuff going on in the memory, some
variables, the program counter and all that, that must be stored somewhere to
be able to come back to the place you called the function when the function
ends. Right?
The place where all that is stored is the stack. You already know all this.
When you call many nested functions, the stack goes pushing more and more data,
and there's no chance to pop it, so it overflows.
This can happen anytime but there's more risk for that when you call functions
recursively, because they call themselves many times by definition. In a
non-recursive program it can happen too, but devices can handle big levels of
nesting so it's more unlikely to happen (in small devices like microcontrollers
or so, you have to take care of this too).
This doesn't mean recursive functions will result in a stack overflow always.
That will only happen when the nesting level is bigger than the stack size.
> You are so stupid the recursive function that calculates your stupidity
> causes a stack overflow.
> — Heard in a computer science class
But this is not always true. There are some optimizations that can change this
behaviour and allow you to create stack-safe recursions. Let's talk about
**tail-call optimization**.
Some programming languages implement tail-call optimization, that, if used
correctly, avoids stack overflows in recursive calls and increase performance.
First of all, in order to be able to make a tail-call optimization, the
function **must** have a call as its last action (tail-call). This means **it
requires to be ordered in an specific way**. Let's see it with an
(oversimplified) example (in Python, but don't pay attention to the language):
::python
def factorial (a):
""" This function does not provide a tail-call, because the last thing
to execute in it is a multiplication, not a call """
if a == 1:
return 1
return a * factorial(a-1)
def factorial_tail (a, acc=1):
""" This function provides a tail-call, the last thing happening on it
it's a function call."""
if a == 1:
return acc
return factorial_tail(a-1, acc=acc*a)
As the comments say, the first function is not performing a tail-recursion, but
the second is. But, what's the difference?
The main point is the first function, `factorial`, needs to go back in the call
stack to retrieve previous step's `a` value, while the second function doesn't.
That's why the second can be optimized and the first not.
The optimization exploits this behaviour in a really clever way to avoid the
stack overflows I told you before. Tail call optimization just changes the
input parameters of the function and calls it again, replacing the original
call with the new call with different input arguments. This can be made because
the function is written in a way that doesn't need anything from the previous
step.
Imagine that we introduce a `3` in the first and the second function, let's
compare the execution. Let's check `factorial` first:
- Call `factorial(3)`
- Call ` factorial(2)`
- Call `factorial(1)`
- Return `1`
- Return `2 * 1`
- Return `3 * 2`
Now with the `factorial-tail` function but without any optimization:
- Call `factorial-tail(3)`
- Call `factorial-tail(2, acc=3)`
- Call `factorial-tail(1, acc=6)`
- Return 6
- Return 6
- Return 6
See the difference?
The `factorial-tail` call doesn't need anything from the previous step, the
last `factorial-tail(1, acc=6)` function call's result is the same as the
result of the `factorial-tail(3)` function. That changes everything!
What tail call optimization does is just change the call arguments and keep
running the same code. There's no need to store anything on the stack, just
change the function call with the tail call.
Let's optimize the second call now:
- Call `factorial-tail(3)`
- Replace the call with `factorial-tail(2, acc=3)`
- Replace the call with `factorial-tail(1, acc=6)`
- Return 6
This can be stretched further! It can involve different functions! In any place
where a tail-call is made, even if the called function is a different function,
this kind of optimization can be done, reducing the stack size and increasing
the performance.
If you want to read more about this, there's [a great wikipedia page on the
subject][wikipedia] and you there's [a really good explanation in the book
Programming in Lua][lua].
But how is all this handled by the programming languages? You may ask.
The answer is there's not a clear answer, all of them have their own style of
dealing with this. Let me give you some examples.
**Python**, just to point out the language I chose for the example is no the
best example of this, has no tail recursion elimination. Guido and the
Pythonists[^2] argue that tail call optimization alters the stack traces (which
is true) and that they don't like the recursion as a base for programming, so
they try to avoid it. In CPython there's no tail call optimization, but they
don't forbid (they can't!) any other Python implementation to implement that
particular optimization. There's a really [interesting post by Guido Van Rossum
about this][guido].
**Lua**, as you've seen in the [previous link][lua], implements proper tail
calls (as they call them there) and there's nothing the programmer needs to do
to make sure they are optimized. The only thing is to put the tail calls
correctly.
**Scala** implements tail recursion optimization at compile time so the
compiler transforms the recursive call with a loop in compilation time. That's
interesting because there's a compile time check too. There's an annotation
called `@tailrec` that can be used to make sure that your function is going to
be optimized. If the compiler is not able to optimize it will throw an error if
it has the `@tailrec` annotation. If it doesn't have it, it will simply make a
standard recursion. In the [annotations tour of the Scala language][scala] has
some words about `@tailrec`.
**Clojure** is a really interesting case too. Clojure doesn't implement
tail-call optimization, but it has one (and only one) special form for non
stack consuming looping: `recur`. This special form rebinds the recursion
point's bindings or arguments and jumps back to the recursion point. The
*recursion point* can be a `loop` special form or a function definition. So,
it's just an explicit call to the tail recursion optimization. Tail call must
be done correctly too, `recur` is only allowed in a tail call and the compiler
checks if it's located in the correct place. Also, it has some specific rules
that must be taken in consideration (multiple arity functions and so on), that
is better to [read in the documentation][clojure].
> *Edited 2019-01-25*: Thanks to a discussion in the fediverse about the topic,
> I found the moment where **Emacs Lisp** got its tail call optimization
> everything explained [in the author's blog][emacs]. It's really interesting.
[^1]: Some help: what's the name of the website you check when you don't know
how to solve your programming problem?
[^2]: My next music band name.
[wikipedia]: https://en.wikipedia.org/wiki/Tail_call
[lua]: https://www.lua.org/pil/6.3.html
[guido]: https://neopythonic.blogspot.com/2009/04/tail-recursion-elimination.html
[scala]: https://docs.scala-lang.org/tour/annotations.html
[clojure]: https://clojure.org/reference/special_forms#recur
[emacs]: https://chrismgray.github.io/posts/emacs-tco/
|