Loops are the basic control flow structure we use in imperative programming whenever we want our program to do something multiple times. E.g. if we wanted to find the largest element in array, we’d need to check all the array elements, so we can put the checking code in a loop and execute it as many times as there are elements:
function maxLoop = function(array){
if(array.length == 0){
return null;
}
let max = Number.MIN_VALUE;
for(let i = 0; i < array.length; i++){
if(array[i] > max){
max = array[i];
}
}
return max;
}
maxLoop([]);
//=> null
maxLoop([3]);
//=> 3
maxLoop([3, 4, 1, -6, 9, 5]);
//=> 9
Pretty straightforward, but with one caveat - it relies on variable mutation. We are mutating the loop counter
i
and the max
variable. We’ll just hand-wave the loop counter - it’s something that goes with the territory
if you use a for
loop. However, max
variable is a whole different thing. For starters, its name is a bit wrong.
Initially, it holds the smallest possible number value so it cannot be the maximum for any array that doesn’t
hold this value, i.e. any of the test cases above. Then it becomes a “temporary maximum” of all the elements
visited so far and only becomes “true maximum” after the loop is finished. So, the name max
only becomes correct
in the last line of the code.
Generally, heavy reliance on data mutation breaks away from the declarative nature of functional programming. In order to achieve the same effect using only functional programming concepts, we’ll need to use recursion. Simply put, if we want to do something multiple times, we’ll call the same function multiple times. And if we want to do that without any loops, we’ll need to rig our function to call itself.
So, how do we write a recursive equivalent of our maxLoop
function? We can start with the edge cases - we
know that an empty array has no maximum and that the maximum of an array with exactly one element is that very
element.
let max = function(array){
if(array.length == 0){
return null;
}
if(array.length == 1){
return array[0];
}
}
max([]);
//=> null
max([6]);
//=> 6
max([3, 4, 1, -6, 9, 5]);
//=> undefined
So far so good, but the function has a massive flaw - it simply doesn’t work on arrays with more than one element. That’s something we can fix with recursion.
We can get the maximum element of an array by comparing the first element with whatever element is the maximum of the remainder of the array. For example:
maximum of [3, 4, 1, -6, 9, 5]
is either 3 or maximum of [4, 1, -6, 9, 5]
following the same thought
maximum of [4, 1, -6, 9, 5]
is either 4 or maximum of [1, -6, 9, 5]
then
maximum of [1, -6, 9, 5]
is either 1 or maximum of [-6, 9, 5]
and then
maximum of [-6, 9, 5]
is either -6 or maximum of [9, 5]
and then again
maximum of [9, 5]
is either 9 or maximum of [5]
finally
maximum of [5]
is, oh wait, our function already knows how to do this
it's just 5
Or in actual JavaScript code:
let max = function(array){
if(array.length == 0){
return null;
}
if(array.length == 1){
return array[0];
}
let first = array[0];
let rest = array.slice(1);
let maxRest = max(rest); //recursive call
if(first > maxRest){
return first;
}
return maxRest;
}
max([]);
//=> null
max([6]);
//=> 6
max([3, 4, 1, -6, 9, 5]);
//=> 9
And it works fine! Now, let’s analyze it. Every recursive function must contain:
- at least one recursive call, i.e. call to the same function - in our example that happens when we calculate the maximum of the remaining elements of the array.
- at least one exit clause, i.e. a set of conditions when the value is returned without any further recursive call - in our example that happens when the passed array has zero or one elements in it.
Both parts are equally important for a well-defined recursive function. Without a recursive call we’d only be able to process one element of the data structure and without an exit clause our function would keep recursively calling itself until it crashes.
Speaking of crashes, let’s show one of the most common problems with recursive function implementation.
To illustrate it, we’ll write a recursive function that calculates the sum of first n
numbers.
let sumFirst = function(n){
if(n == 0){
return 0;
}
let sumOfRest = sumFirst(n - 1);
return n + sumOfRest;
}
sumFirst(0);
//=> 0
sumFirst(1);
//=> 1
sumFirst(3);
//=> 6
sumFirst(6);
//=> 21
Looks ok, but what if we called it with a huge number?
sumFirst(100000)
//=> Uncaught RangeError: Maximum call stack size exceeded
And we got ourselves a stack overflow. What happens here is that every time we have a nested function call, the state of the function that called it is added to a stack. So, when the recursion finally reaches the exit clause it returns the value and the stack gets emptied all the way up to the initial function call.
This is nothing specific to recursion - it’s generally the way nested function calls work. In theory, we’d get the same crash if we nested 100000 calls to completely different functions. But since doing that is a bit unrealistic, this scenario is in practice much more likely if we use recursion. The bottom line is that, due to this constraint, the number of nested calls is limited by the stack size and the size of the context state each function call needs to store.
Having a hard limit on the recursion depth isn’t a very good thing. Luckily, there’s a few ways around that.
The obvious workaround is to only use recursion when we “know” that depth is not going to be large enough to crash the stack and to just use iteration otherwise.
A more technical solution is to use something called tail call optimization. We start with by questioning why do we need this stack in the first place. The purpose of the stack is to store the function context state so that the caller function instance could continue to run after the called function instance returns its result to it. E.g. in our example:
let sumFirst = function(n){
if(n == 0){
return 0;
}
let sumOfRest = sumFirst(n - 1);
return n + sumOfRest;
}
If we call the function with number 1 as its parameter then sumFirst
will run exactly twice - once with 1 as
its parameter and from there it will recursively call itself with 0 as its parameter.
So, the moment when recursive call happens, the environment stores the context state of sumFirst(1)
so that the
function could continue to run after sumFirst(0)
returns its value.
That said, if recursive call was the very last thing that happened in the code, we wouldn’t need the context state, simply because the caller function would not need to continue running after the called function returned value.
The very last thing done in the function body is said to be in tail position, i.e. it’s at the function end. Word “tail” here means “end”, it’s the opposite of “head” which means “beginning”. So, if optimize the code by calling the recursive function from a tail position, we’ll have a tail call optimization, get it?
Let’s try doing that in our example:
let sumFirstAlmostTailCall = function(n){
if(n == 0){
return 0;
}
return n + sumFirstAlmostTailCall(n - 1);
}
Almost, but not quite. Even though the recursive call is technically in the very last line of code, it’s not the
last thing that happens in the function. After it returns the value we add it to n
, so the addition operation
is the last thing. Which means that we still need to store n
parameter for every call and thus we still need
the stack.
But we can do better than that:
let sumFirstTailCall = function(n, acc){
if(n == 0){
return acc;
}
return sumFirstTailCall(n - 1, acc + n);
}
sumFirstTailCall(0, 0);
//=> 0
sumFirstTailCall(1, 0);
//=> 1
sumFirstTailCall(3, 0);
//=> 6
sumFirstTailCall(6, 0);
//=> 21
What we did now is introduce acc
- the accumulator parameter. Its purpose is to store the work-in-progress
result and pass it down the recursive calls. When we call the function, we always need to pass zero as the initial
accumulator value.
The accumulator parameter is really intended to be used by recursive calls, not our initial function call. If we don’t like to always manually pass a zero we can trim down the function interface by offering a one-parameter version. In JavaScript, we can optionally omit a parameter when we call a function and then just check its existence in function body.
let sumFirstTailCall = function(n, acc){
if(!acc){ //undefined is treated as false in if
acc = 0;
}
if(n == 0){
return acc;
}
return sumFirstTailCall(n - 1, acc + n);
}
sumFirstTailCall(0);
//=> 0
sumFirstTailCall(1);
//=> 1
sumFirstTailCall(3);
//=> 6
sumFirstTailCall(6);
//=> 21
Let’s see if we achieved anything with this:
Looks ok, but what if we called it with a huge number?
sumFirstTailCall(100000)
//=> Uncaught RangeError: Maximum call stack size exceeded
Wow, again!?
The bad news, in order to take advantage of tail call optimization, your programming language compiler or runtime must have tail call optimization feature implemented. As this does not exist for JavaScript at the time of this writing, our example above is just a theoretical illustration of what tail position is in a function.
It has been addressed in EcmaScript6 specifications and hopefully could be introduced at some point, but until then, we’ll need to use some other language to show an example.
We’ll use Clojure to implement a version without and with tail call optimization and compare the results:
(defn sum-first [n]
(if (zero? n)
0
(+ n (sum-first (- n 1)))))
(sum-first 0)
;;=> 0
(sum-first 1)
;;=> 1
(sum-first 3)
;;=> 6
(sum-first 6)
;;=> 21
(sum-first 100000)
;;=> StackOverflowError clojure.lang.Numbers$LongOps.negate (Numbers.java:534)
(defn sum-first-almost-tail-call
([n acc]
(if (zero? n)
acc
(sum-first-almost-tail-call (- n 1) (+ acc n))))
([n] (sum-first-almost-tail-call n 0)))
(sum-first-almost-tail-call 100000)
;;=> StackOverflowError clojure.lang.Numbers$LongOps.negate (Numbers.java:534)
Almost there, we just need to address one language-specific peculiarity. In Clojure, if we want to use tail call
optimization, we must call the function by using recur
rather than the actual function name, e.g:
(defn sum-first-tail-call
([n acc]
(if (zero? n)
acc
(recur (- n 1) (+ acc n))))
([n] (sum-first-tail-call n 0)))
(sum-first-tail-call 0)
;;=> 0
(sum-first-tail-call 1)
;;=> 1
(sum-first-tail-call 3)
;;=> 6
(sum-first-tail-call 6)
;;=> 21
(sum-first-tail-call 100000)
;;=> 5000050000
And that’s it. Stack did not overflow. We’ve got the result instead. How about even bigger numbers?
(sum-first-tail-call 9876543)
;;=>
(sum-first-tail-call 99000000)
;;=> 4900500049500000
(sum-first-tail-call 123456789)
;;=> 7620789436823655
Yep, still good.
But that doesn’t change the fact that we simply can’t do this with JavaScript. In fact, not many languages support tail call optimization often making recursion a risky choice for any large enough problem. Is there anything at all we can do about that? Luckily, it turns out there is. Let’s take another look at the JavaScript version with the tail call:
let sumFirstTailCall = function(n, acc){
if(!acc){
acc = 0;
}
if(n == 0){
return acc;
}
return sumFirstTailCall(n - 1, acc + n);
}
Logically speaking, what happens here is that we go from n
all the way down to zero, adding all numbers as we go.
We practically loop from n
to 0
and it just happens that we implemented our loop with recursive function calls.
The code above is equivalent to:
let sumFirstEquivalentLoop = function(n){
let acc = 0; //zero passed in initial call
while(n > 0){ //if condition met - recursive call
//otherwise - exit clause condition
acc += n; //recursive call second parameter
n--; //recursive call first parameter
}
return acc; //exit clause return
}
Comments next to each line show which step in iterative version is equivalent to which step in the recursive one. Arguably, it’s less readable, but go through it line by line and you’ll see that it follows the same algorithm. And without any dependence on function call stack, so it doesn’t crash:
sumFirstEquivalentLoop(10000);
;;=> 50005000
What’s the deal?, you could ask now, First you tell me to use recursion, not iteration, to solve problems. And now you tell me to drop the recursion and just use the loop? I could even just do this and be done with it:
let sumFirstForLoop = function(n){
let sum = 0;
for(let i = n; i > 0; i--){
sum += i;
}
return sum;
}
Well, if your favorite programming language supports tail call optimization (or the problem is not large in scope), then all is well.
But if not, there’s two things you can do:
- Khm, khm, switch to a language that better suits your needs. Just saying!
- Model your problem by solving it recursively, if it’s easier to think about the problem in a recursive way, and then possibly convert it to an iterative solution, like we just did now.
- Use the iterative solution from start. Using loops is not the end of the world - as long as the overall function is well thought and pure, you’ll be fine.
Now that we covered recursion, we continue our functional programming journey with a new tool in our belt. On we go!