In previous chapters we already noticed that in functional programming we don’t really care much about the exact order in which the individual steps of our programs are executed. Instead, we just specify what kind of result we want and based on which parameters and let the environment take care of the rest.
That is the reward we get for using pure functions. However, our own comfort is not the only benefit we get. Armed with the assumption of function purity, execution environment can transparently give you some pretty cool optimizations for free.
Lazy evaluation is one of those optimizations and its underlying philosophy is rather simple - no code should be executed until it is absolutely necessary to do so.
Let’s start with a non-lazy example. We can have a simple arithmetic expression in JavaScript:
(4 + 5 + 6) * (1 + 1)
Final result of this expression is the product of the two sums. So, we obviously need those sums evaluated first.
However, note that it really does not matter which sub-expression will be evaluated first, the left one or the
right one. Furthermore, the order in which addition operations are evaluated within expression (4 + 5 + 6)
also does not matter.
//is it going to be like this:
(4 + 5 + 6) * (1 + 1)
(9 + 6) * (1 + 1)
15 * (1 + 1)
15 * 2
30
//or like this:
(4 + 5 + 6) * (1 + 1)
(4 + 11) * (1 + 1)
15 * (1 + 1)
15 * 2
30
//or even like this:
(4 + 5 + 6) * 2
(4 + 11) * 2
15 * 2
30
Programming language specifications are typically strict about their evaluation order and JavaScript is no different. However, the point we were making here is that, at least logically speaking, the evaluation order in this example does not affect the final evaluation of this piece of code.
If we compare all the different evaluation orders, we can notice that in all of them every sub-expressions eventually gets evaluated and all of their operations get executed. Not a single addition or multiplication gets skipped, whichever path we use to get to the result.
Now, let’s use a different expression:
(5 == 5) || (7 == 1)
Here we have a logical OR operation that will be true if at least one of its parameters is true. Effectively, if we find that the first parameter is true (which it is), we don’t really need to evaluate the second one as we know that the whole expression is also true.
The evaluation order in this case goes like this:
(5 == 5) || (7 == 1)
true || (7 == 1)
true
Which means that sub-expression (7 == 1)
never actually gets evaluated - we simply don’t need it to know the
final result.
We can try a few variations of this example, e.g. we can use expressions with side effects to demonstrate that they are not evaluated:
//the console message will never appear
(5 == 5) || console.log("hey")
true || console.log("hey")
true
We’ll refer to this expression as our elementary lazy evaluation example. No sub-expression gets evaluated until it is absolutely necessary to do so. And since the second sub-expression never becomes necessary, it ends up unevaluated.
For the sake of comparison, we’ll also show the eager version of OR operation - its symbol is a single pipe character:
(5 == 5) | (7 == 1)
true | (7 == 1)
true | false
true
As we expected, eager operation forced the evaluation of all its sub-expressions even after it was obvious what the result would be. Since this example used pure sub-expressions, there was no visible difference. But what is going to happen if we retry our example with side effects using the eager OR operation:
(5 == 5) | console.log("hey")
true | console.log("hey")
true | undefined //at this point "hey" appears in console log
true
Now that we’ve covered the elementary cases, let’s try out something a bit more complex. Let’s say we have an array of numbers, e.g.
let numbers = [3, 4, 1, 7, 3, 10];
And we want to create another array that contains squares of those numbers. Easy enough:
let squares = [];
for(let i = 0; i < numbers.length; i++){
let n = numbers[i];
squares[i] = n * n;
}
squares;
//=> [9, 16, 1, 49, 9, 100]
Looks good! However, it’s a bit wasteful to run the calculations for all the array elements upfront without even considering which elements we actually need calculated. For all we know we may only need the square of, say, the third number:
squares[2];
//=> 1
This doesn’t make much of a difference in our tiny example, but imagine if our array was huge and if, instead of a simple multiplication, we called some expensive function on each element. Definitely not good if we only end up needing a small part of the result.
The complicated solution to this problem would be to figure out which elements we need calculated in advance and then process only them.
The good and simple solution is to use lazy evaluation. Again, we’ll start with the same array of numbers:
let numbers = [3, 4, 1, 7, 3, 10];
But this time, instead of creating a new array that contains pre-calculated squares, we’ll write a function that creates a wrapper around the original array:
let wrapWithSquares = function(array){
return {length: array.length,
get: function(index){
let value = array[index];
return value * value;
}}
}
let squaresWrapped = wrapWithSquares(numbers);
That gives us a data structure around the original wrapper that offers access to the processed items.
One particular difference is that we’ll use get
method instead of native index access, e.g. we’ll get the
third element like this:
squaresWrapped.get(2);
//=> 1
So, why is this solution lazy? Because we created the squares
structure without any squaring whatsoever.
In fact, the only time any multiplication happened was when we requested the third element and we only processed
that element. So, the code above only had one multiplication calculated and that would’ve been equally true for
an array the size of 100000 as it is for this one that has only six elements.
Next thing we’ll do is calculate something different with the same array of numbers, e.g. factorial for each element in the array. In order to do that we can re-use the same wrapped logic and only change the transformation function. In fact, by using higher order functions, we can make it pluggable:
let wrapArray = function(array, f){
return {length: array.length,
get: function(index){
return f(array[index]);
}}
}
let factorial = function(n){
if(n == 1){
return 1;
}
return n * factorial(n - 1);
}
let factorials = wrapArray(numbers, factorial);
factorials.get(0); // element at index 0 is 3
//=> 6
factorials.get(1); // element at index 0 is 4
//=> 24
factorials.get(2); // element at index 0 is 1
//=> 1
Pretty cool, but there is one drawback. From the performance perspective this all works fine only as long as we don’t need to access the same element multiple times. Because if we do, we’ll just re-run the transformation function over and over, which is a bit wasteful and could potentially drop the performance of our app, especially if the function is expensive.
We can see more clearly what happens if we use a transformation function with a side effect.
let double = function(n){
console.log("Doubling number " + n);
return n * 2;
}
let doubles = wrapArray(numbers, double);
doubles.get(0);
//=> Doubling number 3
//=> 6
doubles.get(1);
//=> Doubling number 4
//=> 4
doubles.get(2);
//=> Doubling number 1
//=> 2
//then we access the same element multiple times
doubles.get(0);
//=> Doubling number 3
//=> 6
doubles.get(0);
//=> Doubling number 3
//=> 6
doubles.get(0);
//=> Doubling number 3
//=> 6
doubles.get(0);
//=> Doubling number 3
//=> 6
//and console messages show us that it was calculated
//from scratch over and over again.
Ideally, we would have a data structure that keeps all its elements unevaluated until they are requested for the first time, but when they do get evaluated, the results would get cached for later reference. That would mean that every element would get evaluated at most once.
We’ll see more about result caching in the upcoming chapter about memoization, so stay tuned!