The definition of immutable data is completely contained in the name - it’s data that we cannot change or mutate.
Since this concept is defined by what it isn’t rather than what it is, we’ll start with an explanation of its opposite - mutable data
What is mutable data?
Simply put, mutable data structures are the ones that we can change over the course of time. In JavaScript, regular objects are the typical example of this behavior. Let’s start by creating one now:
let car = {brand: 'Ford',
model: 'Fiesta',
color: 'gray'};
What we have here is one object with three properties and the corresponding values and one reference to that
object stored in a variable called car
. Graphically, we could depict the whole thing like this:
We can change any of the object properties at any moment, e.g. we can decide that blue color looks much better than gray:
car.color = 'blue';
And from that line of code the color of the car is blue. At least until another similar statement changes the value again to something else. What happened here is sometimes called destructive update. Destructive in the sense that the original value of the color property is now lost - it’s been destroyed, overwritten by the new value. We have no way of knowing that the car was ever gray, all we know is that it’s blue now.
Destructive updates are the defining feature of mutable data, they are what makes it mutable. Now let’s see some interesting consequences of object mutation can have.
Since the variable only holds the reference, when we copy its value and store it in another variable, we’ll end up to two variables referring to the same object, i.e. this:
let anotherCar = car;
Whichever variable we use, we’ll get to the same object. So, what happens if we change the color again?
car.color = 'red';
//evaluating the same variable has no surprises
car
//=> {brand: 'Ford', model: 'Fiesta', color: 'red'}
//but what about the other one?
anotherCar
//=> {brand: 'Ford', model: 'Fiesta', color: 'red'}
Since both variables refer to the same data structure, by changing a property for one of them, we inadvertently changed it for the other one as well.
That’s it - mutable data is the one we can change. Now, let’s get back to the main topic.
What is immutable data?
The most common example of immutable data is the String data type, which is usually implemented as an immutable data structure, although it may also have a mutable counterpart depending on the programming language or library. JavaScript strings are immutable, so we’ll use them:
let greeting = "Hello";
When we create a JavaScript string, we get a value that consists of a number of characters, but it is immutable. In other words, we can’t change that same string, all we can do is use it to get a new one based on it.
For example, we might want to add an exclamation mark to our short greeting there. So, we just want to append a single character at the end of the string, but alas we cannot do that - not literally anyway. What we can do instead is to create a new string that consists of all the same characters, but also has an extra one.
let greeting = "Hello";
let anotherGreeting = greeting + "!";
Ok, but that’s not quite what we wanted to achieve. We wanted to change the value of greeting
, not create a new
variable. Well, there is nothing to prevent us from assigning the result to the same variable:
let greeting = "Hello";
greeting = greeting + "!";
Or even shorter:
let greeting = "Hello";
greeting += "!";
Note that, while we did change the value in the variable, we never actually changed the original string value that
only said "Hello"
. Instead, we used that value and another string value containing only the exclamation mark and
we produced a completely new string based on that.
So, what’s the consequence then? Let’s say we used the same string object with two different variables.
let greeting = "Hello";
let anotherGreeting = greeting;
So, we have two references to the same string. Simple enough, is it? Let’s say we repeat what we did there and
append an exclamation mark to greeting
.
greeting += "!";
greeting
//=> "Hello!"
anotherGreeting
//=> "Hello"
What happened there is this:
- We started with two references to the same string.
- We created a new string with only one exclamation mark character in it.
- Then , we created another string by appending those two.
- Assigned a reference leading to the new string to variable
greeting
, leaving variableanotherGreeting
alone. - We ended up with variable
greeting
referring to the new string andanotherGreeting
still referring to the intact original string.
As another example, all JavaScript primitive types are immutable. That includes numbers and booleans, e.g:
//we can evaluate a number literal
5
//=>5
//we can use it in expressions
5 + 1
//=> 6
//or we can put it in a variable
let a = 5
//and then use it in an expression
a + 1
//=> 6
//but we can never change it
//a five is always a five,
//we can't change its fiveness!
Immutability and function purity
Alright, we’ve seen what immutable data is, now let’s see why we need it in the first place.
Remember how we defined pure functions - they get all their inputs as parameters, return the value only as the result and don’t touch the external state at any point in between.
Well, this only really works if all the inputs are immutable. Otherwise, what we get as a parameter is nothing but a reference to an object whose state may be changed by any other code that has a reference to it, i.e. dependence on some shared state. Likewise, if our function updates any fields in the object, other functions that refer to it will also be affected.
Ok, this is simple enough, but how would an immutable object be any better? Isn’t it also a reference to some external state? Well, yes. However, it is a state that cannot be modified by our function (no side effects) and also cannot be modified by any other code so it is fixed from the moment our function receives it as a parameter as if it were a hardcoded value literal, e.g. 5.
To be fair, strictly speaking this is only a big deal in multithreaded environments since you really need those two functions to run in parallel in order for them to simultaneously access state of the same object. Still, even in single threaded environments immutable data is much easier to make sense of as much as pure code is easier to test and debug.
The other side of the same coin is that it is incredibly difficult do parallelize a program that uses mutable data, whereas the same thing is almost trivial with immutable structures.
That’s it more or less - immutable data lets us write cleaner code, it’s easier to reason about and in multithreaded environments it is a necessary requirement for function purity.
Persistent data structures
Different environments have different default implementations for the built-in data structures and even more libraries to find.
For example, in JavaScript regular objects, dates and arrays are mutable, while numbers, booleans, strings and functions are not. A similar situation is with Java, C# and other mainstream object-oriented language.
On the other hand, languages that are more focused on functional programming style typically use immutable built-in data structures as the default, e.g. Clojure, Haskell, etc.
Now, since our examples are in JavaScript, we might also mention that there is a way to make a regular object immutable, even though it is a not particularly useful:
let castle = {princesses : 2, dragons: 0};
Object.freeze(castle);
Function Object.freeze
takes one presumably mutable object as its input parameter and it makes that object
immutable. What’s useless about that, you might ask. What if we wanted to add a dragon to the castle?
castle.dragons = 1;
Depending on the environment settings, this line will either silently fail or throw an error. Either way, it won’t update the object:
castle
//=> {princesses : 2, dragons: 0}
But, that’s what we wanted, wasn’t it? Almost, but not quite - we wanted to be able to “change” the object by creating a new one that’s different in one detail. We wanted to create a new castle the same as this one with the exception that it also has a dragon in it.
Sure, we can create a blank object and copy the properties one at a time, but it would be a bit tedious. What we really need is something like those functions String objects have where we pass some parameters and get the updated object as the function return value.
//not actual JavaScript!
//illustrative example only - it won't run
let changedCastle = update(castle, 'dragons', 1);
See, the problem here is that the object we got by calling Object.freeze
is immutable, but it is not
persistent!
A persistent data structure is one that is immutable, but also offers a fairly easy interface that consists of operations that look like mutations but actually create new objects and return it in a purely functional manner.
That said, we can see that built-in JavaScript String is a persistent data structure. We won’t find many of those in the standard library.
Of course, we could write our own data structures and functions to cover that gap. For illustrative purposes,
we’ll write a naive implementation of function update
now:
function update(object, property, value){
let copy = {};
for(let key in object){
copy[key] = object[key];
}
copy[property] = value;
Object.freeze(copy);
return copy;
}
We first create an exact copy of the object, then set the extra property and then freeze the object just to make sure it’s immutable. Again, this is a very naive example, so it does not even work properly with nested objects, not to mention that with all the copying its performance can’t be good. Yet, we can use it to run our previous example:
let castle = Object.freeze({princesses : 2, dragons: 0});
let castleWithDragons = update(castle, dragons, 1);
castle
//=> {princesses : 2, dragons: 0}
castleWithDragons
//=> {princesses : 2, dragons: 1}
So, they’re persistent, alright. We can check if both objects are also immutable:
castle.dragons = 11;
castle
//=> {princesses : 2, dragons: 0}
castleWithDragons.princesses = 5;
castleWithDragons
//=> {princesses : 2, dragons: 1}
So, our objects themselves cannot be modified and the only way to get changed objects is though a pure function that produces a new value. Mission accomplished!
Of course, we don’t have to reinvent the wheel. There are plenty of libraries that fully focus on this problem, such as Immutable.js for JavaScript and many others for different languages and platforms.
Immutable data modelling
From a purely modelling point of view there are various cases when either mutable or immutable approach corresponds more closely to what’s happening in the real system.
For example, when we changed the color of the car, it was actually more intuitive to use a mutable structure since that’s how things happen in real life.
car.color = 'red'
When you paint your car, you change the actual object. And it really is a destructive update in the sense that the previous color is destroyed. You don’t create a new car that’s the same as the old one except for the paint.
This is also where our vocabulary somewhat fails us when it comes to persistent data structure operations.
We will refer to an operation as set
, update
, change
and so on even though we know that that’s exactly
what they don’t do.
However, it turns out that this way of thinking is not any more unnatural than any other data modelling we used - it simply uses a different point of view.
With mutable data we used a data structure to represent a real world object as it exists through time.
For example a mutable car
object represents the same car yesterday, right now, five minutes from now and
tomorrow, too. Naturally, just like the real world car changes as time flows by, so does our data structure.
And just like any changes on the real world car destroy its previous state, as do all the mutation operations
on our data structure.
That’s where immutable data modelling is completely different. We eliminate the flow of time from the equation!
Instead of using a data structure to represent the live state of the car, what we actually do is use it as a
snapshot of its state in one particular moment. So, when we update
a value, what we actually do is update
the knowledge our model has about the real world car.
Mutable data is a live broadcast that we can’t pause or rewind and we only ever see the latest state. On the other hand, immutable persistent data is a series of snapshots we can go back and forth with.