A Deep Dive into Javascript References: Eliminating a Category of Bugs

As a full-stack developer who has worked extensively in Javascript, I‘ve seen my fair share of perplexing bugs. More often than not, tricky Javascript bugs come down to a misunderstanding of the language‘s nuances and quirks. Perhaps the most notorious is how Javascript handles references when copying objects. It‘s a concept that seems simple on the surface, but in practice catches even senior developers off guard. In this deep dive, we‘ll explore exactly how object references work, the kinds of bugs they cause, and strategies to avoid them altogether.

Pass By Reference vs Pass By Value

To understand Javascript‘s quirk with object copying, we first need to take a step back and look at the two main ways programming languages can pass around data – pass by reference and pass by value.

Pass by value is the simpler of the two. It means whenever you assign a variable to another variable, it creates a completely separate copy of that value. Modifying one does not affect the other.

let num1 = 5;
let num2 = num1;

num2 = 10;

console.log(num1); // 5
console.log(num2); // 10

In this example, num1 is initialized with the value 5. When we assign num2 = num1, it copies that value 5 into a completely separate spot in memory for num2. So even though we later change num2 to 10, num1 remains unmodified as 5.

In contrast, pass by reference means when you assign a variable to another variable, it only copies the reference or "address" of the value in memory, not the value itself. Both variables end up pointing to the same underlying data. So modifying one also modifies the other.

let obj1 = {a: 1, b: 2};
let obj2 = obj1;

obj2.c = 3;

console.log(obj1); // {a: 1, b: 2, c: 3}
console.log(obj2); // {a: 1, b: 2, c: 3}

Here obj1 is initialized as an object {a: 1, b: 2}. When we assign obj2 = obj1, it doesn‘t create a new object, it just copies the reference to the existing object. You can visualize it like two arrows pointing to the same object in memory. So when we add a property c to obj2, we‘re actually modifying the shared object that both obj1 and obj2 reference.

Object references visualized

How Javascript Decides to Pass By Value or Reference

So does Javascript use pass by value or pass by reference? The answer is both, depending on the data type. Javascript always passes primitives (string, number, boolean, null, undefined, symbol) by value and objects (arrays, functions, objects) by reference.

This often confuses new Javascript developers who expect consistentency. Intuitively, it seems like objects should also be passed by value. After all, when you assign a number it makes a copy – so shouldn‘t assigning an object also make a copy? But because objects can be quite large and complex, copying them on every assignment would be very inefficient. So as a performance optimization, Javascript only copies the reference and both variables share the same underlying object.

As we saw in the initial example, this reference sharing can lead to some unexpected behavior if you‘re not aware of it. Modifying an object in one place can accidentally change it in other places. It violates our intuitive assumption that assigning a variable shouldn‘t affect any other variables.

The Many Manifestations of Reference Bugs

The copying quirk can manifest as bugs in many scenarios. The most common is unintended mutation of shared state.

For example, consider a Redux reducer that needs to update an item in an array.

function todosReducer(state = [], action) {
  switch (action.type) {
    case ‘TOGGLE_TODO‘:
      let todo = state[action.index];
      todo.completed = !todo.completed; 
      return state;
    default:
      return state;  
  }
}

At first glance this code looks okay. It finds the right todo in the state array, flips its completed flag, and returns the state. But it doesn‘t actually work correctly. Because todo is a reference to an item in the state array, the reducer is directly mutating the previous state. The toggle works, but the return state line doesn‘t actually change the state reference in the store. This leads to all sorts of problems like the UI not reflecting the change.

Here‘s a more insidious case that recently caught me in a recursive function. See if you can spot the bug:

function sumLeaves(node, sum = 0) {
  if (!node) return sum;

  if (node.left) sum = sumLeaves(node.left, sum);
  if (node.right) sum = sumLeaves(node.right, sum);

  return sum + node.val;
}

This function recursively sums up the values of the leaf nodes in a binary tree. It almost works, but it double counts leaves! The bug is tricky because the unintended reference sharing happens between the recursive calls. Each recursive call takes in a sum parameter defaulted to 0. But when it makes the next recursive call, it passes the same sum variable it received. So the left and right recursive calls are secretly modifying the same sum variable by reference, causing the double counting.

Reference bugs also frequently appear in asynchronous code. Consider this snippet:

function printFiles(files) {
  files.forEach(file => {
    fs.readFile(file, ‘utf8‘, (err, contents) => {
      console.log(file + ‘: ‘ + contents);
    });
  });
}

let files = [‘file1.txt‘, ‘file2.txt‘, ‘file3.txt‘];
printFiles(files);
files = [];

The goal is to print the contents of each file in the files array. But because the fs.readFile callback is asynchronous, the files variable is emptied before the callbacks run. Since there‘s only one files array referenced in multiple places, emptying it makes the callbacks print empty content.

As these examples illustrate, unintended reference sharing bugs can crop up in a variety of scenarios beyond just a simple object copy. Anytime you pass an object into a function (as a parameter, to a nested function, to a callback, in a recursive call, etc.), you open the door to referencing the same object in multiple places. And mutating one of those references will lead to confusing bugs.

Preventing Reference Bugs with Cloning

The fundamental problem with copying objects by reference is it violates our mental model. We expect assignments and function calls to be isolated, but shared references let them affect each other. The solution is to change how we copy objects to match our intuitive expectations – always copy the object data, not just the reference. In other words, clone the object.

There are a few common ways to clone an object in Javascript:

  1. Use the spread operator to shallow copy an object:

    let obj = {a: 1, b: 2};
    let clone = {...obj};
  2. Use Object.assign to shallow copy an object:

    let obj = {a: 1, b: 2};
    let clone = Object.assign({}, obj);
  3. Use JSON stringify/parse to deep copy an object:

    let obj = {a: 1, b: {c: 2}};
    let clone = JSON.parse(JSON.stringify(obj));

Each of these techniques creates a separate copy of the object data, so modifications to the clone don‘t affect the original. This solves the immediate shared reference problem. However, cloning comes with its own issues.

Both spread and Object.assign only create a shallow copy, meaning they only copy the top level properties. If your object contains nested objects, those nested objects will still be shared by reference between the original and the copy. Only stringify/parse does a deep clone, but it only works on objects that are JSON safe (no functions, symbols, or circular references).

Cloning also has performance implications to keep in mind. Making copies of large or complex objects on every operation gets expensive fast. Imagine you have an array of thousands of large objects and you want to update a property on one of them. Making a complete copy of the entire array for that one update is wasteful.

Immutable Data Structures

The alternative to cloning is to change how we model our data structures to make them immutable. An immutable data structure can‘t be changed after it‘s created. Instead of modifying it, you create an updated copy of it. The original remains unchanged.

There are a few ways to achieve immutability in Javascript. One is to embrace functional programming principles and treat your objects as immutable, even though Javascript doesn‘t enforce it. Every time you need to update something, you return a new object instead of modifying the existing one. This is easier said than done – it requires rethinking common operations and diligently avoiding mutating methods.

A more robust approach is to use a library that implements immutable data structures, like Immutable.js or Mori. These model your application data as a tree of immutable objects. Whenever you need to make a change, you get an updated reference to the tree. Sort of like Git versioning for your data structures. The libraries use advanced techniques like structural sharing to efficiently make copies while minimizing memory overhead.

Using immutable data structures solves the shared reference problem at a fundamental level. You never have to worry about one part of your code mutating an object by reference and affecting another because mutating isn‘t allowed. It forces a more functional style of programming focused on transformations rather than mutations.

There are tradeoffs to consider with immutability. It can make your code more verbose since you can‘t use mutating operations. It also has a learning curve – you have to understand and adhere to the immutable APIs. And it‘s yet another dependency to add to your codebase. But in my experience, the elimination of reference bugs is well worth the cost.

Internalizing Javascript Mental Models

At the end of the day, unintended reference sharing is just one of many counterintuitive parts of Javascript. The path to Javascript mastery is internalizing accurate mental models of how these language features really work. When you read let b = a, you should immediately think "b is a reference to a" instead of "b is a copy of a". The goal is for this mental model to become natural and automatic.

Building these mental models takes a lot of intentional practice. Anytime you encounter a bug related to a Javascript quirk, take the time to dive deep into it. Don‘t just find a quick fix on Stack Overflow. Read explanations and dissect examples until the underlying concept really clicks for you. Over time, you‘ll build a robust framework for reasoning about Javascript edge cases.

More broadly, strive to understand the why behind the design of Javascript. Read up on the language‘s history and philosophy. The quirks make a lot more sense when you realize Javascript‘s core tenet is flexibility over strictness. Gaining an appreciation for this context will help you work with the grain of the language rather than fighting against it.

Conclusion

If you take one thing away from this deep dive, let it be this – Javascript copies objects by reference, and it can lead to some truly perplexing bugs. Anytime you pass an object around, there‘s a risk that it‘s actually a reference to some shared data structure. Mutating it in one place can cause unexpected side effects in another.

The two main strategies to combat this are cloning objects to make true copies or using immutable data structures to prevent mutation altogether. Both have tradeoffs in performance and complexity, but they‘re essential tools for avoiding reference bugs.

More importantly, seek to deeply understand Javascript‘s reference model and build an accurate mental representation of it. Don‘t settle for the faulty intuition that objects should copy like primitives. Recognize and anticipate scenarios where reference sharing can bite you. The key to mastering Javascript is internalizing its quirks so you can focus on the fun parts – building awesome things.

Similar Posts