Let‘s Learn How Module Bundlers Work and Then Write One Ourselves

Module bundlers are the unsung heroes of the modern web development world. Chances are if you‘ve built a web application with JavaScript in the last 5+ years, you‘ve used a module bundler – maybe without even realizing it! Tools like webpack, Rollup, and Parcel have revolutionized the way we build for the web, but they can seem like mysterious black boxes. Today we‘re going dive deep and learn exactly what module bundlers do, how they work under the hood, and even write a simple one ourselves!

A Brief History of Bundlers

First, let‘s set the stage with a quick history lesson. Back in the early days of JavaScript and the web, most code was written in a single file. As applications and codebases grew larger, developers began splitting their JavaScript into multiple files and loading them with multiple <script> tags. While this made code more maintainable, it led to problems:

  • Multiple script tags meant more HTTP requests, slowing down page loads
  • Scripts needed to be loaded in a specific order to handle dependencies
  • Scoping issues could easily leak variables into the global namespace

In 2009, a project called CommonJS was started with the goal of bringing the code organization benefits of modules in languages like Python and Ruby to JavaScript. Node.js adopted the CommonJS require and module.exports syntax, but browsers didn‘t have support for modules yet.

Some early tools like Browserify and RequireJS emerged to fill the gap, letting developers write modular code that could be bundled for browser usage. However, as JavaScript applications grew in size and complexity, so did the demands on bundlers. More advanced features like code splitting, hot module replacement, and tree shaking led to the development of a new generation of bundlers like webpack (released in 2014) and Rollup (2015).

Today, webpack is by far the most popular JavaScript bundler, with over 20 million weekly downloads on npm. A 2020 survey from the State of JS found that 77% of respondents use webpack, while 11% use Rollup and 8% use Parcel.

Module Bundler Concepts

So what exactly does a module bundler do? At a high level, a bundler:

  1. Takes an entry point file and follows its dependencies to build a dependency graph
  2. Resolves those dependencies, recursively building up the full graph
  3. Bundles the modules into one or more optimized output files

Let‘s look at each of those steps in more detail.

Entry Points

An entry point is simply the file where the bundler starts the process of building the dependency graph. This is typically the file that kicks off your application, like an index.js or app.js. You can specify multiple entry points to generate separate bundles, for example, to create a separate vendor bundle for third-party libraries.

Dependency Graph

The dependency graph is a representation of how all the modules in your project depend on each other. It‘s a directed acyclic graph where each node is a module and each edge is a dependency.

Building this graph is the key job of the bundler‘s dependency resolution phase. Starting from the entry point, the bundler parses the code to identify dependencies (usually by looking for import or require statements), and then recursively does the same for each of those dependencies.

The parsing process can be tricky since it needs to handle different module formats (CommonJS, ESM, AMD) and deal with issues like circular dependencies. Bundlers use abstract syntax tree (AST) parsers to statically analyze the code and extract dependencies without actually executing it.

Module Resolution

After building the dependency graph, the next step is module resolution. This is the process of mapping each module‘s dependencies to actual files on disk.

Bundlers have various algorithms for resolving modules, but most follow a similar pattern:

  1. If the path is relative (like ./utils), resolve it relative to the current file
  2. If the path is absolute (like /lib/math), resolve it from the project root
  3. If the path is a bare module name (like lodash), resolve it from the node_modules folder

The resolving process needs to handle file extensions (like .js or .ts), package.json entries like module or main, and browser field mappings for packages that have separate builds for node vs browser.

Bundling

Finally, after resolving all the dependencies, the bundler is ready to generate the output bundle(s). This process is also called "packing".

The basic approach is to wrap each module in a function and concatenate them into a single scope, with some additional runtime code to handle the module loading process. Here‘s a simplified example:

const modules = {
  ‘index.js‘: function(require, module, exports) {
    const { greet } = require(‘./greet.js‘);
    document.body.textContent = greet(‘world‘);
  },
  ‘greet.js‘: function(require, module, exports) {
    module.exports = function greet(name) {
      return `Hello, ${name}!`;
    }
  }
};

(function(modules) {
  const cache = {};

  function require(id) {
    if (cache[id]) {
      return cache[id].exports;
    }

    const module = cache[id] = {
      exports: {}
    };

    modules[id](require, module, module.exports);

    return module.exports;
  }

  require(‘index.js‘);
})(modules);

In this example, each module is wrapped in a function that takes require, module, and exports arguments (similar to the CommonJS environment). The require function is defined in the outer scope and handles loading and caching modules. Finally, the entry point is require‘d to kick off the program.

Real bundlers have more advanced variations of this process. They might:

  • Use a more efficient code generation approach than simple concatenation
  • Rename variables to avoid naming collisions
  • Remove unused exports (tree shaking)
  • Split modules into separate chunks that can be loaded on demand
  • Generate source maps for easier debugging

Writing a Basic Bundler

Now that we understand the key concepts, let‘s apply them by writing a basic module bundler ourselves! Our bundler will take an entry file, build its dependency graph, and generate a bundled JS file.

Setup

First, we‘ll set up a new project:

mkdir bundler-demo
cd bundler-demo
npm init -y
npm install --save-dev astrology

We‘ll use the astrology library to handle parsing JavaScript and extracting dependencies.

Next, let‘s create an example project to bundle:

// add.js
export function add(a, b) {
  return a + b;
}
// index.js
import { add } from ‘./add.js‘;

console.log(add(1, 2));

Creating the Bundler

Now we‘re ready to implement the bundler. Here‘s the code with explanations inline:

// bundler.js
import fs from ‘fs‘;
import path from ‘path‘;
import { parse } from ‘astrology‘;

// Parse a file and extract its dependencies
function parseFile(filename) {
  const content = fs.readFileSync(filename, ‘utf-8‘);
  const ast = parse(content);

  const dependencies = [];

  // Recursively traverse the AST to find import declarations  
  function traverse(node) {
    if (node.type === ‘ImportDeclaration‘) {
      dependencies.push(node.source.value);
    }

    if (node.type === ‘Program‘ || node.type === ‘BlockStatement‘) {
      node.body.forEach(traverse);
    }
  }

  traverse(ast);

  return {
    filename,
    dependencies
  };
}

// Build the dependency graph given an entry file
function buildDependencyGraph(entryFile) {
  const rootModule = parseFile(entryFile);
  const queue = [rootModule];

  for (const module of queue) {
    const dirname = path.dirname(module.filename);

    module.dependencies.forEach(relativePath => {
      const absolutePath = path.join(dirname, relativePath);
      const child = parseFile(absolutePath);
      queue.push(child);
    });
  }

  return queue;
}

// Generate the bundled code given a dependency graph
function generateBundle(graph) {
  let modules = ‘‘;

  graph.forEach(mod => {
    modules += `‘${mod.filename}‘: function(exports, require) {\n`;
    modules += fs.readFileSync(mod.filename, ‘utf-8‘);
    modules += `\n},`;
  });

  const result = `
    (function(modules) {
      function require(id) {
        const fn = modules[id];
        const module = { exports: {} };
        fn(module.exports, require);
        return module.exports;
      }
      require(‘${graph[0].filename}‘);
    })({${modules}})
  `;

  return result;
}

// Main function to bundle an entry file
function bundle(entryFile) {
  const graph = buildDependencyGraph(entryFile);
  const bundledCode = generateBundle(graph);
  fs.writeFileSync(‘./output.js‘, bundledCode);
}

bundle(‘./index.js‘);

Here‘s a step-by-step breakdown:

  1. In parseFile, we use astrology to parse a file into an AST. Then we recursively traverse the AST to find ImportDeclaration nodes and extract their source (the relative import path).

  2. buildDependencyGraph takes an entry file, parses it, and adds it to a queue. Then it processes each module in the queue, parses its dependencies, and adds them to the queue until all dependencies have been processed. The result is an array representing the dependency graph.

  3. generateBundle takes the dependency graph and generates the bundled code. It wraps each module in a function and concatenates them into a single scope, with a require function to load modules at runtime.

  4. Finally, the bundle function is the main entry point that orchestrates the process. It builds the dependency graph, generates the bundled code, and writes it to disk.

If we run the bundler on our example project:

node bundler.js

It will generate an output.js file with the bundled code:

(function(modules) {
  function require(id) {
    const fn = modules[id];
    const module = { exports: {} };
    fn(module.exports, require);
    return module.exports;
  }
  require(‘./index.js‘);
})({
  ‘./index.js‘: function(exports, require) {
    import { add } from ‘./add.js‘;

    console.log(add(1, 2));  
  },
  ‘./add.js‘: function(exports, require) {
    export function add(a, b) {
      return a + b;
    }    
  }
})

And that‘s it! We‘ve built a basic but functional JavaScript module bundler. It can take an entry file, build its dependency graph, and generate a bundle with all the modules wrapped in functions and loaded at runtime.

Real-World Bundlers

Of course, real-world bundlers like webpack and Rollup are much more sophisticated than our basic example. Some additional features they offer include:

  • Support for multiple entry points
  • Code splitting to generate separate bundles that can be loaded on demand
  • Tree shaking to remove unused exports and reduce bundle size
  • Support for css, images, and other asset types
  • Loaders and plugins to transform code (like transpiling ES2015+ to ES5)
  • Hot Module Replacement for faster development
  • Caching and incremental builds for better performance

Implementing these features requires a much more robust architecture, but the core concepts are the same as we‘ve covered.

The Future of Bundlers

As JavaScript and web development continue to evolve, so do the tools we use to build applications. One of the most exciting developments in recent years is native ES modules in browsers. Modern browsers now support using <script type="module"> to load ES modules directly, without needing a bundler.

This has led some to question whether bundlers are still necessary. While it‘s true that bundlers may not be needed for simpler applications, they still offer many benefits for larger, more complex apps:

  • Browsers still have limited support for some of the more advanced features offered by bundlers, like dynamic imports and tree shaking.
  • Shipping large numbers of small, unbundled files can negatively impact performance. Bundlers can optimize and minimize network requests.
  • Many existing tools and frameworks still depend on a bundled environment. Migrating to unbundled, native modules would require significant changes.
  • Bundlers offer powerful features for optimizing production builds that go beyond just managing modules.

So while the role of bundlers may evolve, it‘s likely they will continue to be an important part of the web development toolchain for the foreseeable future.

Conclusion

Module bundlers are a critical piece of the modern JavaScript ecosystem. By allowing us to write modular, reusable code and efficiently packaging it for the browser, they enable the large, complex applications we build today.

I hope this deep dive has given you a better understanding of how bundlers work under the hood. While our DIY example only scratched the surface of what real-world bundlers can do, the core concepts of dependency resolution and code generation are the same.

Lastly, I want to emphasize that you don‘t need to deeply understand your bundler to be an effective developer. Tools like webpack have many powerful features, but can also be intimidating due to their complexity. My advice is to start simple, using a tool‘s defaults or a pre-configured starter, and gradually learn the more advanced customization options as you need them. Understanding the fundamentals of how your tools work is valuable, but shouldn‘t get in the way of shipping code.

So what do you think – will you be writing your own module bundler anytime soon? Probably not. But next time you‘re waiting for your webpack build to finish, perhaps you‘ll have a new appreciation for all the work it‘s doing behind the scenes!

Similar Posts