Supercharging Docker for Mac File Access: An Expert Guide

If you‘ve been using Docker for Mac to develop and deploy applications, you may have noticed one glaring issue: file access speeds for mounted volumes are abysmally slow compared to native file systems. Operations that take just milliseconds on Linux can grind to a halt when running inside a container on macOS, leading to a sluggish and frustrating development experience.

In this post, we‘ll take a deep dive into the root causes of this performance bottleneck and explore several strategies to dramatically speed up file access in your dockerized development environment on macOS. Along the way, we‘ll cover the technical details of how each approach works and provide benchmarks to quantify the performance gains. Let‘s get started!

Understanding the Docker File System Landscape on macOS

On macOS, Docker runs containers inside a lightweight hypervisor called HyperKit. This virtualization layer is necessary to run the Linux kernel and containerized processes on top of macOS. However, it also means that file system calls from inside containers must traverse multiple layers of abstraction, incurring significant overhead.

When you mount a macOS directory into a container using the -v flag, Docker uses a custom file system called osxfs to facilitate the mount. Osxfs is based on FUSE (Filesystem in Userspace), which allows file systems to be implemented in user space rather than kernel space.

While this architecture enables the necessary translation between the macOS and Linux file systems, it comes at a steep performance cost. File system calls that normally take microseconds can balloon to milliseconds as they are passed through the FUSE layer and hypervisor.

To quantify the performance hit, let‘s run some benchmarks comparing file access speeds between native Linux, macOS, and Docker for Mac with an osxfs mount. We‘ll use the dd command to write a 100MB file and measure the time taken.

On native Linux:

$ dd if=/dev/zero of=./test.img bs=1M count=100
104857600 bytes (100 MB, 105 MB) copied, 0.0983159 s, 1.1 GB/s

On macOS:

$ dd if=/dev/zero of=./test.img bs=1m count=100
104857600 bytes transferred in 0.376233 secs (278506448 bytes/sec)

On Docker for Mac with osxfs:

$ docker run --rm -v $PWD:/data alpine dd if=/dev/zero of=/data/test.img bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 7.80701 s, 13.4 MB/s

The results are startling. The native Linux file system clocks in at a blistering 1.1 GB/s, while macOS manages a respectable 278 MB/s. But with osxfs, the write speed plummets to just 13.4 MB/s – nearly 20x slower than native macOS!

Clearly, the overhead of FUSE and the hypervisor is having a massive impact on file system performance. So what can we do about it? Let‘s explore some options.

Option 1: Using NFS Volumes

One alternative to osxfs is to use NFS (Network File System) to mount directories into containers. NFS has been around for decades and is known for its simplicity and performance. Best of all, it‘s built right into macOS, so you don‘t need to install any additional software.

To set up an NFS mount, first make sure the NFS server is enabled on your Mac. Open a terminal and run:

$ sudo nfsd enable

Next, configure the directories you want to share by editing the /etc/exports file. Add a line for each directory you want to share, specifying the path and IP range of clients allowed to access it:

/Users/yourusername/projects -alldirs -mapall=501:20 localhost

This line shares the /Users/yourusername/projects directory and all its subdirectories (-alldirs) with the localhost client. The -mapall=501:20 option maps all incoming UIDs and GIDs to the UID 501 and GID 20, which are the default IDs for the first user account on macOS.

After saving the file, restart the NFS server:

$ sudo nfsd restart

Now, you can mount the NFS share inside a container using the --mount flag:

$ docker run -it --mount type=volume,src=nfs,source=/Users/yourusername/projects,target=/projects alpine sh

This command starts an Alpine Linux container and mounts the /Users/yourusername/projects directory from the host into the container at the /projects mount point.

Let‘s re-run the earlier dd benchmark with the NFS mount:

$ docker run --rm --mount type=volume,src=nfs,source=$PWD,target=/data alpine dd if=/dev/zero of=/data/test.img bs=1M count=100
100+0 records in
100+0 records out
104857600 bytes (105 MB, 100 MiB) copied, 0.335776 s, 312 MB/s

The results are impressive – with NFS, the write speed jumps to 312 MB/s, a 23x improvement over osxfs! Reads are also substantially faster, making NFS a solid choice for many performance-sensitive workloads.

However, NFS does have some limitations. It can be tricky to set up, especially if you have a complex network topology. It also lacks some of the advanced features of more modern file systems, like strong consistency guarantees and fine-grained access controls. And if you‘re frequently switching between different projects or code bases, manually editing the /etc/exports file and restarting the NFS server can quickly become a chore.

Option 2: Remote Container Development with Rsync

Another approach that has gained popularity recently is using a remote Linux host, VM, or cloud instance for container development. With this method, you run your containers on a remote machine, using rsync to synchronize your local source code files with the remote host.

The biggest advantage of this approach is that you get native file system performance since your containers are running on a Linux machine with no virtualization overhead. You can use any Linux distribution and file system you want, and you have full control over the environment.

To get started, you‘ll need a remote Linux machine. For this tutorial, we‘ll assume you‘re using an Ubuntu 20.04 VM on your local network, with an IP address of 192.168.1.100 and your SSH key authorized to access it.

Install Docker on the remote machine:

$ ssh [email protected]
$ sudo apt update
$ sudo apt install docker.io

Next, install rsync on both your local Mac and the remote host:

# On macOS
$ brew install rsync

# On Ubuntu 
$ sudo apt install rsync

Now, create a directory on the remote host to store your project files:

$ ssh [email protected] "mkdir -p projects/myapp"

We‘ll use rsync to synchronize the local project directory with the remote directory. Test it out with a dry run:

$ rsync -avz -e ssh --delete /Users/yourusername/projects/myapp/ [email protected]:projects/myapp

This command recursively syncs the /Users/yourusername/projects/myapp directory on your local machine to the projects/myapp directory on the remote host. The --delete flag tells rsync to remove any files on the remote side that don‘t exist locally, keeping the directories perfectly in sync.

If everything looks good, re-run rsync without the -z flag to actually synchronize the files. Then, SSH into the remote machine and start your containers there, mounting the synced directory:

$ ssh [email protected]
$ docker run -it --rm -v /home/ubuntu/projects/myapp:/app alpine sh

Any changes you make locally will be synced to the remote machine, so you can edit your code in your favorite IDE and run it in containers with native Linux speed!

For even more convenience, you can automate the syncing process using a helper like fswatch. Install it on macOS:

$ brew install fswatch

Then, create a simple bash script to watch for local file system changes and automatically trigger an rsync:

#!/bin/bash

# Watch the local directory for changes
fswatch -o /Users/yourusername/projects/myapp | while read f; do

  # Sync changes to the remote host
  rsync -avz -e ssh --delete /Users/yourusername/projects/myapp/ [email protected]:projects/myapp

done

Save the script as sync.sh, make it executable with chmod +x sync.sh, and run it in a terminal window. Now, any time you change a file locally, it will be automatically synced to the remote host, ready for testing in your containers.

The remote container development approach offers the best of both worlds – the performance of native Linux with the convenience of local editing. The main downside is the need to maintain a separate remote machine and keep your code in sync. But for many development workflows, the performance gains are well worth the added complexity.

Conclusion

As we‘ve seen, there are several ways to work around the file system performance issues with Docker for Mac and achieve near-native speeds for your containerized development environment.

If you‘re looking for a quick and easy solution, using NFS mounts can provide a substantial performance boost with minimal configuration. It‘s a good choice for smaller projects or when you don‘t need the absolute highest levels of performance.

For larger codebases or more intensive workloads, the rsync-based remote container development approach offers the best performance by running your containers on a native Linux machine. It requires a bit more setup and infrastructure, but the gains in speed and productivity can be enormous, especially for I/O heavy applications.

Ultimately, the right approach depends on your specific needs and constraints. But by taking the time to optimize your file system performance, you can supercharge your Docker for Mac development experience and keep your code flowing smoothly.

I hope this guide has been helpful in demystifying the performance challenges of Docker for Mac and providing some practical solutions. Happy coding, and may your containers be swift and speedy!

Supercharging Docker for Mac File Access: An Expert Guide

Understanding the Docker File System Landscape on macOS

Option 1: Using NFS Volumes

Option 2: Remote Container Development with Rsync

Conclusion

Related

The Easy Way to Set Up Docker on a Raspberry Pi: A Comprehensive Guide

Learn Docker by Building a Node / Express App

How to Remove All Docker Images – A Docker Cleanup Guide

Machine Learning as a Service with TensorFlow

How to Supercharge Your Node.js Development with Docker

Scaling Container Deployments with Docker Swarm: An In-Depth Guide

Understanding the Docker File System Landscape on macOS

Option 1: Using NFS Volumes

Option 2: Remote Container Development with Rsync

Conclusion

Related

Similar Posts