Garbage Collection in Java – What is GC and How it Works in the JVM

Garbage Collection in Java

As a Java developer, you may have heard about garbage collection, but do you really understand how it works under the hood? Garbage collection is an essential feature of Java that automatically frees memory occupied by objects that are no longer in use by the program. This helps avoid common memory-related bugs like dangling pointers and memory leaks that can cause crashes or performance issues in lower-level languages like C and C++.

In this article, we‘ll take a deep dive into how garbage collection is implemented in the Java Virtual Machine (JVM). We‘ll explore how the JVM manages memory, how the garbage collector identifies and frees unused objects, the different types of collectors available, and best practices for writing GC-friendly code and tuning the JVM. Let‘s get started!

Java Memory Management and the Heap

To understand garbage collection, we first need to know a bit about how the JVM manages memory. When a Java program runs, all objects are allocated memory from a single region called the heap. The heap is created when the JVM starts up and is destroyed when it shuts down. The heap‘s size can be specified with the -Xms (initial) and -Xmx (maximum) flags passed to the java command.

Here‘s an example of allocating an object on the heap:

public class Scratch {
    public static void main(String[] args) {
        // Allocate a 1MB byte array on the heap
        byte[] data = new byte[1024 * 1024];
    }
}

In this code, a 1MB byte array is allocated on the heap when the program starts. The JVM will reserve this block of memory for the data array until it determines the object is no longer reachable (i.e. there are no more references to it). At that point, the memory becomes eligible to be freed by the garbage collector.

Java heap memory structure

It‘s important to realize that the heap is finite in size. If the program keeps allocating objects without freeing memory, it will eventually run out of memory and crash with an OutOfMemoryError. This is where the garbage collector comes in – its job is to automatically find objects that are no longer needed and free their memory so it can be reused for new objects.

How the Garbage Collector Works

A typical garbage collector has three main phases:

  1. Mark – identify which objects on the heap are still reachable by traversing references from root objects (like local variables on the stack). Objects that are not reachable are considered garbage.

  2. Sweep – free the memory used by the unreachable objects identified in the mark phase.

  3. Compact (optional) – move the reachable objects next to each other to eliminate memory fragmentation and allow new large objects to be allocated. Some GCs do this in a separate phase while others combine it with sweeping.

Mark and sweep garbage collection

Let‘s walk through a simple example to see how this works in practice:

public void doSomething() {
    // Allocate objects A, B, and C
    Object objA = new Object();
    Object objB = new Object(); 
    Object objC = new Object();

    objA = null;
    // Simulate some computation
    // ...

    // Only objB and objC are reachable now

    // The doSomething() method returns
}

When doSomething() is called:

  1. Objects A, B, and C are allocated memory on the heap
  2. The reference to object A is lost when objA is set to null
  3. doSomething() eventually returns, making the locals objB and objC unreachable too

At this point, none of the objects have any references to them and are thus eligible for garbage collection. When the GC runs its mark phase, it will determine that objects A, B, and C are unreachable. The sweep phase will then free the memory used by those objects.

It‘s important to note that the programmer never explicitly frees memory in Java. It is done automatically by the garbage collector. The GC uses techniques like reference counting and liveness analysis to determine which objects are safe to collect.

Generational Garbage Collection

The JVM heap is divided into three main sections, called generations, that are each garbage collected with different frequencies:

  • Young generation – where new objects are initially allocated. It is GCed frequently since most objects die young. The Eden space is where objects are first allocated. Surviving objects are moved to the two Survivor spaces (S0 and S1).

  • Old generation – where long-lived objects end up after surviving a number of young GC cycles. It is GCed less often.

  • Permanent generation (Java 7) / Metaspace (Java 8+) – used for class definitions and related metadata. Classes may be GCed if they are no longer needed (like if a web app is redeployed).

Generations and object lifetimes

The rationale behind this generational design is that most objects are short-lived, as the above graph shows. By collecting the young generation more frequently, we can avoid scanning the entire heap every time. Objects that survive repeated young collections (by having their references copied from one survivor space to the other) are eventually promoted (or tenured) to the old generation.

Each generation is collected with a different algorithm. For example, the young generation is usually collected with a copying collector that moves objects between spaces. The old generation may use a mark-compact algorithm that slides live objects together to eliminate fragmentation.

The GC can be tuned by setting the relative sizes of the generations with JVM options. For instance, -XX:NewRatio=3 sets the ratio of old/new generation sizes to 3, meaning the combined survivor spaces will be 1/3 the size of the old generation. Larger young generations allow short-lived objects to be collected less frequently.

Types of Garbage Collectors

Over the years, Java has provided a number of different garbage collectors optimized for various use cases. Here are some of the main ones:

  • Serial – single-threaded GC designed for basic, small applications
  • Parallel (aka Throughput Collector) – multi-threaded young generation GC designed to maximize throughput
  • Parallel Old – multi-threaded old generation GC
  • CMS (Concurrent Mark Sweep) – low-pause GC that collects the old generation concurrently with the application
  • G1 (Garbage First) – a server-style GC designed for large heaps, that aims to keep pause times low
  • ZGC (Java 11+) – a low latency GC that performs all operations concurrently with the application
  • Shenandoah (Java 12+) – another low latency GC that compacts concurrently

The default collector in Java 8 is the Parallel GC. You can select another collector with a command line flag passed to java, like -XX:+UseG1GC to enable the G1 collector. It‘s also possible to tune specific aspects of each collector‘s operation with dozens of JVM options.

Here‘s a quick comparison of the collectors (note that some require specifying -XX:+UnlockExperimentalVMOptions):

Collector Multithreaded Young Gen Multithreaded Old Gen Parallel Compaction Concurrent Marking Concurrent Compaction
Serial
Parallel
ParallelOld
CMS
G1
ZGC
Shenandoah

In general, applications that require low pause times (like web servers) should use the CMS or G1 collector, while batch applications that are not sensitive to pauses can use the Parallel collector. The newer ZGC and Shenandoah collectors promise very low pauses suitable for real-time applications.

Garbage Collection Best Practices

While the JVM provides sophisticated garbage collection algorithms, application developers still need to be aware of memory usage and GC performance. Here are some best practices to keep in mind:

  1. Avoid creating unnecessary objects – Object allocation is cheaper than in languages like C++, but is still not free. Reuse objects where possible, like by using a StringBuilder to concatenate strings. Avoid creating short-lived objects in performance critical code.

  2. Minimize object lifetimes – Short-lived objects are collected cheaply by the young generation GC. Try to avoid holding references to objects that are no longer needed. Nulling out references can help make objects eligible for collection sooner.

  3. Beware of memory leaks – Just because Java has a GC doesn‘t mean you can‘t leak memory. Leaks often arise from holding references to objects in static fields or collections. Use a memory profiler to look for unexpected memory growth over time.

  4. Tune the GC – If GC pauses are impacting your application, consider tuning the GC algorithm and parameters. Increasing the heap size can reduce the frequency of GC cycles. Adjusting generation sizes can optimize for your app‘s memory usage patterns. Choosing a low-pause collector like CMS or G1 can keep your app responsive.

  5. Don‘t explicitly call System.gc() – Forcing a GC with System.gc() is almost never necessary or beneficial. It can actually hurt performance by forcing a full GC cycle when one may not be needed. Let the JVM decide when to collect!

By following these tips, you can get the most out of Java‘s automatic memory management and write efficient, responsive applications. Happy coding!

Conclusion

Garbage collection is a key feature of Java that makes developers‘ lives easier by automating memory management. But it‘s not a magic bullet – memory leaks and performance problems can still arise if you‘re not careful.

In this article, we took a deep dive into how garbage collection works in the JVM. We looked at how the heap is structured, how objects are allocated, and how the garbage collector identifies and frees dead objects. We also compared the different GC algorithms available and discussed best practices for writing memory-efficient code.

I hope this article has given you a better understanding of what goes on under the hood of Java‘s memory management. Knowing how the GC works can help you diagnose memory issues, tune your application‘s performance, and write cleaner code. So next time you see a GC pause in your application logs, you‘ll know exactly what‘s happening!

If you enjoyed this article, check out my other Java tutorials and guides on my blog. You can also find me on LinkedIn and GitHub. Happy coding!

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *