The Double Checked Locking confusion html

The Double Checked Locking confusion

Last Monday I was attending a presentation of Brian Goetz about the Java Memory Model. One of his points was about lazy initialization which concluded with "don’t use the Double Checked Locking idiom". Last Thursdays keynote at Devoxx Joshua Bloch tells the audience that if you really, really need performance you should use the the double checked locking idiom.

I was a little puzzled at first but after some reading it makes more sense. And they were both right!

Initialization of a variable can be very expensive because we need to go to a database or some expensive computation is behind it. Only in this case lazy initialization should be considered. Otherwise use ‘normal’ initialization.

public class Bar {
  private final Foo instance = new Foo();
}

In this case there aren’t any possible concurrency problems. The thread-safety is guaranteed by the class loader. But if the initialization of the Foo instance is very expensive lazy initialization could be considered. If the instance variable is static the initialize-on-demand holder idiom can be used.

public class Bar {
  private static class LazyFooHolder {
    public static final Foo instance = new Foo();
  }
  public static Foo getInstance() {
    return LazyFooHolder.instance;
  }
}

In this case the initialization of the variable is done as soon as the static class LazyFooHolder gets loaded and the class-loader make sure we don’t have any concurrency problems. The LazyFooHolder class is loaded as soon it is referenced which will only happen when the getInstance method is called.

But what if lazy initialization is needed and the instance variable is not static. If we use the following example and are in a multi-thread environment it could happen that the expensive initialization is performed more than once. Which we try to avoid.

public class Bar {
  private Foo instance;
  public Foo getInstance() {
    if (instance == null) {
      instance = new Foo();
    }
    return instance;
  }
}

Because the check and the initialization don’t happen as an atomic action we have the so called check-then-act problem. To solve this problem the synchronized keyword can be used on the method getInstance. By making the getInstance method synchronized a thread can only enter the method if it has a lock on the Foo instance.

public class Bar {
  private Foo instance;
  public synchronized Foo getInstance() {
    if (instance == null) {
      instance = new Foo();
    }
    return instance;
  }
}

This should be the prefered solution for lazy initialization of a non static instance variable.

Because synchronization is a relative expensive operation some people came up with the double checked locking idiom. First check if the instance is created without a lock. If the instance is null then create a lock and check again if the instance is still null. If it is, create the instance and release the lock.

don’t use!!!

public class Bar {
  private Foo instance;
  public Foo getInstance() {
    if (instance == null) {
      synchronized (this) {
        if (instance == null) {
          instance = new Foo();
        }
      }
    }
    return instance;
  }
}

don’t use!!!

Because it is not guaranteed that the Foo instance is fully initialized before the reference to the instance variable is written you could end up with a partially constructed Foo.
It took a while before it was proven this idiom is broken but it is so don’t use it. This is exactly what Brian Goetz was telling.

Joshua Bloch said exactly the same thing about lazy initialization. Just use the synchronized keyword if you want to lazy initialize a instance variable. But if performance really, really matters, you can use the fixed version of the double checked locking idiom. For this you need to use the new JMM (Java 5) which gives some more guarantees about synchronization. The trick is to make the instance variable volatile. The volatile rule: A write to a volatile variable happens-before every subsequent read of that same volatile.

public class Bar {
  private volatile Foo instance;
  public Foo getInstance() {
    Foo result = instance;
    if (result == null) {
      synchronized (this) {
        result = instance;
        if (result == null) {
          instance = result = new Foo();
        }
      }
    }
    return result;
  }
}

In this example an extra local variable result is used which seems not necessary. And strictly it isn’t but Joshua claims that the use of the local variable gains a 25% performance gain on his machine. The reason for this lies within the optimizations the compiler applies. Apparently the generated byte code runs faster on the JVM.

Because it is easy to misunderstand the code of Joshua or even forget the volatile keyword on the instance I agree with Brian not to use the double-checked-locking. But there are some use cases that need the best performance one can get. In this I would use the fixed version of the idiom but with care and some good comments in the code.

To conclude:

* Only use lazy initialization when really needed, otherwise use ‘normal’ initialization
* When the instance is static use the initialize-on-demand holder idiom
* Use a synchronized method if the instance variable is non static
* When performance is really important use the fixed double checked locking with a volatile variable and some comments in the code. For best performance copy the example of Joshua.

Some references:

* http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
* http://jeremymanson.blogspot.com/
* http://www.javaworld.com/jw-02-2001/jw-0209-double.html
* http://www.briangoetz.com/pubs.html

8 Comments

  1. michael January 3, 2009
  2. Alan van Dam December 17, 2008
  3. Emiel Paasschens December 16, 2008
  4. Alan van Dam December 15, 2008
  5. Jean-Francois Poilpret December 15, 2008
  6. Peter Veentjer December 15, 2008
  7. Peter Veentjer December 15, 2008
  8. Ben Manes December 15, 2008