Wednesday, July 09, 2008

Type-safe Builder Pattern in Scala

The Builder Pattern is an increasingly popular idiom for object creation. Traditionally, one of it's shortcomings in relation to simple constructors is that clients can try to build incomplete objects, by omitting mandatory parameters, and that error will only show up in runtime. I'll show how to make this verification statically in Scala.


So, let's say you want to order a shot of scotch. You'll need to ask for a few things: the brand of the whiskey, how it should be prepared (neat, on the rocks or with water) and if you want it doubled. Unless, of course, you are a pretentious snob, in that case you'll probably also ask for a specific kind of glass, brand and temperature of the water and who knows what else. Limiting the snobbery to the kind of glass, here is one way to represent the order in scala.
sealed abstract class Preparation  /* This is one way of coding enum-like things in scala */
case object Neat extends Preparation
case object OnTheRocks extends Preparation
case object WithWater extends Preparation

sealed abstract class Glass
case object Short extends Glass
case object Tall extends Glass
case object Tulip extends Glass

case class OrderOfScotch(val brand:String, val mode:Preparation, val isDouble:Boolean, val glass:Option[Glass])

A client can instantiate their orders like this:
val normal = new OrderOfScotch("Bobby Runner", OnTheRocks, false, None)
val snooty = new OrderOfScotch("Glenfoobar", WithWater, false, Option(Tulip));

Note that if the client doesn't want to specify the glass he can pass None as an argument, since the parameter was declared as Option[Glass]. This isn't so bad, but it can get annoying to remember the position of each argument, specially if many are optional. There are two traditional ways to circumvent this problem — define telescoping constructors or set the values post-instantiation with accessors — but both idioms have their shortcomings. Recently, in Java circles, it has become popular to use a variant of the GoF Builder pattern. So popular that it is Item 2 in the second edition of Joshua Bloch's Effective Java. A Java-ish implementation in Scala would be something like this:
class ScotchBuilder {
private var theBrand:Option[String] = None
private var theMode:Option[Preparation] = None
private var theDoubleStatus:Option[Boolean] = None
private var theGlass:Option[Glass] = None

def withBrand(b:Brand) = {theBrand = Some(b); this} /* returning this to enable method chaining. */
def withMode(p:Preparation) = {theMode = Some(p); this}
def isDouble(b:Boolean) = {theDoubleStatus = some(b); this}
def withGlass(g:Glass) = {theGlass = Some(g); this}

def build() = new OrderOfScotch(theBrand.get, theMode.get, theDoubleStatus.get, theGlass);
}

This is almost self-explanatory, the only caveat is that verifying the presence of non-optional parameters (everything but the glass) is done by the Option.get method. If a field is still None, an exception will be thrown. Keep this in mind, we'll come back to it later.

The var keyword prefixing the fields means that they are mutable references. Indeed, we mutate them in each of the building methods. We can make it more functional in the traditional way:
object BuilderPattern {
class ScotchBuilder(theBrand:Option[String], theMode:Option[Preparation], theDoubleStatus:Option[Boolean], theGlass:Option[Glass]) {
def withBrand(b:String) = new ScotchBuilder(Some(b), theMode, theDoubleStatus, theGlass)
def withMode(p:Preparation) = new ScotchBuilder(theBrand, Some(p), theDoubleStatus, theGlass)
def isDouble(b:Boolean) = new ScotchBuilder(theBrand, theMode, Some(b), theGlass)
def withGlass(g:Glass) = new ScotchBuilder(theBrand, theMode, theDoubleStatus, Some(g))

def build() = new OrderOfScotch(theBrand.get, theMode.get, theDoubleStatus.get, theGlass);
}

def builder = new ScotchBuilder(None, None, None, None)
}

The scotch builder is now enclosed in an object, this is standard practice in Scala to isolate modules. In this enclosing object we also find a factory method for the builder, which should be called like so:
import BuilderPattern._

val order = builder withBrand("Takes") isDouble(true) withGlass(Tall) withMode(OnTheRocks) build()

Looking back at the ScotchBuilder class and it's implementation, it might seem that we just moved the huge constructor mess from one place (clients) to another (the builder). And yes, that is exactly what we did. I guess that is the very definition of encapsulation, sweeping the dirt under the rug and keeping the rug well hidden. On the other hand, we haven't gained all the much from this "functionalization" of our builder; the main failure mode is still present. That is, having clients forget to set mandatory information, which is a particular concern since we obviously can't fully trust the sobriety of said clients*. Ideally the type system would prevent this problem, refusing to typecheck a call to build() when any of the non-optional fields aren't set. That's what we are going to do now.

One technique, which is very common in Java fluent interfaces, would be to write an interface for each intermediate state containing only applicable methods. So we would begin with an interface VoidBuilder having all our withFoo() methods but no build() method, and a call to, say, withMode() would return another interface (maybe BuilderWithMode), and so on, until we call the last withBar() for a mandatory Bar, which would return an interface that finally has the build() method. This technique works, but it requires a metric buttload of code — for n mandatory fields 2n interfaces should be created. This could be automated via code generation, but there is no need for such heroic efforts, we can make the typesystem work in our favor by applying some generics magic. First, we define two abstract classes:
abstract class TRUE
abstract class FALSE

Then, for each mandatory field, we add to our builder a generic parameter:
class ScotchBuilder[HB, HM, HD](val theBrand:Option[String], val theMode:Option[Preparation], val theDoubleStatus:Option[Boolean], val theGlass:Option[Glass]) {

/* ... body of the scotch builder .... */

}

Next, have each withFoo method pass ScotchBuilder's type parameters as type arguments to the builders they return. But, and here is where the magic happens, there is a twist on the methods for mandatory parameters: they should, for their respective generic parameters, pass instead TRUE:
class ScotchBuilder[HB, HM, HD](val theBrand:Option[String], val theMode:Option[Preparation], val theDoubleStatus:Option[Boolean], val theGlass:Option[Glass]) {
def withBrand(b:String) =
new ScotchBuilder[TRUE, HM, HD](Some(b), theMode, theDoubleStatus, theGlass)

def withMode(p:Preparation) =
new ScotchBuilder[HB, TRUE, HD](theBrand, Some(p), theDoubleStatus, theGlass)

def isDouble(b:Boolean) =
new ScotchBuilder[HB, HM, TRUE](theBrand, theMode, Some(b), theGlass)

def withGlass(g:Glass) =
new ScotchBuilder[HB, HM, HD](theBrand, theMode, theDoubleStatus, Some(g))
}

The second part of the magic act is to apply the world famous pimp-my-library idiom and move the build() method to an implicitly created class, which will be anonymous for the sake of simplicity:
implicit def enableBuild(builder:ScotchBuilder[TRUE, TRUE, TRUE]) = new {
def build() =
new OrderOfScotch(builder.theBrand.get, builder.theMode.get, builder.theDoubleStatus.get, builder.theGlass);
}

Note the type of the parameter for this implicit method: ScotchBuilder[TRUE, TRUE, TRUE]. This is the point where we "declare" that we can only build an object if all the mandatory parameters are specified. And it really works:
scala> builder withBrand("hi") isDouble(false) withGlass(Tall) withMode(Neat) build()
res5: BuilderPattern.OrderOfScotch = OrderOfScotch(hi,Neat,false,Some(Tall))

scala> builder withBrand("hi") isDouble(false) withGlass(Tall) build()
<console>:9: error: value build is not a member of BuilderPattern.ScotchBuilder[BuilderPattern.TRUE,BuilderPattern.FALSE,BuilderPattern.TRUE]
builder withBrand("hi") isDouble(false) withGlass(Tall) build()

So, we achieved our goal (see the full listing below). If you are worried about the enormous parameter lists inside the builder, I've posted here an alternative implementation with abstract members instead. It is more verbose, but also cleaner.

Now, remember those abstract classes TRUE and FALSE? We never did subclass or instantiate them at any point. If I'm not mistaken, this is an idiom named Phantom Types, commonly used in the ML family of programming languages. Even though this application of phantom types is fairly trivial, we can glimpse at the power of the mechanism. We have, in fact, codified all 2n states (one for each combination of mandatory fields) as types. ScotchBuilder's subtyping relation forms a lattice structure and the enableBuild() implicit method requires the supremum of the poset (namely, ScotchBuilder[TRUE, TRUE, TRUE]). If the domain requires, we could specify any other point in the lattice — say we can doll-out a dose of any cheap whiskey if the brand is not given, this point is represented by ScotchBuilder[_, TRUE, TRUE]. And we can even escape the lattice structure by using Scala inheritance. Of course, I didn't invent any of this; the idea came to me in this article by Matthew Fluet and Riccardo Pucella, where they use phantom types to encode subtyping in a language that lacks it.



object BuilderPattern {
sealed abstract class Preparation
case object Neat extends Preparation
case object OnTheRocks extends Preparation
case object WithWater extends Preparation

sealed abstract class Glass
case object Short extends Glass
case object Tall extends Glass
case object Tulip extends Glass

case class OrderOfScotch(val brand:String, val mode:Preparation, val isDouble:Boolean, val glass:Option[Glass])

abstract class TRUE
abstract class FALSE

class ScotchBuilder
[HB, HM, HD]
(val theBrand:Option[String], val theMode:Option[Preparation], val theDoubleStatus:Option[Boolean], val theGlass:Option[Glass]) {
def withBrand(b:String) =
new ScotchBuilder[TRUE, HM, HD](Some(b), theMode, theDoubleStatus, theGlass)

def withMode(p:Preparation) =
new ScotchBuilder[HB, TRUE, HD](theBrand, Some(p), theDoubleStatus, theGlass)

def isDouble(b:Boolean) =
new ScotchBuilder[HB, HM, TRUE](theBrand, theMode, Some(b), theGlass)

def withGlass(g:Glass) = new ScotchBuilder[HB, HM, HD](theBrand, theMode, theDoubleStatus, Some(g))
}

implicit def enableBuild(builder:ScotchBuilder[TRUE, TRUE, TRUE]) = new {
def build() =
new OrderOfScotch(builder.theBrand.get, builder.theMode.get, builder.theDoubleStatus.get, builder.theGlass);
}

def builder = new ScotchBuilder[FALSE, FALSE, FALSE](None, None, None, None)
}



* Did you hear that noise? It's the sound of my metaphor shattering into a million pieces



EDIT 2008-07-09 at 19h00min: Added introductory paragraph.

19 comments:

Henry Ware said...

Great article, thanks.

Justin said...

This technique is really cool. I came to it through Haskell. You'll find you can start encoding things much more complex than true and false. Lists, numbers and more can be encoded in the types. Haskell's typeclass plus inference amount to a logic programming language in the type system. One of the best articles I've read which explains how these work is "Type-Level Instant Insanity" by Conrad Parker in The Monad Reader, Issue 8 (http://www.haskell.org/sitewiki/images/d/dd/TMR-Issue8.pdf)

I also coded up a version of your problem in Haskell. Building is done by pipelining the value through composition. For example, to make a scotch order neat with Dewars:

buildA . withPreparation Neat . withBrand "Dewars" $ scotchA

Preparation and brand are required, so if I try build a scotch without it I get a runtime error:

> buildA . withPreparation Neat $ scotchA

Phantom types can be used to ensure required arguments are provided just as you described. The type ScotchB has two phantom types, representing if a preparation and a brand have been specified:

data ScotchB hasPrep hasBrand = ...

The functions to set brand and preparation then transform their type to specify the property is set:

withBrandB :: ... -> ScotchB p b -> ScotchB p HasBrand

withPrepB :: ... -> ScotchB p b -> ScotchB HasPrep b

When a scotch is first made, its type says no parameters are set yet:

scotchB :: ScotchB NoPrep NoBrand

And the builder only takes a scotch with the parameters set:

buildB :: ScotchB HasPrep HasBrand

The entire module is pasted below as a literate haskell program. Copy and paste it out of this comment, save as "Scotch.lhs" and load into a Haskell interpreter. Try building an invalid scotch with buildA:

buildA . withPrepA Neat $ scotchA

versus with an invalid scotch with buildB:

buildB . withPrepB Neat $ scotchB

> data Preparation = Neat | OnTheRocks | WithWater deriving Show
> data Glass = Short | Tall | Tulip deriving Show
>
> type Brand = String
>
> data ScotchA = ScotchA { brandA :: Brand, prepA :: Preparation, doubleA :: Bool, glassA :: (Maybe Glass) } deriving Show
>
> withBrandA :: Brand -> ScotchA -> ScotchA
> withBrandA brand scotch = scotch { brandA = brand }
>
> withPrepA :: Preparation -> ScotchA -> ScotchA
> withPrepA prep scotch = scotch { prepA = prep }
>
> withGlassA :: Glass -> ScotchA -> ScotchA
> withGlassA glass scotch = scotch { glassA = Just glass}
>
> noGlassA :: ScotchA -> ScotchA
> noGlassA scotch = scotch { glassA = Nothing }
>
> scotchA = ScotchA (error "No brand given") (error "No preparation set.") False Nothing
>
> -- Use as
> --
> -- buildA . withPrepA Neat . withBrandA "Dewars" $ scotchA
> --
> -- Don't forget all necessary arguments!
> --
> -- buildA . withPrepA Neat $ scotchA
> --
> -- gives an error.
> buildA :: ScotchA -> ScotchA
> buildA = id
>
> data ScotchB hasPreparation hasBrand = ScotchB ScotchA
>
> data NoPrep = NotPrep
> data NoBrand = NotBrand
> data HasPrep = HasPrep
> data HasBrand = HasBrand
>
> withPrepB :: Preparation -> ScotchB p b -> ScotchB HasPrep b
> withPrepB prep (ScotchB scotch) = ScotchB (withPrepA prep scotch)
>
> withBrandB :: Brand -> ScotchB p b -> ScotchB p HasBrand
> withBrandB brand (ScotchB scotch) = ScotchB (withBrandA brand scotch)
>
> withGlassB :: Glass -> ScotchB p b -> ScotchB p b
> withGlassB glass (ScotchB scotch) = ScotchB (withGlassA glass scotch)
>
> noGlassB :: ScotchB p b -> ScotchB p b
> noGlassB (ScotchB scotch) = ScotchB (noGlassA scotch)
>
> scotchB :: ScotchB NoPrep NoBrand
> scotchB = ScotchB scotchA
>
> -- Must provide required arguments:
> -- buildB . withPrepB Neat . withBrandB "Dewars" $ scotchB
> --
> -- works but
> --
> -- buildB . withBrandB "Dewars" $ scotchB
> --
> -- Gives a type error.
> buildB :: ScotchB HasPrep HasBrand -> ScotchA
> buildB (ScotchB scotch) = scotch

Rafael de F. Ferreira said...

@henry
Sure thing. Thank you.

@justin
That's an awesome comment, thanks! It reminded me yet again that I really need to learn Haskell properly.

Chris said...

My but it's sad that Java/Scala has to resort to a design pattern for this (if I'm understanding the post correctly).

In Python, we have keyword arguments and default argument values, which allows for very rich parameter declarations. So we can just do:

#assuming only brand is mandatory. just remove the '=whatever' in the parameter declaration to make a parameter mandatory
class OrderOfScotch(object):
def __init__(self, brand, mode="OnTheRocks", isDouble=False, glass="short"):
self.brand = brand
self.mode = mode
self.glass = glass
self.isDouble = isDouble

#and now to create some scotch orders...
#only specify brand
a = OrderOfScotch("Scotty Dog")
#specify positionally
b = OrderOfScotch("Scotchy", "WithWater", True, "tall")
#specify using keywords
#note how order isn't significant
c = OrderOfScotch("Bobby Runner", glass="Tulip", isDouble=True,)

paulk said...
This comment has been removed by the author.
paulk said...

In Groovy I would be tempted to do something like:

enum Preparation { OnTheRocks, WithWater, Neat }
enum Glass { Short, Tall, Tulip }

// optional static imports to make things a little prettier below
import static Preparation.OnTheRocks
import static Preparation.WithWater
import static Glass.Tulip
import static Glass.Tall

class OrderOfScotch {
  String brand
  Preparation prep
  boolean isDouble
  Glass glass
  OrderOfScotch(b, p, d, g=Tall) {
    prep = p; brand = b; isDouble = d; glass = g
  }
}

normal = new OrderOfScotch("Bobby Runner", OnTheRocks, false)
snooty = new OrderOfScotch("Glenfoobar", WithWater, false, Tulip)

This keeps the strong typing without too much pain. You could of course use named parameters in the constructor if you wanted instead of the explicit constructor.

harshad said...

Good article, though I am still trying to digest the last part.

@Chris
I agree that the scala/java way is cumbersome, but there is a good reason why the title mentions "type-safe" builder. Try the following in python and watch it crash:
d = OrderOfScotch("NoBoolean", isDouble="Gotcha")

Rob said...

I think I can see a limitation: you lose the ability to actually build the end result in stages. For example, the following only compiles in the first (none type-safe) version:

var b = builder
b = b withBrand("hi")
b = b isDouble(false)
b = b withGlass(Tall)
//b = b withMode(Neat)
b build()

Ideally, your pattern would cause the compiler to complain only about the last line (which calls the build method), but it complains about the 2nd, 3rd and 4th as well.

James Iry said...

Paul,

I think you missed the point of the article which was to show how to use phantom types to create a statically type safe builder with some optional and some required elements. It had nothing to do with named parameters.

Rob,

You're trying to assign values of different types to the same variable. If you want to build in stages, try this instead

val b1 = builder
val b2 = b1 withBrand("hi")
val b3 = b2 isDouble(false)
val b4 = b3 withGlass(Tall)
val b5 = b4 withMode(Neat)
val b = b5.build()

Rob said...

James,

You are indeed correct. Many thanks!

Rafael de F. Ferreira said...

I responded to comments in a new post.

Greg Allen said...

Neat concept. Can even be used in Java, with only a tiny bit of ugliness. Seems to need the build() method to be static, as it needs method overloading on the type of the builder instance. Fragments:

public static Thing build(ThingBuilder<Supplied, Supplied> builder) { return builder.build();
}

public static final ThingBuilder<Missing, Missing> BUILDER = new ThingBuilder<Missing, Missing>(0, false);

final Thing thing = ThingBuilder.build(ThingBuilder.BUILDER
.withSize(1).withAlive(true));

andrew123 said...

I think you missed the point of the article which was to show how to use phantom types to create a statically type safe builder with some optional and some required elements.
===================================
Andrew William

Link Building

Przemek said...

Hi Rafael, the technique is nice. However the abstract of the post is wrong: traditional Builder has no problems with mandatory parameters - just place them on Builder constructor's arg list. Type safety guaranteed and KISS rule preserved ;)

You could promote this technique presenting its other benefits. Currently it seems that there are not many.

sp said...

If you are building a DSL, there certainly is a benefit to doing it this way. Instead of specifying all of the arguments upfront which is what it would come out to be without using this typesafe builder you would be able to specify them in a more builder style one at a time to create a more fluent language.

Przemek said...

Ok, when you build a DSL this approach makes sense.

Susan said...

Rafael, Great description on the technique used.

harveywi said...

Hi Rafael,

Thanks for your great post! I was inspired by your idea and I wanted to see if I could use the Scala shapeless library to implement the builder pattern.

Here is a link to the Github repository:
https://github.com/harveywi/shapeless-builder

And here is a link to the announcement that I put on the shapeless-dev listserv:
https://groups.google.com/forum/#!topic/shapeless-dev/y96yaNaSCGc

Cheers,

William Harvey

Torgeir said...

I'm late to the party; but what an awesome post!

Suggestions for further improvement: make your imlicit not return an instance with a build method, but rather make it build your instance, ridding the need for a build() method completely