An Intuition for Optics
In this article, we are going to explore a particular abstraction called optics. Like many abstractions, their practical application and benefits are not immediately obvious. We will define optics, while also simultaneously demonstrating their application. We will use the Scala programming to demonstrate code examples.
Abstraction
In the early 2000's, uttering the word "monad" on the street, even quietly, could almost land you in a jail cell overnight. Today, the benefits of monads are generally better understood: to prevent the repetition of work. Since that time, we have all argued over The Monad Tutorial and its value in conveying helpful information to others. On monads, we have arrived at a general acceptance.
Prior to this acceptance, we (at least I) would often encounter software projects where it was very clear that intelligent programmers have gone to great effort to essentially reinvent the benefits of monads. However, without a vocabulary and intuition for monads, that implementation often carried some unnecessary and significant penalties. This observation inspired many "monad tutorials" in an attempt to point out this insight to others, to give a little helping hand over the fence, to observe and exploit the collective result of the work that extends an already-existing intuition.
On optics, this acceptance is not so developed. This is unfortunate, given that the practical benefit is very much aimed toward the same goal, though in practice and concrete outcomes, the magnitude is almost always much greater than that for monads. That is to say, one of the primary benefits of optics is to prevent the repetition of work; and it's not just by a little bit.
Monad is an abstraction and the primary practical benefit comes from exploiting that abstraction. What do we mean by this? An explanation that I believe is unhelpful is describing several examples of instances of the monad interface. For example, "Option
is a monad, and so is List
", but this misses the point and observers of such commentary are left understandably confused. This does not describe the concrete practical benefits that we can obtain because Option
is a monad and List
is a monad.
The underlying principles of an abstraction are the following two fundamentals:
- to minimise the amount of necessary work in witnessing the abstraction
- to maximise the amount of free work that comes about from having witnessed that abstraction
Let's provide an example for the monad abstraction. To meet the first principle, we must provide an implementation of the monad interface for our value F
:
bind[A, B]: (A => F[B]) => F[A] => F[B]
pure[A]: A => F[A]
- satisfying laws of identity and associativity
In other words, the monad interface declares:
"If you can witness (implement) these simple requirements on my interface of
bind
,pure
, and these laws, then I will provide you with lots of free work that has already been done."
A good abstraction will minimise the amount of this necessary work, while maximising the amount of free work that comes about from meeting this contract. What free work do we get in return?
We would, for example, get the sequence
function, for free. That is, since we have done the minimal work of witnessing the monad interface for some concrete value (which we will call F
), we can now, "turn a list of our value (F) of elements into our value (F) of a list of elements."
def sequence[F[_]: Monad, A]: List[F[A]] => F[List[A]] =
...
There are many other such functions that we get for free, by having performed that initial work of witnessing the monad interface. We can use sequence
, and many other functions, merely because we witnessed a monad instance. All of this comes for free — no work required from us. This is the primary practical benefit of the monad interface, and any other abstraction.
A common problem in developing an intuition for abstraction is in recognising the concrete benefits of that free work. "So I now have all this free functionality by having done a small amount of work — so what?" It is often not immediately obvious where the benefits can actually be applied.
Consider these three code examples written using pseudo-code:
// >>> given([notnull 1,null,notnull 3])
// null
// >>> given([notnull 1,notnull 2,notnull 3])
// notnull [1,2,3]
given(list) {
var r = List.empty
for(var i = 0; i < list.length; i++) {
if(list[i] == null)
return null
else
r.add(list[i])
}
return r
}
// >>> given([(+1),(*2),(/3)], 99)
// [100,198,33]
given(list, x) {
var r = List.empty
for(var i = 0; i < list.length; i++) {
r.add(list[i].apply(x))
}
return r
}
// >>> given([[1],[3,4],[6,7]])
// [[1,3,6],[1,3,7],[1,4,6],[1,4,7]]
given(list) {
var r = List.empty;
for(var i = 0; i < list.length; i++) {
var s = List.empty
for(var j = 0; j < list[i].length; j++) {
s.add(list[i][j])
}
r.append(s)
}
return r
}
This pseudo-code above might be described as code that you might typically encounter on a software project. Hopefully, you'd agree. However, did you notice the code repetition? Possibly not, and it is that code repetition that we can eliminate by exploiting an abstraction. Code repetition?
Claim: They are all the same function.
Let's examine this claim. First, observe that all three of these functions have a similar structure. Let's describe the structure, or type, of each of the above three functions:
List[PossiblyNull[ X]] => PossiblyNull[ List[X]]
List[Function1[Int, X]] => Function1[Int, List[X]]
List[List[ X]] => List[ List[X]]
This more general structure is evident in each of these functions with some careful observation. We can generalise this structure to a single description: List[F[X]] => F[List[X]]
for varying values of F
in each of the examples.
List[PossiblyNull[ X]] => PossiblyNull[ List[X]]
List[Function1[Int, X]] => Function1[Int, List[X]]
List[List[ X]] => List[ List[X]]
List[F[ X]] => F[ List[X]]
We can then ask the question:
What is the minimal requirement on
F
such that these three functions can be implemented all at once?
The answer to this question is the Monad
interface1. And in fact, all three of these functions re-implement the sequence
function described earlier. Can you see how that might not be immediately obvious to the author who wrote all three of these functions (and possibly more)? If it is not clear, take a closer look, since the implementations themselves, do precisely what a single function already does, if only each value for F
can meet the Monad
interface (tip: they can).
It's a big ask at this point. In order to exploit this abstraction so as to first recognise, then prevent, the code repetition, a programmer must put all of this intuition together, internalise it and speak fluently with it. This is quite a significant request! Let's not dismiss the size of this demand of a programmer. It is important in working and communicating with other programmers and this factor can easily be disregarded. Let's not do that.
It is this human factor in understanding and exploiting abstraction that we will attempt to overcome in this article, though not for the monad abstraction, but for optics. We will use the same two fundamental principles of abstraction; to minimise amount of initial work, from which we derive and maximise some amount of free work.
A Brief History of Optics
You may have come across the term "lens" in relation to optics. While a lens is one specific example of an optic (which we will see later), the emphasis on lens is an accident of history. Lenses were first described in terms of relational database theory2, where we have relations (or records).
While lenses pertain specifically to database relations, since that original work, other efforts have been made toward achieving similar benefits for other types of relationships, with varying success. Nevertheless, we have arrived at a general consensus of being able to describe the relevant relationships to general programming, of which lens is just one example, and that general structure is called an optic.
Optics
So let's get started on optics.
Suppose we are at the ACME Vehicle Registration Centre and have a number of "domain objects" for our software application:
case class Person(name: String, age: Int)
case class VehicleRegistration(num: String, owner: Person)
case class Vehicle(make: String, reg: VehicleRegistration)
We have a phone centre, where someone can ring up and do three things:
- Request the age of the registered own of a vehicle.
- Submit a "birthday request" for a vehicle because the owner's age has incremented by
1
. - Update a vehicle registration number because they have purchased personalised number plates.
To meet the first use-case, we don't need to do any work. After all, the Scala programming language provides the necessary functions already: def getAge = _.reg.owner.age
.
However, the other two require some work, so we implement these use-cases on our domain objects:
case class Person(name: String, age: Int) {
def birthday: Person =
Person(name, age+1)
}
case class VehicleRegistration(num: String, owner: Person) {
def newNumber(n: String): VehicleRegistration =
VehicleRegistration(n, owner)
def modifyOwner(fn: Person => Person): VehicleRegistration =
VehicleRegistration(num, fn(owner))
}
case class Vehicle(make: String, reg: VehicleRegistration) {
def birthday: Vehicle =
Vehicle(make, reg.modifyOwner(_.birthday))
}
One day, the project team decides we have a fourth new use-case: we need to be able capitalise the name of a vehicle's registered owner. So we implement that use-case as well.
Wouldn't it be nice to implement all four use-cases, and any future use-cases all at once? Perhaps more importantly, we can anticipate less trivial requirements in the future. For example, a person now has an account balance, and given a list of vehicles, those with a vehicle registration starting with 'X'
, give them a $100 bonus to their account. Oh, and vehicles don't always have a make anymore, but they might also have a colour, oh wait, and the vehicle registration also has the state of registration.
Would be nice, wouldn't it? We need an abstraction.
Let's make some observations about the relationships between our domain objects.
- a
Vehicle
has aString
make field and some other parts. - a
Vehicle
has aVehicleRegistration
reg field and some other parts. - a
VehicleRegistration
has aString
num field and some other parts. - a
VehicleRegistration
has aPerson
owner field and some other parts. - a
Person
has aString
name field and some other parts. - a
Person
has aInt
age field and some other parts.
I want to draw your attention to this part of the relationships that is common to all:
a
Something
has aSomethingElse
field and some other parts.
This relationship describes a lens. Specifically, there is a Lens
from Something
to SomethingElse
and that relationship is a lens, precisely because Something
has exactly one SomethingElse
and some other parts.
We could write a lens, for example, from Vehicle
to VehicleRegistration
, using the monocle library. Keep in mind that in terms of the principles of abstraction, we are implementing the first step here. In other words, we are witnessing the lens, so that we can pick up all that free work that comes about by having implemented that minimal amount of work. The free work, by the way, is all the use-cases listed so far, those that we anticipate, and even those we haven't thought of yet. Here is the implementation of the lens:
def vehicleRegistration: Lens[Vehicle, VehicleRegistration] =
Lens[Vehicle, VehicleRegistration]
(_.reg)
(r => t => VehicleRegistration(t.make, r))
We could continue, and write a lens for all of the relationships above. Notice our use of vocabulary earlier:
"due to observing this relationship, there is a lens from something to something else"
We could back up for a minute and replace the word "lens" with the word "function":
"... there is a function from something to something else"
Indeed, there is a function from something to something else, given a lens from something to something else, but not the other way around. For example, there exists a function from Vehicle
to VehicleRegistration
. This relationship, where we have a function (not a lens), is called a Getter
. Given a Lens
, we can obtain a Getter
, but a Lens
is more than a Getter
, and so by implementing the lens, a stronger constraint, we get more free work than if we only had the getter.
Let's also observe a relationship between lenses in that they are closed under composition, similar to that of functions. Hopefully it is clear that if we have a function from Vehicle
to VehicleRegistration
, and a function from VehicleRegistration
to Person
, then by composition, we can get a function from Vehicle
to Person
. This closure of composition is also true of lenses. Given a Lens[Vehicle, VehicleRegistration]
and Lens[VehicleRegistration, Person]
, then by composition, we can get a Lens[Vehicle, Person]
.
Anything that has this property is said to be a category3 — yes that category, from category theory. Both functions, and lenses, are morphisms for a category.
OK, so we have a bunch of lenses. We have met the obligations of our abstraction. How do we achieve our use-cases? What free work have we obtained?
- Request the age of the registered own of a vehicle.
- We have a
Lens[Vehicle, Person]
by composition. - We ask that lens for its
Getter
.
- Submit a "birthday request" for a vehicle because the owner's age has incremented by
1
.
- We have a
Lens[Vehicle, Person]
by composition. - We ask that lens for its
modify
operation and pass it(+1)
.
- Update a vehicle registration number because they have purchased personalised number plates.
- We have a
Lens[Vehicle, VehicleRegistration]
. - We ask that lens for its
set
operation and pass the new registration number.
The monocle library has many other functions (free work), that come about because we have written our lenses and those functions cover most use-cases for data types that have the lens relationship.
But what about other relationships? Consider this data type:
sealed trait Json
case class JsonString(s: String) extends Json
case class JsonNull() extends Json
...
It is not correct to say that a Json
has a String
and some other parts. We are looking at a different relationship here between Json
and String
. The relationship may be expressed as follows:
a
Json
has aString
field or some other parts.
Note the difference in the relationship here to that of lenses. The relationship being described here is called a prism. Just like a lens is a type of optic, so is a prism, which is an abstraction standing in for this relationship. Let's write the prism using the monocle library:
def _JsonString: Prism[Json, String] =
Prism[Json, String] {
case JsonString(s) => Some(s)
case JsonNull() => None
...
}(JsonString(_))
Just like functions and lenses, prisms are also closed under composition. That is to say, if we have a Prism[A, B]
and a Prism[B, C]
, then under composition, we can get a Prism[A, C]
. This property is quite useful for navigating down a Json
data structure and potentially applying an update to any value we find there: we simply compose our prisms and use the modify
operation of the prism. Also like lenses, prisms come with a large library of free functions that we can use because we have done the ground work of writing our prisms.
Let's make all this a bit more real, and detailed.
Here are some data types taken from ChatRoulette production code:
sealed trait Active
case object WaitingForPartner extends Active
case object Connecting extends Active
case object Connected extends Active
case class CorrelationId(value: UUID)
case class Conversation(
st: Active, cId: CorrelationId, chn: Matching.Channel)
sealed trait ChatState
case object Stopped extends ChatState
case class ActiveChatState(ch: Conversation) extends ChatState
sealed trait ClientState
case object Unattached extends ClientState
case class Attached(
tId: TrackingId, state: ChatState) extends ClientState
There are lots of different relationships to be discussed here. Just to keep things simple for now, let's first look at the ClientState
data type. Suppose we have a use-case where we want to "get the TrackingId
out of the ClientState
." Of course, not all ChatState
values will carry a TrackingId
. Let's assume we are not familiar with optics, and look at what might happen in the evolution of our code.
We might start off with a few usages of this data type, where we are specifically interested in the TrackingId
:
def numberOfThingos(s: ClientState): Int =
s match {
case Unattached => 99
case Attached(tId, _) => someFunction(tId)
}
def isAttached(s: ClientState): Boolean =
s match {
case Unattached => false
case Attached(_, _) => true
}
def swizzleTheTrackingId(s: ClientState): ClientState =
s match {
case Unattached => Unattached
case Attached(tId, c) => Attached(swizzle(tId), c)
}
A second iteration may come about from the observation that the first two of these can be written using a single "get" function.
def getTrackingId(s: ClientState): Option[TrackingId] =
s match {
case Unattached => None
case Attached(tId, _) => Some(tId)
}
def numberOfThingos(s: ClientState): Int =
getTrackingId(s).fold(99)(someFunction)
def isAttached(s: ClientState): Boolean =
getTrackingId(s).isDefined
def swizzleTheTrackingId(s: ClientState): ClientState =
s match {
case Unattached => Unattached
case Attached(tId, c) => Attached(swizzle(tId), c)
}
Another use-case may come up later, where we don't wish to swizzle the ClientState
, but rather, to also swozzle the ClientState
, so we factor out a function to which we can pass the swizzle
and the swozzle
operations in the implementation:
def getTrackingId(s: ClientState): Option[TrackingId] =
s match {
case Unattached => None
case Attached(tId, _) => Some(tId)
}
def numberOfThingos(s: ClientState): Int =
getTrackingId(s).fold(99)(someFunction)
def isAttached(s: ClientState): Boolean =
getTrackingId(s).isDefined
def modifyTheTrackingId(mod: TrackingId => TrackingId)
( s: ClientState): ClientState =
s match {
case Unattached => Unattached
case Attached(tId, c) => Attached(mod(tId), c)
}
def swizzleTheTrackingId(s: ClientState): ClientState =
modifyTheTrackingId(swizzle)(s)
def swozzleTheTrackingId(s: ClientState): ClientState =
modifyTheTrackingId(swozzle)(s)
However, in doing these refactorings, we have essentially, reimplemented the free work that comes with an optics library. In other words, if we write the optics instead, in this case prisms, we get all of these use-cases for free (and more). We have factored out some of the code repetition, but if we were familiar with optics, we'd have started there, and achieved all these use-cases, including those in the future, all in one shot.
def _Unattached: Prism[ClientState, Unit] =
Prism[ClientState, Unit] {
case Unattached => Some(())
case Attached(_, _) => None
...
}(_ => Unattached)
def _Unattached: Prism[ClientState, (TrackingId, ChatState)] =
Prism[ClientState, (TrackingId, ChatState)] {
case Unattached => None
case Attached(tId, c) => Some((tId, c))
...
}{
case (tId, c) => Attached(tId, c)
}
At this point, there is nothing else to do, as all the use-cases above, and those not yet discovered, all come for free.
An astute observer will point to a little lie. We do not have a Prism
from ClientState
to TrackingId
, so we don't yet have these use-cases fulfilled. Rather, we have a Prism
from ClientState
to (TrackingId, ChatState)
.
Remember the lens relationship?
a
(TrackingId, ChatState)
has aTrackingId
and some other parts.
To get to a TrackingId
, we need to use this lens. It comes in the monocle library, and it is called _1
, which is a common naming convention for this lens. So ... now ... we take our prism and compose it with our lens to get from ClientState
to TrackingId
. But wait a minute, compose a prism with a lens? We haven't yet talked about this. While both prisms and lenses are closed under composition, what happens if we compose a prism with a lens?
???[ClientState, TrackingId]
This relationship is called a Traversal
4. A traversal relationship can be described as:
"a
ClientState
has some number ofTrackingId
values and/or some other parts"
A traversal is a strictly weaker relationship than both a prism and a lens. However, note that traversals are also closed under composition (actually, all optics are).
We can see other traversal relationships in our data types if we move passed ClientState
.
Consider for example, the following use-case:
Given a
ClientState
, if it isAttached
, take theChatState
and if it isActiveChatState
, then take theConversation
and then itsCorrelationId
field, which under its constructor is aUUID
and swap the least and most significant bits.
This use-case is not a real one in production of course. However, it is used here as a demonstration of even the most convoluted example. What would be required to achieve the goals of this use-case?
A naïve implementation could pattern-match through the data types, navigating until we got to the UUID
, swap the least and most significant bits field, then reconstruct the data type back to a ClientState
. This would clearly be somewhat clumsy and error-prone. However, what if we'd done the ground work of implementing our optics?
Suppose we already had on hand:
// as described earlier
def _Attached: Prism[ClientState, (TrackingId, ChatState)]
// the _2 lens comes with monocle
def _2: Lens[(TrackingId, ChatState), ChatState]
// which acts on the `ActiveChatState{:scala}` constructor
def _ActiveChatState: Prism[ChatState, Conversation]
// a lens on the class' field
def correlationId: Lens[Conversation, CorrelationId]
// we have not yet discussed the Iso relationship
def _UUID: Iso[CorrelationId, UUID]
// swaps the least and most significant bits of a UUID
// (note that this can be written using monocle)
def swap: Iso[UUID, UUID]
Our solution, for our use-case, becomes:
val swapBitSignificance: Traversal[ClientState, UUID] =
_Attached compose
_2 compose
_ActiveChatState compose
correlationId compose
_UUID compose
swap
def useCase: ClientState => ClientState =
swapBitSignificance modify identity
Note the use of optic composition to navigate down the data structures, where we end at a Traversal
. Finally, we use the traversal and invoke its modify
operation.
One relationship we saw above is Iso
. This relationship, short for isomorphism, can be described as:
a
Something
has aSomethingElse
and no other parts.
This is the relationship between a CorrelationId
and a UUID
, since a CorrelationId
has a UUID
and no other parts. This is also known as a newtype, terminology taken from the Haskell programming language.
One important observation is that there needn't be such a direct relationship to observe an optic. For example, although a Conversation
is implemented as a case class with a CorrelationId
field, this is not a requirement to observe and implement a lens relationship.
Let's give a concrete example. Given a value K
, it is possible to implement a Lens[Map[K, V], Option[V]]
even though a Map
does not have an Option[V]
field; it merely observes the lens relationship. This lens implementation will do updates, deletions and inserts into the Map
. Try writing it!
def mapLens[K, V](k: K): Lens[Map[K, V], Option[V]] =
...
In summary, we have described the following relationships and concreted them into different optic relationships:
// a X has a Y and some other parts
Lens[X, Y]
// a X has a Y or some other parts
Prism[X, Y]
// a X has some number of Y values and/or some other parts
Traversal[X, Y]
// a X has a Y and no other parts
Iso[X, Y]
In the next section, we will look at more optics, more relationships, more free operations from the abstraction, and the relationships between the different types of optics.
Footnotes
-
Technically, the
Applicative
functor interface, though this technical difference is not particularly relevant to the point. ↩ -
A. Bohannon, B.C. Pierce, and J.A. Vaughan. Relational lenses: a language for updatable views. In Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of databases systems, pages 338–347. ACM, 2006. ↩
-
There are some additional, less interesting requirements of categories; both functions and lenses satisfy these requirements. ↩
-
In the monocle library, this actually creates an
Optional
not aTraversal
, which is a design error. ↩