An Introduction to ReactiveCocoa

A lot of the posts I’ve written so far are by and large foundational work. They are, so to speak, table stakes for functional programming. But once at the table, it’s hard to know exactly where to go. There are many great articles on using these principles to, e.g., parse JSON, but at the end of the day, that’s one problem, there are solid solutions out there, and it doesn’t need to be solved again. Parsing JSON is hardly a reason to adopt functional programming wholesale. Functional programming should help you write better code.

Over the past couple of months, our team at work has been developing an application in pure Swift using the pre-release versions of ReactiveCocoa, and it has been a complete joy. We have been able to test far more of our code than ever before in unit tests, we have been able to break it into tiny functions that are easy to review on their own, and we have been having a ton of fun. Since RAC, as it is often called, uses and expands on a lot of the topics I’ve written about in the past, I thought it would be good to share.

Let’s get the introductions out of the way. ReactiveCocoa is a functional reactive programming (FRP) framework developed by GitHub, primarily Justin Spahr-Summers and Josh Abernathy. FRP, for its part, is a specific way of writing and architecting software that creates a malleable abstraction for timelines; RAC implements one version of it for iOS and OS X.

The good folks at GitHub are about to release version 3.0, which is the one we have been using and I will focus on. While version 3.0 is (mostly) backward compatible with version 2.0, we have actually not used 2.0 at all; the basic Swift API has served us well so far, and understanding it will give you the tools to explore the more elaborate features at your leisure. So with that said, let’s get started. I assume you have no idea what FRP is about and will try to build things up from scratch.

As a result, this will be long.

• • • • •

At the base of FRP is the notion of events. Events are simply things that happen — which is obviously a concept that every type of programming supports. However, in ReactiveCocoa, events are first-class citizens; in fact, they have their own type. Here is a summary:1 2

enum Event<T, E: ErrorType> {
    case Next(T)
    case Error(E)
    case Completed
    case Interrupted
}

The .Error case is the simplest: it represents an error event. The .Next, .Completed and .Interrupted cases are a little different: they imply an ordering. What does a .Next event follow? What is .Completed? What got .Interrupted? Say hello to the next fundamental type: the signal.

struct Signal<T, E:ErrorType> { /*…*/ }

A Signal<T, E> is a sequence of Event<T,E>s in time, with precise semantics: every event must be of type .Next, except the last one. The last one can either be an .Error, a .Completed, or an .Interrupted. But the key factor is that these events carry information. That’s the generic types T and E, denoting arbitrary and error-specific information in both in Signal and Event.

The information T can be anything: the components of a data stream, the contents of a text field over time, or even a Void type that signifies something happened, but doesn’t have any actual data (think of a signal that represents button presses; we need to know they happened, but there is no data associated with the button press event).

Here is a valid sequence of events for a signal, one that enumerates the first three letters of the alphabet:

.Next("a") -- .Next("b") -- .Next("c") -- .Completed

Here’s another, that enumerates the result of dividing 3 by 3, 2, 1, and 0:

.Next(1) -- .Next(1.5) -- .Next(3) -- .Error("Division by zero")

The remaining case, .Interrupted, can come up when a signal is forcibly stopped, but it has been relatively rare for us, and the framework often handles that case transparently.

• • • • •

Let’s be honest, the examples above for a signal were pretty artificial. That’s because in real life, signals actually come in two flavors, typically referred to as “hot” signals and “cold” signals, and I wanted to avoid mixing them up.

Signal represents hot signals: signals that have no beginning and typically no end, but are simply a set of events in the world that can be observed. UI interactions, for instance, fall nicely within this. Button presses are a signal: they don’t really have a beginning, they just happen. But so do events like push notifications. Signal can represent any stream of such events, possibly combined.

For instance, suppose we want to update the screen on button press and push notification. We can represent both these events with a single Signal<Void, NoError>, where NoError is a built-in type that, you guessed it, means the signal can’t error out. This makes sense, since there is no notion of a button press being an error from the application’s standpoint, nor of a push notification being one. The timeline for a signal like that is dead simple:

…  --  .Next(Void) -- .Next(Void) -- .Next(Void) -- …

In our experience, most of our Signal instances have had NoError as their error type. When something doesn’t have a well-defined beginning or end, it becomes more convenient to model it as never failing.

Cold signals, by contrast, are signals that encapsulate a behavior that can be started and that often finishes. A network call is an excellent example: it is started on demand, and it can succeed, returning the data, or it can fail, returning an error code. The type we use for cold signals is SignalProducer:

struct SignalProducer<T, E:ErrorType> { /*…*/ }

Like Signal, a SignalProducer emits Events. The big difference comes in the way the timelines will typically look. For instance, let’s think again about our network call. Its type would likely be SignalProducer<NSData, NetworkError>, where we assume we have a NetworkError type that conforms to ErrorType. We have several possible timelines for this signal. One is the successful network call:

| -- .Next(data) -- .Completed

Here, I have used | as an indication that the producer was explicitly started. Another timeline is the bad call:

| -- .Error(.NotFound)

Finally, another one is the cancelled call:

| -- .Interrupted

But what makes this representation really powerful is that there is no need to assume all the data returns at once. If we have a long-lived data task, say an NSURLSessionDownloadTask that calls a delegate many times during its execution, it can still be represented by the same type. Here are the equivalent timelines in that case:

| -- .Next(data1) -- .Next(data2) -- .Completed

| -- .Next(data1) -- .Error(.ConnectionLost)

| -- .Next(data1) -- .Interrupted

Thus, a SignalProducer<NSData, NetworkError> is a generalized representation of a network call that can be adapted to any specific case.

• • • • •

Okay, all of this may make sense, but it still doesn’t explain why anyone would go through the effort of representing things as Signals or SignalProducers. Nor do we yet know how to create them, or use them. So let’s look at the first part, creating a SignalProducer.

Comic Cathy is writing an app that keeps track of her comic book collection. The collection is on a server, but she doesn’t want to download the entire thing every time she launches her app, so she has a local store. When the app launches, she wants to populate a table view with her collection. However, she doesn’t want to wait until the app is done syncing; she wants to display what’s in the store first, and then load any updates.

Fresh from learning about SignalProducers, Cathy thinks about her data. She sees that what she wants will come in two steps: first an array of existing comics, and then an array with any new comics. Getting the first array could fail if the local store produces an error, and getting the second array could fail if the network call produces an error. Either way, that would be a retrieval error. Oh, and if the store fails, she doesn’t want to make the network call and show the user confusing or incomplete info. Perfect! Cathy can define a function that returns the appropriate producer:

func comicCollectionProducer()
       -> SignalProducer<[Comic], RetrievalError>

What should the implementation be? Let’s make a few simplifying assumptions. Let’s assume that both retrieving the comic info from the local store and retrieving it from the network are synchronous calls. Let’s say the functions are:3

func localComics() -> Result<[Comic], LocalStoreError>
func networkComics() -> Result<[Comic], NetworkError>

If we look at the API for SignalProducer, we see that the main initializer has a strange type:

public init(_ startHandler:
            (Signal<T, E>.Observer, CompositeDisposable) -> ())

Ouch. What does that mean? Let’s break it down. First of all, init takes a closure. This is called the startHandler because it gets called when the start method is called on the producer. Now let’s look at the parameters. The first parameter to the handler is a Signal<T, E>.Observer; this is, in common parlance, a sink: it’s where we send the events that the producer generates. The second parameter is a disposable. This is a memory management mechanism that is specific to ReactiveCocoa; for now, we can ignore it.

Armed with this knowledge, Cathy writes the following implementation for her function:

func comicCollectionProducer()
       -> SignalProducer<[Comic], RetrievalError> {

    return SignalProducer { sink, disposable in
        switch localComics() {
        case .Success(let comics):
            sendNext(sink, comics)
        case .Failure(let error)
            sendError(sink, retrievalErrorForStoreError(error))
            return // errors terminate the signal
        }

        switch networkComics() {
        case .Success(let comics):
            sendNext(sink, comics)
            sendCompleted(sink)
        case .Failure(let error)
            sendError(sink, retrievalErrorForNetworkError(error))
        }
    }
}

She first fetches the comics from the local store and sends them along by calling sendNext. This creates a .Next event of the appropriate type and emits it on the sink. If that fails, she sends an error to the sink, first turning it into a RetrievalError.

If the local fetch completes successfully, she carries out the network call and repeats the process. The only difference is that, since there is no more work to be done after sending the network data, she calls sendComplete, thus terminating the signal.

To her delight, everything works.

• • • • •

Still, this doesn’t seem like it’s a brilliant argument for using ReactiveCocoa, does it? Or for thinking all those posts about error handling and flatMap were particularly useful. That’s because Cathy’s implementation doesn’t make use of the standard library of functions that ships with RAC and, indeed, with any FRP framework.

You see, signals are fundamentally collections. And just like one can define map, reduce, flatMap, and other functions on arrays, one can define similar functions on signals and signal producers. So the power of this representation is in the way it allows us to manipulate signals as collections. For instance, what if I told you that the initializer above could be rewritten as:

func comicCollectionProducer()
       -> SignalProducer<[Comic], RetrievalError> {

    let localFetchProducer = SignalProducer(result:localComics())
        |> mapError(retrievalErrorFromStoreError)

    let networkFetchProducer = SignalProducer.try(networkComics)
        |> mapError(retrievalErrorFromNetworkError)

    return localFetchProducer |> concat(networkFetchProducer)
}

Now it’s looking more interesting, isn’t it? But of course, a lot more dense. So let’s take a look line by line at what is happening. I am going to focus on the meaning of the lines, and less on the mechanics of how every detail is achieved, because part of the power of FRP is that it gives you a vocabulary that abstracts those mechanics away.

At a high level, we are creating two producers and concatenating them. The |> operator is an extremely versatile and powerful operator whose mechanics we are going to ignore for now. When you read it, just read “take the thing on the left, and do the thing on the right to it once the signal is started”.

That last bit is crucial, by the way. The |> operator creates a specification. It doesn’t do anything during the call to comicCollectionProducer; instead, it defers all actions to the moment when the SignalProducer returned by the function is started.

Reading the return line in that light, we see it says “take the producer on the left and concatenate to it the producer on the right”. In this context, “concatenate” means “wait until the first one is done, and then start the second one”. Simple. Crucially, the second one is started only if the first one completes; if it is interrupted or errors out, the second one is never started. This is exactly the behavior Cathy wants.

Now let’s take a look at how we create the two producers. The first producer, localFetchProducer, is created in two steps. First, we create a new SignalProducer from the result of the localComics() call. This equivalent to writing the following:

SignalProducer { sink, disposable in
        switch localComics() {
        case .Success(let comics):
            sendNext(sink, comics)
            sendCompleted(sink)
        case .Failure(let error)
            sendError(sink, error)
        }
}

It’s such a common thing to want to write that the framework provides this convenience initializer. Now if you look at the code carefully, you’ll see that the type of this producer is SignalProducer<[Comic], LocalStoreError>. However, the signature of the comicCollectionProducer function calls for the error to be a RetrievalError. That’s where the second part of the creation comes in.

Per our previous semantics, the second part of the initialization of localFetchProducer says “take the signal producer and map its errors to new errors using the retrievalErrorFromStoreError function”. Again, it looks like this:

    |> mapError(retrievalErrorFromStoreError)

In other words, if the first signal results in an array of comics, this line has no effect. If, however, it results in an error, it takes that error and uses the retrievalErrorFromStoreError to turn it into a RetrievalError. Since this is a specification rather than a direct action, what the |> operator returns is actually another signal producer, with type SignalProducer<[Comic], RetrievalError>. Victory! That’s what we wanted.

The second producer is slightly different. We want to make the network call, but it’s very important that it happen after the local store fetch, because we don’t want to block or delay that fetch (remember that networkComics is synchronous).

If we were to use the same initializer as before, networkComics would get called at initialization, i.e. during the comicCollectionProducer call. That’s fine for the local store call, but definitely not for the network call, which should not be made if, say, the local store call ends in error.

Fortunately, that too is a very common scenario, and SignalProducer has the try static function that instead of taking a Result takes a closure that returns a Result. This function gets called only when the start method is called on the producer. Effectively, it can take a function, like networkComics, and wait until it is started to execute it.

Once again, the signal producer returned by SignalProducer.try(networkComics) has the wrong type: SignalProducer<[Comic], NetworkError>. Like before, we deal with that through mapError, which is the second line of this call.

• • • • •

If you’re still with me, you waded through twelve paragraphs to explain five lines of code. First of all, congratulations. Second, isn’t that cool? The semantics of that code are precise and concise — and more abstract than anything I’ve ever been able to write with any framework. At the end of the day, this is what that code is saying:

The fact that we can express it almost as concisely as we can say it at that high level is incredible. And in addition to being concise and high-level, each part of this process is testable in isolation:

Finally, if we make networkComics and localComics parameters to comicCollectionProducer, the entire chain can be unit tested. In a completely controlled manner. That’s truly golden.

• • • • •

Oh, man. It’s the end of the day, and Cathy just decided she really would rather wait for everything to sync. And she really wants to show her work to her friend in like an hour. How can she rewrite the whole thing to return everything at once quickly and without error?

Turns out, she doesn’t have to. All she needs to do is change the return line in comicCollectionProducer from this:

return localFetchProducer |> concat(networkFetchProducer)

to this:

return localFetchProducer |> concat(networkFetchProducer)
       |> reduce([]) { $0 + $1 }

It’s that simple. I promise.

• • • • •

Alright, I’ve shown you a reasonably thorough example of how to create a SignalProducer, including creating simple ones from primitives and combining them into the producer we want using the |> operator and various higher order functions. That’s half of using FRP. The other half is how to use the events the producer emits. This post has grown enormous already, so I’m going to leave that aspect for my next post. Stay tuned.








1 Swift 1.2 still doesn’t support declarations like that one; the cases have to instead take Box<T> and Box<E> types. I have removed that for simplicity and because I’m hopeful that one day, that blight on my soul will be lifted and this post will be both easy to follow and valid Swift.↩︎


2 SWIFT 2 MAKES THIS CORRECT CODE!! Ahem. Carry on.↩︎


3 I’ve written before about the Result enum, and Robert Napier wrote a nice implementation that has been merged into the microframeworks maintained by Rob Rix.↩︎

 
390
Kudos
 
390
Kudos

Now read this

Immutable Swift

Swift is a new language designed to work on a powerful, mature, established platform—a new soul in a strange world. This causes tension in the design, as well as some concern due to the feeling that the power of Cocoa is inextricably... Continue →