FRP - Three principles for GUI elements with bidirectional data flow

I present three principles for programming GUI elements with bidirectional data flow using functional reactive programming. It turns out that traditional GUI frameworks need some work to fit.

After the recent release of my version 0.5 of my reactive-banana library I’m going to take a small break from expanding semantics and focus more on the application side of things. Today, I would like to talk about GUI frameworks.

A while ago, I asked people for GUI examples that they would like to see implemented in FRP. One example in particular turned out to be very fruitful, namely the CRUD.hs example, which implements an interface for a small (toy) database, a very common task in GUI programming.

Even with the blessings of functional reactive programming (FRP), the task of implementing a small database interface turned out to be surprisingly difficult. Fortunately, this forced me to find good patterns for GUI programming. I think I have now found a pleasant set of principles that I will describe below. In turn, these principles tell us how GUI frameworks should behave to be maximally supportive of the FRP style; it turns out that wxHaskell (and likely GTK as well) needs some work in that respect. I will detail this as well. And I would even go futher: these principles apply to any kind of GUI programming, not just the FRP style.

As usual, I welcome your input and discussion! While I am confident to have found “the right” principles, do they really hold up to all situations?

Before we start with the principles proper, I have to make an important observation, namely that GUI elements should have bidirectional data flow by default. Since the post is rather long, you may want to jump directly to the principles instead.

GUI elements have bidirectional data flow by default

Input and output elements

It seems natural to classify GUI elements as to whether they are input elements or output elements. For instance, a button is an input element because it only reacts to user input, whereas a static text display would be an output element, because it only displays data, but does not respond to user actions. A simple example following this separation would be a counter:

Two buttons accept user input and translate it to an output number.

This example is straightforward to describe in FRP: the button gives an event e that corresponds to user clicks

( e :: Event () ) <- event0 button command

while the static text display takes a behavior b that describes the text value

sink staticText [ text :== ( b :: Behavior String ) ]

Remember that behaviors correspond to time-varying values and events to a stream of occurrences

Behavior a =~ Time -> a
Event    a =~ [(Time,a)]

Text entries have bidirectional data flow

So far, so good, but you will quickly find that some GUI elements have bidirectional data flow. Their data can be manipulated by both user input and program output. In fact, already the seemingly innocous text entry field is bidirectional!

Usually, the text entry field is used as an input element, giving a behavior that describes the text that the user entered over time

( bi :: Behavior String ) <- behavior entry text

On the other hand, you can also use the text entry as an output element to display text

sink entry [ text :== ( bo :: Behavior String ) ]

But can you do both at once? To do that, you have to feed the “input” bi back into the “output” bo, i.e. write something like bo = bi. To a Haskell programmer, this recursive use comes naturally, but the problem is that doing so directly will generate a feedback loop, the value of bo at a particular moment in time depends on itself.

This problem needs to be solved, for it would be deeply unsatisfying to have text entries in two flavors: one that can only do input and one that can only do output. In fact, genuine bidirectionality is very natural for text entries. For instance, consider the CRUD.hs example. The list box allows you to select a person, whose name is then displayed in the text entry fields. At the same time, you can edit the name by simply typing in the text entries.

Another example shall demonstrate that unidirectional GUI elements are indeed the exception and not the rule.

List boxes are bidirectional, too

At first glance, while a list box both displays data (the list of items) and reacts to user input (the selection), it does not appear to be genuinely bidirectional in the sense that there could be a feedback loop.

However, it is not uncommon to manipulate the selection programmatically as well, and then you have bidirectionality for the selection marker. For instance, in the CRUD example, adding a new item should select that item and deleting an item should unselect any items. But at the same time, the user can change the selection marker, too.

Bidirectional as default

There is also the issue of composing user interface elements. For instance, a text entry field is made from a rectangle, individual letters and an insertion caret. It may well be that the individual components are unidirectional, but the compound GUI element is bidirectional because it manipulates compound data.

And isn’t bidirectionality what user interfaces are all about? The users should be able act directly on the display of data instead of having to use a set of “knobs and lever” to influence an output display.

All this suggests that it’s actually easier to treat bidirectionality as the default and to consider unidirectional GUI elements as a special case.

Principles for programming bidirectional data flow

The three principles

So, we have seen that GUI elements are best treated as bidirectional by default. Now, to the core problem: how to avoid feedback loops?

Fortunately, the solution is not very difficult and follows from three simple principles.

  1. Principle. Only the program may manipulate the output display, never the user.

    In other words, if we take a text entry field and say

     sink entry [ text :== ( bo :: Behavior String ) ]

    then the text inside the text entry will always be described by bo. Unless we incorporate user input into the value of bo, the user will never be able to see any text he types. This is the only sensible possibility if we want to describe the text as a function Time -> String rather than a series of “updates”.

  2. Principle. User input is presented in the form of events.

    In other words, a text entry field emits an event whenever the user types a letter on the keyboard.

     ( e :: Event String ) <- event1 entry text

    This event contains the new value of the text, but it is up to the programmer to incorporate it into the behavior bo that describes the text that is actually displayed. Presenting user input as events is the only way to advoid feedback loops and it works thanks to Conal Elliott’s brilliant semantics for mutual recursion between Behavior and Event.

    You can think of the event as a suggestion by the user: he indicates that he wants a particular value and it is up to the program to accept, adjust or even reject it.

  3. Principle. GUI elements generate events only in response to user input, never in response to program output.

    In other words, the event mentioned in principle 2 only occurs when the user types something into the text field, never when the program changes the text field.

    Clearly, to avoid feedback loops, you need to distinguish user events from program events, because the former cause feedback loops. We restrict ourselves to user events then, as you can obtain the program events from principle 1.

    This principle is also obvious if you treat the text of a text entry as a continuous function Time -> String, then the notion “now we update the text value display” simply doesn’t exist.

    Yet, various GUI frameworks are tempted to provide response to program updates as well, in the name of ensuring consistency further down the road. I believe that I have heard that GTK does this. My recommendation: don’t do that.

That’s it, these are my principles for avoiding feedback loops for GUI elements with bidirectional data flow.

You probably have one immediate question that I need to address. Namely, what about the case where you want to use a text entry field as an input element? Don’t you always have to supply the text as a behavior, like this?

mdo
    e <- event1 entry text
    sink entry [ text :== stepper "" e ]

The answer is “Yes, you have to do that, according to principle 1”. Of course, it is a good idea to provide this behavior as a default for text entries. But the point is that as soon as you want to use the text as output as well, the default behavior should be disabled and you take full responsibility for displaying the text including the display of user input.

In fact, if you look at the principles above, they don’t really have anything to do with functional reactive programming at all. These are completely general principles for assigning responsibility for programming GUI applications. Do you want to let the user input text in a text entry? Then the entry has the sole responsibility for the text display and you may not set the text attribute at all. Do you want to set the text attribute? Then you also take full responsibility to handling the user input.

GUI programming is hard because the code quickly turns into a mess, and I would say that a large cause for that are unclear responsibilities. Sometimes the programmer sets a text value, sometimes the GUI framework, sometimes the user. The principles above give a clear guidance on how to assign responsibility.

Implications for Haskell GUI frameworks

Unfortunately, wxHaskell and Gtk2Hs do not always work according to the principles above, so we actually have to write a lot of glue code to support the FRP style. For instance, there is no default event in wxHaskell that text entries emit when they receive user input. GTK does have such an event, but the documentation only says that “The ::changed signal is emitted at the end of a single user-visible operation on the contents of the GtkEditable”. Apparently, “single user-visible” refers to things like pasting text, but the important question is: does this adhere to principle 3 or not? I can’t tell :-(. Also, we need a way to block the default behavior of setting the text entry on user input.

Since the principles apply to traditional GUI programming as well, it may be worthwhile to incorporate the glue code into the wxHaskell package. I could put everything into the reactive-banana-wx package, but that’s a lot of work and I don’t want to restrict it to the particular FRP style that reactive-banana uses.

Of course, such an undertaking assumes that the principles actually work in all cases. Can you think of an example where this is not the case?

Alternatively, it may be worth to turn attention towards the browser, where we have the lucky situation that there is no native GUI framework and we get to invent our own, with semantics of our own chosing. (The Utrecht Haskell Compiler supports compilation to JavaScript and I will look into making reactive-banana compile on UHC.)

Relation to MVC

For scholarly reasons, it may be interesting to compare the above principles to the traditional model-view-controller (MVC) pattern. The original paper writes

In either case, the string model is a completely passive holder of the string data manipulated by the view and the controller. It adds, removes, or replaces substrings upon demand from the controller and regurgitates appropriate substrings upon request from the view.

A goodwill interpretation would be that the view displays data from the model/program according to principle 1 and the controller sends user input according to principle 2. As for principle 3, the paragraph

But all models cannot be so passive. Suppose that the data object – the string in the above example – changes as a result of messages from objects other than its view or controller. [..] In that case the object which depends upon the model’s state – its view – must be notified that the model has changed. Because only the model can track all changes to its state, the model must have some communication link to the view.

can indeed be interpreted to mean the model is responsible for program events, and thus not the controller.

Then again, the MVC patterns was probably never meant to applied as strictly as the three principles above. For instance, I am not aware of any MVC GUI framework where text entry fields do not respond to user input once the programmers takes control of the text value. However, the very point of my three principles is that they are to be applied without exception.

A practical example

The CRUD example

I have used all three principles to implement the CRUD.hs example, to give you an idea how this looks like in practice. (You can also download the CRUD example here: CRUD.zip). I like how elegant the result turned out.

Unfortunately, I had to paper over many “shortcomings” of the wxWidgets framework. In particular, setting the text field will move the insertion caret to end, so that one may jump around unexpectedly.

Composing user events

The example also makes use a neat data type called Tidings that allows us to compose GUI elements, or more precisely their user input events.

The idea is the following: Tidings consist of facts and rumors

data Tidings t a = Tidings
    { facts  :: Behavior t a
    , rumors :: Event t a }

For a text entry field, the facts would simply the time-varying text value while the rumors would be user event that is triggered whenever the user types into the field. Mnemonic: humans cause rumors while programs make facts.

Note that the values occurring in the rumors don’t need to have any relation to the facts. Remember that the user only suggests a change to the text value, but it is up to the program to set the text value. Again, this is a nice mnemonic: rumors don’t need to be related to facts.

The key point is that Tidings is an applicative functor, so that we can combine user inputs from different GUI elements. In the CRUD.hs example, the reactiveDataItem combines two reactiveTextEntry in this fashion. The idea is that the user can change the first or the last name and now we need to pair this with the other part to turn this into a change of the whole name.

To be precise, pairing Tidings works like this:

pair :: Tidings a -> Tidings b -> Tidings (a,b)

rumors (pair a b) = unionWith join x y 
    where
    y =      (,) <$> facts a <@> rumors b
    x = flip (,) <$> facts b <@> rumors a
    join (x,_) (_,y) = (x,y) 

When there are no rumors for one of the components, we resort to facts instead. Otherwise, we combine the rumors. I think this captures exactly how rumors behave in human societies. Maybe this correspondence also shows that we should apply category theory to the social sciences. Or maybe not.

Alright, that’s the end of today’s post! Leave your comments below.

Comments

Some HTML formatting is allowed.