On Zero-Side-Effect Interactive Programming, Actors, and FSMs

Functional programming is alien to many programmers. Sergey Ignatchenko considers parallels between actors and finite state machines.

Greetings Earthlings! We come in peace. Take me to your leader...
~ Aliens

Disclaimer: as usual, the opinions within this article are those of ‘No Bugs’ Hare, and do not necessarily coincide with the opinions of the translators and Overload editors; also, please keep in mind that translation difficulties from Lapine (like those described in [ Loganberry04 ]) might have prevented an exact translation. In addition, the translator and Overload expressly disclaim all responsibility from any action or inaction resulting from reading this article.

NB: Please don’t expect to find any Big Computer Science Truths within this article; all the things mentioned here are either well-known or should be well-known to computer science people. It is the question ‘how to make functional programming more usable in industry’ which this article is about. Also note that I’m coming from the imperative programming side, so don’t hit me too hard if I use terminology which is unusual in functional programming circles.

Functional programming introduces quite a few interesting concepts (personally, I am a big fan of pure functions). However, whether we like it or not, functional programming languages (especially ‘pure’ ones such as Haskell) are very much out of mainstream use, at least within that part of the industry which deals with interactive programming (more on interactive programming below).

This utter lack of popularity outside of computational programming is a trivial observation, hardly worth any further discussion, except for the question “ WHY are functional programming languages not popular? ” Of course, any explanation which says that there is something wrong with a language inevitably invites brutal lashing by its hardcore zealots (who will say that the problem is not with the language, and that it has all the necessary features, ¹ it is just idiots like me who don’t understand the beauty of it all). On the other hand, to start improving things one needs to realize what the problems are, so I will take the risk of being pounded in the name of the ‘greater good’ (the one of pure functions getting into industry programming).

On computational programming and interactive programming’

IMHO, one of the big reasons behind the lack of popularity of functional languages is that for functional programming, ² there is a significant difference between two well-known entities, which I will call computational programming and interactive programming .

Computational programming

Let’s define computational programming as the process of traditional calculations, with lots of input data coming in, and some results coming out. For computational programming , the result is completely defined by inputs (with inputs not allowed to change while we’re computing), and all we need is to calculate an arbitrarily complex function

  F(INPUTS[])

where F is our function, and INPUTS[] is a vector of inputs.

This is the area where functional programming really shines. All the functions can be made ‘pure functions’ in a very natural manner (‘natural manner’ being a synonym for ‘a manner easily understandable by subject matter experts’), with no real need to assign variables, and all the other resulting goodies. This is to be expected, as it is also very natural for academics (who’re traditionally positively in love with formulas), so no wonder that they’ve designed something optimized for their own needs. BTW, I don’t mean that such an optimization is a bad thing per se . However, when we’re using something optimized for one field, using it in another field requires us to conduct applicability analysis.

From practical perspective, computational programming applies to such fields as HPC.

HPC

High Performance Computing (HPC) most generally refers to the practice of aggregating computing power in a way that delivers much higher performance than one could get out of a typical desktop computer or workstation in order to solve large problems in science, engineering, or business. [ HPC ]

Interactive programming

On the other hand, most of out-of-academia (and out-of-HPC) programming cannot easily be described by merely taking pre-defined inputs and producing outputs. Let’s take a very simple (but immensely practical) example. Let’s consider a system which has two users, and needs to give something (let’s say, a cookie) to the user who comes first. This task, while being extremely simple, illustrates the huge difference between computational programming and interactive programming . In particular, we cannot possibly decide which of the users will get the cookie until after we’ve launched our program.

Describing this kind of logic in terms of imperative programming is trivial, while describing it in terms of functional programming is much less obvious. Strictly speaking, to solve an interactive problem such as above via functions, we’d need to write a function

  F'(INPUTSINTIME[][])

with vector INPUTSINTIME of inputs being two-dimensional, with the first dimension being the same as that for INPUTS[] , and the second dimension being time (thank goodness, time can be seen as discrete for our purposes).

Now, let’s observe that while the F'() representation is technically correct, dealing with it in practice is really awkward. Not only does the logic itself have nothing to do with the terms in which the task is defined, but we’re also required to keep all the history of all the INPUTSINTIME vector to run our program, real ouch!

One common solution to deal with this problem practically is to introduce (in addition to pure functions) some ‘actions’. However, I don’t want to accept ‘actions’ as a viable solution for the business-logic parts of the code, as they re-introduce side effects, defeat all the purity-related goodies, and essentially bring us back to the side-effect-ridden world :-(.

Alternatively, for the vast majority of cases (if not for all cases) we can rewrite our function F'() into a much easier to deal with form, by defining a finite state machine (more precisely – an ad hoc finite state machine, as defined in [ Calderone13 ] and [ NoBugs15a ]) with a transition function T(S,INPUTS[]) (with T() taking current state S and current INPUTS[] and returning new state S' back). Then, we can rewrite F'(INPUTSINTIME[][]) as follows:

  F'(INPUTSINTIME[][]) =
    T(...T(T(S0,INPUTSINTIME[][0]),
                INPUTSINTIME[][1])...)

where S0 is an initial state of our state machine.

It means that to design an interactive program in a really pure function style, we don’t need to define the really awkward F'(INPUTSINTIME[][]) , ³ but can define the easily understandable transition function T(S,INPUTS[]) (which can and should be a ‘pure function’). Now, compared to our F'() function, we have come much closer to having a ‘natural’ representation of our interactive program. ⁴ In fact, function T() is very similar to the way interactive (and especially distributed) programs are usually written in a imperative programming language: as an event-driven program (which is equivalent to an ad hoc state machine).

BTW, implementation-wise, when a programming language knows that our function T is going to be used as a part of state machine, it can easily discard all the states except for current one from the cache, as well as discard all the previous values of INPUTSINTIME[][] provided that it always carries the last state. On the other hand, technically, it is no more than optimization compared to our purely functional completely side-effect-free function F'() , albeit a very important one (as storing all the INPUTSINTIME[][] vector forever can be way too much for lots of practical uses).

On the other hand, let’s mention that while it is technically possible to describe our interactive system in terms of pure functions (see F'() above), it is strictly necessary to store some kind of historical state for our system; it may take the form of storing prior inputs ( INPUTSINTIME[] ), or may take the form of storing S , but for any interactive system we cannot possibly live without some kind of state which changes over time. And given the choice between storing INPUTSINTIME[] and storing S , I certainly prefer storing S .

FSM-based actors: a very practical observation

In practice, even before I learned about functional programming, I found myself using (and forcing my teams to use as part of system-wide architecture) ‘actors’ based on ad hoc finite state machines for large-scale distributed systems; here I mean ‘actor’ as a thread, a queue, and a deterministic Finite State Machine (FSM, a.k.a. finite-state automaton). ⁵ I’ve found that this Actor/FSM approach has lots (and I mean LOTS) of very practical benefits (described, in particular, in [ NoBugs15a ]). Just to give one extremely practical example described in [ NoBugs15a ], such deterministic FSMs allow the holy grail of production post-mortem to be achieved(!).

The programming style of these ad hoc finite state machines, while imperative, has almost universally used the pattern in Listing 1.

class FSM {
  State s;
  void process_event(Event& event) {
   // CHECK INPUTS
   //   check validity of event,
   //   taking into account s
   //   DO NOT modify s
   //   may throw an exception
   //   semantics of thrown exception is obvious
   //     as s is not modified
   ...
   
   // CALCULATE CHANGES
   //   calculate changes which are to be made
   //   DO NOT modify s (yet)
   //   may throw an exception
   //   semantics of thrown exception is still
   //     obvious
   ...
  
   // APPLY CHANGES AND SEND OUTGOING EVENTS
   //   modify s
   //   no exceptions thrown, as semantics can 
   //     become difficult to manage
  }
};

Listing 1

Moreover, I’ve found myself arguing to make the process_event() function entirely deterministic (by moving all the time-related and other system-related stuff out of the function, see [ NoBugs15a ] for further discussion), which brings the process_event() very close to being a ‘pure’ function (in a sense that it doesn’t have side effects, with side effects defined quite loosely but still guaranteeing 100% determinism).

Now, if we take a closer look at this pattern, we’ll see that it can be converted into a functional-style function T() in a trivial manner, simply by making a transition function T out of the CHECK INPUTS and CALCULATE CHANGES parts of process_event() (and the APPLY CHANGES part will stay outside our function T( and will be made by the compiler or something).

Let’s take our very simple example of a stateful program, which needs to give a cookie on a first-come first-served basis.

Then, in usual C++ code, this might look like Listing 2.

class FSM {
  bool still_have_cookie;
  FSM() {
    still_have_cookie = true;
  }

  void process_event(const Event& ev) {
    // we assume that the only event which can come,
    // is request for a cookie
    // CALCULATE CHANGES 
    bool give_cookie = still_have_cookie;
  
    // APPLY CHANGES
    if(give_cookie) {
      post_event(GIVE_COOKIE_ID);
      still_have_cookie = false;
    }
    else {
      post_event(NO_COOKIE_ID);
    }
  }
};

Listing 2

The separation between different parts of the process_event() is not 100% clear, but this can be easily rewritten into Listing 3.

void process_event(const Event& ev) {
  pair<int,bool> event_and_new_state = T(ev);
  //APPLY CHANGES
  still_have_cookie =  event_and_new_state.second;
  post_event(event_and_new_state.first);
}

pair<int,bool> T(const Event& ev) const {
  //we still assume that the only event which 
  //can come, is request for a cookie
  //CALCULATE CHANGES 
  bool give_cookie = still_have_cookie;

  //STILL CALCULATE CHANGES
  int event_to_be_posted;
  new_still_have_cookie = still_have_cookie;
  if(give_cookie) {
    event_to_be_posted = GIVE_COOKIE_ID;
    new_still_have_cookie = false;
  }
  else {
    event_to_be_posted = NO_COOKIE_ID;
  }
  return pair<int,bool>(event_to_be_posted,
                        new_still_have_cookie);
}

Listing 3

Note that here all substantial logic resides within the side-effect-free function T() (which is actually nothing but a classic FSM transition function), and that process_event() can actually be written in a very generic manner. More importantly, this second C++ form is very close to pure functional programming. In fact, in Haskell, the very same transition function T() can be defined along the lines shown in Listing 4, which is probably not the best Haskell but is hopefully sufficient to tell the story.

t :: State -> Event -> ([Event],State)

give_cookie :: State -> Bool
give_cookie st = still_have_cookie st
event_to_be_posted st ev 
  = if give_cookie st then [Event(give_cookie_id)] else [Event(no_cookie_id)]
new_still_have_cookie st ev 
  = if give_cookie st then State(False) else st 

t st ev = ( event_to_be_posted st ev,
            new_still_have_cookie st ev )

-- t() just calculates and returns (new_state,
-- list_of_events_to_be_posted)
-- process_event() (not shown) will need to have
-- side effects, but can be written in a
-- completely generic manner and kept completely
-- out of sight so our FSM code is completely
-- side-effect free

Listing 4

While syntactically Haskell’s t() looks quite different from C++::T() , semantically they’re strictly equivalent. For example, for the C++::T() give_cookie variable, the Haskell version has an equivalent function give_cookie (which, when called, will return exactly the same value which C++::give_cookie variable will have when the C++ code is executed). The same stands for each and every variable from C++::T() , all the way to the returned C++ pair<> (and the Haskell tuple).

And note that the Haskell version is functional in its purest form – as there is no I/O, there is no need for actions, no sequencing, no monads, etc.

This similarity in semantics between imperative event-driven programming and functional programming (I hope) means that in fact, industry developers (especially those who need to deal with distributed systems) are already more or less prepared to write state machines in a style that is very close to functional (a ‘pure function’ style), the only remaining (though admittedly large) challenge being how to wrap this style into more familiar syntax.

Another way to put this observation (which is the key point I want to make in this article) is the following.

Ad hoc deterministic Finite State Machines is an interactive programming style, which can be understood (and reasonably expressed) by both industry-oriented imperative programmers, and functional programmers.

On Actors, concurrency, and interactive programming

Usually, Actors are considered as merely a way to achieve concurrency. I am arguing that they have a much more important role than just doing that: I see Actors (based on deterministic FSMs) as the way to implement interactive programs as defined above. And as interactive programming is what the vast majority of the industry developers are doing out there, making writing such Actors convenient is important for the industry to start benefiting from some of functional programming concepts (specifically – from ‘pure functions’ and determinism).

On Actors and composability

In [ Chiusano10 ], it is argued that actors are not ‘composable’, ⁶ which impedes their use for concurrency purposes. If this is the case, it would mean that indeed actors are not so good for practical use. Let’s take a closer look at this problem.

Actually, the author admits that actors can be made composable, but he argues that in this case they will be just an inferior form of pure functions. While this observation does stand for computational programming as described above (and where all the examples provided by the author belong), it doesn’t stand for interactive programming . With interactive programming , even if we have our Actor as completely deterministic, it still inherently has some kind of state (see above), so it won’t degenerate into a mere ‘pure function’.

In other words, I tend to agree with [ Chiusano10 ] that using Actors to describe concurrency for computational programming might be not that good an idea and that pure functions describe what we need for concurrency better (in theory, that is, while we can leave practical considerations such as compilers able to compile these things properly to others).

However, using Actors for interactive programming is a very different story, and that’s where (stateful!) Actors really shine. It is confirmed by real-world experiences, starting from the practical use of Erlang for building large systems (with billions of transactions per day), continuing with my own experience with C++ FSMs for a highly interactive system (processing around a billion messages per day), and the recent wave of popularity for Akka Actors. IMNSHO, deterministic Actors are the very best thing in existence for interactive programming , with lots of very practical benefits (from replay-based regression testing and production post-mortem described in [ NoBugs15a ], to protection of in-memory state against server faults as described in [ NoBugs15b ], with no need to deal with thread sync and business logic at the same time, in between).

Bottom line

With this article, I actually want to emphasize two points. The first one is more of theoretical nature (and it is known but not often articulated). It is that deterministic FSMs (or ‘actors’ with the event processing function being ‘pure’) can be seen as a flavour of functional programming with a quite specific caching.

The second point is of more practical nature. There are certainly some very useful things which are already carried from functional programming into industry; in particular, its pure functions and determinism. However, to simplify adoption of functional languages for interactive programming which forms a huuuge part of the industry, I (as a guy coming from industry), would like to see the following:

explicit and extensive discussion about the differences between computational programming and interactive programming . These two are very different beasts, and mixing them together causes lots of confusion. While the difference can be found in literature, it is by far not as prominent as it should be
support for ad hoc FSMs (a.k.a. event-driven programs) as ‘first-class citizens’ (of course, with transition function being ‘pure’). While it is possible to program a finite-state automaton in most (all?) functional programming languages, the syntax is usually ugly and confusing (for example, creating a new instance of FSM on each iteration certainly doesn’t look ‘natural’ enough for subject matter experts; and ‘actions’ seem to go against the very nature of functional programming; an explicit ‘transition function’ will do much better in this regard). Event-driven programs are very common for interactive stuff in the industry, and can be explained very easily to the guys coming from there. On the other hand, given that interactive programming covers the vast majority of industry programs, explicitly supporting it (and spending time on making it very easy to use) makes perfect sense (of course, this only makes sense if there is a desire to proliferate functional programming beyond academia and HPC).

References

[Calderone13] Jean-Paul Calderone, ‘What is a State Machine?’

[Chiusano10] Paul Chiusano, ‘Actors are not a good concurrency model’, http://pchiusano.blogspot.com/2010/01/actors-are-not-good-concurrency-model.html

[Hewitt73] Carl Hewitt; Peter Bishop; Richard Steiger (1973). ‘A Universal Modular Actor Formalism for Artificial Intelligence’. IJCAI.

[HPC] http://insidehpc.com/hpc-basic-training/what-is-hpc/

[ Loganberry04] David ‘Loganberry’ Buttery, ‘Frithaes! – an Introduction to Colloquial Lapine’, http://bitsnbobstones.watershipdown.org/lapine/overview.html

[NoBugs15a] ‘No Bugs’ Hare, ‘MMO Modular Architecture: Client-Side. On Debugging Distributed Systems, Deterministic Logic, and Finite State Machines’. http://ithare.com/chapter-vc-modular-architecture-client-side-on-debugging-distributed-systems-deterministic-logic-and-finite-state-machines

[NoBugs15b] ‘No Bugs’ Hare, ‘Server-Side MMO Architecture. Naïve, Web-Based, and Classical Deployment Architectures’, http://ithare.com/chapter-via-server-side-mmo-architecture-naive-and-classical-deployment-architectures/

Acknowledgement

Cartoons by Sergey Gordeev from Gordeev Animation Graphics, Prague.

Of course, they are. All Turing-complete programming languages do have all the necessary features, and this includes the Brainfuck programming language. The only difference between the languages is how convenient is it to use the language.
Actually, the difference stands for any kind of programming paradigm, but imperative languages tend to hide the differences with more skill than functional ones.
Nor to define ‘actions’ with side effects.
When compared to ‘actions’, we are still completely within the ‘pure function’ world(!)
I don’t want (nor I am really qualified) to go into discussions whether this interpretation of Actors is a strict Actor concurrency model as defined by Hewitt [ Hewitt73 ]; it does, however, correspond to ‘actors’ as implemented by Erlang and Akka, which I think is more important for our purposes.
Entities are composable if we can easily and generally combine their behaviours in some way without having to modify the entities being combined .