Monday, August 17, 2015

A turtle's tale of state streams and state machines.

Loading Turtles...

The turtles simulation is inspired by Logo, in which a turtle is a drawing robot that is driven by commands such as forward, left, and right.

How turtles spiral

The keystone of the four turtles simulation is the spiral-step function:

(defn spiral-step
  [{:as turtle
    :keys [x y angle spin velocity acceleration next-step]
    :or {angle 0
         spin 0.2
         velocity 10
         acceleration 1.1}}]
  (let [switch? (or (and (> velocity 50)
                         (> acceleration 1))
                    (and (< velocity 1)
                         (< acceleration 1)))]
    (assoc turtle
           :x (+ x (* velocity (Math/cos angle)))
           :y (+ y (* velocity (Math/sin angle)))
           :angle (+ angle spin)
           :spin (if switch?
                   (- spin)
           :acceleration (if switch?
                           (/ 1 acceleration)
           :velocity (* acceleration velocity)
           :next-step (if (and switch? (< velocity 1))

The spiral-step function takes a single value as input and returns a single value. The input value is a map containing x, y, angle, spin, velocity, acceleration and the next step. The output is almost the same value as the input, but with some of the fields updated.

Here is the important part for movement:
           :x (+ x (* velocity (Math/cos angle)))
           :y (+ y (* velocity (Math/sin angle)))

The return value has a new position calculated based on the current position and velocity.

Spin is the direction the turtle will change angle per step, and by how much it will turn.

Notice that if the velocity gets too high, the turtle changes direction and starts decelerating. If the velocity gets too low, the turtle starts a different step function.

There are three step functions chained in a cycle; spiral-step, linear-spiral-step, and square-spiral-step. Leonardo, Michelangelo and Raphael cycle through these step functions. But Donatello starts with a much simpler rand-step function that never transitions to any other step function.

The step function applies the turtle's intelligence, but cannot change the current world directly. Change occurs through a transactor:

  (doseq [tname (shuffle (keys (session/get :turtles)))]
    (session/swap! transact tname))

The rules of the simulation are applied externally:

(defn bound [x]
  (min 500 (max -500 x)))

(defn bounded [{:as turtle :keys [x y]}]
  (assoc turtle
         :x (bound x)
         :y (bound y)))

This rule is applied by the transactor to the output of each attempted step to ensure the turtle stays within the view bounds. There is also another rule to ensure new steps do not intersect with existing trails. The step functions themselves are oblivious to these rules. We can think about the activity of drawing a spiral separately from the verification of whether a step is allowed.

Another advantage of step functions is that you can verify individual step behavior.

State streams

A state stream is the iteration of a function that takes structured data and returns structured data of similar shape, but modified content.

{} => {} => {} => {} => {}

One example of iteration is the use of the iterate function itself:
Usage: (iterate f x)
Returns a lazy sequence of x, (f x), (f (f x)) etc.
f must be free of side-effects
(take 5 (iterate inc 1))
=> (1 2 3 4 5)
Consider a Fibonacci sequence. The next number in a Fibonacci sequence is the sum of the previous two numbers:

(defn fib-step [[a b]]
  [b (+ a b)])

This function takes a single value; a tuple of a and b; and returns a tuple of b and a + b. You can produce a sequence from this function using iterate:

(take 5 (iterate fib-step [0 1]))
=> ([0 1] [1 1] [1 2] [2 3] [3 5])

I could have chosen to represented my world as a sequence instead of storing the current world value in an atom:

(take 5 (iterate spiral-step {}))
 {:x 10,
  :y 0,
  :angle 0.2,
  :spin 0.2,
  :acceleration 1.1,
  :velocity 11}
 {:x 20.780732356253658,
  :y 2.1853626387456733,
  :angle 0.4,
  :spin 0.2,
  :acceleration 1.1,
  :velocity 12.100000000000001}
 {:x 31.92557038368857,
  :y 6.897324580680346,
  :angle 0.6000000000000001,
  :spin 0.2,
  :acceleration 1.1,
  :velocity 13.310000000000002}
 {:x 42.91078741813639,
  :y 14.41271590156827,
  :angle 0.8,
  :spin 0.2,
  :acceleration 1.1,
  :velocity 14.641000000000004})

Whether your state stream flows through an atom or a sequence or some other mechanism depends on your particular problem and requirements. For example if you wanted to preserve an audit trail of states, you might choose to write every state to a log or database.

The important thing is that instead of thinking about "how am I going to modify my state?" think about "what transitions and rules define my system?" Rather than writing functions that change state, write functions that take values and return values. Keep the concept of "current state" orthogonal from "state transition".

O.K. so step functions are useful in sequence generation, and a state stream is an iteration of step functions on values; But why the special name? Why not just talk about sequences?

1. The values represent entities at a point in time
2. There are rules governing valid transitions
3. The iteration is stepwise
4. The current value may belong in a place instead of a sequence

Streams or Machines?

Visualizing a state stream is rather dull:

State machine diagrams on the other hand, are more meaningful:

State machine diagrams are informative for thinking about and describing how the system behaves. The diagram above shows clearly the cycle of steps, and how rand-step never transitions to another step.

State machine design and representation is relative to what we consider important in the system. We can draw a more basic state machine of the turtles simulation as:

Or a more complete version as:

The point is that these diagrams are visual representations of allowable state transitions. They can show as much or as little detail as is relevant to the logic you are describing.

State machine diagrams don't have to show everything that happens in your system.

For example turtles can become blocked by other turtle trails. If we wanted to include blocking inside a state machine it might look something like this:

Messy! Including the blocked property has made the interesting logic more obscure. Blocking and bounds protection is an orthogonal concern which is best left out of the state machine diagram.

When designing a state machine keep in mind that you can represent interesting aspects as different diagrams. Avoid the trap of thinking that you must make one canonical diagram that matches the code. There is nothing preventing you from having several diagrams that all align with different aspects of your code.

About state and places

Let's talk a little bit about state, defined as:
"mode or condition of being"
In programming terms state refers to values representing some entity or scope of entities.

State can be as big or small as we want it to be. A single binary value can be called state, as can a record of everything we know about a person, an entire database, or even an entire history of a database. Hence when using the word state, it is meaningful in conjunction with some semantic scope and model. For instance, the world state for the four turtles consists of all the turtle locations and current activities.

Do not confuse state with the place where values are stored. Values occupy a place, and that place can be occupied with different values over time. But the state is not the place where we store values and state is not the entity itself. State is the collection of values we use to represent the entity at some given point in time. It is a snapshot of values.

Stateful objects are places. Places that can be referenced and modified by any code that has ever seen them. For instance imagine if the step function could modify the world directly? Such a property exposes the system to incidental complexity. How can you enforce rules in such a scenario?

I used the term state streams to indicate states that are immutable, occur sequentially, and that transitions are defined independently from the application of the change. By contrast state machines are often presented as the direct assignment of values to places. But we need not think of them like that. State machines are a more powerful abstraction when we think about them as the logical visualization of step functions.

Abstract state is ambiguous. Consider an abstraction such as reduce. One way to describe reduce is as calling a function over a collection that builds some accumulation. The accumulate may be described as either state or a value. Value is a more direct explanation. State alludes to some notion of time, model and scope which may well be present when using reduce, but does not help understand the fundamental operation. Value on the other hand alludes to an integer, but can just as well mean rich structured data. We might be more inclined to think of reducing values as the summation of integers, but we still understand that the same abstraction could be applied to aggregating a histogram of favorite colors over a sequence of structured data representing people. Perhaps it would be better to use the terms value streams and flow charts instead of state streams and state machines?

Often the cause of transitions between states are described as events. Event names are used to label the arrows between states. I don't find the concept of events in this context very helpful because it is usually clear what the cause of a transition is. Event names are redundant unless there are heterogeneous events which need further description. I did not label any transitions in my diagrams, but I trust you still understood them.

What about a larger business process state machine diagram?

We could have labeled the link between booking and booked as confirmation received or similar, but it would not add any meaning. The booked state occurring is self descriptive. We could have indicated that transitioning to abandoned occurs after 20 minutes of inactivity. I think that would be a useful annotation, so definitely worthy of inclusion. But don't feel you need to label every link, as it may in fact be more distracting than useful.

Final thoughts

Complex behaviors can be modeled with simple rules. Think about change as step functions that take a value and return a value. Transition rules can be applied externally to step handling. Clojure provides a strong foundation for representing value streams with behavior designed and communicated with flow charts. Spirals are fun.

The diagrams in this post were created with LeaderboardX. The full code for the turtles simulation is on GitHub.