Automata vs Spaghetti Code
(Frame is a new Domain Specific Langauge (DSL) focused on enabling developers to easily create programs using automata. See the Frame documentation as well as Getting Started With Frame to find tools and other resources to explore the Frame language.)
“Spaghetti code” is a perfect phrase for software that essentially is a cognitive and esthetic hot mess. This article will reveal a 3 point plan to kill spaghetti code:
- Discuss why spaghetti code isn’t very tasty, al dente or not
- Introduce a new way to look at what code actually does
- Discuss Frame Machine Notation (FMN) that helps disentangle developers from their dinner
We all know how hard everybody else’s code can be to read. This can be because the problem itself is hard or because the structure of the code is — um — creative. Frequently they go hand-in-hand.
Hard problems are just hard problems and usually nothing except a breakthrough will make them any simpler. However, it is often the case that the actual structure of the software has a lot of unnecessary complexity, which is something that can be addressed.
The bad and the ugly of spaghetti code is complicated conditional logic. And while it may be hard to imagine a life without a lot of tricky if-then-else statements, this article is, of course, going to show a better way (the good).
To illustrate our situation with spaghetti code, we will first need to turn this:
Into this:
Let's get cooking!
Implicit State
To cook pasta, we will certainly need to bring some water to a boil. However, even something so apparently simple can be fraught with confusion when spaghetti code is involved.
Here is a simple example:
(temp < 32)
What does this test really detect? It obviously divides a number line into two partitions, but what do the partitions mean? I’m sure you have a reasonable guess, but the fact that the code doesn’t explicitly tell you is the problem.
If I confirm that yes, the test is to see if water is SOLID, then what does it logically mean if it is false?
While the test divided the numbers into two groups, there are actually three logical states — SOLID, LIQUID, GAS!
So this number line:
is partitioned like this by the conditional test:
if (temp < 32) {
} else {
}
Please note what has happened, as it is very important to understanding the unsavory nature of spaghetti code. The boolean test has divided a number space into two parts but has NOT categorized the system as the actual logical structure of (SOLID, LIQUID, GAS). Instead, the test partitioned the space as (SOLID, EVERYTHING ELSE).
Here is a similar test:
Visually this would look like this:
if (temp > 212) {
} else {
}
Notice that:
- the complete set of possible states are never declared anywhere
- the logical states or groupings of states being tested are never declared anywhere in the conditional
- some states are silently grouped by the structure of the conditional logic and branching
This code is fragile but common, and small enough in scope that no one is going to quit if they had to maintain it. So let’s push the situation over the edge.
The code above assumes three states of matter — SOLID, LIQUID, GAS. However, according to science, there are actually four observable states that include PLASMA (there are really a lot of other ones, but this will make the point). While we would never cook pasta with plasma, if this code was posted to Github and then forked by a high-energy physics grad student, we would certainly want it to work well for her.
Adding plasma to the mix, however, now makes the simple code above naively do the following:
It is quite possible that the old code in the else branches for the three state system would break by adding PLASMA to the set of states. Unfortunately, there is nothing in the structure of the code that helps to easily flag the existence of the new state or the impact of the change. Additionally, any bugs are probably subtle, which is the worst kind of defect. Just say no to bugs in spaghetti.
In a nutshell, the problem is this — boolean tests are used to determine states implicitly. The logical states are often undeclared and invisible in the code. As we see above, when the system adds new logical states, existing code can break. To prevent this, developers must reexamine every single conditional test and branch to determine if the code paths are still correct for all the logical states that map to them! This is the primary reason why big code blobs degrade as they become more complicated.
While there is no way to completely eliminate conditional tests on data, any technique that minimizes them will simplify the complexity of the code.
Let’s now take a look at a typical object-oriented implementation of a class that provides a very simple model of a sample of water. The class will manage the sample through the common state changes of matter. After we explore some of the issues with a classical approach to this problem, we will then discuss a new notation called Frame and how it can address the challenges we will find.
First Bring Water to a Boil…
Science has given names to each of the possible transitions matter can go through as temperature changes.
Our class is very simple (and not very useful). It responds to calls to do state transitions and adjusts the temperature until it is correct for the desired target state:
(NOTE: this is a pseudocode of my devising. Use professionally at your own risk)
This code has a few improvements on our original snippets. The first is the replacement of hardcoded “magic” numbers (32, 212) with constants for the state temperature boundaries (WATER_SOLID_TEMP, WATER_GAS_TEMP). This change starts to make the states more explicit, although indirectly.
The code also introduces “defensive programming” tests that guard calling the method when it is in an invalid state for the operation. For instance, water can’t freeze if it is not a liquid — it's against the law (of nature). But adding guard conditions makes it harder to understand the purpose of the code. For instance:
// liquid -> solidif (!(temp < WATER_GAS_TEMP && temp > WATER_SOLID_TEMP))
throw new IllegalStateError()
This conditional test does the following:
- Tests temp is less than the GAS boundary temp
- Tests temp is greater than the SOLID boundary temp
- Throws an error if either of those tests is not true
This is confusing logic. First of all, the state of being liquid is determined by determining what it is not — a solid or gas.
(temp < WATER_GAS_TEMP && temp > WATER_SOLID_TEMP) // is liquid?
Next, the code inspects if the sample is not a liquid to determine if it should throw an error.
!(temp < WATER_GAS_TEMP && temp > WATER_SOLID_TEMP) // Seriously?
It takes more than a glance to parse this subtle double negative state of affairs. Here is a simplification which somewhat decreases the complexity of the expression:
bool isLiquidWater = (temp < WATER_GAS_TEMP
&& temp > WATER_SOLID_TEMP)if (!isLiquidWater) throw new IllegalStateError()
This code is easier to understand because the isLiquidWater state is explicit.
We will now explore some techniques that institutionalize explicit state as the best way to do things. With this approach, the logical states of a system become the physical structure of the software which makes code both better and easier to understand.
The Frame Machine Notation
The Frame Machine Notation (FMN) is a Domain Specific Language (DSL) defining an opinionated, methodical and easy approach for specifying and implementing various types of automata. For simplicity, I will use the term “machine” when referring to Frame automata, as the notation can define systems that meet the theoretical criteria for any of the different types (state machines, pushdown automata, and the top automata predator — Turing machines). Please refer to Wikipedia to get all the details about their various flavors and uses as computational formalisms.
While automata theory can be interesting (a HIGHLY controversial statement), this article will focus on the practical matter of making these powerful concepts a natural way to architect systems and write to code.
To do so Frame introduces standardized notation that works at three integrated levels:
- A text-based DSL for defining Frame controllers using an elegant and economical syntax
- A set of coding reference patterns for implementing object-oriented classes as machines that Frame calls “controllers”
- A visual notation that takes advantage of FMN to express complex operations that are difficult to represent graphically — the Frame Visual Notation (FVN)
In this article, I will focus on the first two points — FMN and the reference patterns — and save FVN discussion for future articles.
Frame is a different kind of notation in a few important ways:
- FMN has first-class entities related to automata concepts that are not present in object-oriented languages.
- the FMN specification defines standard implementation patterns in pseudocode that demonstrate how the FMN notation can be implemented.
- FMN must be able to be transpiled (a work in progress) into any object-oriented language
PLEASE NOTE: the reference implementation is to demonstrate a 1–1 equivalence between FMN notation and a simple way to implement it in any object-oriented language. Feel free to choose any way you prefer.
We will now introduce the two most important first-class entity types in Frame — Frame Events and Frame Controllers.
Frame Events
FrameEvents are essential to the simplicity of the FMN notation. A FrameEvent is implemented as a structure or class that has, at the minimum, the following member variables:
- a message identifier
- a parameter dictionary or list
- a return object
Here is pseudocode for a FrameEvent class:
class FrameEvent { var _msg:String
var _params:Object
var _return:Object FrameEvent(msg:String, params:Object = null) {
_msg = msg
_params = params
}
}
Frame notation uses the @ symbol to identify a FrameEvent object. Each of the mandatory FrameEvent attributes has a special token for accessing it:
@|message| : the pipe bracket references the _msg attribute@[param1] : the [] notation enables dereferencing a variable
parameter list@^ : the carat is used here and in other contexts
to indicate the _return attribute
Often it is not necessary to indicate which FrameEvent is being operated on. As most contexts will have only one FrameEvent in scope at a time, the notation can be unambiguously simplified to just use the attribute selectors in such situations. Therefore we can simplify access to:
|buttonClick| // Select for a "buttonClick" event _msg[firstName] = "Mark" // Set firstName _params property to "Mark"^ = "YES" // Set the _return object to "YES"
While this notation may seem strange at first, we will see how this simple syntax for events makes the FMN code very easy to understand.
Frame Controllers
A Frame Controller is an object-oriented class organized in a defined way to implement a Frame machine. Controller types are identified using a # symbol prefix:
#MyController
is equivalent to this object-oriented pseudocode:
class MyController {}
This is obviously not a very useful class. In order to do anything, the controller needs at least one state to respond to events.
Controllers are structured to contain blocks of different kinds that are identified by dashes around the block type name:
#MyController
-block 1-
-block 2-
-block 3-
There can be at most one instance of each block type in a controller and block types can contain only certain kinds of subcomponents. In this article, we will only explore the -machine- block, which can only contain states. States are identified by the $ prefix token.
Here we see the FMN for a controller hosting a machine with only one state:
#MyController // controller declaration
-machine- // machine block
$S1 // state declaration
Here is an implementation of the FMN code above:
class MyController { // -machine-
var _state(e:FrameEvent) = S1 // initialize state variable
// to $S1 func S1(e:FrameEvent) { // state $S1 does nothing
}}
The machine block implementation consists of:
- a _state variable that references the current state function which is initialized to the first state function in the controller
- one or more state methods
A Frame state method is defined to be a function with the following signature:
func MyState(e:FrameEvent);
With these basics of the machine block implementation defined, we can now see how the FrameEvent object interplays nicely with the machine.
The Interface Block
The interplay of FrameEvents driving machine activity is the essence of both the simplicity and power of Frame notation. A lingering question, however, is where do FrameEvents come from — how do they get into a controller to drive it? One possibility is that external clients can create and initialize FrameEvents themselves and then directly call the method pointed to by the _state member variable:
myController._state(new FrameEvent("buttonClick"))
A much better alternative would be to provide a generic public interface method that wrapped the direct call to _state member variable:
myController.sendEvent(new FrameEvent("buttonClick"))
However, the most seamless approach that conforms to the usual way of creating object-oriented software is to create public methods that send the event on behalf of the client to the internal machine:
class MyController { func buttonClick() {
FrameEvent e = new FrameEvent("buttonClick")
_state(e)
return e._return
}}
Frame defines a syntax for an interface block which contains methods that turn calls to the public interface into FrameEvents.
#MyController
-interface-
buttonClick
...
There is much more to explain regarding the interface block but this example gives an idea of how it works. Stay tuned for future articles on the topic.
Now let's continue exploring how the Frame machine works.
Event Handlers
Even though we have demonstrated how to define a machine, we do not have any notation to actually do anything yet. To handle events we need to 1) be able to select an event we want to handle and 2) map it to behavior to execute.
Here is a simple Frame controller that provides the infrastructure to handle events:
#MyController // controller declaration
-machine- // machine block
$S1 // state declaration
|e1| ^ // e1 event handler and return
As discussed earlier, FMN uses the pipe bracket notation to access the FrameEvent _msg attribute:
|messageName|
FMN also uses the caret ^ token to indicate a return statement. Our controller above would be implemented like this:
class MyController { // #MyController // -machine- var _state(e:FrameEvent) = S1 func S1(e:FrameEvent) { // $S1
if (e._msg == "e1") { // |e1|
return // ^
}
}
}
We can see here how FMN notation maps cleanly to an implementation pattern that is both easy to understand and code.
With these essential aspects of events, controllers, machines, states and event handlers defined we can now turn our attention to using machines to solve real(ish) problems.
Mono Focus Machines
Previously we saw a controller with no states which, we observed, was rather useless.
#MyController
One degree up the food chain of utility is a class with only a single state, which, while not useless, is just boring. But at least it does something.
To start, let’s see how a class with just one (implicit) state would be implemented:
class Mono { String status() { return "OFF" }
}
There is no state declared or even implied, but let's assume that if the code does anything at all then the system is in a “Working” state.
We will also introduce the important idea of considering calls to an interface to be the same as sending an event to the object. Therefore the code above can be viewed as a way to send a |status| event to the Mono class that is always in a $Working state.
One way to visualize this situation is by an event mapping table:
Let’s now take a look at the FMN that demonstrates the same functionality and conforms to the same mapping table:
Here is the implementation:
You can see we have also introduced new notation for the return statement that indicates to evaluate the expression and return the result to the interface:
^(return_expr)
This statement equivalent to
@^ = return_expr
or just
^ = return_expr
All of these statements are functionally equivalent and can be used interchangeably, but ^(return_expr) is arguably the most concise.
Turning on the Stove
At this point, we have seen a controller with 0 states and a controller with 1 state. Not too useful yet but we are on the cusp of doing something interesting.
To boil our pasta, we will first need to switch on our stove. Below we see a simple Switch class with a single boolean variable:
Though it is not directly obvious from its structure, the code above implements this event mapping table:
In contrast, here is the FMN for the same behavior:
We can see how directly the Frame notation conforms with the purpose of the code — to map an event (method call) to behavior based on what state the controller is in. Additionally, the structure of the implementation is also aligned with the mapping table as well:
The table makes it easy to quickly understand the purpose of the controller in its different states. Both the structure of Frame notation as well as the implementation pattern give similar benefits.
However, there is one glaring functional issue with our switch. It is initialized in the $Off state but there is no way to switch it $On! To do so, we need to introduce the change state operator.
Changing State
The change state operator looks like this:
->> $NewState
We can now use this operator to toggle between $Off and $On:
Here is the corresponding event map table:
A new |toggle| event now triggers a change that simply cycles between the two states. So how can a change state operation be implemented?
Easy peasy. Here’s the implementation for Switch2:
One final improvement to our Switch2 could be to not only permit toggling between the states but also enable explicitly setting the state:
Unlike with the |toggle| event, if |turnOn| is sent when the Switch3 is already on or |turnOff| when it is already off, the message is ignored and nothing happens. This small improvement gives the client a way to be explicit about the state that the switch should be in:
This final evolution of our switch shows how easy it is to understand the intended workings of the controller in FMN. The corresponding code demonstrates how simple it can be implemented using Frame mechanisms.
With the Switch machine now designed, we can get the heat on and start cooking!
Probing for State
A key, but subtle, aspect to automata is that the current state of a machine is the result of either a situation (like starting up) or some analysis of data or the environment. Once a machine switches to a given state, it is assumed that the situation won’t change without the machine being aware of it.
This assumption, however, is not always true. Certain situations absolutely require testing (or “probing”) data to determine what the current logical state is:
- initial restored state — when a machine is restored from a persisted state
- external state — determining “the facts on the ground” that exist in the environment when a machine is created, restored or running
- volatile internal state — where some of the internal data managed by a running machine can change outside of the machine’s control
In all these cases either the data, the environment or both must be “probed” to determine the situation and set the machine state accordingly. Ideally, this boolean logic can be factored into a single function that will determine the correct logical state. To support this pattern, Frame notation has a special type of function to probe the universe and determine what the situation is at that instant. These functions are indicated by a $ prefix in front of a method name that returns a state reference:
$probeForState()
For our situation, this method could be implemented like this:
func probeForState():FrameState { if (temp < 32) return Solid
if (temp < 212) return Liquid
return Gas
}
We see above that the method simply returns a reference to the state function that corresponds to the correct logical state. This probe function can then be used to transition to the correct state:
->> $probeForState()
The implementation mechanism looks like this:
_state = probeForState()
The state probe method is an example of Frame notation for manipulating state in a defined manner. Next, we will explore important notation for manipulating FrameEvents as well.
Behavioral Inheritance and the Dispatch Operator
Despite sounding like a bad band name for nerds, behavioral inheritance and the dispatch operator is actually a powerful programming paradigm and the last topic on Frame notation in this article.
Frame uses a form of behavioral inheritance and not the inheritance of data or other attributes. To do so, states send FrameEvents to other states if the original state doesn’t handle the event (or as we will see in other articles, just wants to pass it on). This chain of event passing can go on to any depth desired.
To do this, machines can be implemented using a technique called method chaining. The FMN notation for passing an event from one state to another is the dispatch operator => :
$S1 => $S2
This FMN statement can be implemented like this:
func S1(e:FrameEvent) { S2(e) // $S1 => $S2}
We can now see how easy it is to chain state methods. So let's apply this technique to a moderately complex situation:
Above we see that there are two base states — $Moving and $Motionless — and the other five states inherit important functionality from them. The event map makes it clear what the aggregated maps look like:
The implementation is straightforward based on the techniques we have learned:
The Water Machine
We now have the basic FMN knowledge needed to understand how to reimplement the original WaterSample class in a stateful and much more intelligible way. We will also make it ready to use for our physics student and add in the new $Plasma state for her:
Here is the full FMN implementation:
We can see that we have a start state $Begin that responds to the |create| message and saves off the temp value. A probe function first tests the initial temp to determine the logical state and then the machine transitions to it.
The physical states ($Solid, $Liquid, $Gas, $Plasma) all inherit defensive behavior from the $Default state. Any events that are invalid for the current state are passed to the $Default state which will throw an InvalidStateError. This demonstrates how simple defensive programming can be implemented using behavioral inheritance.
And now for the implementation:
Conclusion
Automata are concepts at the core of computer science that have, for too long, been reserved for specialized areas of software and hardware development. The primary goal of Frame is to provide a notation for describing automata and defining simple coding patterns or “mechanisms” for implementing them. The hope is that Frame changes how developers think about automata, gives them an easy way to practically apply them to daily programming problems, and of course, keeps spaghetti on the plate and not in the code.
Future articles will build on these concepts and introduce more power and expressiveness to the FMN notation. Eventually, I will expand the discussion to explore the visual modeling that incorporates FMN and resolves ambiguous behavior in current software modeling approaches.