EmbeddedRelated.com
Blogs

Unuglify C++ FSM with DSL

Massimiliano PaganiMay 7, 2024

This is the third installment of my finite state machine in C++ series. In the first post, I introduced the subject of a state machine and how to go from a trivial but poor implementation to a more robust and better scalable one.

In the second article, I showed how to decouple the implementation of the state machine engine (for reuse) and the specific part (that can't be reused).

The result of the second post was satisfying in terms of functionalities and decoupling, but the states require some boilerplate code that doesn't make them pretty. In this article, we'll see how to improve the look and avoid the boilerplate by defining a preprocessor-based DSL.

This article is available in PDF format for easy printing

First, let's get the elephant in the room - what are DSLs? DSL stands for Domain Specific Language and is a usually small language whose constructs are specifically targeted to a given domain. SQL is a good example of a DSL - it is designed just for database interactions. The good old Makefile is written in a DSL and interpreted by the make command. DSL can also be embedded in a general-purpose language, LINQ is an example of such a thing in C#. The Scala language ecosystem has many DSLs available for several domains including unit testing and database interactions.

The idea proposed in this post is to leverage the C++ preprocessor to define a DSL for simplifying the writing of FSMs using the FSM engine designed in my last post.

I'm going a bit down nostalgia lane, so if you are not in the mood, please skip the next paragraphs (scroll to the first source code listing).

My first attempt at introducing a DSL for state machines was back at the turn of the millennium - while designing the engine for Rayman GBC (playable version). Z80 macro assembler by Nintendo provided the only language available, so the abstraction level was pretty low. The assembler had some quite powerful macro constructs. This allowed me to design a bunch of macros that defined states, with their entry and exit handlers. It was far from a general solution since everything was quite tightly coupled with the game engine, but it allowed all my colleagues even the least experienced in assembly to write gameplay code.

Next, we switched to GameBoy Advance with C language and I opted to repeat the same approach. The host language changed and the C preprocessor was less powerful than the macro assembler I used in the previous project, but the result was ultimately good - the DSL helped to shape and define game object behaviors in a clear and self-descriptive way.

Fast forward a few years in the firmware world, I brought on with this approach until my boss made me ponder that I was defining a language that was not C. This is the crucial point of the DSL embedded in another language - you are introducing a new language, something that a newcomer has to learn and that might look unfamiliar or scary, and possibly hard to master when the domain (of the DSL) is not well known.

When a DSL is embedded in another language, it has to follow the host language rules. The closer the two languages are, the less this is a problem. But to overcome the differences is not unusual to force odd and artificial-looking syntax. This is not nice and beyond a threshold, it breaks the magic pushing the users' preference to plain host language with proper libraries.

Pending my boss' observation, I simplified the code to be acceptable even without a DSL. This code worked and required no strange construct, but the model was very naive and still required the programmer to look up how to do stuff and what pre-conditions, and post-conditions were. So, it doesn't matter how you shift the balance, you can't have the cake and eat it too.

Let's take a state source code from my previous article and then try to shape the DSL:

Fsm::State stateOff( Fsm::StateArg const& event )
{
  return std::visit(
    overloaded{
      [](Fsm::OnEntry) {
        gpioWrite(0);
        return Fsm::Stay;
      },
      [](Fsm::OnExit) {
        return Fsm::Stay;
      },
      [](ButtonPress e) {
        return e.buttonCode == 42 ?
          &LightSwitcher::stateOn :
          Fsm::Stay;
      },
      []( auto _ ) {
        return Fsm::Stay;
      }
    },
    event
  );
}

Now, if we want to get rid of the boilerplate code, ideally we would like to write something like this:

STATE(Off)
  ON_ENTRY
    gpioWrite( 0 );
  ON_EVENT( ButtonPress e )
    if( e.buttonCode == 42 ) {
      TRANSITION_TO( On );
    }
END

As you can see my former boss was right - saved for curly braces and double equals, this barely resembles C++. And if you are not familiar with state machines (and possibly with the path we walked from my first FSM article to now), you may wonder what the rules for this language are.

The other problem is that you are not the only one to be confused about this - your IDE/Editor may not be able to recognize it, showing it without syntax highlighting or with syntax error marks, and/or failing with completions or any other automatic (non-AI) code suggestion.

But on the bright side, we halved the count of needed lines, we prevented mistyping errors, we prevented the programmer from forgetting the trailing argument in the overloaded call (I always do), and we got rid of housekeeping code.

Moreover, we can embed logging and tracing instructions in those macros, so to simplify our debugging. I won't show this in this article, but it is easier to add a log here where you know the symbolic name of the state function, rather than in the Fsm engine library where you know only the address.

Another advantage is that by inserting an abstraction layer between your code and the underlying FSM engine, you could, switch to another FSM engine, or change the interface requiring just the adjustment of the DSL macros and not the rewriting of the code that uses the DSL.

Let's see if the C preprocessor provides everything we need to define this DSL. There are two strong limitations with the preprocessor -

  1. it is a plain textual replacement, so it doesn't understand, nor comply with the C++ syntax and context.
  2. you cannot define new preprocessor symbols when instantiating a preprocessor macro. So you can't write a macro generator.

These limitations will force a bit the DSL syntax. Moreover, limitation #1 means that preprocessor macros have no scope. Define X and you will get X all over potentially conflicting with every other X in the code universe. There is no solution, but the problem can be mitigated by having a potentially unique prefix. My library is called Chef, so using a CHEFFSM_ prefix may do the job. But this reduces the readability. In the rest of the article, I will ignore the problem and show code without prefixes. In my library, the symbols are all prefixed.

Let's start with the first token STATE(Name). This must be roughly translated into the code:

Fsm::State stateName( Fsm::StateArg arg ) const
{
  return std::visit( overloaded(

This code expects that the type of the state machine is named Fsm. You may remember that the state is a member function of a host class, i.e. a class where the specific behavior of the state machine is described. This class hosts the state machine engine instance and, for sure knows its type. So, we may require the state machine type to be defined as the nested type Host::FsmType. This design choice takes away a bit of flexibility (requiring a fixed name for a type), but simplifies the usage of the DSL.

With this in mind, the STATE macro can be written as:

#define STATE( H, N )                                         \
    H::FsmType::State H::state##N( H::FsmType::StateArg const& chefFsmStateMachine_event ) { \
        using ChefFsmHostType = H;                                                                   \
        return std::visit(                                                  \
            overloaded{

Two notes on this code. First, the argument name of the state method is purposely wordy to avoid a name clash with user variables or symbols. Second, I captured the type of the host class (H) in a local type ChefFsmHostType because I knew that it would be needed later. This saves me from passing the host type as a macro argument to other macros.

Also, note that the overloaded template is not defined in the standard library, so we need to define it somewhere.

Now it is time to define ON_ENTRY and ON_EXIT. From the original state source code, we see that handlers are concatenated one after the other using a comma. According to the direction you look at it - every handler but the first is preceded by a comma, or every handler but the last is followed by a comma. So it is not possible to treat all the handlers in the same way. In the DSL I removed the handler blocks graphs for a couple of reasons, first being a new syntax the programmer has no clue whether these blocks have to be separated or not, and in case they have to be, what the separator is (comma? semicolon?). The second reason is that I want the return Stay to be implicit so that the programmer has to take care only of the outgoing transitions.

To simplify the implementation I decided to force the ON_ENTRY handler to be the first. It comes quite naturally and goes on par with classes where constructors are usually defined first.

#define ON_ENTRY                               \
                [this](FsmType::OnEntry) {

#define ON_EXIT                                \
                    return FsmType::Stay;      \
                },                             \
                [this](FsmType::OnExit) {

Here you can see that the ON_EXIT handler starts producing the code needed to close the previous state, this part will be present also in the event handler we'll see below. Also, note that having required the symbol FsmType to be the type of the FSM engine simplifies these macros by preventing us from passing that type as a macro argument.

ON_EVENT macro is not much different:

#define ON_EVENT( E )                          \
                    return FsmType::Stay;      \
                },                             \
                [this](E) {

Again it starts with closing the previous handler and then introduces the event handling lambda. Note that macro argument E can include spaces (it is a textual substitution and can give you headaches when you need to deal with commas, but this is not the case), so you can write the ButtonPress e code that you need in the handler body to check the button code.

Now it would be nice that every unhandled event would be ignored as per the State Machine definition. If you have tried to play a bit with the code I published in my last article (or with std::visit and overloaded), you may have noticed that the compiler gets angry if you miss a handler for an event. This is a safety net, but I find it a bit annoying, also because it is perfectly fine to ignore some events on a state-by-state basis. So I included the default case in the END macro.

#define END                                       \
                    return FsmType::Stay;         \
                },                                \
                [this](auto const&) {             \
                    return FsmType::Stay;         \
                }                                 \
            },                                    \
            chefFsmStateMachine_event             \
        );                                        \
    }

Note that you may still add a default handler, as long as it is the last one - use the ON_EVENT macro with auto const& _ argument. This ON_EVENT is a template so the argument _ will potentially be of one of several types. Usually, this is fine since the default action for a group of different events does not depend on the event value or type. Should you need to differentiate reactions, then you might want to use dedicated handlers. Nonetheless, the _ variable could be used as an argument for template calls or to retrieve type info and print them for logging or debugging.

The last macro we need is the transition macro. This macro leverages the ChefFunHostType type that we have defined in the state begin macro:

#define TRANSITION_TO( S )                                           \
                return &ChefFsmHostType::state##S

The state macros are complete, we have seen that using the preprocessor forced us a bit of constraint that may not come naturally to the user:

  • STATE/END group must be textually located outside the host class declaration;
  • ON_ENTRY must be the first handler in the state;
  • Someone has to define FsmType;
  • Someone has to define the overloaded template.

We'll take care of the last two, but the first two we have to live with.

To address the FsmType we may introduce a macro to declare the state machine. This macro will be instantiated in the class definition and may be written as:

#define DECLARE_FSM_TYPE( H, Ev... ) \
    using FsmType = ChefFsm::StateMachine<H, Ev>;

Here I used a variable argument macro to collect all the possible events and provided the symbol FsmType to all the methods of the host class. Unfortunately, you need to explicitly pass the host class type because there is no way to get the class type inside its declaration in a defined or portable way. Everything that comes to mind involving the this pointer is going to fail because this is not defined at class definition time. You could get something close by using the CRTP approach, but it would need to write the class type in the base template type argument list, so there is not much gain.

The last thing is to declare the states, we have provided macros for their definitions (out of the class definition), but we still need to declare them in the class definition.

#define DECLARE_STATE( S ) \
    FsmType::State state##S( FsmType::StateArg const& event )

And we are done. I just left out the overloaded template. We could put it everywhere we want, in a macro, or some library. In my Chef library, you can find it inside the ChefFsm namespace in the Fsm DSL header.

So, let's see how the host class definition looks now:

class LightSwitcher
{
  public:
    LightSwitcher()
    : mFsm{ &LightSwitcher::stateOff }
    {
    }
    template<typename E>
    void feed( E event )
    {
        mFsm.feed( event );
    }
  private:
    DECLARE_FSM_TYPE(LightSwitcher,ButtonPress);
    DECLARE_STATE(On);
    DECLARE_STATE(Off);
    FsmType mFsm;
};

There are still some sharp edges, that aren't hidden behind the DSL -

  • FSM engine type has to be named FsmType;
  • an explicit feeder is needed;
  • state machine starts with the constructor.
These are details that can be refined by adding other macros. But let's proceed to examine the class implementation
STATE(LightSwitcher, Off)
  ON_ENTRY
    gpioWrite( 0 );
  ON_EVENT( ButtonPress e )
    if( e.buttonCode == 42 ) {
      TRANSITION_TO( On );
    }
END

STATE(LightSwitcher, On)
  ON_ENTRY
    gpioWrite( 1 );
  ON_EVENT( ButtonPressed e )
    if( e.buttonCode == 42 ) {
      TRANSITION_TO( Off );
    }
END

The resulting code is definitively terser and more focused on the problem the code is trying to solve rather than the mechanism we use to solve it.

We saw the restrictions of this approach, is there anything else that we could miss? We are constrained to just one finite state machine per hosting class. This was quite fine to me until I realized that this could have been a way to implement a hierarchical state machine, i.e. one state is composed of a sub-state machine. This could have been done with the library (although it would have become yet more verbose) but is ruled out by the definition of type Fsm.

In a production context, macros will need some prefixes, reducing a bit the readability, but it is the price we have to pay to use something devised 50 years ago in much simpler times.

Engineering is mostly about trade-offs, you may opt for a simpler unobtrusive FSM framework that doesn't need DSL, but it possibly won't scale up gracefully, you may go with a code generator but then you have to keep the original model and the generated code in sync and debugging could be hard, or you may go with a DSL, but then your IDE may not fully support you. In other words, each project has its peculiarities and each programmer has their preferences, carefully choosing the tools that best suit the project (and your taste) is always recommended.

In this article, we have seen what DSL are and how can be implemented in C++ using the preprocessor. By applying the process to the state machine engine, we translated the example presented in my previous article into a much shorter and clearer code.
Next article will be more generally focused on state machine ecosystems, feeding, and interacting with FSM.



To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Please login (on the right) if you already have an account on this platform.

Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: