EmbeddedRelated.com
Blogs
Memfault Beyond the Launch

Simple C++ State Machine Engine

Massimiliano PaganiMarch 14, 2024

When the going gets tough, the tough get going. So, if you are going to deal with a problem slightly harder than switching a lightbulb on and off, you may want something better in your toolbox than what I described in my previous post on State Machines.

In this post, I'll start from the basic ad-hoc implementation presented in my last post and will decouple the state handling (the state machine engine) from the specific implementation of the state machine.

Throughout this article, I'll keep the example I used in the first post - the pushbutton-operated light switch. The code could look a bit overengineered for such a simple device. This is correct, but I prefer to keep the function simple, without introducing a more convoluted model, to keep the focus on the state machine itself, rather than the implemented function.

This article is available in PDF format for easy printing

Although reading my previous post is not required, a quick glimpse at the beginning, could provide some useful context to better follow the examples. Also, I'll make use of C++20, although it is not required to have a full grasp of the entire standard, a basic understanding of lambdas, std::variant, and std::visit is advisable to follow the code.

In the last code example in my previous post, I defined states as methods of a class, each one accepting the event as a parameter and returning the next state to transition to.

This change paves the road for more efficient handling of the state - nothing prevents us from using something other than the enumeration as long as we can make a distinction between states and we can invoke the proper state handling function. These two features can be combined by using the state function pointer itself.

The direction is good also because jumping to the active state would be very fast since you skip the switch statement. Unfortunately, we have a little conundrum here. Suppose our two state-methods whose signatures are:

State stateOff( Event e );
State stateOn( Event e );

Now State is the type of the state, i.e. a method that accepts an event and returns a pointer to the same type. But this type is not yet defined, so we cannot reference it. The problem is solvable, and later I show you how. for the moment let's keep this State type around with the somewhat recursive meaning (a pointer to a member function accepting an event and returning a pointer to a member function... ok, you got the picture).

As a side note - it may be tempting to go with a void*, but it would be a dead end.

Another side note - by leaving the enumeration behind, we lose an easy way to print the state name (e.g. by using the magic_enum library). Retrieving the state name from function pointers is not that straightforward and may require some additional code to get it working.

class LightSwitcher
{
  public:
    using State = std::any; // using std::any just to avoid compilation
                            // errors. The right type will be defined later.
    enum class Event 
    {
      ButtonPressed
    };
    void feed( Event event )
    {
      mState = mState(event);
    }
  private:
    State stateOff( Event event ) const
    {
      if( event == Event::ButtonPressed ) {
        gpioWrite( 1 );
        return &LightSwitcher::stateOn;
      }
      return &LightSwitcher::stateOff;
    }
    State stateOn( Event event ) const
    {
      if( event == Event::ButtonPressed ) {
        gpioWrite( 0 );
        return &LightSwitcher::stateOff;
      }
      return &LightSwitcher::stateOn;
    }
    State mState{&LightSwitcher::stateOff};
};

This approach allows us to decouple the state machine implementation (function feed) that takes care of making the state machine evolve, from the state machine structure. In other words, any state machine, regardless of its structure can be driven by this feed function.

At this point, before solving the State type mystery, it is worth splitting the state machine into two classes - the first is a state machine engine, and the latter is the specific state machine implementation (which I refer to as the host)

template<typename T,typename E>
class StateMachine
{
  public:
    StateMachine( T* host, State initialState )
    : mState{ initialState }
    {}
    void feed( E event )
    {
      mState = mState( event );
    }
  private:
    State mState;
    T* mHost;
};

In this code, the StateMachine type depends on the type of the host class (needed to store the pointer to the state function) and the type of the event.

The implementation (host) now becomes

class LightSwitcher
{
  public:
    LightSwitcher()
    : mFsm{ this, &LightSwitcher::stateOff }
    {
       // hw init to match the initial fsm state
       gpioWrite( 1 );
    }
  private:
    enum class Event 
    {
      ButtonPressed
    };
    State stateOff( Event event ) const
    {
      if( event == Event::ButtonPressed ) {
        gpioWrite( 1 );
        return &LightSwitcher::stateOn;
      }
      return &LightSwitcher::stateOff;
    }
    State stateOn( Event event ) const
    {
      if( event == Event::ButtonPressed ) {
        gpioWrite( 0 );
        return &LightSwitcher::stateOff;
      }
      return &LightSwitcher::stateOn;
    }
    StateMachine<LightSwitcher,Event> mFsm;
};

And now let's focus on the State mystery. Do we need a pointer to a member function? Well, not exactly, we can devise a class that has all the information we need and that can invoke a state member function. There is no need to use a functor, i.e. a class that sports an operator() so that it could be called in the same way a function is called.

This State class can be a StateMachine template inner class.

    template<typename T, typename E>
    class State
    {
      public:
        typedef State (T::*StateFn)( E event );
        // the following function need to be implicit.
        State( StateFn stateFn ) noexcept : mState( stateFn )
        {}
        State feed( T& host, E event )
        {
          return (host.*mState)( event );
        }
      private:
        StateFn mState;
    };

Now we have come to a satisfying design for simple state machines - we have a state machine engine that can power an implementation state machine. There is a clear locality for state description and the state variable is protected from rogue access.

The implementation class may also store additional state data. Suppose you want to implement a timer button: you press the button once and the light goes on. After a timeout, the light goes off. If the button is pressed again when the light is on the timer restarts. In this case, you need a timer object. Whether you are running on the bare metal, or within some scheduling RTOS, the specific structure may vary, but in the end, you will need to add data to the state. The state machine engine does not handle these data and may go into the implementation class.

This timer object is specific to the LightOn state since it has no business with the LightOff state. On entering LightOn the timer is set, and the timer expiration event is handled by the LightOn state and timer restart pending a button press is still something operated by the LightOn state.

But adding the timer object in the LightSwitcher object means that even code in the LightOff state may access the timer. In other words, there is no data segregation among states. The programmer has to take care to operate on the right data from a given state.  I don't like much this design, but I consider it a compromise. On the other hand, if you want better state data segregation you would need to create a class for each state, something that would make writing state machines without the help of a code generation tool (or a DSL), harder. Moreover, it would take some effort to design this system properly without the need for dynamic memory (which I prefer to avoid in embedded systems), and virtual function calls (which would introduce some performance overhead).

So, if you go for it, try to be disciplined while accessing data - better grouping state data together to make it clear what can be accessed from where even if the access restrictions are not enforced by the language.

We are ready to make things more convoluted by adding onEntry and onExit handlers. It would be nice if these handlers could be managed by the state machine engine. At the same time if we want to keep state locality it would be better if the same function contains the onEntry/onExit code alongside code for handling events in the same state. This design choice can be challenged, since you may prefer to adhere to the single responsibility principle - so that three functions (onEntry, onExit, onEvent) would be needed. I preferred the single function as a prompt to keep state code simple and delegate complex behaviors to other functions to be called from the state function.

The state function is invoked in this configuration with mutually exclusive arguments - onEntry, onExit, or an application Event. In my earlier design, I employed two arguments for the state function, the first selected the required operation (on entry, on exit, or event) and the latter optionally contained the event. This had two shortcomings - it needed the implementation to do some checking and ignore or consider the event argument. Moreover, it required a dummy event for the argument when the entry/exit handler had to be invoked.

A better approach is to translate the mutual exclusion into the argument type. With type enforcement, writing the wrong code becomes much harder. Modern C++ std::variant is the right choice for this kind of need. It behaves like a traditional union but with type safety added.

Since the state function is expected to return the next state, I am forcing things a bit, because when the function is called to perform the onEntry or onExit actions, I don't want the state to be changed (only events may change the state). So the value returned in these two cases will be ignored by the state machine engine. Nonetheless, I have to return something. Similarly, an event may trigger some actions, but resulting in a transition to another state.

To indicate this intent, I let the function return a nullptr. I could have used a std::optional, but I didn't want to add overhead and the nullptr is safely handled inside the engine and the user code never uses the pointer.
Since, as we see later, using nullptr is not possible in the code I'm going to write, I'm going to define the symbol Stay which is a properly type-casted nullptr.

template<typename T,typename E>
class StateMachine
{
  public:
    // State machine tags
    struct OnEntry {};
    struct OnExit {};
    using StateArg = std::variant<OnEntry,OnExit,E>;
    class State
    {
      public:
        using StateFn = State (T::*)(StateArg const&);
        // the following function need to be implicit.
        State( StateFn stateFn ) noexcept : mState( stateFn )   // NOLINT
        {}
        State feed( T& host, StateArg arg )
        {
          return (host.*mState)( arg );
        }
        [[nodiscard]] bool isNull() const noexcept
        {
            return mState != nullptr;
        }
        [[nodiscard]] bool operator!=(State const& other ) const noexcept
        {
            return mState != other.mState;
        }
      private:
        StateFn mState;
    }; // as defined previously
    static auto constexpr Stay = static_cast<typename State::StateFn>(nullptr);
    explicit StateMachine( T* host, State initialState )
    : mState{ initialState }
    , mHost{ host }
    {
      mState.feed( *mHost, OnEntry{} );
    }
    void feed( E event )
    {
      auto nextState = mState.feed( *mHost, event );
      if( !nextState.isNull && nextState != mState ) {
         mState( OnExit{} );
         mState = nextState;
         mState( OnEntry{} );
      }
    }
  private:
    State mState;
    T* mHost;
};

isNull() and operator!= have been added to make the feed function easier to write and read.

Note that the StateMachine constructor takes the initial state as an argument, and calls the onEntry handler of such a state. This is not usually a good idea - the time you build the state machine, may not be the time you want to start it. For example, the onEntry could depend on something that has not yet been properly set up. I'll keep this design to keep the code in the examples from being too complex.

The LightSwitcher implementation can be rewritten as:

class LightSwitcher
{
  public:                          
    LightSwitcher()
    : mFsm{ this, &LightSwitcher::stateOff }
    {
       // no need to init hw, since it is done in the onEntry
       // of the first state.
    }
  private:
    enum class Event 
    {
      ButtonPressed
    };
    using Fsm = StateMachine<LightSwitcher,Event>;
    Fsm::State stateOff( Fsm::StateArg const& event )
    {
      return std::visit(
        overloaded(
          []( Fsm::OnEntry ){
             gpioWrite( 0 );
             return Fsm::Stay;
          },
          []( Fsm::OnExit ) { return Fsm::Stay; },
          []( Event e ) {
            return e == Event::ButtonPressed ?
              &LightSwitcher::stateOn :
              Fsm::Stay;
          }
        ),
        event
      );
    }
    Fsm::State stateOn( Fsm::StateArg const& event )
    {
      return std::visit(
        overloaded(
          []( Fsm::OnEntry ){
             gpioWrite( 1 );
             return Fsm::Stay;
          },
          []( Fsm::OnExit ) { return Fsm::Stay; },
          []( Event e ) {
            return e == Event::ButtonPressed ?
              &LightSwitcher::stateOff :
              Fsm::Stay;
          }
        ),
        event
      );
    }
    Fsm mFsm;
};

Where overload is defined as:

template<class... Ts>
struct overloaded : Ts... { using Ts::operator()...; };
template<class... Ts>
overloaded(Ts...) -> overloaded<Ts...>;

(code taken from cpp reference website)

Unfortunately, the C++ type system is not powerful enough to infer the right type for nullptr when it is used in the std::visit function. for this reason, I had to provide a Stay static constexpr in the StateMachine class that is just a nullptr cast to the right type.

The use of onEntry/onExit, cleaned a bit up the states, by moving the gpio writing at the entry of the state, disentangling it from the event decoding part.

Since OnExit handlers are empty, you may wonder if they can be omitted. If you try you will get an (unbelievably verbose) error complaining that a type is missing from the visit arguments. This can be annoying, but it is also a safeguard. Rather than defaulting to ignore missing events, you should explicitly tell that you don't intend to handle them.

In this example, there are a few cases, so adding the empty handler is not a problem, but when the number of events increases, then a way to say "ignore everything not explicitly mentioned" is needed. This can be achieved by using a template lambda:

[]( auto&& ){ return Fsm::Stay; }

Just be sure to add this entry at the end of the event handlers (as you would do for the default label in a switch statement).

Let's consider now the event part. Until now we considered just a single event type, but in the real world, you will have many different event types, coming from different unrelated sources. The traditional approach was to unify all the events in some polymorphic object, but this was a kind of forcing round pegs through square holes. You had to decide whether to use a common struct with a type discrimination field (à la Windows message) or to use dynamic memory and dynamic casting. No solution was satisfactory.

Looking at the code, we can see that we already have a sum type (the std::variant) for feeding one of the possible stimuli to the state function. We may leverage this approach and add other kinds of events to the same StateArg type.

This would take advantage of the std::visit/overload pattern matching and would simplify the event decoding part of the state functions. To demonstrate this, let's change the problem a bit and suppose the button event is decorated with a button code. We need to switch the light for a button with code (guess what) 42. Also, a timer expired event will switch the light off.
struct ButtonPress
{
  int buttonCode;
};
struct TimerExpired {};

class LightSwitcher
{
  public:
    LightSwitcher()
    : mFsm{ this, &LightSwitcher::stateOff }
    {
    }
  private:
    using Fsm = StateMachine<LightSwitcher,ButtonPress,TimerExpired>;
    Fsm::State stateOff( Fsm::StateArg const& event )
    {
      return std::visit(
        overloaded(
          []( Fsm::OnEntry ){
             gpioWrite( 0 );
             return Fsm::Stay;
          },
          []( Fsm::OnExit ) { return Fsm::Stay; },
          []( ButtonPress e ) {
            return e.buttonCode == 42 ?
              &LightSwitcher::stateOn :
              Fsm::Stay;
          },
          []( TimerExpired ) {
            return Fsm::Stay;
          }
        ),
        event
      );
    }
    Fsm::State stateOn( Fsm::StateArg const& event )
    {
      return std::visit(
        overloaded(
          []( Fsm::OnEntry ){
             gpioWrite( 1 );
             // some code to arm the timer
             return Fsm::Stay;
          },
          []( Fsm::OnExit ) { return Fsm::Stay; },
          []( ButtonPress e ) {
            return e.buttonCode == 42 ?
              &LightSwitcher::stateOff :
              Fsm::Stay;
          },
          []( TimerExpired ) {
            return &LightSwitcher::stateOff;
          }
        ),
        event
      );
    }
    Fsm mFsm;
};

To get this code working with the state machine engine, we need a small refactoring using a template parameter pack to handle the additional event types.

template<typename T,typename ...Es>
class StateMachine
{
  public:
    // State machine tags
    struct OnEntry {};
    struct OnExit {};
    using StateArg = std::variant<OnEntry,OnExit,Es...>;
    class State
    {
      public:
        using StateFn = State (T::*)(StateArg const&);
        // the following function need to be implicit.
        State( StateFn stateFn ) noexcept : state_( stateFn )   // NOLINT
        {}
        State operator() ( T& host, StateArg arg )
        {
          return (host.*state_)( arg );
        }
        [[nodiscard]] bool isNull() const
        {
            return state_ != nullptr;
        }
        [[nodiscard]] bool operator!=(State const& other ) const noexcept
        {
            return mState != other.mState;
        }
      private:
        StateFn state_;
    }; // as defined previously
    explicit StateMachine( State initialState )
    : mState{ initialState }
    {
      mState( OnEntry{} );
    }
    template<typename E>
    void feed( E&& event )
    {
      StateArg arg = std::forward(event);
      feed( arg );
    }
  private:
    void feed( StateArg const &arg )
    {
      auto nextState = mState( arg );
      if( !nextState.isNull && nextState != arg ) {
         mState( OnExit{} );
         mState = nextState;
         mState( OnEntry{} );
      }
    }
    State mState;
};

This is a pretty good component for modeling state machines in your code. You can find a ready-to-use component in my GitLab repository,

The LightSwitcher, although a bit verbose,  is quite clear to read if you have some familiarity with C++ lambda function syntax and a grasp of state machine concepts. 

In my experience, this is a good compromise between staying with conventional C++ code (no DSL) and describing a moderately complex state machine. This design lacks mainly nested state machines (or sub-state machines) and transition actions. They can be added, but it gets harder to stay in the readable part of the spectrum.

In this article, I walked through the design process to define an implementation for a generic state machine engine. Although in my experience this worked quite well, some design decisions are questionable, and different choices may have different advantages and disadvantages, that may better fit your domain.

In my next article, I will apply the DSL idea to this state machine implementation to reduce the effort in writing this code and to improve the readability.



Memfault Beyond the Launch

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Please login (on the right) if you already have an account on this platform.

Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: