Memfault State of IoT Report

Round-robin or RTOS for my embedded system

Manuel HerreraJune 9, 20197 comments

First of all, I would like to introduce myself. I am Manuel Herrera. I am starting to write blogs about the situations that I have faced over the years of my career and discussed with colleagues.

To begin, I would like to open a conversation with a dilemma that is present when starting a project ... must I use or not any operating system?

I hope it helps you to form your own criteria and above all that you enjoy it.

Does my embedded system need an RTOS?

Embedded systems that do not have an operating system, known as round-robin, are usually structured in two ways:

The first option is when the main loop calls in the proper order the list of functions which each internally processes their own events in the manner of a large state machine that encompasses all the operation of the system. Where each of these functions seeks to return to the main loop as quickly as possible. These functions, depending on the complexity of the work they do, may or may not work like a state machine. In this scheme, the correct functioning of the entire system depends heavily on the design of the global state machine and the synchronization of all the internal state machines on which it rests.

The second option would be a round-robin system with a scheduler. Mainly, the global execution of the system is carried out by a scheduler mechanism which assigns a fixed time slot for each function to execute. The system may choose to rearrange the execution order of the existing tasks, according to certain criteria such as their priority and assigned time. Obviously, when a situation arises such that said task needs more execution time than the one previously assigned to it, the system will not obtain a result of the said task until the mechanism has scheduled it enough times.

The two previous options briefly describe simple control flow scenarios which do not impose significant overhead. Real-time operating systems aim to solve the programmatic complexity posed by two round-robin options presented above. The RTOS tries to provide each task as if it were unique in the system. It has a mechanism that interrupts each task according to certain criteria, after having finished this interruption, the RTOS returns control to the task just at the point where it was interrupted.

Certainly, the RTOS, of the three options presented so far, is the software mechanism that represents the most code. Because of in most cases it is not written from scratch for the project you want to develop, other issues such as the required monetary investment and licenses have to be considered. Especially the cost of having a lot of extra code in the project, we could make the mistake of adapting the system to the chosen RTOS instead of the software to the original solution.

Another important consideration is, in general, the use of any RTOS will probably require greater software knowledge.

The question of whether or not an RTOS is necessary should evaluate:

if the functionality of the system and the events it handles can be managed safely with and without an RTOS. For this analysis, a recommended tool can be Rate Monotonic Analysis (RMA). Even without using RMA, your scheduler can help ensure deadlines are met. Ironically, an RTOS will typically increase interrupt latency.

An RTOS can help by allowing the project to be broken into independent threads or processes and using services such as message queues, mutexes, semaphores, etc. to communicate & synchronize, the system. This requires a certain way of thinking, and the project may result in high complexity.

Let's not forget a crucial component when developing a system, the reusable code. Either a round-robin system or one with RTOS, most likely different layers can be reused between different projects. The question here is if you code drivers / protocol-handlers using an RTOS API they can plug into future projects easier. Although by definition any well-structured project, should allow the reuse of different software comps, it is true that there are already standards that facilitate the reuse of software. It is necessary to analyze whether it suits to adhere to them or not. It should also be borne in mind that if the decision to use an RTOS has been made, the selection process will depend on specific considerations and requirements that best fit the system.


My intention has not been to show my personal position on the matter, because I consider that each project has its scope and its restrictions. In the opinion of who writes to them, I prefer to avoid the use of extra software layers about which their internal functioning is unknown or they do not have control over what is happening under them, because sooner or later, they will have to change something of their behavior.

Memfault State of IoT Report
[ - ]
Comment by QLJune 11, 2019
The article mentions the term "state machine" quite a bit, but it seems to lump "state machines" together into the "round robin" (a.k.a. "superloop" or "main+ISRs") architecture. I see this loose understanding of the term "state machine" as a nebulous blob of conditional code happening quite a bit. For example, Richard Barry--the creator of FreeRTOS--also often contrasts [messy] "state machines" with the [superior] RTOS-based architecture (e.g., see ).

But, really the term "state machine" actually means something and is typically the exactly opposite from what people put into their "superloops".

To call a piece of code "state machine", it must have at the very least a few properties, such as: (1) a clearly identifiable, central "state variable" which indicates the currently active state, (2) clear rules of changing the "state variable", which are called "state transitions", and (3) it should be possible to depict the operation of the code in a state diagram.

Unfortunately, most code that you can find inside "superloops" doesn't have these properties. In fact, the code tends to contain multitude of flags and variables, which are then set and tested in convoluted IF-THEN-ELSE branches of code. The other name of this type of code is "spaghetti" or BBM (Big Ball of Mud). Needless to say, it is impossible to reverse-engineer BBM into a real state machine diagram.

So, in this sense "state machines" are exactly NOT this. In fact, the most important benefit of "state machines" is that they act as the most powerful "spaghetti reducers".
[ - ]
Comment by pmuellerJune 19, 2019

I agree with Miros arguments. A good state machine design either with or without RTOS is key for embedded control applications. If the machines are not trivial it is also highly recommended to use a code generation tool for the otherwise error prone task of coding state machines.

[ - ]
Comment by mr_banditNovember 20, 2019

I have done plenty of systems where a command loop calls a number of state machines. The devices were interrupt driven. 

These were systems where exact timing was not critical, or I could isolate the real-time functionality and use timers or in the device driver ISRs. Obviously, I made sure the ISR execution time was properly limited.

For cases where latency is critical, an RTOS is a logical choice. But I did a command loop on a time-critical system, because I isolated the time-critical parts to hardware: FPGA and the ethernet to a WizNet chip. Basically, "set and forget".

Another had timer ISR to set flags/states for the state machines which were called in a command loop. The state machine only did something when triggered by the ISR. There was a bit of latency, but within design bounds. And design assured determinism in what executed when.

Proper design is critical, and instrumenting to verify the design.

[ - ]
Comment by vinnieJuly 17, 2019

I'm curious about this statement: " it is true that there are already standards that facilitate the reuse of software."  - what specifically are you referring to for 8, 16 and 32 bit MCU standards?

Frameworks and Design Templates (Patterns) offer some help here, but using a commercial RTOS will often allow you to use verified code (for a fee) that gives file-system, network stacks, crypto libraries, USB, GUI-tools, and many higher functions without porting or writing your own.

Some solutions like Renesas Synergy even give you all that for free, without the licenses required in traditional development. The RTOS itself is crucial for IoT work, as evident from Microsoft and Amazon recent acquisitions.  (I work for Renesas BTW)


[ - ]
Comment by matthewbarrJune 19, 2019

It is possible to have a hybrid of the two round-robin systems you outline, it isn't necessarily one or the other. You can have a main loop calling functions in some order, and one or more of the functions might use a software timer so that it runs periodically. The timer is tested on entry and returns if it has not expired. Scheduling is of course not precise, but if you do a good job implementing stateful non-blocking functions and schedule appropriately it can work well, with execution timing centered around a target time period with some +/- slop. For example, a PID temperature control function can tolerate some scheduling imprecision and still work very well.

I would never argue that the round-robin (aka super loop) approach is superior to or even as good as an RTOS, it simply isn't for a variety of reasons. If you're trying to run the LWIP stack or similar, then you absolutely want an RTOS and a more capable processor. However, an RTOS can be problematic for simple control applications on limited resource resource microcontrollers. Memory for program and variables can be very limited such that RTOS + application is a difficult if not impossibly tight fit. Some microcontroller instruction set architectures don't provide the instructions required to implement efficient task switching, producing additional run time overhead relative to simple call/return handling. Under these particular constraints an RTOS may not be the best answer, a big hammer for a small nail.

[ - ]
Comment by felipelavrattiJune 20, 2019

Why not a event-loop (or reactive) framework?
(Full disclaimer, I am the author of the project above)

[ - ]
Comment by Myth832September 30, 2022
Many embedded operating systems rely on applications to either have a variety of callback routines that do a little work, schedule another callback, and return quickly, or else make sure they never go too long without calling a "task switch if needed" function. These operating systems frequently do not use a timer for task switching. It will be easy to avoid the majority of uses of locks if one is aware of which functions may or may not allow task switches. A partially changed collection of objects should not be explicitly prevented from being seen by other tasks if code changes several objects without calling a task switch function or anything else that could call it in between the updates of the first and last object.

To post reply to a comment, click on the 'reply' button attached to each comment. To post a new comment (not a reply to a comment) check out the 'Write a Comment' tab at the top of the comments.

Please login (on the right) if you already have an account on this platform.

Otherwise, please use this form to register (free) an join one of the largest online community for Electrical/Embedded/DSP/FPGA/ML engineers: