To use or not to use an #RTOS in an embedded system can sometimes be an easy decision but sometimes a tricky one.
When we are done here and with your help, the hope is for this discussion thread to have become a great resource for the embedded systems community to check out to possibly gain a better understanding on:
- what is meant by 'bare-metal' programming in the world of embedded/IoT systems
- what is an RTOS
- the pros and cons of both approaches
- when should an RTOS be considered and when should bare-metal programming be preferred
- whatever else you think could add insights to this topic
Feel free to include personal stories and/or code snippets in your post if you think they could help the reader.
Thanks to Percepio, the makers of the highly acclaimed Tracealyzer visualization tool for developers of RTOS or Linux based embedded software systems, $200 will be divided between the authors of the most appreciated contributions to this new #FAQ discussion, based on the number of thumbs-ups received. Check out this interview with Johan Kraft CEO of Percepio at Embedded World.
Thank you for your participation and please do not forget to 'thumbs-up' the posts that you believe deserve to be at the top of the thread.
mirceac raises the good point that the definition of RTOS is murky in the embedded world. Some folks consider Linux built with its real-time extensions an RTOS, others envision something off-the-shelf like VxWorks, and others would include simple schedulers in that definition. As with so much in the embedded world the right answer is 'it depends'; it depends on the processor, the team, the company, and most importantly the requirements of the end product.
Answering the original questions using my own personal definitions:
'Bare-metal' programming is proceeding without a scheduler of any sort. The base loop is just that, and all activity is either polled or interrupt-driven. There may be some vendor-supplied header files or driver subroutines, but your code determines what happens when and is in total control. These systems are the most deterministic, are easier to test and debug, and the way to go for most simple microcontroller-based applications that don't need much in the way of 'desktop I/O' (eg TCP/IP, USB, et al).
An RTOS includes a scheduler of some sort. It doesn't have to be a fancy one. I wrote a rate-monotonic scheduler once for a 68000-based application that only had a few hundred lines of code. Interrupt handlers would call the scheduler at the end of the handler, and the scheduler would examine the run queue and could checkpoint a running task on the stack and launch another with the return-from-interrupt instruction. Much more sophisticated than a bare-metal base loop, a bit harder to debug and trace, but made it easier to manage the large variety of tasks--some high-priority, some not--that existed on that system. At the other end, many of the high-end microprocessors you can find in embedded systems these days almost require a real-time Linux or off-the-shelf RTOS because they are essentially little 'mini-desktop' systems, with dual and quad cores, full TCP/IP stacks, USB, HD video, and a host of other subsystems that a single team would spend a lifetime writing 'bare-metal' code for. The advantage in time-to-market versus the trade-off in never quite knowing for sure what the code paths are is obvious in these large systems.
In general, you need bare-metal programming when the project is simple or, even more importantly, when the penalty for failure is high. You need an RTOS when there a lots of tasks, lots of desktop-style I/O, or a sophisticated user interface. Most projects that go wrong do so when they try to cram these mutually-exclusive requirements into a single chip. If I need to assure a valve is closed NOW, I use a microcontroller and program to the metal. If I need to show the state of that valve on a fancy LCD display via TCP/IP, I use a microprocessor with a RTOS and a software library full of graphic routines and network stacks. What I DON'T do is try to do both jobs in one processor at the same time. That's when the world will blow up, sooner or later. Tie the valve to a microcontroller with only one job--keep that valve from blowing up the world. Run the fancy UI with a microprocessor to get the thing to market in a reasonable time-frame. Tie the microcontroller to the microprocessor with a UART or something so they can talk to each other. As always, it's about using the right tool for the job.
Trust and control. Those are the compelling reasons for "programming at the bare metal" - which is a phrase that I first heard in the early 1980s. At that time, it meant writing programs in assembly language for just about any available computer, especially (though not limited to) micro-controllers like 8080, Z80, 8051 and those weird MicroChip PIC processors. During my undergraduate years in the late 1960s (when we finally got rid of the abacus and slide rule), we all took a shot at bare metal programming on the dinosaurs in the "computing lab". We had an IBM 1620, later replaced by an IBM 360/40.
With the advent of the IBM PC (sounds like an IBM commercial, doesn't it? I actually worked at UNIVAC and TI for a time, so it was just that IBMs were available), there was DOS. If you wanted to do anything fancy, you could use some of the simple services, but you would still end up writing a bunch of bare-metal code. Want to use interrupts for a serial communications program, write them yourself. Want to twiddle bits to make something happen? Write it yourself. I did that enough to get the attention of a publisher. Read all about it in "The Programmer's Guide to MS-DOS" if any copies still exist.
Side note: I was on a compiler writing team at UNIVAC (late 1970s) for a military mini-computer and was responsible for a part of the Pascal compiler and the runtime environment. I spent a year commuting from Minneapolis to Florida doing field work at an installation. I would get updates by courier (no internet), but any bug fixes that I made meant that I inserted a BRANCH instruction in binary form at the point of the fix; insert the fix in high memory in binary with a BRANCH back to the rear of the error. This was all done with toggle switches for each bit of the address and the data with lamps to show the memory contents. You could get really good at entering code that way after a while. Easy place to mess up, too.
Over the years, I have accumulated a hoard of "bare-metal" routines that I use all of the time. That is the control part. I know exactly what is happening in the hardware and software because I wrote it. If something goes wrong, I know that it is my own fault and I fix it. No mistakenly pointed fingers at some company for my own dumb typos. All of the interrupt routines, data structure schemes and program timing are my responsibility; I don't have to trust somebody else to have it right.
I currently use ARM-7 and Cortex processors from various sources. My workspace is awash with development boards - many of which are available for adoption to a good home. I am trying and failing to retire. :-)
My programming model is simple and works well. An infinite loop the repeated calls to major functions in the scheme. Call them "threads" if you wish. Each function is a finite-state machine which either performs the current function or waits for another event (from interrupt, for instance) or waits for a timer to expire. No lengthy loops allowed in the FSM - just enter, do something simple, change the state, if required, and leave.
Which brings me to the innovation that lies between "bare-metal" and RTOS: The manufacturer supplied SDK and drivers. Who knows better than the chip maker on how to use the peripherals? Well, this became a trust issue. How good are the people at NXP, STM, MicroChip, Freescale, etc. at programming their own stuff? Do you trust them? I don't. I use the drivers, but I read them to understand what they are doing.
For instance, there are days when USB drivers appear to be on the edge of magic. I like to use the supplied drivers. It would even be better if they worked properly. I got a great education in USB by debugging the supplied driver from %&$&^. At least I had the source for the driver, found the error, fixed it and notified %&$&^ of the problem and fix. Have no idea what happened after that. Trust and control.
I considered an RTOS at one time. Turned out that I was so deep into "rolling my own" that I had most of the functions covered. I would have had to change some of my style (another reason for "bare-metal") to match the style of the RTOS. Stayed with what I knew.
I think that I have rambled enough.
Everything IMHO ( TM ;)
Nowadays there is no "bare metal" programming, there are barely any controllers that are not running Linux or some sort of full OS, and on the RTOS side there is such a GIGANTIC offering of all sorts of more or less real time OSes that your project has to really be on a very special platform and on a very special purpose to NOT find a suitable one. The offerings are ranging from the likes of QNX and Pike OS for certified and commercially supported OSes, to FreeRTOS and Kontiki on the Open SOurce ones, and even there along with the community support, you can also get commercial support as well. And I didn't mention here the vendor attempts of creating specialized RTOS for their specific platforms, because vendor lock-in is a concept that will never grew old to the management ;).
Let's break the wall of text and list some pros and cons for a RTOS against of "bare metal" solution:
- Not reinventing the wheel, tested and confirmed solution, vendor support for different architectures.
- Debugging support, many debugging tools vendors support much better a well known RTOS that was integrated with their tools and develoment medium, than a bare metal JTAG connection on an unknown "OS".
- Time to market, having the primitives done and a user library of tested and proved functions, shortens the development cycle a lot, compared with "what strange bug in the interrupt/DMA/SPI/etc. controller I have to solve this week"
- Third party support for common tasks, there is no need to redo again and again this SPI, UART or HDLC library, not to mention network stacks, if one add mini web/ftp servers as well, the task to do them from scratch becomes fast complicated.
- Certification for different things like Common Criteria, or usage in a specific field, like medical, critical industrial infrastructure, banking or defense and aerospace applications. While it's not impossible to certify your own implementation of everything, most of the time it's not economically feasible or just plain impossible, the specification ask only for a very small set of RTOSes, usually with just one element ;).
- Finally, post development and customer support, for an established RTOS, you'll find easier talent to do support and bug fixing, as compared with your proprietary solution developed in-house, that nobody, except for the alpha geek, knows exactly how it works, and when it gets bored and leave, you have a big mess on your hands.
Pros for "bare metal" approach:
- Your platform/SoC really has very limited resources, like scarily limited, every bit and byte matters, and you can't select other platform, you have to live with it. In this situation even the smallest overhead is important to get rid of it and a ultra-customized solution is worth doing.
- Your project needs to have the hardest of the hard real-time constraints, every nanosecond counts, and the smallest desynchronization will kill your 10K engine/device or put the reactor on critical. Or just your inkjet printer will print wrong. In this situation one needs a strict analyze of every CPU clock and check all possible interactions with the external events. In this situation it may be worth doing a "bare metal" application, to have everything under control, but here a FPGA/ASIC implementation of the time critical part with small MCU for configuration and control around may be a better solution.
- You're a student, hobbyst or want to go to the bottom in the RTOS operation, then by all means everybody needs to do at least once a "bare metal" implementation, starting with the MCU/SoC datasheet and an assembler, along with a scope, LA . There is no substitute for this, no University or theoretical training will be able to give you the insights and skills that one gets doing a bare metal project.
This my take about the RTOS/"bare metal" theme, that recurrently appears in the embedded/IoT discussion forums.
mirceac - you make good points. Jack Ganssle made similar arguments 10 years ago on p. 106 of his book, "The Art of Designing Embedded Systems, Second Edition," from Newnes Press, a book I still use.
My experience has been that very small systems or mission-critical systems, such as spacecraft, sometimes use a superloop or state diagram in place of an RTOS. Spacecraft design and operation is very concerned with robust operation and recovery in the event of a fault; these tend towards custom architectures. Clearly these are corner-cases and not the more mainstream products.
The book is a classic, I think. I read it a couple of years back and wrote a review of the same: https://indianengineeringdesignforum.wordpress.com...
Your project needs to have the hardest of the hard real-time constraints, every nanosecond counts, and the smallest desynchronization will kill your 10K engine/device or put the reactor on critical. Or just your inkjet printer will print wrong. In this situation one needs a strict analyze of every CPU clock and check all possible interactions with the external events. In this situation it may be worth doing a "bare metal" application, to have everything under control, but here a FPGA/ASIC implementation of the time critical part with small MCU for configuration and control around may be a better solution.
You make it sound like hard-real-time constraints are "extreme" ("every nanosecond counts" ... "put the reactor on critical").
There are plenty of applications in DSP / switching power supply / motor control where "real-time" means low timing jitter in the 100s of nanoseconds / low microseconds range. These signal processing and control algorithms have severe consequences for being late, with penalty being "hiccups" in the audio/video/voltage/torque output, all of which are detrimental performance hits. These aren't million dollar devices, they're things like your cell phone or the power supply in a server (which is more and more likely these days to be based around a microcontroller) or a drone motor controller.
None of these precludes the presence of an RTOS. But they almost always require processing power at a level below the RTOS scheduler, via a high-priority interrupt.
I disagree with a lot of mirceac's post. Firstly, yes, there is still bare-metal programming (I just finished a bare-metal embedded project myself.) And yes, there are still many embedded projects not running a general-purpose OS such as Linux.
Secondly, mirceac's RTOS pros don't address the RTOS vs bare-metal question but instead are simply pros of using a commercial RTOS as opposed to home-brewed software. One can create a home-brewed RTOS and it wouldn't benefit from any of the listed pros.
Do not use an Operating System when everything that needs to be done can be done synchronously. Use an Operating System when you need to do things asynchronously.
A simple big loop or a more complex collection of smaller loops.
Is this too simple a way to look at it?
I have been designing firmware for almost 30 years and have never found a compelling reason to use an RTOS.I have evaluated RTOS’s in the past and I prefer to use Linux on the PC, over Windows.This said, as much as I have attempted to move to embedded Linux, I have still not found a compelling reason to do so.My specialty has been hard real time embedded systems and my projects have ranged from oceanographic research packages to LIDAR speed/ranging systems, so, I may be a bit of an outlier, as far as embedded systems design.I did not move from assembly language to C until after the turn of the century.That being said, here is my reasoning:
Pros for an RTOS
- Well tested and qualified code base.
- Support and updates.
- Tailored for many processors.
- Fast time-to-market.
Cons for an RTOS
- Created to be a one-shoe-fits-all solution.
- Not necessarily a lean code base.
- Usually significantly expensive.
I am sure that if I went to college when RTOSs were popular, and had some experience during that period, I might be more inclined to jump on board. However, my view is, clock speed (power consumption), memory, peripherals, and development tools are what differentiate processors/vendors, and getting the low level drivers written is a benefit of the RTOS.But, a good software engineer is actually half hardware engineer and understands exactly what the code actually does at the register level.An RTOS provides the initial framework and allows a programmer to bypass that low level step and jump right to application development.That is fine most of the time, until it isn’t.
As for debugging, I find that I only need an o-scope and a DVM, and with some creative coding, get a better picture of what is wrong, in real time, not simulated time.I worry when we start giving away creativity for expediency.Perhaps that is why Windows 10 is not popular and why you have so many updates to your phones.Good engineering does not require weekly updates.
There are many fine reasons for using an RTOS. One very common example is the requirement to support a TCP/IP networking stack. It's difficult to implement TCP/IP networking without an RTOS.
An RTOS does not provide low-level drivers nor allow a developer to jump right to the application. An RTOS is simply a library that provides support for multi-threading. The RTOS does not provide microcontroller peripheral support. A developer using an RTOS still needs to understand the hardware registers in order to write a peripheral driver. And a developer using an RTOS still has to understand how the startup code works and get to main() just like a super-loop developer. However, once you get to main you start the RTOS instead of starting the super-loop.
I feel like this post and others are confusing RTOS with "commercial software" and/or "general-purpose operating systems" such as Linux. An RTOS doesn't have to be commercial software. And an RTOS is not a general-purpose (e.g., "desktop") operating system and doesn't provide nearly the same services as a general-purpose operating system.
I'm going to define "bare metal" as a system with some sort of a non-preemptive task loop, whether it be an explicit superloop in 'main()', or some fancy C++ thing that does the equivalent.
I'm going to define an "RTOS" as a preemptive, prioritized scheduler with controlled latency times. Any added-on gew-gaws like file access or communications stacks need to also have controlled latency times for me to accept the assembly as real time.
I'm an adherent to the definition of "real time" that says "real fast don't mean real time" -- real time requirements are ones that say that something has to happen in a specific interval of time. So if your shortest required time interval is 50ms, then as I understand it* you may be able to satisfy that with RT-Linux, at least if you're not running on a PC**.
So when would I use a task loop ("bare metal")? My two biggest indications would be uniform task completion time, and a project team that's 100% staffed with people who understand real time. Third on the list would be the absolute lowest-latency requirement. Fourth, and really a rough indicator of the other two, is the size of the software -- if it fits in 8K or 16K of flash, then you probably don't need an RTOS. If it needs 128K of flash and you're not using an RTOS, you should be asking yourself why.
The decision to use an RTOS, for me, comes about when the costs (time, money, memory space, processor ticks, etc.) that are expended screwing with the RTOS (including educating my fellow team members on its use) no longer exceed the costs incurred by not using it.
My biggest indication to choose "bare metal" over an RTOS would be if I have a bunch of things going on that can easily be written to execute in small chunks, so that my longest-running task never makes my shortest-deadline task miss the deadline. An example of this would be a motor control loop that needs to sample at 10kHz, and that receives short, easily parsed messages on serial.
The counter-indication is if there are a bunch of different things going on that have wildly different deadlines. If I have my 10kHz motor control loop (with it's implied 100us deadline) that needs to run along side a chess-playing program (perhaps a chess-playing robot, on one processor?) that takes seconds to grind through it's calculations, then the RTOS wins the selection process, hands down.
My second-biggest indication would be staffing. Nothing screws up a bare-metal application than some otherwise good software engineer who just does not understand that spending extra time on a long-deadline task will make a short-deadline task malfunction. Even in an all-expert group, if you have a long-running algorithm it's a royal pain in the behind to try to split it up into numerous small chunks -- and impossible if it's handed to you already written. It's a lot easier to ride herd on a newbie in (or import code into) an RTOS-driven environment by giving them some strong "don't do this" statements than it is in a task-loop driven environment (they'll still manage to screw things up, though -- been there, done that).
My third-biggest indication would be the smallest latency that had to be achieved. Even the best RTOS's take some time to do a context switch, and a well-crafted task loop will always beat that. So there are some things that just can't be done with an RTOS that can be done with a task loop. Some of those can be dealt with by violating the prime directive of good embedded programming, and putting a lot of processing in an interrupt, but you have to approach such shenanigans with care.
* I haven't worked with RT-Linux.
** Apparently, the processor temperature control in modern PC BIOS's can consume the processor for long and unpredictable amounts of time -- and you risk the processor if you bypass them.
All of my work in the last 25 years or so has been in military and space embedded applications using Ada on various processors with no RTOS but just using Ada’s built-in runtime support which included tasking and many features usually only available with a RTOS. I am now retired and do a lot of work using the ARM STM32F4 and F7 series processor from ST Microsystems and I exclusively use AdaCore’s GNAT Ada on those for the same reasons.AdaCore has great support for Ada on these processors and you can choose either a zero-footprint runtime (essentially bare-metal), a small-footprint runtime with limited tasking, or a full runtime with all the language features. I prefer the small-footprint because it gives a smaller size executable and still supports tasking and has a global exception handler which can be used to report any runtime exceptions (where they occurred, etc.) to make debugging easier. I usually set up an environment similar to the Arduino system – an initialization sequence followed by an endless loop controller. Then I can use tasking, DMA, interrupts, etc. as needed for the particular application. I have this environment running on dozens of different ST boards (including many of the Nucleo boards which are Arduino compatible so I can use most of the sensors and peripherals available for Arduino apps).
If you haven’t used Ada, of course there is a bit of a learning curve but to me it is worth the effort to get the power, security, and safety that Ada provides and have the RTOS features you normally need without having to deal with a third-party RTOS. Also, GNAT Ada for ARM is free for non-commercial use or any use under their GPL license. I described my initial efforts into this combination of Ada and ARM in an article in Electronic Design magazine in September, 2014 ("ARMed and Ready").
I think "bare-metal" has come to mean "without an RTOS". That's how you're using it in the question. But originally I think "bare-metal" was supposed to imply that the software is developed from the hardware up, without the benefit of intermediate layers of software such as an operating system. When someone develops an application for a general-purpose OS they rely on the software APIs provided by the OS and drivers and they may not even need to know anything about the hardware. That's the opposite of bare-metal programming. When someone is developing "bare-metal" software they're writing software for the hardware APIs provided by the microcontroller and peripherals. Ironically, this hardware-based definition of "bare-metal" should probably still apply to embedded software that employs an RTOS. But somewhere along the line, "RTOS" got conflated with general-purpose operating systems and then "bare-metal" came to mean "without an RTOS". It might be better to state the question as "RTOS vs Super-Loop".
A super-loop is a single-threaded software program with a loop that gets repeated forever. This thread might get interrupted by interrupt service routines but when the interrupt is complete control returns to the original super-loop thread.
An RTOS is a software component that provides the ability to create multiple threads of software execution and a scheduler for managing those threads. An RTOS usually provides additional APIs for inter-thread synchronization mechanisms and software timer services.
An RTOS is basically a tool for managing software complexity. It allows one to divide a complex set of software requirements into multiple threads of execution. When done properly, each individual thread has a relatively simple set of requirements and becomes easier to implement. And thread prioritization can make it easier to fulfill the real-time requirements of the application. These are basically the pros of employing an RTOS.
However, an RTOS is not a panacea. An RTOS increases the overall system complexity and opens you up to new types of bugs (such as deadlocks). It requires knowledge and experience to design and implement an RTOS based program effectively.
If the software requirements are simple enough to be implemented with a super-loop that is maintainable (and hopefully somewhat extensible) then you probably don't need an operating system. As the software requirements increase, the super-loop gets more complex. When the software requirements are so many that the super-loop becomes too complex or cannot fulfill the real-time requirements of the system then it is time to consider another architecture.
My rule of thumb is that you should consider an RTOS if the product requires at least one of the following: a TCP/IP stack (or other complex networking stack), a complex GUI (perhaps one with GUI objects such as windows and events), or a file system. And if your project requires all of those things then you should consider general-purpose operating systems. General-purpose operating systems include drivers for the low-level details and allow you to focus on your application.
RTOSs have become the most popular (and most promoted) next-step beyond the main super-loop. But they're not the only option (and perhaps they're undeserving of their popularity). Consider alternatives such as Multi-Rate Main Loop Tasking or an event-based state machine architecture (such as QP). These alternatives might be simpler, easier to understand, or more compatible with your way of designing software.
There's been a lot of interesting posts so far so I'll just add a few personal insights and beliefs.
First, what is meant by "bare-metal" programming? To me, bare metal programming is when a developer uses a super loop or a very simple scheduler such as a cooperative scheduler to manage time within their application. Baremetal applications tend to be written at a low-level, where the application developer is directly accessing registers using their own software.
A RTOS is a collection of libraries that are designed to aid a developer in creating a multi-tasking and deterministic run-time environment. It adds in mechanisms such as mutexes, queues, semaphores and event flags. The RTOS itself though is a scheduling kernel. The definition that I use in my RTOS course is:
"A RTOS is an operating system designed to manage hardware resources of an embedded system with very precise timing and a high degree of reliability."
There are many characteristics that an RTOS will exhibit such as:
An RTOS should be used when:
- It will simplify the software architecture
- Ease software maintenance
- Real-time timing requirements are complex (i.e. TCP/IP, USB, File system, MQTT are required)
- Pre-emption is required (the ability for a task to interrupt another task)
- When RTOS diagnostics can make debugging and system validation easier
- The microprocessor can support it
- Ease 3rd party component integration
These are just a few examples. For any real-time system, a developer could go without the RTOS and go baremetal but the software will be far more complex and difficult to test and validate.
Now I'm not trying to be self promoting, but, I've written several articles and given many presentations on bare-metal versus RTOS so if anyone is interested they can find more of my opinions on my RTOS blog category at https://www.beningo.com/category/rtos/
> Pre-emption is required (the ability for a task to interrupt another task)
You can achieve preemption with ISR cascade, so you do have alternative to RTOS with this requirement.
You can mimic preemption using ISR's but if you really want preemption, where a task interrupts another task, you need an RTOS. Best practices for ISR dictate they should be fast and minimal. Get in and get out. For a simple application, using ISR cascade can certainly work. For an IoT or a more complex design, using ISR priorities is asking for trouble. At least in my experience.
Thanks for pointing that out. I once sat in on a talk that covered ~50 different architectural designs to get real-time preemptive behavior without an RTOS. It was a great talk.
Didn't find on the page you link to the two RTOS articles you wrote in your blog on EmbeddedRelated. Here they are:
The best argument I read so far, IMO, was about the `Trust and control` from dnj.
Besides Trust and Control, I'd also add simplicity.
I try my best to keep everything as simple as possible, so, for every increment on complexity I think a lot on how to avoid it. I do that to RTOS too.
RTOS does increment your complexity, you simply have more code to maintain and debug. As a replacement to the RTOS you can use an event-loop, main super loop or even an interrupt cascade if your application is simple enough. With exception to the last, you have deal with almost zero possible race conditions in the firmware, that is a big plus in simplicity.
I also have had the necessity to use Real-time Linux in a project, when we were evaluating the adoption I spent a lot of effort trying to simplify the project to not to use it, there was no way. We had to adopt it due to very complex algorithms we were running on it and its dependencies to libraries that were not ported to any other smaller RTOS. That was a big increment in complexity, at least we are aware there were no other way around, so I believe that was a good decision.
I have never used generic RTOS in a product, TBH, only specialized ones, such as Contiki (to get the 6Lowpan stack and ethernet as side effect) and embedded Linux for complex networking applications and complex user interfaces.
Simple networking applications and USB I tend to stick to bare-metal. Even simple GUIs I stick to the bare metal.
Bare metal, is sufficient for most of the low level products like sensors,
transmitters, field devices, displays, human machine interfaces, keyboard,
touch panels, displays.
PROS: a modular approach to implement the code and integrate them.
CONS: very difficult to test all possible failure modes.
RTOS is an easier and quicker way to implement the product in above products.
PROS: putting the demo project adding tasks and integrating.
CONS: If there is a deadlock it is very difficult to find out where is the problem.
but sometimes this will be a overhead. Think of a card swipe device hooked to a gateway through RS485 running on a proprietary protocol for communication. The bare metal does everything required there, and device can be put together with low cost. If RTOS route is taken, you will end up with a higher end controller, (because the demo program will be available on such controller, and you may end up in high cost for the product, because the demo featured controller will have more peripherals
and you may not be using any of them.
Capturing of requirements, the definition and implementation of the architecture, portioning will matter most in such situations.
Re using an implemented product on a new one with additional features with an RTOS has a lot many problems. In one of our projects (about 3 years ago) we faced a lot of problems.
In an existing product there was a requirement to add a new feature. The product had several components some of them at very low end with bare metal implementation and some with a medium level which used the RTOS.
An analysis of requirement was done, and went through the existing architecture.
In the low end component of the product it was decided to use a controller with an additional UART and added one more LED, and there were change in the mechanicals of the product.
In the middle end of the component of the product the existing hardware was sufficient and in firmware it was decided to add one more high priority task and use one more port.
The lower end component was bare metal one, while the middle end used EMBOS.
During implementation the middle end component was earlier to implement, (because there was no hardware change) while the low end component took some time.
When the new feature was completely implemented on all components it was integrated and tested, and then released as new feature to the users.
Once it was released there were complaints from field about the new feature use to interrupt the primary functionality of the product.
A decision was taken to add RTOS to the low end product, but fortunately the cause was found about the additional task implementation in the middle end component without proper understanding of semaphore was the cause.
This was solved and a new release was made.
Moral: When using an RTOS, understand the RTOS thoroughly.
To add a bit extra to the debate, the whole concept of RTOS it's a bit muddy for many. Because before accurately answering the question "Are you using a RTOS or "bare metal" ?" it should be necessary to define what an RTOS is, many of the well known RTOS are little more than a collection of initialization routines and driver and utility functions, and the "tasklets" whaterverlets ;) are more or less running unscheduled or executing when an interrupt is coming, without any preemtion or scheduling.
To me personally, the difference between between a RTOS and some "educated" bare metal thing is the presence of a scheduler and the capability of assigning precise and reproducible time quanta to the running tasks, and this has to be designed in the OS architecture, including drivers and libraries, and not some abused timer interrupt with whatever interrupt priorities the MCU (eventually) has.
All the other things are bare metal or "desktop OS" in case of Linux/WhateverBSD is shoehorned in a project-
Of course, this is just IMO, what are your criteria to distinguish between a RTOS, "bare-metal" or "desktop OS" in an embedded project ?
In the past 15 years I worked in telecom industry, doing firmware for modem chips (yes, the other end of the AT command). Those projects are surprisingly large - 100000s of lines of code and 2000+ developers working on the firmware and even more people working on the hardware/chip design.
Here a very small RTOS is used, however this has big advantages:
1) Even though the RTOS is 3rd party you cannot consider it being bug-free. No software is bug-free. With a low-scale RTOS you have a chance to identify such a bug.
2) Resource demand and overhead is small if compared to a Linux distribution, however you have the luxury of using multiple threads without having to write scheduler and synchronization mechanisms such as semaphores, queues etc. on your own. You do not loose control.
3) It scales. A multitude of threads, both statically and dynamically created, are supported, and it's easy to add new threads. Of course thread attributes such as priority and stack size need be considered carefully, but it is rather easy to find a well-balanced sweet spot.
Well, this is true for a modem application. This does not require a web interface, however there are quite some high-performance external connections such as MIPI and USB. For those drivers are in place, some or parts being 3rd party.
For an application where more complex communication is required I'd also prefer using a Linux based machine. However I'd consider splitting the application into 2 parts, maybe even running on different controllers: Do the real-time-stuff on a small processor using a bare-metal firmware or a small RTOS with low footprint, do the fancy interface stuff on a Linux based machine which comes with all the peripherals for attaching touch-display and internet and stuff, and connect both via a lean and clean interface.
Sometimes you have to port multi-threaded code from (for example) Linux to an embedded MCU, and you do not have time [i.e., are not given the time!] to rework the code to be single-threaded (e.g., by replacing the threads with a call to select()). In this case, porting the code to an RTOS (rather than bare metal) may be the easiest way to go.
You can see here that I define RTOS as supporting multiple threads of execution (tasks), and bare metal as, in effect, a single thread of execution (+ ISRs, of course). E.g., the lwIP TCP/IP stack is designed to support bare metal, whereas ThreadX provides a TCP/IP stack designed to run in the corresponding (ThreadX) RTOS.