RTOS Debugging Techniques

Started by beningjw 3 years ago7 replieslatest reply 3 years ago1318 views

Hello! I am putting together a webinar on "Mastering #RTOS Debugging Techniques" and I would greatly appreciate your insights as to what you think should be added to the presentation to make it as insightful and interesting as possible. At the moment, I'm thinking about covering:

1) Properly sizing the stack using by performing a worst case stack analysis 

2) Setting up application trace capabilities

3) Detecting and resolving common RTOS application issues such as Priority Inversion, Deadlock, etc

4) RTOS application best practices

Does this sound like an interesting webinar? What other topics would you like to see included? Any specific demonstrations?

I greatly appreciate the feedback!

UPDATE - Here is the webinar:

[ - ]
Reply by beningjwMay 3, 2017

Thanks everyone for your feedback. I've put together the abstract based on your feedback and am planning to run the webinar on Thursday May 11th. 

If you are interested in attending, the abstract and registration can be found at

Thanks again!

[ - ]
Reply by LaszloApril 26, 2017


That is a great idea! I debugged on a day to day basis a complex RTOS (tech support), so i hit a few bumps on the way.

Tracing functionality is the most critical:

- memory consumption, focus on heap

- task switching/semphore/mutex states

- idle/wake-up/cpu frequency tracing - especially critical in low power systems

- ISR tracing with timestamps

- as much hardware usage trace as possible (DMA/Timers/Co-processors/Multi-core)

- ... and probably more

Looking forward to see the outcome.

[ - ]
Reply by beningjwApril 26, 2017

Thanks! Those are all great ideas and critical points. I will make sure that I cover these topics. Thanks for the feedback!

[ - ]
Reply by natersozApril 26, 2017

It will be interesting to see what people emphasize within this thread.

  • I suggest, when designing for an RTOS, the following or similar reference:

Meeting Deadlines in Hard Real-Time Systems: The Rate Monotonic Approach

The strategy described within this text involves:

    • Determining hard and soft deadlines. as inputs
    • Determining task run-time durations as a design inputs.
    • Determining task run-time periodicity as a design input.
    • Determining the task duration variability when necessary for these inputs.
    • Using timeline techniques to gain confidence and intuition with respect to the system behavior.

Based on these criteria and the rules for Rate Monotonic scheduling a real-time system can be designed and analyzed for meeting its deadlines

  • To analyze the system under test, my preferred method is GPIO pins and the Saleae logic analyzer. 

Use of GPIO pin for tracing context switching and task duration provides low-latency profiling of the system under test.

  • I have never had the opportunity to use Traceback debugging techniques which would obsolete the logic analyzer method.

I would specifically like to hear about others' experience with traceback debugging tools on the ARM Cortex series and what tools you have found useful. It appears that every ARM device on the market these days contains the Embedded Trace Macrocell.

This should be standard practice - but I am not yet up to speed  :(

[ - ]
Reply by beningjwApril 26, 2017

Thanks for sharing that link and providing your input. A core piece of the webinar is going to be discussing the ARM Cortex tracing capabilities. It might be a good webinar for you to attend to get up to speed.

The logic analyzer method is one I've used in the past on 8 and 16 bit machines. When using ARM, utilizing the internal hardware and external trace tools can provide a lot of insight and not use up GPIO. I think this will be a good point to mention!

Thanks again for your input!  

[ - ]
Reply by LaszloApril 27, 2017

My 2 cents, what i mentioned as tracing is basically logging data into RAM or NVM buffers(with or without the HW trace support), then dumping it and analyze it with a help of scripts(to add some readable form).

This all needs a piece of firmware, usually enabled/disabled via compiler switches.

If you have a debug/trace firmware component, that can be merged into any system, the low level drivers can have HW support or pure SW implementation, but that should be abstracted from the actual tracing. Like that you can integrate this component, making "any" RTOS traceable.

It sounds more complex than it actually is, what you need:

- define your data structure (struct), to contain all data you need, but no redundant data(you are using RAM/NVM for it). Add a field which you can correlate with the other traces, like timestamp/clockcount. Track also in which task you are, if relevant. Important - add a trace element nr. so you know at which entry you are looking at(see below)

- define an array with your trace structure, use it as circular buffer.

- collected data can be dumped with memory dump or downloaded via serial/usb ports, or even checked via jtag/watch expression

Debugging with a GPIO is possible in the classic "super loop" system, however the task/context switching will be hard to follow, and the "trace" is not that easy to share/analyze.

However GPIO debugging is though very useful if you have "strange" resets, and can't actually take the dump, or the system is not even booting up properly, then a toggling a GPIO at specific code location can show you if you reached that or not. Detecting a reset event is also very useful.

Looking forward :)

[ - ]
Reply by Bob11April 30, 2017

IMO, the closer to the metal (silicon?) the RTOS and application are, the more important it is to be familiar with the ABI and start-up processes of the software toolchain. Knowing which processor registers are 'scratch' registers vs which are saved during function calls, the direction of stack pushes, where code and data segments are placed by the linker and how they are initialized, the proper use of type qualifiers such as 'volatile', the access models of the processor, etc. etc., is more useful when debugging an RTOS application than when debugging, say, a typical desktop application. For example, I've seen embedded code fail when someone tried to squeeze ~2100 bytes into a data segment when the processor only had 2048 bytes of data FLASH, and the linker/loader was not properly configured to warn about the segment size. As long as no one accessed those last 52 bytes that aliased around to the first 52 (no run-time error naturally) everything was fine.

When tracing an RTOS/application on the smaller processors, being able to follow the flow of execution below the source code level is usually a necessary skill; at least that's been my experience. I agree with the rest of your list as well--sounds like an interesting webinar!