EmbeddedRelated.com
Forums

"Boost context" for task switching on embedded Linux on ARM

Started by Unknown August 3, 2018
On Monday, August 6, 2018 at 8:04:22 PM UTC+12, George Neuner wrote:

> >If we make each co-operative task a proper thread then I was thinking we > >would suspend the thread when it does a task switch and the task switch > >function would release a mutex to allow other "non co-operative code" > >to access the data being generated by the co-operative threads. > >The task switch function would resume the next co-operative thread in > >round robin fashion. The legacy code just saves each task stack pointer > >in an array and cycles around them. > > Just a word of caution: suspending a thread from the outside can cause > problems if done at the wrong time. It's acceptible for a thread to > suspend itself, but apart from debugging there really are no good > reasons for one thread to suspend another. > > It's ok for a thread to be in a "suspended" state while waiting for > some event that makes it runnable again. Just make sure to > distinguish the state from the action. > > George
ok, thanks. I'm not sure if the code below will work correctly - it's not as simple as I thought. I suspect I need a supervising thread. Maybe boost::context would be cleaner and simpler. void task_switch() { release_mutex(); suspend_me_for_a_while(); acquire_mutex(); if ( ++task_id >= max_tasks ) { task_id = 0; } switch ( task_id ) { default: case 0 : resume_thread(0); break; case 1 : resume_thread(1); break; // ... } suspend_me(); }
On Mon, 6 Aug 2018 04:09:52 -0700 (PDT), gp.kiwi@gmail.com wrote:

>I'm not sure if the code below will work correctly - it's not as >simple as I thought. I suspect I need a supervising thread. Maybe >boost::context would be cleaner and simpler. > >void task_switch() >{ > release_mutex(); > suspend_me_for_a_while(); > acquire_mutex(); > > if ( ++task_id >= max_tasks ) { > task_id = 0; > } > switch ( task_id ) { > default: > case 0 : resume_thread(0); break; > case 1 : resume_thread(1); break; > // ... > } > suspend_me(); >}
It would be better to sleep() [or in pthreads sched_yield()] rather than suspend - a suspended thread can't wake up again unless resumed from outside (e.g., by a supervisor as you mentioned). However, whether this pattern will work depends on the mutex implementation. For example, pthreads does NOT guarantee the order of threads waiting for a mutex - it depends on the scheduler which waiting thread will get the mutex next. pthreads (Linux thread group) RR scheduling only works AS EXPECTED when all the threads remain runnable (iow, they are simply using up their timeslots and otherwise are not waiting for anything). I don't know what Boost does. If you want to force round robin in the face of resource contention, you need to find some way to play the moral equivalent of "pass the token". George
George Neuner <gneuner2@comcast.net> writes:
> Multi-tasking with [setjmp/jmp_buf] is another issue. It's generally > pretty easy to set up a new stack, but initializing all the other CPU > state for the new "task" can be problematic. It was easier to do with > simpler CPUs. It still can be done with a modern CPU, but it is much > more difficult to get everything right [the more so for lack of > documentation].
This sounds pretty horrendous and compiler and hardware dependent. You are basically writing a miniature OS in your scheduler. It helps if the compiler docs explicitly say it's supposed to work. Otherwise, if you reverse engineer it for a particular compiler version, all bets are off for the next one. There are some very lightweight RTOS out there and it's probably saner to just use one instead of a hack like this.
On 18-08-06 21:21 , Paul Rubin wrote:
> George Neuner <gneuner2@comcast.net> writes: >> Multi-tasking with [setjmp/jmp_buf] is another issue. It's generally >> pretty easy to set up a new stack, but initializing all the other CPU >> state for the new "task" can be problematic. It was easier to do with >> simpler CPUs. It still can be done with a modern CPU, but it is much >> more difficult to get everything right [the more so for lack of >> documentation]. > > This sounds pretty horrendous and compiler and hardware dependent. You > are basically writing a miniature OS in your scheduler. It helps if the > compiler docs explicitly say it's supposed to work. Otherwise, if you > reverse engineer it for a particular compiler version, all bets are off > for the next one. > > There are some very lightweight RTOS out there and it's probably saner > to just use one instead of a hack like this.
Simplest robust solution I can think of is to create one semaphore for each of the round-robin threads, initialise all semaphores but one to zero, and initialise the semaphore for the first thread that you want to run, to one (1). Each thread starts with "taking" its own semaphore, which means that only the chosen "first thread" actually starts running, and the other threads all block on their semaphores. When a thread wants to yield to some other thread, it calls a function that: - "signals" the semaphore for the thread that should run next (how ever you want to choose that thread), and then - "takes" the semaphore for the current thread (the one that is yielding), which blocks this thread until someone signals this semaphore. In this way, you can implement either round-robin execution or "directed" yields to a specific thread, because you control which semaphore is signalled next. Assuming that the base system is Linux, or some competent RTOS, I would never mess around with jmp_buf or some system-dependent threading libraries. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
On Tuesday, August 7, 2018 at 7:48:33 AM UTC+12, Niklas Holsti wrote:
> Simplest robust solution I can think of is to create one semaphore for > each of the round-robin threads, initialise all semaphores but one to > zero, and initialise the semaphore for the first thread that you want to > run, to one (1). > > Each thread starts with "taking" its own semaphore, which means that > only the chosen "first thread" actually starts running, and the other > threads all block on their semaphores. When a thread wants to yield to > some other thread, it calls a function that: > > - "signals" the semaphore for the thread that should run next (how ever > you want to choose that thread), and then > > - "takes" the semaphore for the current thread (the one that is > yielding), which blocks this thread until someone signals this semaphore. > > In this way, you can implement either round-robin execution or > "directed" yields to a specific thread, because you control which > semaphore is signalled next. > > Assuming that the base system is Linux, or some competent RTOS, I would > never mess around with jmp_buf or some system-dependent threading libraries.
Thanks for the suggestion.
On Mon, 06 Aug 2018 11:21:23 -0700, Paul Rubin
<no.email@nospam.invalid> wrote:

>George Neuner <gneuner2@comcast.net> writes: >> Multi-tasking with [setjmp/jmp_buf] is another issue. It's generally >> pretty easy to set up a new stack, but initializing all the other CPU >> state for the new "task" can be problematic. It was easier to do with >> simpler CPUs. It still can be done with a modern CPU, but it is much >> more difficult to get everything right [the more so for lack of >> documentation]. > >This sounds pretty horrendous and compiler and hardware dependent. You >are basically writing a miniature OS in your scheduler.
??? This is *userspace* thread switching - not kernel thread or process switching. Do you not remember using LWT thread libraries - e.g., cthreads - before operating systems had kernel threads. What do you think they did?
>It helps if the compiler docs explicitly say it's supposed to work.
Actually, the C standard says it works - at least if you read between the lines. 7.13 guarantees that CPU state[*] sufficient to restore the calling environment of setjmp will be preserved in the jmp_buf structure, and that that state [modulo setjmp's return value] will be restored by calling longjmp. Although the standard does not address it's use in multitasking, the description of the operation of setjmp/longjmp matches quite well the description of a register state task switcher. The hard part of using it for multitasking is that compiler vendors often don't document the jmp_buf structure very well because the standard explicitly speaks only to its use within a single thread, and that use does not require modifying the structure data. Generally, setting up a new stack is pretty easy. The hard part is figuring out how to set the saved instruction pointer so as to properly enter your new "thread" function. [minimum you need to know if/how your CPU autoincrements addresses.] Once you get past that hurdle, switching "threads" is as simple as if (setjmp(&mystate) == 0) longjmp(&yourstate,1); and "scheduling" is just picking the next jmp_buf to restore from an array or list of jmp_bufs that represent your runnable "threads". [*] 7.13 says explicitly that the jmp_buf does not record the state of FPU status flags.
>Otherwise, if you reverse engineer it for a particular compiler version, >all bets are off for the next one. > >There are some very lightweight RTOS out there and it's probably saner >to just use one instead of a hack like this.
I am NOT advocating use of setjmp/longjmp to implement multitasking: getting a "thread" to start properly can be tedious and error prone unless the compiler documentation is good. The OP mentioned it as a possible solution to his problem, and I was merely explaining more about it. However, Boost::context must be doing something extremely similar, and likely it is simply building on top of setjmp/longjmp which already are provided (yes, C++ has them too). George
On 18-08-07 22:10 , George Neuner wrote:
> On Mon, 06 Aug 2018 11:21:23 -0700, Paul Rubin > <no.email@nospam.invalid> wrote: > >> George Neuner <gneuner2@comcast.net> writes: >>> Multi-tasking with [setjmp/jmp_buf] is another issue
...
>> >> This sounds pretty horrendous and compiler and hardware dependent. You >> are basically writing a miniature OS in your scheduler.
...
>> It helps if the compiler docs explicitly say it's supposed to work. > > Actually, the C standard says it works - at least if you read between > the lines.
C11 says it is undefined -- see below.
> [Section] 7.13 guarantees that CPU state[*] sufficient to restore > the calling environment of setjmp will be preserved in the jmp_buf > structure, and that that state [modulo setjmp's return value] will be > restored by calling longjmp. > > Although the standard does not address it's use in multitasking, the > description of the operation of setjmp/longjmp matches quite well the > description of a register state task switcher.
I don't remember which version of C the OP intends to use, but the C11 draft (n1570.pdf) contains this text in section 7.13.2.1, for longjmp: "The longjmp function restores the environment saved by the most recent invocation of the setjmp macro in the same invocation of the program with the corresponding jmp_buf argument. If ... the invocation was from another thread of execution ... the behavior is undefined." -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
On Tue, 7 Aug 2018 22:41:18 +0300, Niklas Holsti
<niklas.holsti@tidorum.invalid> wrote:

>On 18-08-07 22:10 , George Neuner wrote: >> On Mon, 06 Aug 2018 11:21:23 -0700, Paul Rubin >> <no.email@nospam.invalid> wrote: >> >>> George Neuner <gneuner2@comcast.net> writes: >>>> Multi-tasking with [setjmp/jmp_buf] is another issue > ... >>> >>> This sounds pretty horrendous and compiler and hardware dependent. You >>> are basically writing a miniature OS in your scheduler. > ... >>> It helps if the compiler docs explicitly say it's supposed to work. >> >> Actually, the C standard says it works - at least if you read between >> the lines. > >C11 says it is undefined -- see below. > >> [Section] 7.13 guarantees that CPU state[*] sufficient to restore >> the calling environment of setjmp will be preserved in the jmp_buf >> structure, and that that state [modulo setjmp's return value] will be >> restored by calling longjmp. >> >> Although the standard does not address it's use in multitasking, the >> description of the operation of setjmp/longjmp matches quite well the >> description of a register state task switcher. > >I don't remember which version of C the OP intends to use, but the C11 >draft (n1570.pdf) contains this text in section 7.13.2.1, for longjmp: > >"The longjmp function restores the environment saved by the most recent >invocation of the setjmp macro in the same invocation of the program >with the corresponding jmp_buf argument. If ... the invocation was from >another thread of execution ... the behavior is undefined."
Yes. setjmp/longjmp is not *intended* to be used for multitasking. That has made clear already in C89/90. However, almost every C program does things that technically are "undefined" or "implementation defined" according to the standard. And things get used for more than they originally were intended. The point is not whether it's legal, but whether it works. You can multitask in C with setjmp/longjmp if you do it right. Obviously its better to use the standard approved thread API in C11, but users of previous versions did not have that luxury: whatever they did using a thread library or OS thread API was "undefined" wrt C. Userspace threading in C has been done almost since the beginning: the 1st version of what later became the portable cthreads library popped out in 1981. The portable cthreads API (circa ~1990) was one of the inspirations for pthreads. cthreads itself was based on setjmp/longjmp. George
On 18-08-08 07:42 , George Neuner wrote:
> On Tue, 7 Aug 2018 22:41:18 +0300, Niklas Holsti > <niklas.holsti@tidorum.invalid> wrote: > >> On 18-08-07 22:10 , George Neuner wrote: >>> On Mon, 06 Aug 2018 11:21:23 -0700, Paul Rubin >>> <no.email@nospam.invalid> wrote: >>> >>>> George Neuner <gneuner2@comcast.net> writes: >>>>> Multi-tasking with [setjmp/jmp_buf] is another issue >> ... >>>> >>>> This sounds pretty horrendous and compiler and hardware dependent. You >>>> are basically writing a miniature OS in your scheduler. >> ... >>>> It helps if the compiler docs explicitly say it's supposed to work. >>> >>> Actually, the C standard says it works - at least if you read between >>> the lines. >> >> C11 says it is undefined -- see below. >> >>> [Section] 7.13 guarantees that CPU state[*] sufficient to restore >>> the calling environment of setjmp will be preserved in the jmp_buf >>> structure, and that that state [modulo setjmp's return value] will be >>> restored by calling longjmp. >>> >>> Although the standard does not address it's use in multitasking, the >>> description of the operation of setjmp/longjmp matches quite well the >>> description of a register state task switcher. >> >> I don't remember which version of C the OP intends to use, but the C11 >> draft (n1570.pdf) contains this text in section 7.13.2.1, for longjmp: >> >> "The longjmp function restores the environment saved by the most recent >> invocation of the setjmp macro in the same invocation of the program >> with the corresponding jmp_buf argument. If ... the invocation was from >> another thread of execution ... the behavior is undefined." > > Yes. setjmp/longjmp is not *intended* to be used for multitasking. > That has made clear already in C89/90. > > However, almost every C program does things that technically are > "undefined" or "implementation defined" according to the standard. And > things get used for more than they originally were intended.
True, but not desirable.
> The point is not whether it's legal, but whether it works. You can > multitask in C with setjmp/longjmp if you do it right.
You can multitask in C with foo/bar if you implement foo and bar "right". Sure, for a simple thread context setjmp/longjmp can have some of the required properties of foo and bar, but may also have some undesirable properties. The undefinedness of a cross-thread longjmp can be far more serious than, say, shifting or overflowing a signed integer. A longjmp is intended to *abandon* the current execution point and return to a point *lower* (earlier) in the call stack. Some C or C++ systems might implement this by unwinding the stack, frame by frame, until the setjmp frame is reached. This will have drastic "undefinedness" if the setjmp refers to another stack... It is not advisable, today, to try to misuse setjmp/longjmp for thread switching. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
On Wed, 8 Aug 2018 08:21:08 +0300, Niklas Holsti
<niklas.holsti@tidorum.invalid> wrote:

>The undefinedness of a cross-thread longjmp can be far more serious >than, say, shifting or overflowing a signed integer. A longjmp is >intended to *abandon* the current execution point and return to a point >*lower* (earlier) in the call stack. Some C or C++ systems might >implement this by unwinding the stack, frame by frame, until the setjmp >frame is reached. This will have drastic "undefinedness" if the setjmp >refers to another stack...
But none do ... every implementation I have seen leaves the stack data intact and simply resets the CPU register(s). longjmp is NOT equivalent to throwing an exception, and, in fact, it is dangerous to use it that way in C++. Using longjmp will NOT cause destructors to be called for local objects on the "abandoned" stack. It exists in C++ for only for compatibility with C.
>It is not advisable, today, to try to misuse setjmp/longjmp for thread >switching.
I already said that I don't advocate doing it, and that I merely was explaining it to the OP. Maybe you prefer to conceal information about things you don't like, but I prefer to educate people and let them make up their own minds. George