EmbeddedRelated.com
Forums

IAR MSP430 compiler problem

Started by brOS November 23, 2009
Niklas Holsti wrote:
>> Niklas Holsti wrote: >>> brOS wrote: >>>>> On Mon, 23 Nov 2009 14:19:14 -0600 >>>>> "brOS" <bogdanrosandic@gmail.com> wrote: >>>>> >>>>>> Dear all, >>>>>> >>>>>> Does anybody knows how to force compiler to use call instruction >>>>>> instead of br(branch)for disassembling function call? >>>>>> It is extremely important for me to specific function is disassembled >>>>>> using call instead of brunch, as compiler always does. >>>>>> >>> ... >>>>> >>>> This is why i need it.... >>>> Function I'm calling have looks something like this: >>>> void Spin(void){ >>>> for(;;){} >>>> } >>>> So if it is disassembled with call before entering in pc will be >>>> saved on >>>> stack and it will point to instruction after function spin....So I >>>> want to >>>> use that pc and to save context so when my scheduler schedule that task >>>> again it will not continue spinning in that forever loop but it will >>>> jump >>>> to next instruction after Spin function..... >>>> branch doesn t push pc to stack so taht s my problem;) >>> >>> The compiler has deduced that a branch instruction is as good as a >>> call instruction for this/these calls of Spin. There can be two >>> reasons for that: >>> >>> 1. If the compiler has seen the code of Spin (if it is in the same >>> source-code file as the calling function) it may have deduced that >>> Spin never returns, so it does not need the return address that a >>> call instruction would push on the stack. Of course the compiler >>> cannot know that your scheduler breaks C semantics (I assume by >>> interrupting the eternal loop in Spin) and needs the return address. > > I experimented a bit with the IAR MSP430 compiler (current "kickstart" > version), and it uses call instructions to call a non-returning function > containing only an eternal for-loop, even if the function is presented > in the same source-code file as the call. If the function is marked with > the __noreturn keyword the compiler will use a branch or jump > instruction, though. (I assume that the OP has not marked Spin with > __noreturn.) > > So it seems my suggested reason 1 is not the true explanation. >
It's always fun to test and compare compilers. The stable version of gcc for the msp430 is an older version - 3.2.3 (with 4.x under development). It always "calls" the function even when it knows it is non-returning, and there is a "ret" after the call (and after the infinite loop). Newer gcc versions give tighter code (testing with avr-gcc 4.3.2) - a function calling Spin() inlines the infinite loop into caller. There are no jumps, calls, or returns. The point here is that such details vary from compiler to compiler, and from version to version. The compiler will do exactly what you tell it, but you can't rely on it using a particular method to implement a particular construct.
>>> 2. If the call to Spin is the last statement in the calling function >>> (a "tail call"), the compiler understands that the call does not have >>> to push a return address, because Spin will return (assuming it would >>> return) to the end of the calling function, which immediately returns >>> to *its* caller. The branch instruction leaves the calling function's >>> return address on the stack, so when Spin returns (assuming it could >>> return) it will take a short-cut and return to the caller of the >>> calling function. This optimization saves time and stack space. > > In my small experiments, the IAR compiler does code a tail call to Spin > using a branch or jump instruction, instead of a call. So reason 2 is a > possible explanation for the OP's observation. Interestingly, this > happens even if the optimization level is set to "None", so this advice > of mine: > >>> Another possibility is to avoid the "High" optimization level of the >>> compiler. > > does not work. >
Optimisation levels are never more than a hint to the compiler. You are just making a suggestion as to how it should balance compile time, ease of debugging, and size and speed of the generated code. Optimisation flags are never demands, and the compiler is free to apply all its optimisations at any level (though obviously it is more user-friendly to have some correlation). Code that is dependent on the optimisation level for correctness is broken code. (Obviously it can be dependent on the optimisation level for size and speed requirements.)
> David Brown wrote: > >> Your diagnosis of the problem is fair enough, but your workarounds >> are, IMHO, totally wrong. Anything that involves trying to trick or >> cripple the compiler (separate compiled files, disabling >> optimisations, fake extra inline assembly, gratuitous function pointer >> usage, etc.) is at best an ugly hack, and at worst a maintenance >> nightmare. Remember, the compiler is free to work around all these >> workarounds - lying to your tools is a bad idea. > > In general I agree with you, David, but the OP is trying to run C code > under a custom scheduler, apparently in some kind of simple > multi-threading or coroutine style. This is out of scope for the C > language, so the operation of the scheduler will involve some things > that the compiler does not know about -- and should not (have to) know > about. The scheduler/kernel routines should follow the C compiler's > calling protocols, but will themselves do things that exceed C's semantics. >
No, the scheduler/kernel should /not/ rely on the compiler's calling protocols. The compiler can change these as it wants, and mix them for different functions. If the scheduler depends on the compiler using particular instructions to call a function, the scheduler is broken - a pre-emptive scheduler can assume /nothing/ about the code it is pre-empting. If you have a scheduler that for some reason needs a way to get a function's return address, then it needs to use a compiler-specific feature such as gcc's "__builtin_return_address()" function. If the compiler doesn't have such a feature, then you are out of luck. Get a different compiler, or write a scheduler that doesn't depend on knowing the return address. Under no circumstances is it correct to tell the compiler you have an infinite loop, and then complain because you can't see how to break out of it.
> Of course, the person writing the scheduler should know all about the C > compiler's calling protocols and run-time system so that the scheduler > can save and restore thread contexts properly. > > The Spin function seems intended to be part of the application/scheduler > interface; an application task calls it when it has finished its job and > yields to the scheduler. Writing this "yield" routine as an eternal loop > is unusual, but can be OK for a custom kernel. In a more conventional
It is not "unusual", it is "wrong". There is no point in trying to help the OP find some workaround to get this system to compile - he must fix the code.
> kernel, the application would call a kernel "yield" or "suspend_me" > function, the kernel would check if some other thread is ready to run, > and if not the kernel would stick in a loop, or schedule a looping "null > thread" that is always ready to run. >
Exactly. When a task has finished, control must be returned to the scheduler, either by calling a "yield" function, or by returning to its caller (the kernel). You could, I suppose, end a task in an infinite loop and rely on the pre-empter to make sure other tasks get processor time. But you certainly wouldn't expect that thread to ever leave the infinite loop - that's why it's called an "infinite loop".
>> The function is called by branch, not call, because it never returns. > > That could be a reason, but I now doubt it for the IAR compiler -- see > my note on experiments above. The tail-call explanation is the more > likely one. > >> That's what you (OP) wrote in the source code, so that's what the >> compiler does. >> If you want the function to return, you have to write code that allows >> the function to return. In particular, you need to have some way of >> exiting the spin, otherwise it is useless. > > As I understand it, the OP's scheduler (most likely running in an > interrupt handler) will break out of the "eternal" loop by popping the > return address from the stack into the PC, forcing a return from Spin. > This is legal MSP430 code, but out of C semantics. >
If the OP wants to write such brain-dead code in some sort of non-C, that's up to him - but he should not expect to use a C compiler to achieve it.
>> If you can't see why you need something along these lines, you'll have >> to think a bit harder about how you want your code to work. But >> telling the compiler you want a tight infinite loop, and then trying >> to find some way to break out of it, is definitely not the answer. > > Making Spin test a flag that the scheduler sets is a solution, but a > different solution. >
That's /almost/ correct. Making Spin test a flag /is/ a solution. But it's not a "different solution", because he doesn't have a solution at the moment - his scheduler concept /cannot/ be made to work the way he thinks. An infinite loop is a dead end to the thread that hits it - no exits, no escapes, no returns. It's dead. The end. Rather than trying to play Dr. Frankenstein, the OP should re-think the way his scheduler should work, and what Spin() should actually do. In particular, if he wants the function to be able to return, he must give it a way to return.
> It could be safer to write Spin in assembly language, to prevent the C > compiler gaining any false knowledge about its behaviour, such as "does > not return" knowledge.
Rubbish. Fake assembly to lie to the compiler is not the answer.
> But if the OP knows that the C compiler does not > transport such knowledge across compilation units, writing Spin in C > (for separate compilation) is safe. Of course this has to be rechecked
Dangerous rubbish. Code the relies on separate compilation is as broken as code that relies on hobbling the optimiser. You don't have that choice - the compiler can transport anything it wants across compilation units, and you can't choose to stop that.
> for each new version of the compiler, so it is indeed a maintenance > burden, over and above the burden of checking for changes in the calling > protocols and run-time system structure, which a scheduler author has to > do for every compiler version anyway. > > Summary: Tail call optimization is the likely cause of the compiler > using a branch instead of a call instruction. So: >
Real summary: The original idea is /wrong/. An infinite loop has no exit and no return. If the function Spin() needs to exit, it should have an exit. Write code that says what you want it to do, don't write something totally different and rely on layers of workarounds, compiler-specific hacks, assembly tricks and other nonsense.
[ Quotations edited severely but hopefully without misattribution.]

>>>> brOS wrote: >>>>>> On Mon, 23 Nov 2009 14:19:14 -0600 >>>>>> "brOS" <bogdanrosandic@gmail.com> wrote: >>>>>> >>>>>>> Dear all, >>>>>>> >>>>>>> Does anybody knows how to force compiler to use call instruction >>>>>>> instead of br(branch)for disassembling function call? >>>>>>> It is extremely important for me to specific function is >>>>>>> disassembled >>>>>>> using call instead of brunch, as compiler always does. >>>>>>> >>>> ... >>>>>> >>>>> This is why i need it.... >>>>> Function I'm calling have looks something like this: >>>>> void Spin(void){ >>>>> for(;;){} >>>>> } >>>>> So if it is disassembled with call before entering in pc will be >>>>> saved on >>>>> stack and it will point to instruction after function spin....So I >>>>> want to >>>>> use that pc and to save context so when my scheduler schedule that >>>>> task >>>>> again it will not continue spinning in that forever loop but it >>>>> will jump >>>>> to next instruction after Spin function..... >>>>> branch doesn t push pc to stack so taht s my problem;)
> Niklas Holsti wrote:
>> ... The scheduler/kernel routines should follow the C compiler's >> calling protocols, but will themselves do things that exceed C's >> semantics.
David Brown wrote:
> > No, the scheduler/kernel should /not/ rely on the compiler's calling > protocols.
I didn't say "rely"-- I said "follow". If the application calls a kernel routine, it will use the compiler's calling protocols that, for example, say which registers must be preserved, and which can be overwritten. The kernel routine should follow these rules, but is certainly allowed to change the values of the overwritable registers, for example. (Note, I am not talking about *pre-emption* here, nor was the OP, I believe.)
> The compiler can change these as it wants, and mix them for > different functions. If the scheduler depends on the compiler using > particular instructions to call a function,
The question here is not really about particular instructions, but about the state in which the Spin routine is entered, specifically whether there is a usable return address on the stack. The presence of a return address on the stack must be defined in the compiler's calling protocol if the compiler is meant to be able to interface to assembly-language routines or generally "foreign" routines.
> the scheduler is broken - a > pre-emptive scheduler can assume /nothing/ about the code it is > pre-empting.
In principle true -- for a preemptive scheduler. (The OP is most likely not making a pre-emptive scheduler, however.) But in practice a pre-emptive scheduler must sometimes know about the run-time architecture of the pre-empted software. For example, some small systems use statically allocated memory for thread-specific data, such as additional working "registers" for floating-point libraries. A pre-emptive kernel has to know about such things in order to save and restore context. The alternative is to disable preemption while a thread uses such software-defined shared resources; the choice is a trade-off between latency and context-switching overhead. But that is veering off-topic, I think.
> If you have a scheduler that for some reason needs a way to get a > function's return address, then it needs to use a compiler-specific > feature such as gcc's "__builtin_return_address()" function. If the > compiler doesn't have such a feature, then you are out of luck. Get a > different compiler, or write a scheduler that doesn't depend on knowing > the return address.
Not very helpful to the OP. But "tough love", perhaps :-)
> Under no circumstances is it correct to tell the compiler you have an > infinite loop, and then complain because you can't see how to break out > of it.
Who was complaining? The OP seems to know perfectly well how to break out of this loop by changing the PC in the scheduler (when the looping code is interrupted).
> There is no point in trying to help the OP find some workaround to get > this system to compile - he must fix the code.
Eh? The system compiles. And can work, if the compiler's use of a branch instruction instead of a call instruction is only due to tail-call optimization, and there is always a return address on the stack.
>> ... the application would call a kernel "yield" or "suspend_me" >> function, the kernel would check if some other thread is ready to run, >> and if not the kernel would stick in a loop, or schedule a looping >> "null thread" that is always ready to run. >> > > Exactly. When a task has finished, control must be returned to the > scheduler, either by calling a "yield" function, or by returning to its > caller (the kernel). You could, I suppose, end a task in an infinite > loop and rely on the pre-empter to make sure other tasks get processor > time. But you certainly wouldn't expect that thread to ever leave the > infinite loop - that's why it's called an "infinite loop".
It is not uncommon for kernels to (internally) use an eternal loop ("lab: jump lab") to wait for the next interrupt that creates some work to do, as the OP does in Spin. Yes, the loop is syntactically eternal/infinite, but in the presence of interrupts it can be terminated.
>> As I understand it, the OP's scheduler (most likely running in an >> interrupt handler) will break out of the "eternal" loop by popping the >> return address from the stack into the PC, forcing a return from Spin. >> This is legal MSP430 code, but out of C semantics. >> > > If the OP wants to write such brain-dead code in some sort of non-C, > that's up to him - but he should not expect to use a C compiler to > achieve it.
The OP is combining C semantics -- the loop is eternal -- with interrupt semantics -- the loop can be broken. This approach is normal for writing kernels and schedulers, but of course has its pitfalls. I agree that it would be cleaner to write the non-C-semantics code, such as Spin, in assembly language.
>> Making Spin test a flag that the scheduler sets is a solution, but a >> different solution. > > That's /almost/ correct. Making Spin test a flag /is/ a solution. But > it's not a "different solution", because he doesn't have a solution at > the moment - his scheduler concept /cannot/ be made to work the way he > thinks.
Sure it can -- that is, an interrupt handler can break the Spin loop and resume execution at the point after the Spin call, as long as there is a return address.
>> It could be safer to write Spin in assembly language, to prevent the C >> compiler gaining any false knowledge about its behaviour, such as >> "does not return" knowledge. > > Rubbish. Fake assembly to lie to the compiler is not the answer.
There is nothing "fake" about this. A kernel/scheduler (especially if pre-emptive) has to go beyond C semantics. Using assembly language is the normal way to do this. And using the return address is the normal way for a kernel to save the PC of a thread, when a kernel routine suspends the thread.
> The original idea is /wrong/. An infinite loop has no exit and no > return.
The loop in Spin can be terminated by a scheduler using PC manipulations, as in a typical scheduler. Nothing wrong about that, although it is risky to write it in C, for the reason that we agree on: the C compiler will only see the C semantics, and may use them in ways that cause problems for this idea. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
Niklas Holsti wrote:
> [ Quotations edited severely but hopefully without misattribution.] > >>>>> brOS wrote: >>>>>>> On Mon, 23 Nov 2009 14:19:14 -0600 >>>>>>> "brOS" <bogdanrosandic@gmail.com> wrote: >>>>>>> >>>>>>>> Dear all, >>>>>>>> >>>>>>>> Does anybody knows how to force compiler to use call instruction >>>>>>>> instead of br(branch)for disassembling function call? >>>>>>>> It is extremely important for me to specific function is >>>>>>>> disassembled >>>>>>>> using call instead of brunch, as compiler always does. >>>>>>>> >>>>> ... >>>>>>> >>>>>> This is why i need it.... >>>>>> Function I'm calling have looks something like this: >>>>>> void Spin(void){ >>>>>> for(;;){} >>>>>> } >>>>>> So if it is disassembled with call before entering in pc will be >>>>>> saved on >>>>>> stack and it will point to instruction after function spin....So I >>>>>> want to >>>>>> use that pc and to save context so when my scheduler schedule that >>>>>> task >>>>>> again it will not continue spinning in that forever loop but it >>>>>> will jump >>>>>> to next instruction after Spin function..... >>>>>> branch doesn t push pc to stack so taht s my problem;) > >> Niklas Holsti wrote: > >>> ... The scheduler/kernel routines should follow the C compiler's >>> calling protocols, but will themselves do things that exceed C's >>> semantics. > > David Brown wrote: >> >> No, the scheduler/kernel should /not/ rely on the compiler's calling >> protocols. > > I didn't say "rely"-- I said "follow". If the application calls a kernel > routine, it will use the compiler's calling protocols that, for example, > say which registers must be preserved, and which can be overwritten. The > kernel routine should follow these rules, but is certainly allowed to > change the values of the overwritable registers, for example. (Note, I > am not talking about *pre-emption* here, nor was the OP, I believe.) >
The OP has given us very little information to go on - a lot of what we both are writing about is speculation (and I am just as likely to guess incorrectly as you). However, since an infinite loop can clearly never be broken without pre-emption, I am assuming he /does/ want pre-emption. Certainly the kernel should follow the compiler's conventions for function calling - it should, as far as practically possible, be written in C, and thus calling conventions follow automatically. I misinterpreted your post - I thought you meant the kernel could assume that the code it is scheduling always follows the compiler's conventions.
>> The compiler can change these as it wants, and mix them for different >> functions. If the scheduler depends on the compiler using particular >> instructions to call a function, > > The question here is not really about particular instructions, but about > the state in which the Spin routine is entered, specifically whether > there is a usable return address on the stack. The presence of a return > address on the stack must be defined in the compiler's calling protocol > if the compiler is meant to be able to interface to assembly-language > routines or generally "foreign" routines. >
That is only true at the points at which it actually /is/ interfaced to "foreign" code. When a C function calls another C function, the compiler can use or abuse whatever calling convention it likes at the time. Good compilers can and will do all sorts of re-arrangements to get better code, including inlining code bodies, changing register usage, or using a "branch" instead of a "call" when the called function cannot return. Nothing you can do with compiler flags, separate compilation, or other tricks can change that in a reliable way.
>> the scheduler is broken - a pre-emptive scheduler can assume /nothing/ >> about the code it is pre-empting. > > In principle true -- for a preemptive scheduler. (The OP is most likely > not making a pre-emptive scheduler, however.) But in practice a
The code is just as broken for a co-operative scheduler. As you have said yourself, when a task wants to release the processor it should call the kernel scheduler. I don't know how much you have worked on schedulers, but I get the impression you know what you are doing and could write one perfectly well. You would solve the same sorts of problems in a similar way to the way I or most other scheduler writers would. So I don't really want to sound like I am trying to teach you something you already know about. But I just cannot comprehend why you are defending the OP's bad design, and trying to find ways to jam that square peg into a round hole. You know as well as I do that writing a tight infinite loop, and then trying to find some way to go around the compiler to break out of the loop, is bad design from step 1. Everything else in this thread is of minor relevance (though interesting).
> pre-emptive scheduler must sometimes know about the run-time > architecture of the pre-empted software. For example, some small systems > use statically allocated memory for thread-specific data, such as > additional working "registers" for floating-point libraries. A > pre-emptive kernel has to know about such things in order to save and > restore context. The alternative is to disable preemption while a thread > uses such software-defined shared resources; the choice is a trade-off > between latency and context-switching overhead. >
That is true enough. In such a situation, the OS must know whether these additional "registers" (or for some devices, they are real registers) must be preserved and restored. In "normal" embedded code the same situation turns up with interrupts. For example, when using the embedded multiplier on the msp430 you must disable interrupts or be sure that the interrupt routines don't use the multiplier.
> But that is veering off-topic, I think. >
Only a little :-)
>> If you have a scheduler that for some reason needs a way to get a >> function's return address, then it needs to use a compiler-specific >> feature such as gcc's "__builtin_return_address()" function. If the >> compiler doesn't have such a feature, then you are out of luck. Get a >> different compiler, or write a scheduler that doesn't depend on >> knowing the return address. > > Not very helpful to the OP. But "tough love", perhaps :-) >
That is, IMHO, what the OP needs here. Any advice he gets that help him continue down his original path is false help.
>> Under no circumstances is it correct to tell the compiler you have an >> infinite loop, and then complain because you can't see how to break >> out of it. > > Who was complaining? The OP seems to know perfectly well how to break > out of this loop by changing the PC in the scheduler (when the looping > code is interrupted). >
He is complaining because although he knows he has to change the PC, he doesn't know what new value to use.
>> There is no point in trying to help the OP find some workaround to get >> this system to compile - he must fix the code. > > Eh? The system compiles. And can work, if the compiler's use of a branch > instruction instead of a call instruction is only due to tail-call > optimization, and there is always a return address on the stack. >
A system can work (assuming for a moment that it can be made to work), and yet still be so badly designed and fragile that it is "broken".
>>> ... the application would call a kernel "yield" or "suspend_me" >>> function, the kernel would check if some other thread is ready to >>> run, and if not the kernel would stick in a loop, or schedule a >>> looping "null thread" that is always ready to run. >>> >> >> Exactly. When a task has finished, control must be returned to the >> scheduler, either by calling a "yield" function, or by returning to >> its caller (the kernel). You could, I suppose, end a task in an >> infinite loop and rely on the pre-empter to make sure other tasks get >> processor time. But you certainly wouldn't expect that thread to ever >> leave the infinite loop - that's why it's called an "infinite loop". > > It is not uncommon for kernels to (internally) use an eternal loop > ("lab: jump lab") to wait for the next interrupt that creates some work > to do, as the OP does in Spin. Yes, the loop is syntactically > eternal/infinite, but in the presence of interrupts it can be terminated. >
It is certainly possible to have such an infinite loop in the kernel - but only as an idle function for when the processor is doing nothing. The thread is never expected to continue beyond the loop, or return from it in any way.
>>> As I understand it, the OP's scheduler (most likely running in an >>> interrupt handler) will break out of the "eternal" loop by popping >>> the return address from the stack into the PC, forcing a return from >>> Spin. This is legal MSP430 code, but out of C semantics. >>> >> >> If the OP wants to write such brain-dead code in some sort of non-C, >> that's up to him - but he should not expect to use a C compiler to >> achieve it. > > The OP is combining C semantics -- the loop is eternal -- with interrupt > semantics -- the loop can be broken. This approach is normal for writing > kernels and schedulers, but of course has its pitfalls. >
It is perfectly common and reasonable to have almost-infinite loops. An obvious example is a real spin lock, as implemented in real working schedulers - you have a tight loop that checks for an external event (such as a flag set within an interrupt routine or another task), and exits the loop when the flag is set. But the critical point here is that the loop has an exit clause. If you want to write a loop that will be exited, you write a loop with an exit clause.
> I agree that it would be cleaner to write the non-C-semantics code, such > as Spin, in assembly language. >
I am /not/ saying this sort of code should be written in assembly - I am saying it should not be written at all! It can never be "clean" code. But if it is written in assembly, then at least you are giving the tools no useful information, instead of directly lying to them.
>>> Making Spin test a flag that the scheduler sets is a solution, but a >>> different solution. >> >> That's /almost/ correct. Making Spin test a flag /is/ a solution. >> But it's not a "different solution", because he doesn't have a >> solution at the moment - his scheduler concept /cannot/ be made to >> work the way he thinks. > > Sure it can -- that is, an interrupt handler can break the Spin loop and > resume execution at the point after the Spin call, as long as there is a > return address. >
How is this in any way "better" than having Spin loop until a flag is set, and have the interrupt handler set that flag? Doing it the right way is entirely standard C, is far easier, far safer, far more portable, far more maintainable, and is smaller and faster than any sort of hack you might conceivably get working.
>>> It could be safer to write Spin in assembly language, to prevent the >>> C compiler gaining any false knowledge about its behaviour, such as >>> "does not return" knowledge. >> >> Rubbish. Fake assembly to lie to the compiler is not the answer. > > There is nothing "fake" about this. A kernel/scheduler (especially if > pre-emptive) has to go beyond C semantics. Using assembly language is > the normal way to do this. And using the return address is the normal > way for a kernel to save the PC of a thread, when a kernel routine > suspends the thread. >
Using assembly language where assembly language is needed is absolutely fine - and a pre-emptive scheduler is always going to need some assembly language. But using assembly language to try to force the compiler not to optimise some code is almost always bad design. And the scheduler gets the PC of a thread by looking at the return address for the interrupt routine, not by trying to dig down the stack and guess the return address for the current function in the interrupted thread.
>> The original idea is /wrong/. An infinite loop has no exit and no >> return. > > The loop in Spin can be terminated by a scheduler using PC > manipulations, as in a typical scheduler. Nothing wrong about that, > although it is risky to write it in C, for the reason that we agree on: > the C compiler will only see the C semantics, and may use them in ways > that cause problems for this idea. >
One thing we haven't really discussed here is how the interrupt routine / scheduler knows that the thread is in the Spin function. Is it going to take the real thread PC (from the interrupt routine's return stack) and compare it to the address of the Spin function to determine if the thread is current at the "lab: jump lab" instruction? If it is there, then it will look deeper in the stack for the previous return address, and return to that point. If not, then the thread is somewhere else and the interrupt routine (or the scheduler) must return there. While such a scheme may theoretically be made to work, it is needlessly complicated, very fragile and highly dependent on getting the code compiled in exactly the right way, and hopelessly restrictive and inflexible. Maybe there is something here that I'm missing - perhaps the OP will come back to us with some more information.
David Brown wrote:
> Niklas Holsti wrote: >> [ Quotations edited severely but hopefully without misattribution.] >> >>>>>> brOS wrote: >>>>>>>> On Mon, 23 Nov 2009 14:19:14 -0600 >>>>>>>> "brOS" <bogdanrosandic@gmail.com> wrote: >>>>>>>> >>>>>>>>> Dear all, >>>>>>>>> >>>>>>>>> Does anybody knows how to force compiler to use call instruction >>>>>>>>> instead of br(branch)for disassembling function call? >>>>>>>>> It is extremely important for me to specific function is >>>>>>>>> disassembled >>>>>>>>> using call instead of brunch, as compiler always does. >>>>>>>>> >>>>>> ... >>>>>>>> >>>>>>> This is why i need it.... >>>>>>> Function I'm calling have looks something like this: >>>>>>> void Spin(void){ >>>>>>> for(;;){} >>>>>>> } >>>>>>> So if it is disassembled with call before entering in pc will be >>>>>>> saved on >>>>>>> stack and it will point to instruction after function spin....So >>>>>>> I want to >>>>>>> use that pc and to save context so when my scheduler schedule >>>>>>> that task >>>>>>> again it will not continue spinning in that forever loop but it >>>>>>> will jump >>>>>>> to next instruction after Spin function..... >>>>>>> branch doesn t push pc to stack so taht s my problem;) >> > > The OP has given us very little information to go on - a lot of what we > both are writing about is speculation (and I am just as likely to guess > incorrectly as you).
Yes. I wrote my answer assuming that the OP knows what he or she is doing but was concerned that the branch instruction might not leave a good return address on the stack.
> However, since an infinite loop can clearly never > be broken without pre-emption, I am assuming he /does/ want pre-emption.
I would not call it pre-emption, but interruption. To me, pre-emption means suspending a task at some arbitrary point in its execution and switching control to another task. In the OP's code, the Spin function seems to be the expected place for suspending and resuming the task, so the task is prepared for it, at that point. This looks like co-operative multi-tasking. My guess about the OP's design was that the Spin function would be used for consuming the rest of a thread's time-slice when the thread has finished its current job, and that the OP would not try to schedule another ready thread to use this (slack) time, perhaps in order to have deterministic time-triggered behaviour, or perhaps to avoid pre-emptions.
> I misinterpreted your post - I thought you meant the kernel could assume > that the code it is scheduling always follows the compiler's conventions.
I agree completely that a pre-emptive kernel cannot assume that. (Well, there may be *some* conventions that always hold, for example relating to the stack pointer. But all conventions known to hold at a "foreign" call are generally not true at arbitrary points.)
> That is only true at the points at which it actually /is/ interfaced to > "foreign" code. When a C function calls another C function, the > compiler can use or abuse whatever calling convention it likes at the > time.
Agreed. For most embedded compilers, though, anything in a separate compilation is considered "foreign". But as you say, there is no guarantee in general.
> The code is just as broken for a co-operative scheduler. As you have > said yourself, when a task wants to release the processor it should call > the kernel scheduler.
In my guess as to what the OP is doing, the call to Spin *is* this call, which would make the OP's kernel a rather special one. On the other hand, perhaps I mis-guessed, and the call to Spin happens *within* the OP's kernel, after the kernel has done the more normal things such as looking for other ready tasks.
> I don't know how much you have worked on schedulers, but I get the > impression you know what you are doing and could write one perfectly > well.
Thanks. I've written a couple of simple, co-operative ones, a while ago, for obsolete processors, and studied a few other, current ones from the point of view of static WCET analysis.
> But I just cannot comprehend why you are defending the OP's bad design, > and trying to find ways to jam that square peg into a round hole. > > You know as well as I do that writing a tight infinite loop, and then > trying to find some way to go around the compiler to break out of the > loop, is bad design from step 1. Everything else in this thread is of > minor relevance (though interesting).
I'm not so ready to call this "bad design" without knowing more about the OP's requirements and design. The code generated for Spin is exactly the kind of tight eternal loop that you often find in a kernel where the kernel has no ready tasks and waits for an interrupt. I haven't tried it, but it seems to me that writing this loop as a conditional, flag-checking one could increase (by a little) the latency for resuming the right task when an interrupt happens, compared to resuming the task directly from the interrupt handler and simply abandoning the tight loop. It may be bad practice to rely on the C compiler to generate this code, and perhaps I should have said so in my original reply to the OP. It has been said now, good. David Brown wrote:
>>> Under no circumstances is it correct to tell the compiler you have an >>> infinite loop, and then complain because you can't see how to break >>> out of it.
Niklas Holsti replied:
>> Who was complaining? The OP seems to know perfectly well how to break >> out of this loop by changing the PC in the scheduler (when the looping >> code is interrupted).
David Brown replied:
> He is complaining because although he knows he has to change the PC, he > doesn't know what new value to use.
Because the OP thought that a branch instruction would not leave a return address on the stack. But if the branch instruction implements a tail call, it does leave a return address (although for an outer call).
> How is this in any way "better" than having Spin loop until a flag is > set, and have the interrupt handler set that flag?
See my comment on latency, above. But of course this is again a guess as to why the OP is doing it this way.
> Using assembly language where assembly language is needed is absolutely > fine - and a pre-emptive scheduler is always going to need some assembly > language. But using assembly language to try to force the compiler not > to optimise some code is almost always bad design.
Writing a function in assembly language (and not, of course, as "in-line assembly code" in a C file) is a pretty sure way of making the C compiler treat is as a "foreign" function and so ensure that calls use the standard conventions, including pushing a return address.
> And the scheduler gets the PC of a thread by looking at the return > address for the interrupt routine,
Or the return address of the call from the thread to the kernel function, which is the case for Spin (I guess).
> One thing we haven't really discussed here is how the interrupt routine > / scheduler knows that the thread is in the Spin function. Is it going > to take the real thread PC (from the interrupt routine's return stack) > and compare it to the address of the Spin function to determine if the > thread is current at the "lab: jump lab" instruction? If it is there, > then it will look deeper in the stack for the previous return address, > and return to that point.
That is (also) my guess of what the OP is trying to do.
> If not, then the thread is somewhere else and > the interrupt routine (or the scheduler) must return there.
Maybe not. In my guess of the OP's design, if the thread is not in Spin when the interrupt happens, the thread has exceeded its time-slice. I don't of course know what the OP intends the kernel/scheduler to do, in that case; perhaps log a fatal error and reboot. Another choice is to set an error flag and let the thread continue until the next tick, when it is checked again.
> Maybe there is something here that I'm missing - perhaps the OP will > come back to us with some more information.
That would be good. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
On Nov 23, 3:46=A0pm, Rob Gaddi <rga...@technologyhighland.com> wrote:

> B) I think the world would be a generally happier place if more > processors had a dedicated brunch instruction. =A0I figure that properly > implemented it ought to take a good hour and a half to return, and then > come back with the stack smelling of coffee and bacon.
*PROPERLY* implemented it should divert to the nearest pub and not return until the keg is dry.
Niklas Holsti wrote:
> David Brown wrote: >> Niklas Holsti wrote: >>> [ Quotations edited severely but hopefully without misattribution.] >>> >>>>>>> brOS wrote: >>>>>>>>> On Mon, 23 Nov 2009 14:19:14 -0600 >>>>>>>>> "brOS" <bogdanrosandic@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Dear all, >>>>>>>>>> >>>>>>>>>> Does anybody knows how to force compiler to use call instruction >>>>>>>>>> instead of br(branch)for disassembling function call? >>>>>>>>>> It is extremely important for me to specific function is >>>>>>>>>> disassembled >>>>>>>>>> using call instead of brunch, as compiler always does. >>>>>>>>>> >>>>>>> ... >>>>>>>>> >>>>>>>> This is why i need it.... >>>>>>>> Function I'm calling have looks something like this: >>>>>>>> void Spin(void){ >>>>>>>> for(;;){} >>>>>>>> } >>>>>>>> So if it is disassembled with call before entering in pc will be >>>>>>>> saved on >>>>>>>> stack and it will point to instruction after function spin....So >>>>>>>> I want to >>>>>>>> use that pc and to save context so when my scheduler schedule >>>>>>>> that task >>>>>>>> again it will not continue spinning in that forever loop but it >>>>>>>> will jump >>>>>>>> to next instruction after Spin function..... >>>>>>>> branch doesn t push pc to stack so taht s my problem;) >>> >> >> The OP has given us very little information to go on - a lot of what >> we both are writing about is speculation (and I am just as likely to >> guess incorrectly as you). > > Yes. > > I wrote my answer assuming that the OP knows what he or she is doing but > was concerned that the branch instruction might not leave a good return > address on the stack. >
I don't think "the OP knows what he or she is doing" is a fair assumption, based on the posted code for Spin() !
>> However, since an infinite loop can clearly never be broken without >> pre-emption, I am assuming he /does/ want pre-emption. > > I would not call it pre-emption, but interruption. To me, pre-emption > means suspending a task at some arbitrary point in its execution and > switching control to another task. In the OP's code, the Spin function > seems to be the expected place for suspending and resuming the task, so > the task is prepared for it, at that point. This looks like co-operative > multi-tasking. >
It's possible (or maybe even likely) that the OP is /trying/ to implement a co-operative scheduler. But it doesn't actually co-operate - an eternal loop is not co-operative, even if it you cheat and break out using interrupts. Interrupts are inherently asynchronous - if the thread can be suspended by an interrupt function, that is pre-emptive multitasking.
> My guess about the OP's design was that the Spin function would be used > for consuming the rest of a thread's time-slice when the thread has > finished its current job, and that the OP would not try to schedule > another ready thread to use this (slack) time, perhaps in order to have > deterministic time-triggered behaviour, or perhaps to avoid pre-emptions. >
That could well be the intention. But spinning like that is a silly idea, and even if he wants to do what you suggest here, the implementation is totally wrong. The interrupt should set a flag, and the spin lock should block waiting for the flag.
>> I misinterpreted your post - I thought you meant the kernel could >> assume that the code it is scheduling always follows the compiler's >> conventions. > > I agree completely that a pre-emptive kernel cannot assume that. (Well, > there may be *some* conventions that always hold, for example relating > to the stack pointer. But all conventions known to hold at a "foreign" > call are generally not true at arbitrary points.) > >> That is only true at the points at which it actually /is/ interfaced >> to "foreign" code. When a C function calls another C function, the >> compiler can use or abuse whatever calling convention it likes at the >> time. > > Agreed. For most embedded compilers, though, anything in a separate > compilation is considered "foreign". But as you say, there is no > guarantee in general. >
These days, full program optimisation is not uncommon. Even gcc (despite its critics' opinions) can do reasonable full program optimisation by compiling all the C modules in one shot.
>> The code is just as broken for a co-operative scheduler. As you have >> said yourself, when a task wants to release the processor it should >> call the kernel scheduler. > > In my guess as to what the OP is doing, the call to Spin *is* this call, > which would make the OP's kernel a rather special one. On the other > hand, perhaps I mis-guessed, and the call to Spin happens *within* the > OP's kernel, after the kernel has done the more normal things such as > looking for other ready tasks. > >> I don't know how much you have worked on schedulers, but I get the >> impression you know what you are doing and could write one perfectly >> well. > > Thanks. I've written a couple of simple, co-operative ones, a while ago, > for obsolete processors, and studied a few other, current ones from the > point of view of static WCET analysis. >
I think most of our apparent disagreements have the basis in different guesses as to what we think the OP is trying to do. Hopefully the OP is still reading the thread, and will take some inspiration from our discussion!
>> But I just cannot comprehend why you are defending the OP's bad >> design, and trying to find ways to jam that square peg into a round hole. >> >> You know as well as I do that writing a tight infinite loop, and then >> trying to find some way to go around the compiler to break out of the >> loop, is bad design from step 1. Everything else in this thread is of >> minor relevance (though interesting). > > I'm not so ready to call this "bad design" without knowing more about > the OP's requirements and design. The code generated for Spin is exactly > the kind of tight eternal loop that you often find in a kernel where the > kernel has no ready tasks and waits for an interrupt. I haven't tried > it, but it seems to me that writing this loop as a conditional, > flag-checking one could increase (by a little) the latency for resuming > the right task when an interrupt happens, compared to resuming the task > directly from the interrupt handler and simply abandoning the tight loop. >
Nah, the loop overhead to continually read a flag would be a few cycles at most. The interrupt function overhead to figure out return addresses from the stack will be much, much worse. When I see someone write one thing, and mean another, I see a mistake. When the author knows what he has written and is wants to find some way to work around this difference rather than correcting the code, I see a bad design. Maybe I'm just less tolerant than you.
> It may be bad practice to rely on the C compiler to generate this code, > and perhaps I should have said so in my original reply to the OP. It has > been said now, good. > > David Brown wrote: >>>> Under no circumstances is it correct to tell the compiler you have >>>> an infinite loop, and then complain because you can't see how to >>>> break out of it. > > Niklas Holsti replied: >>> Who was complaining? The OP seems to know perfectly well how to break >>> out of this loop by changing the PC in the scheduler (when the >>> looping code is interrupted). > > David Brown replied: >> He is complaining because although he knows he has to change the PC, >> he doesn't know what new value to use. > > Because the OP thought that a branch instruction would not leave a > return address on the stack. But if the branch instruction implements a > tail call, it does leave a return address (although for an outer call). > >> How is this in any way "better" than having Spin loop until a flag is >> set, and have the interrupt handler set that flag? > > See my comment on latency, above. But of course this is again a guess as > to why the OP is doing it this way. > >> Using assembly language where assembly language is needed is >> absolutely fine - and a pre-emptive scheduler is always going to need >> some assembly language. But using assembly language to try to force >> the compiler not to optimise some code is almost always bad design. > > Writing a function in assembly language (and not, of course, as "in-line > assembly code" in a C file) is a pretty sure way of making the C > compiler treat is as a "foreign" function and so ensure that calls use > the standard conventions, including pushing a return address. >
That is true, but my point is that you should not use assembly like this just to "get around" the compiler - not without very good reasons. I've often seen people use assembly code to try to force the compiler to act in some way, when they could have done much better while staying within C.
>> And the scheduler gets the PC of a thread by looking at the return >> address for the interrupt routine, > > Or the return address of the call from the thread to the kernel > function, which is the case for Spin (I guess). > >> One thing we haven't really discussed here is how the interrupt >> routine / scheduler knows that the thread is in the Spin function. Is >> it going to take the real thread PC (from the interrupt routine's >> return stack) and compare it to the address of the Spin function to >> determine if the thread is current at the "lab: jump lab" >> instruction? If it is there, then it will look deeper in the stack >> for the previous return address, and return to that point. > > That is (also) my guess of what the OP is trying to do. > >> If not, then the thread is somewhere else and the interrupt routine >> (or the scheduler) must return there. > > Maybe not. In my guess of the OP's design, if the thread is not in Spin > when the interrupt happens, the thread has exceeded its time-slice. I > don't of course know what the OP intends the kernel/scheduler to do, in > that case; perhaps log a fatal error and reboot. Another choice is to > set an error flag and let the thread continue until the next tick, when > it is checked again. >
Your guesses as to the OP's ideas make a certain sense - perhaps he is trying to implement a sort of fixed time-slice scheduler. The implementation of Spin() is still wrong (you'll never convince me otherwise!), but that might bring us a little closer to helping him get a working implementation.
>> Maybe there is something here that I'm missing - perhaps the OP will >> come back to us with some more information. > > That would be good. >
David Brown wrote:
> Niklas Holsti wrote: >> David Brown wrote: >>> Niklas Holsti wrote: >>>> [ Quotations edited severely but hopefully without misattribution.] >>>> >>>>>>>> brOS wrote: >>>>>>>>>> On Mon, 23 Nov 2009 14:19:14 -0600 >>>>>>>>>> "brOS" <bogdanrosandic@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Dear all, >>>>>>>>>>> >>>>>>>>>>> Does anybody knows how to force compiler to use call instruction >>>>>>>>>>> instead of br(branch)for disassembling function call? >>>>>>>>>>> It is extremely important for me to specific function is >>>>>>>>>>> disassembled >>>>>>>>>>> using call instead of brunch, as compiler always does. >>>>>>>>>>> >>>>>>>> ... >>>>>>>>>> >>>>>>>>> This is why i need it.... >>>>>>>>> Function I'm calling have looks something like this: >>>>>>>>> void Spin(void){ >>>>>>>>> for(;;){} >>>>>>>>> } >>>>>>>>> So if it is disassembled with call before entering in pc will >>>>>>>>> be saved on >>>>>>>>> stack and it will point to instruction after function >>>>>>>>> spin....So I want to >>>>>>>>> use that pc and to save context so when my scheduler schedule >>>>>>>>> that task >>>>>>>>> again it will not continue spinning in that forever loop but it >>>>>>>>> will jump >>>>>>>>> to next instruction after Spin function..... >>>>>>>>> branch doesn t push pc to stack so taht s my problem;) >>>> >>> >>> The OP has given us very little information to go on - a lot of what >>> we both are writing about is speculation (and I am just as likely to >>> guess incorrectly as you). >> >> Yes. >> >> I wrote my answer assuming that the OP knows what he or she is doing >> but was concerned that the branch instruction might not leave a good >> return address on the stack. >> > > I don't think "the OP knows what he or she is doing" is a fair > assumption, based on the posted code for Spin() ! > >>> However, since an infinite loop can clearly never be broken without >>> pre-emption, I am assuming he /does/ want pre-emption. >> >> I would not call it pre-emption, but interruption. To me, pre-emption >> means suspending a task at some arbitrary point in its execution and >> switching control to another task. In the OP's code, the Spin function >> seems to be the expected place for suspending and resuming the task, >> so the task is prepared for it, at that point. This looks like >> co-operative multi-tasking. >> > > It's possible (or maybe even likely) that the OP is /trying/ to > implement a co-operative scheduler. But it doesn't actually co-operate > - an eternal loop is not co-operative, even if it you cheat and break > out using interrupts. Interrupts are inherently asynchronous - if the > thread can be suspended by an interrupt function, that is pre-emptive > multitasking.
Well, what constitutes "co-operation" may be a matter of precise definition (in real life, sometimes of litigation :-). In my guess of the OP's kernel/scheduler design, the suspension is designed to happen only when the thread is looping in the Spin function. By calling Spin the thread shows that it is ready to be suspended, so it is co-operating in my view. (As discussed earlier, we don't know what happens if the scheduler interrupt finds the thread is *not* in Spin.)
>> Agreed. For most embedded compilers, though, anything in a separate >> compilation is considered "foreign". But as you say, there is no >> guarantee in general. >> > > These days, full program optimisation is not uncommon. Even gcc > (despite its critics' opinions) can do reasonable full program > optimisation by compiling all the C modules in one shot.
Sure, but in that case it would not be "separate compilation". Interesting question, though: Is there a standard way in a C environment to ensure that the standard calling sequence is used for an extern function, with no C-calling-C optimizations?
>> I'm not so ready to call this "bad design" without knowing more about >> the OP's requirements and design. The code generated for Spin is >> exactly the kind of tight eternal loop that you often find in a kernel >> where the kernel has no ready tasks and waits for an interrupt. I >> haven't tried it, but it seems to me that writing this loop as a >> conditional, flag-checking one could increase (by a little) the >> latency for resuming the right task when an interrupt happens, >> compared to resuming the task directly from the interrupt handler and >> simply abandoning the tight loop. >> > > Nah, the loop overhead to continually read a flag would be a few cycles > at most. The interrupt function overhead to figure out return addresses > from the stack will be much, much worse.
Let's consider what the kernel has to do, in my guess of the OP's design, considering the two cases of (A) an unconditional "eternal" loop and (B) a flag-checking loop. The kernel knows which thread is running. When the thread finishes its job in this time-slice, it calls Spin, expecting to be resumed at the next instruction after the Spin call, say instruction R. The Spin function loops, eating up the rest of the time-slice. The tick interrupt comes in. The tick interrupt handler saves the context of the interrupted thread. By comparing its PC to the address of the Spin loop, it can check that the thread has not overrun its time-slice. At this point: - For (A) the handler gets the return address of Spin by a POP and stores this return address as the resumption point for the thread to be suspended. - For (B) the handler stores the interrupted PC (in the flag-checking loop) as the resumption point. The interrupt handler (scheduler) finds the thread to run in the next time-slice. In case (B) it then sets the (thread-specific) flag on which Spin is waiting. In case (A) it does not need to set any flag. The handler restores the context of the new thread. As the last step in this, it pushes the resumption address and the restored status register and does return-from-interrupt (RETI). In case (A), the thread is resumed immediately at the desired instruction, the instruction R that follows the Spin call. In case (B), the thread is resumed in the middle of the flag-checking loop. It still has to read the flag, branch out of the loop, and execute a return instruction (effectively a POP from the stack), before instruction R is reached. In summary, case (A) and case (B) both have to POP the stack to get to instruction R, but case (B) also has to set a flag and check a flag. It is a close call, but you might save some cycles in case (A). Morever, in case (B) the flag has to be thread-specific, so it has to be passed to Spin with a parameter, consuming more cycles. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
An extension and a correction to my last (deleting the context):

Niklas Holsti wrote:

> In summary, case (A) and case (B) both have to POP the stack to get to > instruction R, but case (B) also has to set a flag and check a flag.
Plus the flag has to be cleared at some point (being careful to avoid race conditions).
> Morever, in > case (B) the flag has to be thread-specific, so it has to be passed to > Spin with a parameter, consuming more cycles. >
... except if the flag is in a register that is cleared in Spin before the loop, but set by the scheduler in the context that is restored (except for this flag) when the Spin loop is resumed. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
And a second correction to myself:

Niklas Holsti wrote:

> Morever, in case (B) the flag has to be thread-specific, so it has to > be passed to Spin with a parameter, consuming more cycles.
In fact the flag can be global, not thread-specific, since only one thread is resumed at a time. But since this thread *is* resumed, it is certain to find the flag set, which goes to show that the flag is redundant in this design, and the flagless unconditional loop in case (A) makes more sense. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
Niklas Holsti wrote:
> David Brown wrote: >> Niklas Holsti wrote: >>> David Brown wrote: >>>> Niklas Holsti wrote: >>>>> [ Quotations edited severely but hopefully without misattribution.] >>>>> >>>>>>>>> brOS wrote: >>>>>>>>>>> On Mon, 23 Nov 2009 14:19:14 -0600 >>>>>>>>>>> "brOS" <bogdanrosandic@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Dear all, >>>>>>>>>>>> >>>>>>>>>>>> Does anybody knows how to force compiler to use call >>>>>>>>>>>> instruction >>>>>>>>>>>> instead of br(branch)for disassembling function call? >>>>>>>>>>>> It is extremely important for me to specific function is >>>>>>>>>>>> disassembled >>>>>>>>>>>> using call instead of brunch, as compiler always does. >>>>>>>>>>>> >>>>>>>>> ... >>>>>>>>>>> >>>>>>>>>> This is why i need it.... >>>>>>>>>> Function I'm calling have looks something like this: >>>>>>>>>> void Spin(void){ >>>>>>>>>> for(;;){} >>>>>>>>>> } >>>>>>>>>> So if it is disassembled with call before entering in pc will >>>>>>>>>> be saved on >>>>>>>>>> stack and it will point to instruction after function >>>>>>>>>> spin....So I want to >>>>>>>>>> use that pc and to save context so when my scheduler schedule >>>>>>>>>> that task >>>>>>>>>> again it will not continue spinning in that forever loop but >>>>>>>>>> it will jump >>>>>>>>>> to next instruction after Spin function..... >>>>>>>>>> branch doesn t push pc to stack so taht s my problem;) >>>>> >>>> >>>> The OP has given us very little information to go on - a lot of what >>>> we both are writing about is speculation (and I am just as likely to >>>> guess incorrectly as you). >>> >>> Yes. >>> >>> I wrote my answer assuming that the OP knows what he or she is doing >>> but was concerned that the branch instruction might not leave a good >>> return address on the stack. >>> >> >> I don't think "the OP knows what he or she is doing" is a fair >> assumption, based on the posted code for Spin() ! >> >>>> However, since an infinite loop can clearly never be broken without >>>> pre-emption, I am assuming he /does/ want pre-emption. >>> >>> I would not call it pre-emption, but interruption. To me, pre-emption >>> means suspending a task at some arbitrary point in its execution and >>> switching control to another task. In the OP's code, the Spin >>> function seems to be the expected place for suspending and resuming >>> the task, so the task is prepared for it, at that point. This looks >>> like co-operative multi-tasking. >>> >> >> It's possible (or maybe even likely) that the OP is /trying/ to >> implement a co-operative scheduler. But it doesn't actually >> co-operate - an eternal loop is not co-operative, even if it you cheat >> and break out using interrupts. Interrupts are inherently >> asynchronous - if the thread can be suspended by an interrupt >> function, that is pre-emptive multitasking. > > Well, what constitutes "co-operation" may be a matter of precise > definition (in real life, sometimes of litigation :-). In my guess of > the OP's kernel/scheduler design, the suspension is designed to happen > only when the thread is looping in the Spin function. By calling Spin > the thread shows that it is ready to be suspended, so it is co-operating > in my view. (As discussed earlier, we don't know what happens if the > scheduler interrupt finds the thread is *not* in Spin.) > >>> Agreed. For most embedded compilers, though, anything in a separate >>> compilation is considered "foreign". But as you say, there is no >>> guarantee in general. >>> >> >> These days, full program optimisation is not uncommon. Even gcc >> (despite its critics' opinions) can do reasonable full program >> optimisation by compiling all the C modules in one shot. > > Sure, but in that case it would not be "separate compilation". >
Fair enough. What about gcc 4.5 with -flto ? Then you can compile C modules separately into object files, but the object files hold a copy of the internal trees as well as generated object code. When you link these object files, the trees are used for link-time optimisation, including inlining across modules. You lose all clarity in the definitions of "compile", "link", and "separate compilation". But that is a digression, especially since the msp430 gcc port is not (yet) updated to gcc 4.5, which is itself not yet released.
> Interesting question, though: Is there a standard way in a C environment > to ensure that the standard calling sequence is used for an extern > function, with no C-calling-C optimizations? >
I think the only way is by being sure that the compiler can't access the code for a function declared as "extern". It should not be hard to do, but you may have to do it explicitly. For example, if you use a compiler's IDE and project manager, you might have to go out of your way to force true separate compilation.
>>> I'm not so ready to call this "bad design" without knowing more about >>> the OP's requirements and design. The code generated for Spin is >>> exactly the kind of tight eternal loop that you often find in a >>> kernel where the kernel has no ready tasks and waits for an >>> interrupt. I haven't tried it, but it seems to me that writing this >>> loop as a conditional, flag-checking one could increase (by a little) >>> the latency for resuming the right task when an interrupt happens, >>> compared to resuming the task directly from the interrupt handler and >>> simply abandoning the tight loop. >>> >> >> Nah, the loop overhead to continually read a flag would be a few >> cycles at most. The interrupt function overhead to figure out return >> addresses from the stack will be much, much worse. > > Let's consider what the kernel has to do, in my guess of the OP's > design, considering the two cases of (A) an unconditional "eternal" loop > and (B) a flag-checking loop. > > The kernel knows which thread is running. > > When the thread finishes its job in this time-slice, it calls Spin, > expecting to be resumed at the next instruction after the Spin call, say > instruction R. > > The Spin function loops, eating up the rest of the time-slice. > > The tick interrupt comes in. > > The tick interrupt handler saves the context of the interrupted thread. > By comparing its PC to the address of the Spin loop, it can check that > the thread has not overrun its time-slice. At this point: >
Note that a sensible Spin function would tell the kernel that it is finished and entering the spin loop, rather than leaving the interrupt handler to figure it out in this fragile way.
> - For (A) the handler gets the return address of Spin > by a POP and stores this return address as the resumption > point for the thread to be suspended. >
Assuming, of course, that you've figured out a way to do that safely and reliably....
> - For (B) the handler stores the interrupted PC (in > the flag-checking loop) as the resumption point. >
This bit will typically require some assembly, compiler-specific features, or some knowledge of the way the compiler generates interrupt routines. But that's unavoidable when you have an interrupt-based scheduler.
> The interrupt handler (scheduler) finds the thread to run in the next > time-slice. In case (B) it then sets the (thread-specific) flag on which > Spin is waiting. In case (A) it does not need to set any flag. >
Fair enough, although setting a flag is exactly a hard job, and can be done within standard C.
> The handler restores the context of the new thread. As the last step in > this, it pushes the resumption address and the restored status register > and does return-from-interrupt (RETI). >
OK.
> In case (A), the thread is resumed immediately at the desired > instruction, the instruction R that follows the Spin call. >
Again assuming that is it is possible to figure out the address of R in a reliable way...
> In case (B), the thread is resumed in the middle of the flag-checking > loop. It still has to read the flag, branch out of the loop, and execute > a return instruction (effectively a POP from the stack), before > instruction R is reached. >
Yes, you can expect it to take about 3 or 4 instructions before getting to R. That would still be a lot less time than you spend messing around getting the address of R in case (A), so case (B) wins here in time.
> In summary, case (A) and case (B) both have to POP the stack to get to > instruction R, but case (B) also has to set a flag and check a flag. It > is a close call, but you might save some cycles in case (A). Morever, in > case (B) the flag has to be thread-specific, so it has to be passed to > Spin with a parameter, consuming more cycles. >
Remember that all ideas about how case (A) could feasibly be implemented are based on hobbling the compiler. Write the code correctly (case B), and you can let the optimiser do its job - that will overwhelm any conceivable time advantage case A might have had. Among other things, Spin() could be inlined in its calling function and remove most of the overhead. Even making the great leap of faith that there is a reliable way to get the desired return address, and then making a second leap of faith that case A is faster, the concept is /still/ wrong. There is no way that shaving a few cycles off the latency could justify using this horrible hack. If those cycles matter, you need a new design.