IAR MSP430 compiler problem| page 2

Reply by David Brown ●November 25, 20092009-11-25

Niklas Holsti wrote:
>> Niklas Holsti wrote:
>>> brOS wrote:
>>>>> On Mon, 23 Nov 2009 14:19:14 -0600
>>>>> "brOS" <bogdanrosandic@gmail.com> wrote:
>>>>>
>>>>>> Dear all,
>>>>>>
>>>>>> Does anybody knows how to force compiler to use call instruction
>>>>>> instead of br(branch)for disassembling function call?
>>>>>> It is extremely important for me to specific function is disassembled
>>>>>> using call instead of brunch, as compiler always does.
>>>>>>
>>> ...
>>>>>
>>>> This is why i need it....
>>>> Function I'm calling have looks something like this:
>>>> void Spin(void){
>>>> for(;;){}
>>>> }
>>>> So if it is disassembled with call before entering in pc will be 
>>>> saved on
>>>> stack and it will point to instruction after function spin....So I 
>>>> want to
>>>> use that pc and to save context so when my scheduler schedule that task
>>>> again it will not continue spinning in that forever loop but it will 
>>>> jump
>>>> to next instruction after Spin function.....
>>>>  branch doesn t push pc to stack so taht s my problem;)        
>>>
>>> The compiler has deduced that a branch instruction is as good as a 
>>> call instruction for this/these calls of Spin. There can be two 
>>> reasons for that:
>>>
>>> 1. If the compiler has seen the code of Spin (if it is in the same 
>>> source-code file as the calling function) it may have deduced that 
>>> Spin never returns, so it does not need the return address that a 
>>> call instruction would push on the stack. Of course the compiler 
>>> cannot know that your scheduler breaks C semantics (I assume by 
>>> interrupting the eternal loop in Spin) and needs the return address.
> 
> I experimented a bit with the IAR MSP430 compiler (current "kickstart" 
> version), and it uses call instructions to call a non-returning function 
> containing only an eternal for-loop, even if the function is presented 
> in the same source-code file as the call. If the function is marked with 
> the __noreturn keyword the compiler will use a branch or jump 
> instruction, though. (I assume that the OP has not marked Spin with 
> __noreturn.)
> 
> So it seems my suggested reason 1 is not the true explanation.
> 

It's always fun to test and compare compilers.  The stable version of 
gcc for the msp430 is an older version - 3.2.3 (with 4.x under 
development).  It always "calls" the function even when it knows it is 
non-returning, and there is a "ret" after the call (and after the 
infinite loop).  Newer gcc versions give tighter code (testing with 
avr-gcc 4.3.2) - a function calling Spin() inlines the infinite loop 
into caller.  There are no jumps, calls, or returns.

The point here is that such details vary from compiler to compiler, and 
from version to version.  The compiler will do exactly what you tell it, 
but you can't rely on it using a particular method to implement a 
particular construct.

>>> 2. If the call to Spin is the last statement in the calling function 
>>> (a "tail call"), the compiler understands that the call does not have 
>>> to push a return address, because Spin will return (assuming it would 
>>> return) to the end of the calling function, which immediately returns 
>>> to *its* caller. The branch instruction leaves the calling function's 
>>> return address on the stack, so when Spin returns (assuming it could 
>>> return) it will take a short-cut and return to the caller of the 
>>> calling function. This optimization saves time and stack space.
> 
> In my small experiments, the IAR compiler does code a tail call to Spin 
> using a branch or jump instruction, instead of a call. So reason 2 is a 
> possible explanation for the OP's observation. Interestingly, this 
> happens even if the optimization level is set to "None", so this advice 
> of mine:
> 
>>> Another possibility is to avoid the "High" optimization level of the 
>>> compiler.
> 
> does not work.
> 

Optimisation levels are never more than a hint to the compiler.  You are 
just making a suggestion as to how it should balance compile time, ease 
of debugging, and size and speed of the generated code.  Optimisation 
flags are never demands, and the compiler is free to apply all its 
optimisations at any level (though obviously it is more user-friendly to 
have some correlation).  Code that is dependent on the optimisation 
level for correctness is broken code.  (Obviously it can be dependent on 
the optimisation level for size and speed requirements.)

> David Brown wrote:
> 
>> Your diagnosis of the problem is fair enough, but your workarounds 
>> are, IMHO, totally wrong.  Anything that involves trying to trick or 
>> cripple the compiler (separate compiled files, disabling 
>> optimisations, fake extra inline assembly, gratuitous function pointer 
>> usage, etc.) is at best an ugly hack, and at worst a maintenance 
>> nightmare.  Remember, the compiler is free to work around all these 
>> workarounds - lying to your tools is a bad idea.
> 
> In general I agree with you, David, but the OP is trying to run C code 
> under a custom scheduler, apparently in some kind of simple 
> multi-threading or coroutine style. This is out of scope for the C 
> language, so the operation of the scheduler will involve some things 
> that the compiler does not know about -- and should not (have to) know 
> about. The scheduler/kernel routines should follow the C compiler's 
> calling protocols, but will themselves do things that exceed C's semantics.
> 

No, the scheduler/kernel should /not/ rely on the compiler's calling 
protocols.  The compiler can change these as it wants, and mix them for 
different functions.  If the scheduler depends on the compiler using 
particular instructions to call a function, the scheduler is broken - a 
pre-emptive scheduler can assume /nothing/ about the code it is pre-empting.

If you have a scheduler that for some reason needs a way to get a 
function's return address, then it needs to use a compiler-specific 
feature such as gcc's "__builtin_return_address()" function.  If the 
compiler doesn't have such a feature, then you are out of luck.  Get a 
different compiler, or write a scheduler that doesn't depend on knowing 
the return address.

Under no circumstances is it correct to tell the compiler you have an 
infinite loop, and then complain because you can't see how to break out 
of it.

> Of course, the person writing the scheduler should know all about the C 
> compiler's calling protocols and run-time system so that the scheduler 
> can save and restore thread contexts properly.
> 
> The Spin function seems intended to be part of the application/scheduler 
> interface; an application task calls it when it has finished its job and 
> yields to the scheduler. Writing this "yield" routine as an eternal loop 
> is unusual, but can be OK for a custom kernel. In a more conventional 

It is not "unusual", it is "wrong".

There is no point in trying to help the OP find some workaround to get 
this system to compile - he must fix the code.

> kernel, the application would call a kernel "yield" or "suspend_me" 
> function, the kernel would check if some other thread is ready to run, 
> and if not the kernel would stick in a loop, or schedule a looping "null 
> thread" that is always ready to run.
> 

Exactly.  When a task has finished, control must be returned to the 
scheduler, either by calling a "yield" function, or by returning to its 
caller (the kernel).  You could, I suppose, end a task in an infinite 
loop and rely on the pre-empter to make sure other tasks get processor 
time.  But you certainly wouldn't expect that thread to ever leave the 
infinite loop - that's why it's called an "infinite loop".

>> The function is called by branch, not call, because it never returns. 
> 
> That could be a reason, but I now doubt it for the IAR compiler -- see 
> my note on experiments above. The tail-call explanation is the more 
> likely one.
> 
>> That's what you (OP) wrote in the source code, so that's what the 
>> compiler does.
>> If you want the function to return, you have to write code that allows 
>> the function to return.  In particular, you need to have some way of 
>> exiting the spin, otherwise it is useless.
> 
> As I understand it, the OP's scheduler (most likely running in an 
> interrupt handler) will break out of the "eternal" loop by popping the 
> return address from the stack into the PC, forcing a return from Spin. 
> This is legal MSP430 code, but out of C semantics.
> 

If the OP wants to write such brain-dead code in some sort of non-C, 
that's up to him - but he should not expect to use a C compiler to 
achieve it.

>> If you can't see why you need something along these lines, you'll have 
>> to think a bit harder about how you want your code to work.  But 
>> telling the compiler you want a tight infinite loop, and then trying 
>> to find some way to break out of it, is definitely not the answer.
> 
> Making Spin test a flag that the scheduler sets is a solution, but a 
> different solution.
> 

That's /almost/ correct.  Making Spin test a flag /is/ a solution.  But 
it's not a "different solution", because he doesn't have a solution at 
the moment - his scheduler concept /cannot/ be made to work the way he 
thinks.

An infinite loop is a dead end to the thread that hits it - no exits, no 
escapes, no returns.  It's dead.  The end.

Rather than trying to play Dr. Frankenstein, the OP should re-think the 
way his scheduler should work, and what Spin() should actually do.  In 
particular, if he wants the function to be able to return, he must give 
it a way to return.

> It could be safer to write Spin in assembly language, to prevent the C 
> compiler gaining any false knowledge about its behaviour, such as "does 
> not return" knowledge. 

Rubbish.  Fake assembly to lie to the compiler is not the answer.

> But if the OP knows that the C compiler does not 
> transport such knowledge across compilation units, writing Spin in C 
> (for separate compilation) is safe. Of course this has to be rechecked 

Dangerous rubbish.  Code the relies on separate compilation is as broken 
as code that relies on hobbling the optimiser.  You don't have that 
choice - the compiler can transport anything it wants across compilation 
units, and you can't choose to stop that.

> for each new version of the compiler, so it is indeed a maintenance 
> burden, over and above the burden of checking for changes in the calling 
> protocols and run-time system structure, which a scheduler author has to 
> do for every compiler version anyway.
> 
> Summary: Tail call optimization is the likely cause of the compiler 
> using a branch instead of a call instruction. So:
> 

Real summary:

The original idea is /wrong/.  An infinite loop has no exit and no 
return.  If the function Spin() needs to exit, it should have an exit. 
Write code that says what you want it to do, don't write something 
totally different and rely on layers of workarounds, compiler-specific 
hacks, assembly tricks and other nonsense.

Reply by Niklas Holsti ●November 25, 20092009-11-25

[ Quotations edited severely but hopefully without misattribution.]

>>>> brOS wrote:
>>>>>> On Mon, 23 Nov 2009 14:19:14 -0600
>>>>>> "brOS" <bogdanrosandic@gmail.com> wrote:
>>>>>>
>>>>>>> Dear all,
>>>>>>>
>>>>>>> Does anybody knows how to force compiler to use call instruction
>>>>>>> instead of br(branch)for disassembling function call?
>>>>>>> It is extremely important for me to specific function is 
>>>>>>> disassembled
>>>>>>> using call instead of brunch, as compiler always does.
>>>>>>>
>>>> ...
>>>>>>
>>>>> This is why i need it....
>>>>> Function I'm calling have looks something like this:
>>>>> void Spin(void){
>>>>> for(;;){}
>>>>> }
>>>>> So if it is disassembled with call before entering in pc will be 
>>>>> saved on
>>>>> stack and it will point to instruction after function spin....So I 
>>>>> want to
>>>>> use that pc and to save context so when my scheduler schedule that 
>>>>> task
>>>>> again it will not continue spinning in that forever loop but it 
>>>>> will jump
>>>>> to next instruction after Spin function.....
>>>>>  branch doesn t push pc to stack so taht s my problem;)        

> Niklas Holsti wrote:

>> ... The scheduler/kernel routines should follow the C compiler's 
>> calling protocols, but will themselves do things that exceed C's 
>> semantics.

David Brown wrote:
> 
> No, the scheduler/kernel should /not/ rely on the compiler's calling 
> protocols.

I didn't say "rely"-- I said "follow". If the application calls a kernel 
routine, it will use the compiler's calling protocols that, for example, 
say which registers must be preserved, and which can be overwritten. The 
kernel routine should follow these rules, but is certainly allowed to 
change the values of the overwritable registers, for example. (Note, I 
am not talking about *pre-emption* here, nor was the OP, I believe.)

> The compiler can change these as it wants, and mix them for 
> different functions.  If the scheduler depends on the compiler using 
> particular instructions to call a function,

The question here is not really about particular instructions, but about 
the state in which the Spin routine is entered, specifically whether 
there is a usable return address on the stack. The presence of a return 
address on the stack must be defined in the compiler's calling protocol 
if the compiler is meant to be able to interface to assembly-language 
routines or generally "foreign" routines.

> the scheduler is broken - a 
> pre-emptive scheduler can assume /nothing/ about the code it is 
> pre-empting.

In principle true -- for a preemptive scheduler. (The OP is most likely 
not making a pre-emptive scheduler, however.) But in practice a 
pre-emptive scheduler must sometimes know about the run-time 
architecture of the pre-empted software. For example, some small systems 
use statically allocated memory for thread-specific data, such as 
additional working "registers" for floating-point libraries. A 
pre-emptive kernel has to know about such things in order to save and 
restore context. The alternative is to disable preemption while a thread 
uses such software-defined shared resources; the choice is a trade-off 
between latency and context-switching overhead.

But that is veering off-topic, I think.

> If you have a scheduler that for some reason needs a way to get a 
> function's return address, then it needs to use a compiler-specific 
> feature such as gcc's "__builtin_return_address()" function.  If the 
> compiler doesn't have such a feature, then you are out of luck.  Get a 
> different compiler, or write a scheduler that doesn't depend on knowing 
> the return address.

Not very helpful to the OP. But "tough love", perhaps :-)

> Under no circumstances is it correct to tell the compiler you have an 
> infinite loop, and then complain because you can't see how to break out 
> of it.

Who was complaining? The OP seems to know perfectly well how to break 
out of this loop by changing the PC in the scheduler (when the looping 
code is interrupted).

> There is no point in trying to help the OP find some workaround to get 
> this system to compile - he must fix the code.

Eh? The system compiles. And can work, if the compiler's use of a branch 
instruction instead of a call instruction is only due to tail-call 
optimization, and there is always a return address on the stack.

>> ... the application would call a kernel "yield" or "suspend_me" 
>> function, the kernel would check if some other thread is ready to run, 
>> and if not the kernel would stick in a loop, or schedule a looping 
>> "null thread" that is always ready to run.
>>
> 
> Exactly.  When a task has finished, control must be returned to the 
> scheduler, either by calling a "yield" function, or by returning to its 
> caller (the kernel).  You could, I suppose, end a task in an infinite 
> loop and rely on the pre-empter to make sure other tasks get processor 
> time.  But you certainly wouldn't expect that thread to ever leave the 
> infinite loop - that's why it's called an "infinite loop".

It is not uncommon for kernels to (internally) use an eternal loop 
("lab: jump lab") to wait for the next interrupt that creates some work 
to do, as the OP does in Spin. Yes, the loop is syntactically 
eternal/infinite, but in the presence of interrupts it can be terminated.

>> As I understand it, the OP's scheduler (most likely running in an 
>> interrupt handler) will break out of the "eternal" loop by popping the 
>> return address from the stack into the PC, forcing a return from Spin. 
>> This is legal MSP430 code, but out of C semantics.
>>
> 
> If the OP wants to write such brain-dead code in some sort of non-C, 
> that's up to him - but he should not expect to use a C compiler to 
> achieve it.

The OP is combining C semantics -- the loop is eternal -- with interrupt 
semantics -- the loop can be broken. This approach is normal for writing 
kernels and schedulers, but of course has its pitfalls.

I agree that it would be cleaner to write the non-C-semantics code, such 
as Spin, in assembly language.

>> Making Spin test a flag that the scheduler sets is a solution, but a 
>> different solution.
> 
> That's /almost/ correct.  Making Spin test a flag /is/ a solution.  But 
> it's not a "different solution", because he doesn't have a solution at 
> the moment - his scheduler concept /cannot/ be made to work the way he 
> thinks.

Sure it can -- that is, an interrupt handler can break the Spin loop and 
resume execution at the point after the Spin call, as long as there is a 
return address.

>> It could be safer to write Spin in assembly language, to prevent the C 
>> compiler gaining any false knowledge about its behaviour, such as 
>> "does not return" knowledge. 
> 
> Rubbish.  Fake assembly to lie to the compiler is not the answer.

There is nothing "fake" about this. A kernel/scheduler (especially if 
pre-emptive) has to go beyond C semantics. Using assembly language is 
the normal way to do this. And using the return address is the normal 
way for a kernel to save the PC of a thread, when a kernel routine 
suspends the thread.

> The original idea is /wrong/.  An infinite loop has no exit and no 
> return.

The loop in Spin can be terminated by a scheduler using PC 
manipulations, as in a typical scheduler. Nothing wrong about that, 
although it is risky to write it in C, for the reason that we agree on: 
the C compiler will only see the C semantics, and may use them in ways 
that cause problems for this idea.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

Reply by David Brown ●November 25, 20092009-11-25

Niklas Holsti wrote:
> [ Quotations edited severely but hopefully without misattribution.]
> 
>>>>> brOS wrote:
>>>>>>> On Mon, 23 Nov 2009 14:19:14 -0600
>>>>>>> "brOS" <bogdanrosandic@gmail.com> wrote:
>>>>>>>
>>>>>>>> Dear all,
>>>>>>>>
>>>>>>>> Does anybody knows how to force compiler to use call instruction
>>>>>>>> instead of br(branch)for disassembling function call?
>>>>>>>> It is extremely important for me to specific function is 
>>>>>>>> disassembled
>>>>>>>> using call instead of brunch, as compiler always does.
>>>>>>>>
>>>>> ...
>>>>>>>
>>>>>> This is why i need it....
>>>>>> Function I'm calling have looks something like this:
>>>>>> void Spin(void){
>>>>>> for(;;){}
>>>>>> }
>>>>>> So if it is disassembled with call before entering in pc will be 
>>>>>> saved on
>>>>>> stack and it will point to instruction after function spin....So I 
>>>>>> want to
>>>>>> use that pc and to save context so when my scheduler schedule that 
>>>>>> task
>>>>>> again it will not continue spinning in that forever loop but it 
>>>>>> will jump
>>>>>> to next instruction after Spin function.....
>>>>>>  branch doesn t push pc to stack so taht s my problem;)        
> 
>> Niklas Holsti wrote:
> 
>>> ... The scheduler/kernel routines should follow the C compiler's 
>>> calling protocols, but will themselves do things that exceed C's 
>>> semantics.
> 
> David Brown wrote:
>>
>> No, the scheduler/kernel should /not/ rely on the compiler's calling 
>> protocols.
> 
> I didn't say "rely"-- I said "follow". If the application calls a kernel 
> routine, it will use the compiler's calling protocols that, for example, 
> say which registers must be preserved, and which can be overwritten. The 
> kernel routine should follow these rules, but is certainly allowed to 
> change the values of the overwritable registers, for example. (Note, I 
> am not talking about *pre-emption* here, nor was the OP, I believe.)
> 

The OP has given us very little information to go on - a lot of what we 
both are writing about is speculation (and I am just as likely to guess 
incorrectly as you).  However, since an infinite loop can clearly never 
be broken without pre-emption, I am assuming he /does/ want pre-emption.

Certainly the kernel should follow the compiler's conventions for 
function calling - it should, as far as practically possible, be written 
in C, and thus calling conventions follow automatically.  I 
misinterpreted your post - I thought you meant the kernel could assume 
that the code it is scheduling always follows the compiler's conventions.

>> The compiler can change these as it wants, and mix them for different 
>> functions.  If the scheduler depends on the compiler using particular 
>> instructions to call a function,
> 
> The question here is not really about particular instructions, but about 
> the state in which the Spin routine is entered, specifically whether 
> there is a usable return address on the stack. The presence of a return 
> address on the stack must be defined in the compiler's calling protocol 
> if the compiler is meant to be able to interface to assembly-language 
> routines or generally "foreign" routines.
> 

That is only true at the points at which it actually /is/ interfaced to 
"foreign" code.  When a C function calls another C function, the 
compiler can use or abuse whatever calling convention it likes at the 
time.  Good compilers can and will do all sorts of re-arrangements to 
get better code, including inlining code bodies, changing register 
usage, or using a "branch" instead of a "call" when the called function 
cannot return.  Nothing you can do with compiler flags, separate 
compilation, or other tricks can change that in a reliable way.

>> the scheduler is broken - a pre-emptive scheduler can assume /nothing/ 
>> about the code it is pre-empting.
> 
> In principle true -- for a preemptive scheduler. (The OP is most likely 
> not making a pre-emptive scheduler, however.) But in practice a 

The code is just as broken for a co-operative scheduler.  As you have 
said yourself, when a task wants to release the processor it should call 
the kernel scheduler.

I don't know how much you have worked on schedulers, but I get the 
impression you know what you are doing and could write one perfectly 
well.  You would solve the same sorts of problems in a similar way to 
the way I or most other scheduler writers would.  So I don't really want 
to sound like I am trying to teach you something you already know about.

But I just cannot comprehend why you are defending the OP's bad design, 
and trying to find ways to jam that square peg into a round hole.

You know as well as I do that writing a tight infinite loop, and then 
trying to find some way to go around the compiler to break out of the 
loop, is bad design from step 1.  Everything else in this thread is of 
minor relevance (though interesting).

> pre-emptive scheduler must sometimes know about the run-time 
> architecture of the pre-empted software. For example, some small systems 
> use statically allocated memory for thread-specific data, such as 
> additional working "registers" for floating-point libraries. A 
> pre-emptive kernel has to know about such things in order to save and 
> restore context. The alternative is to disable preemption while a thread 
> uses such software-defined shared resources; the choice is a trade-off 
> between latency and context-switching overhead.
> 

That is true enough.  In such a situation, the OS must know whether 
these additional "registers" (or for some devices, they are real 
registers) must be preserved and restored.  In "normal" embedded code 
the same situation turns up with interrupts.  For example, when using 
the embedded multiplier on the msp430 you must disable interrupts or be 
sure that the interrupt routines don't use the multiplier.

> But that is veering off-topic, I think.
> 

Only a little :-)

>> If you have a scheduler that for some reason needs a way to get a 
>> function's return address, then it needs to use a compiler-specific 
>> feature such as gcc's "__builtin_return_address()" function.  If the 
>> compiler doesn't have such a feature, then you are out of luck.  Get a 
>> different compiler, or write a scheduler that doesn't depend on 
>> knowing the return address.
> 
> Not very helpful to the OP. But "tough love", perhaps :-)
> 

That is, IMHO, what the OP needs here.  Any advice he gets that help him 
continue down his original path is false help.

>> Under no circumstances is it correct to tell the compiler you have an 
>> infinite loop, and then complain because you can't see how to break 
>> out of it.
> 
> Who was complaining? The OP seems to know perfectly well how to break 
> out of this loop by changing the PC in the scheduler (when the looping 
> code is interrupted).
> 

He is complaining because although he knows he has to change the PC, he 
doesn't know what new value to use.

>> There is no point in trying to help the OP find some workaround to get 
>> this system to compile - he must fix the code.
> 
> Eh? The system compiles. And can work, if the compiler's use of a branch 
> instruction instead of a call instruction is only due to tail-call 
> optimization, and there is always a return address on the stack.
> 

A system can work (assuming for a moment that it can be made to work), 
and yet still be so badly designed and fragile that it is "broken".

>>> ... the application would call a kernel "yield" or "suspend_me" 
>>> function, the kernel would check if some other thread is ready to 
>>> run, and if not the kernel would stick in a loop, or schedule a 
>>> looping "null thread" that is always ready to run.
>>>
>>
>> Exactly.  When a task has finished, control must be returned to the 
>> scheduler, either by calling a "yield" function, or by returning to 
>> its caller (the kernel).  You could, I suppose, end a task in an 
>> infinite loop and rely on the pre-empter to make sure other tasks get 
>> processor time.  But you certainly wouldn't expect that thread to ever 
>> leave the infinite loop - that's why it's called an "infinite loop".
> 
> It is not uncommon for kernels to (internally) use an eternal loop 
> ("lab: jump lab") to wait for the next interrupt that creates some work 
> to do, as the OP does in Spin. Yes, the loop is syntactically 
> eternal/infinite, but in the presence of interrupts it can be terminated.
> 

It is certainly possible to have such an infinite loop in the kernel - 
but only as an idle function for when the processor is doing nothing. 
The thread is never expected to continue beyond the loop, or return from 
it in any way.

>>> As I understand it, the OP's scheduler (most likely running in an 
>>> interrupt handler) will break out of the "eternal" loop by popping 
>>> the return address from the stack into the PC, forcing a return from 
>>> Spin. This is legal MSP430 code, but out of C semantics.
>>>
>>
>> If the OP wants to write such brain-dead code in some sort of non-C, 
>> that's up to him - but he should not expect to use a C compiler to 
>> achieve it.
> 
> The OP is combining C semantics -- the loop is eternal -- with interrupt 
> semantics -- the loop can be broken. This approach is normal for writing 
> kernels and schedulers, but of course has its pitfalls.
> 

It is perfectly common and reasonable to have almost-infinite loops.  An 
obvious example is a real spin lock, as implemented in real working 
schedulers - you have a tight loop that checks for an external event 
(such as a flag set within an interrupt routine or another task), and 
exits the loop when the flag is set.  But the critical point here is 
that the loop has an exit clause.  If you want to write a loop that will 
be exited, you write a loop with an exit clause.

> I agree that it would be cleaner to write the non-C-semantics code, such 
> as Spin, in assembly language.
> 

I am /not/ saying this sort of code should be written in assembly - I am 
saying it should not be written at all!  It can never be "clean" code. 
But if it is written in assembly, then at least you are giving the tools 
no useful information, instead of directly lying to them.

>>> Making Spin test a flag that the scheduler sets is a solution, but a 
>>> different solution.
>>
>> That's /almost/ correct.  Making Spin test a flag /is/ a solution.  
>> But it's not a "different solution", because he doesn't have a 
>> solution at the moment - his scheduler concept /cannot/ be made to 
>> work the way he thinks.
> 
> Sure it can -- that is, an interrupt handler can break the Spin loop and 
> resume execution at the point after the Spin call, as long as there is a 
> return address.
> 

How is this in any way "better" than having Spin loop until a flag is 
set, and have the interrupt handler set that flag?  Doing it the right 
way is entirely standard C, is far easier, far safer, far more portable, 
far more maintainable, and is smaller and faster than any sort of hack 
you might conceivably get working.

>>> It could be safer to write Spin in assembly language, to prevent the 
>>> C compiler gaining any false knowledge about its behaviour, such as 
>>> "does not return" knowledge. 
>>
>> Rubbish.  Fake assembly to lie to the compiler is not the answer.
> 
> There is nothing "fake" about this. A kernel/scheduler (especially if 
> pre-emptive) has to go beyond C semantics. Using assembly language is 
> the normal way to do this. And using the return address is the normal 
> way for a kernel to save the PC of a thread, when a kernel routine 
> suspends the thread.
> 

Using assembly language where assembly language is needed is absolutely 
fine - and a pre-emptive scheduler is always going to need some assembly 
language.  But using assembly language to try to force the compiler not 
to optimise some code is almost always bad design.

And the scheduler gets the PC of a thread by looking at the return 
address for the interrupt routine, not by trying to dig down the stack 
and guess the return address for the current function in the interrupted 
thread.

>> The original idea is /wrong/.  An infinite loop has no exit and no 
>> return.
> 
> The loop in Spin can be terminated by a scheduler using PC 
> manipulations, as in a typical scheduler. Nothing wrong about that, 
> although it is risky to write it in C, for the reason that we agree on: 
> the C compiler will only see the C semantics, and may use them in ways 
> that cause problems for this idea.
> 

One thing we haven't really discussed here is how the interrupt routine 
/ scheduler knows that the thread is in the Spin function.  Is it going 
to take the real thread PC (from the interrupt routine's return stack) 
and compare it to the address of the Spin function to determine if the 
thread is current at the "lab: jump lab" instruction?  If it is there, 
then it will look deeper in the stack for the previous return address, 
and return to that point.  If not, then the thread is somewhere else and 
the interrupt routine (or the scheduler) must return there.

While such a scheme may theoretically be made to work, it is needlessly 
complicated, very fragile and highly dependent on getting the code 
compiled in exactly the right way, and hopelessly restrictive and 
inflexible.

Maybe there is something here that I'm missing - perhaps the OP will 
come back to us with some more information.

Reply by Niklas Holsti ●November 25, 20092009-11-25

David Brown wrote:
> Niklas Holsti wrote:
>> [ Quotations edited severely but hopefully without misattribution.]
>>
>>>>>> brOS wrote:
>>>>>>>> On Mon, 23 Nov 2009 14:19:14 -0600
>>>>>>>> "brOS" <bogdanrosandic@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Dear all,
>>>>>>>>>
>>>>>>>>> Does anybody knows how to force compiler to use call instruction
>>>>>>>>> instead of br(branch)for disassembling function call?
>>>>>>>>> It is extremely important for me to specific function is 
>>>>>>>>> disassembled
>>>>>>>>> using call instead of brunch, as compiler always does.
>>>>>>>>>
>>>>>> ...
>>>>>>>>
>>>>>>> This is why i need it....
>>>>>>> Function I'm calling have looks something like this:
>>>>>>> void Spin(void){
>>>>>>> for(;;){}
>>>>>>> }
>>>>>>> So if it is disassembled with call before entering in pc will be 
>>>>>>> saved on
>>>>>>> stack and it will point to instruction after function spin....So 
>>>>>>> I want to
>>>>>>> use that pc and to save context so when my scheduler schedule 
>>>>>>> that task
>>>>>>> again it will not continue spinning in that forever loop but it 
>>>>>>> will jump
>>>>>>> to next instruction after Spin function.....
>>>>>>>  branch doesn t push pc to stack so taht s my problem;)        
>>
> 
> The OP has given us very little information to go on - a lot of what we 
> both are writing about is speculation (and I am just as likely to guess 
> incorrectly as you).

Yes.

I wrote my answer assuming that the OP knows what he or she is doing but 
was concerned that the branch instruction might not leave a good return 
address on the stack.

> However, since an infinite loop can clearly never 
> be broken without pre-emption, I am assuming he /does/ want pre-emption.

I would not call it pre-emption, but interruption. To me, pre-emption 
means suspending a task at some arbitrary point in its execution and 
switching control to another task. In the OP's code, the Spin function 
seems to be the expected place for suspending and resuming the task, so 
the task is prepared for it, at that point. This looks like co-operative 
multi-tasking.

My guess about the OP's design was that the Spin function would be used 
for consuming the rest of a thread's time-slice when the thread has 
finished its current job, and that the OP would not try to schedule 
another ready thread to use this (slack) time, perhaps in order to have 
deterministic time-triggered behaviour, or perhaps to avoid pre-emptions.

> I misinterpreted your post - I thought you meant the kernel could assume 
> that the code it is scheduling always follows the compiler's conventions.

I agree completely that a pre-emptive kernel cannot assume that. (Well, 
there may be *some* conventions that always hold, for example relating 
to the stack pointer. But all conventions known to hold at a "foreign" 
call are generally not true at arbitrary points.)

> That is only true at the points at which it actually /is/ interfaced to 
> "foreign" code.  When a C function calls another C function, the 
> compiler can use or abuse whatever calling convention it likes at the 
> time.

Agreed. For most embedded compilers, though, anything in a separate 
compilation is considered "foreign". But as you say, there is no 
guarantee in general.

> The code is just as broken for a co-operative scheduler.  As you have 
> said yourself, when a task wants to release the processor it should call 
> the kernel scheduler.

In my guess as to what the OP is doing, the call to Spin *is* this call, 
which would make the OP's kernel a rather special one. On the other 
hand, perhaps I mis-guessed, and the call to Spin happens *within* the 
OP's kernel, after the kernel has done the more normal things such as 
looking for other ready tasks.

> I don't know how much you have worked on schedulers, but I get the 
> impression you know what you are doing and could write one perfectly 
> well.

Thanks. I've written a couple of simple, co-operative ones, a while ago, 
for obsolete processors, and studied a few other, current ones from the 
point of view of static WCET analysis.

> But I just cannot comprehend why you are defending the OP's bad design, 
> and trying to find ways to jam that square peg into a round hole.
> 
> You know as well as I do that writing a tight infinite loop, and then 
> trying to find some way to go around the compiler to break out of the 
> loop, is bad design from step 1.  Everything else in this thread is of 
> minor relevance (though interesting).

I'm not so ready to call this "bad design" without knowing more about 
the OP's requirements and design. The code generated for Spin is exactly 
the kind of tight eternal loop that you often find in a kernel where the 
kernel has no ready tasks and waits for an interrupt. I haven't tried 
it, but it seems to me that writing this loop as a conditional, 
flag-checking one could increase (by a little) the latency for resuming 
the right task when an interrupt happens, compared to resuming the task 
directly from the interrupt handler and simply abandoning the tight loop.

It may be bad practice to rely on the C compiler to generate this code, 
and perhaps I should have said so in my original reply to the OP. It has 
been said now, good.

David Brown wrote:
>>> Under no circumstances is it correct to tell the compiler you have an 
>>> infinite loop, and then complain because you can't see how to break 
>>> out of it.

Niklas Holsti replied:
>> Who was complaining? The OP seems to know perfectly well how to break 
>> out of this loop by changing the PC in the scheduler (when the looping 
>> code is interrupted).

David Brown replied:
> He is complaining because although he knows he has to change the PC, he 
> doesn't know what new value to use.

Because the OP thought that a branch instruction would not leave a 
return address on the stack. But if the branch instruction implements a 
tail call, it does leave a return address (although for an outer call).

> How is this in any way "better" than having Spin loop until a flag is 
> set, and have the interrupt handler set that flag?

See my comment on latency, above. But of course this is again a guess as 
to why the OP is doing it this way.

> Using assembly language where assembly language is needed is absolutely 
> fine - and a pre-emptive scheduler is always going to need some assembly 
> language.  But using assembly language to try to force the compiler not 
> to optimise some code is almost always bad design.

Writing a function in assembly language (and not, of course, as "in-line 
assembly code" in a C file) is a pretty sure way of making the C 
compiler treat is as a "foreign" function and so ensure that calls use 
the standard conventions, including pushing a return address.

> And the scheduler gets the PC of a thread by looking at the return 
> address for the interrupt routine,

Or the return address of the call from the thread to the kernel 
function, which is the case for Spin (I guess).

> One thing we haven't really discussed here is how the interrupt routine 
> / scheduler knows that the thread is in the Spin function.  Is it going 
> to take the real thread PC (from the interrupt routine's return stack) 
> and compare it to the address of the Spin function to determine if the 
> thread is current at the "lab: jump lab" instruction?  If it is there, 
> then it will look deeper in the stack for the previous return address, 
> and return to that point.

That is (also) my guess of what the OP is trying to do.

> If not, then the thread is somewhere else and 
> the interrupt routine (or the scheduler) must return there.

Maybe not. In my guess of the OP's design, if the thread is not in Spin 
when the interrupt happens, the thread has exceeded its time-slice. I 
don't of course know what the OP intends the kernel/scheduler to do, in 
that case; perhaps log a fatal error and reboot. Another choice is to 
set an error flag and let the thread continue until the next tick, when 
it is checked again.

> Maybe there is something here that I'm missing - perhaps the OP will 
> come back to us with some more information.

That would be good.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

Reply by larwe ●November 25, 20092009-11-25

On Nov 23, 3:46=A0pm, Rob Gaddi <rga...@technologyhighland.com> wrote:

> B) I think the world would be a generally happier place if more
> processors had a dedicated brunch instruction. =A0I figure that properly
> implemented it ought to take a good hour and a half to return, and then
> come back with the stack smelling of coffee and bacon.

*PROPERLY* implemented it should divert to the nearest pub and not
return until the keg is dry.

Reply by David Brown ●November 25, 20092009-11-25

Niklas Holsti wrote:
> David Brown wrote:
>> Niklas Holsti wrote:
>>> [ Quotations edited severely but hopefully without misattribution.]
>>>
>>>>>>> brOS wrote:
>>>>>>>>> On Mon, 23 Nov 2009 14:19:14 -0600
>>>>>>>>> "brOS" <bogdanrosandic@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Dear all,
>>>>>>>>>>
>>>>>>>>>> Does anybody knows how to force compiler to use call instruction
>>>>>>>>>> instead of br(branch)for disassembling function call?
>>>>>>>>>> It is extremely important for me to specific function is 
>>>>>>>>>> disassembled
>>>>>>>>>> using call instead of brunch, as compiler always does.
>>>>>>>>>>
>>>>>>> ...
>>>>>>>>>
>>>>>>>> This is why i need it....
>>>>>>>> Function I'm calling have looks something like this:
>>>>>>>> void Spin(void){
>>>>>>>> for(;;){}
>>>>>>>> }
>>>>>>>> So if it is disassembled with call before entering in pc will be 
>>>>>>>> saved on
>>>>>>>> stack and it will point to instruction after function spin....So 
>>>>>>>> I want to
>>>>>>>> use that pc and to save context so when my scheduler schedule 
>>>>>>>> that task
>>>>>>>> again it will not continue spinning in that forever loop but it 
>>>>>>>> will jump
>>>>>>>> to next instruction after Spin function.....
>>>>>>>>  branch doesn t push pc to stack so taht s my problem;)        
>>>
>>
>> The OP has given us very little information to go on - a lot of what 
>> we both are writing about is speculation (and I am just as likely to 
>> guess incorrectly as you).
> 
> Yes.
> 
> I wrote my answer assuming that the OP knows what he or she is doing but 
> was concerned that the branch instruction might not leave a good return 
> address on the stack.
> 

I don't think "the OP knows what he or she is doing" is a fair 
assumption, based on the posted code for Spin() !

>> However, since an infinite loop can clearly never be broken without 
>> pre-emption, I am assuming he /does/ want pre-emption.
> 
> I would not call it pre-emption, but interruption. To me, pre-emption 
> means suspending a task at some arbitrary point in its execution and 
> switching control to another task. In the OP's code, the Spin function 
> seems to be the expected place for suspending and resuming the task, so 
> the task is prepared for it, at that point. This looks like co-operative 
> multi-tasking.
> 

It's possible (or maybe even likely) that the OP is /trying/ to 
implement a co-operative scheduler.  But it doesn't actually co-operate 
- an eternal loop is not co-operative, even if it you cheat and break 
out using interrupts.  Interrupts are inherently asynchronous - if the 
thread can be suspended by an interrupt function, that is pre-emptive 
multitasking.

> My guess about the OP's design was that the Spin function would be used 
> for consuming the rest of a thread's time-slice when the thread has 
> finished its current job, and that the OP would not try to schedule 
> another ready thread to use this (slack) time, perhaps in order to have 
> deterministic time-triggered behaviour, or perhaps to avoid pre-emptions.
> 

That could well be the intention.  But spinning like that is a silly 
idea, and even if he wants to do what you suggest here, the 
implementation is totally wrong.  The interrupt should set a flag, and 
the spin lock should block waiting for the flag.

>> I misinterpreted your post - I thought you meant the kernel could 
>> assume that the code it is scheduling always follows the compiler's 
>> conventions.
> 
> I agree completely that a pre-emptive kernel cannot assume that. (Well, 
> there may be *some* conventions that always hold, for example relating 
> to the stack pointer. But all conventions known to hold at a "foreign" 
> call are generally not true at arbitrary points.)
> 
>> That is only true at the points at which it actually /is/ interfaced 
>> to "foreign" code.  When a C function calls another C function, the 
>> compiler can use or abuse whatever calling convention it likes at the 
>> time.
> 
> Agreed. For most embedded compilers, though, anything in a separate 
> compilation is considered "foreign". But as you say, there is no 
> guarantee in general.
> 

These days, full program optimisation is not uncommon.  Even gcc 
(despite its critics' opinions) can do reasonable full program 
optimisation by compiling all the C modules in one shot.

>> The code is just as broken for a co-operative scheduler.  As you have 
>> said yourself, when a task wants to release the processor it should 
>> call the kernel scheduler.
> 
> In my guess as to what the OP is doing, the call to Spin *is* this call, 
> which would make the OP's kernel a rather special one. On the other 
> hand, perhaps I mis-guessed, and the call to Spin happens *within* the 
> OP's kernel, after the kernel has done the more normal things such as 
> looking for other ready tasks.
> 
>> I don't know how much you have worked on schedulers, but I get the 
>> impression you know what you are doing and could write one perfectly 
>> well.
> 
> Thanks. I've written a couple of simple, co-operative ones, a while ago, 
> for obsolete processors, and studied a few other, current ones from the 
> point of view of static WCET analysis.
> 

I think most of our apparent disagreements have the basis in different 
guesses as to what we think the OP is trying to do.

Hopefully the OP is still reading the thread, and will take some 
inspiration from our discussion!

>> But I just cannot comprehend why you are defending the OP's bad 
>> design, and trying to find ways to jam that square peg into a round hole.
>>
>> You know as well as I do that writing a tight infinite loop, and then 
>> trying to find some way to go around the compiler to break out of the 
>> loop, is bad design from step 1.  Everything else in this thread is of 
>> minor relevance (though interesting).
> 
> I'm not so ready to call this "bad design" without knowing more about 
> the OP's requirements and design. The code generated for Spin is exactly 
> the kind of tight eternal loop that you often find in a kernel where the 
> kernel has no ready tasks and waits for an interrupt. I haven't tried 
> it, but it seems to me that writing this loop as a conditional, 
> flag-checking one could increase (by a little) the latency for resuming 
> the right task when an interrupt happens, compared to resuming the task 
> directly from the interrupt handler and simply abandoning the tight loop.
> 

Nah, the loop overhead to continually read a flag would be a few cycles 
at most.  The interrupt function overhead to figure out return addresses 
from the stack will be much, much worse.

When I see someone write one thing, and mean another, I see a mistake. 
When the author knows what he has written and is wants to find some way 
to work around this difference rather than correcting the code, I see a 
bad design.  Maybe I'm just less tolerant than you.

> It may be bad practice to rely on the C compiler to generate this code, 
> and perhaps I should have said so in my original reply to the OP. It has 
> been said now, good.
> 
> David Brown wrote:
>>>> Under no circumstances is it correct to tell the compiler you have 
>>>> an infinite loop, and then complain because you can't see how to 
>>>> break out of it.
> 
> Niklas Holsti replied:
>>> Who was complaining? The OP seems to know perfectly well how to break 
>>> out of this loop by changing the PC in the scheduler (when the 
>>> looping code is interrupted).
> 
> David Brown replied:
>> He is complaining because although he knows he has to change the PC, 
>> he doesn't know what new value to use.
> 
> Because the OP thought that a branch instruction would not leave a 
> return address on the stack. But if the branch instruction implements a 
> tail call, it does leave a return address (although for an outer call).
> 
>> How is this in any way "better" than having Spin loop until a flag is 
>> set, and have the interrupt handler set that flag?
> 
> See my comment on latency, above. But of course this is again a guess as 
> to why the OP is doing it this way.
> 
>> Using assembly language where assembly language is needed is 
>> absolutely fine - and a pre-emptive scheduler is always going to need 
>> some assembly language.  But using assembly language to try to force 
>> the compiler not to optimise some code is almost always bad design.
> 
> Writing a function in assembly language (and not, of course, as "in-line 
> assembly code" in a C file) is a pretty sure way of making the C 
> compiler treat is as a "foreign" function and so ensure that calls use 
> the standard conventions, including pushing a return address.
> 

That is true, but my point is that you should not use assembly like this 
just to "get around" the compiler - not without very good reasons.  I've 
often seen people use assembly code to try to force the compiler to act 
in some way, when they could have done much better while staying within C.

>> And the scheduler gets the PC of a thread by looking at the return 
>> address for the interrupt routine,
> 
> Or the return address of the call from the thread to the kernel 
> function, which is the case for Spin (I guess).
> 
>> One thing we haven't really discussed here is how the interrupt 
>> routine / scheduler knows that the thread is in the Spin function.  Is 
>> it going to take the real thread PC (from the interrupt routine's 
>> return stack) and compare it to the address of the Spin function to 
>> determine if the thread is current at the "lab: jump lab" 
>> instruction?  If it is there, then it will look deeper in the stack 
>> for the previous return address, and return to that point.
> 
> That is (also) my guess of what the OP is trying to do.
> 
>> If not, then the thread is somewhere else and the interrupt routine 
>> (or the scheduler) must return there.
> 
> Maybe not. In my guess of the OP's design, if the thread is not in Spin 
> when the interrupt happens, the thread has exceeded its time-slice. I 
> don't of course know what the OP intends the kernel/scheduler to do, in 
> that case; perhaps log a fatal error and reboot. Another choice is to 
> set an error flag and let the thread continue until the next tick, when 
> it is checked again.
> 

Your guesses as to the OP's ideas make a certain sense - perhaps he is 
trying to implement a sort of fixed time-slice scheduler.  The 
implementation of Spin() is still wrong (you'll never convince me 
otherwise!), but that might bring us a little closer to helping him get 
a working implementation.

>> Maybe there is something here that I'm missing - perhaps the OP will 
>> come back to us with some more information.
> 
> That would be good.
>

Reply by Niklas Holsti ●November 26, 20092009-11-26

David Brown wrote:
> Niklas Holsti wrote:
>> David Brown wrote:
>>> Niklas Holsti wrote:
>>>> [ Quotations edited severely but hopefully without misattribution.]
>>>>
>>>>>>>> brOS wrote:
>>>>>>>>>> On Mon, 23 Nov 2009 14:19:14 -0600
>>>>>>>>>> "brOS" <bogdanrosandic@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Dear all,
>>>>>>>>>>>
>>>>>>>>>>> Does anybody knows how to force compiler to use call instruction
>>>>>>>>>>> instead of br(branch)for disassembling function call?
>>>>>>>>>>> It is extremely important for me to specific function is 
>>>>>>>>>>> disassembled
>>>>>>>>>>> using call instead of brunch, as compiler always does.
>>>>>>>>>>>
>>>>>>>> ...
>>>>>>>>>>
>>>>>>>>> This is why i need it....
>>>>>>>>> Function I'm calling have looks something like this:
>>>>>>>>> void Spin(void){
>>>>>>>>> for(;;){}
>>>>>>>>> }
>>>>>>>>> So if it is disassembled with call before entering in pc will 
>>>>>>>>> be saved on
>>>>>>>>> stack and it will point to instruction after function 
>>>>>>>>> spin....So I want to
>>>>>>>>> use that pc and to save context so when my scheduler schedule 
>>>>>>>>> that task
>>>>>>>>> again it will not continue spinning in that forever loop but it 
>>>>>>>>> will jump
>>>>>>>>> to next instruction after Spin function.....
>>>>>>>>>  branch doesn t push pc to stack so taht s my problem;)        
>>>>
>>>
>>> The OP has given us very little information to go on - a lot of what 
>>> we both are writing about is speculation (and I am just as likely to 
>>> guess incorrectly as you).
>>
>> Yes.
>>
>> I wrote my answer assuming that the OP knows what he or she is doing 
>> but was concerned that the branch instruction might not leave a good 
>> return address on the stack.
>>
> 
> I don't think "the OP knows what he or she is doing" is a fair 
> assumption, based on the posted code for Spin() !
> 
>>> However, since an infinite loop can clearly never be broken without 
>>> pre-emption, I am assuming he /does/ want pre-emption.
>>
>> I would not call it pre-emption, but interruption. To me, pre-emption 
>> means suspending a task at some arbitrary point in its execution and 
>> switching control to another task. In the OP's code, the Spin function 
>> seems to be the expected place for suspending and resuming the task, 
>> so the task is prepared for it, at that point. This looks like 
>> co-operative multi-tasking.
>>
> 
> It's possible (or maybe even likely) that the OP is /trying/ to 
> implement a co-operative scheduler.  But it doesn't actually co-operate 
> - an eternal loop is not co-operative, even if it you cheat and break 
> out using interrupts.  Interrupts are inherently asynchronous - if the 
> thread can be suspended by an interrupt function, that is pre-emptive 
> multitasking.

Well, what constitutes "co-operation" may be a matter of precise 
definition (in real life, sometimes of litigation :-). In my guess of 
the OP's kernel/scheduler design, the suspension is designed to happen 
only when the thread is looping in the Spin function. By calling Spin 
the thread shows that it is ready to be suspended, so it is co-operating 
in my view. (As discussed earlier, we don't know what happens if the 
scheduler interrupt finds the thread is *not* in Spin.)

>> Agreed. For most embedded compilers, though, anything in a separate 
>> compilation is considered "foreign". But as you say, there is no 
>> guarantee in general.
>>
> 
> These days, full program optimisation is not uncommon.  Even gcc 
> (despite its critics' opinions) can do reasonable full program 
> optimisation by compiling all the C modules in one shot.

Sure, but in that case it would not be "separate compilation".

Interesting question, though: Is there a standard way in a C environment 
to ensure that the standard calling sequence is used for an extern 
function, with no C-calling-C optimizations?

>> I'm not so ready to call this "bad design" without knowing more about 
>> the OP's requirements and design. The code generated for Spin is 
>> exactly the kind of tight eternal loop that you often find in a kernel 
>> where the kernel has no ready tasks and waits for an interrupt. I 
>> haven't tried it, but it seems to me that writing this loop as a 
>> conditional, flag-checking one could increase (by a little) the 
>> latency for resuming the right task when an interrupt happens, 
>> compared to resuming the task directly from the interrupt handler and 
>> simply abandoning the tight loop.
>>
> 
> Nah, the loop overhead to continually read a flag would be a few cycles 
> at most.  The interrupt function overhead to figure out return addresses 
> from the stack will be much, much worse.

Let's consider what the kernel has to do, in my guess of the OP's 
design, considering the two cases of (A) an unconditional "eternal" loop 
and (B) a flag-checking loop.

The kernel knows which thread is running.

When the thread finishes its job in this time-slice, it calls Spin, 
expecting to be resumed at the next instruction after the Spin call, say 
instruction R.

The Spin function loops, eating up the rest of the time-slice.

The tick interrupt comes in.

The tick interrupt handler saves the context of the interrupted thread. 
By comparing its PC to the address of the Spin loop, it can check that 
the thread has not overrun its time-slice. At this point:

- For (A) the handler gets the return address of Spin
   by a POP and stores this return address as the resumption
   point for the thread to be suspended.

- For (B) the handler stores the interrupted PC (in
   the flag-checking loop) as the resumption point.

The interrupt handler (scheduler) finds the thread to run in the next 
time-slice. In case (B) it then sets the (thread-specific) flag on which 
Spin is waiting. In case (A) it does not need to set any flag.

The handler restores the context of the new thread. As the last step in 
this, it pushes the resumption address and the restored status register 
and does return-from-interrupt (RETI).

In case (A), the thread is resumed immediately at the desired 
instruction, the instruction R that follows the Spin call.

In case (B), the thread is resumed in the middle of the flag-checking 
loop. It still has to read the flag, branch out of the loop, and execute 
a return instruction (effectively a POP from the stack), before 
instruction R is reached.

In summary, case (A) and case (B) both have to POP the stack to get to 
instruction R, but case (B) also has to set a flag and check a flag. It 
is a close call, but you might save some cycles in case (A). Morever, in 
case (B) the flag has to be thread-specific, so it has to be passed to 
Spin with a parameter, consuming more cycles.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

Reply by Niklas Holsti ●November 26, 20092009-11-26

An extension and a correction to my last (deleting the context):

Niklas Holsti wrote:

> In summary, case (A) and case (B) both have to POP the stack to get to 
> instruction R, but case (B) also has to set a flag and check a flag.

Plus the flag has to be cleared at some point (being careful to avoid 
race conditions).

> Morever, in 
> case (B) the flag has to be thread-specific, so it has to be passed to 
> Spin with a parameter, consuming more cycles.
> 

... except if the flag is in a register that is cleared in Spin before 
the loop, but set by the scheduler in the context that is restored 
(except for this flag) when the Spin loop is resumed.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

Reply by Niklas Holsti ●November 26, 20092009-11-26

And a second correction to myself:

Niklas Holsti wrote:

> Morever, in case (B) the flag has to be thread-specific, so it has to 
> be passed to Spin with a parameter, consuming more cycles.

In fact the flag can be global, not thread-specific, since only one 
thread is resumed at a time. But since this thread *is* resumed, it is 
certain to find the flag set, which goes to show that the flag is 
redundant in this design, and the flagless unconditional loop in case 
(A) makes more sense.

-- 
Niklas Holsti
Tidorum Ltd
niklas holsti tidorum fi
       .      @       .

Reply by David Brown ●November 26, 20092009-11-26

Niklas Holsti wrote:
> David Brown wrote:
>> Niklas Holsti wrote:
>>> David Brown wrote:
>>>> Niklas Holsti wrote:
>>>>> [ Quotations edited severely but hopefully without misattribution.]
>>>>>
>>>>>>>>> brOS wrote:
>>>>>>>>>>> On Mon, 23 Nov 2009 14:19:14 -0600
>>>>>>>>>>> "brOS" <bogdanrosandic@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Dear all,
>>>>>>>>>>>>
>>>>>>>>>>>> Does anybody knows how to force compiler to use call 
>>>>>>>>>>>> instruction
>>>>>>>>>>>> instead of br(branch)for disassembling function call?
>>>>>>>>>>>> It is extremely important for me to specific function is 
>>>>>>>>>>>> disassembled
>>>>>>>>>>>> using call instead of brunch, as compiler always does.
>>>>>>>>>>>>
>>>>>>>>> ...
>>>>>>>>>>>
>>>>>>>>>> This is why i need it....
>>>>>>>>>> Function I'm calling have looks something like this:
>>>>>>>>>> void Spin(void){
>>>>>>>>>> for(;;){}
>>>>>>>>>> }
>>>>>>>>>> So if it is disassembled with call before entering in pc will 
>>>>>>>>>> be saved on
>>>>>>>>>> stack and it will point to instruction after function 
>>>>>>>>>> spin....So I want to
>>>>>>>>>> use that pc and to save context so when my scheduler schedule 
>>>>>>>>>> that task
>>>>>>>>>> again it will not continue spinning in that forever loop but 
>>>>>>>>>> it will jump
>>>>>>>>>> to next instruction after Spin function.....
>>>>>>>>>>  branch doesn t push pc to stack so taht s my problem;)        
>>>>>
>>>>
>>>> The OP has given us very little information to go on - a lot of what 
>>>> we both are writing about is speculation (and I am just as likely to 
>>>> guess incorrectly as you).
>>>
>>> Yes.
>>>
>>> I wrote my answer assuming that the OP knows what he or she is doing 
>>> but was concerned that the branch instruction might not leave a good 
>>> return address on the stack.
>>>
>>
>> I don't think "the OP knows what he or she is doing" is a fair 
>> assumption, based on the posted code for Spin() !
>>
>>>> However, since an infinite loop can clearly never be broken without 
>>>> pre-emption, I am assuming he /does/ want pre-emption.
>>>
>>> I would not call it pre-emption, but interruption. To me, pre-emption 
>>> means suspending a task at some arbitrary point in its execution and 
>>> switching control to another task. In the OP's code, the Spin 
>>> function seems to be the expected place for suspending and resuming 
>>> the task, so the task is prepared for it, at that point. This looks 
>>> like co-operative multi-tasking.
>>>
>>
>> It's possible (or maybe even likely) that the OP is /trying/ to 
>> implement a co-operative scheduler.  But it doesn't actually 
>> co-operate - an eternal loop is not co-operative, even if it you cheat 
>> and break out using interrupts.  Interrupts are inherently 
>> asynchronous - if the thread can be suspended by an interrupt 
>> function, that is pre-emptive multitasking.
> 
> Well, what constitutes "co-operation" may be a matter of precise 
> definition (in real life, sometimes of litigation :-). In my guess of 
> the OP's kernel/scheduler design, the suspension is designed to happen 
> only when the thread is looping in the Spin function. By calling Spin 
> the thread shows that it is ready to be suspended, so it is co-operating 
> in my view. (As discussed earlier, we don't know what happens if the 
> scheduler interrupt finds the thread is *not* in Spin.)
> 
>>> Agreed. For most embedded compilers, though, anything in a separate 
>>> compilation is considered "foreign". But as you say, there is no 
>>> guarantee in general.
>>>
>>
>> These days, full program optimisation is not uncommon.  Even gcc 
>> (despite its critics' opinions) can do reasonable full program 
>> optimisation by compiling all the C modules in one shot.
> 
> Sure, but in that case it would not be "separate compilation".
> 

Fair enough.

What about gcc 4.5 with -flto ?  Then you can compile C modules 
separately into object files, but the object files hold a copy of the 
internal trees as well as generated object code.  When you link these 
object files, the trees are used for link-time optimisation, including 
inlining across modules.  You lose all clarity in the definitions of 
"compile", "link", and "separate compilation".  But that is a 
digression, especially since the msp430 gcc port is not (yet) updated to 
gcc 4.5, which is itself not yet released.

> Interesting question, though: Is there a standard way in a C environment 
> to ensure that the standard calling sequence is used for an extern 
> function, with no C-calling-C optimizations?
> 

I think the only way is by being sure that the compiler can't access the 
code for a function declared as "extern".  It should not be hard to do, 
but you may have to do it explicitly.  For example, if you use a 
compiler's IDE and project manager, you might have to go out of your way 
to force true separate compilation.

>>> I'm not so ready to call this "bad design" without knowing more about 
>>> the OP's requirements and design. The code generated for Spin is 
>>> exactly the kind of tight eternal loop that you often find in a 
>>> kernel where the kernel has no ready tasks and waits for an 
>>> interrupt. I haven't tried it, but it seems to me that writing this 
>>> loop as a conditional, flag-checking one could increase (by a little) 
>>> the latency for resuming the right task when an interrupt happens, 
>>> compared to resuming the task directly from the interrupt handler and 
>>> simply abandoning the tight loop.
>>>
>>
>> Nah, the loop overhead to continually read a flag would be a few 
>> cycles at most.  The interrupt function overhead to figure out return 
>> addresses from the stack will be much, much worse.
> 
> Let's consider what the kernel has to do, in my guess of the OP's 
> design, considering the two cases of (A) an unconditional "eternal" loop 
> and (B) a flag-checking loop.
> 
> The kernel knows which thread is running.
> 
> When the thread finishes its job in this time-slice, it calls Spin, 
> expecting to be resumed at the next instruction after the Spin call, say 
> instruction R.
> 
> The Spin function loops, eating up the rest of the time-slice.
> 
> The tick interrupt comes in.
> 
> The tick interrupt handler saves the context of the interrupted thread. 
> By comparing its PC to the address of the Spin loop, it can check that 
> the thread has not overrun its time-slice. At this point:
> 

Note that a sensible Spin function would tell the kernel that it is 
finished and entering the spin loop, rather than leaving the interrupt 
handler to figure it out in this fragile way.

> - For (A) the handler gets the return address of Spin
>   by a POP and stores this return address as the resumption
>   point for the thread to be suspended.
> 

Assuming, of course, that you've figured out a way to do that safely and 
reliably....

> - For (B) the handler stores the interrupted PC (in
>   the flag-checking loop) as the resumption point.
> 

This bit will typically require some assembly, compiler-specific 
features, or some knowledge of the way the compiler generates interrupt 
routines.  But that's unavoidable when you have an interrupt-based 
scheduler.

> The interrupt handler (scheduler) finds the thread to run in the next 
> time-slice. In case (B) it then sets the (thread-specific) flag on which 
> Spin is waiting. In case (A) it does not need to set any flag.
> 

Fair enough, although setting a flag is exactly a hard job, and can be 
done within standard C.

> The handler restores the context of the new thread. As the last step in 
> this, it pushes the resumption address and the restored status register 
> and does return-from-interrupt (RETI).
> 

OK.

> In case (A), the thread is resumed immediately at the desired 
> instruction, the instruction R that follows the Spin call.
> 

Again assuming that is it is possible to figure out the address of R in 
a reliable way...

> In case (B), the thread is resumed in the middle of the flag-checking 
> loop. It still has to read the flag, branch out of the loop, and execute 
> a return instruction (effectively a POP from the stack), before 
> instruction R is reached.
> 

Yes, you can expect it to take about 3 or 4 instructions before getting 
to R.  That would still be a lot less time than you spend messing around 
getting the address of R in case (A), so case (B) wins here in time.

> In summary, case (A) and case (B) both have to POP the stack to get to 
> instruction R, but case (B) also has to set a flag and check a flag. It 
> is a close call, but you might save some cycles in case (A). Morever, in 
> case (B) the flag has to be thread-specific, so it has to be passed to 
> Spin with a parameter, consuming more cycles.
> 

Remember that all ideas about how case (A) could feasibly be implemented 
are based on hobbling the compiler.  Write the code correctly (case B), 
and you can let the optimiser do its job - that will overwhelm any 
conceivable time advantage case A might have had.  Among other things, 
Spin() could be inlined in its calling function and remove most of the 
overhead.

Even making the great leap of faith that there is a reliable way to get 
the desired return address, and then making a second leap of faith that 
case A is faster, the concept is /still/ wrong.  There is no way that 
shaving a few cycles off the latency could justify using this horrible 
hack.  If those cycles matter, you need a new design.

Previous 123 Next

IAR MSP430 compiler problem

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group