Getting the size of a C function| page 4

Reply by Willem ●January 23, 20102010-01-23

Mark Borgerson wrote:
) In article <87tyucpp7x.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
) says...
)> Mark Borgerson <mborgerson@comcast.net> writes:
)> 
)> > In article <87y6jopsl9.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
)> > says...
)> >> Mark Borgerson <mborgerson@comcast.net> writes:
)> >> You seem to be assuming that the compiler emits machine code that
)> >> is in the same order as the corresponding C code, i.e. that the
)> >> call to Markend() will occur at the end of MoveMe().  This is not
)> >> a good assumption.
)> >
)> > This would certainly be a dangerous technique on a processor
)> > with multi-threading and possible out-of-order execution.
)> > I think it will work OK on the MSP430 that is the CPU where
)> > I am working on a flash-burning routine.
)> 
)> Threading and out-of-order execution has little if anything to do
)> with it.  The issue is the order of the code emitted by compiler,
)> not the order of the code's execution.
)> 
) But woudn't an optimizing compiler generating code for a 
) complex processor be more likely to compile optimize in
) a way that changed the order of operations?  I think
) that might apply particularly to a call to a function
) that returns no result to be used in  a specific
) place inside the outer function.

More specifically, it could generate code like this:
(example in pseudocode)

(begin MoveMe)
TEST var
SKIP NEXT on zero
JUMP Markend
... ; the rest of the code
RETURN
(end MoveMe)


SaSW, Willem
-- 
Disclaimer: I am in no way responsible for any of the statements
            made in the above text. For all I know I might be
            drugged or something..
            No I'm not paranoid. You all think I'm paranoid, don't you !
#EOT

Reply by Jon Kirwan ●January 23, 20102010-01-23

On Sat, 23 Jan 2010 13:17:45 -0800, Mark Borgerson wrote:

>In article <87tyucpp7x.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
>says...
>> Mark Borgerson <mborgerson@comcast.net> writes:
>> 
>> > In article <87y6jopsl9.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
>> > says...
>> >> Mark Borgerson <mborgerson@comcast.net> writes:
>> >> You seem to be assuming that the compiler emits machine code that
>> >> is in the same order as the corresponding C code, i.e. that the
>> >> call to Markend() will occur at the end of MoveMe().  This is not
>> >> a good assumption.
>> >
>> > This would certainly be a dangerous technique on a processor
>> > with multi-threading and possible out-of-order execution.
>> > I think it will work OK on the MSP430 that is the CPU where
>> > I am working on a flash-burning routine.
>> 
>> Threading and out-of-order execution has little if anything to do
>> with it.  The issue is the order of the code emitted by compiler,
>> not the order of the code's execution.
>> 
>But woudn't an optimizing compiler generating code for a 
>complex processor be more likely to compile optimize in
>a way that changed the order of operations?  I think
>that might apply particularly to a call to a function
>that returns no result to be used in  a specific
>place inside the outer function.

Ben quite correctly brought you up short on the right point.
Your example was, just to refresh ourselves:

>:   int MoveMe ( ...., bool findend ) {
>:       if ( !findend ) {
>:           // do normal function stuff
>:       } else
>:           Markend();
>:   }

Let's divert from this for a moment and take the case of a
for-loop in c.  It looks like:

>:   for ( init-block; condition; iterate-block )
>:       body-block;

A compiler will often translate this into this form:

>:      init-block;
>:      goto A;
>:   B: body-block;
>:   C: iterate-block;
>:   A: if ( condition ) goto B;
>:   D:

(The reason for the C label is to support the continue-
statement and the reason for the D label is to support a
break-statement, of course.)

The straight interpretation would have been more like this:

>:      init-block;
>:   A: if ( !condition ) goto D;
>:   B: body-block;
>:   C: iterate-block;
>:      goto A;
>:   D:

But note that the execution of the for-loop's main body,
presumed by the compiler to have "many iterations" as a
reasonable guess, includes execution for the "goto A"
statement in each and every iteration.  But so is, in effect,
the conditional test, too.  In other words, it takes longer
to execute the body, even if that only means the execution of
one jump instruction.  It's more efficient to redesign the
model used by the compiler to the first example I gave,
merely because the c compiler takes the position that the
added one-time execution of the first "goto A" will be the
lower cost approach (which it almost always will be.)

Now let's assume that the compiler takes the position that
the first case of an if-statement section is the more
frequently travelled one.  In other words, when the
conditional case is executed, it will more often be "true"
than "false."  The model used might very well then be to
convert:

>:   if ( condition )
>:       s1-block;
>:   else
>:       s2-block;

into:

>:      if ( !condition ) goto A;
>:      s2-block;
>:      goto B;
>:   A: s1-block;
>:   B:

This provides s1-block execution with one less jump and
therefore lets it execute slightly faster with the idea that
it is the preferred path.

So let's revisit your example again in this light:

>:   int MoveMe ( ...., bool findend ) {
>:       if ( !findend ) {
>:           // do normal function stuff
>:       } else
>:           Markend();
>:   }

This _may_ be taken by a c compiler to be:

>:   int MoveMe ( ...., bool findend ) {
>:       if ( findend ) goto A;
>:       Markend();
>:       goto B;
>:    A: // do normal function stuff
>:    B:
>:   }

Leaving your function call to Markend not exactly where you'd
have liked to see it occur.

An old book you can pick up talking about a method used to
_explicitly_ inform the compiler about statistics of branch
likelihoods is the Ph.D. thesis by John Ellis:

http://mitpress.mit.edu/catalog/item/default.asp?tid=5098&ttype=2

Worth a read, some snowy day.

Jon

Reply by Mark Borgerson ●January 23, 20102010-01-23

In article <slrnhlmqth.1evp.willem@turtle.stack.nl>, willem@stack.nl 
says...
> Mark Borgerson wrote:
> ) In article <87tyucpp7x.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
> ) says...
> )> Mark Borgerson <mborgerson@comcast.net> writes:
> )> 
> )> > In article <87y6jopsl9.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
> )> > says...
> )> >> Mark Borgerson <mborgerson@comcast.net> writes:
> )> >> You seem to be assuming that the compiler emits machine code that
> )> >> is in the same order as the corresponding C code, i.e. that the
> )> >> call to Markend() will occur at the end of MoveMe().  This is not
> )> >> a good assumption.
> )> >
> )> > This would certainly be a dangerous technique on a processor
> )> > with multi-threading and possible out-of-order execution.
> )> > I think it will work OK on the MSP430 that is the CPU where
> )> > I am working on a flash-burning routine.
> )> 
> )> Threading and out-of-order execution has little if anything to do
> )> with it.  The issue is the order of the code emitted by compiler,
> )> not the order of the code's execution.
> )> 
> ) But woudn't an optimizing compiler generating code for a 
> ) complex processor be more likely to compile optimize in
> ) a way that changed the order of operations?  I think
> ) that might apply particularly to a call to a function
> ) that returns no result to be used in  a specific
> ) place inside the outer function.
> 
> More specifically, it could generate code like this:
> (example in pseudocode)
> 
> (begin MoveMe)
> TEST var
> SKIP NEXT on zero
> JUMP Markend
> ... ; the rest of the code
> RETURN
> (end MoveMe)
> 

I've actually seen constructs like that intentionally
coded in assembly language, since it saves the 
address push and pop you would need need in a branch
to a subroutine.   I haven't seen it recently
in compiler output, but that may be because I
limit optimization to make debugging easier.  Since
I do limited numbers  of systems in a niche market,
I save money by spending a few extra dollars on
more memory and CPU cycles if it saves me a few
hours of debugging time.

In any of these instances,  I would certainly review
the assembly code to make sure the compiler was doing
what I intended in the order I wanted.  Maybe programmers
in comp.lang.c don't do that as often as programmers
in comp.arch.embedded.  ;-)

Mark Borgerson

Reply by Mark Borgerson ●January 23, 20102010-01-23

In article <fvqml5lfdi277l5ri725ia2l4pv8hhseoo@4ax.com>, 
jonk@infinitefactors.org says...
> On Sat, 23 Jan 2010 13:17:45 -0800, Mark Borgerson wrote:
> 
> >In article <87tyucpp7x.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
> >says...
> >> Mark Borgerson <mborgerson@comcast.net> writes:
> >> 
> >> > In article <87y6jopsl9.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
> >> > says...
> >> >> Mark Borgerson <mborgerson@comcast.net> writes:
> >> >> You seem to be assuming that the compiler emits machine code that
> >> >> is in the same order as the corresponding C code, i.e. that the
> >> >> call to Markend() will occur at the end of MoveMe().  This is not
> >> >> a good assumption.
> >> >
> >> > This would certainly be a dangerous technique on a processor
> >> > with multi-threading and possible out-of-order execution.
> >> > I think it will work OK on the MSP430 that is the CPU where
> >> > I am working on a flash-burning routine.
> >> 
> >> Threading and out-of-order execution has little if anything to do
> >> with it.  The issue is the order of the code emitted by compiler,
> >> not the order of the code's execution.
> >> 
> >But woudn't an optimizing compiler generating code for a 
> >complex processor be more likely to compile optimize in
> >a way that changed the order of operations?  I think
> >that might apply particularly to a call to a function
> >that returns no result to be used in  a specific
> >place inside the outer function.
> 
> Ben quite correctly brought you up short on the right point.
> Your example was, just to refresh ourselves:
> 
> >:   int MoveMe ( ...., bool findend ) {
> >:       if ( !findend ) {
> >:           // do normal function stuff
> >:       } else
> >:           Markend();
> >:   }
> 
> Let's divert from this for a moment and take the case of a
> for-loop in c.  It looks like:
> 
> >:   for ( init-block; condition; iterate-block )
> >:       body-block;
> 
> A compiler will often translate this into this form:
> 
> >:      init-block;
> >:      goto A;
> >:   B: body-block;
> >:   C: iterate-block;
> >:   A: if ( condition ) goto B;
> >:   D:
> 
> (The reason for the C label is to support the continue-
> statement and the reason for the D label is to support a
> break-statement, of course.)
> 
> The straight interpretation would have been more like this:
> 
> >:      init-block;
> >:   A: if ( !condition ) goto D;
> >:   B: body-block;
> >:   C: iterate-block;
> >:      goto A;
> >:   D:
> 
> But note that the execution of the for-loop's main body,
> presumed by the compiler to have "many iterations" as a
> reasonable guess, includes execution for the "goto A"
> statement in each and every iteration.  But so is, in effect,
> the conditional test, too.  In other words, it takes longer
> to execute the body, even if that only means the execution of
> one jump instruction.  It's more efficient to redesign the
> model used by the compiler to the first example I gave,
> merely because the c compiler takes the position that the
> added one-time execution of the first "goto A" will be the
> lower cost approach (which it almost always will be.)

I've also run across main processing loops such as

void MainLoop(void)
    while(1){
	get user input
       execute commands
    }
    MarkEnd();
}

where MarkEnd doesn't appear in the generated machine
code, because the compiler, even at lowest optimization
setting, recognizes that the code after the loop
will never get executed.


> 
> Now let's assume that the compiler takes the position that
> the first case of an if-statement section is the more
> frequently travelled one.  In other words, when the
> conditional case is executed, it will more often be "true"
> than "false."  The model used might very well then be to
> convert:
> 
> >:   if ( condition )
> >:       s1-block;
> >:   else
> >:       s2-block;
> 
> into:
> 
> >:      if ( !condition ) goto A;
> >:      s2-block;
> >:      goto B;
> >:   A: s1-block;
> >:   B:
> 
> This provides s1-block execution with one less jump and
> therefore lets it execute slightly faster with the idea that
> it is the preferred path.
> 
> So let's revisit your example again in this light:
> 
> >:   int MoveMe ( ...., bool findend ) {
> >:       if ( !findend ) {
> >:           // do normal function stuff
> >:       } else
> >:           Markend();
> >:   }
> 
> This _may_ be taken by a c compiler to be:
> 
> >:   int MoveMe ( ...., bool findend ) {
> >:       if ( findend ) goto A;
> >:       Markend();
> >:       goto B;
> >:    A: // do normal function stuff
> >:    B:
> >:   }
> 
> Leaving your function call to Markend not exactly where you'd
> have liked to see it occur.

That could certainly occur.  I would be interested in the logic
that could come to the conclusion that one or the other 
of the branches would be more likely to occur.  I guess the
compiler could check all the calls to MoveMe and compare the
number of times the findend parameter was true and false. However that
might be pretty difficult if a variable was used.

Still, a good reason, as I've said in other posts,  to look 
at the resulting  assembly langauage.  I did it for one
MSP430 compiler, and it worked the way I wanted.  YMMV.

I wonder how many compilers would make that kind of optimization
and under which optimization settings.
> 
> An old book you can pick up talking about a method used to
> _explicitly_ inform the compiler about statistics of branch
> likelihoods is the Ph.D. thesis by John Ellis:
> 
> http://mitpress.mit.edu/catalog/item/default.asp?tid=5098&ttype=2
> 
> Worth a read, some snowy day.

Those are pretty rare in Corvallis.  Only one or two so far this winter.
Now, rainy days----those I get in plentitude!
> 
Mark Borgerson

Reply by Jon Kirwan ●January 23, 20102010-01-23

On Sat, 23 Jan 2010 15:18:25 -0800, Mark Borgerson
<mborgerson@comcast.net> wrote:

>In article <fvqml5lfdi277l5ri725ia2l4pv8hhseoo@4ax.com>, 
>jonk@infinitefactors.org says...
>> On Sat, 23 Jan 2010 13:17:45 -0800, Mark Borgerson wrote:
>> 
>> >In article <87tyucpp7x.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
>> >says...
>> >> Mark Borgerson <mborgerson@comcast.net> writes:
>> >> 
>> >> > In article <87y6jopsl9.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
>> >> > says...
>> >> >> Mark Borgerson <mborgerson@comcast.net> writes:
>> >> >> You seem to be assuming that the compiler emits machine code that
>> >> >> is in the same order as the corresponding C code, i.e. that the
>> >> >> call to Markend() will occur at the end of MoveMe().  This is not
>> >> >> a good assumption.
>> >> >
>> >> > This would certainly be a dangerous technique on a processor
>> >> > with multi-threading and possible out-of-order execution.
>> >> > I think it will work OK on the MSP430 that is the CPU where
>> >> > I am working on a flash-burning routine.
>> >> 
>> >> Threading and out-of-order execution has little if anything to do
>> >> with it.  The issue is the order of the code emitted by compiler,
>> >> not the order of the code's execution.
>> >> 
>> >But woudn't an optimizing compiler generating code for a 
>> >complex processor be more likely to compile optimize in
>> >a way that changed the order of operations?  I think
>> >that might apply particularly to a call to a function
>> >that returns no result to be used in  a specific
>> >place inside the outer function.
>> 
>> Ben quite correctly brought you up short on the right point.
>> Your example was, just to refresh ourselves:
>> 
>> >:   int MoveMe ( ...., bool findend ) {
>> >:       if ( !findend ) {
>> >:           // do normal function stuff
>> >:       } else
>> >:           Markend();
>> >:   }
>> 
>> Let's divert from this for a moment and take the case of a
>> for-loop in c.  It looks like:
>> 
>> >:   for ( init-block; condition; iterate-block )
>> >:       body-block;
>> 
>> A compiler will often translate this into this form:
>> 
>> >:      init-block;
>> >:      goto A;
>> >:   B: body-block;
>> >:   C: iterate-block;
>> >:   A: if ( condition ) goto B;
>> >:   D:
>> 
>> (The reason for the C label is to support the continue-
>> statement and the reason for the D label is to support a
>> break-statement, of course.)
>> 
>> The straight interpretation would have been more like this:
>> 
>> >:      init-block;
>> >:   A: if ( !condition ) goto D;
>> >:   B: body-block;
>> >:   C: iterate-block;
>> >:      goto A;
>> >:   D:
>> 
>> But note that the execution of the for-loop's main body,
>> presumed by the compiler to have "many iterations" as a
>> reasonable guess, includes execution for the "goto A"
>> statement in each and every iteration.  But so is, in effect,
>> the conditional test, too.  In other words, it takes longer
>> to execute the body, even if that only means the execution of
>> one jump instruction.  It's more efficient to redesign the
>> model used by the compiler to the first example I gave,
>> merely because the c compiler takes the position that the
>> added one-time execution of the first "goto A" will be the
>> lower cost approach (which it almost always will be.)
>
>I've also run across main processing loops such as
>
>void MainLoop(void)
>    while(1){
>	get user input
>       execute commands
>    }
>    MarkEnd();
>}
>
>where MarkEnd doesn't appear in the generated machine
>code, because the compiler, even at lowest optimization
>setting, recognizes that the code after the loop
>will never get executed.

Yes, of course.  That is another possibility.  The intended
function may be essentially the "main loop" of the code and
as such never returns.  However, whether or not MarkEnd()
were optimized out, it wouldn't ever get executed anyway.  So
you'd never get the address stuffed into something useful...
and so it doesn't even matter were it that the compiler kept
the function call.  So it makes a good case against your
approach for an entirely different reason than optimization
itself.

>> Now let's assume that the compiler takes the position that
>> the first case of an if-statement section is the more
>> frequently travelled one.  In other words, when the
>> conditional case is executed, it will more often be "true"
>> than "false."  The model used might very well then be to
>> convert:
>> 
>> >:   if ( condition )
>> >:       s1-block;
>> >:   else
>> >:       s2-block;
>> 
>> into:
>> 
>> >:      if ( !condition ) goto A;
>> >:      s2-block;
>> >:      goto B;
>> >:   A: s1-block;
>> >:   B:
>> 
>> This provides s1-block execution with one less jump and
>> therefore lets it execute slightly faster with the idea that
>> it is the preferred path.
>> 
>> So let's revisit your example again in this light:
>> 
>> >:   int MoveMe ( ...., bool findend ) {
>> >:       if ( !findend ) {
>> >:           // do normal function stuff
>> >:       } else
>> >:           Markend();
>> >:   }
>> 
>> This _may_ be taken by a c compiler to be:
>> 
>> >:   int MoveMe ( ...., bool findend ) {
>> >:       if ( findend ) goto A;
>> >:       Markend();
>> >:       goto B;
>> >:    A: // do normal function stuff
>> >:    B:
>> >:   }
>> 
>> Leaving your function call to Markend not exactly where you'd
>> have liked to see it occur.
>
>That could certainly occur.  I would be interested in the logic
>that could come to the conclusion that one or the other 
>of the branches would be more likely to occur.

I wasn't suggesting that the optimizer includes a feature
where it "tries" to adduce the likelihood.  I was suggesting
the idea that the compiler writer makes the 'a priori'
decision that it is.

Think of it this way.  Ignorant of application specific
information, the compiler writer has two options to take when
considering the if..else case's approach.  Regardless of
which way the compiler author chooses, one of the two blocks
will get a run-time preference.  So, does the compiler author
_choose_ to prefer the if-case or the else-case?  Without
knowledge, which way would _you_ decide to weigh in on?
Either way you go, you are making a choice.  No escaping that
fact.

Now, the Bulldog compiler provides a way for the author of
the code to supply known information _or_ to use run-time
profiling to provide that information, automatically.  But
I'm not talking about this case.  That's for another
discussion.  I only pointed that out for leisure reading. Not
as a point in this specific discussion.

>I guess the
>compiler could check all the calls to MoveMe and compare the
>number of times the findend parameter was true and false. However that
>might be pretty difficult if a variable was used.

Run-time profiling could provide that information.  But that
wasn't anything I wanted you worrying over in this talk.  It
distracts from the central point -- which is that a compiler
writer, sans application knowledge and sans anything in the
compiler or compiler syntax provided to the application coder
to better inform him/her about which way to go, must choose.
Either prefer the if-case or prefer the else-case.  There is
no other option.  So which way would you go?

>Still, a good reason, as I've said in other posts,  to look 
>at the resulting  assembly langauage.  I did it for one
>MSP430 compiler, and it worked the way I wanted.  YMMV.

Indeed.  I think the point here is that one is left entirely
to the vagaries of the compiler author.  And on that point,
they may decide to go either way.  There is NOTHING in the c
language itself to help them decide which is better.

>I wonder how many compilers would make that kind of optimization
>and under which optimization settings.

I think this question is moot.  The point I was making
remains even _without_ optimizations that may help inform the
compiler about frequency of execution.  So there is no need
to argue this point.

>> An old book you can pick up talking about a method used to
>> _explicitly_ inform the compiler about statistics of branch
>> likelihoods is the Ph.D. thesis by John Ellis:
>> 
>> http://mitpress.mit.edu/catalog/item/default.asp?tid=5098&ttype=2
>> 
>> Worth a read, some snowy day.
>
>Those are pretty rare in Corvallis.  Only one or two so far this winter.
>Now, rainy days----those I get in plentitude!
> 
>Mark Borgerson

Hehe.  I live near Mt. Hood at an elevation of about 1000'
ASL.  So I get three feet of snow and ice, from time to time.
I've had to use my JD 4320 tractor on more than one occasion!
:)

Jon

Reply by Mark Borgerson ●January 23, 20102010-01-23

In article <uq4nl51thfeujhkgk5i7nhtmapjesi6ns8@4ax.com>, 
jonk@infinitefactors.org says...
> On Sat, 23 Jan 2010 15:18:25 -0800, Mark Borgerson
> <mborgerson@comcast.net> wrote:
> 
<<SNIP discussion of unpredictable, but legal, compiler behavior>>
> >
> >I've also run across main processing loops such as
> >
> >void MainLoop(void)
> >    while(1){
> >	get user input
> >       execute commands
> >    }
> >    MarkEnd();
> >}
> >
> >where MarkEnd doesn't appear in the generated machine
> >code, because the compiler, even at lowest optimization
> >setting, recognizes that the code after the loop
> >will never get executed.
> 
> Yes, of course.  That is another possibility.  The intended
> function may be essentially the "main loop" of the code and
> as such never returns.  However, whether or not MarkEnd()
> were optimized out, it wouldn't ever get executed anyway.  So
> you'd never get the address stuffed into something useful...
> and so it doesn't even matter were it that the compiler kept
> the function call.  So it makes a good case against your
> approach for an entirely different reason than optimization
> itself.

Well, I wuould not dream of using this approach on a function
that never returns.  OTOH a flash-update routine had better
return, or it won't be particularly useful (unless your goal
is to test the write-endurance of the flash)   ;-)
> 
<<SNIP>>
> >> An old book you can pick up talking about a method used to
> >> _explicitly_ inform the compiler about statistics of branch
> >> likelihoods is the Ph.D. thesis by John Ellis:
> >> 
> >> http://mitpress.mit.edu/catalog/item/default.asp?tid=5098&ttype=2
> >> 
> >> Worth a read, some snowy day.
> >
> >Those are pretty rare in Corvallis.  Only one or two so far this winter.
> >Now, rainy days----those I get in plentitude!
> > 
> >Mark Borgerson
> 
> Hehe.  I live near Mt. Hood at an elevation of about 1000'
> ASL.  So I get three feet of snow and ice, from time to time.
> I've had to use my JD 4320 tractor on more than one occasion!
> :)
> 
Mark Borgerson

Reply by Grant Edwards ●January 23, 20102010-01-23

On 2010-01-23, BGB / cr88192 <cr88192@hotmail.com> wrote:
>
> "Grant Edwards" <invalid@invalid.invalid> wrote in message 
> news:hjdjjj$njp$1@reader1.panix.com...
>> On 2010-01-23, BGB / cr88192 <cr88192@hotmail.com> wrote:
>>
>>> in this case, it might actually be better advised to generate
>>> the function as a chunk of arch-specific ASM or machine code
>>> (ASM is preferable IMO, but requires an assembler...), which
>>> could then be located wherever (such as the heap).
>>
>> IMO, the "right" thing to do is to tell the compiler to put
>> the function into a separate section and then have it linked
>> so that it's "located" to run in RAM at the proper address but
>> stored in ROM.
>>
>> That way you know the code will work correctly when it's run
>> from RAM.  Defining approprate symbols in the linker command
>> file will allow the program to refer to the start and end of
>> the section's address in ROM.
>
> this is a little closer to the second option, of having a
> secondary image file embedded as data...

Yup, it's pretty much exactly that.

>> The OP needs to spend some time studying the manuals for his
>> compiler and linker.
>
> this is, assuming the linker or image format actually supports
> the "separate section" idea...

Every C compiler/toolchain I've used for embedded systems
development for the past 25 years supported things like that.
If his tools don't support multiple sections, then the first
order of business is to find a decent toolchain.

> dunno about ELF,

ELF supports multile sections, and I've done exactly such
things with ELF-based toolchains (Gnu binutils and GCC) when
working on stuff like bootloaders where the memory map changes
completely part-way through the program as the memory
controller gets configured.n

> but PE/COFF would not support this, since it would require 
> breaking some of the internal assumptions of the file format
> (for example, that the image is continuous from ImageBase to
> ImageBase+ImageSize, ...).
>
> ELF may have similar restrictions (actually, I think most ELF
> images are position independent anyways,

That depends on the compiler options and linker command file.
In my experience, "executable" ELF files on embedded systems
(images that are ready to load into RAM and run) are generally
not relocatable.

> can't say so much about other file formats though...

The COFF-based toolchains I've used all seem to support
multiple sections, but that may have been due to
vendor-specific extensions.

-- 
Grant

Reply by BGB / cr88192 ●January 24, 20102010-01-24

"Grant Edwards" <invalid@invalid.invalid> wrote in message 
news:hjgfa1$jcd$1@reader1.panix.com...
> On 2010-01-23, BGB / cr88192 <cr88192@hotmail.com> wrote:
>>
>> "Grant Edwards" <invalid@invalid.invalid> wrote in message
>> news:hjdjjj$njp$1@reader1.panix.com...
>>> On 2010-01-23, BGB / cr88192 <cr88192@hotmail.com> wrote:
>>>

<snip>

>>
>> this is a little closer to the second option, of having a
>> secondary image file embedded as data...
>
> Yup, it's pretty much exactly that.
>

ok.

>>> The OP needs to spend some time studying the manuals for his
>>> compiler and linker.
>>
>> this is, assuming the linker or image format actually supports
>> the "separate section" idea...
>
> Every C compiler/toolchain I've used for embedded systems
> development for the past 25 years supported things like that.
> If his tools don't support multiple sections, then the first
> order of business is to find a decent toolchain.
>

well, I haven't personally had much experience with embedded systems, so I 
am not sure here.

>> dunno about ELF,
>
> ELF supports multile sections, and I've done exactly such
> things with ELF-based toolchains (Gnu binutils and GCC) when
> working on stuff like bootloaders where the memory map changes
> completely part-way through the program as the memory
> controller gets configured.n
>

yes, I know it has multiple sections, but AFAIK it is generally assumed that 
the final image is in a continuous region of memory (with the sections 
generally packed end-to-end), at least in the cases I have seen. granted, in 
cases I have seen, ELF has usually been x86 and PIC as well (the default 
build for Linux).

>> but PE/COFF would not support this, since it would require
>> breaking some of the internal assumptions of the file format
>> (for example, that the image is continuous from ImageBase to
>> ImageBase+ImageSize, ...).
>>
>> ELF may have similar restrictions (actually, I think most ELF
>> images are position independent anyways,
>
> That depends on the compiler options and linker command file.
> In my experience, "executable" ELF files on embedded systems
> (images that are ready to load into RAM and run) are generally
> not relocatable.
>

interesting...

well, I have really only seen ELF on Linux on x86, and there it is almost 
invariably position-independent.

granted, I don't know what other systems do...

>> can't say so much about other file formats though...
>
> The COFF-based toolchains I've used all seem to support
> multiple sections, but that may have been due to
> vendor-specific extensions.
>

COFF has multiple sections, but PE/COFF (in particular) also has ImageBase 
and ImageSize fields (I forget their exact official names right off), which 
are located in the optional header, which is mandatory in PE/COFF, and also 
contains things like the subsystem (Console, GUI, ...), and references to 
the import and export tables (related to DLL's), ...

AFAIK, PE/COFF also tends to assume that the image is continuous between 
these addresses, and also that all loadable sections be between them (doing 
otherwise could break the DLL / EXE loader). however, they may support 
additional "non-loadable" sections, which AFAIK need not obey this (but are 
usually ignored by the loader).

granted, to really know, I would have to dig around more closely in the 
PE/COFF spec (and Microsoft's sometimes confusing writing style, which 
caused great fun in a few cases when trying to debug my custom EXE/DLL 
loader...).

however, I can't say much about how much of this is common with other 
variants of COFF (IOW: the ones which don't necessarily begin with an MS-DOS 
stub, ...).

nevermind the added layer of hackery needed for .NET ...

I guess it all depends then on whether the particular linker for the 
particular target supports non-continuous images then, or if alternative 
means would be needed instead...

or such...

Reply by Flash Gordon ●January 24, 20102010-01-24

BGB / cr88192 wrote:
> "Grant Edwards" <invalid@invalid.invalid> wrote in message 
> news:hjgfa1$jcd$1@reader1.panix.com...
>> On 2010-01-23, BGB / cr88192 <cr88192@hotmail.com> wrote:
>>> "Grant Edwards" <invalid@invalid.invalid> wrote in message

<snip>

> well, I haven't personally had much experience with embedded systems, so I 
> am not sure here.

Then believe people who have...

>>> dunno about ELF,
>> ELF supports multile sections, and I've done exactly such
>> things with ELF-based toolchains (Gnu binutils and GCC) when
>> working on stuff like bootloaders where the memory map changes
>> completely part-way through the program as the memory
>> controller gets configured.n
>>
> 
> yes, I know it has multiple sections, but AFAIK it is generally assumed that 
> the final image is in a continuous region of memory (with the sections 
> generally packed end-to-end), at least in the cases I have seen. granted, in 
> cases I have seen, ELF has usually been x86 and PIC as well (the default 
> build for Linux).

It simply isn't true for embedded systems.

>>> but PE/COFF would not support this, since it would require
>>> breaking some of the internal assumptions of the file format
>>> (for example, that the image is continuous from ImageBase to
>>> ImageBase+ImageSize, ...).
>>>
>>> ELF may have similar restrictions (actually, I think most ELF
>>> images are position independent anyways,
>> That depends on the compiler options and linker command file.
>> In my experience, "executable" ELF files on embedded systems
>> (images that are ready to load into RAM and run) are generally
>> not relocatable.
> 
> interesting...
> 
> well, I have really only seen ELF on Linux on x86, and there it is almost 
> invariably position-independent.
> 
> granted, I don't know what other systems do...

Then believe people who do...

<snip>

> I guess it all depends then on whether the particular linker for the 
> particular target supports non-continuous images then, or if alternative 
> means would be needed instead...
> 
> or such...

Or you could believe people with experience on embedded systems. It is a 
common requirement to have non-contiguous sections, and sections which 
are loaded in to one location but run from another, and all sorts of 
funky things. Sometimes you can execute code faster from RAM than ROM, 
so you move the code at run-time, having boot loaders which get code 
from one place (sometimes on a different processor) and put it in 
another is common, and I've even had gaps in the ROM where there was 
RAM! All of which means having separate sections which are not adjacent.

It's all specific to the given tool-chain as to the best way to achieve 
it though.
-- 
Flash Gordon

Reply by Flash Gordon ●January 24, 20102010-01-24

Mark Borgerson wrote:
> In article <uq4nl51thfeujhkgk5i7nhtmapjesi6ns8@4ax.com>, 
> jonk@infinitefactors.org says...
>> On Sat, 23 Jan 2010 15:18:25 -0800, Mark Borgerson
>> <mborgerson@comcast.net> wrote:
>>
> <<SNIP discussion of unpredictable, but legal, compiler behavior>>
>>> I've also run across main processing loops such as
>>>
>>> void MainLoop(void)
>>>    while(1){
>>> 	get user input
>>>       execute commands
>>>    }
>>>    MarkEnd();
>>> }
>>>
>>> where MarkEnd doesn't appear in the generated machine
>>> code, because the compiler, even at lowest optimization
>>> setting, recognizes that the code after the loop
>>> will never get executed.
>> Yes, of course.  That is another possibility.  The intended
>> function may be essentially the "main loop" of the code and
>> as such never returns.  However, whether or not MarkEnd()
>> were optimized out, it wouldn't ever get executed anyway.  So
>> you'd never get the address stuffed into something useful...
>> and so it doesn't even matter were it that the compiler kept
>> the function call.  So it makes a good case against your
>> approach for an entirely different reason than optimization
>> itself.
> 
> Well, I wuould not dream of using this approach on a function
> that never returns.  OTOH a flash-update routine had better
> return, or it won't be particularly useful (unless your goal
> is to test the write-endurance of the flash)   ;-)

<snip>

Some times you *do* write such functions so they never return. Whilst 
reprogramming the flash it keeps kicking the watchdog, but it stops when 
it's finished and the watchdog resets the system thus booting it in to 
the new code. Or it might branch to the reset (or power-up) vector 
rather than return. In fact, returning could easily be impossible 
because the code from which the function was called is no longer there!
-- 
Flash Gordon

Previous 2 345 6 7 Next

Getting the size of a C function

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group