Getting the size of a C function| page 5

Reply by Mark Borgerson ●January 24, 20102010-01-24

In article <efkt27x98l.ln2@news.flash-gordon.me.uk>, 
smap@spam.causeway.com says...
> Mark Borgerson wrote:
> > In article <uq4nl51thfeujhkgk5i7nhtmapjesi6ns8@4ax.com>, 
> > jonk@infinitefactors.org says...
> >> On Sat, 23 Jan 2010 15:18:25 -0800, Mark Borgerson
> >> <mborgerson@comcast.net> wrote:
> >>
> > <<SNIP discussion of unpredictable, but legal, compiler behavior>>
> >>> I've also run across main processing loops such as
> >>>
> >>> void MainLoop(void)
> >>>    while(1){
> >>> 	get user input
> >>>       execute commands
> >>>    }
> >>>    MarkEnd();
> >>> }
> >>>
> >>> where MarkEnd doesn't appear in the generated machine
> >>> code, because the compiler, even at lowest optimization
> >>> setting, recognizes that the code after the loop
> >>> will never get executed.
> >> Yes, of course.  That is another possibility.  The intended
> >> function may be essentially the "main loop" of the code and
> >> as such never returns.  However, whether or not MarkEnd()
> >> were optimized out, it wouldn't ever get executed anyway.  So
> >> you'd never get the address stuffed into something useful...
> >> and so it doesn't even matter were it that the compiler kept
> >> the function call.  So it makes a good case against your
> >> approach for an entirely different reason than optimization
> >> itself.
> > 
> > Well, I wuould not dream of using this approach on a function
> > that never returns.  OTOH a flash-update routine had better
> > return, or it won't be particularly useful (unless your goal
> > is to test the write-endurance of the flash)   ;-)
> 
> <snip>
> 
> Some times you *do* write such functions so they never return. Whilst 
> reprogramming the flash it keeps kicking the watchdog, but it stops when 
> it's finished and the watchdog resets the system thus booting it in to 
> the new code. Or it might branch to the reset (or power-up) vector 
> rather than return. In fact, returning could easily be impossible 
> because the code from which the function was called is no longer there!
> 
Hmmmm.  I hadn't thought of that watchdog idea,  since TI recommends
shutting off the watchdog and disabling interrupts while programming
flash.

I also agree about the return not being normal---there's probably not
much chance returning to the address on the stack is going to 
work out, so a reset is probably the best idea after a firmware
update.

I should have said that I wouldn't use this idea on a function
designed to run forever---or at least not one that the compiler
might think runs forever.    I would also  examine the 
resulting code to make sure the compiler was doing what I
intended.


I think that the ideas I have described will work on some
processors and compilers  for  some functions,   but not 
on all compilers for all processors and functions.  If you 
do a lot of embedded systems programming,  restrictions
like that are nothing new.


Mark Borgerson



Mark Borgerson

Reply by Albert van der Horst ●January 24, 20102010-01-24

In article <hjda8u$t4k$1@speranza.aioe.org>, john  <john@nospam.com> wrote:
>Hi,
>
>I need to know the size of a function or module because I need to
>temporarily relocate the function or module from flash into sram to
>do firmware updates.
>
>How can I determine that at runtime? The
>sizeof( myfunction)
>generates an error: "size of function unknown".

Admit it, you do something that can't be done in C.
By far the simplest is to generate assembler code, and
add a small instrumentation to that.
Start by accessing the function through a pointer to subroutine.
Then you can store an sram address there when needed.

>
>Thanks.

--
-- 
Albert van der Horst, UTRECHT,THE NETHERLANDS
Economic growth -- being exponential -- ultimately falters.
albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst

Reply by David Brown ●January 24, 20102010-01-24

Keith Thompson wrote:
> WangoTango <Asgard24@mindspring.com> writes:
>> In article <hjda8u$t4k$1@speranza.aioe.org>, john@nospam.com says...
>>> I need to know the size of a function or module because I need to
>>> temporarily relocate the function or module from flash into sram to
>>> do firmware updates.
>>>
>>> How can I determine that at runtime? The
>>> sizeof( myfunction)
>>> generates an error: "size of function unknown".
>>>
>> Good question, and I would like to know if there is an easy way to do it 
>> during runtime, and a portable way would be nice too.  I would probably 
>> look at the map file and use the size I calculated from there, but 
>> that's surely not runtime.
>>
>> You can get the starting address of the function pretty easy, but how 
>> about the end?  Hmmm, gotta' think about that.
> 
> You can't even portably assume that &func is the memory address of the
> beginning of the function.  I think there are systems (AS/400) where
> function pointers are not just machine addresses.
> 

Closer to comp.arch.embedded, &func may not be the memory address of a 
function on smaller micros with more than 64KB (or sometimes 64K words) 
of flash.  gcc for the AVR, for example, uses trampolines for function 
pointers on devices with more than 64K words flash - &func gives the 
address of a jump instruction in the lower 64K memory, which jumps to 
the real function.  That way you can use 16-bit function pointers with 
larger memories.

> Given whatever it is you're doing, you're probably not too concerned
> with portability, so that likely not to be an issue.  But there's no
> portable way in C to determine the size of a function, so you're more
> likely to get help somewhere other than comp.lang.c.
>

Reply by David Brown ●January 24, 20102010-01-24

Mark Borgerson wrote:
> In article <pan.2010.01.23.05.08.12.672000@nowhere.com>, 
> nobody@nowhere.com says...
>> On Fri, 22 Jan 2010 22:53:18 +0000, john wrote:
>>
>>> I need to know the size of a function or module because I need to
>>> temporarily relocate the function or module from flash into sram to
>>> do firmware updates.
>> Do you need to be able to run it from RAM? If so, simply memcpy()ing it
>> may not work. And you would also need to copy anything which the function
>> calls (just because there aren't any explicit function calls in the source
>> code, that doesn't mean that there aren't any in the resulting object code).
>>
>>
> At the expense of a few words of code and a parameter, you could do
> 
> 
> int MoveMe(...., bool findend){
> 	if(!findend){
> 
> 	// do all the stuff the function is supposed to do
> 
> 	} else Markend();
> 
> }
> 
> 
> Where Markend is a function that pulls the return 
> address off the stack and stashes it somewhere
> convenient.  Markend may have to have some
> assembly code.   External code can then
> subtract the function address from the address
> stashed by Markend(), add a safety margin, and
> know how many bytes to move to RAM.
> 
> 
> Mark Borgerson
> 

Anything that relies on the compiler being stupid, or deliberately 
crippled ("disable all optimisations") or other such nonsense is a bad 
solution.  It is conceivable that it might happen to work - /if/ you can 
get the compiler in question to generate bad enough code.  But it is 
highly dependent on the tools in question, and needs to be carefully 
checked at the disassembly level after any changes.

In this particular example of a highly risky solution, what happens when 
the compiler generates proper code?  The compiler is likely to generate 
the equivalent of :

int MoveMe(..., bool findend) {
	if (findend) "jump" Markend();
	// do all the stuff
}

Or perhaps it will inline Markend, MoveMe, or both.  Or maybe it will 
figure out that MoveMe is never called with "findend" set, and thus 
optimise away that branch.  All you can be sure of, is that there is no 
way you can demand that a compiler produces directly the code you 
apparently want it to produce - C is not assembly.

Reply by David Brown ●January 24, 20102010-01-24

Mark Borgerson wrote:
> In article <slrnhlmqth.1evp.willem@turtle.stack.nl>, willem@stack.nl 
> says...
>> Mark Borgerson wrote:
>> ) In article <87tyucpp7x.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
>> ) says...
>> )> Mark Borgerson <mborgerson@comcast.net> writes:
>> )> 
>> )> > In article <87y6jopsl9.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
>> )> > says...
>> )> >> Mark Borgerson <mborgerson@comcast.net> writes:
>> )> >> You seem to be assuming that the compiler emits machine code that
>> )> >> is in the same order as the corresponding C code, i.e. that the
>> )> >> call to Markend() will occur at the end of MoveMe().  This is not
>> )> >> a good assumption.
>> )> >
>> )> > This would certainly be a dangerous technique on a processor
>> )> > with multi-threading and possible out-of-order execution.
>> )> > I think it will work OK on the MSP430 that is the CPU where
>> )> > I am working on a flash-burning routine.
>> )> 
>> )> Threading and out-of-order execution has little if anything to do
>> )> with it.  The issue is the order of the code emitted by compiler,
>> )> not the order of the code's execution.
>> )> 
>> ) But woudn't an optimizing compiler generating code for a 
>> ) complex processor be more likely to compile optimize in
>> ) a way that changed the order of operations?  I think
>> ) that might apply particularly to a call to a function
>> ) that returns no result to be used in  a specific
>> ) place inside the outer function.
>>

You get good and bad compilers for all sorts of processors, and even a 
half-decent one will be able to move code around if it improves the 
speed or size of the target - something that can apply on any size of 
processor.

<snip>

> 
> In any of these instances,  I would certainly review
> the assembly code to make sure the compiler was doing
> what I intended in the order I wanted.  Maybe programmers
> in comp.lang.c don't do that as often as programmers
> in comp.arch.embedded.  ;-)
> 

I don't know about typical "comp.lang.c" programmers, but typical 
"comp.arch.embedded" programmers use compilers that generate tight code, 
and they let the compiler do its job without trying to force the tools 
into their way of thinking.  At least, that's the case for good embedded 
programmers - small and fast code means cheap and reliable 
microcontrollers in this line of work.  And code that has to be 
disassembled and manually checked at every change is not reliable or 
quality code.

Reply by James Harris ●January 24, 20102010-01-24

On 24 Jan, 21:44, David Brown <david.br...@hesbynett.removethisbit.no>
wrote:
...
> Anything that relies on the compiler being stupid, or deliberately
> crippled ("disable all optimisations") or other such nonsense is a bad
> solution.

I *think* Mark is aware of the limitations of his suggestion but there
seems to be no C way to solve the OP's problem. It does sound like the
problem only needs to be solved as a one-off in a particular
environment.

That said, what about taking function pointers for all functions and
sorting their values? It still wouldn't help with the size of the last
function. Can we assume the data area would follow the code? I guess
not.

James

Reply by Jon Kirwan ●January 24, 20102010-01-24

On Sun, 24 Jan 2010 22:53:01 +0100, David Brown
<david.brown@hesbynett.removethisbit.no> wrote:

>Mark Borgerson wrote:
>> In article <slrnhlmqth.1evp.willem@turtle.stack.nl>, willem@stack.nl 
>> says...
>>> Mark Borgerson wrote:
>>> ) In article <87tyucpp7x.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
>>> ) says...
>>> )> Mark Borgerson <mborgerson@comcast.net> writes:
>>> )> 
>>> )> > In article <87y6jopsl9.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
>>> )> > says...
>>> )> >> Mark Borgerson <mborgerson@comcast.net> writes:
>>> )> >> You seem to be assuming that the compiler emits machine code that
>>> )> >> is in the same order as the corresponding C code, i.e. that the
>>> )> >> call to Markend() will occur at the end of MoveMe().  This is not
>>> )> >> a good assumption.
>>> )> >
>>> )> > This would certainly be a dangerous technique on a processor
>>> )> > with multi-threading and possible out-of-order execution.
>>> )> > I think it will work OK on the MSP430 that is the CPU where
>>> )> > I am working on a flash-burning routine.
>>> )> 
>>> )> Threading and out-of-order execution has little if anything to do
>>> )> with it.  The issue is the order of the code emitted by compiler,
>>> )> not the order of the code's execution.
>>> )> 
>>> ) But woudn't an optimizing compiler generating code for a 
>>> ) complex processor be more likely to compile optimize in
>>> ) a way that changed the order of operations?  I think
>>> ) that might apply particularly to a call to a function
>>> ) that returns no result to be used in  a specific
>>> ) place inside the outer function.
>>>
>
>You get good and bad compilers for all sorts of processors, and even a 
>half-decent one will be able to move code around if it improves the 
>speed or size of the target - something that can apply on any size of 
>processor.
>
><snip>
>
>> 
>> In any of these instances,  I would certainly review
>> the assembly code to make sure the compiler was doing
>> what I intended in the order I wanted.  Maybe programmers
>> in comp.lang.c don't do that as often as programmers
>> in comp.arch.embedded.  ;-)
>> 
>
>I don't know about typical "comp.lang.c" programmers, but typical 
>"comp.arch.embedded" programmers use compilers that generate tight code, 
>and they let the compiler do its job without trying to force the tools 
>into their way of thinking.  At least, that's the case for good embedded 
>programmers - small and fast code means cheap and reliable 
>microcontrollers in this line of work.  And code that has to be 
>disassembled and manually checked at every change is not reliable or 
>quality code.

You give me a great way to segue into something.  There are
cases where you simply have no other option than to do
exactly that.  I'll provide one example.  There are others.

I was working on a project using the PIC18F252 processor and,
at the time, the Microchip c compiler was in its roughly-v1.1
incarnation.  We'd spent about 4 months in development time
and the project was nearing completion when we discovered an
intermittent (very rarely occurred) problem in testing.  Once
in a while, the program would emit strange outputs that we
simply couldn't understand when closely examining and walking
through the code that was supposed to generate that output.
It simply wasn't possible.  Specific ASCII characters were
being generated that simply were not present in the code
constants.

In digging through the problem, by closely examining the
generated assembly output, I discovered one remarkable fact
that led me to imagine a possibility that might explain
things.  The Microchip c compiler was using static variables
for compiler temporaries.  And it would _spill_ live
variables that might be destroyed across a function call into
them.  They would be labelled something like __temp0 and the
like.

There was _no_ problem when the c compiler was doing that for
calls made to functions within the same module, because they
had anticipated that there might be more than one compiler
temporary needed in nested calls and they added the extra
code in the c compiler to observe if a decendent function,
called by a parent, would also need to spill live variables
and would then construct more __temp1... variables to cover
that case.  Not unlike what good 8051 compilers might do when
generating static variable slots for nested call parameters
for efficiency (counting spills all the way down, so to
speak.)

However, when calling functions in _other_ modules, where the
c compiler had _no_ visibility about what it had already done
over there on a separate compilation, it had no means to do
that and, of course, there became a problem.  What was
spilled into __temp0 in module-A was also spilled into
__temp0 in module-B and, naturally, I just happened to have a
case where that became a problem under the influence of
interrupt processing.  I had completely saved _all_ registers
at the moment of the interrupt code before attempting to call
any c functions, of course.  That goes without saying.  But
I'd had _no_ idea that I might have to save some statics
which may, or may not, at the time be "live."

Worse, besides the fact that there was no way I could know in
advance which naming the c compiler would use in any
circumstance, the c compiler chose these names in such a way
that they were NOT global or accessible either to c code or
to assembly.  I had to actually _observe_ in the linker file
the memory location where they resided and make sure that the
interrupt routine protected them, as well.

This required me to document a procedure where every time we
made a modification to the code that might _move_ the
location of these compiiler generated statics, we had to
update a #define constant to reflect it, and then recompile
again.

Got us by.

Whether it is _reliable_ or not would be another debate.  The
resulting code was very reliable -- no problems at all.
However, the process/procedures we had to apply were not
reliable, of course, because we might forget to apply the
documented procedure before release.  So on that score, sure.

Life happens.  Oh, well.

Jon

Reply by bartc ●January 24, 20102010-01-24

"James Harris" <james.harris.1@googlemail.com> wrote in message
news:c448f39c-2775-4ea5-b25a-7c8bfa0c6ded@b2g2000yqi.googlegroups.com...
> On 24 Jan, 21:44, David Brown <david.br...@hesbynett.removethisbit.no>
> wrote:
> ...
>> Anything that relies on the compiler being stupid, or deliberately
>> crippled ("disable all optimisations") or other such nonsense is a bad
>> solution.
>
> I *think* Mark is aware of the limitations of his suggestion but there
> seems to be no C way to solve the OP's problem. It does sound like the
> problem only needs to be solved as a one-off in a particular
> environment.
>
> That said, what about taking function pointers for all functions and
> sorting their values? It still wouldn't help with the size of the last
> function. Can we assume the data area would follow the code? I guess
> not.

You'd need to sort *all* the functions of an application (include
non-global functions), and there would still be the possibility that some 
function or other stuff you don't know about resides between 'consecutive' 
functions f() and g().

Reading f() might be alright but overwriting it would be tricky.

-- 
Bartc

Reply by Jon Kirwan ●January 24, 20102010-01-24

On Sun, 24 Jan 2010 15:13:15 -0800 (PST), James Harris wrote:

>On 24 Jan, 21:44, David Brown <david.br...@hesbynett.removethisbit.no>
>wrote:
>...
>> Anything that relies on the compiler being stupid, or deliberately
>> crippled ("disable all optimisations") or other such nonsense is a bad
>> solution.
>
>I *think* Mark is aware of the limitations of his suggestion but there
>seems to be no C way to solve the OP's problem. It does sound like the
>problem only needs to be solved as a one-off in a particular
>environment.
>
>That said, what about taking function pointers for all functions and
>sorting their values? It still wouldn't help with the size of the last
>function. Can we assume the data area would follow the code? I guess
>not.

In general, no universally "good" assumptions exist.  Partly
also because the very idea itself of "moving a function" in
memory at run-time is itself not yet well-defined by those
talking about it here.

Any given function may have the following:

code -->  Code is essentially strings of constants.  It may
reside in a von-Neumann memory system or a Harvard one.  It
therefore may be readable by other code, or not.  Many of the
Harvard implementations include a special instruction or a
special pointer register, perhaps, to allow access to the
code space memory.  But not all do.  In general, it may not
even be possible to read and move code.  Even in von-Neumann
memory systems where, in theory there is no problem, the code
may have been "distributed" in pieces.  An example here would
be an implementation I saw with Metaware's c compiler where
they had extended it to support a type of co-routine called
an 'iterator.'  In this case, the body-block of a for-loop
would be moved outside the function's code region into a
separate function so that their implementation could call the
for-loop body through their very excellently considered
support mechanism for iterators.  You'd need to know where
that part was, as well, to meaningfully move things.

constants --> A function may include instanced constants
(which a smart compiler may "understand" from something like
'const int aa= 5;', if it also finds that some other code
takes an address to 'aa'.)  These may also need to be moved.
Especially if one is trying to download an updated function
into ram before flashing it for permanence as a "code update"
procedure.  These constants may also be placed either in
von-Neumann memory systems and be accessed via PC-relative or
absolute memory locations -- itself a potential bag of worms
-- or in Harvard code space if the processor supports
accessing it or in Harvard data space, otherwise, especially
if there is some of that which is non-volatile.

static initialized data -->  A function may include instanced
locations that must be initialized prior to main(), but where
the actual values of these instances are located in some
general collection place used by who-knows-what code in the
crt0 library routine that does this job of pre-initing.  Once
again, more issues to deal with and wonder about.

And that's just what trips off my tongue to start.

It's a tough problem to solve generally.  To do it right, the
language semantics (and syntax, most likely, as well) itself
would need to be expanded to support it.  That could be done,
I suppose.  But I imagine a lot of gnashing of teeth along
the way.

Jon

Reply by James Harris ●January 24, 20102010-01-24

On 22 Jan, 22:53, john <j...@nospam.com> wrote:

> I need to know the size of a function or module because I need to
> temporarily relocate the function or module from flash into sram to
> do firmware updates.
>
> How can I determine that at runtime? The
> sizeof( myfunction)
> generates an error: "size of function unknown".

...

On 24 Jan, 23:37, "bartc" <ba...@freeuk.com> wrote:
> "James Harris" <james.harri...@googlemail.com> wrote in message

...

> > there seems to be no C way to solve the OP's problem.

...

> > That said, what about taking function pointers for all functions and
> > sorting their values? It still wouldn't help with the size of the last
> > function. Can we assume the data area would follow the code? I guess
> > not.
>
> You'd need to sort *all* the functions of an application (include
> non-global functions), and there would still be the possibility that some
> function or other stuff you don't know about resides between 'consecutive'
> functions f() and g().
>
> Reading f() might be alright but overwriting it would be tricky.

Since you've commented, Bart, do you have any thoughts on making
metadata about functions available in a programming language? Maybe
you already do this in one of your languages.

The thread got me thinking that if a function is a first-class object
perhaps some of its attributes should be transparent. Certainly its
code size and maybe its data size too; possibly its location, maybe a
signature for its input and output types. Then there are other
attributes such as whether it is in byte code or native code, whether
it is relocatable or not, what privilege it needs etc.

If portability is not needed a function object could also be
decomposed to individual instruction or subordinate function objects.
I'm not saying I like this idea - portability is a key goal for me -
but I'm just offering some ideas for comment.

Any thoughts on what's hot and what's not?

Followups set to only comp.lang.misc.

James

Previous 3 456 7 8 Next

Getting the size of a C function

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group