Getting the code size of a function in C| page 3

Reply by Hans-Bernhard Broeker ●November 2, 20052005-11-02

Grant Edwards <grante@visi.com> wrote:
> On 2005-11-02, Hans-Bernhard Broeker <broeker@physik.rwth-aachen.de> wrote:
> > Grant Edwards <grante@visi.com> wrote:

> >> Sure, but once you've put in a label so you know where the
> >> function starts, putting in a second one so you know where it
> >> ends only takes a couple more keystrokes.

> > Except that unlike the single starting point, which usually
> > must exist for the C function to be callable from unrelated
> > translation units, there's no particular reason for a given C
> > function to even *have* exactly one end where such a label
> > could be put.

> Nonsense.  There is some address X such that all bytes in the
> function have addresses less than X.  

Indeed, such an address X must exist --- but calling 'the end' of that
function may be premature.  There's no general way of making sure that
between X and that other label, you'll have enclosed significantly
less than the entire program, and even less that positioning such
labels in the C source has any relation with the actual address range
covered by (fragments of) the function.

And that's only scratching the surface.  Next you'll have to worry
about other functions being called by the function in question, often
without the C source showing any sign of their existence.  E.g. on
some C-unfriendly architecture, accessing a structure element in an
array of structures in "far" memory can easily cost 3 or more calls to
"secret" C runtime library functions.

> > There's not even a requirement that a C compiler would have to
> > translate a single function to a single, consequtive block of
> > code. Optimization by function tail merging exists.

> It may not be a requirement, but I've never seen a C compiler
> that didn't.  

Then, with all due respect, you haven't been looking closely enough.

> This is comp.arch.EMBEDDED.  We've got to work with real-world
> toolchains here, not some imaginary "could do anything the ISO spec
> allows" toolchain.  comp.lang.c is that way --->

Let's stay right here, shall we?  I hope we can agree that Keil C51
(current version) is quite definitely a realq-world toolchain, and as
embedded as they come, right?  So: inspect what its optimization
options like "reuse common entry code", "common block subroutines",
and particularly the machinery they call "linker code packing" can do.

Fact is that a single function can fall apart into multiple distinct
blocks of code, which the linker can principally distribute all over
the place, if it so wishes, and that toolchains doing this not
actually exist, but are highly relevant in this newsgroup.

-- 
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Reply by Grant Edwards ●November 2, 20052005-11-02

On 2005-11-02, Tauno Voipio <tauno.voipio@INVALIDiki.fi> wrote:

>> I've been doing embedded C for 20 years on a dozen different
>> target architectures and at least as many toolchains.  I've
>> never seen a target/toolchain where what the OP wants isn't
>> easily doable with some trivial linker-script hacking.
>
> OK. Have a look at the Hi-Cross C Compiler. It translates
> every function into a separately linked unit. The units
> are linked in only if there are references to them.

Yup, that's exactly the same as the way gcc/ld works (at least
that's how I use it for embedded work).

> The addresses allocated by the linker for consecutive
> functions are not consecutive. For the reasons in this
> discussion (Flash writer) I tried to find the function sizes,
> but failed miserably.

In the linker script isn't there any way to tell it to link in
a "unit" and put that "unit" into a specified section of
memory?

> The binary file created was also a jumble of criss- crossed
> function-size pieces, so that I had to make an extra
> pre-sorting and merge pass before writing to Flash.
>
> My guess is that the functions were ordered by the stored call
> tree order collapsed in a weird way.

But the link should still put them into the specified output
sections, shouldn't it?

-- 
Grant Edwards                   grante             Yow!  With YOU, I can be
                                  at               MYSELF... We don't NEED
                               visi.com            Dan Rather...

Reply by Grant Edwards ●November 2, 20052005-11-02

On 2005-11-02, Hans-Bernhard Broeker <broeker@physik.rwth-aachen.de> wrote:

>> Nonsense.  There is some address X such that all bytes in the
>> function have addresses less than X.  
>
> Indeed, such an address X must exist --- but calling 'the end' of that
> function may be premature.  There's no general way of making sure that
> between X and that other label, you'll have enclosed significantly
> less than the entire program, and even less that positioning such
> labels in the C source has any relation with the actual address range
> covered by (fragments of) the function.

I never said anything about position labels in C source.  I
thought I was quite explicit that I was talking about using the
linker script for placing symbols before/after the memory
section containing the function.

> And that's only scratching the surface.  Next you'll have to
> worry about other functions being called by the function in
> question,

Of course.

> often without the C source showing any sign of their
> existence.  E.g. on some C-unfriendly architecture, accessing
> a structure element in an array of structures in "far" memory
> can easily cost 3 or more calls to "secret" C runtime library
> functions.

Then relocating that function probably won't work if the called
functions aren't available.  It's quite easy to determine if
that's the case by looking at the generated assembly.

> Let's stay right here, shall we?  I hope we can agree that
> Keil C51 (current version) is quite definitely a realq-world
> toolchain, and as embedded as they come, right?  So: inspect
> what its optimization options like "reuse common entry code",
> "common block subroutines", and particularly the machinery
> they call "linker code packing" can do.

Then don't use those optimizations.

> Fact is that a single function can fall apart into multiple
> distinct blocks of code, which the linker can principally
> distribute all over the place, if it so wishes, and that
> toolchains doing this not actually exist, but are highly
> relevant in this newsgroup.

With all the toolchains I've used, there was always a way to
place a function into a distinct section of memory such that
that section's start/end addresses could be made globally
visible at link-time.  It may have required placing the
function in a separate file and not using optimizations, but it
was never that hard.  If the Keil 51 compiler really is that
difficult to work with, then I'm glad I never had to use it. 

-- 
Grant Edwards                   grante             Yow!  Excuse me, but didn't
                                  at               I tell you there's NO HOPE
                               visi.com            for the survival of OFFSET
                                                   PRINTING?

Reply by ●November 2, 20052005-11-02

Grant Edwards wrote:

> > In ELF files generated by GCC, the symbol table has not only
> > the name, type (FUNC), and address of each symbol, but its
> > size as well.

> The ELF file isn't available to the program at run-time.

True, but you could reserve initialized memory space to hold the
information and have a post-link tool that extracts the desired
information from the ELF header and then stores it in the space you
reserved...

More basically, the ELF header might give you some idea if the block of
code in question is contiguous - ie, if the scheme is going to be
possible provided you can provide the addresses at runtime, or if the
compiler/linker is going to need adjustment in order to produce
contiguous functions.

Reply by Grant Edwards ●November 2, 20052005-11-02

On 2005-11-02, cs_posting@hotmail.com <cs_posting@hotmail.com> wrote:
> Grant Edwards wrote:
>
>>> In ELF files generated by GCC, the symbol table has not only
>>> the name, type (FUNC), and address of each symbol, but its
>>> size as well.
>
>> The ELF file isn't available to the program at run-time.
>
> True, but you could reserve initialized memory space to hold
> the information and have a post-link tool that extracts the
> desired information from the ELF header and then stores it in
> the space you reserved...

Perhaps, but it's way, way, way easier to do this in the linker
script.  After all keeping track of what's where in memory is
what linkers _do_.  Those ELF headers were createdy by the
linker, and all that information is availably in a trivial
manner to the linker script.

-- 
Grant Edwards                   grante             Yow!  Yow! Maybe I should
                                  at               have asked for my Neutron
                               visi.com            Bomb in PAISLEY--

Reply by Arie de Muynck ●November 2, 20052005-11-02

"Grant Edwards" ...
> Hans-Bernhard Broeker ...
>
> > There's not even a requirement that a C compiler would have to
> > translate a single function to a single, consequtive block of
> > code. Optimization by function tail merging exists.
>
> It may not be a requirement, but I've never seen a C compiler
> that didn't.  This is comp.arch.EMBEDDED.  We've got to work
> with real-world toolchains here, not some imaginary "could do
> anything the ISO spec allows" toolchain.  comp.lang.c is that
> way --->

Sure:                        Avocet C51 V3
Not sure anymore (long ago): IAR C51

AFAIK a 8051 is embedded stuff   ;-)

Arie de Muynck

Reply by Tauno Voipio ●November 2, 20052005-11-02

Grant Edwards wrote:
> On 2005-11-02, Tauno Voipio <tauno.voipio@INVALIDiki.fi> wrote:
> 
> 
>>>I've been doing embedded C for 20 years on a dozen different
>>>target architectures and at least as many toolchains.  I've
>>>never seen a target/toolchain where what the OP wants isn't
>>>easily doable with some trivial linker-script hacking.
>>
>>OK. Have a look at the Hi-Cross C Compiler. It translates
>>every function into a separately linked unit. The units
>>are linked in only if there are references to them.
> 
> 
> Yup, that's exactly the same as the way gcc/ld works (at least
> that's how I use it for embedded work).

Nope. The linkable unit in GCC is a module, not a single
function. Hi-Cross separates even functions within a module.

>>The addresses allocated by the linker for consecutive
>>functions are not consecutive. For the reasons in this
>>discussion (Flash writer) I tried to find the function sizes,
>>but failed miserably.
> 
> In the linker script isn't there any way to tell it to link in
> a "unit" and put that "unit" into a specified section of
> memory?

Hi-Cross does not have such a thing. There is a primitive
sectionization, but nothing else.

>>The binary file created was also a jumble of criss- crossed
>>function-size pieces, so that I had to make an extra
>>pre-sorting and merge pass before writing to Flash.
>>
>>My guess is that the functions were ordered by the stored call
>>tree order collapsed in a weird way.
> 
> 
> But the link should still put them into the specified output
> sections, shouldn't it?

See above.

After sorting, the final output was a sensible binary
file in one piece, but the functions that were together
in the source in a module, were usually no longer together
in the final file, except occasionally. It is pretty clear
that there is then no way to put in labels to calculate
the size of a function. The code (M86k) is inherently
position-independent in small functions, so the only
problem was to get the size of the code.

-- 

Tauno Voipio
tauno voipio (at) iki fi

Reply by Grant Edwards ●November 2, 20052005-11-02

On 2005-11-02, Tauno Voipio <tauno.voipio@INVALIDiki.fi> wrote:

>> Yup, that's exactly the same as the way gcc/ld works (at least
>> that's how I use it for embedded work).
>
> Nope. The linkable unit in GCC is a module, not a single
> function. Hi-Cross separates even functions within a module.

So does the Gnu toolchain if you tell it to.  If you use the
-ffunction-sections option, each function goes into it's own
section. This means that each function is a linkable unit. [For
most purposes, it's the same as putting every function in a
separate file.]

When you link, you specify --gc-sections, and only sections
(functions) that are referenced will be linked into the output
image.

This also means that in the linker script you can specify a
separate output section for a particular function if you want
to.  Once the function is in its own section, finding the size
of that section is trivial using the SIZEOF() operator in the
linker script.

In all the other toolchains I've used (which I guess weren't
all that representative) you could easily accomplish much the
same thing if the function was placed in its own file.

[You can also do the same thing with variables: if you use
--data-sections when compiling then unreferenced global
variables will not be linked into the output image.]

>> In the linker script isn't there any way to tell it to link in
>> a "unit" and put that "unit" into a specified section of
>> memory?
>
> Hi-Cross does not have such a thing. There is a primitive
> sectionization, but nothing else.

Hi-Cross sounds like a nasty bit of work.

-- 
Grant Edwards                   grante             Yow!  I had pancake makeup
                                  at               for brunch!
                               visi.com

Reply by Tauno Voipio ●November 3, 20052005-11-03

Grant Edwards wrote:
> On 2005-11-02, Tauno Voipio <tauno.voipio@INVALIDiki.fi> wrote:
> 
> 
>>>Yup, that's exactly the same as the way gcc/ld works (at least
>>>that's how I use it for embedded work).
>>
>>Nope. The linkable unit in GCC is a module, not a single
>>function. Hi-Cross separates even functions within a module.
> 
> 
> So does the Gnu toolchain if you tell it to.  If you use the
> -ffunction-sections option, each function goes into it's own
> section. This means that each function is a linkable unit. [For
> most purposes, it's the same as putting every function in a
> separate file.]

We were not asking for it, but how to avoid it.

If we're talking about GCC and the GNU toolchain, the
whole discussion is moot: there are plenty of ways
to create the desired effect. IMHO, the by far easiest
is to sectionize and locate the function to .data,
so we do not need to consider position-dependency
issues at all and the start-up code does the loading
of the function code to RAM.

>>>In the linker script isn't there any way to tell it to link in
>>>a "unit" and put that "unit" into a specified section of
>>>memory?
>>
>>Hi-Cross does not have such a thing. There is a primitive
>>sectionization, but nothing else.
> 
> 
> Hi-Cross sounds like a nasty bit of work.

It is a family of compilers sold for embedded work.
The OP question was about C language without specifying
a compiler.

AFAIK, Hi-Cross is not alone with this kind of
'innovations' in the mebedded world.

-- 

Tauno Voipio
tauno voipio (at) iki fi

PS. I learned something of the compiler: my current projects
     are all using the GNU toolset.

TV

Previous 1 23Next

Getting the code size of a function in C

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group