EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Getting the code size of a function in C

Started by Tosca Berisha November 1, 2005
Grant Edwards <grante@visi.com> wrote:
> On 2005-11-02, Hans-Bernhard Broeker <broeker@physik.rwth-aachen.de> wrote: > > Grant Edwards <grante@visi.com> wrote:
> >> Sure, but once you've put in a label so you know where the > >> function starts, putting in a second one so you know where it > >> ends only takes a couple more keystrokes.
> > Except that unlike the single starting point, which usually > > must exist for the C function to be callable from unrelated > > translation units, there's no particular reason for a given C > > function to even *have* exactly one end where such a label > > could be put.
> Nonsense. There is some address X such that all bytes in the > function have addresses less than X.
Indeed, such an address X must exist --- but calling 'the end' of that function may be premature. There's no general way of making sure that between X and that other label, you'll have enclosed significantly less than the entire program, and even less that positioning such labels in the C source has any relation with the actual address range covered by (fragments of) the function. And that's only scratching the surface. Next you'll have to worry about other functions being called by the function in question, often without the C source showing any sign of their existence. E.g. on some C-unfriendly architecture, accessing a structure element in an array of structures in "far" memory can easily cost 3 or more calls to "secret" C runtime library functions.
> > There's not even a requirement that a C compiler would have to > > translate a single function to a single, consequtive block of > > code. Optimization by function tail merging exists.
> It may not be a requirement, but I've never seen a C compiler > that didn't.
Then, with all due respect, you haven't been looking closely enough.
> This is comp.arch.EMBEDDED. We've got to work with real-world > toolchains here, not some imaginary "could do anything the ISO spec > allows" toolchain. comp.lang.c is that way --->
Let's stay right here, shall we? I hope we can agree that Keil C51 (current version) is quite definitely a realq-world toolchain, and as embedded as they come, right? So: inspect what its optimization options like "reuse common entry code", "common block subroutines", and particularly the machinery they call "linker code packing" can do. Fact is that a single function can fall apart into multiple distinct blocks of code, which the linker can principally distribute all over the place, if it so wishes, and that toolchains doing this not actually exist, but are highly relevant in this newsgroup. -- Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de) Even if all the snow were burnt, ashes would remain.
On 2005-11-02, Tauno Voipio <tauno.voipio@INVALIDiki.fi> wrote:

>> I've been doing embedded C for 20 years on a dozen different >> target architectures and at least as many toolchains. I've >> never seen a target/toolchain where what the OP wants isn't >> easily doable with some trivial linker-script hacking. > > OK. Have a look at the Hi-Cross C Compiler. It translates > every function into a separately linked unit. The units > are linked in only if there are references to them.
Yup, that's exactly the same as the way gcc/ld works (at least that's how I use it for embedded work).
> The addresses allocated by the linker for consecutive > functions are not consecutive. For the reasons in this > discussion (Flash writer) I tried to find the function sizes, > but failed miserably.
In the linker script isn't there any way to tell it to link in a "unit" and put that "unit" into a specified section of memory?
> The binary file created was also a jumble of criss- crossed > function-size pieces, so that I had to make an extra > pre-sorting and merge pass before writing to Flash. > > My guess is that the functions were ordered by the stored call > tree order collapsed in a weird way.
But the link should still put them into the specified output sections, shouldn't it? -- Grant Edwards grante Yow! With YOU, I can be at MYSELF... We don't NEED visi.com Dan Rather...
On 2005-11-02, Hans-Bernhard Broeker <broeker@physik.rwth-aachen.de> wrote:

>> Nonsense. There is some address X such that all bytes in the >> function have addresses less than X. > > Indeed, such an address X must exist --- but calling 'the end' of that > function may be premature. There's no general way of making sure that > between X and that other label, you'll have enclosed significantly > less than the entire program, and even less that positioning such > labels in the C source has any relation with the actual address range > covered by (fragments of) the function.
I never said anything about position labels in C source. I thought I was quite explicit that I was talking about using the linker script for placing symbols before/after the memory section containing the function.
> And that's only scratching the surface. Next you'll have to > worry about other functions being called by the function in > question,
Of course.
> often without the C source showing any sign of their > existence. E.g. on some C-unfriendly architecture, accessing > a structure element in an array of structures in "far" memory > can easily cost 3 or more calls to "secret" C runtime library > functions.
Then relocating that function probably won't work if the called functions aren't available. It's quite easy to determine if that's the case by looking at the generated assembly.
> Let's stay right here, shall we? I hope we can agree that > Keil C51 (current version) is quite definitely a realq-world > toolchain, and as embedded as they come, right? So: inspect > what its optimization options like "reuse common entry code", > "common block subroutines", and particularly the machinery > they call "linker code packing" can do.
Then don't use those optimizations.
> Fact is that a single function can fall apart into multiple > distinct blocks of code, which the linker can principally > distribute all over the place, if it so wishes, and that > toolchains doing this not actually exist, but are highly > relevant in this newsgroup.
With all the toolchains I've used, there was always a way to place a function into a distinct section of memory such that that section's start/end addresses could be made globally visible at link-time. It may have required placing the function in a separate file and not using optimizations, but it was never that hard. If the Keil 51 compiler really is that difficult to work with, then I'm glad I never had to use it. -- Grant Edwards grante Yow! Excuse me, but didn't at I tell you there's NO HOPE visi.com for the survival of OFFSET PRINTING?
Grant Edwards wrote:

> > In ELF files generated by GCC, the symbol table has not only > > the name, type (FUNC), and address of each symbol, but its > > size as well.
> The ELF file isn't available to the program at run-time.
True, but you could reserve initialized memory space to hold the information and have a post-link tool that extracts the desired information from the ELF header and then stores it in the space you reserved... More basically, the ELF header might give you some idea if the block of code in question is contiguous - ie, if the scheme is going to be possible provided you can provide the addresses at runtime, or if the compiler/linker is going to need adjustment in order to produce contiguous functions.
On 2005-11-02, cs_posting@hotmail.com <cs_posting@hotmail.com> wrote:
> Grant Edwards wrote: > >>> In ELF files generated by GCC, the symbol table has not only >>> the name, type (FUNC), and address of each symbol, but its >>> size as well. > >> The ELF file isn't available to the program at run-time. > > True, but you could reserve initialized memory space to hold > the information and have a post-link tool that extracts the > desired information from the ELF header and then stores it in > the space you reserved...
Perhaps, but it's way, way, way easier to do this in the linker script. After all keeping track of what's where in memory is what linkers _do_. Those ELF headers were createdy by the linker, and all that information is availably in a trivial manner to the linker script. -- Grant Edwards grante Yow! Yow! Maybe I should at have asked for my Neutron visi.com Bomb in PAISLEY--
"Grant Edwards" ...
> Hans-Bernhard Broeker ... > > > There's not even a requirement that a C compiler would have to > > translate a single function to a single, consequtive block of > > code. Optimization by function tail merging exists. > > It may not be a requirement, but I've never seen a C compiler > that didn't. This is comp.arch.EMBEDDED. We've got to work > with real-world toolchains here, not some imaginary "could do > anything the ISO spec allows" toolchain. comp.lang.c is that > way --->
Sure: Avocet C51 V3 Not sure anymore (long ago): IAR C51 AFAIK a 8051 is embedded stuff ;-) Arie de Muynck
Grant Edwards wrote:
> On 2005-11-02, Tauno Voipio <tauno.voipio@INVALIDiki.fi> wrote: > > >>>I've been doing embedded C for 20 years on a dozen different >>>target architectures and at least as many toolchains. I've >>>never seen a target/toolchain where what the OP wants isn't >>>easily doable with some trivial linker-script hacking. >> >>OK. Have a look at the Hi-Cross C Compiler. It translates >>every function into a separately linked unit. The units >>are linked in only if there are references to them. > > > Yup, that's exactly the same as the way gcc/ld works (at least > that's how I use it for embedded work).
Nope. The linkable unit in GCC is a module, not a single function. Hi-Cross separates even functions within a module.
>>The addresses allocated by the linker for consecutive >>functions are not consecutive. For the reasons in this >>discussion (Flash writer) I tried to find the function sizes, >>but failed miserably. > > In the linker script isn't there any way to tell it to link in > a "unit" and put that "unit" into a specified section of > memory?
Hi-Cross does not have such a thing. There is a primitive sectionization, but nothing else.
>>The binary file created was also a jumble of criss- crossed >>function-size pieces, so that I had to make an extra >>pre-sorting and merge pass before writing to Flash. >> >>My guess is that the functions were ordered by the stored call >>tree order collapsed in a weird way. > > > But the link should still put them into the specified output > sections, shouldn't it?
See above. After sorting, the final output was a sensible binary file in one piece, but the functions that were together in the source in a module, were usually no longer together in the final file, except occasionally. It is pretty clear that there is then no way to put in labels to calculate the size of a function. The code (M86k) is inherently position-independent in small functions, so the only problem was to get the size of the code. -- Tauno Voipio tauno voipio (at) iki fi
On 2005-11-02, Tauno Voipio <tauno.voipio@INVALIDiki.fi> wrote:

>> Yup, that's exactly the same as the way gcc/ld works (at least >> that's how I use it for embedded work). > > Nope. The linkable unit in GCC is a module, not a single > function. Hi-Cross separates even functions within a module.
So does the Gnu toolchain if you tell it to. If you use the -ffunction-sections option, each function goes into it's own section. This means that each function is a linkable unit. [For most purposes, it's the same as putting every function in a separate file.] When you link, you specify --gc-sections, and only sections (functions) that are referenced will be linked into the output image. This also means that in the linker script you can specify a separate output section for a particular function if you want to. Once the function is in its own section, finding the size of that section is trivial using the SIZEOF() operator in the linker script. In all the other toolchains I've used (which I guess weren't all that representative) you could easily accomplish much the same thing if the function was placed in its own file. [You can also do the same thing with variables: if you use --data-sections when compiling then unreferenced global variables will not be linked into the output image.]
>> In the linker script isn't there any way to tell it to link in >> a "unit" and put that "unit" into a specified section of >> memory? > > Hi-Cross does not have such a thing. There is a primitive > sectionization, but nothing else.
Hi-Cross sounds like a nasty bit of work. -- Grant Edwards grante Yow! I had pancake makeup at for brunch! visi.com
Grant Edwards wrote:
> On 2005-11-02, Tauno Voipio <tauno.voipio@INVALIDiki.fi> wrote: > > >>>Yup, that's exactly the same as the way gcc/ld works (at least >>>that's how I use it for embedded work). >> >>Nope. The linkable unit in GCC is a module, not a single >>function. Hi-Cross separates even functions within a module. > > > So does the Gnu toolchain if you tell it to. If you use the > -ffunction-sections option, each function goes into it's own > section. This means that each function is a linkable unit. [For > most purposes, it's the same as putting every function in a > separate file.]
We were not asking for it, but how to avoid it. If we're talking about GCC and the GNU toolchain, the whole discussion is moot: there are plenty of ways to create the desired effect. IMHO, the by far easiest is to sectionize and locate the function to .data, so we do not need to consider position-dependency issues at all and the start-up code does the loading of the function code to RAM.
>>>In the linker script isn't there any way to tell it to link in >>>a "unit" and put that "unit" into a specified section of >>>memory? >> >>Hi-Cross does not have such a thing. There is a primitive >>sectionization, but nothing else. > > > Hi-Cross sounds like a nasty bit of work.
It is a family of compilers sold for embedded work. The OP question was about C language without specifying a compiler. AFAIK, Hi-Cross is not alone with this kind of 'innovations' in the mebedded world. -- Tauno Voipio tauno voipio (at) iki fi PS. I learned something of the compiler: my current projects are all using the GNU toolset. TV

Memfault Beyond the Launch