EmbeddedRelated.com
Forums

Getting the size of a C function

Started by john January 22, 2010
"Grant Edwards" <invalid@invalid.invalid> wrote in message 
news:hjdjjj$njp$1@reader1.panix.com...
> On 2010-01-23, BGB / cr88192 <cr88192@hotmail.com> wrote: > >> in this case, it might actually be better advised to generate the >> function >> as a chunk of arch-specific ASM or machine code (ASM is preferable IMO, >> but >> requires an assembler...), which could then be located wherever (such as >> the >> heap). > > IMO, the "right" thing to do is to tell the compiler to put the > function into a separate section and then have it linked so > that it's "located" to run in RAM at the proper address but > stored in ROM. > > That way you know the code will work correctly when it's run > from RAM. Defining approprate symbols in the linker command > file will allow the program to refer to the start and end of > the section's address in ROM. >
this is a little closer to the second option, of having a secondary image file embedded as data...
> The OP needs to spend some time studying the manuals for his > compiler and linker. >
this is, assuming the linker or image format actually supports the "separate section" idea... dunno about ELF, but PE/COFF would not support this, since it would require breaking some of the internal assumptions of the file format (for example, that the image is continuous from ImageBase to ImageBase+ImageSize, ...). ELF may have similar restrictions (actually, I think most ELF images are position independent anyways, so one could relocate and adjust the GOT for an image easily enough). (note that embedding an additional PE/COFF of ELF image would not likely be "that difficult", and the formats are not particularly difficult to work with). a fixed-address PE/COFF image is likely an easy case, since one can copy the contents of the sections and then call into it. for fixed-address, producing a raw binary image (supported by GNU ld, ...) is also probably a good option, since in this case the resulting image can be copied as a raw chunk of data (no need to relocate or worry about file-format), and jumped into. can't say so much about other file formats though...
On 23 Jan, 08:03, jacob navia <ja...@nospam.org> wrote:

...

> > At the expense of a few words of code and a parameter, you could do > > > int MoveMe(...., bool findend){ > > =A0 =A0if(!findend){ > > > =A0 =A0// do all the stuff the function is supposed to do > > > =A0 =A0} else Markend(); > > > } > > > Where Markend is a function that pulls the return > > address off the stack and stashes it somewhere > > convenient.
...
> Sorry Mark but this is totally WRONG! > > The return address contains the address where the CPU RETURNS TO > when the current function is finished, not the end of the > current function!!!
So in Mark's example what will it be in Markend()?
> The return address will be in the middle of another function, that CALLED > this one.
i.e. Moveme()? James
In article <hjeah3$10g$1@speranza.aioe.org>, jacob@nospam.org says...
> Mark Borgerson a =E9crit : > > In article <pan.2010.01.23.05.08.12.672000@nowhere.com>,=20 > > nobody@nowhere.com says... > >> On Fri, 22 Jan 2010 22:53:18 +0000, john wrote: > >> > >>> I need to know the size of a function or module because I need to > >>> temporarily relocate the function or module from flash into sram to > >>> do firmware updates. > >> Do you need to be able to run it from RAM? If so, simply memcpy()ing i=
t
> >> may not work. And you would also need to copy anything which the funct=
ion
> >> calls (just because there aren't any explicit function calls in the so=
urce
> >> code, that doesn't mean that there aren't any in the resulting object =
code).
> >> > >> > > At the expense of a few words of code and a parameter, you could do > >=20 > >=20 > > int MoveMe(...., bool findend){ > > =09if(!findend){ > >=20 > > =09// do all the stuff the function is supposed to do > >=20 > > =09} else Markend(); > >=20 > > } > >=20 > >=20 > > Where Markend is a function that pulls the return=20 > > address off the stack and stashes it somewhere > > convenient. Markend may have to have some > > assembly code. External code can then > > subtract the function address from the address > > stashed by Markend(), add a safety margin, and > > know how many bytes to move to RAM. > >=20 > >=20 > > Mark Borgerson > >=20 >=20 > Sorry Mark but this is totally WRONG! >=20 > The return address contains the address where the CPU RETURNS TO > when the current function is finished, not the end of the > current function!!! >=20 > The return address will be in the middle of another function, that CALLED > this one.
I think you missed a few points: Inside Markend, The return address on the stack will be the address=20 after the call to Markend----which was purposely located at the end of=20 MoveMe. Then next few instructions after the call to Markend will be the return from MoveMe (an RTS or equivalent with stack=20 cleanup). Inside Markend, the return address on the stack will be an address near the end of MoveMe. It is that address that you need to save and make available for the computation of the function length. In assembly, the code in Moveme might look like this: 0900 MoveMe: sub.l #8, SP // make room for 8 bytes of locals 0904=09 test.l R14 // check the findend parameter in R14 0908 bne lbl1; // if true, just find end of function .... .... // all the work of Moveme goes here .... // and gets executed when findend is zero .... =20 1000 bra lbl2 // skip the markend call 1004 lbl1: bsr Markend =20 1008 lbl2: add.l #8, SP // clean up 8 bytes of local variables=20 1012 rts // return from MoveMe When Markend is called at 1004, the address 1008 gets pushed on the=20 stack. Inside Markend, you could do: 2040 Markend: Move SP, NearEnd // NearEnd is a global variable 2044 RTS Someplace else, could do=20 MMLength =3D NearEnd - (unsigned long)&Moveme + 4; When I was teaching introductory M68K assembly language, I used to give exam problems with nested subroutine calls like this---some with pushed local variables, and ask the students to show the contents of the stack at some point in the function. Those questions really separated the As from the Bs and Cs! NOTE: You have to make sure that your compiler doesn't convert the Markend function to an inline sequence of instructions. Mark Borgerson
In article <hjes74.1bk.1@stefan.msgid.phost.de>, stefan.news@arcor.de 
says...
> john wrote: > > I need to know the size of a function or module because I need to > > temporarily relocate the function or module from flash into sram to > > do firmware updates. > > > > How can I determine that at runtime? > > You can't in standard C, because functions are not contiguous objects. > > Most environments have some way of placing a function in a special > section (using pragmas or things like __attribute__), and a possibility > to acquire position and size of that section (using linker magic). > > In general, you cannot assume a function generates just a single blob of > assembly code in the ".text" sections. For example, functions containing > string or floating-point literals, or large switches, often generate > some data in ".rodata", static variables end up in ".data" or ".bss", > and if you're doing C++, you'll get some exception handling tables as well. > >
That's a real good point. If the OP's goal was just to move the function code--and not necessarily execute it after movement, he may not care whether the bytes in the .rodata, .data, or .bss segments get moved. If the function has to be moved and executed, then it better to be able to access the data in the .rodata, .data and .bss segements---or not use data in any of those segments that are in flash memory. If you're moving the function to RAM because you can't execute from Flash while updating flash, the function being moved could be written to use only variables and data in RAM. This might be the case if the function being moved is the Flash write routine. Now that I think about it, I may use this approach in writing a firmware update routine for the MSP430---which has the restrictions mentioned above. Mark Borgerson
Mark Borgerson <mborgerson@comcast.net> writes:

> At the expense of a few words of code and a parameter, you could do > > > int MoveMe(...., bool findend){ > if(!findend){ > > // do all the stuff the function is supposed to do > > } else Markend(); > > } > > > Where Markend is a function that pulls the return > address off the stack and stashes it somewhere > convenient. Markend may have to have some > assembly code. External code can then > subtract the function address from the address > stashed by Markend(), add a safety margin, and > know how many bytes to move to RAM.
You seem to be assuming that the compiler emits machine code that is in the same order as the corresponding C code, i.e. that the call to Markend() will occur at the end of MoveMe(). This is not a good assumption. -- "A lesson for us all: Even in trivia there are traps." --Eric Sosman
In article <MPG.25c4df83ad93857d989a54@news.eternal-september.org>,=20
mborgerson@comcast.net says...
> In article <hjeah3$10g$1@speranza.aioe.org>, jacob@nospam.org says... > > Mark Borgerson a =E9crit : > > > In article <pan.2010.01.23.05.08.12.672000@nowhere.com>,=20 > > > nobody@nowhere.com says... > > >> On Fri, 22 Jan 2010 22:53:18 +0000, john wrote: > > >> > > >>> I need to know the size of a function or module because I need to > > >>> temporarily relocate the function or module from flash into sram to > > >>> do firmware updates. > > >> Do you need to be able to run it from RAM? If so, simply memcpy()ing=
it
> > >> may not work. And you would also need to copy anything which the fun=
ction
> > >> calls (just because there aren't any explicit function calls in the =
source
> > >> code, that doesn't mean that there aren't any in the resulting objec=
t code).
> > >> > > >> > > > At the expense of a few words of code and a parameter, you could do > > >=20 > > >=20 > > > int MoveMe(...., bool findend){ > > > =09if(!findend){ > > >=20 > > > =09// do all the stuff the function is supposed to do > > >=20 > > > =09} else Markend(); > > >=20 > > > } > > >=20 > > >=20 > > > Where Markend is a function that pulls the return=20 > > > address off the stack and stashes it somewhere > > > convenient. Markend may have to have some > > > assembly code. External code can then > > > subtract the function address from the address > > > stashed by Markend(), add a safety margin, and > > > know how many bytes to move to RAM. > > >=20 > > >=20 > > > Mark Borgerson > > >=20 > >=20 > > Sorry Mark but this is totally WRONG! > >=20 > > The return address contains the address where the CPU RETURNS TO > > when the current function is finished, not the end of the > > current function!!! > >=20 > > The return address will be in the middle of another function, that CALL=
ED
> > this one. >=20 > I think you missed a few points: >=20 > Inside Markend, The return address on the stack will be the address=20 > after the call to Markend----which was purposely located at the end of=20 > MoveMe. Then next few instructions after the call to > Markend will be the return from MoveMe (an RTS or equivalent with stack=
=20
> cleanup). >=20 >=20 > Inside Markend, the return address on the stack will be > an address near the end of MoveMe. It is that address that > you need to save and make available for the computation > of the function length. >=20 > In assembly, the code in Moveme might look like this: >=20 > 0900 MoveMe: sub.l #8, SP // make room for 8 bytes of locals > 0904=09 test.l R14 // check the findend parameter in R14 > 0908 bne lbl1; // if true, just find end of function > .... > .... // all the work of Moveme goes here > .... // and gets executed when findend is zero > .... =20 > 1000 bra lbl2 // skip the markend call > 1004 lbl1: bsr Markend =20 > 1008 lbl2: add.l #8, SP // clean up 8 bytes of local variables=
=20
> 1012 rts // return from MoveMe >=20 > When Markend is called at 1004, the address 1008 gets pushed on the=20 > stack. >=20 > Inside Markend, you could do: >=20 > 2040 Markend: Move SP, NearEnd // NearEnd is a global variable > 2044 RTS
Yikes! I'll have to mark myself down 5 points!!! That should be =20 2040 Markend: Move @SP, NearEnd // NearEnd is a global variable I need to save the data pointed to by the stack pointer, not the contents of the stack pointer itself.
>=20 > Someplace else, could do=20 >=20 > MMLength =3D NearEnd - (unsigned long)&Moveme + 4; >=20 >=20 > When I was teaching introductory M68K assembly language, I used > to give exam problems with nested subroutine calls like this---some > with pushed local variables, and ask the students to show > the contents of the stack at some point in the function. > Those questions really separated the As from the Bs and > Cs! >=20 > NOTE: You have to make sure that your compiler doesn't convert > the Markend function to an inline sequence of instructions. >=20 >=20
I also realized that, on the MSP430, I don't even need the function call. At the end of the function whose length I want to determine, I simply add the assembly=20 language: mov PC, NearEnd Both these methods do require some assembly language and are processor dependent. The compiler that I'm using on the MSP430 (Imagecraft), allows inline assembly, so the instruction above would be asm("mov PC, %NearEnd\n"); // the % is used to reference a C =09 =09=09=09=09=09//variable I'm reasonably confident that I can use this technique to move a flash-write routine, but I will have to be very careful=20 about using global variables, since the compiler produces PC relative references to global and static variables. Those references will be hosed when the code is moved. Mark Borgerson
In article <87y6jopsl9.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
says...
> Mark Borgerson <mborgerson@comcast.net> writes: > > > At the expense of a few words of code and a parameter, you could do > > > > > > int MoveMe(...., bool findend){ > > if(!findend){ > > > > // do all the stuff the function is supposed to do > > > > } else Markend(); > > > > } > > > > > > Where Markend is a function that pulls the return > > address off the stack and stashes it somewhere > > convenient. Markend may have to have some > > assembly code. External code can then > > subtract the function address from the address > > stashed by Markend(), add a safety margin, and > > know how many bytes to move to RAM. > > You seem to be assuming that the compiler emits machine code that > is in the same order as the corresponding C code, i.e. that the > call to Markend() will occur at the end of MoveMe(). This is not > a good assumption. >
I'll paraphrase the old Reagan maxim: "assume, but verify". I did a test run with an MSP-430 compiler and the call was at the end. For that particular processor, as I later discovered and noted in another post, you don't even need the function call. You can save the contents of the PC at the end of the function with a line of assembly. This would certainly be a dangerous technique on a processor with multi-threading and possible out-of-order execution. I think it will work OK on the MSP430 that is the CPU where I am working on a flash-burning routine. Mark Borgerson
"Mark Borgerson" <mborgerson@comcast.net> wrote in message 
news:MPG.25c428eeb3e97212989a50@news.eternal-september.org...
> In article <pan.2010.01.23.05.08.12.672000@nowhere.com>, > nobody@nowhere.com says... >> On Fri, 22 Jan 2010 22:53:18 +0000, john wrote: >> >> > I need to know the size of a function or module because I need to >> > temporarily relocate the function or module from flash into sram to >> > do firmware updates. >> >> Do you need to be able to run it from RAM? If so, simply memcpy()ing it >> may not work. And you would also need to copy anything which the function >> calls (just because there aren't any explicit function calls in the >> source >> code, that doesn't mean that there aren't any in the resulting object >> code). >> >> > At the expense of a few words of code and a parameter, you could do > > > int MoveMe(...., bool findend){ > if(!findend){ > > // do all the stuff the function is supposed to do > > } else Markend(); > > } >
If you're going to add a special parameter (and assume the return type is compatible with a return address), it might be possible to use gcc's feature of obtaining the address of a label. Then findend can return the address of a label placed near the closing brace of the function (which possibly may be less likely to be rearranged than a function call). int MoveMe(...., bool findend){ if(findend) return (int)&&endoffunction; // do all the stuff the function is supposed to do endoffunction: return 0; } -- Bartc
Mark Borgerson <mborgerson@comcast.net> writes:

> In article <87y6jopsl9.fsf@blp.benpfaff.org>, blp@cs.stanford.edu > says... >> Mark Borgerson <mborgerson@comcast.net> writes: >> You seem to be assuming that the compiler emits machine code that >> is in the same order as the corresponding C code, i.e. that the >> call to Markend() will occur at the end of MoveMe(). This is not >> a good assumption. > > This would certainly be a dangerous technique on a processor > with multi-threading and possible out-of-order execution. > I think it will work OK on the MSP430 that is the CPU where > I am working on a flash-burning routine.
Threading and out-of-order execution has little if anything to do with it. The issue is the order of the code emitted by compiler, not the order of the code's execution. -- Ben Pfaff http://benpfaff.org
In article <87tyucpp7x.fsf@blp.benpfaff.org>, blp@cs.stanford.edu 
says...
> Mark Borgerson <mborgerson@comcast.net> writes: > > > In article <87y6jopsl9.fsf@blp.benpfaff.org>, blp@cs.stanford.edu > > says... > >> Mark Borgerson <mborgerson@comcast.net> writes: > >> You seem to be assuming that the compiler emits machine code that > >> is in the same order as the corresponding C code, i.e. that the > >> call to Markend() will occur at the end of MoveMe(). This is not > >> a good assumption. > > > > This would certainly be a dangerous technique on a processor > > with multi-threading and possible out-of-order execution. > > I think it will work OK on the MSP430 that is the CPU where > > I am working on a flash-burning routine. > > Threading and out-of-order execution has little if anything to do > with it. The issue is the order of the code emitted by compiler, > not the order of the code's execution. >
But woudn't an optimizing compiler generating code for a complex processor be more likely to compile optimize in a way that changed the order of operations? I think that might apply particularly to a call to a function that returns no result to be used in a specific place inside the outer function. Mark Borgerson