EmbeddedRelated.com
Forums

Position independent code with fixed data and bss

Started by Christopher Collins July 6, 2015
MCU: STM32F4 (ARM Cortex M4)
Build environment: arm-none-eabi-gcc 4.8.4 20140725

My goal is to build an image that can be run from any properly-aligned
offset in internal flash (i.e., position-independent).  I found the
following set of gcc flags that achieves this goal:

    # Generate position independent code.
    -fPIC

    # Access bss via the GOT.
    -mno-pic-data-is-text-relative

    # GOT is not PC-relative; store GOT location in a register.
    -msingle-pic-base

    # Store GOT location in r9.
    -mpic-register=r9

This works, but now I am wondering if there is a way to reduce the size of
the resulting binary.  In particular, the above flags cause all global
variables to be accessed via the global offset table (GOT).  However, I
don't need this extra indirection, because the data and bss sections
will always be at fixed offsets in SRAM.  The only part of the image
that needs to be position independent is the code itself.  Ideally, I
would like to gcc to treat all accesses to global variables as though it
weren't generating position-independent code.

Any ideas?  All input is greatly appreciated.

Chris

On Mon, 06 Jul 2015 17:44:11 -0700, Christopher Collins wrote:

> MCU: STM32F4 (ARM Cortex M4) > Build environment: arm-none-eabi-gcc 4.8.4 20140725 > > My goal is to build an image that can be run from any properly-aligned > offset in internal flash (i.e., position-independent). I found the > following set of gcc flags that achieves this goal: > > # Generate position independent code. > -fPIC > > # Access bss via the GOT. > -mno-pic-data-is-text-relative > > # GOT is not PC-relative; store GOT location in a register. > -msingle-pic-base > > # Store GOT location in r9. > -mpic-register=r9 > > This works, but now I am wondering if there is a way to reduce the size > of the resulting binary. In particular, the above flags cause all > global variables to be accessed via the global offset table (GOT). > However, I don't need this extra indirection, because the data and bss > sections will always be at fixed offsets in SRAM. The only part of the > image that needs to be position independent is the code itself. > Ideally, I would like to gcc to treat all accesses to global variables > as though it weren't generating position-independent code. > > Any ideas? All input is greatly appreciated.
State your next goal out: why do you want this position-independent code? You may be trying to solve your real problem with the wrong solution. It may be that you just want to NOT access the bss via the GOT: if you're not defining any variables in bss or data then you should be able to just link against whatever defines the positions of the global data. If you ARE defining such variables, then I'm not sure what reasonable, workable think you're trying to do. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
On 2015-07-07, Tim Wescott <seemywebsite@myfooter.really> wrote:
> On Mon, 06 Jul 2015 17:44:11 -0700, Christopher Collins wrote: > >> MCU: STM32F4 (ARM Cortex M4) >> Build environment: arm-none-eabi-gcc 4.8.4 20140725 >> >> My goal is to build an image that can be run from any properly-aligned >> offset in internal flash (i.e., position-independent). I found the >> following set of gcc flags that achieves this goal:
<snip>
>> In particular, the above flags cause all global variables to be >> accessed via the global offset table (GOT). However, I don't need >> this extra indirection, because the data and bss sections will always >> be at fixed offsets in SRAM. > State your next goal out: why do you want this position-independent code? > > You may be trying to solve your real problem with the wrong solution.
I want the code to be position-independent to allow for multiple image slots in flash. I should be able to build an image without knowing which slot it will ultimately be uploaded to, and the processor needs to be able to run the image directly from whichever slot it was placed in.
> It may be that you just want to NOT access the bss via the GOT: if you're > not defining any variables in bss or data then you should be able to just > link against whatever defines the positions of the global data.
Are you suggesting wrapping all global data in accessor functions, and then building the function modules without the PIC flags? I can see how that would work. Maybe I am demanding too much from the compiler and linker, but I was hoping for a means of specifying that certain sections are at fixed addresses, without needing to make substantial changes to the C code.
> If you ARE defining such variables, then I'm not sure what reasonable, > workable think you're trying to do.
Thanks, Chris
On Tue, 07 Jul 2015 13:09:28 -0700, Christopher Collins wrote:

> On 2015-07-07, Tim Wescott <seemywebsite@myfooter.really> wrote: >> On Mon, 06 Jul 2015 17:44:11 -0700, Christopher Collins wrote: >> >>> MCU: STM32F4 (ARM Cortex M4) >>> Build environment: arm-none-eabi-gcc 4.8.4 20140725 >>> >>> My goal is to build an image that can be run from any properly-aligned >>> offset in internal flash (i.e., position-independent). I found the >>> following set of gcc flags that achieves this goal: > <snip> >>> In particular, the above flags cause all global variables to be >>> accessed via the global offset table (GOT). However, I don't need >>> this extra indirection, because the data and bss sections will always >>> be at fixed offsets in SRAM. >> State your next goal out: why do you want this position-independent >> code? >> >> You may be trying to solve your real problem with the wrong solution. > > I want the code to be position-independent to allow for multiple image > slots in flash. I should be able to build an image without knowing > which slot it will ultimately be uploaded to, and the processor needs to > be able to run the image directly from whichever slot it was placed in. > >> It may be that you just want to NOT access the bss via the GOT: if >> you're not defining any variables in bss or data then you should be >> able to just link against whatever defines the positions of the global >> data. > > Are you suggesting wrapping all global data in accessor functions, and > then building the function modules without the PIC flags? I can see how > that would work.
I was suggesting that IF the code you wanted to relocate did not need to allocate anything in bss or data, then you could just use whatever locations the "host" code provides.
> Maybe I am demanding too much from the compiler and > linker, but I was hoping for a means of specifying that certain sections > are at fixed addresses, without needing to make substantial changes to > the C code.
Well, if I take you at your word and interpret "sections" to mean the same thing that the gnu tools does, then there may be a way to finagle the data and bss sections of some linked chunk to be something else -- i.e., a linker directive that says "change all instances of bss to bss_n". That would require pre-allocating code and RAM together, however.
>> If you ARE defining such variables, then I'm not sure what reasonable, >> workable think you're trying to do.
It sounds like you want to implement a shared library the hard way. Why not use an OS that supports shared libraries, and go that route? I couldn't guarantee it, but I'd be surprised if you couldn't do something with micro Linux, and you certainly could with "real" Linux, or vxWorks, etc. -- Tim Wescott Wescott Design Services http://www.wescottdesign.com
On 7/6/2015 5:44 PM, Christopher Collins wrote:
> MCU: STM32F4 (ARM Cortex M4) > Build environment: arm-none-eabi-gcc 4.8.4 20140725 > > My goal is to build an image that can be run from any properly-aligned > offset in internal flash (i.e., position-independent). I found the > following set of gcc flags that achieves this goal: > > # Generate position independent code. > -fPIC > > # Access bss via the GOT. > -mno-pic-data-is-text-relative > > # GOT is not PC-relative; store GOT location in a register. > -msingle-pic-base > > # Store GOT location in r9. > -mpic-register=r9 > > This works, but now I am wondering if there is a way to reduce the size of > the resulting binary. In particular, the above flags cause all global > variables to be accessed via the global offset table (GOT). However, I > don't need this extra indirection, because the data and bss sections > will always be at fixed offsets in SRAM. The only part of the image > that needs to be position independent is the code itself. Ideally, I > would like to gcc to treat all accesses to global variables as though it > weren't generating position-independent code.
Why not stuff the globals in a special (named) section (e.g., "common") and tell the linkage editor where you want that section loaded? Let the compiler generate absolute references for each...
On 2015-07-07, Don Y <this@is.not.me.com> wrote:
> On 7/6/2015 5:44 PM, Christopher Collins wrote: >> MCU: STM32F4 (ARM Cortex M4) >> Build environment: arm-none-eabi-gcc 4.8.4 20140725 >> >> My goal is to build an image that can be run from any properly-aligned >> offset in internal flash (i.e., position-independent). I found the >> following set of gcc flags that achieves this goal: >> >> # Generate position independent code. >> -fPIC >> >> # Access bss via the GOT. >> -mno-pic-data-is-text-relative >> >> # GOT is not PC-relative; store GOT location in a register. >> -msingle-pic-base >> >> # Store GOT location in r9. >> -mpic-register=r9 >> >> This works, but now I am wondering if there is a way to reduce the size of >> the resulting binary. In particular, the above flags cause all global >> variables to be accessed via the global offset table (GOT). However, I >> don't need this extra indirection, because the data and bss sections >> will always be at fixed offsets in SRAM. The only part of the image >> that needs to be position independent is the code itself. Ideally, I >> would like to gcc to treat all accesses to global variables as though it >> weren't generating position-independent code. > > Why not stuff the globals in a special (named) section (e.g., "common") and > tell the linkage editor where you want that section loaded? Let the compiler > generate absolute references for each...
Thanks, Don. Actually, that is exactly what I'm trying to do. The issue I am struggling with is: how to tell the compiler to use absolute references to some sections, but still generate position-independent code for branches. gcc's position independent options seem to be "all or nothing." Chris
On 7/7/2015 3:49 PM, Christopher Collins wrote:

>> Why not stuff the globals in a special (named) section (e.g., "common") and >> tell the linkage editor where you want that section loaded? Let the compiler >> generate absolute references for each... > > Thanks, Don. Actually, that is exactly what I'm trying to do. The > issue I am struggling with is: how to tell the compiler to use absolute > references to some sections, but still generate position-independent > code for branches. gcc's position independent options seem to be "all > or nothing."
Tag each variable/struct with a "section" attribute: #define (INITIAL_VALUE) ... long int my_data __attribute__ ((section ("COMMONDATA"))) = INITIAL_VALUE; You can cheat and put all of the "items" in a struct wrapper so you just have to assign the attribute to that *one* struct (instead of having to chase down EVERY individual variable and thusly tag it.) Try it on a small scale -- with N different instances of a "Hello, World" referencing a single "shared" variable. Examine the load map for each of those instances and see that they are all referencing that "shared" variable (i.e., not N copies of it!)
On 2015-07-07, Don Y <this@is.not.me.com> wrote:
> On 7/7/2015 3:49 PM, Christopher Collins wrote: > >>> Why not stuff the globals in a special (named) section (e.g., >>> "common") and tell the linkage editor where you want that section >>> loaded? Let the compiler generate absolute references for each... >> >> Thanks, Don. Actually, that is exactly what I'm trying to do. The >> issue I am struggling with is: how to tell the compiler to use >> absolute references to some sections, but still generate >> position-independent code for branches. gcc's position independent >> options seem to be "all or nothing." > > Tag each variable/struct with a "section" attribute: > > #define (INITIAL_VALUE) ... > > long int > my_data __attribute__ ((section ("COMMONDATA"))) = INITIAL_VALUE; > > You can cheat and put all of the "items" in a struct wrapper so you > just have to assign the attribute to that *one* struct (instead of > having to chase down EVERY individual variable and thusly tag it.) > > Try it on a small scale -- with N different instances of a "Hello, World" > referencing a single "shared" variable. Examine the load map for > each of those instances and see that they are all referencing that > "shared" variable (i.e., not N copies of it!)
The problem is that the compiler does not know that COMMONDATA is at an absolute address. When I compile the file which accesses my_data, gcc is in "PIC mode," so it generates code to look up my_data in the GOT. By time the linker looks at the linker script and notices that COMMONDATA is a separate section, it is already too late; the object files already contain GOT references for each global access. At least this is my interpretation of what is happening.
Christopher Collins wrote:
> On 2015-07-07, Don Y <this@is.not.me.com> wrote: >> On 7/7/2015 3:49 PM, Christopher Collins wrote: >> >>>> Why not stuff the globals in a special (named) section (e.g., >>>> "common") and tell the linkage editor where you want that section >>>> loaded? Let the compiler generate absolute references for each... >>> >>> Thanks, Don. Actually, that is exactly what I'm trying to do. The >>> issue I am struggling with is: how to tell the compiler to use >>> absolute references to some sections, but still generate >>> position-independent code for branches. gcc's position independent >>> options seem to be "all or nothing." >> >> Tag each variable/struct with a "section" attribute: >> >> #define (INITIAL_VALUE) ... >> >> long int >> my_data __attribute__ ((section ("COMMONDATA"))) = INITIAL_VALUE; >> >> You can cheat and put all of the "items" in a struct wrapper so you >> just have to assign the attribute to that *one* struct (instead of >> having to chase down EVERY individual variable and thusly tag it.) >> >> Try it on a small scale -- with N different instances of a "Hello, World" >> referencing a single "shared" variable. Examine the load map for >> each of those instances and see that they are all referencing that >> "shared" variable (i.e., not N copies of it!) > > The problem is that the compiler does not know that COMMONDATA is at an > absolute address. When I compile the file which accesses my_data, gcc > is in "PIC mode," so it generates code to look up my_data in the GOT. > By time the linker looks at the linker script and notices that > COMMONDATA is a separate section, it is already too late; the object > files already contain GOT references for each global access. > > At least this is my interpretation of what is happening. >
COMMONMDATA should be "fixed up" by the linker, not the compiler itself. "gcc -c ...." should generate unresolved references to the variables in COMMONDATA ( as viewable by objdump) , and you'll need linker-script-fu to get all that sorted out. You want to arrange your source code where everything in COMMONDATA is "extern...". As I recall, the linker for GNU affords locating as well; you'd absolutely locate COMMONDATA and not-absolutely locate everything else. -- Les Cargill
On 2015-07-08, Les Cargill <lcargill99@comcast.com> wrote:
> COMMONMDATA should be "fixed up" by the linker, not the compiler > itself. "gcc -c ...." should generate unresolved references to > the variables in COMMONDATA ( as viewable by objdump) , and > you'll need linker-script-fu to get all that sorted out. > > You want to arrange your source code where everything in COMMONDATA > is "extern...". As I recall, the linker for GNU affords > locating as well; you'd absolutely locate COMMONDATA and > not-absolutely locate everything else.
Either I am not explaining myself very well, or I am not understanding the suggestions. When I tell gcc to compile a C file as position-independent code, each access to global data is translated into a series of instructions that is quite different from what I get when I don't specify any position-independent flags. I wouldn't expect the linker to be able to convert the first series of instructions into the second. Here is an example: $ cat pic.c extern int var; /* Specifying the section here makes no difference. */ void test(void) { var = 2; } ### Not position-independent $ /usr/local/bin/arm-none-eabi-gcc -O1 -c pic.c $ /usr/local/bin/arm-none-eabi-objdump -dS pic.o pic.o: file format elf32-littlearm Disassembly of section .text: 00000000 <test>: 0: e3a02002 mov r2, #2 4: e59f3004 ldr r3, [pc, #4] ; 10 <test+0x10> 8: e5832000 str r2, [r3] c: e12fff1e bx lr 10: 00000000 .word 0x00000000 ### position-independent $ /usr/local/bin/arm-none-eabi-gcc -fPIC -O1 -c pic.c $ /usr/local/bin/arm-none-eabi-objdump -dS pic.o pic.o: file format elf32-littlearm Disassembly of section .text: 00000000 <test>: 0: e59f3014 ldr r3, [pc, #20] ; 1c <test+0x1c> 4: e08f3003 add r3, pc, r3 8: e59f2010 ldr r2, [pc, #16] ; 20 <test+0x20> c: e7933002 ldr r3, [r3, r2] 10: e3a02002 mov r2, #2 14: e5832000 str r2, [r3] 18: e12fff1e bx lr 1c: 00000010 .word 0x00000010 20: 00000000 .word 0x00000000 Are you expecting the linker to change the indirect access in the second listing to something that looks like the first one, assuming I specify the correct sections in the linker script and other C files? I have not had any luck in getting this to happen. If the answer is that gcc just doesn't support the behavior I'm looking for, that is OK. I am not demanding a solution from the group :). Thanks again, Chris