EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Position independent code with position dependent data ?

Started by nono240 June 30, 2010
Hi Peter,

Peter Dickerson wrote:
> "D Yuniskis" <not.going.to.be@seen.com> wrote in message > news:i0gf3s$kou$1@speranza.aioe.org... >> nono240 wrote: >> >>> My CPU has no MMU, very little RAM (8KB), and is running a modified >>> FreeRTOS. I'd like to have the ability to "load" and run some code >>> from USART/DATAFLASH to FLASH as a RTOS task. Of course, for >>> convenience, the compiled code must be fully position independent. >>> Using the -fPIC or -fpic option, looking at the assembler, the code >>> seems OK : a dynamically computed offset is applied to every >>> operations. >>> >>> BUT, looking deeply, both DATA and CODE are applied the base offset ! >>> While this is the expected behavior for CODE (running anywhere in >>> FLASH), moving my CODE over the entire flash space doesn't mean moving >>> my RAM ! This make only sense when executing everything from SDRAM ! >>> >>> I'm looking for a solution to generate position independent *code*, >>> but with position dependent *data* using GCC/LD... Any help ? >> I find, in resource starved applications, using interpreters >> is a big win. If you're loading apps dynamically, I suspect >> the speed penalty would be insignificant (esp with careful >> choice of language) > > Any suggestions for such interpretters, Don? Experiences?
Remember, this is c.a.e so, for the most part, you *know* what the application is -- and what it will *remain* (i.e., we're not looking at an environment where you have to be able to handle infinite variety of applications). In the past, I've written C-ish, PL/M-ish and BASIC-ish interpreters along with Forth. Note that you can use these as guidelines for a pseudo-language without strictly complying with any formal language definition. E.g., you can opt to implement integer only math instead of supporting "doubles", etc. You can force limits to be defined for string lengths (static memory allocation). You can discount recursion, etc. The advantage of interpreters has always seemed to be coming up with really tight representations of algorithms and spend "ROM" instead of needing space in (loadable) RAM...
Hi there ! Thank you for reply !



>If this is the entire >application and there is nothing else present, no operating >system for example, then why do you care?
I'm running FreeRTOS. We need "dynamic task loading".
>Unless you want to download multiple tasks like this, and have them >stored in arbitrary places in flash (and ram), then there is no need for >position-independent data or code
IT IS my case. It's a (commercial) product, letting the user to load multiple (so named) "tasklets" into FLASH, but its only allowed to run ONE at a time. So, we need those "tasklets" to be CODE position independent, but share DATA.
>However, with sram fixed in one place and flash in another >place (I'm assuming that's the case as you point out there is >no MMU), there is no question about the fact that there are >at least two separate segments in your situation. The base >address of the flash-located segment might be the PC register >so that this flash block can be moved around freely and uses >the PC register as a cheap way to figure out where it's own >stuff is at (assuming the processor supports that), but that >won't work for the sram data instance segment which is >obviously located "elsewhere." Somehow, a base address for >that region needs to be made available to your code and >applied at run time. What mechanism is available for that? >Any RODATA are stored in FLASH, and the mechanism used for position indepedence is PC relative offset : before any IO operations, the *real* offset from original linkage is computed and added automatically :
For example, the following code : extern int myarray[]; // @0x4000 (DATA) int foo() { return myarray[0]; } Give : 80018196: lddpc r6, 80018204 <---- R6 = 0x80014198 (Load PC relative) 80018198: rsub r6,pc <---- R6 = PC - R6 = 0x4000 8001819a: ld.w r12,r6[0] <---- R12= *(uint32_t *) R6 8001819e: ret .... 80018204: .word 0x80014198 So, if I run my code from elsewhere, let's say 16KB farther, the PIC- computed address for m myarray is 0x8000, not what I want. The same is applied for ROM constants (but it's OK in this case).
> For example, the following code : > > extern int myarray[]; // @0x4000 (DATA) > int foo() > { > return myarray[0]; > } > > Give : > > 80018196: lddpc r6, 80018204<---- R6 = 0x80014198 (Load PC > relative) > 80018198: rsub r6,pc<---- R6 = PC - R6 = 0x4000 > 8001819a: ld.w r12,r6[0]<---- R12= *(uint32_t *) R6 > 8001819e: ret > .... > 80018204: > .word 0x80014198 > > So, if I run my code from elsewhere, let's say 16KB farther, the PIC- > computed address for m myarray is 0x8000, not what I want. > > The same is applied for ROM constants (but it's OK in this case).
Why don't you provide a function call to get the address of the shared data in your code and use that within a tasklet. You might put the relevant data into a structure: ditto ROM data. Andrew
On Thu, 1 Jul 2010 05:12:36 -0700 (PDT), nono240 <nono240@gmail.com>
wrote:


>>Unless you want to download multiple tasks like this, and have them >>stored in arbitrary places in flash (and ram), then there is no need for >>position-independent data or code > > >IT IS my case. It's a (commercial) product, letting the user to load >multiple (so named) "tasklets" into FLASH, but its only allowed to run >ONE at a time. So, we need those "tasklets" to be CODE position >independent, but share DATA.
If you can run only one task at a time, why do you need position independent code ? Just link each program to the same fixed load address. You need PIC code only when there are _multiple_ programs to be loaded somewhere into the RAM. If you want to share data between these programs, first link the data area to a fixed address and then link each program to that address. This is how it was done half a century ago. In FORTRAN, create a COMMON area, install it into a fixed address (usually at the top of the core) and then load each "transient program" into low memory, since the whole program could not fit into the core at once. No base/stack pointer relative addressing needed, since the data addresses were known at compile time. With modern processors with versatile addressing modes, why not reserve one data area pointer at a known location (such as the first or last address in RAM or ROM) and use this to access the shared variables in each program ? Data = GetPersistentDataAreaPointer() ; ... Data->SharedVar1 = Data->SharedVar2 ;
In article <24ad21f6-9326-4fe8-8434-164dd88fdfd8@b35g2000yqi.googlegroups.com>,
nono240  <nono240@gmail.com> wrote:
>Hi there ! > >My CPU has no MMU, very little RAM (8KB), and is running a modified >FreeRTOS. I'd like to have the ability to "load" and run some code >from USART/DATAFLASH to FLASH as a RTOS task. Of course, for >convenience, the compiled code must be fully position independent. >Using the -fPIC or -fpic option, looking at the assembler, the code >seems OK : a dynamically computed offset is applied to every >operations. > >BUT, looking deeply, both DATA and CODE are applied the base offset ! >While this is the expected behavior for CODE (running anywhere in >FLASH), moving my CODE over the entire flash space doesn't mean moving >my RAM ! This make only sense when executing everything from SDRAM ! > >I'm looking for a solution to generate position independent *code*, >but with position dependent *data* using GCC/LD... Any help ?
A basic concept in linking ( ld program) is the ``section''. A section is an area of memory belonging together, such that e.g. distances within the section are fixable. A section may have a relocation table identifying the places in the section that still needs to be adjusted to the final place it will be used in the program. Now you want to have different sections behave differently regards location. What the linker (ld) does is combine sections from different object modules together into larger sections with names like .bss .text .data and possible fixing the relocation table. From that point whatever went into such a section will be treated in the same way, i.e. once you combined DATA and CODE into one section, data and code will be either fixed at a position or have a relocation table. The linker is blind to the difference between code and data, the only information it gets is by naming convention of input sections. This information is generated by the compiler. Now you have to understand which sections you have, then tell the linker what to do with it. Using the --debug option to the linker you get a so called linker script which details what the linker does. What you want can be accomplished by adapting the linker script, which is -- I admit -- not necessarily easy. Groetjes Albert -- -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Thanks for all your replies !

> If you can run only one task at a time, why do you need position > independent code ? Just link each program to the same fixed load > address. You need PIC code only when there are _multiple_ programs to > be loaded somewhere into the RAM.
Because those "tasklets" are stored in different places in FLASH : we ship the device with 4 embedded tasklets, but a dozen more are available to download and free to be uploaded at any of those 4 slots, we don't want the user to take care about "link address" ! Moreover, if we update our CPU to more FLASH, we don't want to deal with multiple tasklets version.
> If you want to share data between these programs, first link the data > area to a fixed address and then link each program to that address. > > This is how it was done half a century ago. In FORTRAN, create a > COMMON area, install it into a fixed address (usually at the top of > the core) and then load each "transient program" into low memory, > since the whole program could not fit into the core at once. No > base/stack pointer relative addressing needed, since the data > addresses were known at compile time.
Relocating a tasklet "on demand" to an "predefined fixed area" will prematurely kill the FLASH since there's not enough RAM to run code from...
> With modern processors with versatile addressing modes, why not > reserve one data area pointer at a known location (such as the first > or last address in RAM or ROM) and use this to access the shared > variables in each program ? > > =A0 Data =3D GetPersistentDataAreaPointer() ; > =A0 ... > =A0 Data->SharedVar1 =3D Data->SharedVar2 ;
Because we want the tasklets to be "RTOS" unaware. Our FreeRTOS is running as an "hypervisor" (we did have an MPU though). I just need a way to tell LD that my DATA section is ABSOLUTE, and not relative from CODE..
>> With modern processors with versatile addressing modes, why not >> reserve one data area pointer at a known location (such as the first >> or last address in RAM or ROM) and use this to access the shared >> variables in each program ? >> >> Data = GetPersistentDataAreaPointer() ; >> ... >> Data->SharedVar1 = Data->SharedVar2 ; > > Because we want the tasklets to be "RTOS" unaware. Our FreeRTOS is > running as an "hypervisor" (we did have an MPU though).
I don't see that Paul's suggestion makes your tasklet RTOS aware. Furthermore, I would have thought that the tasklets do need to be RTOS aware because they are manipulating a common data area.
> I just need a way to tell LD that my DATA section is ABSOLUTE, and not > relative from CODE..
What does your link script look like at present? Andrew
"D Yuniskis" <not.going.to.be@seen.com> wrote in message 
news:i0hm4l$c6m$1@speranza.aioe.org...
> Hi Peter, > > Peter Dickerson wrote: >> "D Yuniskis" <not.going.to.be@seen.com> wrote in message >> news:i0gf3s$kou$1@speranza.aioe.org... >>> nono240 wrote: >>> >>>> My CPU has no MMU, very little RAM (8KB), and is running a modified >>>> FreeRTOS. I'd like to have the ability to "load" and run some code >>>> from USART/DATAFLASH to FLASH as a RTOS task. Of course, for >>>> convenience, the compiled code must be fully position independent. >>>> Using the -fPIC or -fpic option, looking at the assembler, the code >>>> seems OK : a dynamically computed offset is applied to every >>>> operations. >>>> >>>> BUT, looking deeply, both DATA and CODE are applied the base offset ! >>>> While this is the expected behavior for CODE (running anywhere in >>>> FLASH), moving my CODE over the entire flash space doesn't mean moving >>>> my RAM ! This make only sense when executing everything from SDRAM ! >>>> >>>> I'm looking for a solution to generate position independent *code*, >>>> but with position dependent *data* using GCC/LD... Any help ? >>> I find, in resource starved applications, using interpreters >>> is a big win. If you're loading apps dynamically, I suspect >>> the speed penalty would be insignificant (esp with careful >>> choice of language) >> >> Any suggestions for such interpretters, Don? Experiences? > > Remember, this is c.a.e so, for the most part, you *know* > what the application is -- and what it will *remain* > (i.e., we're not looking at an environment where you have to > be able to handle infinite variety of applications). > > In the past, I've written C-ish, PL/M-ish and BASIC-ish interpreters > along with Forth. Note that you can use these as guidelines > for a pseudo-language without strictly complying with any > formal language definition. > > E.g., you can opt to implement integer only math instead of > supporting "doubles", etc. You can force limits to be defined > for string lengths (static memory allocation). You can > discount recursion, etc. > > The advantage of interpreters has always seemed to be coming > up with really tight representations of algorithms and > spend "ROM" instead of needing space in (loadable) RAM...
OK, different aim. In my case I have a scientific instrument that is making various low level measurements. Users, who are typically chemists or biochemists, want real answers not raw measurements. For this the apply "Methods" that turn instrumental measurements into stuff like concentrations. These methods are all pretty standard but there are lots of them, with the occasional new one turning up. I'd prefer the applications chemists to be able to implement the methods so that I can concentrate on measuring femtoamps. So, I'm looking scriptable. Peter
Hi Peter,

Peter Dickerson wrote:
> "D Yuniskis" <not.going.to.be@seen.com> wrote in message > news:i0hm4l$c6m$1@speranza.aioe.org... >> Hi Peter, >> >> Peter Dickerson wrote: >>> "D Yuniskis" <not.going.to.be@seen.com> wrote in message >>> news:i0gf3s$kou$1@speranza.aioe.org... >>>> nono240 wrote: >>>> >>>>> My CPU has no MMU, very little RAM (8KB), and is running a modified >>>>> FreeRTOS. I'd like to have the ability to "load" and run some code >>>>> from USART/DATAFLASH to FLASH as a RTOS task. Of course, for >>>>> convenience, the compiled code must be fully position independent. >>>>> Using the -fPIC or -fpic option, looking at the assembler, the code >>>>> seems OK : a dynamically computed offset is applied to every >>>>> operations. >>>>> >>>>> BUT, looking deeply, both DATA and CODE are applied the base offset ! >>>>> While this is the expected behavior for CODE (running anywhere in >>>>> FLASH), moving my CODE over the entire flash space doesn't mean moving >>>>> my RAM ! This make only sense when executing everything from SDRAM ! >>>>> >>>>> I'm looking for a solution to generate position independent *code*, >>>>> but with position dependent *data* using GCC/LD... Any help ? >>>> I find, in resource starved applications, using interpreters >>>> is a big win. If you're loading apps dynamically, I suspect >>>> the speed penalty would be insignificant (esp with careful >>>> choice of language) >>> Any suggestions for such interpretters, Don? Experiences? >> Remember, this is c.a.e so, for the most part, you *know* >> what the application is -- and what it will *remain* >> (i.e., we're not looking at an environment where you have to >> be able to handle infinite variety of applications). >> >> In the past, I've written C-ish, PL/M-ish and BASIC-ish interpreters >> along with Forth. Note that you can use these as guidelines >> for a pseudo-language without strictly complying with any >> formal language definition. >> >> E.g., you can opt to implement integer only math instead of >> supporting "doubles", etc. You can force limits to be defined >> for string lengths (static memory allocation). You can >> discount recursion, etc. >> >> The advantage of interpreters has always seemed to be coming >> up with really tight representations of algorithms and >> spend "ROM" instead of needing space in (loadable) RAM... > > OK, different aim. In my case I have a scientific instrument that is making > various low level measurements. Users, who are typically chemists or > biochemists, want real answers not raw measurements. For this the apply > "Methods" that turn instrumental measurements into stuff like > concentrations. These methods are all pretty standard but there are lots of > them, with the occasional new one turning up. I'd prefer the applications > chemists to be able to implement the methods so that I can concentrate on > measuring femtoamps. So, I'm looking scriptable.
Yes, we wrote/implemented a "QBASIC" for some of our instruments for just this reason (blood assays). Allowed the customer to design new tests without having to contract with us to code them. I.e., we just provided a device that came up with the raw data and let the customer come up with the means of interpreting that data based on the reagents, etc. that he was using in the assay. Note that you can do this two different ways: - *source* level interpreter in the instrument - "bytecode" interpreter in the instrument with an external "compiler/parser". (I'm talking *really* limited resources, here) If you have a more fleshy implementation to work with, look at Lua. Lately I am doing a lot with Inferno/Limbo (but would not suggest it for "end users")
"D Yuniskis" <not.going.to.be@seen.com> wrote in message 
news:i0kg59$cqf$1@speranza.aioe.org...
> Hi Peter,
[snip]
> Yes, we wrote/implemented a "QBASIC" for some of our instruments > for just this reason (blood assays). Allowed the customer to > design new tests without having to contract with us to code > them. I.e., we just provided a device that came up with > the raw data and let the customer come up with the means > of interpreting that data based on the reagents, etc. that > he was using in the assay. > > Note that you can do this two different ways: > - *source* level interpreter in the instrument > - "bytecode" interpreter in the instrument with > an external "compiler/parser".
Yes, I'd go for source since that is conceptually the simplest. Otherwise I need a bytecode compiler somewhere in the machine or on a PC.
> (I'm talking *really* limited resources, here) > > If you have a more fleshy implementation to work with, > look at Lua. Lately I am doing a lot with Inferno/Limbo > (but would not suggest it for "end users")
I did get Lua linked in but ran out of memory almost immediately. In particular I couldn't measure anything. The problem seems to be that a lot of stuff gets stored in RAM - dictionaries, strings etc. I'd prefer to be able to keep that stuff in Flash only even at the cost of performance. Peter

Memfault Beyond the Launch