arm-gcc: pointer to constant string

Started by pozz September 14, 2018
--- 1.c ---
void foo(void) {
   dummy("hello");
}

--- 2.c ---
static const char *s;

void dummy(const char *ss) {
   s = ss;
}



"hello" is a constant string declared inside function foo().  Is there a 
guarantee that his pointer (passed to dummy()) is valid after exiting 
foo()?  In other words, is the string on the stack (in this case it will 
not be valid after foo()) or in Flash (so it is permanent)?
On 14/09/18 09:50, pozz wrote:
> --- 1.c --- > void foo(void) { > dummy("hello"); > } > > --- 2.c --- > static const char *s; > > void dummy(const char *ss) { > s = ss; > } > > > > "hello" is a constant string declared inside function foo(). Is there a > guarantee that his pointer (passed to dummy()) is valid after exiting > foo()? In other words, is the string on the stack (in this case it will > not be valid after foo()) or in Flash (so it is permanent)?
It is not guaranteed by C, AFAIUI - the literal is only valid during its lifetime, which ends when the call to "dummy" returns. However, gcc always places such strings in flash and it will remain valid throughout the program's lifetime. I cannot see that changing in any future version of gcc. But note that the same does not apply to pointers to other objects that might reasonably be put on the stack. And it may not apply to targets like the AVR where strings get copied from flash to ram before use.
pozz <pozzugno@gmail.com> wrote:
> --- 1.c --- > void foo(void) { > dummy("hello"); > } > > --- 2.c --- > static const char *s; > > void dummy(const char *ss) { > s = ss; > } > > > "hello" is a constant string declared inside function foo(). Is there a > guarantee that his pointer (passed to dummy()) is valid after exiting > foo()?
Yes. K&R II (The C Programming Language, 2nd Ed), says that (A2.6) "A string has type 'array of characters' and storage class 'static'." and (A4) "Static objects [...] retain their values across exit from and reentry to functions and blocks."
> In other words, is the string on the stack (in this case it will > not be valid after foo()) or in Flash (so it is permanent)?
-- Nils M Holm < n m h @ t 3 x . o r g > www.t3x.org
On 14/09/18 11:47, Nils M Holm wrote:
> pozz <pozzugno@gmail.com> wrote: >> --- 1.c --- >> void foo(void) { >> dummy("hello"); >> } >> >> --- 2.c --- >> static const char *s; >> >> void dummy(const char *ss) { >> s = ss; >> } >> >> >> "hello" is a constant string declared inside function foo(). Is there a >> guarantee that his pointer (passed to dummy()) is valid after exiting >> foo()? > > Yes. K&R II (The C Programming Language, 2nd Ed), says that
K&R does not define the C language. It was an approximation to a definition until ANSI (then ISO) made the standard, and has not been relevant for decades. The standards copied most of the features and rules given in K&R, but were not absolute about it. Accurate draft versions of the C standards are freely available on the net - the current document of choice is N1570 for C11.
> > (A2.6) "A string has type 'array of characters' and storage class 'static'." >
No, it is not - not for decades. In C, "string" is defined in 7.1.1 of the current standard as "A string is a contiguous sequence of characters terminated by and including the first null character." In particular, there is /no/ mention of storage class, and strings are independent of the storage class. Also, here the OP is asking about the practical lifetime of a string literal, not a string. Literals, like other constants (using the C standards definition of "constant"), do not have lifetimes or scopes like objects - the nearest you have is the period of validity of the temporary pointer to the string literal. And in the OP's code, that is during the call to "dummy". A C compiler /can/ implement the call function "foo" in this manner : void foo(void) { char s[STRING_LIT_LENGTH_HELLO]; // On stack copy_string_literal_from_compressed_storage(&s, STRING_LIT_IDENTIFIER_HELLO, STRING_LIT_LENGTH_HELLO); dummy(s); } That would - AFAIUI - be legal in C. It is not the way gcc does it on any targets I have seen (not even on the AVR, where the string must be copied to ram in a case like this). In particular, for the ARM it would be very strange for a compiler to do anything that would not involve having the string literal at a fixed place in read-only flash.
> and > > (A4) "Static objects [...] retain their values across exit from and reentry > to functions and blocks." >
The current C definition of static objects' lifetime is given in 6.2.4p3, and is more accurate than the old K&R notes. They are not relevant here, however, since string literals are not objects. (And if it is relevant to the OP, the rules for C++ are a bit more complicated. But C-string literals will be handled the same way by the ARM gcc compiler.)
>> In other words, is the string on the stack (in this case it will >> not be valid after foo()) or in Flash (so it is permanent)? > >
On 14.9.18 10:50, pozz wrote:
> --- 1.c --- > void foo(void) { > &#2013266080; dummy("hello"); > } > > --- 2.c --- > static const char *s; > > void dummy(const char *ss) { > &#2013266080; s = ss; > } > > > > "hello" is a constant string declared inside function foo().&#2013266080; Is there
a
> guarantee that his pointer (passed to dummy()) is valid after exiting > foo()?&#2013266080; In other words, is the string on the stack (in this case it
will
> not be valid after foo()) or in Flash (so it is permanent)?
You can look up it yourself, just add -Wa,-adhlms=myfile.lst to your command line (this for myfile.c). Here is your example, slightly changed to prevent the optimizer from make the code totally disappear: static const char *s; void dummy(const char *ss) { s = ss; } const char *foo(void) { dummy("hello"); return s; /* added to keep the code */ } And the compiled code for Cortex-M3 (somewhat shortened): 1 .cpu cortex-m3 11 .file "pozz.c" 12 .section .text.dummy,"ax",%progbits 13 .align 1 14 .global dummy 15 .syntax unified 16 .thumb 17 .thumb_func 18 .fpu softvfp 20 dummy: 21 @ args = 0, pretend = 0, frame = 0 22 @ frame_needed = 0, uses_anonymous_args = 0 23 @ link register save eliminated. 24 0000 7047 bx lr 26 .section .text.foo,"ax",%progbits 27 .align 1 28 .global foo 29 .syntax unified 30 .thumb 31 .thumb_func 32 .fpu softvfp 34 foo: 35 @ args = 0, pretend = 0, frame = 0 36 @ frame_needed = 0, uses_anonymous_args = 0 37 @ link register save eliminated. 38 0000 0048 ldr r0, .L3 39 0002 7047 bx lr 40 .L4: 41 .align 2 42 .L3: 43 0004 00000000 .word .LC0 45 .section.rodata.str1.1,"aMS",%progbits,1 46 .LC0: 47 0000 68656C6C .ascii "hello\000" 47 6F00 48 .ident "GCC: (15:6.3.1+svn253039-1build1) 6.3.1 20170620" --- The string constant goes to an own section (.rodata) which is then located as the linker script commands (often to ROM/ Flash). -- -TV
Il 14/09/2018 11:18, David Brown ha scritto:
> On 14/09/18 09:50, pozz wrote: >> --- 1.c --- >> void foo(void) { >> dummy("hello"); >> } >> >> --- 2.c --- >> static const char *s; >> >> void dummy(const char *ss) { >> s = ss; >> } >> >> >> >> "hello" is a constant string declared inside function foo(). Is there a >> guarantee that his pointer (passed to dummy()) is valid after exiting >> foo()? In other words, is the string on the stack (in this case it will >> not be valid after foo()) or in Flash (so it is permanent)? > > It is not guaranteed by C, AFAIUI - the literal is only valid during > its lifetime, which ends when the call to "dummy" returns. However, gcc > always places such strings in flash and it will remain valid throughout > the program's lifetime. I cannot see that changing in any future > version of gcc.
If I was pedantic and I want to be sure the pointer to string is valid after dummy(), what should I do? Do I declare the string static (even *in* the function)? void foo(void) { static const char s[] = "hello"; dummy(s); }
> But note that the same does not apply to pointers to other objects that > might reasonably be put on the stack. And it may not apply to targets > like the AVR where strings get copied from flash to ram before use.
Again the solution should be to explicitly declare objects as static.
On 14/09/18 13:05, pozz wrote:
> Il 14/09/2018 11:18, David Brown ha scritto: >> On 14/09/18 09:50, pozz wrote: >>> --- 1.c --- >>> void foo(void) { >>> dummy("hello"); >>> } >>> >>> --- 2.c --- >>> static const char *s; >>> >>> void dummy(const char *ss) { >>> s = ss; >>> } >>> >>> >>> >>> "hello" is a constant string declared inside function foo(). Is there a >>> guarantee that his pointer (passed to dummy()) is valid after exiting >>> foo()? In other words, is the string on the stack (in this case it will >>> not be valid after foo()) or in Flash (so it is permanent)? >> >> It is not guaranteed by C, AFAIUI - the literal is only valid during >> its lifetime, which ends when the call to "dummy" returns. However, gcc >> always places such strings in flash and it will remain valid throughout >> the program's lifetime. I cannot see that changing in any future >> version of gcc. > > If I was pedantic and I want to be sure the pointer to string is valid > after dummy(), what should I do? Do I declare the string static (even > *in* the function)? > > void foo(void) { > static const char s[] = "hello"; > dummy(s); > }
It should, I think be fine to say: void foo(void) { static const char* s = "hello"; dummy(s); } I believe that will mean that the compiler is obliged to make the string literal available for the lifetime of "s", which is program lifetime. But I would be confident for arm-gcc to write it as you did, as the compiler implements strings in a way that keeps them all for the lifetime of the program. (I assume you are not doing anything nasty and undefined, like trying to change the values of strings somewhere.) And it is also possible that I have this wrong, and that the compiler /must/ keep the string literal around for the lifetime of the program. comp.lang.c would be the newsgroup to look for extra opinions here.
> > >> But note that the same does not apply to pointers to other objects that >> might reasonably be put on the stack. And it may not apply to targets >> like the AVR where strings get copied from flash to ram before use. > > Again the solution should be to explicitly declare objects as static.
For the AVR, the solution is usually to use the macros and attributes needed to access strings directly from flash to avoid wasting ram space. But it's a slightly awkward target.
On 2018-09-14 David Brown wrote in comp.arch.embedded:
> On 14/09/18 11:47, Nils M Holm wrote: >> >> Yes. K&R II (The C Programming Language, 2nd Ed), says that > > K&R does not define the C language. It was an approximation to a > definition until ANSI (then ISO) made the standard, and has not been > relevant for decades.
You must be talking about K&R, Nils Specifically said K&R II. That edition has printed "ANSI C" in large capitals on the cover. A quote from the preface: "This second edition of 'The C Programming Language' describes C as defined by the ANSI standard" That standard is almost identical to C90 IIRC. And that standard is, however old, still relevant. The Keil ARM compiler for instance defaults to C90 and you have to specifically enable C99. (and nothing newer if you don't want to set it to C++ 2003) So don't throw out your K&R II just yet! I have my copy within an arms reach. ;-) -- Stef (remove caps, dashes and .invalid from e-mail address to reply by mail) In 1914, the first crossword puzzle was printed in a newspaper. The creator received $4000 down ... and $3000 across.
On 14/09/18 14:52, Stef wrote:
> On 2018-09-14 David Brown wrote in comp.arch.embedded: >> On 14/09/18 11:47, Nils M Holm wrote: >>> >>> Yes. K&R II (The C Programming Language, 2nd Ed), says that >> >> K&R does not define the C language. It was an approximation to a >> definition until ANSI (then ISO) made the standard, and has not been >> relevant for decades. > > You must be talking about K&R, Nils Specifically said K&R II. > That edition has printed "ANSI C" in large capitals on the cover. > A quote from the preface: > "This second edition of 'The C Programming Language' describes C > as defined by the ANSI standard"
Exactly - it /describes/ the C language defined by the standard. It does not define the language, or the standard, or give a complete or accurate set of rules for it. That is what the /standard/ is for. K&R (all editions) are /tutorials/, not standards or language definitions. (Equally, the standards are not tutorials, and say as much in their forewords.) K&R was updated for the second edition to contain much of what is often incorrectly termed "ANSI C", but it was never expected to be exact.
> > That standard is almost identical to C90 IIRC. And that standard is, > however old, still relevant. The Keil ARM compiler for instance defaults > to C90 and you have to specifically enable C99. (and nothing newer if you > don't want to set it to C++ 2003) > > So don't throw out your K&R II just yet! > I have my copy within an arms reach. ;-) >
K&R is considered a "classic" in terms of quality technical writing. It is not a good book for learning modern C. It is not a good book for learning embedded programming. It is not a good book to use as a reference for the details of the C language. It is better than many others, but it is badly outdated, contains some horrible advice and examples, is strongly oriented towards command-line Unix software from the 80's and 90's, which is completely inappropriate for embedded development, and C is almost always a poor choice for modern day programming of the sort of tasks targeted in the book. It is true that many compilers default to C90. That is not an excuse for using C90 - if you are serious about C development, you should be using /your/ choice of C standard, not the default from the compiler. It is an unfortunate reality that some people still have to use compilers that don't support C99 - that is, obviously, a good reason for using C90. Other than that, /you/ should choose your version of the language - and you should choose at least C99 because it lets you write higher quality software. And if your tool vendor does not support C11 by now, complain to them. The year is 2018. Why anyone would think that a 30 year old tutorial book is a better choice of reference than the current official standards is beyond my comprehension. So K&R (I or II) is an interesting read, and interesting history - but I would not consider it a way to learn good embedded C programming, nor would I consider it a useful reference. And if you want more readable references then there are plenty available online. My recommendation (for C and C++) is <https://en.cppreference.com/>.
On 14/09/18 15:04, David Brown wrote:
> On 14/09/18 14:52, Stef wrote: >> On 2018-09-14 David Brown wrote in comp.arch.embedded: >>> On 14/09/18 11:47, Nils M Holm wrote: >>>> >>>> Yes. K&R II (The C Programming Language, 2nd Ed), says that >>> >>> K&R does not define the C language. It was an approximation to a >>> definition until ANSI (then ISO) made the standard, and has not been >>> relevant for decades. >> >> You must be talking about K&R, Nils Specifically said K&R II. >> That edition has printed "ANSI C" in large capitals on the cover. >> A quote from the preface: >> "This second edition of 'The C Programming Language' describes C >> as defined by the ANSI standard" > > Exactly - it /describes/ the C language defined by the standard. It > does not define the language, or the standard, or give a complete or > accurate set of rules for it. That is what the /standard/ is for. K&R > (all editions) are /tutorials/, not standards or language definitions. > (Equally, the standards are not tutorials, and say as much in their > forewords.) K&R was updated for the second edition to contain much of > what is often incorrectly termed "ANSI C", but it was never expected to > be exact. > >> >> That standard is almost identical to C90 IIRC. And that standard is, >> however old, still relevant. The Keil ARM compiler for instance defaults >> to C90 and you have to specifically enable C99. (and nothing newer if you >> don't want to set it to C++ 2003) >> >> So don't throw out your K&R II just yet! >> I have my copy within an arms reach. ;-) >> > > K&R is considered a "classic" in terms of quality technical writing. It > is not a good book for learning modern C. It is not a good book for > learning embedded programming. It is not a good book to use as a > reference for the details of the C language. It is better than many > others, but it is badly outdated, contains some horrible advice and > examples, is strongly oriented towards command-line Unix software from > the 80's and 90's, which is completely inappropriate for embedded > development, and C is almost always a poor choice for modern day > programming of the sort of tasks targeted in the book. > > It is true that many compilers default to C90. That is not an excuse > for using C90 - if you are serious about C development, you should be > using /your/ choice of C standard, not the default from the compiler. > It is an unfortunate reality that some people still have to use > compilers that don't support C99 - that is, obviously, a good reason for > using C90. Other than that, /you/ should choose your version of the > language - and you should choose at least C99 because it lets you write > higher quality software. And if your tool vendor does not support C11 > by now, complain to them. > > > The year is 2018. Why anyone would think that a 30 year old tutorial > book is a better choice of reference than the current official standards > is beyond my comprehension. > > So K&R (I or II) is an interesting read, and interesting history - but I > would not consider it a way to learn good embedded C programming, nor > would I consider it a useful reference. > > And if you want more readable references then there are plenty available > online. My recommendation (for C and C++) is > <https://en.cppreference.com/>.
Mine is the C++ FQA (sic): http://yosefk.com/c++fqa/ The FQA isn't completely up to date; but then the nice thing about C "standards" is that there are so many to choose from.