On Thursday, February 6, 2020 at 7:26:52 AM UTC-5, pozz wrote:
> [1] http://www.nadler.com/embedded/newlibAndFreeRTOS.html

Pozz, you mentioned you're trying to use ST's RubeMX.
Be careful, extremely buggy support libraries and examples!!
Latest unbelievable foolishness here:
https://community.st.com/s/question/0D50X0000CBmXufSQF/newlibmalloc-locking-mechanism-to-be-threadsafe
Avoid the ST suggestions!!!

Aaaaarrrrggggg.....

On 09/02/2020 04:42, Paul Rubin wrote:
> David Brown <david.brown@hesbynett.no> writes:
>> No, it hasn't accessed unallocated memory at all.  It did not access
>> anything.  The compiler could see that either the malloc worked fine
>> and the result would be the value 6,
> 
> It's a question of how the compiler implemented the compile-time
> execution.  I hope that it didn't really allocate 6 bytes of memory at
> compile time, and then write past it.  But it makes me wonder.
> 

I'm lost here.

Are you talking about the mistake I made in allocating 4 bytes, rather 
than 4 * sizeof(int) ?  It makes no difference to the code generated 
when that is corrected, and I don't see where "6 bytes" comes from.

The use of "malloc" in the source code bears no direct relation to 
having to allocate dynamic memory in the compiler.  Compile-time 
execution means figuring out what the effect of the code is, and 
simulating it at compile time - it does /not/ mean executing it 
directly.  And in particular, baring bugs in the compiler it does not 
mean executing undefined behaviour in the compiler - though it can mean 
ignoring it in the source code.

(It would have been nice if the compiler had spotted my mistake and told 
me about it, however.)

David Brown <david.brown@hesbynett.no> writes:
> No, it hasn't accessed unallocated memory at all.  It did not access
> anything.  The compiler could see that either the malloc worked fine
> and the result would be the value 6,

It's a question of how the compiler implemented the compile-time
execution.  I hope that it didn't really allocate 6 bytes of memory at
compile time, and then write past it.  But it makes me wonder.

On 07/02/2020 17:00, Paul Rubin wrote:
> David Brown <david.brown@hesbynett.no> writes:
>> The point is that because the compiler knows what malloc and free do -
>> they are specified in the standards - it can use that knowledge for
>> optimisation.
> 
> In this case it has accessed unallocated memory.  I wonder if there is
> an exploit.  Hmm.
> 

No, it hasn't accessed unallocated memory at all.  It did not access 
anything.

The compiler could see that either the malloc worked fine and the result 
would be the value 6, or the malloc would fail (and return 0) in which 
case the program would have undefined behaviour (accessing a null 
pointer).  The compiler can assume that the programmer doesn't care what 
happens when executing undefined behaviour, and thus giving a result of 
6 is perfectly acceptable there too.  So the best code is simply to 
return 6 without any work at run time.

Ironically, if I had checked the result of malloc() for a null pointer, 
it could not have made this optimisation!

David Brown <david.brown@hesbynett.no> writes:
> The point is that because the compiler knows what malloc and free do -
> they are specified in the standards - it can use that knowledge for
> optimisation.

In this case it has accessed unallocated memory.  I wonder if there is
an exploit.  Hmm.

On Friday, February 7, 2020 at 8:31:10 AM UTC-5, pozz wrote:
> After your considerations, why use a printf that uses heap? There are 
> other good implementations that don't use heap at all and so are 
> intrinsically thread-safe.

There's a list of alternate printf implementations on my web page you referenced. Other library functions like strtok use malloc and friends.
Again, whatever you do, check the map and MAKE SURE you don't accidentally
drag in non-thread-safe uses of library malloc family...

Hope that helps,
Best Regards, Dave

Il 07/02/2020 10:36, upsidedown@downunder.com ha scritto:
> On Thu, 6 Feb 2020 13:26:47 +0100, pozz <pozzugno@gmail.com> wrote:
> 
>> Usually arm gcc compiler uses newlib (or newlib-nano) for standard C
>> libraries (memset, malloc, printf, time and so on).
>>
>> I sometimes replace newlib functions, because I don't like them. First
>> of all, I replace snprintf because newlib implementation uses malloc and
>> I don't like to use malloc, mostly if it can be avoided.
>> And for printf-like functions, there are a few implementations that
>> don't use malloc.
> 
> While dynamic memory  fragmentation can be a serious issue in systems
> that needs to run a long time (years or decades) without reboots. For
> this reason it is a good idea to avoid using malloc and free (or at
> least avoid using free :-). Fragmentation occurs when variable size
> allocations with different lifetimes are used.
> 
> However, functions like printf may allocate some resources at entry
> and release them at exit and the heap state is the same before the
> printf function after it has been exited. In fact in this case dynamic
> memory is used in the same way as stacks. Much of the functionality
> could have been implemented using stack allocation. For some
> historical reasons (very small stacks on some early processors),
> C-language malloc/hree is used much more frequently compared to other
> languages using stack work space.
> 
> In a single task system or in  multitasking environment with private
> heaps using this kind of stack-like usage should not cause
> fragmentation.  However in a multitasking environment with a single
> shared heap, memory fragmentation can occur, if some other task makes
> long lasting allocations while printf is being executed. So in
> reality, the whole printf function should be protected against task
> switching.

After your considerations, why use a printf that uses heap? There are 
other good implementations that don't use heap at all and so are 
intrinsically thread-safe.

On 07/02/2020 10:04, Paul Rubin wrote:
> David Brown <david.brown@hesbynett.no> writes:
>>     int * p = malloc(N);
> 
> (cough) that allocates N bytes, not N ints.

Just checking that you were paying attention :-)

> 
>> gcc compiles test to:
>>
>> test:
>>   mov eax, 6
>>   ret
> 
> Wow!  I think it saw the consts and basically ran the code at compile
> time.
> 

Yes, exactly.

The point is that because the compiler knows what malloc and free do -
they are specified in the standards - it can use that knowledge for
optimisation.

(The exact point at which it will change from run-time calculation to
compile-time calculation is dependent on the compiler, target, options,
etc.)

On Thu, 6 Feb 2020 13:26:47 +0100, pozz <pozzugno@gmail.com> wrote:

>Usually arm gcc compiler uses newlib (or newlib-nano) for standard C 
>libraries (memset, malloc, printf, time and so on).
>
>I sometimes replace newlib functions, because I don't like them. First 
>of all, I replace snprintf because newlib implementation uses malloc and 
>I don't like to use malloc, mostly if it can be avoided.
>And for printf-like functions, there are a few implementations that 
>don't use malloc.

While dynamic memory  fragmentation can be a serious issue in systems
that needs to run a long time (years or decades) without reboots. For
this reason it is a good idea to avoid using malloc and free (or at
least avoid using free :-). Fragmentation occurs when variable size
allocations with different lifetimes are used.

However, functions like printf may allocate some resources at entry
and release them at exit and the heap state is the same before the
printf function after it has been exited. In fact in this case dynamic
memory is used in the same way as stacks. Much of the functionality
could have been implemented using stack allocation. For some
historical reasons (very small stacks on some early processors),
C-language malloc/hree is used much more frequently compared to other
languages using stack work space.

In a single task system or in  multitasking environment with private
heaps using this kind of stack-like usage should not cause
fragmentation.  However in a multitasking environment with a single
shared heap, memory fragmentation can occur, if some other task makes
long lasting allocations while printf is being executed. So in
reality, the whole printf function should be protected against task
switching.

On 06/02/2020 23:23, pozz wrote:
> Il 06/02/2020 21:41, David Brown ha scritto:
>> On 06/02/2020 16:06, pozz wrote:
>>> Il 06/02/2020 14:58, David Brown ha scritto:
>>
>>>
>>>> nor can you implement memcpy
>>>> or memmove, due to the type aliasing and effective type rules.&#4294967295; If you
>>>> want to be sure of problem-free code that is safe regardless of
>>>> optimisation, link-time optimisation, new generations of compilers,
>>>> etc., then you'll be quite careful and make good use of gcc attributes.
>>>
>>> What about copying byte by byte?
>>> Here[1] you can see newlib memcpy implementation. If
>>> PREFER_SIZE_OVER_SPEED or __OPTIMIZE_SIZE__ is defined, the
>>> implementation is really copying byte by byte.
>>>
>>> I don't really know how newlib used by my compiler (CubeIDE from ST)
>>> was compiled, maybe I'm using dumb version of memcpy already.
>>>
>>> This is an extract from a listing:
>>>
>>> 08025850 <memcpy>:
>>> &#4294967295;&#4294967295;8025850:&#4294967295;&#4294967295;&#4294967295; b510&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295; push&#4294967295;&#4294967295;&#4294967295; {r4, lr}
>>> &#4294967295;&#4294967295;8025852:&#4294967295;&#4294967295;&#4294967295; 1e43&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295; subs&#4294967295;&#4294967295;&#4294967295; r3, r0, #1
>>> &#4294967295;&#4294967295;8025854:&#4294967295;&#4294967295;&#4294967295; 440a&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295; add&#4294967295;&#4294967295;&#4294967295; r2, r1
>>> &#4294967295;&#4294967295;8025856:&#4294967295;&#4294967295;&#4294967295; 4291&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295; cmp&#4294967295;&#4294967295;&#4294967295; r1, r2
>>> &#4294967295;&#4294967295;8025858:&#4294967295;&#4294967295;&#4294967295; d100&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295; bne.n&#4294967295;&#4294967295;&#4294967295; 802585c <memcpy+0xc>
>>> &#4294967295;&#4294967295;802585a:&#4294967295;&#4294967295;&#4294967295; bd10&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295; pop&#4294967295;&#4294967295;&#4294967295; {r4, pc}
>>> &#4294967295;&#4294967295;802585c:&#4294967295;&#4294967295;&#4294967295; f811 4b01&#4294967295;&#4294967295;&#4294967295;&#4294967295; ldrb.w&#4294967295;&#4294967295;&#4294967295; r4, [r1], #1
>>> &#4294967295;&#4294967295;8025860:&#4294967295;&#4294967295;&#4294967295; f803 4f01&#4294967295;&#4294967295;&#4294967295;&#4294967295; strb.w&#4294967295;&#4294967295;&#4294967295; r4, [r3, #1]!
>>> &#4294967295;&#4294967295;8025864:&#4294967295;&#4294967295;&#4294967295; e7f7&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295; b.n&#4294967295;&#4294967295;&#4294967295; 8025856 <memcpy+0x6>
>>>
>>> I'm not an expert of assembly, but it seems to me it is implemented
>>> in the simple and not optimized way.
>>>
>>
>> It is not the actual copying that is the problem - copying by char is
>> simple and safe (though often inefficient).&#4294967295; The issue is that the C
>> standards say memcpy also copies the effective type in certain
>> circumstances - there is no way to specify that in C, and it is
>> therefore a special feature of the library memcpy.&#4294967295; 
> 
> Could you make an example? I didn't understand.

Suppose you have a block "b" of memory allocated on the heap with malloc
- it has no "declared type" because it was not part of a C-defined
object.  You are free to store data of any kind in "b", and it takes on
a type based on the access you used to store to "b" (unless you use
character type access, which leaves it untyped).  Let's say you treat
"b" as an array of floats and fill it up - now its effective type is
float[].

Suppose you have another C object or array "s" with a specific type from
somewhere - such as an array of char* pointers.

You want to copy the contents of "s" into "b".

You could do this in several ways:

1. Read from "s" as char* pointers, converting to a float using a union,
and write it to "b".  Then "b" is still an array of floats and the
compiler knows that any access to it as an array of char* pointers is
undefined behaviour - it can assume it can't happen.  Beware the nasal
demons!

2. Make a pointer "char* * p = (char* *) b" and use that as the
destination when copying from "s" to "b".  The compiler knows that "b"
is now an array char* pointers, and can be accessed as such.  Everything
works, but you need a specific copying function each time.

3. Make a generic function that copies using unsigned char, and call
that to copy from s to b.  Then b takes on the effective type of s, and
so b is now an array of char* pointers.  Everything works, but copying
is inefficient.

4. Make a generic function that copies using uint32_t for speed, and
call that to copy from s to b.  Then b becomes an array of uint32_t, and
accessing it as an array of char* pointers is undefined behaviour.
Nasal demons again.

5. Call the standard library memcpy.  Then b gets the effective type of
s, and everything works.  This is true whether the compiler generates a
local loop, or calls the library function, and it is true whether the
copying is done by byte or in larger lumps.  The library memcpy is
special here - you cannot duplicate that behaviour in standard C.

This kind of thing - type based alias analysis and the effective type
rules in C - is difficult to get right.  And it is not often that the
compiler can use this extra information for optimisation.  But sometimes
it can.  And sometimes it uses it for an optimisation that is correct
according to the C code you wrote, but not according to what you wanted.

Understanding the rules is hard, and sometimes playing by the rules is
even harder, so one solution is to change the rules.  The
"-fno-strict-aliasing" flag in gcc changes the semantics of C to say
that the effective type of an object is always the type used to access
it - this simplifies things a lot here, at the cost of occasionally
missed optimisation opportunities.  For example, the Linux kernel is
always compiled with "-fno-strict-aliasing".

(Note that this flag does not help with the aliasing issue with
home-made malloc, as that's a different thing entirely.)

> 
> Anyway as you can see, newlib just implements memcpy in pure C language
> when compiled without optimizations. Are you saying it's bugged?

No - it can be treated as special because it is the standard library for
your implementation.

> 
> 
>> A homemade memcpy does not have that same feature.&#4294967295; (In a similar
>> vain, there is no way to get memory in standard C that has "no
>> declared type" except via the library malloc and friends - a homemade
>> malloc won't do.)&#4294967295; I am not sure what the best solution is here.
>>
>> Anyway, for memcpy make sure the compiler can use the builtin versions
>> where possible (avoid -ffreestanding, or use -fbuiltin) as this will
>> give far better code.
>>
> 
> I don't use -ffreestanding, but I don't know if I'm using -fbuiltin.
> Anyway you are suggesting to use builtin functions that are functions
> built *in* the compiler and not in the newlib.
> 
> Another reason to consider useless newlib.
> 

When you use one of the common "small" functions in the C standard
library, like memcpy, memset, strcat, etc., the compiler knows what they
do without knowing the source.  If it can make smaller or faster code
inline with the same effect as specified in the standards, then it may
do so.  Typically for memcpy that means the compiler knows the size of
the copy and the alignments at compiler time.  For example:

uint32_t rawfloat(float f) {
    uint32_t u;

    memcpy(&u, &f, sizeof(u));
    return u;
}

This will be turned into a register move (if needed, depending on the
cpu), with nothing stored in memory and no library calls made.  And
unlike faffing around with pointer casts, it is correct C code.  And
unlike using a type-punning union, it is correct C++ code as well as
correct C code.

But more general calls to memcpy will be passed on to the library function.