pozz <pozzugno@gmail.com> wrote:
> In one of my projects that run on a Cortex-M0+ MCU, I have a few arrays 
> of structs. Now I need to increase the size of the arrays, but I'm out 
> of RAM, so I'm searching for ways to save some space in RAM.
> 
> One simple way is to pack the structs, for example with
> 
>   __attribute__((packed))
> 
> in gcc. I can save some padding bytes (that waste some memory) for each 
> element of the array, so the total amount of saved space could be enough.
> 
> I know the use of a packed struct forces the compiler to generate a 
> slower code, because of misaligned accesses of its members.
> However, besides having a slower code, is the result correct in any case?
> 

Post the structure with the variable names changed to Greek gods.
Add comments to each line with the range of values stored in each variable.

Il 26/10/2021 16:11, David Brown ha scritto:
> On 26/10/2021 15:22, pozz wrote:
>> Il 26/10/2021 13:57, David Brown ha scritto:
>> [...]
>>> Another alternative is that instead of having an array of structs, you
>>> can split the data up into two or three arrays each containing part of
>>> the data.&#4294967295; (This is often done on big systems with cache, as it can lead
>>> to massive speed increases.)
>>
>> Can you explain better this? Thanks
>>
> 
> Change:
> 
> 	typedef struct {
> 		uint32_t counter;
> 		bool valid;
> 	} counted_thing;
> 
> 	counted_thing things[1000];
> 
> into:
> 
> 	uint32_t thing_counters[1000];
> 	bool thing_valids[1000];
> 
> That turns an 8000 byte array into two arrays of 4000 bytes and 1000
> bytes respectively.

Now I got your point.


> No padding, inefficient packing, or wasted space.  Indeed, with a bit
> more effort the thing_valids[] array could perhaps be packed into 125 bits.
> 
> This is a big issue in game programming - there has been a move away
> from nice C++ objects that are held in an array, to integrating the
> array handling with the object handling so that the data can be arranged
> in a more cache-friendly manner.  It's especially useful when your
> objects have critical data that is accessed a lot (such as position) and
> less critical data that is more rarely used (such as cost or name) -
> separate arrays means you can run through the entire array of object
> positions without filling up your caches with low-priority name data.
> 
> But in your case, you can use it to save space in ram (and possibly make
> stepping through the data a little more efficient as the stride sizes
> are more likely to be powers of two).

On 26/10/2021 15:22, pozz wrote:
> Il 26/10/2021 13:57, David Brown ha scritto:
> [...]
>> Another alternative is that instead of having an array of structs, you
>> can split the data up into two or three arrays each containing part of
>> the data.&#4294967295; (This is often done on big systems with cache, as it can lead
>> to massive speed increases.)
> 
> Can you explain better this? Thanks
> 

Change:

	typedef struct {
		uint32_t counter;
		bool valid;
	} counted_thing;

	counted_thing things[1000];

into:

	uint32_t thing_counters[1000];
	bool thing_valids[1000];

That turns an 8000 byte array into two arrays of 4000 bytes and 1000
bytes respectively.

No padding, inefficient packing, or wasted space.  Indeed, with a bit
more effort the thing_valids[] array could perhaps be packed into 125 bits.

This is a big issue in game programming - there has been a move away
from nice C++ objects that are held in an array, to integrating the
array handling with the object handling so that the data can be arranged
in a more cache-friendly manner.  It's especially useful when your
objects have critical data that is accessed a lot (such as position) and
less critical data that is more rarely used (such as cost or name) -
separate arrays means you can run through the entire array of object
positions without filling up your caches with low-priority name data.

But in your case, you can use it to save space in ram (and possibly make
stepping through the data a little more efficient as the stride sizes
are more likely to be powers of two).

Il 26/10/2021 13:57, David Brown ha scritto:
[...]
> Another alternative is that instead of having an array of structs, you
> can split the data up into two or three arrays each containing part of
> the data.  (This is often done on big systems with cache, as it can lead
> to massive speed increases.)

Can you explain better this? Thanks

In one of my projects that run on a Cortex-M0+ MCU, I have a few arrays 
of structs. Now I need to increase the size of the arrays, but I'm out 
of RAM, so I'm searching for ways to save some space in RAM.

One simple way is to pack the structs, for example with

   __attribute__((packed))

in gcc. I can save some padding bytes (that waste some memory) for each 
element of the array, so the total amount of saved space could be enough.

I know the use of a packed struct forces the compiler to generate a 
slower code, because of misaligned accesses of its members.
However, besides having a slower code, is the result correct in any case?