Suppose I want to have an array that looks something like the following using 8-bit unsigned integers:
uint8_t test[4][6]
{
{ 5, 0, 1, 2, 3, 4}, // 5 items numbered 0, 1, 2, 3, 4
{ 4, 1, 2, 3, 4, 0}, // 4 items numbered 1, 2, 3, 4 with 1x0 fill
{ 1, 2, 0, 0, 0, 0}, // 1 item numbered 2 with 4x0 fills
{ 2, 3, 4, 0, 0, 0 } // 2 items numbered 3 and 4 with 3x0 fills
}
The first number in each row says how many of the remaining numbers have valid values (which can range from 0 to 4 -while any unused values are set to 0.
The thing is that 0 is one of our valid values, but we’re also using this is a “fill” value or place-holder for unused items.
Would it be better to employ the following with 8-bit signed integers and -1 as the unused/fill value?
int8_t test[4][6]
{
{ 5, 0, 1, 2, 3, 4}, // 5 items numbered 0, 1, 2, 3, 4
{ 4, 1, 2, 3, 4, -1}, // 4 items numbered 1, 2, 3, 4 with 1x-1 fill
{ 1, 2, -1, -1, -1, -1}, // 1 item numbered 2 with 4x-1 fills
{ 2, 3, 4, -1, -1, -1 } // 2 items numbered 3 and 4 with 3x-1 fills
}
Are there any arguments for using one scheme over the other?
Since you have a field that explicitly indicates the number of valid values, then zero is fine. If you chose to use -1 then the valid values field serves no real purpose, outside of a small speedup, since you could simply read until -1. I prefer the initial approach of having an explicit field that indicates the number of valid fields when space/memory are not a concern. When space/memory are a concern then I would choose the second approach.
That's a great point (not needing the number if we're using -1 to save space) -- thanks for sharing.
HI, @hodgec!
Agree with you, but in the case of using negative numbers, there is two ( perhaps more ) approaches, each one having a penalty:
-1) If the variable that will scan the array is declared signed, in order to compare it with negative numbers, this can confuse its meaning. And worse, the array must be declared signed!
-2) If the variable is declared unsigned, the test should be done against a value not present in the set of valid ones, which can lead to poor readability of the program.
Besides that, you will be doing a comparison between an unsigned ( the scan variable ) and a signed ( the -1 in the array ).
Because of this, I also prefer the valid values field approach.
P.S.: I must confess I've already used the negative number approach in the past :-)
Cheers!
You aren't overloading the definition of '0'. The first element is the number of valid datum in that line so a 0 in the 0th position of a line has a different meaning. Using -1 instead adds no value and for me is potentially more confusing. Simple add comments before the data structure definition and refer to it in your subroutine/function declarations header comments what the meaning is and it should be fine!
Regards,
Keith Abell
Thanks Keith -- it's always these little things that niggle me and make me think "I wonder what a professional would do?" :-)
I agree with Hodgec. You could also structure it without the zero fill since you have entry counts. It would make it easy to find all N number entries if required. I usually take highest frequency of use to dictate structure. G.H. <<<)))
Putting a unique, never-valid-data value at the end of a list - usually called a "sentinel" - is a classic trick (see C-style strings). But as always, there are...
Pros: No need for an extra number [counter], less chance of silly bugs
Cons: Prevents one particular value from being used as data, wasteful if you need to count the elements regularly or to scan them in a non-linear order for some reason.
So the "correct" solution depends on the program's needs. But if you choose a dedicated counter, it's better to put it somewhere outside the actual data list - again, to reduce the chance of very annoying bugs.
What about declaring a struct with the number/counter and an array of values -- then declare an array of these structs -- it would use the same amount of memory but would be clearer in the code -- yes? no?
The exact size-in-memory of a struct may depend on the data types, the architecture, the compiler (including keywords/directives like "packed" in C) and even the variable order. If unsure, test - and if sure, test anyway :-)
Thank you for your sage advice :-)
By the way, I found this interesting article about #pragma pack:
Well, there were comments about -1 possibly causing problems with signed/unsigned and so on. But you don't need -1, any value that is outside of the valid value range would do as a sentinel. Say, 0xff. Then your array remains unsigned, all your variables can be unsigned and no confusion. It doesn't really matter whether you compare for < 0 or == 0xff. Well, it might cost you an extra instruction on some processors, but not on ARM Cortex-??, which is the most ubiquitous core these days. If your range is less than 0x80, then you can use a bit test, like
if ( x & 0x80 )
which might then save that extra insn on those other chips.
Also, if you then #define the special value to something, like #define EL 0xff for end of list, then it even helps the reader.
I love the way everyone comes up with different solutions to this stuff -- instead of EL, I could use NA (not applicable) or NU (not used). Thanks for the suggestions -- Max
Agree.
But matching signed/unsigned variable types with array values may avoid some compiler warnings and improve program readability.
About saving computer's instructions, this is a declining value concept nowadays in favor of program readability, extensibility, maintenance, and scalability.
Cheers!