Do I need this mask and cast?Started by 3 years ago●21 replies●latest reply 3 years ago●943 views
This is just a quick question -- take a look at the following function:
uint8_t GetBlue (uint32_t tmpColor)
return (uint8_t) (tmpColor & 0xFF);
I'm writing a column -- with regard to the function above, I was going to say:
There are a couple of things to note here. Let’s start with the fact that using (tmpColor & 0xFF) to perform the mask operation is the same as saying (tmpColor & 0x000000FF) because the compiler inserts the leading zeros for us. The next point is that we don’t actually need to perform the mask operation in the first place because the most-significant 24 bits of our 32-bit value will be discarded when we perform the cast. The advantage of including the mask is that it makes our intent clear to anyone who has to maintain this program in the future. Following on from the previous point, believe it or not, we really don’t even need to use the (uint8_t) to cast the result as an 8-bit value, because we already declared the GetBlue() function as returning a uint8_t value, which means the compiler will automatically perform this cast for us. Once again, however, our explicit cast will make what we are trying to do clear to a future reader of this sketch.
But I thought I'd check with you all first -- I don;t want to make a complete fool of myself -- so, is everything I said true?
Although the compiler will most likely produce the correct, desired result without the mask and cast, I am in the habit of using those functions to make sure the intention of the coding is clear... other times I have skipped those steps and had hard-to-find bugs to track down. I presume (maybe naively) that the compiler will optimize unnecessary casts out of the code...
Hi Jmdhuse -- I'm with you -- I think it's best to write the code so anyone reading it in the future (and that could easily be me) can understand what it was I was trying to do. That's why I typically use more than my fair share of parentheses so I don't run into precedence problems.
For the purposes of my column, I just wanted to explain why a lot of what I was doing wasn't strictly necessary.
Why not encourage your readers to try it all ways - and to compare the ASM listing to see what (if any) the effects of the mask and cast are.
It is always good for coders (of embedded systems at least) to know how to navigate and understand the ASM listings.
Hi SpiderKenny -- That's a really good idea -- but I'm a hardware engineer by trade -- how do I get to see the ASM listing from a compilation using the Arduino IDE?
Ahhh.. Arduino IDE -- well IIRC that only produces HEX and not the ASM files, so you'd need to capture the hex and pump it through ObjDump to see the assembly. Unless recent versions of Arduino now allow you to generate assmeblu listings - it has been a long time since I used Arduino IDE.
Dropping unnecessary casts is part of the abstract machine behavior specified by the standard; it's not part of optimization.
I think you're 95% there. These are language lawyer questions (FWIW you are better asking on StackOverflow than here, even though the answers here will likely be kinder). I am not a language lawyer, but I think I can help... some familiarity with the C standard can go a long way (see for example draft N1256 : http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1256.pdf)
You have three points:
First point: 0xff promotes to a 32-bit value -- this is because of the integer promotion rules of C. Most (all?) operators work only on operands of the same type, and there are rules about how the compiler promotes types so it gets operands of the same type. See N1256 section 220.127.116.11 on p. 45. In this case you have uint32_t for the left-hand operand. The right-hand operand 0xff: "The type of an integer constant is the first of the corresponding list in which its value can be represented." (18.104.22.168 p. 55) and the first on the list for hex constants are "int" and "unsigned int". There are two possibilities here depending on the machine type: (A) either it's a machine where "int" is an 8-bit value (in which case 0xff fits in "unsigned int") or (B) it's a machine where "int" is larger than 8 bits, in which case 0xff fits into an "int". The compiler doesn't "insert the leading zeros for us" -- it associates constants with the smallest type it can fit in.
The compiler follows the rules in 22.214.171.124 to promote that 0xff. In case (A) (machine with 8-bit int), the operands are of type uint32_t and an 8-bit unsigned int, in which case the rule that applies is "Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank." They are both unsigned types, and uint32_t has greater rank than 8-bit unsigned int, and that means 0xff is promoted to uint32_t. In case (B) (machine with larger than 8-bit int) then the operands are of type uint32_t and int (mixed sign), and the next conversion rule that applies is "Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type." This applies if "int" is a 16-bit or 32-bit value, and we convert the 0xff to the same type as the left-hand operand uint32_t. If "int" is a 64-bit value then the next rule applies: "Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type" and the uint32_t is converted to an int64_t.
Summary: It depends on how big an "int" is:
- 8-bits: 0xff interpreted as an 8-bit unsigned int and converted to a uint32_t; the result of the AND operation is uint32_t
- 16-bits or 32-bits: 0xff interpreted as an "int" and converted to a uint32_t; the result of the AND operation is uint32_t
- 64-bits: 0xff is interpreted as an "int" (=int64_t) and the uint32_t value is converted to an int64_t; the result of the AND operation is int64_t.
The second point is that you are returning a uint8_t and casting the result of the AND operation to a uint8_t. If you don't have the cast, the compiler will perform an implicit conversion to uint8_t anyway, so you don't need the cast. Integer conversions are covered in 126.96.36.199: "When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged. Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type. Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised." In this case, because neither uint32_t or int64_t can be represented by a uint8_t, the second clause applies: "Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type." If you translate from C-standard-ese to math, all this means is that you're taking the expression (the result of the AND operation) modulo 256, and the compiler will throw away any information it has above the least significant 8 bits
The third point is whether you need the "& 0xff" anyway. You do not, because of the type conversion rule in 188.8.131.52 -- if you're going from uint32_t to uint8_t, the compiler will perform an implicit conversion from uint32_t to uint8_t, treating it modulo 256 and throwing away any information it has above the least significant 8 bits --- which is exactly what you are looking for.
Please note that none of this involves any optimization! It is all about the abstract machine behavior. I suppose that in -O0 it might involve explicit conversions (for example, taking the uint32_t, actually ANDing it with 0xff, and returning the result) whereas in -O1 or greater it will just grab the least significant 8 bits through whatever means are available to it. (e.g. an 8-bit move instruction, or a MOV.B instruction on a 16-bit dsPIC)
If I were reviewing code I would probably recommend the terse version:
uint8_t GetBlue (uint32_t tmpColor)
The semantics of this are well-defined and clear -- and actually they're clearer than those integer promotion rules that come into play when you just stick an 0xff into your code. Use of the unsigned suffix "U" would be cleaner, and we would all be better off if we remembered to express bitmasks as unsigned constants e.g. 0xffU rather than leaving them as their (potentially signed) defaults.
Casts are more important in cases where there is potential ambiguity or you need them, for example:
uint16_t mul_uq16(uint16_t a, uint16_t b)
return ((uint32_t)a * b) >> 16;
If you do not include the (uint32_t) cast, then the compiler will treat the expression a*b as having the same type as its operands, namely uint16_t, and the shift-right by 16 technically produces undefined behavior (!!!) but most compilers will probably just return zero.
The cast is needed to promote the multiply to a 32x32 multiply before the shift-right, and most compilers will realize this is a 16x16 unsigned multiply that you just need the top 16 bits of the product.
Quick question -- so if I have an 8-bit signed int (int8_t) containing -128 (0x80), I know that if I cast this to a 32-bit signed int (int32_t), it will be sign extended to 0xFFFFFF80, which is still -128.
But what happens if I cast it to a 32-bit unsigned int (uint32_t)?
You can also experiment with Compiler Explorer.
See for example https://godbolt.org/z/zkRgAA (try it with -O1) or https://godbolt.org/z/jtWpwf
Hi Jms_nh -- I am awed by the amount of work you put into this response -- thank you so much for all of this -- it's very much appreciated -- Max
Curious to know why you think asking on StackOverflow would be better? Is it because there are more chances to receive helpful answers or for other reasons like the interface? All of the above? Thanks!
well.. hmm, that's a good question. I like the community here, it's much more positive and personal. Stack Overflow has gotten impersonal and often mean over the years.
In terms of interface: yeah, I admit I'm not fond of this forum's UI; I know it gets your personal attention + care, but it doesn't have the polish of something like Discourse.
But the bigger issue is that if you are looking for language-lawyer advice, you will find more experts on SO and the level of meticulousness and correctness will likely be higher. If it were other areas of embedded expertise I would not have made that recommendation. Language-lawyer questions are one of those areas, though, that it's important to have good information that cites sources, and I'd give up the positive-personal-community and risk the SO meanness to get an authoritative answer.
Thanks for taking the time to answer - it all makes sense
I agree that the UI has lots of room for improvement. I am gaining lots of experience on that front, building the online conferences websites, and I hope to find the time sooner than later to apply some of the things that I've learned to the *Related sites.
Take care Jason.
"If the optimizer doesn't fix it up, you can take any additional execution time out of my vacation."
LOL I'll have to remember that line. But you make a good point -- if the optimizer doesn't perform it's task, my superfluous statements could potentially impact execution speed -- a conundrum indeed.
The cast annoys the hell out of me, because it is so self-evidently redundant (even without referring to the return value of the function), and as a result, reduces clarity. In other words, either obtain the byte using the '& 0xFF', or by casting to uint8_t, but by doing both, I instantly get the impression that one of them is wrong.
If your goal is provide clarity, then you should probably add a right-shift (by zero bits, of course) of tmpColor. I expect you have GetRed and GetGreen (and GetOpaqueness?) equivalents for this function, and they are probably obtaining the corresponding color components from other (higher-order) bytes in tmpColor, meaning that in those functions, a right-shift of tmpColor by 8 or 16 (or 24) bits is necessary. So why not highlight this organization of the color components in GetBlue as well, by including a right-shift of tmpColor by 0 bits?
(In fact, upon seeing the redundant '& 0xFF', my immediate reaction was to ask myself, should that have been a right-shift instead?)
Hi Cprovidenti -- when you say "...it is so self-evidently redundant..." I fear you are looking at this through the eyes of someone who already knows this stuff.
I really don't think its as obvious to beginners as you believe, but maybe that's just because "I'm a bear of little brain," as Pooh would say :-)
In a safety-critical environment say in the aerospace industry, code without a proper mask and possibly a cast would not pass code review. In fact, that 0xFF literal would also be subject to scrutiny, as it might be poor practice compared to declaring it a constexpr (if using C++) or at least #define in a proper location.
Hi Manoweb -- this is very useful feedback -- I've added it to my mental list of "pros" :-)
Not sure if I would agree. (But then again, I'm not familiar with aerospace best practices.)
What would a professional-grade static analysis tool say?