Style (?) issue| page 2

Reply by Tim Wescott ●January 3, 20112011-01-03

On 01/03/2011 01:50 AM, D Yuniskis wrote:
> Hi,
>
> (sigh) Another *@&^#($ night of "tree sitting". This gets
> old REALLY FAST! :-/
>
> Anyway, between trips outside, I've been dressing up
> some old bits of software -- cleaning up commentary, etc.
>
> One of the things I do to help with debugging as well as
> making interfaces more explicit is to place DEBUG-only
> invariants at the start of each function that "verify"
> the terms of the invocation contract. E.g., if a pointer
> can't be NULL, I'll do something like:
>
> ASSERT( (src != NULL), "Pointer to source must be non-NULL" );
>
> At run time, these go away (usually... depends on the market
> I am serving). So, they don't "cost anything". And, they
> make the interface rules *very* explicit (while the
> commentary could possibly be ambiguous, the conditionals
> in the invariants are hard to argue with!).
>
> So, the start of each function is usually a list of several
> ASSERT()'s -- at least one for each parameter (usually).
> Then, the meat of the function begins.
>
> One of the routines I was dressing up today had some constraints
> placed on a pointer passed to it. In particular, requiring the
> pointer to reference a valid memory location in a certain
> range (defined by manifest constants).
>
> ONCE THROUGH THE INVARIANTS, "live code" would further qualify
> the parameters passed to the routine. In this case, enforcing
> some "arbitrary" alignment criteria. Conceivably, the aligned
> pointer could now reference something *outside* the range of
> memory locations enforced by the invariants. The code clearly
> has to test for this -- since the whole point is to ensure
> those "out of bounds" areas are not referenced at run-time.
>
> Now, the nit pick... *technically*, the caller has complied
> with the contractual requirements of the function (the contract
> doesn't enforce alignment issues -- that is part of the role
> of the function itself). As such, I maintain that the original
> set of invariants are *all* that should be enforced (to
> determine if the invocation "complies"). The *further*,
> non-contractual alignment restriction that the function imposes
> should be treated as a run-time error -- just like trying to
> resolve "ptr[index]" when the location referenced extends
> outside of the valid range.
>
> In other words, I should *not* include a second set of ASSERT's
> to examine the "aligned" parameters after their initial validation.
>
> Agreed? Else, where does it end?? (it is really "clean" having
> the invariants clumped in one cohesive group)
>
> Thx,
> --don

For me the questions are "where is this contract written", "why does the 
contract, as written, allow clearly erroneous operation" and "how soon 
are you going to fix the contract?".

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html

Reply by David Brown ●January 3, 20112011-01-03

On 03/01/11 17:00, D Yuniskis wrote:
> Hi David,
>
> On 1/3/2011 5:32 AM, David Brown wrote:
>>> So, the start of each function is usually a list of several
>>> ASSERT()'s -- at least one for each parameter (usually).
>>> Then, the meat of the function begins.
>>
>> Where possible, you want to do the checks /before/ the function is
>> called. If the function is called with incorrect parameters, it is the
>> /caller/ that is wrong. Identifying the error at run-time in the called
>> function may be better than nothing, but it is not nearly as good as
>> identifying it at compile-time in the caller. If you can't put the
>> requirements into the "extern" declaration (as above), consider wrapping
>> the real function with an inline function:
>>
>> extern void do_foo(unsigned char *src);
>> static inline void foo(unsigned char *src) {
>> ASSERT((src != NULL), "src must be non-null");
>> do_foo(src);
>> }
>>
>> If foo() is compiled with a src that is known to be non-NULL, this
>
> This is rarely the case ------------------^^^^^^^^^^^ hence the need
> for the run-time check.
>

Actually, I think it is often the case - code is often written in a form 
such as "if (p) foo(p)", because the caller function doesn't want to 
bother calling foo if it doesn't have to.  It is also very common to 
write code such as "foo(&x)" - the compiler knows when calling foo that 
the pointer is non-NULL.  Of course, this is entirely dependent on the 
type of code and functions.

>> results in zero overhead, and if it is known to be NULL, then there will
>> likely be a compile-time error or warning when the caller code is
>> compiled.
>
> The compiler can't always make decisions at compile time.
> I.e., there will be (many) cases where the extra code generated
> by the ASSERT() gets included in the build image. (remember,
> I do this on *every* function I write) This really eats up
> a lot of space (especially for the strings)
>

Feel free to use a "DEBUG" define to enable or disable the run-time checks.

> The advantage it has is that it "forces" this into header
> files where it technically belongs (if you assume the
> headers to define the interfaces). *And*, makes it more
> obvious where the "bug" lies (since I can __FILE__ &
> __LINE__ to point me directly to the offending invocation
> instead of having to examine the call trace)
>

Correct.

>> For even more advanced usage, you can use something like gcc's
>> "__builtin_constant_p" function to distinguish between compile-time
>> known values - this would let the compiler choose between static
>> assertions and a call to a "safe" version of the function, or a call to
>> a dynamic assertion version of the function (which would have the
>> ASSERTs, then call the safe version).
>>
>> Of course, you would want to wrap all this in some macros to hide the
>> messy bits.
>>
>>> One of the routines I was dressing up today had some constraints
>>> placed on a pointer passed to it. In particular, requiring the
>>> pointer to reference a valid memory location in a certain
>>> range (defined by manifest constants).
>>>
>>> ONCE THROUGH THE INVARIANTS, "live code" would further qualify
>>> the parameters passed to the routine. In this case, enforcing
>>> some "arbitrary" alignment criteria. Conceivably, the aligned
>>> pointer could now reference something *outside* the range of
>>> memory locations enforced by the invariants. The code clearly
>>> has to test for this -- since the whole point is to ensure
>>> those "out of bounds" areas are not referenced at run-time.
>>>
>>> Now, the nit pick... *technically*, the caller has complied
>>> with the contractual requirements of the function (the contract
>>> doesn't enforce alignment issues -- that is part of the role
>>> of the function itself). As such, I maintain that the original
>>> set of invariants are *all* that should be enforced (to
>>> determine if the invocation "complies"). The *further*,
>>> non-contractual alignment restriction that the function imposes
>>> should be treated as a run-time error -- just like trying to
>>> resolve "ptr[index]" when the location referenced extends
>>> outside of the valid range.
>>
>> This I don't understand. Either it is an error to call the function with
>> an unaligned pointer, or it is not an error. If it is an error, then it
>
> Not an error (in this case). Think: "align to cache-lines", etc.
>

If it is not an error, then it is not a problem.

>> is part of the requirements for calling the function, and should be
>> checked and enforced as soon as possible. If it is not an error, then it
>> is not an error. The function may use alignment information to run
>> faster, or it may give you some debugging information to help improve
>> the code, but it is not an error.
>
> The *caller* didn't "make an error". But, the actions that
> the function WILL TAKE with the data provided could result
> in those parameters being modified to what would *now* be
> considered an error (had the caller supplied them in that
> "form" originally).
>

Then either the caller /is/ making an error, or the called function is 
in error, or the specifications for the function are in error.  If 
something is going wrong, somebody made a mistake!

> Contrived example:
>
> Caller invokes with argument that resolves to "249".
> Function will ALIGN_UP to 100 byte boundaries so that
> 249 becomes "300". (easier for me to deal with decimal
> numbers at this time of night)
>
> ASSERTs contractually constrain the parameter to the
> range [1,299].
>
> So, HAD THE CALLER PROVIDED THE 300 THAT IS *EFFECTIVELY*
> BEING USED, this would have been a contractual violation.
> (but, he *didn't*).
>

Your function's specifications are wrong - your claimed requirements on 
the inputs don't match the actual requirements.

If this can be caught at compile time, that's good.  (If it can be 
caught at design time, that's better.)  If it can only be caught at 
run-time, then so be it.  It's still an error.

> I am claiming that this should, instead, be a run-time
> "error" condition (i.e., "return FAIL;") instead of
> a contractual violation (i.e., "fall into debugger").
>
> (assume FAIL is a legitimate response from the function)
>
> [time for bed]

I'll not argue there!

Reply by Stefan Reuther ●January 3, 20112011-01-03

Hans-Bernhard Br&ouml;ker wrote:
> An actual implementation of a static_assert these days (i.e. without
> special support by the language) looks roughly like this:
> 
> #define STATIC_ASSERT(condition) do{int a[1-2*(!(condition))];}while(0)
> 
> The trick is that if the condition doesn't hold, this will cause a
> compile-time error because it defines an array of negative length.  The
> problem with it are:
> 
> 1) if the conditions holds, this creates an unused variable (which the
> optimizer will hopefully remove again)

I declare an external variable instead, i.e.
   #define static_assert(condition) \
     extern int static_assert[(condition) ? 1 : -1]
So even if the compiler insists on a definition of that variable (none
so far did), all I lose is one word.

In C++, I use a typedef instead (i.e. replace 'extern' with 'typedef').
This has the advantage of working even in class definitions. C++ allows
redundant typedefs, so you don't even have to mess with __LINE__.

> 2) if the conditions doesn't hold, the error message you'll get from the
> compiler has no apparent connection to the actual problem.

This happens so often, do you really care?
----8<----
  char* getConfigValue(const char* key, const char* default) {
     return "x";
  }
----8<----
  Error E2188 d.c 1: Expression syntax
  Error E2293 d.c 3: ) expected
  Error E2141 d.c 3: Declaration syntax error
----8<----


  Stefan

Reply by D Yuniskis ●January 3, 20112011-01-03

Hi Tim,

On 1/3/2011 12:44 PM, Tim Wescott wrote:
> On 01/03/2011 01:50 AM, D Yuniskis wrote:
>> One of the things I do to help with debugging as well as
>> making interfaces more explicit is to place DEBUG-only
>> invariants at the start of each function that "verify"
>> the terms of the invocation contract. E.g., if a pointer
>> can't be NULL, I'll do something like:
>>
>> ASSERT( (src != NULL), "Pointer to source must be non-NULL" );
>>
>> At run time, these go away (usually... depends on the market
>> I am serving). So, they don't "cost anything". And, they
>> make the interface rules *very* explicit (while the
>> commentary could possibly be ambiguous, the conditionals
>> in the invariants are hard to argue with!).
>>
>> So, the start of each function is usually a list of several
>> ASSERT()'s -- at least one for each parameter (usually).
>> Then, the meat of the function begins.
>>
>> One of the routines I was dressing up today had some constraints
>> placed on a pointer passed to it. In particular, requiring the
>> pointer to reference a valid memory location in a certain
>> range (defined by manifest constants).
>>
>> ONCE THROUGH THE INVARIANTS, "live code" would further qualify
>> the parameters passed to the routine. In this case, enforcing
>> some "arbitrary" alignment criteria. Conceivably, the aligned
>> pointer could now reference something *outside* the range of
>> memory locations enforced by the invariants. The code clearly
>> has to test for this -- since the whole point is to ensure
>> those "out of bounds" areas are not referenced at run-time.
>>
>> Now, the nit pick... *technically*, the caller has complied
>> with the contractual requirements of the function (the contract
>> doesn't enforce alignment issues -- that is part of the role
>> of the function itself). As such, I maintain that the original
>> set of invariants are *all* that should be enforced (to
>> determine if the invocation "complies"). The *further*,
>> non-contractual alignment restriction that the function imposes
>> should be treated as a run-time error -- just like trying to
>> resolve "ptr[index]" when the location referenced extends
>> outside of the valid range.
>>
>> In other words, I should *not* include a second set of ASSERT's
>> to examine the "aligned" parameters after their initial validation.
>>
>> Agreed? Else, where does it end?? (it is really "clean" having
>> the invariants clumped in one cohesive group)
>
> For me the questions are "where is this contract written", "why does the
> contract, as written, allow clearly erroneous operation" and "how soon
> are you going to fix the contract?".

[refering to example shown in another post]

For small projects, the contract is formally specified by the
ASSERTs and informally by the associated commentary.  I often
can't afford to prepare formal specifications for every function
in a project  :>

The operation isn't "clearly erroneous".  Rather, it's "unable to
provide the desired functionality with the inputs given".  Like
"divide(4, denominator)" -- where denominator happens to be
zero or very close thereto.  In many systems, an "interface error",
in some, caught *in* the invocation ("attempt to divide by zero")
and in others, "successfully" handled as "undefined/infinite".

The contract doesn't need "fixing".  The interface will inform the
caller of how it interpreted the inputs handed to it.  E.g.,

Result   /* PASS, QUAL, FAIL, ARGS, TIME */
foo(
     void **location
     Size *size
     ...
)

where "location" and "size" are in/out's.  Result tells the
caller if the call succeeded *exactly* as specified (PASS),
or by qualifying it's arguments (i.e., check size and location
for the actual values used), FAILed because the service
couldn't be provided, had problems with ARGumentS (like passing
in a NULL for either of the above) or TIMEd out.

In the example given, the caller has complied with the terms
of the contract.  And, the "return FAIL" result is something
he should be prepared to handle.  Likewise, the "return QUAL"
with the details of the actual "qualifications" that the routine
placed on his original parameters.  Hence my belief that this
is a rational way to handle what is actually a pathological
case (the alternative is to push extra restrictions onto
every caller).

Reply by D Yuniskis ●January 3, 20112011-01-03

Hi David,

On 1/3/2011 1:28 PM, David Brown wrote:

8<

>>> Where possible, you want to do the checks /before/ the function is
>>> called. If the function is called with incorrect parameters, it is the
>>> /caller/ that is wrong. Identifying the error at run-time in the called
>>> function may be better than nothing, but it is not nearly as good as
>>> identifying it at compile-time in the caller. If you can't put the
>>> requirements into the "extern" declaration (as above), consider wrapping
>>> the real function with an inline function:
>>>
>>> extern void do_foo(unsigned char *src);
>>> static inline void foo(unsigned char *src) {
>>> ASSERT((src != NULL), "src must be non-null");
>>> do_foo(src);
>>> }
>>>
>>> If foo() is compiled with a src that is known to be non-NULL, this
>>
>> This is rarely the case ------------------^^^^^^^^^^^ hence the need
>> for the run-time check.
>
> Actually, I think it is often the case - code is often written in a form

Sorry, I understood your point.  I am commenting that, in my case(s),
the compiler can't tell whether "src is non-NULL" (e.g., imagine
walking a tree and chasing pointers to next branches; you can't
know at compile time which pointers will be NULL or "valid").
So "probe_subtree(pointer)" must either *itself* check that it
doesn't dereference "pointer" *or* the caller does so before
invoking probe_subtree().  In either case, the compiler can't help.

> such as "if (p) foo(p)", because the caller function doesn't want to
> bother calling foo if it doesn't have to. It is also very common to
> write code such as "foo(&x)" - the compiler knows when calling foo that
> the pointer is non-NULL. Of course, this is entirely dependent on the
> type of code and functions.
>
>>> results in zero overhead, and if it is known to be NULL, then there will
>>> likely be a compile-time error or warning when the caller code is
>>> compiled.
>>
>> The compiler can't always make decisions at compile time.
>> I.e., there will be (many) cases where the extra code generated
>> by the ASSERT() gets included in the build image. (remember,
>> I do this on *every* function I write) This really eats up
>> a lot of space (especially for the strings)
>
> Feel free to use a "DEBUG" define to enable or disable the run-time checks.

Yes.  But, I typically do an entire DEBUG *build* -- so that
*everything* is checked (e.g., if a module that I am working
on references "foo()", I could conditionally include the
prepended checks for all foo() invocations.   But, when foo()
references "bar()", I would have to rebuild foo() with
#define DEBUG, etc.  I.e., you never know (prior to release[1])
the full consequences of a particular invocation.

[1] in some systems, the debug code remains live in the
release as it must be sold "as tested" (the alternative is
to remove the DEBUG code *prior* to test -- but it is
often seen as providing greater assurances that the
system being tested is complying at all levels)

>>>> One of the routines I was dressing up today had some constraints
>>>> placed on a pointer passed to it. In particular, requiring the
>>>> pointer to reference a valid memory location in a certain
>>>> range (defined by manifest constants).
>>>>
>>>> ONCE THROUGH THE INVARIANTS, "live code" would further qualify
>>>> the parameters passed to the routine. In this case, enforcing
>>>> some "arbitrary" alignment criteria. Conceivably, the aligned
>>>> pointer could now reference something *outside* the range of
>>>> memory locations enforced by the invariants. The code clearly
>>>> has to test for this -- since the whole point is to ensure
>>>> those "out of bounds" areas are not referenced at run-time.
>>>>
>>>> Now, the nit pick... *technically*, the caller has complied
>>>> with the contractual requirements of the function (the contract
>>>> doesn't enforce alignment issues -- that is part of the role
>>>> of the function itself). As such, I maintain that the original
>>>> set of invariants are *all* that should be enforced (to
>>>> determine if the invocation "complies"). The *further*,
>>>> non-contractual alignment restriction that the function imposes
>>>> should be treated as a run-time error -- just like trying to
>>>> resolve "ptr[index]" when the location referenced extends
>>>> outside of the valid range.
>>>
>>> This I don't understand. Either it is an error to call the function with
>>> an unaligned pointer, or it is not an error. If it is an error, then it
>>
>> Not an error (in this case). Think: "align to cache-lines", etc.
>
> If it is not an error, then it is not a problem.

The "problem" is a pathological one.  What looks to be valid
argument proves to be, um, "inappropriate" -- but, only because
of the requirements that the called function "later" imposes
on it.

I think I agree that it is "not a problem" but, rather,
a case where the function simply returns "FAIL".  It may be
a surprise to the caller but not something that he shouldn't
be prepared to handle.  (e.g., had the below example been
"149", the result would have been "PASS" while "349" would
have tripped the ASSERTions)

>>> is part of the requirements for calling the function, and should be
>>> checked and enforced as soon as possible. If it is not an error, then it
>>> is not an error. The function may use alignment information to run
>>> faster, or it may give you some debugging information to help improve
>>> the code, but it is not an error.
>>
>> The *caller* didn't "make an error". But, the actions that
>> the function WILL TAKE with the data provided could result
>> in those parameters being modified to what would *now* be
>> considered an error (had the caller supplied them in that
>> "form" originally).
>
> Then either the caller /is/ making an error, or the called function is
> in error, or the specifications for the function are in error. If
> something is going wrong, somebody made a mistake!
>
>> Contrived example:
>>
>> Caller invokes with argument that resolves to "249".
>> Function will ALIGN_UP to 100 byte boundaries so that
>> 249 becomes "300". (easier for me to deal with decimal
>> numbers at this time of night)
>>
>> ASSERTs contractually constrain the parameter to the
>> range [1,299].
>>
>> So, HAD THE CALLER PROVIDED THE 300 THAT IS *EFFECTIVELY*
>> BEING USED, this would have been a contractual violation.
>> (but, he *didn't*).
>
> Your function's specifications are wrong - your claimed requirements on
> the inputs don't match the actual requirements.

I don't see that.

E.g., you are implementing an "add()" routine for an 8 digit
calculator.  How do you define valid inputs?  "must be no
more than 8 digits"  So, add(99999999, 99999999) satisfies
the contractual requirements of the interface.  Yet, the
result add() returns is "FAIL" (overflow).  How else would
you specify the contract, "must be no more than 8 digits
and the sum must also fit in 8 digits"?  And, how would
you enforce this *without* the "add" functionality?

Generalizing the example I originally provided:  how do
you create an "align()" function whose "output" can be
unconditionally used by the caller?  There are some
pathological inputs that simply result in "FAIL" instead
of a valid "address".

> If this can be caught at compile time, that's good. (If it can be caught
> at design time, that's better.) If it can only be caught at run-time,
> then so be it. It's still an error.
>
>> I am claiming that this should, instead, be a run-time
>> "error" condition (i.e., "return FAIL;") instead of
>> a contractual violation (i.e., "fall into debugger").
>>
>> (assume FAIL is a legitimate response from the function)

(sigh)  breakfast.  Pasta sounds good!

Reply by Przemek Klosowski ●January 4, 20112011-01-04

On Mon, 03 Jan 2011 02:50:07 -0700, D Yuniskis wrote:

> One of the routines I was dressing up today had some constraints placed
> on a pointer passed to it.  In particular, requiring the pointer to
> reference a valid memory location in a certain range (defined by
> manifest constants).
> 
> ONCE THROUGH THE INVARIANTS, "live code" would further qualify the
> parameters passed to the routine.  In this case, enforcing some
> "arbitrary" alignment criteria.  Conceivably, the aligned pointer could
> now reference something *outside* the range of memory locations enforced
> by the invariants. 

So, the first set of invariants is augmented by the second set of 
invariants. Wouldn't the right solution be to reformulate the first set?
From your description I can't think of an example, but I imagine it could 
be something like checking that the 4-byte variable starts below A, and 
then in the second step constructing an aligned access. If the variable
starts at A-1, the aligned address has to fetch bytes A-4..A-1 and then 
another fetch of A..A+3. I would argue that the right way is to modify 
the first condition to check that the variable start is below A+3.

Reply by D Yuniskis ●January 4, 20112011-01-04

Hi Hans-Bernhard,

On 1/3/2011 11:36 AM, Hans-Bernhard Br&ouml;ker wrote:
> On 03.01.2011 16:36, D Yuniskis wrote:
>> On 1/3/2011 4:58 AM, David Brown wrote:
>
>>> One other thing you can use is static asserts. The only disadvantage
>>> these have is cluttering the code, but it is not much worse than the
>>> typical comment (i.e., you write "static_assert(sizeOfBuffer > 100);"
>>> rather than "/* sizeOfBuffer should be at least 100 */".
>
>> This is what I currently do (hence "ASSERT()" instead of "assert()").
>
> Actually no, you don't. A static assert is resolved at _compile_ time.
> That's not what your following description says you're doing:

Sorry, overtired so my replies have been lacking precision.  :<

Most of the checks can't be resolved "at compile time" so
static checks just won't do the trick.  E.g., in David's
above example, the compiler doesn't know "sizeOfBuffer"
(as "sizeOfBuffer" -- if such a beast existed -- is computed
at run-time)

My "following description" was intended to address his
comment re: "The only disadvantage these have is cluttering
the code, ..." as that *is* the cost my ASSERTs typically
bring with them.  Furthermore, they act *as* the "typical
comment" (again, his words/example).  This makes them a
good mechanism for precisely defining what the function
can rely upon -- e.g., location[size-1] won't extend past
the end of the region in question (which might be "all of
memory") so the function can use that knowledge in its
implementation.  Just like ASSERT(ptr!=NULL) means the
function can freely dereference ptr without having to
bear the cost of *checking* to see if it is NULL.  (contracts
benefit the caller as well as the function itself)

>> However, they may or may not generate code in the production release
>> (depending on how I define them). Sometimes they push an error to
>> my "black box" and abort the function call (since the body of the
>> function thereafter EXPECTS never to see input parameters that
>> violate those stated "requirements"). Other times they may just
>> "vanish"

Reply by D Yuniskis ●January 4, 20112011-01-04

Hi Przemek,

On 1/3/2011 9:19 PM, Przemek Klosowski wrote:
> On Mon, 03 Jan 2011 02:50:07 -0700, D Yuniskis wrote:
>
>> One of the routines I was dressing up today had some constraints placed
>> on a pointer passed to it.  In particular, requiring the pointer to
>> reference a valid memory location in a certain range (defined by
>> manifest constants).
>>
>> ONCE THROUGH THE INVARIANTS, "live code" would further qualify the
>> parameters passed to the routine.  In this case, enforcing some
>> "arbitrary" alignment criteria.  Conceivably, the aligned pointer could
>> now reference something *outside* the range of memory locations enforced
>> by the invariants.
>
> So, the first set of invariants is augmented by the second set of
> invariants. Wouldn't the right solution be to reformulate the first set?

No.  I'm arguing that the second set of invariants shouldn't be there.
I.e., that the "problem" they are trying to detect is caused by
something not exported by the function's interface (or its
specification).

Note an example I mentioned "someplace" regarding an 8 digit
"add()".  It makes sense to define the interface to ensure
both arguments have fewer than 8 digits.  But, it *doesn't*
make sense (IMO) to define the interface such that "the
SUM of these two arguments must fit in 8 digits".  Rather, if
the sum *happens* to not fit, you signal an *error* (overflow),
not a contractual violation.

Otherwise, you start writing "recursive" interface definitions;
in effect, saying, "Use this function but only in a way that
the function applied to the inputs WILL HAVE caused the *actual*
function invocation to work properly..."  :-/

I.e., in the example I used (alignments), I would have to
export the details of "alignment" to the user in order to
give him the information he needs to ensure he never
passes a pointer to me that, once aligned, would cause this
problem.  Doing that forces all of the alignment tests
back into the main body of the code instead of encapsulating
them here.  (the interface already provides mechanisms to
tell you *how* the function has interpreted the parameters
you provided *if* it wasn't able to directly use them "as is")

>  From your description I can't think of an example, but I imagine it could
> be something like checking that the 4-byte variable starts below A, and
> then in the second step constructing an aligned access. If the variable
> starts at A-1, the aligned address has to fetch bytes A-4..A-1 and then
> another fetch of A..A+3. I would argue that the right way is to modify
> the first condition to check that the variable start is below A+3.

Reply by Arlet Ottens ●January 4, 20112011-01-04

On Mon, 03 Jan 2011 22:26:22 -0700, D Yuniskis wrote:

> No.  I'm arguing that the second set of invariants shouldn't be there.
> I.e., that the "problem" they are trying to detect is caused by
> something not exported by the function's interface (or its
> specification).
> 
> Note an example I mentioned "someplace" regarding an 8 digit "add()". 
> It makes sense to define the interface to ensure both arguments have
> fewer than 8 digits.  But, it *doesn't* make sense (IMO) to define the
> interface such that "the SUM of these two arguments must fit in 8
> digits".  Rather, if the sum *happens* to not fit, you signal an *error*
> (overflow), not a contractual violation.

I don't really see what the problem is though. It just depends on whether 
these cases fall within the overall specification, and whether there's a 
benefit in recovering gracefully (which includes that the caller knows 
how to handle the error status).

Basically, you have 3 choices:

1. ignore and get bad results
2. return error and handle that
3. put in an assert and handle that

Either one of these choices could be acceptable, depending on the 
specific situation.

Reply by David Brown ●January 4, 20112011-01-04

On 04/01/2011 00:22, D Yuniskis wrote:
> Hi David,
>
> On 1/3/2011 1:28 PM, David Brown wrote:
>
> 8<
>
>>>> Where possible, you want to do the checks /before/ the function is
>>>> called. If the function is called with incorrect parameters, it is the
>>>> /caller/ that is wrong. Identifying the error at run-time in the called
>>>> function may be better than nothing, but it is not nearly as good as
>>>> identifying it at compile-time in the caller. If you can't put the
>>>> requirements into the "extern" declaration (as above), consider
>>>> wrapping
>>>> the real function with an inline function:
>>>>
>>>> extern void do_foo(unsigned char *src);
>>>> static inline void foo(unsigned char *src) {
>>>> ASSERT((src != NULL), "src must be non-null");
>>>> do_foo(src);
>>>> }
>>>>
>>>> If foo() is compiled with a src that is known to be non-NULL, this
>>>
>>> This is rarely the case ------------------^^^^^^^^^^^ hence the need
>>> for the run-time check.
>>
>> Actually, I think it is often the case - code is often written in a form
>
> Sorry, I understood your point. I am commenting that, in my case(s),
> the compiler can't tell whether "src is non-NULL" (e.g., imagine
> walking a tree and chasing pointers to next branches; you can't
> know at compile time which pointers will be NULL or "valid").
> So "probe_subtree(pointer)" must either *itself* check that it
> doesn't dereference "pointer" *or* the caller does so before
> invoking probe_subtree(). In either case, the compiler can't help.

That is a different sort of function - here you are writing code where 
the pointer is sometimes null, and sometimes non-null.  An "assert" is 
therefore inappropriate for testing, because both values are valid.  It 
depends on the code whether you want to have the test inside 
"probe_subtree" or inside the caller, but either way the test is part of 
the algorithm, not a check for argument validity.

>
>> such as "if (p) foo(p)", because the caller function doesn't want to
>> bother calling foo if it doesn't have to. It is also very common to
>> write code such as "foo(&x)" - the compiler knows when calling foo that
>> the pointer is non-NULL. Of course, this is entirely dependent on the
>> type of code and functions.
>>
>>>> results in zero overhead, and if it is known to be NULL, then there
>>>> will
>>>> likely be a compile-time error or warning when the caller code is
>>>> compiled.
>>>
>>> The compiler can't always make decisions at compile time.
>>> I.e., there will be (many) cases where the extra code generated
>>> by the ASSERT() gets included in the build image. (remember,
>>> I do this on *every* function I write) This really eats up
>>> a lot of space (especially for the strings)
>>
>> Feel free to use a "DEBUG" define to enable or disable the run-time
>> checks.
>
> Yes. But, I typically do an entire DEBUG *build* -- so that
> *everything* is checked (e.g., if a module that I am working
> on references "foo()", I could conditionally include the
> prepended checks for all foo() invocations. But, when foo()
> references "bar()", I would have to rebuild foo() with
> #define DEBUG, etc. I.e., you never know (prior to release[1])
> the full consequences of a particular invocation.
>
> [1] in some systems, the debug code remains live in the
> release as it must be sold "as tested" (the alternative is
> to remove the DEBUG code *prior* to test -- but it is
> often seen as providing greater assurances that the
> system being tested is complying at all levels)
>
>>>>> One of the routines I was dressing up today had some constraints
>>>>> placed on a pointer passed to it. In particular, requiring the
>>>>> pointer to reference a valid memory location in a certain
>>>>> range (defined by manifest constants).
>>>>>
>>>>> ONCE THROUGH THE INVARIANTS, "live code" would further qualify
>>>>> the parameters passed to the routine. In this case, enforcing
>>>>> some "arbitrary" alignment criteria. Conceivably, the aligned
>>>>> pointer could now reference something *outside* the range of
>>>>> memory locations enforced by the invariants. The code clearly
>>>>> has to test for this -- since the whole point is to ensure
>>>>> those "out of bounds" areas are not referenced at run-time.
>>>>>
>>>>> Now, the nit pick... *technically*, the caller has complied
>>>>> with the contractual requirements of the function (the contract
>>>>> doesn't enforce alignment issues -- that is part of the role
>>>>> of the function itself). As such, I maintain that the original
>>>>> set of invariants are *all* that should be enforced (to
>>>>> determine if the invocation "complies"). The *further*,
>>>>> non-contractual alignment restriction that the function imposes
>>>>> should be treated as a run-time error -- just like trying to
>>>>> resolve "ptr[index]" when the location referenced extends
>>>>> outside of the valid range.
>>>>
>>>> This I don't understand. Either it is an error to call the function
>>>> with
>>>> an unaligned pointer, or it is not an error. If it is an error, then it
>>>
>>> Not an error (in this case). Think: "align to cache-lines", etc.
>>
>> If it is not an error, then it is not a problem.
>
> The "problem" is a pathological one. What looks to be valid
> argument proves to be, um, "inappropriate" -- but, only because
> of the requirements that the called function "later" imposes
> on it.
>

There should not be a grey area here.  Arguments are either within the 
specification, or they are not.  It's okay to treat different argument 
values differently, and return special codes in some cases - that is 
part of the specification of your function.

> I think I agree that it is "not a problem" but, rather,
> a case where the function simply returns "FAIL". It may be
> a surprise to the caller but not something that he shouldn't
> be prepared to handle. (e.g., had the below example been
> "149", the result would have been "PASS" while "349" would
> have tripped the ASSERTions)
>
>>>> is part of the requirements for calling the function, and should be
>>>> checked and enforced as soon as possible. If it is not an error,
>>>> then it
>>>> is not an error. The function may use alignment information to run
>>>> faster, or it may give you some debugging information to help improve
>>>> the code, but it is not an error.
>>>
>>> The *caller* didn't "make an error". But, the actions that
>>> the function WILL TAKE with the data provided could result
>>> in those parameters being modified to what would *now* be
>>> considered an error (had the caller supplied them in that
>>> "form" originally).
>>
>> Then either the caller /is/ making an error, or the called function is
>> in error, or the specifications for the function are in error. If
>> something is going wrong, somebody made a mistake!
>>
>>> Contrived example:
>>>
>>> Caller invokes with argument that resolves to "249".
>>> Function will ALIGN_UP to 100 byte boundaries so that
>>> 249 becomes "300". (easier for me to deal with decimal
>>> numbers at this time of night)
>>>
>>> ASSERTs contractually constrain the parameter to the
>>> range [1,299].
>>>
>>> So, HAD THE CALLER PROVIDED THE 300 THAT IS *EFFECTIVELY*
>>> BEING USED, this would have been a contractual violation.
>>> (but, he *didn't*).
>>
>> Your function's specifications are wrong - your claimed requirements on
>> the inputs don't match the actual requirements.
>
> I don't see that.
>
> E.g., you are implementing an "add()" routine for an 8 digit
> calculator. How do you define valid inputs? "must be no
> more than 8 digits" So, add(99999999, 99999999) satisfies
> the contractual requirements of the interface. Yet, the
> result add() returns is "FAIL" (overflow). How else would
> you specify the contract, "must be no more than 8 digits
> and the sum must also fit in 8 digits"? And, how would
> you enforce this *without* the "add" functionality?
>

You have two choices here.  One is to specify that the function will 
take two values that can be summed to an eight digit number, and it will 
return that sum.  The other is to specify that the function will take 
two values, and will either return their sum if it fits within 8 digits, 
or FAIL if it exceeds 8 digits.

That's your functionality specified.  You can use asserts to check the 
input specifications if you want.

What you can't do is claim the function can take any eight digit inputs, 
then crash or return surprise values because their sum takes more than 
eight digits.


> Generalizing the example I originally provided: how do
> you create an "align()" function whose "output" can be
> unconditionally used by the caller? There are some
> pathological inputs that simply result in "FAIL" instead
> of a valid "address".
>
>> If this can be caught at compile time, that's good. (If it can be caught
>> at design time, that's better.) If it can only be caught at run-time,
>> then so be it. It's still an error.
>>
>>> I am claiming that this should, instead, be a run-time
>>> "error" condition (i.e., "return FAIL;") instead of
>>> a contractual violation (i.e., "fall into debugger").
>>>
>>> (assume FAIL is a legitimate response from the function)
>
> (sigh) breakfast. Pasta sounds good!

Previous 123 Next

Style (?) issue

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group