On 04/01/2011 00:22, D Yuniskis wrote:
> Hi David,
>
> On 1/3/2011 1:28 PM, David Brown wrote:
>
> 8<
>
>>>> Where possible, you want to do the checks /before/ the function is
>>>> called. If the function is called with incorrect parameters, it is the
>>>> /caller/ that is wrong. Identifying the error at run-time in the called
>>>> function may be better than nothing, but it is not nearly as good as
>>>> identifying it at compile-time in the caller. If you can't put the
>>>> requirements into the "extern" declaration (as above), consider
>>>> wrapping
>>>> the real function with an inline function:
>>>>
>>>> extern void do_foo(unsigned char *src);
>>>> static inline void foo(unsigned char *src) {
>>>> ASSERT((src != NULL), "src must be non-null");
>>>> do_foo(src);
>>>> }
>>>>
>>>> If foo() is compiled with a src that is known to be non-NULL, this
>>>
>>> This is rarely the case ------------------^^^^^^^^^^^ hence the need
>>> for the run-time check.
>>
>> Actually, I think it is often the case - code is often written in a form
>
> Sorry, I understood your point. I am commenting that, in my case(s),
> the compiler can't tell whether "src is non-NULL" (e.g., imagine
> walking a tree and chasing pointers to next branches; you can't
> know at compile time which pointers will be NULL or "valid").
> So "probe_subtree(pointer)" must either *itself* check that it
> doesn't dereference "pointer" *or* the caller does so before
> invoking probe_subtree(). In either case, the compiler can't help.
That is a different sort of function - here you are writing code where
the pointer is sometimes null, and sometimes non-null. An "assert" is
therefore inappropriate for testing, because both values are valid. It
depends on the code whether you want to have the test inside
"probe_subtree" or inside the caller, but either way the test is part of
the algorithm, not a check for argument validity.
>
>> such as "if (p) foo(p)", because the caller function doesn't want to
>> bother calling foo if it doesn't have to. It is also very common to
>> write code such as "foo(&x)" - the compiler knows when calling foo that
>> the pointer is non-NULL. Of course, this is entirely dependent on the
>> type of code and functions.
>>
>>>> results in zero overhead, and if it is known to be NULL, then there
>>>> will
>>>> likely be a compile-time error or warning when the caller code is
>>>> compiled.
>>>
>>> The compiler can't always make decisions at compile time.
>>> I.e., there will be (many) cases where the extra code generated
>>> by the ASSERT() gets included in the build image. (remember,
>>> I do this on *every* function I write) This really eats up
>>> a lot of space (especially for the strings)
>>
>> Feel free to use a "DEBUG" define to enable or disable the run-time
>> checks.
>
> Yes. But, I typically do an entire DEBUG *build* -- so that
> *everything* is checked (e.g., if a module that I am working
> on references "foo()", I could conditionally include the
> prepended checks for all foo() invocations. But, when foo()
> references "bar()", I would have to rebuild foo() with
> #define DEBUG, etc. I.e., you never know (prior to release[1])
> the full consequences of a particular invocation.
>
> [1] in some systems, the debug code remains live in the
> release as it must be sold "as tested" (the alternative is
> to remove the DEBUG code *prior* to test -- but it is
> often seen as providing greater assurances that the
> system being tested is complying at all levels)
>
>>>>> One of the routines I was dressing up today had some constraints
>>>>> placed on a pointer passed to it. In particular, requiring the
>>>>> pointer to reference a valid memory location in a certain
>>>>> range (defined by manifest constants).
>>>>>
>>>>> ONCE THROUGH THE INVARIANTS, "live code" would further qualify
>>>>> the parameters passed to the routine. In this case, enforcing
>>>>> some "arbitrary" alignment criteria. Conceivably, the aligned
>>>>> pointer could now reference something *outside* the range of
>>>>> memory locations enforced by the invariants. The code clearly
>>>>> has to test for this -- since the whole point is to ensure
>>>>> those "out of bounds" areas are not referenced at run-time.
>>>>>
>>>>> Now, the nit pick... *technically*, the caller has complied
>>>>> with the contractual requirements of the function (the contract
>>>>> doesn't enforce alignment issues -- that is part of the role
>>>>> of the function itself). As such, I maintain that the original
>>>>> set of invariants are *all* that should be enforced (to
>>>>> determine if the invocation "complies"). The *further*,
>>>>> non-contractual alignment restriction that the function imposes
>>>>> should be treated as a run-time error -- just like trying to
>>>>> resolve "ptr[index]" when the location referenced extends
>>>>> outside of the valid range.
>>>>
>>>> This I don't understand. Either it is an error to call the function
>>>> with
>>>> an unaligned pointer, or it is not an error. If it is an error, then it
>>>
>>> Not an error (in this case). Think: "align to cache-lines", etc.
>>
>> If it is not an error, then it is not a problem.
>
> The "problem" is a pathological one. What looks to be valid
> argument proves to be, um, "inappropriate" -- but, only because
> of the requirements that the called function "later" imposes
> on it.
>
There should not be a grey area here. Arguments are either within the
specification, or they are not. It's okay to treat different argument
values differently, and return special codes in some cases - that is
part of the specification of your function.
> I think I agree that it is "not a problem" but, rather,
> a case where the function simply returns "FAIL". It may be
> a surprise to the caller but not something that he shouldn't
> be prepared to handle. (e.g., had the below example been
> "149", the result would have been "PASS" while "349" would
> have tripped the ASSERTions)
>
>>>> is part of the requirements for calling the function, and should be
>>>> checked and enforced as soon as possible. If it is not an error,
>>>> then it
>>>> is not an error. The function may use alignment information to run
>>>> faster, or it may give you some debugging information to help improve
>>>> the code, but it is not an error.
>>>
>>> The *caller* didn't "make an error". But, the actions that
>>> the function WILL TAKE with the data provided could result
>>> in those parameters being modified to what would *now* be
>>> considered an error (had the caller supplied them in that
>>> "form" originally).
>>
>> Then either the caller /is/ making an error, or the called function is
>> in error, or the specifications for the function are in error. If
>> something is going wrong, somebody made a mistake!
>>
>>> Contrived example:
>>>
>>> Caller invokes with argument that resolves to "249".
>>> Function will ALIGN_UP to 100 byte boundaries so that
>>> 249 becomes "300". (easier for me to deal with decimal
>>> numbers at this time of night)
>>>
>>> ASSERTs contractually constrain the parameter to the
>>> range [1,299].
>>>
>>> So, HAD THE CALLER PROVIDED THE 300 THAT IS *EFFECTIVELY*
>>> BEING USED, this would have been a contractual violation.
>>> (but, he *didn't*).
>>
>> Your function's specifications are wrong - your claimed requirements on
>> the inputs don't match the actual requirements.
>
> I don't see that.
>
> E.g., you are implementing an "add()" routine for an 8 digit
> calculator. How do you define valid inputs? "must be no
> more than 8 digits" So, add(99999999, 99999999) satisfies
> the contractual requirements of the interface. Yet, the
> result add() returns is "FAIL" (overflow). How else would
> you specify the contract, "must be no more than 8 digits
> and the sum must also fit in 8 digits"? And, how would
> you enforce this *without* the "add" functionality?
>
You have two choices here. One is to specify that the function will
take two values that can be summed to an eight digit number, and it will
return that sum. The other is to specify that the function will take
two values, and will either return their sum if it fits within 8 digits,
or FAIL if it exceeds 8 digits.
That's your functionality specified. You can use asserts to check the
input specifications if you want.
What you can't do is claim the function can take any eight digit inputs,
then crash or return surprise values because their sum takes more than
eight digits.
> Generalizing the example I originally provided: how do
> you create an "align()" function whose "output" can be
> unconditionally used by the caller? There are some
> pathological inputs that simply result in "FAIL" instead
> of a valid "address".
>
>> If this can be caught at compile time, that's good. (If it can be caught
>> at design time, that's better.) If it can only be caught at run-time,
>> then so be it. It's still an error.
>>
>>> I am claiming that this should, instead, be a run-time
>>> "error" condition (i.e., "return FAIL;") instead of
>>> a contractual violation (i.e., "fall into debugger").
>>>
>>> (assume FAIL is a legitimate response from the function)
>
> (sigh) breakfast. Pasta sounds good!