Ftn I/Os documentation best practices

I add a boilerplate to each function definition that
declares constraints on inputs, expectations of outputs,
performance issues, etc.  I use this to add invariants
to the code to detect/enforce these conditions.

But, there is nothing that ensures that I've done
this -- other than discipline.

I'm looking at ways to create an IDL that will allow
for more specific criteria to be included in the
declaration that could also drive the IDL compiler
to add suitable invariants as applicable.

[This makes RPC much more effective but can also
benefit traditional ftn invocations]

Any pointers to similar schemes?  I've been looking
through CORBA et al. for hints but they seem to
focus on bigger machines (where there is more tolerance
over data types and more overhead expected).

Reply by David Brown ●June 27, 20222022-06-27

On 26/06/2022 21:35, Don Y wrote:
> I add a boilerplate to each function definition that
> declares constraints on inputs, expectations of outputs,
> performance issues, etc.&nbsp; I use this to add invariants
> to the code to detect/enforce these conditions.
> 
> But, there is nothing that ensures that I've done
> this -- other than discipline.
> 
> I'm looking at ways to create an IDL that will allow
> for more specific criteria to be included in the
> declaration that could also drive the IDL compiler
> to add suitable invariants as applicable.
> 
> [This makes RPC much more effective but can also
> benefit traditional ftn invocations]
> 
> Any pointers to similar schemes?&nbsp; I've been looking
> through CORBA et al. for hints but they seem to
> focus on bigger machines (where there is more tolerance
> over data types and more overhead expected).

What programming language are you using?  If your answer is "C", it's wrong.

If you are just putting these things in comments, then they will get out 
of sync with the code.  The best you can do is writing something like a 
Python script that will read the C code and check for the pattern of 
comments.

If you want something really useful, you need a programming language 
that will let you write the contracts in the language itself - then they 
can be checked and enforced.  Ada, D, and Scala are examples.  C++ has a 
Boost.Contracts library, and language support for contracts is due in 
C++23 (last I heard - but it might be delayed again).

Reply by Grant Edwards ●June 27, 20222022-06-27

On 2022-06-27, David Brown <david.brown@hesbynett.no> wrote:
> On 26/06/2022 21:35, Don Y wrote:
>> I add a boilerplate to each function definition that
>> declares constraints on inputs, expectations of outputs,
>> performance issues, etc.

> What programming language are you using?  If your answer is "C",
> it's wrong.
>
> If you are just putting these things in comments, then they will get out 
> of sync with the code.

I'd have to agree. I've worked with many projects and third-party
libraries over the decades which had a big template of comments for
every function which described the input/ouput parameters, return
value, global variables used, and so on.

Often these templates generated documents by using something like
Doxygen.

And on _every_single_one_ of those projects and libraries, the
comments were wrong often enough that nobody who knew which way was up
paid any attention to them. If you wanted to know what the parameters
were for, what the function returned, and so on, you read the C code.

A lot of the time, even the numbers and names of the parmeters
described in the template didn't match the code.

The auto-generated PDF documents and HTML web site looked nice, though.

--
Grant

Reply by David Brown ●June 27, 20222022-06-27

On 27/06/2022 17:18, Grant Edwards wrote:
> On 2022-06-27, David Brown <david.brown@hesbynett.no> wrote:
>> On 26/06/2022 21:35, Don Y wrote:
>>> I add a boilerplate to each function definition that
>>> declares constraints on inputs, expectations of outputs,
>>> performance issues, etc.
> 
>> What programming language are you using?  If your answer is "C",
>> it's wrong.
>>
>> If you are just putting these things in comments, then they will get out
>> of sync with the code.
> 
> I'd have to agree. I've worked with many projects and third-party
> libraries over the decades which had a big template of comments for
> every function which described the input/ouput parameters, return
> value, global variables used, and so on.
> 
> Often these templates generated documents by using something like
> Doxygen.
> 
> And on _every_single_one_ of those projects and libraries, the
> comments were wrong often enough that nobody who knew which way was up
> paid any attention to them. If you wanted to know what the parameters
> were for, what the function returned, and so on, you read the C code.
> 
> A lot of the time, even the numbers and names of the parmeters
> described in the template didn't match the code.
> 
> The auto-generated PDF documents and HTML web site looked nice, though.
> 

Accuracy of such in-code documentation varies, but there is generally no 
way to check it automatically.  That's one of the reasons it is better 
to use constructs in the programming language, where possible, rather 
than documentation and comments.  For preconditions, postconditions and 
invariants, you need a language that has support for contracts.  For 
other languages, usually the best you can do is careful choice of names 
and types, along with assert statements.

Still, Doxygen-like comments in code are usually better synchronised 
with the code than external documentation!

Reply by Don Y ●June 27, 20222022-06-27

On 6/27/2022 8:18 AM, Grant Edwards wrote:
> On 2022-06-27, David Brown <david.brown@hesbynett.no> wrote:
>> On 26/06/2022 21:35, Don Y wrote:
>>> I add a boilerplate to each function definition that
>>> declares constraints on inputs, expectations of outputs,
>>> performance issues, etc.
> 
>> What programming language are you using?  If your answer is "C",
>> it's wrong.
>>
>> If you are just putting these things in comments, then they will get out
>> of sync with the code.
> 
> I'd have to agree. I've worked with many projects and third-party
> libraries over the decades which had a big template of comments for
> every function which described the input/ouput parameters, return
> value, global variables used, and so on.

You perhaps missed the balance of my post:

    "I use this to add invariants to the code to detect/enforce
    these conditions."
    ...
    "I'm looking at ways to create an IDL that will allow for
    more specific criteria to be included in the declaration
    that could also drive the IDL compiler to add suitable
    invariants as applicable."

I.e., a "specification language" FROM WHICH the IDL compiler can
(I am currently using an enhanced form of OCL) create the code -- in
whatever language binding is selected AT COMPILE TIME.

So, if I say:
    month > 0
AND
    month < 13
as constraints *in* the function's "prototype", then
the IDL compiler generates an invariant that throws a
"range error" OR panics (depending on IDL compiler switch)
AT RUN TIME if the function is invoked with the "month"
parameter not compliant with those constraints.

The OCL *documents* the calling constraints of the
function (and its return values) in a language neutral
manner.  I.e., you could create an ASM binding for the
IDL compiler's output and the programmer would be
none the wiser.

The advantage of driving the code generator this way is
the "documentation" creates the code -- if you don't
*document* (declare) a constraint, then it isn't enforced.

It ensures the code and documentation agree and that
every bit of documentation has a corresponding bit of
code (but not necessarily the other way around)

> Often these templates generated documents by using something like
> Doxygen.
> 
> And on _every_single_one_ of those projects and libraries, the
> comments were wrong often enough that nobody who knew which way was up
> paid any attention to them. If you wanted to know what the parameters
> were for, what the function returned, and so on, you read the C code.

You *always* read the code.  The OCL declarations *are* effectively
code; the stub generated *will* reference "month" and not "moth"
or "monday" (or whatever).  But, they are formally expressed in a
syntax defined by the "specification language" (~OCL in my case).

Invoking the exemplar with a month of "13" could possibly work
within the body of the function, as implemented -- perhaps treating
this as year++ with month=1 -- but the invariant won't let the
value *into* the function.  Because the intent was *not* to invoke
the function with a bogus month value.

19A0 is not 2000!

The whole point is to encourage the developer to codify (in OCL)
the constraints on the code so that the IDL compiler can create
the actual instruction sequence (in the language bound to that set
of command line switches) to enforce those constraints.

*But*, you are still reliant on discipline; if the developer
doesn't declare those constraints, then the compiler can't create
any code to do this and simply is resigned to creating the code
to marshal arguments and pack the message for transport.

One can casually inspect the IDL files to see if there is an
abundance -- or a dearth -- of constraints without having to
parse countless source files.  The IDL files *generate* the
"header" files so you can't skip that step.

Additionally, it can generate the sever side stubs (in whichever
language binding is appropriate *there*) to unpack and parse
the message, convert the arguments to whatever format is "native"
for the server (knowing that their values are "legitimized" by
the client-side stub) and hand them off to the server-side
function.

[similarly handling the return message]

> A lot of the time, even the numbers and names of the parmeters
> described in the template didn't match the code.
> 
> The auto-generated PDF documents and HTML web site looked nice, though.

There's no point in generating "prose" from such a specification.
What are you going to do, pretty-print the generated stubs?  Or,
the OCL-expressed constraints?

Reply by Stephen Pelc ●June 28, 20222022-06-28

On 27 Jun 2022 at 17:18:07 CEST, "Grant Edwards" <invalid@invalid.invalid>
wrote:

>> If you are just putting these things in comments, then they will get out
>> of sync with the code.
> 
> I'd have to agree. I've worked with many projects and third-party
> libraries over the decades which had a big template of comments for
> every function which described the input/ouput parameters, return
> value, global variables used, and so on.
> 
> Often these templates generated documents by using something like
> Doxygen.

For the last 20 years or so, virtually all our manuals have been created
by our own "literate programming" system called DocGen. DocGen is
optimised for Forth, but it would not be a big job to write a version for C.

DocGen diverges from Doxygen and friends in a several ways. In
particular it does not need template blocks. If your C code is so bad
that another programmer cannot read the declaration, you need far
more help than DocGen or Doxgen can give you. The main entry
for a function follows the declaration

float someFunc( int how, double x, double y )
// *G The purpose of *\c{someFunc} is ...
// ** ...
{
  ...
}

The lines starting // *x are formal comments to be processed by
DocGen. The *X parts are formatting commands, and the *\<name>{}
parts are text macros.

The ideas behind DocGen are that the code and the documentation
are never separated, and that the DocGen portion is not much larger
than the descriptive comments you should have in your code anyway.
Keeping the code in sync with the documentation is a matter of
company culture and management.

Whenever we receive third party code to include in our products,
we *always* DocGen it before release and we *always* find some
bugs. Overall, I estimate that writing the documentation alongside
the code costs about 10% extra, paid for by the reduction in bug level.

Stephen
-- 
Stephen Pelc, stephen@vfxforth.com
MicroProcessor Engineering, Ltd. - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612, +34 649 662 974
http://www.mpeforth.com - free VFX Forth downloads

Reply by David Brown ●June 28, 20222022-06-28

On 27/06/2022 23:34, Don Y wrote:
> On 6/27/2022 8:18 AM, Grant Edwards wrote:

>>
>> The auto-generated PDF documents and HTML web site looked nice, though.
> 
> There's no point in generating "prose" from such a specification.
> What are you going to do, pretty-print the generated stubs?&nbsp; Or,
> the OCL-expressed constraints?

That is /exactly/ what you do with tools like Doxygen - it extracts 
/interface/ information (function prototypes, type declarations, etc.), 
strips it of implementation-specific details, merges the comments (which 
should hopefully be in sync with the code), and generates clear, 
readable, searchable, cross-referenced documentation.

You use tools like that precisely so that people using your library or 
code do /not/ read the C code.  You don't even have to read the header 
files.

And if you are formalising your prototypes with some kind of interface 
description language to include preconditions, postconditions and 
invariants, then you want them included in the generated documentation. 
  Ideally, that's what people will read, rather than the IDL source code 
or the generated C headers.

The key point of separation of interfaces and implementations is that 
people using the code should /only/ use the documented interfaces, and 
not rely on anything involved in the implementation.  So make the 
information about those interfaces clear and precise - such as good 
quality generated documentation - and make it accurate - such as by 
using an IDL.

Reply by Don Y ●June 28, 20222022-06-28

On 6/28/2022 1:30 AM, Stephen Pelc wrote:
> On 27 Jun 2022 at 17:18:07 CEST, "Grant Edwards" <invalid@invalid.invalid>
> wrote:
> 
>>> If you are just putting these things in comments, then they will get out
>>> of sync with the code.
>>
>> I'd have to agree. I've worked with many projects and third-party
>> libraries over the decades which had a big template of comments for
>> every function which described the input/ouput parameters, return
>> value, global variables used, and so on.
>>
>> Often these templates generated documents by using something like
>> Doxygen.
> 
> For the last 20 years or so, virtually all our manuals have been created
> by our own "literate programming" system called DocGen. DocGen is
> optimised for Forth, but it would not be a big job to write a version for C.
> 
> DocGen diverges from Doxygen and friends in a several ways. In
> particular it does not need template blocks. If your C code is so bad
> that another programmer cannot read the declaration, you need far
> more help than DocGen or Doxgen can give you. The main entry
> for a function follows the declaration
> 
> float someFunc( int how, double x, double y )
> // *G The purpose of *\c{someFunc} is ...
> // ** ...
> {
>    ...
> }
> 
> The lines starting // *x are formal comments to be processed by
> DocGen. The *X parts are formatting commands, and the *\<name>{}
> parts are text macros.
> 
> The ideas behind DocGen are that the code and the documentation
> are never separated, and that the DocGen portion is not much larger
> than the descriptive comments you should have in your code anyway.
> Keeping the code in sync with the documentation is a matter of
> company culture and management.
> 
> Whenever we receive third party code to include in our products,
> we *always* DocGen it before release and we *always* find some
> bugs. Overall, I estimate that writing the documentation alongside
> the code costs about 10% extra, paid for by the reduction in bug level.

I do this by using a specific "paragraph tag" in FrameMaker documents
(e.g., "Code") and then have a simple utility that extracts all thusly
tagged paragraphs to create the "source file" -- which is then
compiled <however>.

[FM files are relatively easy to parse and the format has been
consistent for many releases; I wouldn't think of this sort of
approach with MSWord acting as "container"!]

It adds an extra step to the process (because the source doesn't exist
until extracted from the document).

But, it is ill-suited to producing "manuals" as the presentation
must be linear with the code; you can't tangle/weave to arrange
the code in a different order than the documentation.

OTOH, it is excellent for mixing multimedia with "code"; I can put
an illustration between "if" and "then".  Or, a sound snipet to
indicate what a particular (audio) waveform -- expressed as an
array of floats -- *sounds* like adjacent to those constants.
This is particularly helpful with domain-specific constructs,
mechanisms and phenomena with which a generic programmer might
not have prior experience.

I document the "rationale" and "strategy" behind the code, elsewhere.
That can take the "30,000 ft view" of the code and usually needs
infrequent maintenance.  E.g., why was Q12.4 format chosen?  Show
me the error analysis behind that choice relative to other formats.

Keeping modules short and supporting other non-text annotations
makes it relatively easy for folks to understand the specifics of
an implementation.

But, all of these techniques (yours included) rely on discipline.
There's nothing that mechanically verifies the code and comments
agree.  Even semi-automatic mechanisms rely on the developer
having *created* them (e.g., #including an audio file that
was generated by extracting those floats and converting them
to audio).  Too often, the "solution" is simply to remove
comments rather than ensuring they are maintained.

Sadly, my experience has been that folks aren't keen on keeping
docs and code in sync and the more documentation, the less it
tends to track the code.  Especially for projects that "evolved"
instead of being "designed".  (each refactor requiring a substantial
reframing of the commentary)

Reply by Stephen Pelc ●June 28, 20222022-06-28

On 28 Jun 2022 at 14:49:41 CEST, "Don Y" <blockedofcourse@foo.invalid> wrote:
>> The ideas behind DocGen are that the code and the documentation
>> are never separated, and that the DocGen portion is not much larger
>> than the descriptive comments you should have in your code anyway.
>> Keeping the code in sync with the documentation is a matter of
>> company culture and management.

> Sadly, my experience has been that folks aren't keen on keeping
> docs and code in sync and the more documentation, the less it
> tends to track the code.  Especially for projects that "evolved"
> instead of being "designed".  (each refactor requiring a substantial
> reframing of the commentary)

As others have said it needs discipline. Discipline comes from
management. As the boss, I have made it quite clear that use
of DocGen is a requirement to work at the company. In turn
it is my job to ensure that people know how to use the tool.

Stephen

-- 
Stephen Pelc, stephen@vfxforth.com
MicroProcessor Engineering, Ltd. - More Real, Less Time
133 Hill Lane, Southampton SO15 5AF, England
tel: +44 (0)23 8063 1441, +44 (0)78 0390 3612, +34 649 662 974
http://www.mpeforth.com - free VFX Forth downloads

Reply by Don Y ●June 28, 20222022-06-28

On 6/28/2022 7:35 AM, Stephen Pelc wrote:
> On 28 Jun 2022 at 14:49:41 CEST, "Don Y" <blockedofcourse@foo.invalid> wrote:
>>> The ideas behind DocGen are that the code and the documentation
>>> are never separated, and that the DocGen portion is not much larger
>>> than the descriptive comments you should have in your code anyway.
>>> Keeping the code in sync with the documentation is a matter of
>>> company culture and management.
> 
>> Sadly, my experience has been that folks aren't keen on keeping
>> docs and code in sync and the more documentation, the less it
>> tends to track the code.  Especially for projects that "evolved"
>> instead of being "designed".  (each refactor requiring a substantial
>> reframing of the commentary)
> 
> As others have said it needs discipline. Discipline comes from
> management. As the boss, I have made it quite clear that use
> of DocGen is a requirement to work at the company. In turn
> it is my job to ensure that people know how to use the tool.

You can "legislate" the use of a tool or adherence to a standard.
But, these are subjective issues -- not like "derate all caps by
40%" (which can be independently, mathematically verified).  You
rely on individual "employees" for their judgement as to the
effectiveness of their documentation.  Likewise, the efficacy
of their test/validation efforts.

EVERY employer and client I've ever worked with has had formal
standards regarding code "style", documentation, testing, etc.
"The Boss" in these cases have ranged from accountants, to
mechanical engineers, to electrical engineers ("no longer
practicing"), to economists.  I.e., they can mandate but aren't
qualified to evaluate the quality of the work performed.

You can have peers review each others' work.  But, I've not seen
that improve the work of folks who just don't have the drive
to "do better".  (And I can't remember anyone EVER being fired
for incompetence!)

The true test of this is handing the design to another party
(i.e., SELLING the design) and seeing how well the new owner
can come up to speed on the product.  If you have staff available
"later" that can be consulted wrt their previous work on a
design, then folks need not completely rely on print documentation.

Previous12 Next

Ftn I/Os documentation best practices

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group