C++, Ada, ...| page 5

Reply by David Brown ●April 19, 20212021-04-19

On 19/04/2021 18:38, pozz wrote:

> What do you suggest for a poor C embedded developer that wants to try
> C++ on the next project?
> 
> I would use gcc on Cortex-M MCUs.
> 

I'm not entirely sure what you are asking - "gcc on Cortex-M" is, I
would say, the right answer if you are asking about tools.

Go straight to a new C++ standard - C++17.  (If you see anything that
mentions C++98 or C++03, run away - it is pointless unless you have to
maintain old code.)  Lots of things got a lot easier in C++11, and have
improved since.  Unfortunately the law of backwards compatibility means
old cruft still has to work, and is still there in language.  But that
doesn't mean you have to use it.

Go straight to a newer gcc - gcc 10 from GNU Arm Embedded.  The error
messages are much better (or at least less horrendous), and the static
checking is better.  Be generous with your warnings, and use a good IDE
with syntax highlighting and basic checking in the editor.

Disable exceptions and RTTI (-fno-exceptions -fno-rtti), and enable
optimisation.  C++ (used well) results in massive and incomprehensible
assembly unless you have at least -O1.

Don't try and learn everything at once.  Some things, like rvalue
references, are hard and rarely useful unless you are writing serious
template libraries.  There are many features of C++ that are needed to
write libraries rather than use them.

Don't be afraid of templates - they are great.

Be wary of the bigger parts of the C++ standard library - std::vector
and std::unordered_map are very nice for PC programming, but are far too
dynamic for small systems embedded programming.  (std::array, however,
is extremely useful.  And I like std::optional.)

Think about how the code might be implemented - if it seems that a
feature or class could be implemented in reasonably efficient object
code on a Cortex-M, then it probably will be.  If it looks like it will
need dynamic memory, it probably does.

Big class inheritance hierarchies, especially with multiple inheritance,
virtual functions, etc., is old-fashioned.  Where you can, use
compile-time (static) polymorphism rather than run-time polymorphism.
That means templates, overloaded functions, CRTP, etc.

Keep <https://en.cppreference.com/w/cpp> handy.  Same goes for
<https://godbolt.org>.

And keep smiling! [](){}();  (That's the C++11 smiley - when you
understand what it means, you're laughing!)

Reply by pozz ●April 19, 20212021-04-19

Il 19/04/2021 22:48, David Brown ha scritto:
> On 19/04/2021 18:38, pozz wrote:
> 
>> What do you suggest for a poor C embedded developer that wants to try
>> C++ on the next project?
>>
>> I would use gcc on Cortex-M MCUs.
>>
> 
> I'm not entirely sure what you are asking - "gcc on Cortex-M" is, I
> would say, the right answer if you are asking about tools.

I mentioned the tools just as a starting point. I don't know almost 
anything about C++, but I coded C for many years. I think there are some 
precautions to take in this situation to learn C++ respect learing C++ 
as the first language.


> Go straight to a new C++ standard - C++17.  (If you see anything that
> mentions C++98 or C++03, run away - it is pointless unless you have to
> maintain old code.)  Lots of things got a lot easier in C++11, and have
> improved since.  Unfortunately the law of backwards compatibility means
> old cruft still has to work, and is still there in language.  But that
> doesn't mean you have to use it.
> 
> Go straight to a newer gcc - gcc 10 from GNU Arm Embedded.  The error
> messages are much better (or at least less horrendous), and the static
> checking is better.  Be generous with your warnings, and use a good IDE
> with syntax highlighting and basic checking in the editor.
> 
> Disable exceptions and RTTI (-fno-exceptions -fno-rtti), and enable
> optimisation.  C++ (used well) results in massive and incomprehensible
> assembly unless you have at least -O1.
> 
> Don't try and learn everything at once.  Some things, like rvalue
> references, are hard and rarely useful unless you are writing serious
> template libraries.  There are many features of C++ that are needed to
> write libraries rather than use them.
> 
> Don't be afraid of templates - they are great.
> 
> Be wary of the bigger parts of the C++ standard library - std::vector
> and std::unordered_map are very nice for PC programming, but are far too
> dynamic for small systems embedded programming.  (std::array, however,
> is extremely useful.  And I like std::optional.)
> 
> Think about how the code might be implemented - if it seems that a
> feature or class could be implemented in reasonably efficient object
> code on a Cortex-M, then it probably will be.  If it looks like it will
> need dynamic memory, it probably does.

Dynamic memory... is it possible to have a C++ project without using 
heap at all?


> Big class inheritance hierarchies, especially with multiple inheritance,
> virtual functions, etc., is old-fashioned.  Where you can, use
> compile-time (static) polymorphism rather than run-time polymorphism.
> That means templates, overloaded functions, CRTP, etc.
> 
> 
> Keep <https://en.cppreference.com/w/cpp> handy.  Same goes for
> <https://godbolt.org>.
> 
> And keep smiling! [](){}();  (That's the C++11 smiley - when you
> understand what it means, you're laughing!)
>

Reply by Niklas Holsti ●April 19, 20212021-04-19

On 2021-04-19 22:20, Paul Rubin wrote:
> Niklas Holsti <niklas.holsti@tidorum.invalid> writes:
>> In Ada... The construction step can assign some initial values that
>> can be defined by default (pointers default to null, for example)...
>>
>>     type Zero_Handler is access procedure;
>>     type Counter is record ...
>>        At_Zero : Zero_Handler;      -- Default init to null.
> 
> Wow, that is disappointing.  I had thought Ada access types were like
> C++ or ML references, i.e. they have to be initialized to point to a
> valid object, so they can never be null. 

You can impose that constraint if you want to: if I had defined

    type Zero_Handler is not null access procedure;

then the above declaration of the At_Zero component would be illegal, 
and the compiler would insist on a non-null initial value.

But IMO sometimes you need pointers that can be null, just as you 
sometimes need null values in a database.

There are also other means in Ada to force the explicit initialization 
of objects at declaration (the "unspeficied discriminants" method).

The state of the art in Ada implementations of critical systems is 
slowly becoming to use static analysis and proof tools to verify that no 
run-time check failures, such as accessing a null pointer, can happen. 
That is already fairly easy to do with the AdaCore tools (CodePeer, 
SPARK and others). Proving functional correctness still remains hard.

> I plan to spend some time on Rust pretty soon.  This is based on
> impression rather than experience, but ISTM that a lot of Rust is
> designed around managing dynamic memory allocation by ownership
> tracking, like C++ unique_ptr on steroids built into the language.  That
> lets you write big applications that heavily use dynamic allocation
> while avoiding the usual malloc/free bugs and without using garbage
> collection.  Ada on the other hand is built for high assurance embedded
> applications that don't use dynamic allocation much, except maybe at
> program initialization time.  So Rust and Ada aim to solve different
> problems. 

There is a proposal and an implementation from AdaCore to augment Ada 
pointers with an "ownership" concept, as in Rust. I believe that SPARK 
supports that proposal. Again, in some cases you want shared ownership 
(multiple pointers to the same object), and then the Rust ownership 
concept is not enough, as I understand it (not expert at all).

Reply by Niklas Holsti ●April 19, 20212021-04-19

On 2021-04-19 22:47, Paul Rubin wrote:
> Dimiter_Popoff <dp@tgi-sci.com> writes:
>> On 4/19/2021 14:04, Niklas Holsti wrote:
>>> ...their HR departments say that they cannot find programmers trained
>>> in Ada. Bah, a competent programmer will pick up the core concepts
>>> quickly, says I.
>>
>> This is valid not just for ADA. An experienced programmer will need days
>> to adjust to this or that language. I guess most if not all of us have
>> been through it.
> 
> No it's much worse than that.  First of all some languages are really
> different and take considerable conceptual adjustment: it took me quite
> a while as a C and Python programmer to become anywhere near clueful
> about Haskell.  But understanding Haskell then demystified parts of C++
> that had made no sense to me at all.

I agree that it takes a mental leap to go from an imperative language to 
a functional languge, or to a logic-programming / declarative language.

To maintain a Haskell program, one would certainly prefer to hire a 
programmer experienced in functional programming, over one experienced 
only in C, C++ or Ada.

> Secondly, being competent in a language now means far more than the
> language itself.  There is also a culture and a code corpus out there
> which also have to be assimilated for each language.  E.g. Ruby is a
> very simple language, but coming up to speed as a Ruby developer means
> getting used to a decade of Rails hacks, ORM internals, 100's of "gems"
> (packages) scattered over 100s of Github repositories, etc.  It's the
> same way with Javascript and the NPM universe plus whatever
> framework-of-the-week your project is using.  Python is not yet that
> bad, because it traditionally had a "batteries included" ethic that
> tried to standardize more useful functions than other languages did, but
> it seems to have given up on that in the past few years.

Relying on libraries/packages from the Internet is also a huge 
vulnerability. Not long ago a large part of the world's programs in one 
of these languages (unfortunately I forget which -- it was reported on 
comp.risks) suddenly stopped working because they all depended on 
real-time download of a small library package from a certain repository, 
where the maintainer of that package had quarreled with the repository 
owner/provider and had removed the package. Boom... Fortunately it was a 
very small piece of SW and was easily replaced.

The next step is for a malicious actor to replace some such package with 
malware... the programs which use it will seem to go on working, but may 
not do what they are supposed to do.

Reply by Niklas Holsti ●April 19, 20212021-04-19

On 2021-04-19 23:19, David Brown wrote:
> On 19/04/2021 18:16, Niklas Holsti wrote:
>> On 2021-04-19 15:22, David Brown wrote:
>>> On 19/04/2021 12:51, Niklas Holsti wrote:
>>>>
>>>> (I think there should be a "volatile" spec for the "p" object, don't
>>>> you?)
>>>
>>> It might be logical to make it volatile, but the code would not be
>>> different (the inline assembly has memory clobbers already, which force
>>> the memory accesses to be carried out without re-arrangements).
>>
>>
>> So you are relying on the C++ compiler actually respecting the "inline"
>> directive? Are C++ compilers required to do that?
>>
> 
> No, it is not relying on the "inline" at all - it is relying on the
> semantics of the inline assembly code (which is compiler-specific,
> though several major compilers support the gcc inline assembly syntax).
> 
> Compilers are required to support "inline" correctly, of course - but
> the keyword doesn't actually mean "generate this code inside the calling
> function".  It is one of these historical oddities - it was originally
> conceived as a hint to the compiler for optimisation purposes, but what
> it /actually/ means is roughly "It's okay for there to be multiple
> definitions of this function in the program - I promise they will all do
> the same thing, so I don't mind which you use in any given case".

If the call is not in fact inlined, it seems to me that the compilation 
of the caller does not see the "asm volatile" in the callee, and 
therefore might reorder non-volatile accesses in the caller with respect 
to the call. But perhaps such reordering is forbidden in this example, 
because p is a pointer, and the callee might access the same underlying 
object (*p) through some other pointer to it, or directly.

> As a matter of style, I really do not like the "declare all variables at
> the start of the block" style, standard in Pascal, C90 (or older), badly
> written (IMHO) newer C, and apparently also Ada.  I much prefer to avoid
> defining variables until I know what value they should hold, at least
> initially.  Amongst other things, it means I can be much more generous
> about declaring them as "const", there are almost no risks of using
> uninitialised data, and the smaller scope means it is easier to see all
> use of the variable.

I mostly agree. However, there is always the possibility of having an 
exception, and the question of which variables an exception handler can 
see and use.

When all variable declarations are collected in one place, as in

    declare
       <local var declarations>
    begin
       <statements>
    exception
       <handlers>
    end

it is easy and safe to say that the handlers can rely on all the 
variables declared between "declare" and "begin" being in existence when 
handling some exception from the <statements>. If variables are declared 
here and there in the <statements>, they might or might not yet exist 
when the exception happens, and it would not be safe for the local 
exception handler to access them in any way.

Of course one can nest exception handlers when one nests blocks, but 
that becomes cumbersome pretty quickly.

I don't know how C++ really addresses this problem.

>> (In the next Ada standard -- probably Ada 2022 -- one can write such
>> updating assignments more briefly, as
>>
>>  &#4294967295;&#4294967295;&#4294967295; p.all := @ + z;
>>
>> but the '@' can be anywhere in the right-hand-side expression, in one or
>> more places, which is more flexible than the C/C++ combined
>> assignment-operations like "+=".)
>>
> 
> It may be flexible, but I'm not convinced it is clearer nor that it
> would often be useful.  But I guess that will be highly related to
> familiarity.

Yes, I have not yet used it at all, although I believe GNAT already 
implements it.

I imagine one not uncommon use might be in function calls, such as

    x := Foo (@, ...);

I believe that the main reason this new Ada feature was formulated in 
this "more flexible" way was not the desire for more flexibility, but to 
avoid the introduction of many more lexical tokens and variations like 
"+=", or "+:=" as it would have been for Ada.

Reply by Richard Damon ●April 19, 20212021-04-19

On 4/19/21 6:03 PM, Niklas Holsti wrote:
> On 2021-04-19 23:19, David Brown wrote:
>> On 19/04/2021 18:16, Niklas Holsti wrote:
>>> On 2021-04-19 15:22, David Brown wrote:
>>>> On 19/04/2021 12:51, Niklas Holsti wrote:
>>>>>
>>>>> (I think there should be a "volatile" spec for the "p" object, don't
>>>>> you?)
>>>>
>>>> It might be logical to make it volatile, but the code would not be
>>>> different (the inline assembly has memory clobbers already, which force
>>>> the memory accesses to be carried out without re-arrangements).
>>>
>>>
>>> So you are relying on the C++ compiler actually respecting the "inline"
>>> directive? Are C++ compilers required to do that?
>>>
>>
>> No, it is not relying on the "inline" at all - it is relying on the
>> semantics of the inline assembly code (which is compiler-specific,
>> though several major compilers support the gcc inline assembly syntax).
>>
>> Compilers are required to support "inline" correctly, of course - but
>> the keyword doesn't actually mean "generate this code inside the calling
>> function".&#4294967295; It is one of these historical oddities - it was originally
>> conceived as a hint to the compiler for optimisation purposes, but what
>> it /actually/ means is roughly "It's okay for there to be multiple
>> definitions of this function in the program - I promise they will all do
>> the same thing, so I don't mind which you use in any given case".
> 
> 
> If the call is not in fact inlined, it seems to me that the compilation
> of the caller does not see the "asm volatile" in the callee, and
> therefore might reorder non-volatile accesses in the caller with respect
> to the call. But perhaps such reordering is forbidden in this example,
> because p is a pointer, and the callee might access the same underlying
> object (*p) through some other pointer to it, or directly.

Unless the compiler knows otherwise, it must assume that a call to a
function might access volatile information, so can not migrate volatile
access across that call. Non-volatile accesses. that can not be affected
by the call, can be.
>

Reply by Tom Gardner ●April 19, 20212021-04-19

On 19/04/21 20:20, Paul Rubin wrote:

> I plan to spend some time on Rust pretty soon.  This is based on
> impression rather than experience, but ISTM that a lot of Rust is
> designed around managing dynamic memory allocation by ownership
> tracking, like C++ unique_ptr on steroids built into the language.  That
> lets you write big applications that heavily use dynamic allocation
> while avoiding the usual malloc/free bugs and without using garbage
> collection.  

It also helps with concurrency. From
https://doc.rust-lang.org/book/ch16-00-concurrency.html

  "Initially, the Rust team thought that ensuring memory safety
   and preventing concurrency problems were two separate challenges
   to be solved with different methods. Over time, the team
   discovered that the ownership and type systems are a powerful
   set of tools to help manage memory safety and concurrency
   problems! By leveraging ownership and type checking, many
   concurrency errors are  compile-time errors in Rust rather
   than runtime errors."

I haven't kicked Rust's tyres, but that seems plausible

Reply by Paul Rubin ●April 19, 20212021-04-19

pozz <pozzugno@gmail.com> writes:
> I mentioned the tools just as a starting point. I don't know almost
> anything about C++, but I coded C for many years. I think there are
> some precautions to take in this situation to learn C++ respect
> learing C++ as the first language.

In this case use the book I mentioned, plus Stroustrup's introductory
book "Programming: Principles and Practice Using C++" is supposed to be
good (I haven't used it).  He also wrote "The C++ Programming language"
which is more of a reference manual and which I found indispensible.
Make sure to get the most recent edition in either case.  C++ changed
tremendously with C++11 and anything from before that should be
considered near-useless.  So that means 4th edition or later for the
reference manual: I'm not sure for the intro book.

Alternatively it might be best to skip C++ entirely and use Ada or
Rust.  I don't have a clear picture in my mind of how to address that.

> Dynamic memory... is it possible to have a C++ project without using
> heap at all?

Yes, C++ is a superset of C, more or less.  You do have to maintain some
awareness of where dynamic allocation can happen, to avoid using it,
at least after program initialization is finished.

>> And keep smiling! [](){}();  (That's the C++11 smiley - when you
>> understand what it means, you're laughing!)

Heh, if that means what I think it means.

Reply by Paul Rubin ●April 19, 20212021-04-19

Niklas Holsti <niklas.holsti@tidorum.invalid> writes:
> You can impose that constraint if you want to: if I had defined
>    type Zero_Handler is not null access procedure;
> then the above declaration of the At_Zero component would be illegal,
> and the compiler would insist on a non-null initial value.

Oh, this is nice.

> But IMO sometimes you need pointers that can be null, just as you
> sometimes need null values in a database.

Preferable these days is to use a separate type for a value that is
nullable or optional, so failing to check for null gives a compile-time
type error.  This is 't Option in ML, Maybe a in Haskell, and iirc
std::Optional<T> these days in C++.

> The state of the art in Ada implementations of critical systems is
> slowly becoming to use static analysis and proof tools to verify that
> no run-time check failures, such as accessing a null pointer, can
> happen. That is already fairly easy to do with the AdaCore tools
> (CodePeer, SPARK and others). Proving functional correctness still
> remains hard.

I know about SPARK.  Is CodePeer something along the same lines?  Is it
available through GNU, or is it Adacore proprietary or what?

I still have Burns & Wellings' book on SPARK on the recommendation of
someone here.  It looks good but has been in my want-to-read pile since
forever.  One of these days.

Reply by David Brown ●April 20, 20212021-04-20

On 20/04/2021 00:03, Niklas Holsti wrote:
> On 2021-04-19 23:19, David Brown wrote:
>> On 19/04/2021 18:16, Niklas Holsti wrote:
>>> On 2021-04-19 15:22, David Brown wrote:
>>>> On 19/04/2021 12:51, Niklas Holsti wrote:
>>>>>
>>>>> (I think there should be a "volatile" spec for the "p" object, don't
>>>>> you?)
>>>>
>>>> It might be logical to make it volatile, but the code would not be
>>>> different (the inline assembly has memory clobbers already, which force
>>>> the memory accesses to be carried out without re-arrangements).
>>>
>>>
>>> So you are relying on the C++ compiler actually respecting the "inline"
>>> directive? Are C++ compilers required to do that?
>>>
>>
>> No, it is not relying on the "inline" at all - it is relying on the
>> semantics of the inline assembly code (which is compiler-specific,
>> though several major compilers support the gcc inline assembly syntax).
>>
>> Compilers are required to support "inline" correctly, of course - but
>> the keyword doesn't actually mean "generate this code inside the calling
>> function".&#4294967295; It is one of these historical oddities - it was originally
>> conceived as a hint to the compiler for optimisation purposes, but what
>> it /actually/ means is roughly "It's okay for there to be multiple
>> definitions of this function in the program - I promise they will all do
>> the same thing, so I don't mind which you use in any given case".
> 
> 
> If the call is not in fact inlined, it seems to me that the compilation
> of the caller does not see the "asm volatile" in the callee, and
> therefore might reorder non-volatile accesses in the caller with respect
> to the call. But perhaps such reordering is forbidden in this example,
> because p is a pointer, and the callee might access the same underlying
> object (*p) through some other pointer to it, or directly.

Either the compiler "sees" the definition of the functions, and can tell
that there are things that force the memory access to be done in middle
(whether these functions are inlined or not), or the compiled does not
"see" the definitions and must therefore make pessimistic assumptions
about what the functions might do.  The compiler can't re-order things
unless it can /prove/ that it is safe to do so.

> 
> 
>> As a matter of style, I really do not like the "declare all variables at
>> the start of the block" style, standard in Pascal, C90 (or older), badly
>> written (IMHO) newer C, and apparently also Ada.&#4294967295; I much prefer to avoid
>> defining variables until I know what value they should hold, at least
>> initially.&#4294967295; Amongst other things, it means I can be much more generous
>> about declaring them as "const", there are almost no risks of using
>> uninitialised data, and the smaller scope means it is easier to see all
>> use of the variable.
> 
> 
> I mostly agree. However, there is always the possibility of having an
> exception, and the question of which variables an exception handler can
> see and use.
> 

In C++ (and C99, but it doesn't have exceptions), you don't need to put
your variables at the start of a block.  But their scope and lifetime
will last until the end of the block.  So if you need a variable in an
exception, you have to have declared it before the exception handling
but you don't need it in a rigid variables-then-code-block structure :

{
	foo1();
	int x = foo2();		// <- Not at start of block
	foo3(x);
	try {
		foo4(x, x);
	} catch (...) {
		foo5(x);
	}
}


> When all variable declarations are collected in one place, as in
> 
> &#4294967295;&#4294967295; declare
> &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295; <local var declarations>
> &#4294967295;&#4294967295; begin
> &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295; <statements>
> &#4294967295;&#4294967295; exception
> &#4294967295;&#4294967295;&#4294967295;&#4294967295;&#4294967295; <handlers>
> &#4294967295;&#4294967295; end
> 
> it is easy and safe to say that the handlers can rely on all the
> variables declared between "declare" and "begin" being in existence when
> handling some exception from the <statements>. If variables are declared
> here and there in the <statements>, they might or might not yet exist
> when the exception happens, and it would not be safe for the local
> exception handler to access them in any way.

Flexible placement of declarations does not mean /random/ placement, nor
does it mean you don't know what is in scope and what is not!

And you can be pretty sure the compiler will tell you if you are trying
to use a variable outside its scope.

> 
> Of course one can nest exception handlers when one nests blocks, but
> that becomes cumbersome pretty quickly.
> 
> I don't know how C++ really addresses this problem.
> 

For the kinds of code I write - on small systems - I disable exceptions
in C++.  I prefer to write code that doesn't go wrong (unusual
circumstances are just another kind of value), and the kind of places
where exceptions might be most useful don't turn up often.  (I use
exceptions in Python in PC programming, but the balance is different there.)

> 
>>> (In the next Ada standard -- probably Ada 2022 -- one can write such
>>> updating assignments more briefly, as
>>>
>>> &#4294967295;&#4294967295;&#4294967295;&#4294967295; p.all := @ + z;
>>>
>>> but the '@' can be anywhere in the right-hand-side expression, in one or
>>> more places, which is more flexible than the C/C++ combined
>>> assignment-operations like "+=".)
>>>
>>
>> It may be flexible, but I'm not convinced it is clearer nor that it
>> would often be useful.&#4294967295; But I guess that will be highly related to
>> familiarity.
> 
> 
> Yes, I have not yet used it at all, although I believe GNAT already
> implements it.
> 
> I imagine one not uncommon use might be in function calls, such as
> 
> &#4294967295;&#4294967295; x := Foo (@, ...);
> 
> I believe that the main reason this new Ada feature was formulated in
> this "more flexible" way was not the desire for more flexibility, but to
> avoid the introduction of many more lexical tokens and variations like
> "+=", or "+:=" as it would have been for Ada.

Sounds reasonable.