EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

validity of ... reasons for preferring C over C++

Started by Nobody October 16, 2014
On 14-10-18 21:10 , David Brown wrote:

> There is some evidence suggesting that the "errors per line of code" > rates is fairly independent of the programming language - and with all > other things being equal (which they seldom are), a more compact > language will have a lower bug rate than a more verbose one.
Stephen F. Zeigler reported in 1995 on an empirical study, done within Rational Software Corporation, of a large set of C and Ada modules being developed and maintained by the same team, with the same people working on both C and Ada modules. The rate of defects/KSLOC was 0.676 for the C modules, and 0.096 for the Ada modules - about 7 times more defects per line in C than in Ada. The study makes many other interesting comparisons: http://archive.adaic.com/intro/ada-vs-c/cada_art.html. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
On 18/10/14 19:14, David Brown wrote:
> On 17/10/14 23:11, Paul Rubin wrote: >> This is called "typeful programming" (search on the phrase) and the idea >> is it helps the compiler catch errors in the code. > > Yes, and it can sometimes be helpful in that way - but it can also mean that you need extra code to deal with the conversions, and that means extra scope for errors. It works both ways.
In my experience (not with Ada) is that requiring extra information (that aids mechanical checking) is a net benefit. The major disadvantage of "extra verbiage" is that the advantages are lost in cases in which it is automatically inserted or the programmer does "whatever is necessary to shut the compiler up". Nothing can help in that latter case. The extra information enables the toolset to be much more helpful when exploring/understanding the code, when refactoring the code, and when writing the code. "Autocomplete" operations largely remove the disadvantage of "extra verbiage".
On 18/10/14 19:10, David Brown wrote:
> On 17/10/14 23:44, Wouter van Ooijen wrote: >> David Brown schreef op 17-Oct-14 11:03 PM: >>> On 17/10/14 19:15, Niklas Holsti wrote: >>>> On 14-10-17 13:44 , David Brown wrote: >>>>> On 17/10/14 11:35, Paul Rubin wrote: >>>>>> David Brown <david.brown@hesbynett.no> writes: >>>>>>> I disagree, in the embedded world. In PC programming, I agree - very >>>>>>> often you are better with a language like Python than with C or C++. >>>>>> >>>>>> I have to think that even in MCU programming there have to be better >>>>>> alternatives to C or C++. Ada seems like a possibility except that >>>>>> the >>>>>> toolchains are either obscure or very expensive. There have been >>>>>> some C >>>>>> dialects like Cyclone that were clever but never got traction. And I >>>>>> guess there are some specialty EDSL's like Atom that aren't so good >>>>>> for >>>>>> general purpose development. Forth is interesting but it's from a >>>>>> different world, and still unsafe though with fewer traps than C. >>>>>> >>>>> >>>>> In theory, one could do much better than C or C++ - but not in >>>>> practice. >>>>> For some systems, Ada is a possible choice. But while Ada is "safer" >>>>> than C in some ways, it has its own problems >>>> >>>> Care to list a few of those? Just to have a good debate? Been a while... >>> >>> My experience with Ada is rather limited, so you might end up teaching >>> me rather than getting a good debate. But of course others will no >>> doubt join in. >>> >>> An obvious (but highly subjective) irritation with Ada is the verbosity >>> of the language - lots of things need repeated, and many of the >>> constructs are more wordy than necessary. >> >> But do you find it more difficult to *read* Ada because of this >> verbosity? (One of the basic rules of software engineering is that you >> must optimzie for the reader, not for the author.) > > Yes, I find it harder to read (but again, I stress my limited experience - any language is easier to use after more practice). When reading C, I find the syntax and the common identifiers contrast > with function names, variables, and other identifiers, making it easier to see the structure of the code. Ada just seems to have too many words for my liking - it reads like a school essay.
Without commenting on Ada specifically, my experience is that the more information you can give the compiler via the source code, the better the environmental tools can help you. I started with C when the state of the art was multiple green screens, emacs and grep. The power of modern environments with modern languages to recursively and accurately "show me everything that can call this method" and "show me everything reachable from here", plus "what methods can I call on variable x" is remarkable and helpful. For me that's a good tradeoff.
> There is some evidence suggesting that the "errors per line of code" rates is fairly independent of the programming language - and with all other things being equal (which they seldom are), a more > compact language will have a lower bug rate than a more verbose one. I believe this is simply a matter of the amount of information that you can easily see and process at a time - this is why there > is a common rule of keeping your functions shorter than one screenfull. > > > Of course I agree with you that making code easy to read and understand is important - languages (and identifiers in the language) should not be made short to save keystrokes.
And, of course, that extends beyond mere names!
On 14-10-18 21:45 , Paul Rubin wrote:
> Jacob Sparre Andersen <jacob@jacob-sparre.dk> writes: >>> I've heard of an Ada to C translator >> The (still working) Vermont Technical College cubesat was developed that >> way. - Except that they actually developed in SPARK to ensure the >> absence of run-time errors. > > Do you know what Ada-to-C translation tools they used?
I believe they used the AdaMagic compiler from SofCheck (www.sofcheck.com). However, SofCheck has merged with AdaCore, and it seems the Ada-to-C translator is no longer available -- darn. Perhaps is AdaCore gets many requests... There seems to be another such translator by MapuSoft, http://www.mapusoft.com/ada-to-c-changer/.
> What do the tools do about the Ada runtime?
If the VTC cubesat was done in SPARK, they probably did not use tasking. For the MapuSoft translator, they claim to support several real-time kernels, and claim to support all of Ada 83 and Ada 95, but the easily viewable descriptions of the translator do not mention tasking and run-times explicitly. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
Tom Gardner <spamjunk@blueyonder.co.uk> wrote:

(snip)

> In my experience (not with Ada) is that requiring extra > information (that aids mechanical checking) is a net benefit.
> The major disadvantage of "extra verbiage" is that the advantages > are lost in cases in which it is automatically inserted or > the programmer does "whatever is necessary to shut the compiler up". > Nothing can help in that latter case.
Reminds me that Java requires the compiler to be able to determine that a scalar variable is given a value before it is referenced. They could have just zeroed all of them, but that is pretty much your "automatic". This way, you have a chance to think about it. But then again, you can just put =0 on all your declarations.
> The extra information enables the toolset to be much more helpful > when exploring/understanding the code, when refactoring the code, > and when writing the code. "Autocomplete" operations largely remove > the disadvantage of "extra verbiage".
Yes, another chance to think about the problem, and possible mistakes. -- glen
On 19/10/14 00:07, glen herrmannsfeldt wrote:
> Reminds me that Java requires the compiler to be able to determine > that a scalar variable is given a value before it is referenced. > They could have just zeroed all of them, but that is pretty much > your "automatic". This way, you have a chance to think about it. > But then again, you can just put =0 on all your declarations.
No. If a primitive (int, long, boolean etc) isn't explicitly initialised then it is given a sensible default value (0, false etc). Unless explicitly initialised an object also has a sensible default value, null. There's nothing in the compiler that will prevent you trying to "use" the object, but at runtime a NullPointerException is thrown and, if not caught, a stacktrace. I've seen projects where NullPointerExceptions were regularly thrown, caught, and ignored. I didn't like it one little bit, but since it was a "contractual obligation" project and the program continued to work, there was no point in my making a great big fuss. I'm not sure whether or not to be comforted that the program continued to work :) There's a story from the early Smalltalk days. Someone was bringing up a VM and everything was working except that scrollbars on windows worked in the opposite direction. Eventually it was traced to a VM primitive bug in the subtraction of negative numbers. On even days I think that illustrates remarkable robustness. On odd days I wonder how anyone could be sure that all VM bugs had been found.
On 14-10-18 00:03 , David Brown wrote:
> On 17/10/14 19:15, Niklas Holsti wrote: >> On 14-10-17 13:44 , David Brown wrote: >>> On 17/10/14 11:35, Paul Rubin wrote: >>>> David Brown <david.brown@hesbynett.no> writes: >>>>> I disagree, in the embedded world. In PC programming, I agree - very >>>>> often you are better with a language like Python than with C or C++. >>>> >>>> I have to think that even in MCU programming there have to be better >>>> alternatives to C or C++. Ada seems like a possibility except that the >>>> toolchains are either obscure or very expensive. There have been >>>> some C >>>> dialects like Cyclone that were clever but never got traction. And I >>>> guess there are some specialty EDSL's like Atom that aren't so good for >>>> general purpose development. Forth is interesting but it's from a >>>> different world, and still unsafe though with fewer traps than C. >>>> >>> >>> In theory, one could do much better than C or C++ - but not in practice. >>> For some systems, Ada is a possible choice. But while Ada is "safer" >>> than C in some ways, it has its own problems >> >> Care to list a few of those? Just to have a good debate? Been a while... > > My experience with Ada is rather limited, so you might end up teaching > me rather than getting a good debate. But of course others will no > doubt join in.
Let the flames roar! :-) However, I will focus all my fire on C. I don't know enough about C++ to debate its merits.
> An obvious (but highly subjective) irritation with Ada is the verbosity > of the language -
You may be surprised by the comparisons later in this post.
> lots of things need repeated,
Really? If you mean the repetition of "if" in the construct "if" .. "end if" and the similar repetitions of "loop", "case", and "record" in "end loop", "end case", "end record", I believe many programmers find them quite useful, compared to the multiply overloaded meanings of "}" in C. They help an Ada compiler give sensible error messages when some "end" is omitted by mistake. The only places where you repeat things in Ada, and you do not need to repeat in C, that I can think of now, happen when you want to define the internal codes used for enumeration values or the bit-fields used for structs ("records" in Ada). In C you can specify those without repeating the name of the enumeration literal, or the name of the struct field; in Ada those are repeated, because the "representation clause" is a separate syntactic unit. C: typedef enum {red = 4, blue = 6, green = 9} color_t; Ada: type color_t is (red, blue, green); for color_t use (red => 4, blue => 6, green => 9); To reduce repetition, the second Ada line can be written as for color_t use (4, 6, 9); but that loses the close connection between the name and the corresponding value, and is therefore seldom used.
> and many of the > constructs are more wordy than necessary. Using words rather than > symbols is not necessarily a bad thing, within limits - C++ arguably > relies too much on symbols, as anyone trying to read a lambda function > will know. But sometimes Ada reads as though you are chatting to the > computer rather than programming.
The languages certainly have a different textual appearance. But I will stick my neck out and argue that for all but small toy programs, Ada is *less* verbose than C. How should we measure verbosity? By the number of non-blank characters in the source-code file, or by the number of syntactical tokens? I think both measures are relevant. If we measure by syntactical tokens, for comparable language constructs there is usually little difference between Ada and C, and in some cases the Ada form is shorter. Let's look at some common examples, assuming that in each case the length of the user-chosen identifiers are the same in Ada and in C (I will argue later below that Ada identifiers should in general be *shorter* than C identifiers, assuming equivalent naming conventions). The C forms below follow the usual recommendations for always using {} in control structures and "break" in switch cases. In both Ada and C I insert more spaces than usual, around separators, to make it easier to count tokens. Preparing to use another module C: # include "sorting.h" Ada: with Sorting ; Same number of tokens. Ada is 7 characters shorter. Typeless number definition C: # define NUM_FROOS 55 Ada: Num_Froos : constant := 55 ; C is shorter by 2 tokens and 5 characters. Typed constant definition C: const float x = 33.4f ; Ada: x : constant float := 33.4 ; C is shorter by 1 token and 5 characters. Variable declaration C: float x ; Ada: x : float ; C is one character and token shorter. Type declaration (type naming) C: typedef float voltage_t ; Ada: subtype voltage_t is float ; C is one token and 2 characters shorter. Assignment statement C: varname = expression ; Ada: varname := expression ; Same number of tokens; one character less in C. Assignment to a variable through a pointer C: * varname = expression ; Ada: varname . all := expression ; C is one token and 4 characters shorter. Assignment to a field of a structure through a pointer to the structure C: varname -> field = expression ; Ada. varname . field := expression ; Same number of tokens and characters in C and Ada. Conditional statement C: if ( condition ) { statements1 } else { statements2 } Ada: if condition then statements1 else statements2 end if ; Ada is shorter by two tokens (if we consider "end if" as two tokens) but longer by 4 characters, mainly because of the full-length words in "then" and "end if". Counted loop from 1 to N C: for ( int i = 1 ; i <= N ; i ++ ) { statements } Ada: for i in 1 .. N loop statements end loop ; Ada is shorter by 6 tokens, but has the same number of characters as C because of the full-length words in "loop" and "end loop". The Ada and C forms are not exactly equivalent. In Ada, the loop counter "i" automatically inherits the type of N; in C, that is the programmer's responsibility. In C, the incrementation of "i" might overflow or wrap around, causing undefined behaviour or making the loop never ending; in Ada, the loop is sure to end at i = N, even if i+1 would overflow or wrap around. While loop C: while ( condition ) { statements } Ada: while condition loop statements end loop ; Same number of tokens in Ada and C, but Ada has 8 more characters because of the full-length words in "loop" and "end loop". Conditional exit from loop C: if ( condition ) break ; Ada: exit when condition ; Ada is shorter by 2 tokens and 1 character. Switch-case with 2 cases and a default C: switch ( expr ) { case 0 : statements break ; case 1 : statements break ; default : statements } Ada: case expr is when 0 => statements when 1 => statements when others => statements end case ; Ada is shorter by 3 tokens and 2 characters. Defining a structure/record type C: typedef struct { float x , y , z ; } triple_t ; Ada: type triple_t is record x , y , z : float ; end record ; C 2 tokens and 7 characters shorter, mainly because of the full words "record" and "end" in the Ada form. Procedure (void function) declaration C: void foo ( float x , float * y ) ; Ada: procedure foo ( x : float ; y : out float ) ; C is 2 tokens (the two colons in the formal parameter list) and 9 characters shorter. The C and Ada forms are not exactly equivalent, because "float * y" in the C form means pass by address, while "y : out float" in Ada means no value is passed in, and a value is passed out by copy. To make the Ada form exactly the same as C, replace the "out" by "aliased in out", or define a pointer type float_ptr and write "y : in float_ptr". For this last example, note that a reference to an output parameter, within a function or procedure, is generally shorter in Ada than in C, because in C the parameter is a pointer and therefore one must use the "*" or "->" forms, while in Ada an "out" parameter is not seen as a pointer and does not need the corresponding ".all" notation, even if the compiler chooses to pass the parameter by reference. I could go on, but I believe the conclusion so far is clear: measured by number of tokens, sometimes the C form is shorter, sometimes the Ada form is shorter. In this collection of examples, the difference was largest for the counted loop, where Ada was shorter by 6 tokens. Measured by number of non-blank characters, C is usually shorter, but the difference comes from the English keywords in Ada, where C instead uses special characters. For me, it is faster to type the English keywords, partly because on my Finnish keyboard the brackets {} require the Alt Graph key (and a careful glance at the topmost key row to avoid typing the adjacent () or [] instead). But, to conclude my argument, IMO the clincher in the verbosity comparison is the name-space issue. Ignoring comments, most of the text of a program consists of user-defined identifiers of constants, types, variables, functions, procedures. Many of these identifiers must be globally visible, for example the identifiers of the types, enumerations, and functions/procedures which a client module needs to see in other to call another module. Because C lacks a module/namespace construct, in larger C programs one must usually extend these identifiers with module-specific prefixes to avoid accidental clashes of identifiers defined in different modules. For example, the function "display" defined in module "channels" may be called channels_display, or chan_display, or ch_display, depending on the degree of abbreviation desired. Unfortunately, these prefixes must then be used in every occurrence of the identifier, even within the defining module itself. In contrast, an identifier "display" defined within an Ada package "channels" can be used as such within the package itself (and in all its child packages). Only in other packages is it necessary to write channels.display, or to open direct visibility to the channels package by saying "use channels". This effect should, on average, make identifiers shorter in an Ada program than in the corresponding C program, assuming equivalent design standards for naming (for example, abbreviation rules) and a program of at least medium size. Another factor that can shorten Ada identifiers is the possibility to overload function and procedure names (including enumeration literals) by type, while in C overloading is not possible (in C++ it is, but I am incompetent to talk about C++). (A long argument, perhaps, but *my* verbosity was not at issue :-))
> Ada programming encourages the use of user-defined types for all sorts > of things. You are not supposed to hold a "day" in an "int" or > "uint8_t", you are supposed to define "type Day_type is range 1 .. 31;".
That is entirely a matter of convention. It can hardly called a problem of the Ada language.
> Sometimes this sort of thing can make code clearer, but it can also > make it harder to see what is really going on in the program. When you > see a type like "uint8_t" or "int", you know exactly what it means - > when the type is "Day_type", you have to think about it much more, and > perhaps look up the definition.
This is a good example of the main difference in the concept of "type" in Ada and in C, or at least an example of how the meaning of "type" is different for typical programmers in the two languages. For Ada, and for the typical Ada programmer, the type of an object describes the meaning of the object in the application world, in the problem statement. For C, and for the typical C programmer, the type of an object describes the internal machine representation of the object. "uint8_t" describes the machine representation but says nothing of the logical role or meaning of the value, in the problem and the application. In C, that is typically the role of the object's identifier, such as in the declaration uint8_t day_in_month; "Day_type" describes the logical role, but (as such) does not say anything about the machine representation. This means that the object identifier can more easily be used to make the role of the particular object more precise, as in: Day_type pay_day; In fact I believe that C practice is moving towards the Ada use of types. More and more, C programmers are advised to define type names such as typedef uint8_t day_t; However, the rationale usually given for this advice is to help program maintenance in case the type must be changed to something else, which is true but trivial from the conceptual point of view. Of course, the above does not mean that Ada programmers ignore internal representation issues, just that the two concerns are separated. If the Ada programmer wants to say that a Day_type uses 8 bits, she adds a Size aspect to the type declaration: type Day_type is range 1 .. 31 with Size => 8;
> And because type conversions have to be explicit in Ada, you need > to add lots of them when using these types.
Well, some programmers prefer to avoid automatic conversions. In Ada, you can control whether conversions are automatic or must be explicit, by using subtypes in the former case and types in the latter case. If you declare Day_type not as a type, but as a subtype: subtype Day_type is Integer range 1 .. 31; you are free to write computations that mix Integer and Day_type, you can assign a Day_type variable to an Integer variable, and can assign an Integer expression to a Day_type variable (with, in the last case, an automatic check that the value is in the range 1 .. 31). However, you can still use Day_type'First and Day_type'Last to ask about the range, and you can loop over all days by for D in Day_type loop ... end loop;
> C and C++ do more of this automatically, giving clearer code.
The code can be shorter, yes. Whether is is clearer is subjective, of course.
> The run-time overhead in Ada can be an issue - the larger run-time > library, the run-time checks, exceptions (I don't like them in C++ either).
Ada compilers these days give you a choice of run-times, from zero-run-time to the full standard, with the corresponding restrictions in the Ada code you can execute on that run-time. For the GNAT compiler, the run-time is treated as a normal library at link-time, so any run-time function that is not used can be eliminated from the executable. And *of course* the run-time checks can be turned off globally or locally, by compiler options or specific pragmas. Benchmarks vary, but if one writes Ada code that is functionally equivalent to C code (does not compute with dynamically sized objects, does not use tasking, does not use exceptions, disables run-time checks, etc.) one can expect to have close to the same machine code as for the C code. Note that this does not mean "writing C in Ada", because one still has all the compile-time goodies of Ada: module system, stronger typing, object and type attributes, and so on. And the ability to test the program with run-time checks *on*. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
On 14-10-19 02:36 , Tom Gardner wrote:
> On 19/10/14 00:07, glen herrmannsfeldt wrote: >> Reminds me that Java requires the compiler to be able to determine >> that a scalar variable is given a value before it is referenced. >> They could have just zeroed all of them, but that is pretty much >> your "automatic". This way, you have a chance to think about it. >> But then again, you can just put =0 on all your declarations. > > No. > > If a primitive (int, long, boolean etc) isn't explicitly > initialised then it is given a sensible default value (0, > false etc).
Hmm... seems not to be the case. The Java Specification I found says, in http://docs.oracle.com/javase/specs/jls/se7/html/jls-4.html#jls-4.12.5: "A local variable (&#4294967295;14.4, &#4294967295;14.14) must be explicitly given a value before it is used, by either initialization (&#4294967295;14.4) or assignment (&#4294967295;15.26), in a way that can be verified using the rules for definite assignment (&#4294967295;16)." The default initialisation rules, earlier on the same page, do not apply to local variables. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
On 19/10/14 08:55, Niklas Holsti wrote:
> On 14-10-19 02:36 , Tom Gardner wrote: >> On 19/10/14 00:07, glen herrmannsfeldt wrote: >>> Reminds me that Java requires the compiler to be able to determine >>> that a scalar variable is given a value before it is referenced. >>> They could have just zeroed all of them, but that is pretty much >>> your "automatic". This way, you have a chance to think about it. >>> But then again, you can just put =0 on all your declarations. >> >> No. >> >> If a primitive (int, long, boolean etc) isn't explicitly >> initialised then it is given a sensible default value (0, >> false etc). > > Hmm... seems not to be the case. > > The Java Specification I found says, in > http://docs.oracle.com/javase/specs/jls/se7/html/jls-4.html#jls-4.12.5: > > "A local variable (&#4294967295;14.4, &#4294967295;14.14) must be explicitly given a value > before it is used, by either initialization (&#4294967295;14.4) or assignment > (&#4294967295;15.26), in a way that can be verified using the rules for definite > assignment (&#4294967295;16)." > > The default initialisation rules, earlier on the same page, do not apply > to local variables.
Indeed. I wonder why it is required in that subset of cases. Probably something to do with local classes, which were introduced "relatively recently". This isn't something that concerns me in that the environment requires me to do something that I ought to be doing anyway: it is the safe option without any penalties. Having said that, I am becoming concerned that Java is becoming unnecessarily complex merely in order to save a few keystrokes, to give computer scientists something to do, and to enable marketeers to brag about "improvements". Local classes were one example, generics another, and the stuff in Java 8 another.
On 2014-10-19, Niklas Holsti <niklas.holsti@tidorum.invalid> wrote:
> > Counted loop from 1 to N > > C: > for ( int i = 1 ; i <= N ; i ++ ) { > statements > } > > Ada: > for i in 1 .. N loop > statements > end loop ; >
And in addition, if the goal of the loop is to iterate over the whole of (say) an array, Ada's 'Range attribute can be used so there's no need to explicitly list the loop boundaries within the for statement. Simon. -- Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world
The 2026 Embedded Online Conference