XC8 novice question| page 5

Reply by Don Y ●September 13, 20162016-09-13

On 9/12/2016 2:11 PM, David Brown wrote:

>>>> I would NOT, for example, write:
>>>>      x=-1;
>>>
>>> Neither would I - I would write "x = -1;".  But I believe I am missing
>>> your point with this example.
>>
>> My example would be parsed as:
>>     x =- 1 ;
>
> Parsed by who or what?  A compiler would parse it as "x = -1;".  I assume we
> are still talking about C (or C++) ?

No.  Dig out a copy of Whitesmith's or pcc.  What is NOW expressed as
"x -= 1" was originally expressed as "x =- 1".  Ditto for =+, =%, etc.
So, if you were a lazy typist and liked omitting whitespace by thinking it
redundant, you typed:
     x=-1;
and got:
     x=x-1;
instead of:
     x = (-1);

> When writing, I would include appropriate spaces so that it is easy to see what
> it means.

My point is that most folks would think x=-1 (or x=+2, etc.) bound the
sign to the value more tightly than to the assignment operator.

The same sort of reasoning applies to
    x=y/*p;
where the '*' ends up binding more tightly to the '/' to produce the
comment introducer, "/*".  Note that the problem persists in the
language's current form -- but compilers often warn about it
(e.g., detecting "nested" comments, etc.)

>> Nowadays, you would see this as:
>>     x -= 1 ;
>
> That is a completely different statement.

No.  It is what the statement above HAS BECOME!  (as the language evolved)

>> Would you have noticed that it was NOT assigning "-1" to x?
>> Would you have wasted that precious, limited timeslot that you
>> had access to "The Machine" chasing down a bug that you could have
>> avoided just by adopting a style that, instead, writes this as:
>>     x = -1;
>
> I am afraid I still don't get your point.  "x = -1;" is exactly how I would
> write it.  I believe spacing like that is good style, and helps make it clear
> to the reader and the writer how the statement is parsed.  And like in normal
> text, appropriate spacing makes things easier to read.

See above.

Or, better yet, google turns up:
 
<http://bitsavers.informatik.uni-stuttgart.de/pdf/chromatics/CGC_7900_C_Programmers_Manual_Mar82.pdf>

Perhaps an interesting read for folks didn't have to write code
"back then".

>>>> And, I *would* write:
>>>>      p = &buffer[0];
>>>
>>> So would I - because I think the code looks clearer.  When I want p to
>>> be a pointer to the first element of "buffer", that's what I write.
>>
>> You'll more frequently encounter:
>>       p = buffer;
>
> I know.  But I prefer "p = &buffer[0];", because I think it looks clearer and
> makes more sense.  To my reading, "buffer" is an array - it does not make sense
> to assign an array to a pointer-to-int.  (I'm guessing at types here, since
> they were never declared - adjust them if necessary if "buffer" was an array of
> char or something else.)

I prefer it because I am more hardware oriented.  I think of "objects"
(poor choice of words) residing AT memory addresses.  So, it is only
natural for me to think about "the address of the zeroth element of
the array".

> C converts arrays or array operations into pointers and pointer operations in
> certain circumstances.  I wish it did not - but I can't change the language.
> But just because the language allows you to write code in a particular way, and
> just because many people /do/ write code in a particular way, does not
> necessarily mean it is a good idea to do so.
>
>>> I'd use parens in places that you'd consider superfluous -- but that
>>>> made bindings very obvious (cuz the compiler would do what I *told* it
>>>> without suggesting that I might be doing something unintended):
>>>
>>> Me too.  But that's for clarity of code for both the reader and the
>>> writer - not because I worry that my tools don't follow the rules of C.
>>
>> The problem isn't that tools don't follow the rules.  The problem is that
>> a compiler can be conforming WITHOUT holding your hand and reminding you
>> that what you've typed *is* "valid C" -- just not likely the valid C
>> that you assumed!
>
> Again, I can only say - get a decent compiler, and learn how to use it
> properly.  If you cannot use good compiler (because no good compilers exist for
> your target, at least not in your budget), then use an additional static error
> checker.

For how many different processors have you coded?  I have compilers
for processors that never made it into "mass production".  And, for
processors that saw very limited/targeted support.  If I'm willing to FUND
the development of a compiler -- knowing that I may be the only
customer for that compiler -- then I can "buy" all sorts of capabilities
in that compiler!

OTOH, if I have to pass those costs along to a client, he may be far less
excited about how "good the tool is" and, instead, wonder why *I* can't
compensate for the tool's quality.

What were tools like 20 years ago?  Could you approach your client/employer
and make the case for investing in the development of a compiler with
2016 capabilities?  Running, *effectively*, on 1995 hardware and OS?
Or, would you, instead, argue that the project should be deferred until
the tools were more capable?

Or, would you develop work styles that allowed you to produce reliable
products with the tools at your disposal??

> To be used in a safe, reliable, maintainable, and understandable way, you need
> to limit the C features you use and the style you use.  (This applies to any
> programming language, but perhaps more to C than other languages.)  And you use
> whatever tools you can to help spot errors as early as possible.
>
>>>>      y = x / (*p);
>>>
>>> Me too.
>>
>> Because the compiler would gladly think "y=x/*p;" was THREE characters
>> before the start of a comment -- without telling you that it was
>> interpreting it as such.  Again, you turn the crank and then spend
>> precious time staring at your code, wondering why its throwing an error
>> when none should exist (in YOUR mind).  Or, worse, mangling input:
>>
>>          y=/*p      /* this is the intended comment */
>>            x=3;
>>
>
> Again, spaces are your friend.

Spaces (or parens) are a stylistic choice.  The language doesn't mandate
their use.  E.g., this is perfectly valid:
      y=/*I want to initialize two variables to the same value
          in a single stmt*/x=3;

It's a bad coding STYLE but nothing that the compiler SHOULD complain
about!

>> seen, instead, as (error-free!):
>>
>>          y=x=3;
>
> Again, that's why you have to limit the features of C that you use.  And get
> whatever help you can from your tools to spot the errors (even if they are
> technically valid C).
>
>>>> And, to catch (some) '=' instead of "==" errors, write tests as:
>>>>      if ( CONSTANT == expression ) ...
>>>
>>> No, I would /never/ write that.  I would write it in a way that makes
>>> logical sense when reading or writing.  You would say "if the number of
>>> apples is 3, the price is $1" - you would never say "if 3 is the number
>>> of the apples, $1 is the price".
>>
>> If the bag yields 3 apples...
>
> Bags don't "yield" apples.

Sure they do!  Do you think they HORDE apples?

>>> Thus I would always write:
>>>
>>>     if (expression == constant) { .... }
>>>
>>> (I'm not a fan of shouting either - I left all-caps behind with BASIC
>>> long ago.)
>>
>> I ALWAYS use caps for manifest constants.
>
> Many people do.  I don't.  I really do not think it helps in any way, and it
> gives unnecessary emphasis to a rather minor issue, detracting from the
> readability of the code.

Would you write:
const zero = 0;
const nine = 9;
     for (index = zero; index < nine; index++)...
Or:
const start = zero;
const end = nine;
     for (index = start; index < end; index++)...
Or:
     for (index = START; index < END; index++)...

The latter makes it abundantly clear to me (without having to chase down
the declaration/definition of "start", "end", "zero" or "nine".  (who
is to say whether "zero" actually maps to '0' vs. "273" -- 0 degrees K!)

> But I am aware that some people think it is a useful idiom - and that a good
> many people think it is just something you do, because lots of other people do
> it.  I'm assuming you are in the first category here, and could give good
> reasons why you like all caps here, but I don't think we need to introduce any
> more points of disagreement and discussion at the moment!
>
>>> What benefit do you think "if (CONSTANT == expression)" gives you?  Why
>>> is it useful to invert the logic and make code hard to read, just in
>>> case you make a mistake and write "if (expression = CONSTANT)"?  It is
>>> highly unlikely that you would make such a mistake, and even less likely
>>> that you would make it and not find it quickly - and you compiler (or
>>> linter) will spot it for you.
>>
>> Again, you're spoiled by adopting C when it was far more mature
>> and had more mature tools as well as accessibility.  Give yourself
>> an hour, today, to use your PC/computer.  See how much work you get
>> done (lets forget that your computer is 1000 times more capable).
>> But, don't fret; whatever you DON'T get done you can tackle
>> tomorrow -- in THAT 1 hour timeslot!
>
> And again, you are missing the point.  I am not writing code for a 40 year old
> compiler.  Nor is the OP.  For the most part, I don't need to write unpleasant
> code to deal with outdated tools.

No, YOU are missing the point!  I'm writing code with 40 years of
"respectible track record".  You're arguing that I should "fix"
something (i.e., my style preferences) that isn't broken.  Because
it differs from what your *20* year track record has found to be
acceptable.  Should we both wait and see what next year's
crop of developers comes up with?  Maybe Hungarian notation will be
supplanted by Vietnamese notation?  Or, we'll decide that using identifiers
expressed in Esperanto is far more universally readable?

> I have, in the past, used far more limited compilers.  I have worked with
> compilers that only supported C90/ANSI, meaning I can't mix declarations and
> statements.  I have worked with compilers that have such god-awful code
> generation that I had to think about every little statement, and limit the
> number of local variables I used.  Very occasionally, I have to work on old
> projects that were made using such tools - and I have to go back to the dark
> ages for a while.
>
> Perhaps, long ago when the tools I had were poor and the company I worked for
> did not have the budget for a linter, it might have made sense to write "if
> (CONSTANT == expression)".  But not now - and I don't think there will be many
> developers working today for which it would still make sense.

I can then argue that we shouldn't bother with such an archaic language,
at all!  Look at all the cruft it brings along with it!  Why not wait to
see what pops up tomorrow as the language (and development style) du jour?
Or, why risk being early adopters -- let's wait a few *weeks* before
adopting tomorrow's advances!

Programming is still an art, not a science.  You rely on techniques that
have given you success in the past.  When 25% of product returns are due
to "I couldn't figure out how to USE this PoS!", that suggests current
development styles are probably "lacking".
<https://books.google.com/books?id=pAsCYZCMvOAC&pg=PA130&lpg=PA130&dq=product+returns+confusion&source=bl&ots=P_v4nTI0m8&sig=XZ6VGEOtuyJOG7Kwkcd2SMV2_6w&hl=en&sa=X&ved=0ahUKEwiDwtDKt4vPAhVk0YMKHfO-C-AQ6AEINjAD>

>>>> [would you consider "if (foo==bar())" as preferred to "if
>>>> (bar()==foo)"?]
>>>
>>> I'd usually try to avoid a function call inside an expression inside a
>>> conditional, but if I did then the guiding principle is to make it clear
>>> to read and understand.
>>
>> Which do you claim is cleared?
>> Imagine foo is "errno".
>> After answering, rethink it knowing that errno is actually a function
>> invocation on this platform.
>
> Neither is clear enough.

So, what's your solution?

>>>> Much of this is a consequence of how I learned to "read" (subvocalize)
>>>> code to myself:
>>>>      x = y;
>>>> is "x gets the value of y", not "x equals y".  (how would you read x ::=
>>>> y ?)
>>>>      if ( CONSTANT == expression )
>>>> is "if expression yields CONSTANT".  Catching myself saying "if CONSTANT
>>>> gets
>>>> expression" -- or "if variable gets expression":
>>>>      if ( CONSTANT = expression )
>>>>      if ( variable = expression )
>>>> is a red flag that I'm PROBABLY not doing what I want to do.
>>>>
>>>> I'll wager that most folks read:
>>>>      x = y;
>>>> as:
>>>>      x equals y
>>>
>>> I can't say I vocalize code directly - how does one read things like
>>> brackets or operators like -> ?  But "x = y" means "set x equal to y".
>>
>> And, x ::= y effectively says the same -- do you pronounce the colons?
>> Instead, use a verb that indicates the action being performed:
>>       x gets y
>
> I don't pronounce colons.  I also don't use a programming language with a ::=
> operator.  But if I did want to pronounce it, then "x gets y" would be fine.

Then you come up with alternative ways of conveying the information
present in that symbology.

E.g., "::=" (used to initialize a variable) vs '=' (used to define
*constants*, but not variables, and test for value equality) vs ":=:"
(to test for address equality) vs ":=" (chained assignment) in Algol;
'=' vs "==" in C; ":=" vs. '=' and ':' and "==" in Limbo; etc.

When conversing with someone not fluent in a language, the actual
punctuation plays an important role.  Saying "x gets y" to someone
who isn't familiar with Algol's "::=" would PROBABLY find them
writing "x = y".  When "reading" Limbo code to someone, I resort
to "colon-equals", "equals" and "colon" so I'm sure they know
exactly what I'm saying (because the differences are consequential)

>> I learned to code with left-facing arrows instead of equal signs
>> to make the "assignment" more explicit.
>
> I think that <- or := is a better choice of symbol for assignment.  The
> designers of C made a poor choice with "=".
>
>>>> and subtly change their intent, but NOT their words, when encountering:
>>>>      if ( x == y )
>>>> as:
>>>>      if x equals y
>>>
>>> "if x equals y", or "if x is equal to y".
>>>
>>>> (how would they read "if ( x = y )"?)
>>>
>>> I read it as "warning: suggest parentheses around assignment used as
>>> truth value" :-)
>>
>> No, your *compiler* tells you that.  If you've NEVER glossed over
>> such an assignment in your code, you've ALWAYS had a compiler holding
>> your hand for you.  I'm glad I didn't have to wait 10 years for such
>> a luxury before *I* started using C.
>
> I have used warnings on such errors for as long as I have used compilers that
> conveniently warned about them.  But I can say it is very rare that a compiler
> /has/ warned about it - it is not a mistake I have made more than a couple of
> times.  Still, I want my tools to warn me if it were to happen.

I want my tools to warn me of everything that they are capable of detecting.
INCLUDING the tool that occupies my cranial cavity!

>> I'll frequently store pointers to functions where you might store
>> a variable that you conditionally examine (and then dispatch to
>> one of N places in your code).  In my case, I just invoke the
>> function THROUGH the pointer -- the tests were done when I *chose*
>> which pointer to stuff into that variable (why should I bother to
>> repeat them each time this code fragment is executed?)
>>
>> When you do code reviews, do you all sit around with laptops and
>> live code copies?  Or, do you pour over LISTINGS with red pens?
>> Do you expect people to just give the code a cursory once-over?
>> Or, do you expect them to *read* the code and act as "human
>> compilers/processors"?
>
> I expect code to be as simple to understand as possible, so that it takes as
> little time or effort as possible for other people to read. That way others can
> spend their effort on confirming that what the code does is the right thing,
> rather than on figuring out what the code does first.

What's simple to me might not be simple to you!  E.g., I naturally
think in terms of pointers.  Others prefer array indexes.  For me
to jump/call *through* a pointer is far more obvious than encoding some
meaning into a flag at some point in the program.  Then, decoding that
flag at another point and dispatching based on that decode!  The latter
requires keeping two pieces of code in sync (encode & decode).  The
former puts everything in one place (encoding).

Should I write code at a level that a newbie developer can understand?
Should I limit myself to how expressive and productive I can be out of fear
that someone might not be quick to grasp what I've written?

>> There is no such thing as a perfect style.  You adopt a style that
>> works for you and measure its success by how buggy the code is (or
>> is not) that you produce and the time that it takes you to produce it.
>
> Yes.
>
>>
>> I suspect if I sat you down with some of these early and one-of-a-kind
>> compilers, you'd spend a helluva lot more time chasing down bugs that
>> the compiler didn't hint at.  And, probably run the risk of some creeping
>> into your release (x=y=3) that you'd never caught.
>
> No.
>
> Static checking by the compiler is not a substitute for writing correct code.

Where am I writing "INcorrect code"?  It meets the specifications of the
language.  It compiles without errors or warnings.  It fulfills the goals
of the specification.

It's just that *you* don't like my style!  Is that what makes it "not correct"?
If I replaced the identifiers for manifest constants with lowercase symbols,
would that make it MORE correct?  Should I use camelcase identifiers?
Hungarian notation?  Embedded underscores?   How do any of these make the
code more or less "correct"?

> You make it sound like I program by throwing a bunch of random symbols at the
> compiler, then fixing each point it complains about.  The checking is automated
> confirmation that I am following the rules I use when coding, plus convenient
> checking of silly typos that anyone makes on occasion.
>
>> Instead of pushing an update to the customer over the internet, you'd
>> send someone out to their site to install a new set of EPROMs.  Or,
>> have them mail the product back to you as their replacement arrived
>> (via USPS).  Kind of annoying for the customer when he's got your
>> device on his commercial fishing boat.  Or, has to lose a day's production
>> while it's removed from service, updated and then recertified.
>
> I have mailed EPROMs to customers, or had products delivered back for program
> updates.  But never due to small mistakes of the sort we are discussing here.

I've *never* had to update a product after delivery.  Because the cost of
doing so would easily exceed the profit margin in the product!

I've had clients request modifications to a product; or tweeks for
specific customers.  But, those aren't "bug fixes", they're effectively
"new products" that leverage existing hardware.

I did some work for a firm in Chicago.  Big company with lots of bean
counters.  So, they could tell you what everything "costs" (them!).
I had proposed "swapping ROMs" as an upgrade path for the product.
They told me that it costs them $600 to send someone into the city
to TOUCH an installed device:
- gotta schedule the visit (can't just walk in unannounced and hope
   your customer will be eager to greet you AND have the equipment
   "out of service" at that time)
- gotta drive into the city, find a place to park the car, lug your
   test equipment (have to prove to the customer that what you're doing
   DOES work) and documentation/paperwork/workorder to the customer's
   facility
- gotta spend some time actually making the upgrade, testing it and
   reassuring the customer that it actually works
- gotta schlep all your stuff back to the car, pay the parking fee
- drive back to work and check the vehicle back in
- fill out the trip report and any expense report
- gotta pay the employee for this time (plus his share of "burden")
- gotta make sure you have enough employees with this sort of
   training on hand at all times (cuz you can't simply decide not
   to handle upgrades because Bob quit!)

$600.  And that was more than 30 years ago!  For a product that *cost*
$300 (DM+DL) to build!

"No, there will be no upgrades.  You will get it right the first time!"

>>> Consistency is a good thing - there's no doubt there.  But it is not the
>>> overriding concern in all cases - breaking with old habits is necessary
>>> in order to improve.
>>
>> People are intimidated by pointers.  I should "break that habit" because
>> of their shortcomings?  Or because some standards committee decrees?
>> Despite the fact that I produce higher quality code for less money than
>> the adherents of those standards?
>
> Break the habit if it was based on how you had to work with old tools, and the
> habit is not helpful with modern development.

Who is to say it is not helpful?  Has the quality of my work suffered?
Am I suddenly writing buggier code?

If it isn't broke, don't fix it!

>>> And remember, in this thread and context, we are giving advice to a new
>>> programmer.  It is one thing to say that /you/ want to write "if (4 ==
>>> count)" to remain consistent with the vast quantities of code you have
>>> written over the last 40 years.  It is entirely different to consider
>>> how this /new/ programmer should write the vast quantities of code he
>>> will produce over the /next/ 40 years.  /You/ have to live with the
>>> decisions from your past - the new programmer does not.
>>
>> The new programmer has to be aware of why things are the way they
>> are.  Or, fail to learn history's lessons.  What does he/she do
>> when faced with one of these tools?  Or, a build environment that
>> has turned off all warnings (does he even know that has happened)?
>
> Again, you see to assume that I am recommending programming by luck - randomly
> writing "if (x = 1)" or "if (x == 1)" and relying on the tools to fix the
> mistakes.

No.  I'm acknowledging that people make mistakes, keyboards don't always
"type" what you've told them to type, etc.  I am *very* consistent in
my design styles (hardware and software).  If I look at something that I
created and it doesn't strictly conform to "how I do things", I am
instantly on alert:  "Why is this not as it SHOULD be?"  And, I start
searching for an explanation in the accompanying commentary -- lest I
be tempted to "fix" it (which will introduce a problem and explains
why it WASN'T "as it should be" to begin with!)

> The new programmer should learn to take advantage of new tools, and concentrate
> efforts on problems that are relevant rather than problems are no longer an
> issue (to the extent that they ever /were/ an issue). And if that new
> programmer is faced with poorer tools, then he or she will have to learn the
> idiosyncrasies at the time.  And the lack of warnings on "if (x = 1)" is
> unlikely to be the most important point.

Without the warning, the developer is likely to waste a boatload of
time "seeing what he wants to see" and not what the *compiler* sees.

>> There's a reason you can look at preprocessor output and ASM
>> sources.  Because they are the only way to understand some nasty
>> bugs that may arise from your misunderstanding of what a tool
>> SHOULD be doing; *or*, from a defective tool!  (a broken tool
>> doesn't have to tell you that it's broken!)
>
> I have almost /never/ had occasion to look at pre-processor output.  But I do
> recommend that embedded programmers should be familiar enough with the assembly
> for their targets that they can look at and understand the assembly listing.

You've probably never "abused" the preprocessor in creative ways!  :>

Reply by Don Y ●September 13, 20162016-09-13

On 9/12/2016 3:04 PM, Dennis wrote:
> On 09/12/2016 04:11 PM, David Brown wrote:
>> On 12/09/16 17:19, Don Y wrote:
>>> On 9/12/2016 5:35 AM, David Brown wrote:
>>>> On 12/09/16 11:41, Don Y wrote:
>>>>> On 9/11/2016 11:55 PM, David Brown wrote:
>>>>>> On 11/09/16 22:15, Don Y wrote:
>>
>> <snip for brevity>
>>
>>>
>>>>> I would NOT, for example, write:
>>>>>      x=-1;
>>>>
>>>> Neither would I - I would write "x = -1;".  But I believe I am missing
>>>> your point with this example.
>>>
>>> My example would be parsed as:
>>>     x =- 1 ;
>>
>> Parsed by who or what?  A compiler would parse it as "x = -1;".  I
>> assume we are still talking about C (or C++) ?
>
> What he is getting at is that in the original K&R C the compound assignment
> operator was of the form =op. It was later changed to the current op= because
> of the ambiguity he noted. See the K&R section of
>
> https://en.wikipedia.org/wiki/K%26R_C
>
> I remember having to change a bunch of code for the "new" compilers.

The issue is that the statement (as I wrote it) is "valid" C -- for BOTH
of those DIFFERENT compilers.  No errors encountering it on an early
compiler *or* a 21st century compiler.

Yet it has very different semantics!

A ~1985 compiler *might* warn you that "you MAY be expecting an OBSOLESCENT
behaviour".  it could flag it as an error (if it was a forward-leaning
implementation).  Or, if it is behind the times, might gladly accept the
legacy parse of this statement.

A 2015 compiler may have forgotten all about this alternative interpretation
and merrily generate code that is different from what was ORIGINALLY intended
without even thinking about alerting you to the historical issue (the compiler
doesn't know the code's ancestry!)

You, the developer/maintainer, have no way of knowing how those who preceded
you behaved -- their "styles" -- which is the only way to guesstimate
how the tools might NOW *mis*behave with their code!

Reply by David Brown ●September 13, 20162016-09-13

On 13/09/16 00:04, Dennis wrote:
> On 09/12/2016 04:11 PM, David Brown wrote:
>> On 12/09/16 17:19, Don Y wrote:
>>> On 9/12/2016 5:35 AM, David Brown wrote:
>>>> On 12/09/16 11:41, Don Y wrote:
>>>>> On 9/11/2016 11:55 PM, David Brown wrote:
>>>>>> On 11/09/16 22:15, Don Y wrote:
>>
>> <snip for brevity>
>>
>>>
>>>>> I would NOT, for example, write:
>>>>>      x=-1;
>>>>
>>>> Neither would I - I would write "x = -1;".  But I believe I am missing
>>>> your point with this example.
>>>
>>> My example would be parsed as:
>>>     x =- 1 ;
>>
>> Parsed by who or what?  A compiler would parse it as "x = -1;".  I
>> assume we are still talking about C (or C++) ?
> 
> What he is getting at is that in the original K&R C the compound
> assignment operator was of the form =op. It was later changed to the
> current op= because of the ambiguity he noted. See the K&R section of
> 
> https://en.wikipedia.org/wiki/K%26R_C
> 
> I remember having to change a bunch of code for the "new" compilers.

OK, thanks.

I came across K&R briefly C at university, but ANSI C was already the
norm by the time I was actually doing anything with it.  And I have no
knowledge of C from before the first version of "The C Programming
Language", and have not seen the syntax "x =- 1;" before.

History like this is interesting, but irrelevant to the issue of how to
write code /today/.

Reply by ●September 13, 20162016-09-13

On Mon, 12 Sep 2016 22:52:00 +0200, Hans-Bernhard Br&#4294967295;ker
<HBBroeker@t-online.de> wrote:

>Am 11.09.2016 um 15:14 schrieb upsidedown@downunder.com:
>> On Sun, 11 Sep 2016 13:36:59 +0200, Hans-Bernhard Br&#4294967295;ker
>> <HBBroeker@t-online.de> wrote:
>
>>> It may appear to provide a solution to the issue: "How to keep non-ASCII
>>> characters in a plain char?"
>
>> Unsigned char was the answer for at least 5-10 years before UCS2.
>
>It really wasn't, because as a solution it was incoherent, incomplete, 
>and insular.  Anyway, there really weren't that many years between C90 
>and UCS2.
>
>8-bit unsigned char for non-ASCII characters created more problems than 
>it ever solved.  Instead of having one obvious, while painful 
>restriction, we now had
>
>*) dozens of conflicting interpretations of the same 256 values,
>*) no standard way of knowing which of those applied to given input
>*) almost no way of combining multiple such streams internally
>*) no useful way of outputting a combined stream

Exactly the same problems existed with the 7 bit ISO646 from the
1960/70's.

In ISO 646 some of the US-ASCII code points were replaced by a
national character, such as @ [ \ ] { | } _ ^  ~ $ #

The replacement characters were typically different for each country
or at least each language. Try to combine texts from different
languages produced a mess. This was a problem especially in countries
with multiple official languages. Also writing names of foreign origin
caused problems.

Think about Pascal or C-programming, with missing [ \ ] { | }  ^
characters, which in one code page might have been displayed as &#4294967295; &#4294967295; &#4294967295;
&#4294967295; &#4294967295; &#4294967295; in one code page but  something different in an other code page.
This was especially problematic for printing with a language specific
printer (e.g. daisy chain) giving quite different looking printouts
depending on which printer was used.

For programming, the most convenient way was to switch the terminal to
US-ASCII, but unfortunately the keyboard layout also changed.

Digraphs and trigraphs partially solved this problem but was ugly.

With the introduction of the 8 bit  ISO Latin, a large number of
languages and programming could be handled with that single code page,
greatly simplifying data exchange in Europe,  Americas, Africa and
Australia.

Reply by Don Y ●September 13, 20162016-09-13

On 9/13/2016 12:11 AM, upsidedown@downunder.com wrote:
> For programming, the most convenient way was to switch the terminal to
> US-ASCII, but unfortunately the keyboard layout also changed.
>
> Digraphs and trigraphs partially solved this problem but was ugly.
>
> With the introduction of the 8 bit  ISO Latin, a large number of
> languages and programming could be handled with that single code page,
> greatly simplifying data exchange in Europe,  Americas, Africa and
> Australia.

The problem with Unicode is that it makes the problem space bigger.
Its relatively easy for a developer to decide on appropriate
syntax for file names, etc. with ASCII, Latin1, etc. But, start
allowing for all these other code points and suddenly the
developer needs to be a *linguist* in order to understand what
should/might be an appropriate set of constraints for his particular
needs.

Also, we *tend* to associate meaning with each of the (e.g.) ASCII
code points.  So, 16r31 is the *digit* '1'.  There's concensus on
that interpretation.

However, Unicode just tabulates *glyphs* and deprives many of them
of any particular semantic value.  E.g., if I pick a glyph out
of U+[2800,28FF], there's no way a developer would know what my
*intent* was in selecting the glyph at that codepoint.  It could
"mean" many different things (the developer has to IMPOSE a
particular meaning -- like deciding that 'A' is not an alphabetic
but, rather, a "hexadecimal character" IN THIS CONTEXT)

Reply by ●September 13, 20162016-09-13

On Tue, 13 Sep 2016 01:11:09 -0700, Don Y
<blockedofcourse@foo.invalid> wrote:

>On 9/13/2016 12:11 AM, upsidedown@downunder.com wrote:
>> For programming, the most convenient way was to switch the terminal to
>> US-ASCII, but unfortunately the keyboard layout also changed.
>>
>> Digraphs and trigraphs partially solved this problem but was ugly.
>>
>> With the introduction of the 8 bit  ISO Latin, a large number of
>> languages and programming could be handled with that single code page,
>> greatly simplifying data exchange in Europe,  Americas, Africa and
>> Australia.
>
>The problem with Unicode is that it makes the problem space bigger.
>Its relatively easy for a developer to decide on appropriate
>syntax for file names, etc. with ASCII, Latin1, etc. But, start
>allowing for all these other code points and suddenly the
>developer needs to be a *linguist* in order to understand what
>should/might be an appropriate set of constraints for his particular
>needs.

For _file_ names not a problem, stop scanning at next white space (or
null in C). Everything in between is the file name, no matter what
characters are used.

For _path_ specifications, there must be some rules how to separate
the node, path, file name, file extension and file version from each
other. The separator or other syntactic elements are usually chosen
from the original 7 bit ASCII character set. What is between these
separators is irrelevant.

>Also, we *tend* to associate meaning with each of the (e.g.) ASCII
>code points.  So, 16r31 is the *digit* '1'.  There's concensus on
>that interpretation.

For _numeric_entry_ fields, including the characters 0-9 requires
Arabic numbers fallback mapping. 

As strange as it sounds, the numbers used in Arabic countries differs
from those used in Europe.

>However, Unicode just tabulates *glyphs* and deprives many of them
>of any particular semantic value.  E.g., if I pick a glyph out
>of U+[2800,28FF], there's no way a developer would know what my
>*intent* was in selecting the glyph at that codepoint.  

As long as it is just payload data, why should the programmer worry
about it ?

>It could
>"mean" many different things (the developer has to IMPOSE a
>particular meaning -- like deciding that 'A' is not an alphabetic
>but, rather, a "hexadecimal character" IN THIS CONTEXT)

In Unicode, there are code points for hexadecimal 0 to F. Very good
idea to separate the dual usage for A to F.
Has anyone actually used those  code points ?

Reply by David Brown ●September 13, 20162016-09-13

On 13/09/16 06:22, Don Y wrote:
> On 9/12/2016 2:11 PM, David Brown wrote:
> 
>>>>> I would NOT, for example, write:
>>>>>      x=-1;
>>>>
>>>> Neither would I - I would write "x = -1;".  But I believe I am missing
>>>> your point with this example.
>>>
>>> My example would be parsed as:
>>>     x =- 1 ;
>>
>> Parsed by who or what?  A compiler would parse it as "x = -1;".  I
>> assume we
>> are still talking about C (or C++) ?
> 
> No.  Dig out a copy of Whitesmith's or pcc.  What is NOW expressed as
> "x -= 1" was originally expressed as "x =- 1".  Ditto for =+, =%, etc.
> So, if you were a lazy typist and liked omitting whitespace by thinking it
> redundant, you typed:
>     x=-1;
> and got:
>     x=x-1;
> instead of:
>     x = (-1);
> 
>> When writing, I would include appropriate spaces so that it is easy to
>> see what
>> it means.
> 
> My point is that most folks would think x=-1 (or x=+2, etc.) bound the
> sign to the value more tightly than to the assignment operator.

Of course they think "x=-1" means "x = -1" !  It has been almost forty
years since "x =- 1" has been standard C.  Most people also think that
television is in colour, you can communicate to Australia by telephone,
and flares are no longer in fashion.  Live moves on.

In this particular case, the number of people who ever learned to write
"x =- 1", and are still working as programmers (or even still alive) is
tiny.  And the number of those who failed to learn to use "x -= 1" at
least 35 years ago, must be tinier still.  Sure, you can /remember/ it -
and remember having to change old code to suit new compilers.  An old
shopkeeper may remember when he made deliveries with a horse and cart -
but he does not insist that new employees know about grooming horses.

Backwards compatibility, and compatibility with existing code, is
important.  That is why we still have many of the poor choices in the
design of C as a language - for compatibility.  But with each passing
year or decade, compatibility with the oldest code gets less and less
relevant - except to historians or the occasional very specialist cases.

Of all the lines of C code that are in use today, what fraction were
written in pre-K&R C when "x =- 1" was valid?  One in a million?  One in
a hundred million?  If we exclude code lines that have been converted to
later C standards, then I doubt if it is nearly that many.

> 
> The same sort of reasoning applies to
>    x=y/*p;
> where the '*' ends up binding more tightly to the '/' to produce the
> comment introducer, "/*".  Note that the problem persists in the
> language's current form -- but compilers often warn about it
> (e.g., detecting "nested" comments, etc.)

That is different in that the parsing rules for C are quite clear here,
and are the same as the always have been - /* starts a comment.  But
unless you have carefully created a pathological case and use a
particularly unhelpful compiler (and editor - in this century, most
programmers use editors with syntax highlighting), you are going to spot
the error very quickly.

C provides enormous opportunity for accidentally writing incorrect code
- in many cases, the result is perfectly acceptable C code and will not
trigger any warnings.  If you were to take the top hundred categories of
typos and small mistakes in C code that resulted in compilable but
incorrect code, "x=y/*p" would not feature.  It /might/ make it onto a
list of the top thousand mistakes.  It really is that irrelevant.

And it is preventable by using spaces.  There is a reason that the space
bar is the biggest key on the keyboard.

> 
>>> Nowadays, you would see this as:
>>>     x -= 1 ;
>>
>> That is a completely different statement.
> 
> No.  It is what the statement above HAS BECOME!  (as the language evolved)
> 

Take your head out of your history books.  In C, "x-=1" means "x -= 1",
while "x=-1" means "x = -1".  That is it.  It is a simple fact.  It
matters little what C used to be, decades before most programmers were born.

>>> Would you have noticed that it was NOT assigning "-1" to x?
>>> Would you have wasted that precious, limited timeslot that you
>>> had access to "The Machine" chasing down a bug that you could have
>>> avoided just by adopting a style that, instead, writes this as:
>>>     x = -1;
>>
>> I am afraid I still don't get your point.  "x = -1;" is exactly how I
>> would
>> write it.  I believe spacing like that is good style, and helps make
>> it clear
>> to the reader and the writer how the statement is parsed.  And like in
>> normal
>> text, appropriate spacing makes things easier to read.
> 
> See above.
> 
> Or, better yet, google turns up:
> 
> <http://bitsavers.informatik.uni-stuttgart.de/pdf/chromatics/CGC_7900_C_Programmers_Manual_Mar82.pdf>
> 
> 
> Perhaps an interesting read for folks didn't have to write code
> "back then".

I am not as old as you, but I have been programming for about 35 years.
 I have had my share of hand-assembling code, burning eeproms, using
punched tape, and even setting opcodes with DIP switches with a toggle
switch for the clock.

But I understand the difference between what I do /now/, and what other
programmers do /now/, and what I did long ago.

> 
>>>>> And, I *would* write:
>>>>>      p = &buffer[0];
>>>>
>>>> So would I - because I think the code looks clearer.  When I want p to
>>>> be a pointer to the first element of "buffer", that's what I write.
>>>
>>> You'll more frequently encounter:
>>>       p = buffer;
>>
>> I know.  But I prefer "p = &buffer[0];", because I think it looks
>> clearer and
>> makes more sense.  To my reading, "buffer" is an array - it does not
>> make sense
>> to assign an array to a pointer-to-int.  (I'm guessing at types here,
>> since
>> they were never declared - adjust them if necessary if "buffer" was an
>> array of
>> char or something else.)
> 
> I prefer it because I am more hardware oriented.  I think of "objects"
> (poor choice of words) residing AT memory addresses.  So, it is only
> natural for me to think about "the address of the zeroth element of
> the array".

I am a hardware man too.  And I quite appreciate that interpretation as
well.

> 
>> C converts arrays or array operations into pointers and pointer
>> operations in
>> certain circumstances.  I wish it did not - but I can't change the
>> language.
>> But just because the language allows you to write code in a particular
>> way, and
>> just because many people /do/ write code in a particular way, does not
>> necessarily mean it is a good idea to do so.
>>
>>>> I'd use parens in places that you'd consider superfluous -- but that
>>>>> made bindings very obvious (cuz the compiler would do what I *told* it
>>>>> without suggesting that I might be doing something unintended):
>>>>
>>>> Me too.  But that's for clarity of code for both the reader and the
>>>> writer - not because I worry that my tools don't follow the rules of C.
>>>
>>> The problem isn't that tools don't follow the rules.  The problem is
>>> that
>>> a compiler can be conforming WITHOUT holding your hand and reminding you
>>> that what you've typed *is* "valid C" -- just not likely the valid C
>>> that you assumed!
>>
>> Again, I can only say - get a decent compiler, and learn how to use it
>> properly.  If you cannot use good compiler (because no good compilers
>> exist for
>> your target, at least not in your budget), then use an additional
>> static error
>> checker.
> 
> For how many different processors have you coded?

I can't remember - perhaps 20 or so.

Z80, 6502, 68k, x86, MIPS, COP8, 8051, PIC16, HP43000, ARM, PPC, AVR,
AVR32, MSP430, NIOS, XMOS, TMS430, 56Fxxx

That's 18 - there are several more whose names I can't remember, and
some that I have programmed on without being familiar with the assembly
language.

>  I have compilers
> for processors that never made it into "mass production".  And, for
> processors that saw very limited/targeted support.  If I'm willing to FUND
> the development of a compiler -- knowing that I may be the only
> customer for that compiler -- then I can "buy" all sorts of capabilities
> in that compiler!
> 
> OTOH, if I have to pass those costs along to a client, he may be far less
> excited about how "good the tool is" and, instead, wonder why *I* can't
> compensate for the tool's quality.

Long ago, anyone wanting to make a C compiler for a new processor would
either buy the front end and write their own code generator, or would
pay a compiler company to write the code generator to go with their
existing front end.  Only hobby developers would write their own C front
end - for professionals, it was not worth the money unless they were a
full-time compiler development company.

So you got your front-end already made, with whatever features and
warnings it supported.  Clearly, the range of features would vary here.
 And sometimes you wanted to add your own extra features for better
support of the target.

Now, anyone wanting to make a C compiler for a new processor starts with
either gcc or clang, and writes the code generator - again, the
front-end is there already.

> 
> What were tools like 20 years ago?  Could you approach your client/employer
> and make the case for investing in the development of a compiler with
> 2016 capabilities?  Running, *effectively*, on 1995 hardware and OS?
> Or, would you, instead, argue that the project should be deferred until
> the tools were more capable?

The tools I used 20 years ago were not as good as the ones I use now.
And the tools I used 20 years ago were not as good as the best ones
available 20 years ago - the budget did not stretch.

But now, the budget /does/ stretch to high quality tools - for most
microcontrollers, /everybody's/ budget stretches far enough because high
quality compiler tools are free or very cheap.  There are a few
microcontrollers where that is not the case (the 8051, the unkillable
dinosaur, being an example), but tool quality and price is a factor many
people consider when choosing a microcontroller.

And how relevant are 20 year old tools to the work I do /today/, writing
code /today/ ?  Not very relevant at all, except for occasional
maintenance of old projects.

> 
> Or, would you develop work styles that allowed you to produce reliable
> products with the tools at your disposal??

The whole point is that 20 years ago I had to have a style that made
sense 20 years ago with the tools of that era.  Now I have a style that
is suited to the tools of /this/ era.  Not a lot has changed, because
the guiding principles have been the same, but many details have
changed.  Function-like macros have morphed into static inline
functions, home-made size-specific types have changed to uint16_t and
friends, etc.  Some of my modern style features, such as heavy use of
static assertions, could also have been used 20 years ago - I have
learned with time and experience.

But I refuse to write modern code of poorer quality and with fewer
features simply because those features were not available decades ago -
or even because those features are not available on all modern compilers.

> 
>> To be used in a safe, reliable, maintainable, and understandable way,
>> you need
>> to limit the C features you use and the style you use.  (This applies
>> to any
>> programming language, but perhaps more to C than other languages.) 
>> And you use
>> whatever tools you can to help spot errors as early as possible.
>>
>>>>>      y = x / (*p);
>>>>
>>>> Me too.
>>>
>>> Because the compiler would gladly think "y=x/*p;" was THREE characters
>>> before the start of a comment -- without telling you that it was
>>> interpreting it as such.  Again, you turn the crank and then spend
>>> precious time staring at your code, wondering why its throwing an error
>>> when none should exist (in YOUR mind).  Or, worse, mangling input:
>>>
>>>          y=/*p      /* this is the intended comment */
>>>            x=3;
>>>
>>
>> Again, spaces are your friend.
> 
> Spaces (or parens) are a stylistic choice.  The language doesn't mandate
> their use.  E.g., this is perfectly valid:
>      y=/*I want to initialize two variables to the same value
>          in a single stmt*/x=3;
> 
> It's a bad coding STYLE but nothing that the compiler SHOULD complain
> about!

Compilers can, do, and should complain about particularly bad style.
It's important that such complaints are optional - and for
compatibility, they are usually disabled by default.  There is no clear
division between what is simply a stylistic choice ("x=3" vs. "x = 3",
for example), and what is a really /bad/ idea, such as putting comments
in the middle of a statement.  Thus any complaints about style need to
be configurable.

But there is no doubt that such warnings can be helpful in preventing
bugs.  Warning on "if (x = 3)" is a fine example.  Another is gcc 6's
new "-Wmisleading-indentation" warning that will warn on:

if (x == 1)
	a = 2;
	b = 3;

Code like that is wrong - it is bad code, even if it is perfectly
legitimate C code, and even if it happens to work.  It is a good thing
for compilers to complain about it.

> 
>>>> Thus I would always write:
>>>>
>>>>     if (expression == constant) { .... }
>>>>
>>>> (I'm not a fan of shouting either - I left all-caps behind with BASIC
>>>> long ago.)
>>>
>>> I ALWAYS use caps for manifest constants.
>>
>> Many people do.  I don't.  I really do not think it helps in any way,
>> and it
>> gives unnecessary emphasis to a rather minor issue, detracting from the
>> readability of the code.
> 
> Would you write:
> const zero = 0;
> const nine = 9;
>     for (index = zero; index < nine; index++)...
> Or:
> const start = zero;
> const end = nine;
>     for (index = start; index < end; index++)...
> Or:
>     for (index = START; index < END; index++)...
> 

No, I would not write any of that.

I would not write "const zero = 0" for several reasons.  First, it is
illegal C - it needs a type.  Second, such constants are usually best
declared static.  Third, it is pointless making a constant whose name is
its value.

But I /might/ write:

static const int start = 0;
static const int end = 9;

for (int index = start; index < end; index++) { ... }

Even that is quite unlikely - "start" and "end" would usually have far
better names, while "index" is almost certainly better written as "i".
(But that is a matter of style :-) ).

> The latter makes it abundantly clear to me (without having to chase down
> the declaration/definition of "start", "end", "zero" or "nine".  (who
> is to say whether "zero" actually maps to '0' vs. "273" -- 0 degrees K!)

The only thing the START and END form makes abundantly clear is that you
really, really want everyone looking at the code to see at a glance that
START and END won't change - and that is far more important than
anything else about the code, such as what it does.

If "zero" does not map to zero, don't call it "zero".  Call it "zeroK",
or "lowestTemperature", or whatever.

>>
>> And again, you are missing the point.  I am not writing code for a 40
>> year old
>> compiler.  Nor is the OP.  For the most part, I don't need to write
>> unpleasant
>> code to deal with outdated tools.
> 
> No, YOU are missing the point!  

Certainly we seem to be talking at cross-purposes here.  It is a matter
of viewpoint who is "missing the point" - probably both of us.

> I'm writing code with 40 years of
> "respectible track record".  You're arguing that I should "fix"
> something (i.e., my style preferences) that isn't broken.  Because
> it differs from what your *20* year track record has found to be
> acceptable.  Should we both wait and see what next year's
> crop of developers comes up with?  Maybe Hungarian notation will be
> supplanted by Vietnamese notation?  Or, we'll decide that using identifiers
> expressed in Esperanto is far more universally readable?

Yes, I am arguing that if something in your style is no longer the best
choice for modern programming, then you certainly should consider
changing it.  Clearly you will not do so without good reason, which is
absolutely fine.

I am also arguing against recommending new people adopt a style whose
benefits are based on ancient tools and your own personal habits.
Modern programmers should adopt a forward-looking style that lets them
make the take advantage of modern tools - there is no benefit in
adopting your habits or my habits, simply because /we/ are used to them.
 There are benefits in using, or at least being familiar with, common
idioms and styles.  But that should not be an overriding concern.  Keep
good habits, if they are still good - but drop bad habits.

> 
>> I have, in the past, used far more limited compilers.  I have worked with
>> compilers that only supported C90/ANSI, meaning I can't mix
>> declarations and
>> statements.  I have worked with compilers that have such god-awful code
>> generation that I had to think about every little statement, and limit
>> the
>> number of local variables I used.  Very occasionally, I have to work
>> on old
>> projects that were made using such tools - and I have to go back to
>> the dark
>> ages for a while.
>>
>> Perhaps, long ago when the tools I had were poor and the company I
>> worked for
>> did not have the budget for a linter, it might have made sense to
>> write "if
>> (CONSTANT == expression)".  But not now - and I don't think there will
>> be many
>> developers working today for which it would still make sense.
> 
> I can then argue that we shouldn't bother with such an archaic language,
> at all!  Look at all the cruft it brings along with it!  Why not wait to
> see what pops up tomorrow as the language (and development style) du jour?
> Or, why risk being early adopters -- let's wait a few *weeks* before
> adopting tomorrow's advances!

There is a balance between choosing something that is mature, field
proven and familiar, and choosing something that is newer and has
benefits such as efficiency, clarity, flexibility, safety, etc.

I think that the large majority of work done in C would be better
written in a different language, were it not for two factors - existing
code written in C, and existing experience of the programmer in C.  For
most programming tasks, C /is/ archaic - it is limited, inflexible, and
error prone.  For some tasks, its limitations and its stability as a
language are an advantage.  But for many tasks, if one could disregard
existing C experience, it is a poor choice of language.

Thus a lot of software on bigger systems is written in higher level
languages, such as Python, Ruby, etc.  A lot of software in embedded
systems are written in C++ to keep maximal run-time efficiency while
getting more powerful development features.  New languages such as Go
are developed to get a better balance of the advantages of different
languages and features.

For a good deal of embedded development, the way forward is to avoid
archaic and brain-dead microcontrollers such as the 8051 or the PIC.
Stick to solid but modern processors such as ARM or MIPS cores.  And
move to C++ - /if/ you are good enough to learn and understand how to
use that language well in embedded systems.

I would wait a few years, not weeks, but not decades, before adopting
new languages for embedded programming.  Maybe Go will be a better
choice in a few years.

We've been through all this with "C vs. assembly" - and there are plenty
of people that still use assembly programming for embedded systems
because "it was good enough for my grandfather, it's good enough for
me", or because they simply refuse to move forward with the times.  Like
assembly, C will never go away - but it /will/ move further and further
into niche areas, and be used "for compatibility with existing code and
systems".

In the meantime, we can try and write our C code in the best way that
modern tools allow.

> 
> Programming is still an art, not a science.  You rely on techniques that
> have given you success in the past.  When 25% of product returns are due
> to "I couldn't figure out how to USE this PoS!", that suggests current
> development styles are probably "lacking".
> <https://books.google.com/books?id=pAsCYZCMvOAC&pg=PA130&lpg=PA130&dq=product+returns+confusion&source=bl&ots=P_v4nTI0m8&sig=XZ6VGEOtuyJOG7Kwkcd2SMV2_6w&hl=en&sa=X&ved=0ahUKEwiDwtDKt4vPAhVk0YMKHfO-C-AQ6AEINjAD>
> 

We have only been talking about coding styles, which are a small part of
development styles.  And development styles are only a small part of
products as a whole.

Learning to use spaces appropriately and not using Yoda-speak for your
conditionals will not mean end-users will automatically like your product!

> 
>>>>> [would you consider "if (foo==bar())" as preferred to "if
>>>>> (bar()==foo)"?]
>>>>
>>>> I'd usually try to avoid a function call inside an expression inside a
>>>> conditional, but if I did then the guiding principle is to make it
>>>> clear
>>>> to read and understand.
>>>
>>> Which do you claim is cleared?
>>> Imagine foo is "errno".
>>> After answering, rethink it knowing that errno is actually a function
>>> invocation on this platform.
>>
>> Neither is clear enough.
> 
> So, what's your solution?

It would depend on the rest of the context, which is missing here, but
I'd guess it would be something like:

int noOfWhatsits = bar();
if (noOfWhatsits == foo) { ... }

Local variables are free, and let you divide your code into clear and
manageable parts, and their names let you document your code.  I use
them a lot.

(I have also used older and weaker compilers that generated poorer code
if you had lots of local variables - I am glad my current style does not
have to handle such tools.)

> 
>>>>> Much of this is a consequence of how I learned to "read" (subvocalize)
>>>>> code to myself:
>>>>>      x = y;
>>>>> is "x gets the value of y", not "x equals y".  (how would you read
>>>>> x ::=
>>>>> y ?)
>>>>>      if ( CONSTANT == expression )
>>>>> is "if expression yields CONSTANT".  Catching myself saying "if
>>>>> CONSTANT
>>>>> gets
>>>>> expression" -- or "if variable gets expression":
>>>>>      if ( CONSTANT = expression )
>>>>>      if ( variable = expression )
>>>>> is a red flag that I'm PROBABLY not doing what I want to do.
>>>>>
>>>>> I'll wager that most folks read:
>>>>>      x = y;
>>>>> as:
>>>>>      x equals y
>>>>
>>>> I can't say I vocalize code directly - how does one read things like
>>>> brackets or operators like -> ?  But "x = y" means "set x equal to y".
>>>
>>> And, x ::= y effectively says the same -- do you pronounce the colons?
>>> Instead, use a verb that indicates the action being performed:
>>>       x gets y
>>
>> I don't pronounce colons.  I also don't use a programming language
>> with a ::=
>> operator.  But if I did want to pronounce it, then "x gets y" would be
>> fine.
> 
> Then you come up with alternative ways of conveying the information
> present in that symbology.
> 
> E.g., "::=" (used to initialize a variable) vs '=' (used to define
> *constants*, but not variables, and test for value equality) vs ":=:"
> (to test for address equality) vs ":=" (chained assignment) in Algol;
> '=' vs "==" in C; ":=" vs. '=' and ':' and "==" in Limbo; etc.
> 

Different languages have different symbols - yes, I know that.

> When conversing with someone not fluent in a language, the actual
> punctuation plays an important role.

I am lucky enough to have full use of my hands and my eyes, as well as
my mouth.  The same applies to other people I discuss code with.  I
would not try to distinguish "x = y" and "x == y" verbally - I would
/write/ it.

>  Saying "x gets y" to someone
> who isn't familiar with Algol's "::=" would PROBABLY find them
> writing "x = y".  When "reading" Limbo code to someone, I resort
> to "colon-equals", "equals" and "colon" so I'm sure they know
> exactly what I'm saying (because the differences are consequential)
> 
>>> I learned to code with left-facing arrows instead of equal signs
>>> to make the "assignment" more explicit.
>>
>> I think that <- or := is a better choice of symbol for assignment.  The
>> designers of C made a poor choice with "=".
>>
>>>>> and subtly change their intent, but NOT their words, when
>>>>> encountering:
>>>>>      if ( x == y )
>>>>> as:
>>>>>      if x equals y
>>>>
>>>> "if x equals y", or "if x is equal to y".
>>>>
>>>>> (how would they read "if ( x = y )"?)
>>>>
>>>> I read it as "warning: suggest parentheses around assignment used as
>>>> truth value" :-)
>>>
>>> No, your *compiler* tells you that.  If you've NEVER glossed over
>>> such an assignment in your code, you've ALWAYS had a compiler holding
>>> your hand for you.  I'm glad I didn't have to wait 10 years for such
>>> a luxury before *I* started using C.
>>
>> I have used warnings on such errors for as long as I have used
>> compilers that
>> conveniently warned about them.  But I can say it is very rare that a
>> compiler
>> /has/ warned about it - it is not a mistake I have made more than a
>> couple of
>> times.  Still, I want my tools to warn me if it were to happen.
> 
> I want my tools to warn me of everything that they are capable of
> detecting.

Good.  Then we agree - make the best use of the best tools available.

> INCLUDING the tool that occupies my cranial cavity!

I agree.  That means not distracting it with things that are easily
found automatically by compilers and other tools, so that your mind can
concentrate on the difficult stuff.

> 
>>> I'll frequently store pointers to functions where you might store
>>> a variable that you conditionally examine (and then dispatch to
>>> one of N places in your code).  In my case, I just invoke the
>>> function THROUGH the pointer -- the tests were done when I *chose*
>>> which pointer to stuff into that variable (why should I bother to
>>> repeat them each time this code fragment is executed?)
>>>
>>> When you do code reviews, do you all sit around with laptops and
>>> live code copies?  Or, do you pour over LISTINGS with red pens?
>>> Do you expect people to just give the code a cursory once-over?
>>> Or, do you expect them to *read* the code and act as "human
>>> compilers/processors"?
>>
>> I expect code to be as simple to understand as possible, so that it
>> takes as
>> little time or effort as possible for other people to read. That way
>> others can
>> spend their effort on confirming that what the code does is the right
>> thing,
>> rather than on figuring out what the code does first.
> 
> What's simple to me might not be simple to you!  E.g., I naturally
> think in terms of pointers.  Others prefer array indexes.  For me
> to jump/call *through* a pointer is far more obvious than encoding some
> meaning into a flag at some point in the program.  Then, decoding that
> flag at another point and dispatching based on that decode!  The latter
> requires keeping two pieces of code in sync (encode & decode).  The
> former puts everything in one place (encoding).

Agreed - there is plenty of scope for personal variation and style here.

> 
> Should I write code at a level that a newbie developer can understand?
> Should I limit myself to how expressive and productive I can be out of fear
> that someone might not be quick to grasp what I've written?

The right balance here will vary depending on the circumstances - there
is no single correct answer (but there are many wrong answers).

> 
>>> There is no such thing as a perfect style.  You adopt a style that
>>> works for you and measure its success by how buggy the code is (or
>>> is not) that you produce and the time that it takes you to produce it.
>>
>> Yes.
>>
>>>
>>> I suspect if I sat you down with some of these early and one-of-a-kind
>>> compilers, you'd spend a helluva lot more time chasing down bugs that
>>> the compiler didn't hint at.  And, probably run the risk of some
>>> creeping
>>> into your release (x=y=3) that you'd never caught.
>>
>> No.
>>
>> Static checking by the compiler is not a substitute for writing
>> correct code.
> 
> Where am I writing "INcorrect code"?  It meets the specifications of the
> language.  It compiles without errors or warnings.  It fulfills the goals
> of the specification.

There is more to writing good code than that (and I know you know that).
 Whether you call bad code that happens to work "correct" or "incorrect"
is up to you.

But my point here was that you seem to imply I write code with little
regard for it being correct or incorrect, and then rely on the compiler
to find my errors.

> 
> It's just that *you* don't like my style!  Is that what makes it "not
> correct"?

I suspect that in the great majority of cases where I don't like your
style, then it is nothing more than that.  I might think it is not clear
or easy to understand, or not as maintainable as it could be, or simply
looks ugly and hard to read, or that it is not as efficient as other
styles.  I can't say for sure, since about the only things I know for
sure about your style is that you like to write "if (3 == x)" rather
than "if (x == 3)", and that you like function pointers.

It takes a lot more than that for me to label code as "incorrect" or
"bad" (assuming the final result does the job required).

> If I replaced the identifiers for manifest constants with lowercase
> symbols,
> would that make it MORE correct?  Should I use camelcase identifiers?
> Hungarian notation?  Embedded underscores?   How do any of these make the
> code more or less "correct"?

If a change makes code clearer, then it is a good thing.  Visually
splitting the words in a multi-word identifier makes code clearer -
whether that is done using camelCase or underscores is a minor issue.
Small letters are easier to read (that's why they exist!), and avoid
unnecessary emphasis - that makes them a good choice in most cases.  And
there is rarely any benefit in indicating that an identifier is a
constant or a macro (assuming it is defined and used sensibly) - so
there is no point in making such a dramatic distinction.

> 
>> You make it sound like I program by throwing a bunch of random symbols
>> at the
>> compiler, then fixing each point it complains about.  The checking is
>> automated
>> confirmation that I am following the rules I use when coding, plus
>> convenient
>> checking of silly typos that anyone makes on occasion.
>>
>>> Instead of pushing an update to the customer over the internet, you'd
>>> send someone out to their site to install a new set of EPROMs.  Or,
>>> have them mail the product back to you as their replacement arrived
>>> (via USPS).  Kind of annoying for the customer when he's got your
>>> device on his commercial fishing boat.  Or, has to lose a day's
>>> production
>>> while it's removed from service, updated and then recertified.
>>
>> I have mailed EPROMs to customers, or had products delivered back for
>> program
>> updates.  But never due to small mistakes of the sort we are
>> discussing here.
> 
> I've *never* had to update a product after delivery.  Because the cost of
> doing so would easily exceed the profit margin in the product!
> 
> I've had clients request modifications to a product; or tweeks for
> specific customers.  But, those aren't "bug fixes", they're effectively
> "new products" that leverage existing hardware.

And those are updates after delivery.  There are many perfectly good
reasons for updating software after delivery.  All I said was that I
have provided updates in a variety of ways, and for a variety of reasons
- but never for the sort of mistakes that you seem to think you are
immune to because you learned to program with limited tools, while you
think /I/ make them all the time because I take advantage of modern tool
features.

> "No, there will be no upgrades.  You will get it right the first time!"
> 

That is fine for some projects.  I have had cards that have been
cemented into the ocean floor - upgrades are not physically possible.
And on other projects, customers want to be able to have new features or
changes at a later date.

I think everyone agrees that shipping something that does not work
correctly, and updating for bug fixes, is always a bad idea - just /how/
bad it is will vary.

>> The new programmer should learn to take advantage of new tools, and
>> concentrate
>> efforts on problems that are relevant rather than problems are no
>> longer an
>> issue (to the extent that they ever /were/ an issue). And if that new
>> programmer is faced with poorer tools, then he or she will have to
>> learn the
>> idiosyncrasies at the time.  And the lack of warnings on "if (x = 1)" is
>> unlikely to be the most important point.
> 
> Without the warning, the developer is likely to waste a boatload of
> time "seeing what he wants to see" and not what the *compiler* sees.

Without decent warnings, developers (especially new ones) are likely to
spend a good deal more time chasing small bugs than they would if the
compiler or linter helped them out.  But why do you think this
particular issue is so important?  New C programmers are often told how
important it is to distinguish between = and ==, so it is something they
look out for, and avoid in most cases.  And the Yoda rule only helps in
/some/ cases where you have comparisons - you still need to get your =
and == right everywhere else.

> 
>>> There's a reason you can look at preprocessor output and ASM
>>> sources.  Because they are the only way to understand some nasty
>>> bugs that may arise from your misunderstanding of what a tool
>>> SHOULD be doing; *or*, from a defective tool!  (a broken tool
>>> doesn't have to tell you that it's broken!)
>>
>> I have almost /never/ had occasion to look at pre-processor output. 
>> But I do
>> recommend that embedded programmers should be familiar enough with the
>> assembly
>> for their targets that they can look at and understand the assembly
>> listing.
> 
> You've probably never "abused" the preprocessor in creative ways!  :>
> 

I have abused preprocessors a bit (any use of ## is abuse!), but I
haven't had to look at the output directly to debug that abuse.  Maybe I
haven't been creative enough in my abuses here.

Reply by David Brown ●September 13, 20162016-09-13

On 13/09/16 12:24, upsidedown@downunder.com wrote:
> On Tue, 13 Sep 2016 01:11:09 -0700, Don Y
> <blockedofcourse@foo.invalid> wrote:
> 
>> On 9/13/2016 12:11 AM, upsidedown@downunder.com wrote:
>>> For programming, the most convenient way was to switch the terminal to
>>> US-ASCII, but unfortunately the keyboard layout also changed.
>>>
>>> Digraphs and trigraphs partially solved this problem but was ugly.
>>>
>>> With the introduction of the 8 bit  ISO Latin, a large number of
>>> languages and programming could be handled with that single code page,
>>> greatly simplifying data exchange in Europe,  Americas, Africa and
>>> Australia.
>>
>> The problem with Unicode is that it makes the problem space bigger.
>> Its relatively easy for a developer to decide on appropriate
>> syntax for file names, etc. with ASCII, Latin1, etc. But, start
>> allowing for all these other code points and suddenly the
>> developer needs to be a *linguist* in order to understand what
>> should/might be an appropriate set of constraints for his particular
>> needs.
> 
> For _file_ names not a problem, stop scanning at next white space (or
> null in C). Everything in between is the file name, no matter what
> characters are used.
> 

That sounds fine - but what is "white space" in unicode?  In ASCII, it's
space, tab, newline and carriage return characters.  In unicode, there
are far more.  Invisible spaces, non-breaking spaces, spaces of
different widths, etc.  Did you remember to check for the Ogham space
mark, for those Celtic file names?

Use UTF-8 and stop on a null character.  Just let people put spaces of
any sort in their filenames, and you only have to worry about / (or \
and : ) as special characters.

> For _path_ specifications, there must be some rules how to separate
> the node, path, file name, file extension and file version from each
> other. The separator or other syntactic elements are usually chosen
> from the original 7 bit ASCII character set. What is between these
> separators is irrelevant.
> 
> 
>> Also, we *tend* to associate meaning with each of the (e.g.) ASCII
>> code points.  So, 16r31 is the *digit* '1'.  There's concensus on
>> that interpretation.
> 
> For _numeric_entry_ fields, including the characters 0-9 requires
> Arabic numbers fallback mapping. 
> 
> As strange as it sounds, the numbers used in Arabic countries differs
> from those used in Europe.

That's because our "Arabic numerals" came from India, not Arabia -
though they were brought over to Europe by an Arabic mathematician.  I
believe that in Arabic, the term for them translates as "Indian numerals".

> 
>> However, Unicode just tabulates *glyphs* and deprives many of them
>> of any particular semantic value.  E.g., if I pick a glyph out
>> of U+[2800,28FF], there's no way a developer would know what my
>> *intent* was in selecting the glyph at that codepoint.  
> 
> As long as it is just payload data, why should the programmer worry
> about it ?
> 
>> It could
>> "mean" many different things (the developer has to IMPOSE a
>> particular meaning -- like deciding that 'A' is not an alphabetic
>> but, rather, a "hexadecimal character" IN THIS CONTEXT)
> 
> In Unicode, there are code points for hexadecimal 0 to F. Very good
> idea to separate the dual usage for A to F.
> Has anyone actually used those  code points ?
> 

There are lots of cases where the same glyph exists in multiple unicode
code points, for different purposes.  I have no idea how often they are
used.

Reply by Dennis ●September 13, 20162016-09-13

On 09/13/2016 05:25 AM, David Brown wrote:

>
>
> If a change makes code clearer, then it is a good thing.  Visually
> splitting the words in a multi-word identifier makes code clearer -
> whether that is done using camelCase or underscores is a minor issue.

I'll go off on a tangent - it can be an important issue. I once worked 
with a guy that was visually impaired and used a screen reader for much 
of his work. The underscore form would read as (spoken)word 
(spoken)underscore (spoken)word... where the camelCase would cause it to 
give up and spell it all out. We referred to the underscore form as 
"easy reader code". This was over a decade ago so screen readers may be 
smarter now.

> Small letters are easier to read (that's why they exist!), and avoid
> unnecessary emphasis - that makes them a good choice in most cases.  And
> there is rarely any benefit in indicating that an identifier is a
> constant or a macro (assuming it is defined and used sensibly) - so
> there is no point in making such a dramatic distinction.
>

Reply by David Brown ●September 13, 20162016-09-13

On 13/09/16 17:22, Dennis wrote:
> On 09/13/2016 05:25 AM, David Brown wrote:
>
>>
>>
>> If a change makes code clearer, then it is a good thing.  Visually
>> splitting the words in a multi-word identifier makes code clearer -
>> whether that is done using camelCase or underscores is a minor issue.
>
> I'll go off on a tangent - it can be an important issue. I once worked
> with a guy that was visually impaired and used a screen reader for much
> of his work. The underscore form would read as (spoken)word
> (spoken)underscore (spoken)word... where the camelCase would cause it to
> give up and spell it all out. We referred to the underscore form as
> "easy reader code". This was over a decade ago so screen readers may be
> smarter now.

Unless it is a screen reader specially designed for code, then I'd 
imagine it would have trouble with camelCase words.  I think Don knows 
more about this sort of program.

But you are absolutely right that there can be particular circumstances 
that determine our choices here, and have overriding importance.


>
>> Small letters are easier to read (that's why they exist!), and avoid
>> unnecessary emphasis - that makes them a good choice in most cases.  And
>> there is rarely any benefit in indicating that an identifier is a
>> constant or a macro (assuming it is defined and used sensibly) - so
>> there is no point in making such a dramatic distinction.
>>
>