Sign in

username:

password:



Not a member?

Search rabbit-semi



Search tips

Subscribe to rabbit-semi



Ads

Discussion Groups

This is a group for folks designing and programming embedded systems using the Rabbit Semiconductor C-programmable microcontroller. Rabbit Semi is a spin-off from Z-World who makes a variety of embedded modules and tools. This group is not affiliated with either Rabbit or Z-World, but is a user forum for sharing ideas, asking questions, flaunting knowledge, and other typical user group stuff. The Rabbit is a powerful uC, supported by a full-featured C-compiler.

(unknown) - Steve Trigero - Aug 12 21:35:12 2008

Given the sample code below, what value of 'len' should be pass to CheckSum()?

int len;
char p[20];

strcpy( p, "Hello" );
len = strlen(p);
p[len++] = CheckSum( p, len );
p[len] = NUL;

I have painfully determined that DC will sometimes, and I stress "Sometimes",
increment len before calling CheckSum and sometimes after. It's not even
consistent!

Steve


(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )


RE: (unknown) - Don Starr - Aug 12 22:48:29 2008

> Given the sample code below, what value of 'len' should be pass
> to CheckSum()?
>
> int len;
> char p[20];
>
> strcpy( p, "Hello" );
> len = strlen(p);
> p[len++] = CheckSum( p, len );
> p[len] = NUL;
>
> I have painfully determined that DC will sometimes, and I stress
> "Sometimes", increment len before calling CheckSum and sometimes
> after. It's not even consistent!
>

Undefined behavior, I think.

In:
p[len++] = CheckSum( p, len );

you're both modifying and accessing in the same expression,
without an intervening sequence point. The call to CheckSum() is
a sequence point, but there's no guarantee that the right-hand
side of the expression will be evaluated before the lvalue on the
left-hand side is calculated.

That line of code should likely be split into two:
p[len] = CheckSum( p, len );
len++;
------------------------------------



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: (unknown) - Tom Collins - Aug 13 0:40:07 2008

On Aug 12, 2008, at 6:35 PM, Steve Trigero wrote:
> Given the sample code below, what value of 'len' should be pass to
> CheckSum()?
>
> int len;
> char p[20];
>
> strcpy( p, "Hello" );
> len = strlen(p);
> p[len++] = CheckSum( p, len );
> p[len] = NUL;
>
> I have painfully determined that DC will sometimes, and I stress
> "Sometimes",
> increment len before calling CheckSum and sometimes after. It's not
> even
> consistent!
Perhaps one of the language experts can step in and quote the section
of the C standard that defines behavior when a variable is used in
both the left and right-hand-side of an equation.

Until then, I think that separating the increment is probably the
best idea. If you want to combine two statements, move the increment
to a pre-increment with the NUL.

> p[len] = CheckSum( p, len );
> p[++len] = NUL;

-Tom



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: (unknown) - Steve Trigero - Aug 13 0:42:47 2008

I split the line to get it to work. But what bothers me, is that I've done that
very same thing all over my application, and only the latest iteration of it
has been a problem. Does that mean that there are ticking time-bombs
all over my code that will break one-at-a-time as I modify the code?

It seems to me that the compiler should evaluate the expression the same
way every time. Not one way in one procedure and another way in a
second procedure.

Also, I guess I would disagree with your comment that there is no guarantee
that the right-side will be evaluated before the left side. According to my
K&R, an assignment expression evaluates right-to-left. I suppose some
bureaucrat may have changed that in the final standardization process,
but there is no logical reason to evaluate an assignment left-to-right. It
goes against all programming practice. In my never-to-humble opinion.

Steve

----- Original Message ----
From: Don Starr
To: r...@yahoogroups.com
Sent: Tuesday, August 12, 2008 7:48:23 PM
Subject: RE: [rabbit-semi] (unknown)
> Given the sample code below, what value of 'len' should be pass
> to CheckSum()?
>
> int len;
> char p[20];
>
> strcpy( p, "Hello" );
> len = strlen(p);
> p[len++] = CheckSum( p, len );
> p[len] = NUL;
>
> I have painfully determined that DC will sometimes, and I stress
> "Sometimes", increment len before calling CheckSum and sometimes
> after. It's not even consistent!
>

Undefined behavior, I think.

In:
p[len++] = CheckSum( p, len );

you're both modifying and accessing in the same expression,
without an intervening sequence point. The call to CheckSum() is
a sequence point, but there's no guarantee that the right-hand
side of the expression will be evaluated before the lvalue on the
left-hand side is calculated.

That line of code should likely be split into two:
p[len] = CheckSum( p, len );
len++;



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: (unknown) - Don Starr - Aug 13 1:49:13 2008

Yes, your code contains ticking time bombs if you're using that type
of construct.

If an expression modifies an object's value, then the object may only
be read in that expression in order to determine the value to be
stored. A common example of this, one explicitly shown in the ISO C
standard (ISO/IEC 9899:1999), is:
a[i++] = i;

My copy of K&R (second edition) has, in its section 2.12, a similar
example:
a[i] = i++;

Even K&R says there's no way to know how the expression will be
evaluated - you don't know what will be used as the "subscript": the
value before the increment or after. Even using a single compiler,
within a single source file, you might get different results in
different places - this can be caused by things like optimizations,
machine register usage, etc.

My K&R also says (in the same section 2.12):
"C, like most languages, does not specify the order in which the
operands of an operator are evaluated. (The exceptions are &&,
||, ?:, and ','.)"

This tells me that the operands of the assignment operator '=' (since
it wasn't listed among the exceptions) can be evaluated in any order.
Even avoiding the problematic undefined behavior above, the
expression:
a[i++] = j + 1;
could be evaluated in this order:
1. evaluate
2. calculate lvalue a[i]
3. store result of (1) into lvalue from (2)
4. increment
Or, it could be:
1. calculate lvalue a[i]
2. increment
3. evaluate
4. store result of (3) into lvalue from (1)
The language doesn't specify the order. It's up to the
implementation, and the implementation could do it in different ways
depending on various conditions.

--- In r...@yahoogroups.com, Steve Trigero
wrote:
>
> I split the line to get it to work. But what bothers me, is that
I've done that
> very same thing all over my application, and only the latest
iteration of it
> has been a problem. Does that mean that there are ticking time-
bombs
> all over my code that will break one-at-a-time as I modify the code?
>
> It seems to me that the compiler should evaluate the expression the
same
> way every time. Not one way in one procedure and another way in a
> second procedure.
>
> Also, I guess I would disagree with your comment that there is no
guarantee
> that the right-side will be evaluated before the left side.
According to my
> K&R, an assignment expression evaluates right-to-left. I suppose
some
> bureaucrat may have changed that in the final standardization
process,
> but there is no logical reason to evaluate an assignment left-to-
right. It
> goes against all programming practice. In my never-to-humble
opinion.
>
> Steve
>
> ----- Original Message ----
> From: Don Starr
> To: r...@yahoogroups.com
> Sent: Tuesday, August 12, 2008 7:48:23 PM
> Subject: RE: [rabbit-semi] (unknown)
> > Given the sample code below, what value of 'len' should be pass
> > to CheckSum()?
> >
> > int len;
> > char p[20];
> >
> > strcpy( p, "Hello" );
> > len = strlen(p);
> > p[len++] = CheckSum( p, len );
> > p[len] = NUL;
> >
> > I have painfully determined that DC will sometimes, and I stress
> > "Sometimes", increment len before calling CheckSum and sometimes
> > after. It's not even consistent!
> > Undefined behavior, I think.
>
> In:
> p[len++] = CheckSum( p, len );
>
> you're both modifying and accessing in the same expression,
> without an intervening sequence point. The call to CheckSum() is
> a sequence point, but there's no guarantee that the right-hand
> side of the expression will be evaluated before the lvalue on the
> left-hand side is calculated.
>
> That line of code should likely be split into two:
> p[len] = CheckSum( p, len );
> len++;
>

------------------------------------



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: (unknown) - fairy_dave - Aug 13 3:47:07 2008

I have found similar problems in my own code a couple of times. A
notable one was where I was trying to debounce an input with
something like:

if (BIT(&inputByte, inputBit) != inValue[inputBit])
{
inCount[inputBit])++;
if (inCount[inputBit]) > 10)
{
inValue[inputBit]) = BIT(&inputByte[inputBit]), inputBit
[inputBit]));
inCount[inputBit]) = 0;
}
}

Should work fine, wait until the scanned value is different 10 times
in a row then update the memory buffer. Should also work for starting
from an unknown state with crazy values in the memory buffer you'd
think. It was working fine for months then discovered it would fail
sometimes when it began with crazy values in the memory buffer, they
wouldn't get cleared!

Turns out the bit shifting/and'ing/or'ing/etc can seep from one side
of a comparison to another. You have no idea how shocked I was to add
some debug expressions and getting results of
BIT(&inputByte, inputBit) 1
inValue[inputBit] 105
BIT(&inputByte, inputBit) != inValue[inputBit] 0

Had to go back and assign all bit shifting operations to temporary
variables all through my code to fix that one up. Bottom line in my
mind is, DC is very useful, but cannot be trusted even with a simple
comparison.

--- In r...@yahoogroups.com, Steve Trigero
wrote:
>
> I split the line to get it to work. But what bothers me, is that
I've done that
> very same thing all over my application, and only the latest
iteration of it
> has been a problem. Does that mean that there are ticking time-
bombs
> all over my code that will break one-at-a-time as I modify the code?
>
> It seems to me that the compiler should evaluate the expression the
same
> way every time. Not one way in one procedure and another way in a
> second procedure.
>
> Also, I guess I would disagree with your comment that there is no
guarantee
> that the right-side will be evaluated before the left side.
According to my
> K&R, an assignment expression evaluates right-to-left. I suppose
some
> bureaucrat may have changed that in the final standardization
process,
> but there is no logical reason to evaluate an assignment left-to-
right. It
> goes against all programming practice. In my never-to-humble
opinion.
>
> Steve
>
> ----- Original Message ----
> From: Don Starr
> To: r...@yahoogroups.com
> Sent: Tuesday, August 12, 2008 7:48:23 PM
> Subject: RE: [rabbit-semi] (unknown)
> > Given the sample code below, what value of 'len' should be pass
> > to CheckSum()?
> >
> > int len;
> > char p[20];
> >
> > strcpy( p, "Hello" );
> > len = strlen(p);
> > p[len++] = CheckSum( p, len );
> > p[len] = NUL;
> >
> > I have painfully determined that DC will sometimes, and I stress
> > "Sometimes", increment len before calling CheckSum and sometimes
> > after. It's not even consistent!
> > Undefined behavior, I think.
>
> In:
> p[len++] = CheckSum( p, len );
>
> you're both modifying and accessing in the same expression,
> without an intervening sequence point. The call to CheckSum() is
> a sequence point, but there's no guarantee that the right-hand
> side of the expression will be evaluated before the lvalue on the
> left-hand side is calculated.
>
> That line of code should likely be split into two:
> p[len] = CheckSum( p, len );
> len++;
>

------------------------------------



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

RE: Re: (unknown) - Jaysen Roper - Aug 13 9:41:31 2008

Pre-increment instead of post increment should fix it.

++I instead of i++

From: r...@yahoogroups.com [mailto:r...@yahoogroups.com] On
Behalf Of fairy_dave
Sent: 13 August 2008 08:47
To: r...@yahoogroups.com
Subject: [rabbit-semi] Re: (unknown)

I have found similar problems in my own code a couple of times. A
notable one was where I was trying to debounce an input with
something like:

if (BIT(&inputByte, inputBit) != inValue[inputBit])
{
inCount[inputBit])++;
if (inCount[inputBit]) > 10)
{
inValue[inputBit]) = BIT(&inputByte[inputBit]), inputBit
[inputBit]));
inCount[inputBit]) = 0;
}
}

Should work fine, wait until the scanned value is different 10 times
in a row then update the memory buffer. Should also work for starting
from an unknown state with crazy values in the memory buffer you'd
think. It was working fine for months then discovered it would fail
sometimes when it began with crazy values in the memory buffer, they
wouldn't get cleared!

Turns out the bit shifting/and'ing/or'ing/etc can seep from one side
of a comparison to another. You have no idea how shocked I was to add
some debug expressions and getting results of
BIT(&inputByte, inputBit) 1
inValue[inputBit] 105
BIT(&inputByte, inputBit) != inValue[inputBit] 0

Had to go back and assign all bit shifting operations to temporary
variables all through my code to fix that one up. Bottom line in my
mind is, DC is very useful, but cannot be trusted even with a simple
comparison.

--- In r...@yahoogroups.com ,
Steve Trigero
wrote:
>
> I split the line to get it to work. But what bothers me, is that
I've done that
> very same thing all over my application, and only the latest
iteration of it
> has been a problem. Does that mean that there are ticking time-
bombs
> all over my code that will break one-at-a-time as I modify the code?
>
> It seems to me that the compiler should evaluate the expression the
same
> way every time. Not one way in one procedure and another way in a
> second procedure.
>
> Also, I guess I would disagree with your comment that there is no
guarantee
> that the right-side will be evaluated before the left side.
According to my
> K&R, an assignment expression evaluates right-to-left. I suppose
some
> bureaucrat may have changed that in the final standardization
process,
> but there is no logical reason to evaluate an assignment left-to-
right. It
> goes against all programming practice. In my never-to-humble
opinion.
>
> Steve
>
> ----- Original Message ----
> From: Don Starr
> To: r...@yahoogroups.com
> Sent: Tuesday, August 12, 2008 7:48:23 PM
> Subject: RE: [rabbit-semi] (unknown)
> > Given the sample code below, what value of 'len' should be pass
> > to CheckSum()?
> >
> > int len;
> > char p[20];
> >
> > strcpy( p, "Hello" );
> > len = strlen(p);
> > p[len++] = CheckSum( p, len );
> > p[len] = NUL;
> >
> > I have painfully determined that DC will sometimes, and I stress
> > "Sometimes", increment len before calling CheckSum and sometimes
> > after. It's not even consistent!
> > Undefined behavior, I think.
>
> In:
> p[len++] = CheckSum( p, len );
>
> you're both modifying and accessing in the same expression,
> without an intervening sequence point. The call to CheckSum() is
> a sequence point, but there's no guarantee that the right-hand
> side of the expression will be evaluated before the lvalue on the
> left-hand side is calculated.
>
> That line of code should likely be split into two:
> p[len] = CheckSum( p, len );
> len++;
>



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: Re: (unknown) - Steve Trigero - Aug 13 9:53:41 2008

Why? Preincrementing doesn't make sense to me. I seems like it would
assure that the variable is wrong one.

----- Original Message ----
From: Jaysen Roper
To: r...@yahoogroups.com
Sent: Wednesday, August 13, 2008 1:15:12 AM
Subject: RE: [rabbit-semi] Re: (unknown)
Pre-increment instead of post increment should fix it.

++I instead of i++

From:rabbit-semi@ yahoogroups. com [mailto:rabbit- semi@yahoogroups .com] On Behalf
Of fairy_dave
Sent: 13 August 2008 08:47
To: rabbit-semi@ yahoogroups. com
Subject: [rabbit-semi] Re: (unknown)

I have found similar problems in my own code a
couple of times. A
notable one was where I was trying to debounce an input with
something like:

if (BIT(&inputByte, inputBit) != inValue[inputBit] )
{
inCount[inputBit] )++;
if (inCount[inputBit] ) > 10)
{
inValue[inputBit] ) = BIT(&inputByte[inputBit] ), inputBit
[inputBit])) ;
inCount[inputBit] ) = 0;
}
}

Should work fine, wait until the scanned value is different 10 times
in a row then update the memory buffer. Should also work for starting
from an unknown state with crazy values in the memory buffer you'd
think. It was working fine for months then discovered it would fail
sometimes when it began with crazy values in the memory buffer, they
wouldn't get cleared!

Turns out the bit shifting/and' ing/or'ing/ etc can seep from one side
of a comparison to another. You have no idea how shocked I was to add
some debug expressions and getting results of
BIT(&inputByte, inputBit) 1
inValue[inputBit] 105
BIT(&inputByte, inputBit) != inValue[inputBit] 0

Had to go back and assign all bit shifting operations to temporary
variables all through my code to fix that one up. Bottom line in my
mind is, DC is very useful, but cannot be trusted even with a simple
comparison.

--- In rabbit-semi@ yahoogroups. com,
Steve Trigero
wrote:
>
> I split the line to get it to work. But what bothers me, is that
I've done that
> very same thing all over my application, and only the latest
iteration of it
> has been a problem. Does that mean that there are ticking time-
bombs
> all over my code that will break one-at-a-time as I modify the code?
>
> It seems to me that the compiler should evaluate the expression the
same
> way every time. Not one way in one procedure and another way in a
> second procedure.
>
> Also, I guess I would disagree with your comment that there is no
guarantee
> that the right-side will be evaluated before the left side.
According to my
> K&R, an assignment expression evaluates right-to-left. I suppose
some
> bureaucrat may have changed that in the final standardization
process,
> but there is no logical reason to evaluate an assignment left-to-
right. It
> goes against all programming practice. In my never-to-humble
opinion.
>
> Steve
>
> ----- Original Message ----
> From: Don Starr
> To: rabbit-semi@ yahoogroups. com
> Sent: Tuesday, August 12, 2008 7:48:23 PM
> Subject: RE: [rabbit-semi] (unknown)
> > Given the sample code below, what value of 'len' should be pass
> > to CheckSum()?
> >
> > int len;
> > char p[20];
> >
> > strcpy( p, "Hello" );
> > len = strlen(p);
> > p[len++] = CheckSum( p, len );
> > p[len] = NUL;
> >
> > I have painfully determined that DC will sometimes, and I stress
> > "Sometimes", increment len before calling CheckSum and
sometimes
> > after. It's not even consistent!
> > Undefined behavior, I think.
>
> In:
> p[len++] = CheckSum( p, len );
>
> you're both modifying and accessing in the same expression,
> without an intervening sequence point. The call to CheckSum() is
> a sequence point, but there's no guarantee that the right-hand
> side of the expression will be evaluated before the lvalue on the
> left-hand side is calculated.
>
> That line of code should likely be split into two:
> p[len] = CheckSum( p, len );
> len++;
>


(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: Re: (unknown) - Steve Trigero - Aug 13 10:57:48 2008

Don,

Since I've seen it with my own eyes, you must be right.
The same statement can be compiled differently in
different parts of the code. But it still makes no sense
to me. You might as well tell me that coding when the
moon is full will get different results than when there is
no moon.

On page 21 of K&R second edition, referring to
statement "nl = nw = nc = 0;", it says "This is not a
special case, but a consequence of the FACT that
an assignment is an expression with a value and
assignments associate from right to left."
(emphasis mine).

If assignments associate from right to left, the 'len' (in my
example below) should not be incremented until after
the left hand side of the equal sign is evaluated. And
the left side of the equal sign should not be evaluated
until after the right side is evaluated.

As to section 2.12 in K&R that you identified, I read that
differently than you. In the examples they cite, the question
they appear to address is not which side of the equal sign
gets evaluated first, but what is the order of evaluation of
the variables on the RIGHT side of the equal sign. In their
example of x = f() + g();, the question is which function is
evaluated first, f or g? Not whether f gets assigned to x before
g gets evaluated. In there example of a[i] = i++;, the question
is, after evaluating the right side of the equal sign, what value
of i is used as the index. Both of those are valid issues. But
none of their examples suggests, or implies, that at any time
could a variable on the left side of an equal sign be evaluated
before the right side is evaluated. Otherwise, section 2.12
contradicts what was stated in section 1.5.4, page 21.

I would also draw your attention to something else said in section 2.12
of K&R. Referring to the order of evaluation, and how the order is
not specified, it says "different results" can occur with "different compilers."
Notice that it is different compilers that can produce different results.
Not the same line code in different places in a program using the same
compiler getting different results. This is what bothers me the most.

But, all that said, K&R hits me upside the head with the reminder that
"writing code that depends on order of evaluation is a bad programming
practice..." While this is true as a guideline, it seems to beg the question.
A programmer has to depend on some level of defined evaluation or
programming is nothing more than throwing darts.

x = f() * g() + offset;

If I can't depend on the compiler evaluating the complete right side, with
the multiplication occurring before the addition, before assigning to x,
I'm peeing into the wind.

Now I have to go an grep my files looking for time bombs.

Thanks for your response.

Steve

----- Original Message ----
From: Don Starr
To: r...@yahoogroups.com
Sent: Tuesday, August 12, 2008 10:49:02 PM
Subject: [rabbit-semi] Re: (unknown)
Yes, your code contains ticking time bombs if you're using that type
of construct.

If an expression modifies an object's value, then the object may only
be read in that expression in order to determine the value to be
stored. A common example of this, one explicitly shown in the ISO C
standard (ISO/IEC 9899:1999), is:
a[i++] = i;

My copy of K&R (second edition) has, in its section 2.12, a similar
example:
a[i] = i++;

Even K&R says there's no way to know how the expression will be
evaluated - you don't know what will be used as the "subscript": the
value before the increment or after. Even using a single compiler,
within a single source file, you might get different results in
different places - this can be caused by things like optimizations,
machine register usage, etc.

My K&R also says (in the same section 2.12):
"C, like most languages, does not specify the order in which the
operands of an operator are evaluated. (The exceptions are &&,
||, ?:, and ','.)"

This tells me that the operands of the assignment operator '=' (since
it wasn't listed among the exceptions) can be evaluated in any order.
Even avoiding the problematic undefined behavior above, the
expression:
a[i++] = j + 1;
could be evaluated in this order:
1. evaluate
2. calculate lvalue a[i]
3. store result of (1) into lvalue from (2)
4. increment
Or, it could be:
1. calculate lvalue a[i]
2. increment
3. evaluate
4. store result of (3) into lvalue from (1)
The language doesn't specify the order. It's up to the
implementation, and the implementation could do it in different ways
depending on various conditions.

--- In rabbit-semi@ yahoogroups. com, Steve Trigero
wrote:
>
> I split the line to get it to work. But what bothers me, is that
I've done that
> very same thing all over my application, and only the latest
iteration of it
> has been a problem. Does that mean that there are ticking time-
bombs
> all over my code that will break one-at-a-time as I modify the code?
>
> It seems to me that the compiler should evaluate the expression the
same
> way every time. Not one way in one procedure and another way in a
> second procedure.
>
> Also, I guess I would disagree with your comment that there is no
guarantee
> that the right-side will be evaluated before the left side.
According to my
> K&R, an assignment expression evaluates right-to-left. I suppose
some
> bureaucrat may have changed that in the final standardization
process,
> but there is no logical reason to evaluate an assignment left-to-
right. It
> goes against all programming practice. In my never-to-humble
opinion.
>
> Steve
>
> ----- Original Message ----
> From: Don Starr
> To: rabbit-semi@ yahoogroups. com
> Sent: Tuesday, August 12, 2008 7:48:23 PM
> Subject: RE: [rabbit-semi] (unknown)
> > Given the sample code below, what value of 'len' should be pass
> > to CheckSum()?
> >
> > int len;
> > char p[20];
> >
> > strcpy( p, "Hello" );
> > len = strlen(p);
> > p[len++] = CheckSum( p, len );
> > p[len] = NUL;
> >
> > I have painfully determined that DC will sometimes, and I stress
> > "Sometimes", increment len before calling CheckSum and sometimes
> > after. It's not even consistent!
> > Undefined behavior, I think.
>
> In:
> p[len++] = CheckSum( p, len );
>
> you're both modifying and accessing in the same expression,
> without an intervening sequence point. The call to CheckSum() is
> a sequence point, but there's no guarantee that the right-hand
> side of the expression will be evaluated before the lvalue on the
> left-hand side is calculated.
>
> That line of code should likely be split into two:
> p[len] = CheckSum( p, len );
> len++;
>



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: (unknown) - eilidhs_daddy - Aug 13 11:31:41 2008

I think it just comes down to instruction optimisation in the compiler.

As it stands, it's definitely undefined behaviour, and as such, it's
almost the /DUTY/ of the compiler to do it differently at different
times, in order to force the developer to catch and correct uses of
undefined behaviour.

When I started out in Rabbit development I got caught out by this
exact problem. We live, and we learn!

-Kenny

--- In r...@yahoogroups.com, Steve Trigero wrote:
>
> Don,
>
> Since I've seen it with my own eyes, you must be right.
> The same statement can be compiled differently in
> different parts of the code. But it still makes no sense
> to me. You might as well tell me that coding when the
> moon is full will get different results than when there is
> no moon.
>
> On page 21 of K&R second edition, referring to
> statement "nl = nw = nc = 0;", it says "This is not a
> special case, but a consequence of the FACT that
> an assignment is an expression with a value and
> assignments associate from right to left."
> (emphasis mine).
>
> If assignments associate from right to left, the 'len' (in my
> example below) should not be incremented until after
> the left hand side of the equal sign is evaluated. And
> the left side of the equal sign should not be evaluated
> until after the right side is evaluated.
>
> As to section 2.12 in K&R that you identified, I read that
> differently than you. In the examples they cite, the question
> they appear to address is not which side of the equal sign
> gets evaluated first, but what is the order of evaluation of
> the variables on the RIGHT side of the equal sign. In their
> example of x = f() + g();, the question is which function is
> evaluated first, f or g? Not whether f gets assigned to x before
> g gets evaluated. In there example of a[i] = i++;, the question
> is, after evaluating the right side of the equal sign, what value
> of i is used as the index. Both of those are valid issues. But
> none of their examples suggests, or implies, that at any time
> could a variable on the left side of an equal sign be evaluated
> before the right side is evaluated. Otherwise, section 2.12
> contradicts what was stated in section 1.5.4, page 21.
>
> I would also draw your attention to something else said in section 2.12
> of K&R. Referring to the order of evaluation, and how the order is
> not specified, it says "different results" can occur with "different
compilers."
> Notice that it is different compilers that can produce different
results.
> Not the same line code in different places in a program using the same
> compiler getting different results. This is what bothers me the most.
>
> But, all that said, K&R hits me upside the head with the reminder that
> "writing code that depends on order of evaluation is a bad programming
> practice..." While this is true as a guideline, it seems to beg the
question.
> A programmer has to depend on some level of defined evaluation or
> programming is nothing more than throwing darts.
>
> x = f() * g() + offset;
>
> If I can't depend on the compiler evaluating the complete right
side, with
> the multiplication occurring before the addition, before assigning
to x,
> I'm peeing into the wind.
>
> Now I have to go an grep my files looking for time bombs.
>
> Thanks for your response.
>
> Steve
>
> ----- Original Message ----
> From: Don Starr
> To: r...@yahoogroups.com
> Sent: Tuesday, August 12, 2008 10:49:02 PM
> Subject: [rabbit-semi] Re: (unknown)
> Yes, your code contains ticking time bombs if you're using that type
> of construct.
>
> If an expression modifies an object's value, then the object may only
> be read in that expression in order to determine the value to be
> stored. A common example of this, one explicitly shown in the ISO C
> standard (ISO/IEC 9899:1999), is:
> a[i++] = i;
>
> My copy of K&R (second edition) has, in its section 2.12, a similar
> example:
> a[i] = i++;
>
> Even K&R says there's no way to know how the expression will be
> evaluated - you don't know what will be used as the "subscript": the
> value before the increment or after. Even using a single compiler,
> within a single source file, you might get different results in
> different places - this can be caused by things like optimizations,
> machine register usage, etc.
>
> My K&R also says (in the same section 2.12):
> "C, like most languages, does not specify the order in which the
> operands of an operator are evaluated. (The exceptions are &&,
> ||, ?:, and ','.)"
>
> This tells me that the operands of the assignment operator '=' (since
> it wasn't listed among the exceptions) can be evaluated in any order.
> Even avoiding the problematic undefined behavior above, the
> expression:
> a[i++] = j + 1;
> could be evaluated in this order:
> 1. evaluate
> 2. calculate lvalue a[i]
> 3. store result of (1) into lvalue from (2)
> 4. increment
> Or, it could be:
> 1. calculate lvalue a[i]
> 2. increment
> 3. evaluate
> 4. store result of (3) into lvalue from (1)
> The language doesn't specify the order. It's up to the
> implementation, and the implementation could do it in different ways
> depending on various conditions.
>
> --- In rabbit-semi@ yahoogroups. com, Steve Trigero
> wrote:
> >
> > I split the line to get it to work. But what bothers me, is that
> I've done that
> > very same thing all over my application, and only the latest
> iteration of it
> > has been a problem. Does that mean that there are ticking time-
> bombs
> > all over my code that will break one-at-a-time as I modify the code?
> >
> > It seems to me that the compiler should evaluate the expression the
> same
> > way every time. Not one way in one procedure and another way in a
> > second procedure.
> >
> > Also, I guess I would disagree with your comment that there is no
> guarantee
> > that the right-side will be evaluated before the left side.
> According to my
> > K&R, an assignment expression evaluates right-to-left. I suppose
> some
> > bureaucrat may have changed that in the final standardization
> process,
> > but there is no logical reason to evaluate an assignment left-to-
> right. It
> > goes against all programming practice. In my never-to-humble
> opinion.
> >
> > Steve
> >
> >
> >
> >
> >
> > ----- Original Message ----
> > From: Don Starr
> > To: rabbit-semi@ yahoogroups. com
> > Sent: Tuesday, August 12, 2008 7:48:23 PM
> > Subject: RE: [rabbit-semi] (unknown)
> >
> >
> > > Given the sample code below, what value of 'len' should be pass
> > > to CheckSum()?
> > >
> > > int len;
> > > char p[20];
> > >
> > > strcpy( p, "Hello" );
> > > len = strlen(p);
> > > p[len++] = CheckSum( p, len );
> > > p[len] = NUL;
> > >
> > > I have painfully determined that DC will sometimes, and I stress
> > > "Sometimes", increment len before calling CheckSum and sometimes
> > > after. It's not even consistent!
> > >
> >
> > Undefined behavior, I think.
> >
> > In:
> > p[len++] = CheckSum( p, len );
> >
> > you're both modifying and accessing in the same expression,
> > without an intervening sequence point. The call to CheckSum() is
> > a sequence point, but there's no guarantee that the right-hand
> > side of the expression will be evaluated before the lvalue on the
> > left-hand side is calculated.
> >
> > That line of code should likely be split into two:
> > p[len] = CheckSum( p, len );
> > len++;
>
------------------------------------



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

RE: Re: (unknown) - Don Starr - Aug 13 11:58:47 2008

>On page 21 of K&R second edition, referring to
>statement "nl = nw = nc = 0;", it says "This is not a
>special case, but a consequence of the FACT that
>an assignment is an expression with a value and
>assignments associate from right to left."
>(emphasis mine).
>
>If assignments associate from right to left, the 'len' (in my
>example below) should not be incremented until after
>the left hand side of the equal sign is evaluated. And
>the left side of the equal sign should not be evaluated
>until after the right side is evaluated.

That's text in K&R describes ASSOCIATIVITY, not ORDER OF EVALUATION.
Those are two different things. The right-to-left associativity of
the '=' operator ONLY tells you that after evaluating this shorter
example:
a = b = 1;
both and are 1, instead of having the prior value of
and having 1. That's all associativity tells you. "Right-
to-left associativity" does NOT mean that '1' is evaluated first,
second, and
third. That would be "order of evaluation",
which is most certainly not specified.

>But none of their examples suggests, or implies, that at any time
>could a variable on the left side of an equal sign be evaluated
>before the right side is evaluated.

K&R EXPLICITLY says exactly that in section 2.12 that I quoted
previously:
"C, like most languages, does not specify the order in which
the operands of an operator are evaluated. (The exceptions
are &&, ||, ?:, and ','.)"

Remember that the '=' above is an OPERATOR, just like any other
binary operator (+, -, etc.). It takes two operands, performs an
operation, and has a resulting value (about the only thing special
about the '=' operator is that its left-hand operand must be an
lvalue). Its operands are the two expressions that appear on the
left and right side of the operator. The order in which those
expressions are evaluated is undefined.

>x = f() * g() + offset;
>
>If I can't depend on the compiler evaluating the complete right side, with
>the multiplication occurring before the addition, before assigning to x,
>I'm peeing into the wind.

You can count on the arithmetic "operator precedence". That is, you
can be sure that the product of f() and g() is added to , with
the sum stored in . What you CANNOT depend on is f() being called
before g() or either one of them called before is evaluated.
You can't even count on the product of f() and g() being evaluated
before the compiler evaluates the lvalue of .

What you seem to be expecting is this:
1. evaluate f()
2. evaluate g()
3. multiply
4. evaluate
5. add
6. evaluate
7. store

That's bad. C does not guarantee that order. The language only tells
you that the multiplication will happen before the addition, and the
addition will happen before the assignment. That's it. The compiler is
most certainly free to do the following (using fake register names
only as a general example):
1. evaluate and store in register R1
2. evaluate and store the "address" in RA
3. call f() and store its return value in register R2
4. call g() and store its return value in register R3
5. multiply R2 and R3 and store product in R2
6. add R2 and R1 and store sum in R1
7. store R1 in address pointed to by RA

The above "alternate sequence" is precisely why "a[i++] = i" is bad.
The compiler is free to evaluate "a[i++]" on the left side of the
assignment operator BEFORE or AFTER it evaluates "i" on the right
side of the assignment operator. And "i" could be post-incremented
on the left BEFORE or AFTER its value is read on the right side.

The only operators for which "order of evaluation of the operands"
is guaranteed are: &&, ||, ?:, and ','. For example, since the &&
operator has its operands evaluated left-to-right, and since the
operands "stop evaluating" once the result of the expression is
known, this is safe:
if ( p != NULL && *p == 2 )
We know that the subexpression "p != NULL" is evaluated first. We
also know that, if that subexpression is FALSE (p == NULL), the
subexpression on the right of && isn't evaluated - this is because
the result of the expression is already known. Therefore, the code
will never try to get "*p" if "p" is a NULL pointer.

But, that's one very limited case where you know the order of
evaluation. As you pointed out, even K&R says "don't count on it".
------------------------------------



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: Re: (unknown) - Steve Trigero - Aug 13 13:39:56 2008

If what you say is true, then the following statements:

i = 0;
a[i++] = val;

could also be compiled such that i is incremented first
so that val is stored in location 1 rather than the desired
location 0. The moral of the story then is that variables
should never be incremented or decremented unless they
are a stand alone statement. The above should always
be written:

a[i] = val;
i++;
> What you seem to be expecting is this:
> 1. evaluate f()
> 2. evaluate g()
> 3. multiply
> 4. evaluate
> 5. add
> 6. evaluate
> 7. store

No. What I expect is that lines 1, 2, and 4 be done in any order, but
be done before anything else, followed by line 3, followed by line 5,
then lines 6 and 7.
----- Original Message ----
From: Don Starr
To: r...@yahoogroups.com
Sent: Wednesday, August 13, 2008 8:58:40 AM
Subject: RE: [rabbit-semi] Re: (unknown)
>On page 21 of K&R second edition, referring to
>statement "nl = nw = nc = 0;", it says "This is not a
>special case, but a consequence of the FACT that
>an assignment is an expression with a value and
>assignments associate from right to left."
>(emphasis mine).
>
>If assignments associate from right to left, the 'len' (in my
>example below) should not be incremented until after
>the left hand side of the equal sign is evaluated. And
>the left side of the equal sign should not be evaluated
>until after the right side is evaluated.

That's text in K&R describes ASSOCIATIVITY, not ORDER OF EVALUATION.
Those are two different things. The right-to-left associativity of
the '=' operator ONLY tells you that after evaluating this shorter
example:
a = b = 1;
both and are 1, instead of having the prior value of
and having 1. That's all associativity tells you. "Right-
to-left associativity" does NOT mean that '1' is evaluated first,
second, and
third. That would be "order of evaluation",
which is most certainly not specified.

>But none of their examples suggests, or implies, that at any time
>could a variable on the left side of an equal sign be evaluated
>before the right side is evaluated.

K&R EXPLICITLY says exactly that in section 2.12 that I quoted
previously:
"C, like most languages, does not specify the order in which
the operands of an operator are evaluated. (The exceptions
are &&, ||, ?:, and ','.)"

Remember that the '=' above is an OPERATOR, just like any other
binary operator (+, -, etc.). It takes two operands, performs an
operation, and has a resulting value (about the only thing special
about the '=' operator is that its left-hand operand must be an
lvalue). Its operands are the two expressions that appear on the
left and right side of the operator. The order in which those
expressions are evaluated is undefined.

>x = f() * g() + offset;
>
>If I can't depend on the compiler evaluating the complete right side, with
>the multiplication occurring before the addition, before assigning to x,
>I'm peeing into the wind.

You can count on the arithmetic "operator precedence". That is, you
can be sure that the product of f() and g() is added to , with
the sum stored in . What you CANNOT depend on is f() being called
before g() or either one of them called before is evaluated.
You can't even count on the product of f() and g() being evaluated
before the compiler evaluates the lvalue of .

What you seem to be expecting is this:
1. evaluate f()
2. evaluate g()
3. multiply
4. evaluate
5. add
6. evaluate
7. store

That's bad. C does not guarantee that order. The language only tells
you that the multiplication will happen before the addition, and the
addition will happen before the assignment. That's it. The compiler is
most certainly free to do the following (using fake register names
only as a general example):
1. evaluate and store in register R1
2. evaluate and store the "address" in RA
3. call f() and store its return value in register R2
4. call g() and store its return value in register R3
5. multiply R2 and R3 and store product in R2
6. add R2 and R1 and store sum in R1
7. store R1 in address pointed to by RA

The above "alternate sequence" is precisely why "a[i++] = i" is bad.
The compiler is free to evaluate "a[i++]" on the left side of the
assignment operator BEFORE or AFTER it evaluates "i" on the right
side of the assignment operator. And "i" could be post-incremented
on the left BEFORE or AFTER its value is read on the right side.

The only operators for which "order of evaluation of the operands"
is guaranteed are: &&, ||, ?:, and ','. For example, since the &&
operator has its operands evaluated left-to-right, and since the
operands "stop evaluating" once the result of the expression is
known, this is safe:
if ( p != NULL && *p == 2 )
We know that the subexpression "p != NULL" is evaluated first. We
also know that, if that subexpression is FALSE (p == NULL), the
subexpression on the right of && isn't evaluated - this is because
the result of the expression is already known. Therefore, the code
will never try to get "*p" if "p" is a NULL pointer.

But, that's one very limited case where you know the order of
evaluation. As you pointed out, even K&R says "don't count on it".



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: Re: (unknown) - Scott Henion - Aug 13 13:49:31 2008

Steve Trigero wrote:
> If what you say is true, then the following statements:
>
> i = 0;
> a[i++] = val;
>
> could also be compiled such that i is incremented first
> so that val is stored in location 1 rather than the desired
> location 0. The moral of the story then is that variables
> should never be incremented or decremented unless they
> are a stand alone statement. The above should always
> be written:
>
> a[i] = val;
> i++;

That will always be stored in location 0 in both cases.

The issues is when a variable is used on both sides of an = operator. Like:

x = x / ++x;

The order is undefined. Both of the left x variables could be x or x+1
at any time depending on the compiler.

--
------------------------------------------
| Scott G. Henion| s...@shdesigns.org |
| Consultant | Stone Mountain, GA |
| SHDesigns http://www.shdesigns.org |
------------------------------------------
Rabbit libs: http://www.shdesigns.org/rabbit/
today's fortune
"Been through Hell? Whaddya bring back for me?"
-- A. Brilliant


(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: (unknown) - Bill_CT - Aug 13 13:54:16 2008

--- In r...@yahoogroups.com, "Don Starr" wrote:
>
> > Given the sample code below, what value of 'len' should be pass
> > to CheckSum()?
> >
> > int len;
> > char p[20];
> >
> > strcpy( p, "Hello" );
> > len = strlen(p);
> > p[len++] = CheckSum( p, len );
> > p[len] = NUL;
> >
> > I have painfully determined that DC will sometimes, and I stress
> > "Sometimes", increment len before calling CheckSum and sometimes
> > after. It's not even consistent!
> > Undefined behavior, I think.
>
> In:
> p[len++] = CheckSum( p, len );
>
> you're both modifying and accessing in the same expression,
> without an intervening sequence point. The call to CheckSum() is
> a sequence point, but there's no guarantee that the right-hand
> side of the expression will be evaluated before the lvalue on the
> left-hand side is calculated.
>
> That line of code should likely be split into two:
> p[len] = CheckSum( p, len );
> len++;

It's true that len is being modified and accessed in the same
expression, but the function call must complete before the assignment
can be made. Since len can't change on the right-hand side, it can't
be unsafe.

The danger would be if someone came along and said let's use a macro
CheckSum to save the call/return overhead. Then there's a problem.

Bill

------------------------------------



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: (unknown) - Bill_CT - Aug 13 13:57:16 2008

--- In r...@yahoogroups.com, "Don Starr" wrote:
>
> Yes, your code contains ticking time bombs if you're using that type
> of construct.
>
> If an expression modifies an object's value, then the object may only
> be read in that expression in order to determine the value to be
> stored. A common example of this, one explicitly shown in the ISO C
> standard (ISO/IEC 9899:1999), is:
> a[i++] = i;
>
> My copy of K&R (second edition) has, in its section 2.12, a similar
> example:
> a[i] = i++;
>
> Even K&R says there's no way to know how the expression will be
> evaluated - you don't know what will be used as the "subscript": the
> value before the increment or after. Even using a single compiler,
> within a single source file, you might get different results in
> different places - this can be caused by things like optimizations,
> machine register usage, etc.

But we're talking about a function call on the right-hand side. As
you said, it's a sequence point so all side-effects from the function
call *must* be complete.

I disagree with a ticking time bomb (not on these expressions anyway).
If it happens in one place, I'd dig deeper into it. It just always
seems to happen then if you don't, they come back to haunt you at the
worst possible time (a demo, code freeze for beta, etc.).

Bill

------------------------------------



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

RE: Re: (unknown) - Don Starr - Aug 13 14:49:25 2008

> If what you say is true, then the following statements:
>
> i = 0;
> a[i++] = val;
>
> could also be compiled such that i is incremented first
> so that val is stored in location 1 rather than the desired
> location 0.

No. Let's dissect that expression:
* we have a binary operator '=' that takes two operands, the
right-hand side (RHS) and the left-hand side (LHS)
* the LHS and RHS can be evaluated IN ANY ORDER
* the LHS must evaluate to an lvalue ("someplace where something
can be stored")
* the RHS is a simple rvalue
* the LHS has the subexpression "a[i++]"
* the subexpression "a[i++]" has a subexpression "i++". The result
of that postfix operator is the value of the operand . That
subexpression also has a side-effect: AFTER (and "after" is
guaranteed by the language) the result is obtained, the value of
is incremented. This side-effect will be "complete" sometime
between the last sequence point and the next.
* after the full expression is evaluated:
1. the "th element of array " (where here is the
value of before the expression is evaluated) contains

2. has a value one greater than it did before

In other words, given the two lines of code at the top, you are
GUARANTEED that a[0] will contain , and will be equal to 1.

There's no ambiguity in the above. The "i++" does two things:
1. retrieve value of operand for use as a subscript, and
2. some time before the next sequence point (the semicolon), but
AFTER retrieving the value in (1), increment the stored value
of
Since isn't used anywhere else, there's no problem. It doesn't
MATTER when is incremented, as long as the pre-increment value
is used as the subscript - which it is.

Now, if appears ANYWHERE else in the expression, there's a
problem. For example:
a[i++] = i;

Since order of evaluation of the LHS and RHS is not guaranteed, and
the side-effect will be complete "sometime" between the last sequence
point and the next, you have no idea what value will be used for
on the RHS. We can extend that to your original code form:
a[i++] = func(i);
You have no idea what value will be used for on the RHS. Bill A.
says in a (separately addressed) follow-up that the sequence point
generated by the function call has something to do with it - I don't
believe it does.
>>> x = f() * g() + offset;
>>
>> What you seem to be expecting is this:
>> 1. evaluate f()
>> 2. evaluate g()
>> 3. multiply
>> 4. evaluate
>> 5. add
>> 6. evaluate
>> 7. store
>
> No. What I expect is that lines 1, 2, and 4 be done in any order, but
> be done before anything else, followed by line 3, followed by line 5,
> then lines 6 and 7.

Unfortunately, you can't expect that in C. The language makes no such
guarantee. The only guarantee is that the product of f() and g() will
be added to and the sum stored in . That's it, period. The
"lvalue" can be evaluated AT ANY TIME prior to the "store". There
is absolutely no guarantee of the order of evaluation of f(), g(),
, or . There's NOTHING SPECIAL about the '=' operator that
would make the LHS evaluate before the RHS. The ONLY special things
about the '=' operator are a) the LHS must be an lvalue and b) it
associates (NOT ORDER OF EVALUATION!) right-to-left.

----- Original Message ----
From: Don Starr
To: r...@yahoogroups.com
Sent: Wednesday, August 13, 2008 8:58:40 AM
Subject: RE: [rabbit-semi] Re: (unknown)

>On page 21 of K&R second edition, referring to
>statement "nl = nw = nc = 0;", it says "This is not a
>special case, but a consequence of the FACT that
>an assignment is an expression with a value and
>assignments associate from right to left."
>(emphasis mine).
>
>If assignments associate from right to left, the 'len' (in my
>example below) should not be incremented until after
>the left hand side of the equal sign is evaluated. And
>the left side of the equal sign should not be evaluated
>until after the right side is evaluated.

That's text in K&R describes ASSOCIATIVITY, not ORDER OF EVALUATION.
Those are two different things. The right-to-left associativity of
the '=' operator ONLY tells you that after evaluating this shorter
example:
a = b = 1;
both
and are 1, instead of having the prior value of
and having 1. That's all associativity tells you. "Right-
to-left associativity" does NOT mean that '1' is evaluated first,
second, and
third. That would be "order of evaluation",
which is most certainly not specified.

>But none of their examples suggests, or implies, that at any time
>could a variable on the left side of an equal sign be evaluated
>before the right side is evaluated.

K&R EXPLICITLY says exactly that in section 2.12 that I quoted
previously:
"C, like most languages, does not specify the order in which
the operands of an operator are evaluated. (The exceptions
are &&, ||, ?:, and ','.)"

Remember that the '=' above is an OPERATOR, just like any other
binary operator (+, -, etc.). It takes two operands, performs an
operation, and has a resulting value (about the only thing special
about the '=' operator is that its left-hand operand must be an
lvalue). Its operands are the two expressions that appear on the
left and right side of the operator. The order in which those
expressions are evaluated is undefined.

>x = f() * g() + offset;
>
>If I can't depend on the compiler evaluating the complete right side, with
>the multiplication occurring before the addition, before assigning to x,
>I'm peeing into the wind.

You can count on the arithmetic "operator precedence". That is, you
can be sure that the product of f() and g() is added to , with
the sum stored in . What you CANNOT depend on is f() being called
before g() or either one of them called before is evaluated.
You can't even count on the product of f() and g() being evaluated
before the compiler evaluates the lvalue of .

What you seem to be expecting is this:
1. evaluate f()
2. evaluate g()
3. multiply
4. evaluate
5. add
6. evaluate
7. store

That's bad. C does not guarantee that order. The language only tells
you that the multiplication will happen before the addition, and the
addition will happen before the assignment. That's it. The compiler is
most certainly free to do the following (using fake register names
only as a general example):
1. evaluate and store in register R1
2. evaluate and store the "address" in RA
3. call f() and store its return value in register R2
4. call g() and store its return value in register R3
5. multiply R2 and R3 and store product in R2
6. add R2 and R1 and store sum in R1
7. store R1 in address pointed to by RA

The above "alternate sequence" is precisely why "a[i++] = i" is bad.
The compiler is free to evaluate "a[i++]" on the left side of the
assignment operator BEFORE or AFTER it evaluates "i" on the right
side of the assignment operator. And "i" could be post-incremented
on the left BEFORE or AFTER its value is read on the right side.

The only operators for which "order of evaluation of the operands"
is guaranteed are: &&, ||, ?:, and ','. For example, since the &&
operator has its operands evaluated left-to-right, and since the
operands "stop evaluating" once the result of the expression is
known, this is safe:
if ( p != NULL && *p == 2 )
We know that the subexpression "p != NULL" is evaluated first. We
also know that, if that subexpression is FALSE (p == NULL), the
subexpression on the right of && isn't evaluated - this is because
the result of the expression is already known. Therefore, the code
will never try to get "*p" if "p" is a NULL pointer.

But, that's one very limited case where you know the order of
evaluation. As you pointed out, even K&R says "don't count on it".

------------------------------------



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

RE: Re: (unknown) - Don Starr - Aug 13 15:41:31 2008

>But we're talking about a function call on the right-hand side. As
>you said, it's a sequence point so all side-effects from the function
>call *must* be complete.

Well, yes, all "side effects from the function call *must* be
complete" before the function is actually called. However, that
ONLY has to do with the function call's side effects, including
those caused by evaluating its arguments. It has NOTHING to do
with side effects in the rest of the expression, nor does it have
anything to do with WHEN other operands are evaluated relative to
the "function call operand".

Given the O.P.'s original form:

a[len++] = func( len );

All side effects incurred by _evaluating the function's arguments_
must be complete before the function is called. However, that only
affects evaluating that subexpression (the function call itself).
In this case, there are no side effects involved in the function
call or its arguments.

But, the expression on the _left-hand_ side of the '=' operator can
still be evaluated _before_ the right hand side. The sequence point
can only "hit" when the RHS is evaluated, and nothing says it MUST
be evaluated before the LHS.

A better example of the "function call sequence point" might be:

len = 0;
a[len] = func(len++);

Ignoring the UB for a moment...

All we know is that the pre-increment value of is used as an
argument to func() and is incremented (the side effect is
complete) before func() is called (the sequence point). This has
NOTHING to do with when "a[len]" is evaluated. "a[len]" can STILL be
evaluated before anything on the right-hand side of the '=' operator.
When a[len] is evaluated, len could be either 0 or 1 (well, since
it's UB, we can't even predict *that* much).

That's why it's UB - the RHS and LHS can be evaluated in any order,
causing the RHS's sequence point to be before OR AFTER the LHS is
evaluated.

Back to the form of the original problem expression:

a[len++] = func( len );

We only know that within the sub-expression on the right side of the
'=' operator, and are both evaluated, then a sequence
point, then is called. This doesn't say anything about when
a[len++] is evaluated relative to func(len), or when len is incremented
relative to func(len). For the purposes of determining when "len" has
a "new value", the only sequence point that matters is the semicolon,
because that's the only one that's deterministic. Since len was both
used and modified before that semicolon sequence point: UB.

If we thought that the sequence point caused by the function call had
any effect on the expression as a whole (it doesn't), then we'd have
to say that is incremented before the function call (since we'd
be thinking that all side effects in the entire expression have to be
complete before the sequence point). If that were true (it's not), then
a[0] would get the return value of func(1), and len would contain 1.
(NOTE THAT THIS IS EXACTLY THE "PROBLEM" BEHAVIOR DESCRIBED BY THE O.P.
IN HIS ORIGINAL POST. In all of his previous code, was incremented
after it was used as both an argument and a subscript. In the "new
problem", was incremented before being used as an argument.)

But, the sequence point only affects that sub-expression / operand.
It doesn't affect evaluation of the other operand (left-hand side
of the '=' operator). The sequence point for the function call only
hits when that operand is evaluated, and the left-hand side could
be evaluated FIRST. If we KNEW that the right-hand side would be
evaluated first, then the O.P.'s code would be "safe":
len = 0;
a[len++] = func(len);
* func() called with len=0
* return value of func() stored into a[0]
* incremented to 1
However, and I can't keep stressing this enough or too emphatically,
WE DON'T KNOW:
* WHICH OPERAND OF THE '=' OPERATOR IS EVALUATED FIRST (because
order of evaluation here is not specified by the C language)
* WHEN WILL HAVE ITS NEW VALUE (apart from "before the ;")
* WHAT VALUE OF WILL BE PASSED TO func()

------------------------------------



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

RE: Re: (unknown) - Don Starr - Aug 13 15:56:21 2008

Here's (perhaps) a better illustration of when function calls, sequence
points, side effects, and order-of-evaluation does actually matter:

i = 2;
func(i++) == 1 && i == 3;

* Order of evaluation is guaranteed for the && operator.
* The function call creates a sequence point, at which time the side effect
of incrementing i is "complete".

In this case, the right-hand operand of the && operator is GUARANTEED to
evaluate "true":
* && operator has "order of evaluation", unlike most other operators
(INCLUDING '=' !!!!)
* "func(i++) == 1" is guaranteed to be evaluated before "i == 3"
* function call generates a sequence point
* at the sequence point, the "new value" of "i" (3) has been stored, since
any side effects of evaluating the function call's arguments must be
complete
* if (and only if) "func(2)" returns 1, the right-hand operand of the &&
operator is evaluated
* since the side effect of "i++" is guaranteed to be done because of the
function call sequence point, "i" is now equal to 3.
________________________________

From: r...@yahoogroups.com [mailto:r...@yahoogroups.com] On
Behalf Of Don Starr
Sent: Wednesday, 13 August, 2008 12:41
To: r...@yahoogroups.com
Subject: RE: [rabbit-semi] Re: (unknown)

>But we're talking about a function call on the right-hand side. As
>you said, it's a sequence point so all side-effects from the function
>call *must* be complete.

Well, yes, all "side effects from the function call *must* be
complete" before the function is actually called. However, that
ONLY has to do with the function call's side effects, including
those caused by evaluating its arguments. It has NOTHING to do
with side effects in the rest of the expression, nor does it have
anything to do with WHEN other operands are evaluated relative to
the "function call operand".

Given the O.P.'s original form:

a[len++] = func( len );

All side effects incurred by _evaluating the function's arguments_
must be complete before the function is called. However, that only
affects evaluating that subexpression (the function call itself).
In this case, there are no side effects involved in the function
call or its arguments.

But, the expression on the _left-hand_ side of the '=' operator can
still be evaluated _before_ the right hand side. The sequence point
can only "hit" when the RHS is evaluated, and nothing says it MUST
be evaluated before the LHS.

A better example of the "function call sequence point" might be:

len = 0;
a[len] = func(len++);

Ignoring the UB for a moment...

All we know is that the pre-increment value of is used as an
argument to func() and is incremented (the side effect is
complete) before func() is called (the sequence point). This has
NOTHING to do with when "a[len]" is evaluated. "a[len]" can STILL be
evaluated before anything on the right-hand side of the '=' operator.
When a[len] is evaluated, len could be either 0 or 1 (well, since
it's UB, we can't even predict *that* much).

That's why it's UB - the RHS and LHS can be evaluated in any order,
causing the RHS's sequence point to be before OR AFTER the LHS is
evaluated.

Back to the form of the original problem expression:

a[len++] = func( len );

We only know that within the sub-expression on the right side of the
'=' operator, and are both evaluated, then a sequence
point, then is called. This doesn't say anything about when
a[len++] is evaluated relative to func(len), or when len is incremented
relative to func(len). For the purposes of determining when "len" has
a "new value", the only sequence point that matters is the semicolon,
because that's the only one that's deterministic. Since len was both
used and modified before that semicolon sequence point: UB.

If we thought that the sequence point caused by the function call had
any effect on the expression as a whole (it doesn't), then we'd have
to say that is incremented before the function call (since we'd
be thinking that all side effects in the entire expression have to be
complete before the sequence point). If that were true (it's not), then
a[0] would get the return value of func(1), and len would contain 1.
(NOTE THAT THIS IS EXACTLY THE "PROBLEM" BEHAVIOR DESCRIBED BY THE O.P.
IN HIS ORIGINAL POST. In all of his previous code, was incremented
after it was used as both an argument and a subscript. In the "new
problem", was incremented before being used as an argument.)

But, the sequence point only affects that sub-expression / operand.
It doesn't affect evaluation of the other operand (left-hand side
of the '=' operator). The sequence point for the function call only
hits when that operand is evaluated, and the left-hand side could
be evaluated FIRST. If we KNEW that the right-hand side would be
evaluated first, then the O.P.'s code would be "safe":
len = 0;
a[len++] = func(len);
* func() called with len=0
* return value of func() stored into a[0]
* incremented to 1
However, and I can't keep stressing this enough or too emphatically,
WE DON'T KNOW:
* WHICH OPERAND OF THE '=' OPERATOR IS EVALUATED FIRST (because
order of evaluation here is not specified by the C language)
* WHEN WILL HAVE ITS NEW VALUE (apart from "before the ;")
* WHAT VALUE OF WILL BE PASSED TO func()

------------------------------------



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: (unknown) - Bill_CT - Aug 14 11:57:00 2008

--- In r...@yahoogroups.com, Steve Trigero wrote:
>
> Don,
>
> Since I've seen it with my own eyes, you must be right.
> The same statement can be compiled differently in
> different parts of the code. But it still makes no sense
> to me. You might as well tell me that coding when the
> moon is full will get different results than when there is
> no moon.
>
> On page 21 of K&R second edition, referring to
> statement "nl = nw = nc = 0;", it says "This is not a
> special case, but a consequence of the FACT that
> an assignment is an expression with a value and
> assignments associate from right to left."
> (emphasis mine).
>
> If assignments associate from right to left, the 'len' (in my
> example below) should not be incremented until after
> the left hand side of the equal sign is evaluated. And
> the left side of the equal sign should not be evaluated
> until after the right side is evaluated.
>
> As to section 2.12 in K&R that you identified, I read that
> differently than you. In the examples they cite, the question
> they appear to address is not which side of the equal sign
> gets evaluated first, but what is the order of evaluation of
> the variables on the RIGHT side of the equal sign. In their
> example of x = f() + g();, the question is which function is
> evaluated first, f or g? Not whether f gets assigned to x before
> g gets evaluated. In there example of a[i] = i++;, the question
> is, after evaluating the right side of the equal sign, what value
> of i is used as the index. Both of those are valid issues. But
> none of their examples suggests, or implies, that at any time
> could a variable on the left side of an equal sign be evaluated
> before the right side is evaluated. Otherwise, section 2.12
> contradicts what was stated in section 1.5.4, page 21.
>
> I would also draw your attention to something else said in section 2.12
> of K&R. Referring to the order of evaluation, and how the order is
> not specified, it says "different results" can occur with "different
compilers."
> Notice that it is different compilers that can produce different
results.
> Not the same line code in different places in a program using the same
> compiler getting different results. This is what bothers me the most.
>
> But, all that said, K&R hits me upside the head with the reminder that
> "writing code that depends on order of evaluation is a bad programming
> practice..." While this is true as a guideline, it seems to beg the
question.
> A programmer has to depend on some level of defined evaluation or
> programming is nothing more than throwing darts.
>
> x = f() * g() + offset;
>
> If I can't depend on the compiler evaluating the complete right
side, with
> the multiplication occurring before the addition, before assigning
to x,
> I'm peeing into the wind.

Operator precedence is guaranteed. However with:

expr1 + expr2 + expr3

the compiler can evaluate these expressions in any order.

I've seen this fail

unsigned int word = fgetc(file) | ((fgetc(file) << 8);

because the right side of the | was evaluated first. Many compilers
will evaluate the "harder" side first, including the Softools compiler.

> Now I have to go an grep my files looking for time bombs.

As Don pointed out, only where the left and right-hand sides update
the *same* variable(s). I'm going to see about adding this as a
warning to the Softools compiler - it would be nice if it's not hard
to do.

Bill
------------------------------------



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )

Re: (unknown) - Bill_CT - Aug 14 12:00:13 2008

--- In r...@yahoogroups.com, "Bill_CT" wrote:
>
> --- In r...@yahoogroups.com, "Don Starr" wrote:
> >
> > > Given the sample code below, what value of 'len' should be pass
> > > to CheckSum()?
> > >
> > > int len;
> > > char p[20];
> > >
> > > strcpy( p, "Hello" );
> > > len = strlen(p);
> > > p[len++] = CheckSum( p, len );
> > > p[len] = NUL;
> > >
> > > I have painfully determined that DC will sometimes, and I stress
> > > "Sometimes", increment len before calling CheckSum and sometimes
> > > after. It's not even consistent!
> > >
> >
> > Undefined behavior, I think.
> >
> > In:
> > p[len++] = CheckSum( p, len );
> >
> > you're both modifying and accessing in the same expression,
> > without an intervening sequence point. The call to CheckSum() is
> > a sequence point, but there's no guarantee that the right-hand
> > side of the expression will be evaluated before the lvalue on the
> > left-hand side is calculated.
> >
> > That line of code should likely be split into two:
> > p[len] = CheckSum( p, len );
> > len++;
>
> It's true that len is being modified and accessed in the same
> expression, but the function call must complete before the assignment
> can be made. Since len can't change on the right-hand side, it can't
> be unsafe.
>
> The danger would be if someone came along and said let's use a macro
> CheckSum to save the call/return overhead. Then there's a problem.

Replying to my own post - I was wrong about function call sequence
points. Bottom line - never update the same variable(s) on either
side of an operator (not just assignments!).

a = x-- + ++x;

is also bad (and a dumb example).

Bill

------------------------------------



(You need to be a member of rabbit-semi -- send a blank email to rabbit-semi-subscribe@yahoogroups.com )