C18 Compiler again| page 3

Reply by Meindert Sprang ●June 8, 20102010-06-08

"John Temples" <usenet@xargs-spam.com> wrote in message
news:slrni0rit7.tc1.usenet@xargs-spam.com...
> On 2010-06-07, Meindert Sprang <ms@NOJUNKcustomORSPAMware.nl> wrote:
> > Apparently the optimizer of C18 is not that good. For
> > instance:  LATF = addr >> 16; where addr is an uint32, is compiled into
a
> > loop where 4 registers really get shifted 16 times in a loop.
>
> Here's what Hi-Tech's PIC18 compiler does:
>
>    853                           ;t.c: 59: LATF = addr >> 16;
>    854  00FFFA  C0FE  FF8E          movff   _addr+2,3982    ;volatile

And that's how I expected it to be!

Meindert

Reply by Joe Chisolm ●June 8, 20102010-06-08

On Tue, 08 Jun 2010 11:24:58 +0200, Meindert Sprang wrote:

> "D Yuniskis" <not.going.to.be@seen.com> wrote in message
> news:hujj28$bn5$1@speranza.aioe.org...
>> It would be informative to know what sort of "helper routines" the
>> compiler calls on.  E.g., it might (inelegantly) treat this as "CALL
>> SHIFT_LONG_RIGHT, repeat" -- in which case the 4 temp access is the
>> canned representation of *any* "long int".
> 
> This is the code that does the shift:
> 
>  0FCC8    0E10     MOVLW 0x10
>  0FCCA    90D8     BCF 0xfd8, 0, ACCESS 
>  0FCCC    3203     RRCF 0x3, F,ACCESS 
>  0FCCE    3202     RRCF 0x2, F, ACCESS 
>  0FCD0    3201     RRCF 0x1, F, ACCESS 
>  0FCD2    3200     RRCF 0, F, ACCESS
>  0FCD4    06E8     DECF 0xfe8, F, ACCESS 
>  0FCD6    E1F9     BNZ 0xfcca
> 
> The loop is executed 16 times (>>16) and 4 locations are shifted through
> the carry bit, if I undestand this correctly.... yuck!
> 
> Meindert

What version of C18 are you using and what is your target device?




-- 
Joe Chisolm
Marble Falls, Tx.

Reply by Grant Edwards ●June 8, 20102010-06-08

On 2010-06-08, David Brown <david@westcontrol.removethisbit.com> wrote:

> Some compilers will use shifts, some will use byte or word movements.
>
> On the ARM, a compiler will often use shifts because shifts (especially 
> by constants) are very cheap on the ARM architecture,

When combined with another arithmetic operation, they're free!

> while unaligned and non-32-bit memory accesses may be expensive or
> illegal (depending on the ARM variant).
>
> A quick test with avr-gcc shows that it uses byte register movements 
> rather than shifts, although it's not optimal for 32-bit values (it is 
> fine for 16-bit values, which are much more common in an 8-bit world). 
> For your example below of "((ul&  0xFFFFFF)>>  8)" it is close to perfect.

IIRC gcc for both msp430 and H300 does byte/word operations instead of
shifts as well.

>>          ldr     r0, [r3, #0]
>>          mov     r0, r0, asl #8
>>          mov     r0, r0, lsr #16
>>
>>> If it recognizes the last as wanting just the middle word then that
>>> would be impressive.
>>
>> Recognizing the last two as wanting just the middle word is moot because
>> that 16-bit word is misaligned and can't be accessed using a 16-bit load
>> instruction.
>
> That's very nice code generation - faster (on an ARM anyway) than using 
> masking.

Though it does look a bit odd at first glance. ;)

-- 
Grant Edwards               grant.b.edwards        Yow! You can't hurt me!!
                                  at               I have an ASSUMABLE
                              gmail.com            MORTGAGE!!

Reply by Grant Edwards ●June 8, 20102010-06-08

On 2010-06-08, Meindert Sprang <ms@NOJUNKcustomORSPAMware.nl> wrote:
> "D Yuniskis" <not.going.to.be@seen.com> wrote in message
> news:hujj28$bn5$1@speranza.aioe.org...
>> It would be informative to know what sort of "helper routines"
>> the compiler calls on.  E.g., it might (inelegantly) treat this
>> as "CALL SHIFT_LONG_RIGHT, repeat" -- in which case the
>> 4 temp access is the canned representation of *any* "long int".
>
> This is the code that does the shift:
>
>  0FCC8    0E10     MOVLW 0x10
>  0FCCA    90D8     BCF 0xfd8, 0, ACCESS
>  0FCCC    3203     RRCF 0x3, F, ACCESS
>  0FCCE    3202     RRCF 0x2, F, ACCESS
>  0FCD0    3201     RRCF 0x1, F, ACCESS
>  0FCD2    3200     RRCF 0, F, ACCESS
>  0FCD4    06E8     DECF 0xfe8, F, ACCESS
>  0FCD6    E1F9     BNZ 0xfcca
>
> The loop is executed 16 times (>>16) and 4 locations are shifted through the
> carry bit, if I undestand this correctly.... yuck!

In my experience, "yuck!" is what anybody trying to use C on a PIC
ought to expect.  [IMO, "yuck!" is what you get using asm on a PIC as
well, but that's probably a little more subjective.]

-- 
Grant Edwards               grant.b.edwards        Yow! Loni Anderson's hair
                                  at               should be LEGALIZED!!
                              gmail.com

Reply by George Neuner ●June 8, 20102010-06-08

On Tue, 08 Jun 2010 10:03:58 +0200, David Brown
<david@westcontrol.removethisbit.com> wrote:

>On 08/06/2010 04:47, Grant Edwards wrote:
>> On 2010-06-08, George Neuner<gneuner2@comcast.net>  wrote:
>>> On Mon, 7 Jun 2010 20:18:35 +0000 (UTC), Grant Edwards
>>> <invalid@invalid.invalid>  wrote:
>>>
>>>> On 2010-06-07, George Neuner<gneuner2@comcast.net>  wrote:
>>>>
>>>>> I've been programming since 1977 and I have never seen any compiler
>>>>> turn a long word shift (and/or mask) into a corresponding short word
>>>>> or byte access.  Every compiler I have ever worked with would perform
>>>>> the shift.
>>>>
>>>> Really?
>>>>
>>>> I've seen quite a few compilers do that.  For example, gcc for ARM
>>>> does:
>>>
>
>Some compilers will use shifts, some will use byte or word movements.
>
>On the ARM, a compiler will often use shifts because shifts (especially 
>by constants) are very cheap on the ARM architecture, while unaligned 
>and non-32-bit memory accesses may be expensive or illegal (depending on 
>the ARM variant).
>
>A quick test with avr-gcc shows that it uses byte register movements 
>rather than shifts, although it's not optimal for 32-bit values (it is 
>fine for 16-bit values, which are much more common in an 8-bit world). 
>For your example below of "((ul&  0xFFFFFF)>>  8)" it is close to perfect.
>
>>> Interesting.  But now that I think about it, I almost use shift with a
>>> constant count - it's almost always a computed shift - and even when
>>> the shift is constant, the value is often in a variable anyway due to
>>> surrounding processing.
>>>
>>> - What version of GCC is it?
>>
>> 4.4.3
>>
>>> - What does it do if the shift count is a variable?
>>
>> It uses a shift instruction.  There's not really anyting else it could
>> do with a variable shift count.
>>
>>> - What does it do for ((ul&  0xFFFFFF)>>  8)
>>
>>          ldr     r0, [r3, #0]
>>          mov     r0, r0, asl #8
>>          mov     r0, r0, lsr #16
>>
>>> or ((ul>>  8)&  0xFFFF)?
>>
>>          ldr     r0, [r3, #0]
>>          mov     r0, r0, asl #8
>>          mov     r0, r0, lsr #16
>>
>>> If it recognizes the last as wanting just the middle word then that
>>> would be impressive.
>>
>> Recognizing the last two as wanting just the middle word is moot because
>> that 16-bit word is misaligned and can't be accessed using a 16-bit load
>> instruction.
>>
>
>That's very nice code generation - faster (on an ARM anyway) than using 
>masking.


Yes.  It seems that recent versions of GCC do some interesting shift
optimizations ... which is revising upward my opinion of GCC (which
I've only ever considered an adequate compiler).


I've worked with a number of older versions of GCC and with Intel,
Microsoft and Sun compilers over the years.  What I would normally
expect to see from a good compiler is:
  - if the source value and the shift count can be statically
    determined, I expect the compiler to compute the result 
    and inline it,
  - otherwise I expect to see the shift essentially as coded.

Good compilers can often statically determine the values through value
tracking and/or constant propagation, so under high optimization it
isn't unusual to see something like

  unsigned short get_middle( unsigned long ul )
  {
     return ((ul >> 8) & 0xFFFF);
  }

  :
  unsigned short bleh;
  unsigned long blah = 0xBABE; 
  : 
  blah |= (0xCAFE << 16)
  bleh = get_middle( blah )
  :

reduce the get_middle() call to a short constant load of 0xFEBA.

But until I saw GCC 4.4 do it (partly: see my other post), I had never
seen a compiler change a shift into a word (or byte) load from memory.
I have, on occasion, seen shifts changed into register bit field
extraction on chips that have such instructions, but never before into
partial loads from memory.

Seems like I have to start paying more attention to GCC.
George

Reply by Meindert Sprang ●June 9, 20102010-06-09

"Joe Chisolm" <jchisolm6@earthlink.net> wrote in message
news:5qidnUNFqvYo25PRnZ2dnUVZ_hidnZ2d@earthlink.com...
> On Tue, 08 Jun 2010 11:24:58 +0200, Meindert Sprang wrote:
>
> > "D Yuniskis" <not.going.to.be@seen.com> wrote in message
> > news:hujj28$bn5$1@speranza.aioe.org...
> >> It would be informative to know what sort of "helper routines" the
> >> compiler calls on.  E.g., it might (inelegantly) treat this as "CALL
> >> SHIFT_LONG_RIGHT, repeat" -- in which case the 4 temp access is the
> >> canned representation of *any* "long int".
> >
> > This is the code that does the shift:
> >
> >  0FCC8    0E10     MOVLW 0x10
> >  0FCCA    90D8     BCF 0xfd8, 0, ACCESS
> >  0FCCC    3203     RRCF 0x3, F,ACCESS
> >  0FCCE    3202     RRCF 0x2, F, ACCESS
> >  0FCD0    3201     RRCF 0x1, F, ACCESS
> >  0FCD2    3200     RRCF 0, F, ACCESS
> >  0FCD4    06E8     DECF 0xfe8, F, ACCESS
> >  0FCD6    E1F9     BNZ 0xfcca
> >
> > The loop is executed 16 times (>>16) and 4 locations are shifted through
> > the carry bit, if I undestand this correctly.... yuck!
> >
> > Meindert
>
> What version of C18 are you using and what is your target device?

The lates (V3.35), just downloaded from the Microchip website and the target
is an 18F8720.

Meindert

Reply by Meindert Sprang ●June 9, 20102010-06-09

"Grant Edwards" <invalid@invalid.invalid> wrote in message
news:huljbj$aft$2@reader1.panix.com...
> In my experience, "yuck!" is what anybody trying to use C on a PIC
> ought to expect.  [IMO, "yuck!" is what you get using asm on a PIC as
> well, but that's probably a little more subjective.]

"Yuck" is what you get when using a PIC at all.....
Whoever designed this architecture should be crucified!!

Meindert

Reply by hamilton ●June 9, 20102010-06-09

On 6/9/2010 12:51 AM, Meindert Sprang wrote:
> "Grant Edwards"<invalid@invalid.invalid>  wrote in message
> news:huljbj$aft$2@reader1.panix.com...
>> In my experience, "yuck!" is what anybody trying to use C on a PIC
>> ought to expect.  [IMO, "yuck!" is what you get using asm on a PIC as
>> well, but that's probably a little more subjective.]
>
> "Yuck" is what you get when using a PIC at all.....
> Whoever designed this architecture should be crucified!!

Yes, and they are laughing all the way to the bank.

Not bad for a "Yuck" design.

hamilton


>
> Meindert
>
>

Reply by Richard Swaby ●June 9, 20102010-06-09

On Mon, 7 Jun 2010 11:17:34 +0200, "Meindert Sprang"
<ms@NOJUNKcustomORSPAMware.nl> wrote:

>Unbelievable.....
>
>I'm playing around with the Microchip C18 compiler after a hair-splitting
>experience with CCS. Apparently the optimizer of C18 is not that good. For
>instance:  LATF = addr >> 16; where addr is an uint32, is compiled into a
>loop where 4 registers really get shifted 16 times in a loop. Any decent
>compiler should recognise that a shift by 16, stored to an 8 bit port could
>easily be done by simply accessing the 3rd byte.... sheesh....
>
>Meindert
>

Here's the assembler that CC8E generates from:

void main(void)
{
	uns32 addr;
	LATF = addr >> 16;
}	

; CC8E Version 1.3D, Copyright (c) B Knudsen Data
; C compiler for the PIC18 microcontrollers
; ************   9. Jun 2010  14:47  *************

; NOTE: demo edition, assembly is NOT complete.


	processor  PIC18F6310
	radix  DEC

LATF        EQU   0xF8E
addr        EQU   0x00

	GOTO main

  ; FILE cc8e_test.c
			;void main(void)
			;{
main
			;	uns32 addr;
			;	LATF = addr >> 16;
	MOVF  addr+2,W,0
	MOVWF LATF,0
			;}	
	SLEEP
	RESET

	END


; *** KEY INFO ***

; 0x000004    4 word(s)  0 % : main

; RAM usage: 4 bytes (4 local), 764 bytes free
; Maximum call level: 0
; Total of 6 code words (0 %)


Even simpler, the following generates the same assembler code.

void main(void)
{
	uns32 addr;
	LATF = addr.midH8;
}

With CC8E you can easily address individual  bytes and bits within
larger variables (see above)


Richard

Reply by D Yuniskis ●June 9, 20102010-06-09

Hi Meindert,

Meindert Sprang wrote:
> "Grant Edwards" <invalid@invalid.invalid> wrote in message
> news:huljbj$aft$2@reader1.panix.com...
>> In my experience, "yuck!" is what anybody trying to use C on a PIC
>> ought to expect.  [IMO, "yuck!" is what you get using asm on a PIC as
>> well, but that's probably a little more subjective.]
> 
> "Yuck" is what you get when using a PIC at all.....
> Whoever designed this architecture should be crucified!!

Had you seen the *original* PICs (General Instruments) *and*
compared them to what was available from other vendors at
the time, you would have found it amusing:
   "Is this a joke?  You know, one of those April Fool's Day
   bogus advertisements?"
(I had a similar reaction when Motogorilla later introduced
their *one* bit "ICU")

I think the original PICs had 1K of CODE and maybe 32 bytes
of "RAM" (registers) -- Harvard Architecture.  GI was heavy
into making cable converter boxes back then.  I think you'd
be hard pressed to make a four function *calculator* with
one of those!!  :-/

A shame, actually, that it survived where many other "better"
designs slipped by the wayside...

[I think I need to go rummage through old databooks to see what
I've culled over the years]

Previous 1 234 5 6 Next

C18 Compiler again

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group