mixing C and assembly| page 12

Reply by Chris H ●April 27, 20082008-04-27

In message <fv1ut9$a74$02$1@news.t-online.com>, Hans-Bernhard Br&#4294967295;ker 
<HBBroeker@t-online.de> writes
>Walter Banks wrote:
>> Hans-Bernhard Br&#4294967295;ker wrote:
>
>>> All of that is correct, but beside the point.  For *every* piece of C
>>> code anyone can possibly write, in any C compilers, there's assembler
>>> code that ends up as the exact same machine code.  The same is generally
>>> not true for the opposite direction.  So compilers can't produce faster
>>> code than assemblers.
>
>> Compilers can produce some machine code that is exceedingly difficult
>> to write and maintain in asm.
>
>Huh?  Is something wrong with my writing or with your reading?  Where 
>in the above did you see me talking about maintainability or 
>difficulty? The issue at hand is _speed_ and _size_.  No more, no less.

In which case you loose... I can read the C. I cant read the ASM so I 
won't be able to see that what you have done is the same as the C or 
even correct.... :-)

The whole point is that the C can be as fast and as small as the ASM but 
MUCH easier to read, debug and maintain. Certainly far faster to write.

(BTW I do enjoy writing in asm but that is not the point)

Also the compilers can do some optimisations that humans find difficult 
to do. Some optimisations involve the linker, not just the compiler so I 
am told be a compiler writer (no, it was not Walter).

So in SOME cases an experienced asm writer MIGHT be able to do smaller 
faster code than the compiler but certainly NOT in the same time frame. 
Also that particular experienced ASM programmer can probably only do 
that for one or two MCU and not for all types of program.

-- 
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills  Staffs  England     /\/\/\/\/
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

Reply by CBFalconer ●April 27, 20082008-04-27

David Brown wrote:
> 
... snip ...
> 
> A quick test on avr-gcc 4.2.2, using 16-bit and 8-bit ints rather
> than 32-bit and 16-bit (since it's an 8-bit cpu) reveals that
> avr-gcc is smart enough to do a 8-bit x 8-bit -> 16-bit multiply
> as desired.  It's a little harder to see exactly what is
> happening for bigger numbers and for division, since these use
> library calls - certainly the compiler will generalise some of
> these functions.  But for the very common case of the multiply
> like this, you get optimal code.

Defining 'optimal' is a varying target.  Among others, see Knuth. 
In particular, in the past I have compromised on an 8 * 16 -> 24
bit heart, two of which, with an addition, produced a 16 * 16 -> 32
multiplication.  This had, on the machine of interest (an 8080),
significant advantages, i.e. about a 50% decrease in multiplication
times.  Other games are available at the compile stage where one
operand is constant, especially those where the multiplier consists
of some solid string of 1 bits.

-- 
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: <http://cbfalconer.home.att.net>
            Try the download section.

** Posted from http://www.teranews.com **

Reply by CBFalconer ●April 27, 20082008-04-27

Walter Banks wrote:
> CBFalconer wrote:
> 
... snip ...
>>
>> Well, that looks impressive, but you must be loosing something.
>> You must be doing something illegal and non-understandable (to a C
>> programmer) with one or more of indentation, braces placement,
>> illegal statements (a call to foo should never enter bar).  I see
>> no reason for bar to exit while foo falls through.
> 
> I should have used fixed point type to make the listing fragment
> clearer. This is the source used in the example.
> 
> void bar (void);
> 
> void foo (void)
>   {
>      NOP();
>      bar();
>   }
> 
> void bar (void)
>   {
>      NOP();
>   }
> 
> void main (void)
>   {
>     foo();
>     bar();
>   }

Well, that executes foo (and thus bar), followed by bar.  I see no
savings there from fall-thru.  See my message of Sat. 11:13 am EDT
-0400.

-- 
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: <http://cbfalconer.home.att.net>
            Try the download section.


** Posted from http://www.teranews.com **

Reply by CBFalconer ●April 27, 20082008-04-27

David Brown wrote:
> 
... snip ...
> 
> That's just tail call elimination (changing a "call X; ret" into
> a "jmp X"), which is a standard optimisation technique (some
> assemblers will do that for you).
> 
> A better example would be:
> 
> WriteSpace:
>         ld a, #' '
> WriteChar:
>         st a, outputCharacter
>         ret
> 
> with C code:
> 
> extern volatile char outputCharacter;
> void WriteChar(char c) {
>         outputCharacter = c;
> }
> void WriteSpace(void) {
>         WriteChar(' ');
> }

But that doesn't do anything, because normal C executes a return on
the closing brace.  Am I missing something?

-- 
 [mail]: Chuck F (cbfalconer at maineline dot net) 
 [page]: <http://cbfalconer.home.att.net>
            Try the download section.


** Posted from http://www.teranews.com **

Reply by Walter Banks ●April 27, 20082008-04-27

Hans-Bernhard Br&#4294967295;ker wrote:

> Walter Banks wrote:
> > Hans-Bernhard Br&#4294967295;ker wrote:
>
> > These are sequences that are data or address specific that are likely
> > to change or need to be checked each time the code is assembled.
>
> That's why the prudent assembly programmer would secure such tricks with
> assemlby-time assertions.  I.e. make the assumptions explicity, and make
> sure that the code fails to translate if any of them is no longer true.

It is this type of check that is already embedded in C compilers.
Programming in asm is both an exercise in application programming
and implementation. C the focus is about application algothrims
with an implementation outline.

> > The whole reason for HLL is to aid in making application code easier
> > to create.
>
> Agreed.  But you're still missing the point under discussion.

I don't think so. Most of what I have been saying is use the correct
tool for the job. This is not an asm vs C issue. The importance
of the work we did that created the white paper is proof that
C did not have to be at a performance disadvantage to asm.
That said, lets look at the other issues and see where C has
an advantage.

We are increasingly seeing ISA's that were designed specifically
for machine generated code. Our focus has always been on
making the code generation process easier.

Regards

--
Walter Banks
Byte Craft Limited
Tel. (519) 888-6911
http://www.bytecraft.com
walter@bytecraft.com

Reply by Robert Adsett ●April 27, 20082008-04-27

In article <6fmdnRynGbho0onVRVnyjAA@lyse.net>, David Brown says...
>
> A quick test on avr-gcc 4.2.2, using 16-bit and 8-bit ints rather than 
> 32-bit and 16-bit (since it's an 8-bit cpu) reveals that avr-gcc is 
> smart enough to do a 8-bit x 8-bit -> 16-bit multiply as desired.  

So at least some compilers do so.  Thanks. 

Robert
** Posted from http://www.teranews.com **

Reply by Walter Banks ●April 27, 20082008-04-27

--------------0F4412A512EFDD0498C5F87E
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

CBFalconer wrote:

> Walter Banks wrote:
> > CBFalconer wrote:
> >
> ... snip ...
> >>
> >> Well, that looks impressive, but you must be loosing something.
> >> You must be doing something illegal and non-understandable (to a C
> >> programmer) with one or more of indentation, braces placement,
> >> illegal statements (a call to foo should never enter bar).  I see
> >> no reason for bar to exit while foo falls through.
> >
> > I should have used fixed point type to make the listing fragment
> > clearer. This is the source used in the example.
> >
> > void bar (void);
> >
> > void foo (void)
> >   {
> >      NOP();
> >      bar();
> >   }
> >
> > void bar (void)
> >   {
> >      NOP();
> >   }
> >
> > void main (void)
> >   {
> >     foo();
> >     bar();
> >   }
>
> Well, that executes foo (and thus bar), followed by bar.  I see no
> savings there from fall-thru.  See my message of Sat. 11:13 am EDT
> -0400.

There is a savings

Look at the listing I posted before. It follows in fixed point type.
Don't start a rant about html please

w..

                           void bar (void);

                           void foo (void)
                             {
0100 9D     NOP               NOP();
                              bar();
                             }

                           void bar (void)
                             {
0101 9D     NOP               NOP();
0102 81     RTS              }

                           void main (void)
                            {
0103 AD FB  BSR    $0100      foo();
0105 20 FA  BRA    $0101      bar();
                            }

                  __MAIN:
FFFE 01 03

--------------0F4412A512EFDD0498C5F87E
Content-Type: text/html; charset=us-ascii
Content-Transfer-Encoding: 7bit

<!doctype html public "-//w3c//dtd html 4.0 transitional//en">
<html>
&nbsp;
<p>CBFalconer wrote:
<blockquote TYPE=CITE>Walter Banks wrote:
<br>> CBFalconer wrote:
<br>>
<br>... snip ...
<br>>>
<br>>> Well, that looks impressive, but you must be loosing something.
<br>>> You must be doing something illegal and non-understandable (to a
C
<br>>> programmer) with one or more of indentation, braces placement,
<br>>> illegal statements (a call to foo should never enter bar).&nbsp;
I see
<br>>> no reason for bar to exit while foo falls through.
<br>>
<br>> I should have used fixed point type to make the listing fragment
<br>> clearer. This is the source used in the example.
<br>>
<br>> void bar (void);
<br>>
<br>> void foo (void)
<br>>&nbsp;&nbsp; {
<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; NOP();
<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; bar();
<br>>&nbsp;&nbsp; }
<br>>
<br>> void bar (void)
<br>>&nbsp;&nbsp; {
<br>>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; NOP();
<br>>&nbsp;&nbsp; }
<br>>
<br>> void main (void)
<br>>&nbsp;&nbsp; {
<br>>&nbsp;&nbsp;&nbsp;&nbsp; foo();
<br>>&nbsp;&nbsp;&nbsp;&nbsp; bar();
<br>>&nbsp;&nbsp; }
<p>Well, that executes foo (and thus bar), followed by bar.&nbsp; I see
no
<br>savings there from fall-thru.&nbsp; See my message of Sat. 11:13 am
EDT
<br>-0400.</blockquote>

<p><br>There is a savings
<p>Look at the listing I posted before. It follows in fixed point type.
<br>Don't start a rant about html please
<p>w..
<br>&nbsp;
<br>&nbsp;
<p><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
void bar (void);</tt>
<br><tt></tt>&nbsp;<tt></tt>
<p><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
void foo (void)</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
{</tt>
<br><tt>0100 9D&nbsp;&nbsp;&nbsp;&nbsp; NOP&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
NOP();</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
bar();</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
}</tt><tt></tt>
<p><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
void bar (void)</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
{</tt>
<br><tt>0101 9D&nbsp;&nbsp;&nbsp;&nbsp; NOP&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
NOP();</tt>
<br><tt>0102 81&nbsp;&nbsp;&nbsp;&nbsp; RTS&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
}</tt>
<br><tt></tt>&nbsp;<tt></tt>
<p><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
void main (void)</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
{</tt>
<br><tt>0103 AD FB&nbsp; BSR&nbsp;&nbsp;&nbsp; $0100&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
foo();</tt>
<br><tt>0105 20 FA&nbsp; BRA&nbsp;&nbsp;&nbsp; $0101&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
bar();</tt>
<br><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
}</tt><tt></tt>
<p><tt>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
__MAIN:</tt>
<br><tt>FFFE 01 03</tt>
<br>&nbsp;</html>

--------------0F4412A512EFDD0498C5F87E--

Reply by David Brown ●April 27, 20082008-04-27

CBFalconer wrote:
> David Brown wrote:
> ... snip ...
>> That's just tail call elimination (changing a "call X; ret" into
>> a "jmp X"), which is a standard optimisation technique (some
>> assemblers will do that for you).
>>
>> A better example would be:
>>
>> WriteSpace:
>>         ld a, #' '
>> WriteChar:
>>         st a, outputCharacter
>>         ret
>>
>> with C code:
>>
>> extern volatile char outputCharacter;
>> void WriteChar(char c) {
>>         outputCharacter = c;
>> }
>> void WriteSpace(void) {
>>         WriteChar(' ');
>> }
> 
> But that doesn't do anything, because normal C executes a return on
> the closing brace.  Am I missing something?
> 

You must be missing something :-)  Your example code was not very 
helpful, because your first version implied that foo is a callable 
function in its own right - making a combined fall-through foobar would 
require duplicating the code for foo.  Thus Walter did a direct 
translation to C and generated code that was slightly better than your 
first assembly code.  In the code I've given, I wrote an assembly 
function with two distinct entry points, and the typical equivalent C 
code for it.  The question is, will Walter's C compiler generate a 
fall-through here?

Reply by David Brown ●April 27, 20082008-04-27

CBFalconer wrote:
> David Brown wrote:
> ... snip ...
>> A quick test on avr-gcc 4.2.2, using 16-bit and 8-bit ints rather
>> than 32-bit and 16-bit (since it's an 8-bit cpu) reveals that
>> avr-gcc is smart enough to do a 8-bit x 8-bit -> 16-bit multiply
>> as desired.  It's a little harder to see exactly what is
>> happening for bigger numbers and for division, since these use
>> library calls - certainly the compiler will generalise some of
>> these functions.  But for the very common case of the multiply
>> like this, you get optimal code.
> 
> Defining 'optimal' is a varying target.  Among others, see Knuth. 
> In particular, in the past I have compromised on an 8 * 16 -> 24
> bit heart, two of which, with an addition, produced a 16 * 16 -> 32
> multiplication.  This had, on the machine of interest (an 8080),
> significant advantages, i.e. about a 50% decrease in multiplication
> times.  Other games are available at the compile stage where one
> operand is constant, especially those where the multiplier consists
> of some solid string of 1 bits.
> 

Yes, "optimal" can mean different things - code size, speed, stack use 
and ram size being the most common points.  "optimal" also depends on 
things like shared library code, and any other information that the 
compiler may have.  That's why I restricted my test to a simple 8x8->16 
multiply on the AVR - the generated code is simple enough to be optimal 
in every way.

Reply by Robert Adsett ●April 27, 20082008-04-27

In article <481452FF.C82B6E1C@bytecraft.com>, Walter Banks says...
> 
> 
> Robert Adsett wrote:
> 
> >         mul a,b,c       ;  b * c -> (a,b) 16bit x 16bit -> 32bit multiply
> >         div a,d ; (a,b)/d -> a 32bit / 16bit -> 16bit divide
> >
> > It's something I do write in asm to take advantage of a processors
> > scaling capability.
> 
> Robert,
> 
> A lot of approach depends on processor. We use the "as if"
> rule a lot in code generation. In general 8*8->16 bits will
> use a processor 8*8 if we can. Similarly we grab the MS 8bits
> when we multiply two 8 bit fracts rather than casting and using
> a 32 bit multiply.

Good to know, thanks Walter.

Robert
** Posted from http://www.teranews.com **

Previous 10 111213 14 Next

mixing C and assembly

Sign in

You might also like...

Search forums

Free PDF Downloads

Blogs - Hall of Fame

Discussion Groups

Quick Links

About EmbeddedRelated.com

Social Networks

The Related Media Group