And -- pow! My memory was gone.| page 5

Reply by David Brown ●February 22, 20112011-02-22

On 22/02/11 22:12, Tim Wescott wrote:
> On 02/22/2011 12:05 PM, Arlet Ottens wrote:
>> On 02/22/2011 08:05 PM, Tim Wescott wrote:
>>
>>> Mostly I was sharing my amazement about how much of a chunk one
>>> (supposedly) itty bitty mathematical function took up, and how much more
>>> space the gnu embedded library for the ARM takes up than the
>>> alternative, commercial, tool. (With the library sprintf, the thing
>>> compiles to something like 78kB, which is a barrier to progress given
>>> that the processor in question only has 64kB of flash).
>>
>> It makes sense, though. GCC has a large number of target architectures,
>> and therefore these libraries have been written in C to make them easier
>> to port.
>>
>> Commercial tools are usually aimed towards a single target, which makes
>> it a lot easier to hand optimize the math libs in assembly.
>
> Even hand optimizing in C for a specific processor, or doing mostly C
> with assembly just in the spots where it really matters, can do
> considerable good for both size and run time.
>

Once you are at the stage of a 32-bit processor, then you are generally 
unlikely to do much better by specialising your library code per 
processor, or using assembly.  It makes a much bigger difference for 
smaller targets.

The exception is if you are able to make use of particular odd 
instructions or capabilities that your processor has, but that the 
compiler cannot generate automatically.

Reply by Arlet Ottens ●February 23, 20112011-02-23

On 02/23/2011 01:04 AM, David Brown wrote:
> On 22/02/11 22:12, Tim Wescott wrote:
>> On 02/22/2011 12:05 PM, Arlet Ottens wrote:
>>> On 02/22/2011 08:05 PM, Tim Wescott wrote:
>>>
>>>> Mostly I was sharing my amazement about how much of a chunk one
>>>> (supposedly) itty bitty mathematical function took up, and how much
>>>> more
>>>> space the gnu embedded library for the ARM takes up than the
>>>> alternative, commercial, tool. (With the library sprintf, the thing
>>>> compiles to something like 78kB, which is a barrier to progress given
>>>> that the processor in question only has 64kB of flash).
>>>
>>> It makes sense, though. GCC has a large number of target architectures,
>>> and therefore these libraries have been written in C to make them easier
>>> to port.
>>>
>>> Commercial tools are usually aimed towards a single target, which makes
>>> it a lot easier to hand optimize the math libs in assembly.
>>
>> Even hand optimizing in C for a specific processor, or doing mostly C
>> with assembly just in the spots where it really matters, can do
>> considerable good for both size and run time.
>>
>
> Once you are at the stage of a 32-bit processor, then you are generally
> unlikely to do much better by specialising your library code per
> processor, or using assembly. It makes a much bigger difference for
> smaller targets.
>
> The exception is if you are able to make use of particular odd
> instructions or capabilities that your processor has, but that the
> compiler cannot generate automatically.

Those odd instructions are typically something you can use in math 
operations. Often a CPU will have math instructions that are useful, but 
don't align directly with C conventions, such as non-standard 
multiply/divide bit sizes. Also, assembly will give you direct access to 
carry and overflow flags, and other odd instructions, such as 
count-leading-zeroes.

A good C compiler will also provide ways to gain access to those, but 
may be at a price of having to write convoluted code, and frequently 
checking assembler output. At that point, it's easier just to write the 
assembly yourself.

I'm not advocating that the average user go write his own math lib in 
assembly, but for a vendor trying to sell a high-performance toolchain, 
hiring an expert to optimize the last cycle out of math code can be 
worth it.

Reply by Andreas Huennebeck ●February 23, 20112011-02-23

Tim Wescott wrote:

> If I could figure out how to keep C++ without using the heap, I'd be a
> happy camper.

Don't use 'new'. Put your objects on the stack and let subroutines use
them through references as function arguments, e.g.:

static int foo(MyClass& c)
{
    c.do_something();
}

int main()
{
    int some_data;
    ... some code ...
    MyClass my_object(some_data);
    ... some code ...
    foo(my_object);
    ... some code ...
}

> I'm not even sure if malloc &c. are getting _called_, or 
> just pulled in because C++ uses "new", which uses -- well, you get the
> idea.

If your classes don't call new and your functions never call new and if you
don't use classes from other libraries then malloc() should not be linked in.
That's theory though :-(

> For embedded I pretty much avoid dynamic deallocation like the 
> plague*; while this is against the C++ desktop paradigm, it lets you use
> a very useful subset of the language without a whole 'heap' of trouble.
> 
> * Meaning I'll "new" things at system startup but only if they're going
> to live until power-down.  This gives one great flexibility in making
> portable libraries while still not fragmenting the heap through
> new-delete-new-delete-new sequences.

For this approach (and placement new, see http://en.wikipedia.org/wiki/Placement_syntax)
malloc is linked in.

bye
Andreas
-- 
Andreas H&#4294967295;nnebeck | email: acmh@gmx.de
----- privat ---- | www  : http://www.huennebeck-online.de
Fax/Anrufbeantworter: 0721/151-284301
GPG-Key: http://www.huennebeck-online.de/public_keys/andreas.asc
PGP-Key: http://www.huennebeck-online.de/public_keys/pgp_andreas.asc

Reply by David Brown ●February 23, 20112011-02-23

On 23/02/2011 07:59, Arlet Ottens wrote:
> On 02/23/2011 01:04 AM, David Brown wrote:
>> On 22/02/11 22:12, Tim Wescott wrote:
>>> On 02/22/2011 12:05 PM, Arlet Ottens wrote:
>>>> On 02/22/2011 08:05 PM, Tim Wescott wrote:
>>>>
>>>>> Mostly I was sharing my amazement about how much of a chunk one
>>>>> (supposedly) itty bitty mathematical function took up, and how much
>>>>> more
>>>>> space the gnu embedded library for the ARM takes up than the
>>>>> alternative, commercial, tool. (With the library sprintf, the thing
>>>>> compiles to something like 78kB, which is a barrier to progress given
>>>>> that the processor in question only has 64kB of flash).
>>>>
>>>> It makes sense, though. GCC has a large number of target architectures,
>>>> and therefore these libraries have been written in C to make them
>>>> easier
>>>> to port.
>>>>
>>>> Commercial tools are usually aimed towards a single target, which makes
>>>> it a lot easier to hand optimize the math libs in assembly.
>>>
>>> Even hand optimizing in C for a specific processor, or doing mostly C
>>> with assembly just in the spots where it really matters, can do
>>> considerable good for both size and run time.
>>>
>>
>> Once you are at the stage of a 32-bit processor, then you are generally
>> unlikely to do much better by specialising your library code per
>> processor, or using assembly. It makes a much bigger difference for
>> smaller targets.
>>
>> The exception is if you are able to make use of particular odd
>> instructions or capabilities that your processor has, but that the
>> compiler cannot generate automatically.
>
> Those odd instructions are typically something you can use in math
> operations. Often a CPU will have math instructions that are useful, but
> don't align directly with C conventions, such as non-standard
> multiply/divide bit sizes. Also, assembly will give you direct access to
> carry and overflow flags, and other odd instructions, such as
> count-leading-zeroes.
>
> A good C compiler will also provide ways to gain access to those, but
> may be at a price of having to write convoluted code, and frequently
> checking assembler output. At that point, it's easier just to write the
> assembly yourself.
>

Sometimes the good C compiler will generate odd instructions 
automatically - in which case it is better to write the more "natural" C 
code, as this gives the compiler the best chance of generating even 
better code.  You'll want to check the generated assembly, of course. 
However, although compilers have got better at this sort of thing, they 
are far from perfect.  They are also hampered by the limitations of C. 
For example, there is no good way to write a "rol" or "ror" instruction 
in C, and few if any compilers would generate it from raw C.

I agree with you that writing the assembly by hand is sometimes the 
easiest and clearest way - "intrinsic" C function wrappers around odd 
instructions are okay for an instruction or two, but quickly become 
messy.  A spot of inline assembly can be much clearer than convoluted 
pseudo-C functions.

> I'm not advocating that the average user go write his own math lib in
> assembly, but for a vendor trying to sell a high-performance toolchain,
> hiring an expert to optimize the last cycle out of math code can be
> worth it.
>

It can be worth the effort - but not always.  You (or the compiler 
vendor) also has to take into account things like maintenance, 
testability, and stability in the face of changes to the compiler.  It 
is more important to be correct, and to be sure that your code stays 
correct, than to be small and fast.

There are also different balances, which makes things difficult - often 
you can write code that is small /and/ fast, but sometimes it's a choice 
of small /or/ fast.  Then there is accuracy and conformance to standards 
- do you write your libraries to conform exactly to the standards (for 
IEEE-754, this means special handling of NaNs, signed zeros, etc., which 
often mean a hardware FPU unit is of only limited use), or do you write 
your libraries to conform to what your customers actually need (embedded 
systems seldom need NaNs, etc.)?

Sometimes it /is/ worth writing your own maths code (though it is seldom 
worth doing it in assembly).  For example, it is typically not hard to 
write a sine function that is smaller and much faster than the maths 
library sinf(), and yet which is accurate enough for your motor 
controller or other application.

Reply by David Brown ●February 23, 20112011-02-23

On 23/02/2011 09:20, Andreas Huennebeck wrote:
> Tim Wescott wrote:
>
>> If I could figure out how to keep C++ without using the heap, I'd be a
>> happy camper.
>
> Don't use 'new'. Put your objects on the stack and let subroutines use
> them through references as function arguments, e.g.:
>
> static int foo(MyClass&  c)
> {
>      c.do_something();
> }
>
> int main()
> {
>      int some_data;
>      ... some code ...
>      MyClass my_object(some_data);
>      ... some code ...
>      foo(my_object);
>      ... some code ...
> }
>
>> I'm not even sure if malloc&c. are getting _called_, or
>> just pulled in because C++ uses "new", which uses -- well, you get the
>> idea.
>
> If your classes don't call new and your functions never call new and if you
> don't use classes from other libraries then malloc() should not be linked in.
> That's theory though :-(
>
>> For embedded I pretty much avoid dynamic deallocation like the
>> plague*; while this is against the C++ desktop paradigm, it lets you use
>> a very useful subset of the language without a whole 'heap' of trouble.
>>
>> * Meaning I'll "new" things at system startup but only if they're going
>> to live until power-down.  This gives one great flexibility in making
>> portable libraries while still not fragmenting the heap through
>> new-delete-new-delete-new sequences.
>
> For this approach (and placement new, see http://en.wikipedia.org/wiki/Placement_syntax)
> malloc is linked in.
>

It is also possible to use static objects that get constructed before 
main() - this will not lead to any mallocs and will also allow absolute 
addressing modes which are more efficient on some processors.  Of 
course, you then have to be sure of the ordering of construction of 
these static objects (or use a two-part constructor and init method).

Reply by ●February 23, 20112011-02-23

Mel <mwilson@the-wire.com> writes:

> And somebody's delay routine brought in the floating-point package
> so they could take their parameter in seconds.

That's what you get for neither reading the documentation, nor
obeying the #warnings that are printed.

The floating-point calculations are done at *compile time* only, and
in order for them to be optimized away, well, you have to enable
optimizations.  But that's all documented ...

-- 
Joerg Wunsch * Development engineer, Dresden, Germany

Atmel Automotive GmbH, Theresienstrasse 2, D-74027 Heilbronn
Geschaeftsfuehrung: Steven A. Laub, Stephen Cumming, Thomas Hoetzel
Amtsgericht Stuttgart, Registration HRB 106594

Reply by David Brown ●February 23, 20112011-02-23

On 23/02/2011 13:35, Joerg Wunsch wrote:
> Mel<mwilson@the-wire.com>  writes:
>
>> And somebody's delay routine brought in the floating-point package
>> so they could take their parameter in seconds.
>
> That's what you get for neither reading the documentation, nor
> obeying the #warnings that are printed.
>
> The floating-point calculations are done at *compile time* only, and
> in order for them to be optimized away, well, you have to enable
> optimizations.  But that's all documented ...
>

I take it you are talking about the delay routines in avr-libc (the 
default library for avr-gcc, for those that don't know) - they work 
fine, as the floating point calculations are done at compile time.  If 
someone is getting floating point code using avr-libc delay functions, 
then they are, as you say, not using the code correctly.

But I've seen other people's delay routines which /try/ to work this 
way, but fail - for various reasons, they end up doing the calculations 
at run time, even with optimisations enabled.

Reply by Tim Wescott ●February 23, 20112011-02-23

On 02/23/2011 12:20 AM, Andreas Huennebeck wrote:
> Tim Wescott wrote:
>
>> If I could figure out how to keep C++ without using the heap, I'd be a
>> happy camper.
>
> Don't use 'new'. Put your objects on the stack and let subroutines use
> them through references as function arguments, e.g.:
>
> static int foo(MyClass&  c)
> {
>      c.do_something();
> }
>
> int main()
> {
>      int some_data;
>      ... some code ...
>      MyClass my_object(some_data);
>      ... some code ...
>      foo(my_object);
>      ... some code ...
> }
>
>> I'm not even sure if malloc&c. are getting _called_, or
>> just pulled in because C++ uses "new", which uses -- well, you get the
>> idea.
>
> If your classes don't call new and your functions never call new and if you
> don't use classes from other libraries then malloc() should not be linked in.
> That's theory though :-(
>
>> For embedded I pretty much avoid dynamic deallocation like the
>> plague*; while this is against the C++ desktop paradigm, it lets you use
>> a very useful subset of the language without a whole 'heap' of trouble.
>>
>> * Meaning I'll "new" things at system startup but only if they're going
>> to live until power-down.  This gives one great flexibility in making
>> portable libraries while still not fragmenting the heap through
>> new-delete-new-delete-new sequences.
>
> For this approach (and placement new, see http://en.wikipedia.org/wiki/Placement_syntax)
> malloc is linked in.
>
> bye
> Andreas

void * malloc(size_t n)
{
   assert(false);
   return 0;
}

is pretty small -- maybe I'll use that.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html

Reply by Tim Wescott ●February 23, 20112011-02-23

On 02/23/2011 02:30 AM, David Brown wrote:
> On 23/02/2011 09:20, Andreas Huennebeck wrote:
>> Tim Wescott wrote:
>>
>>> If I could figure out how to keep C++ without using the heap, I'd be a
>>> happy camper.
>>
>> Don't use 'new'. Put your objects on the stack and let subroutines use
>> them through references as function arguments, e.g.:
>>
>> static int foo(MyClass& c)
>> {
>> c.do_something();
>> }
>>
>> int main()
>> {
>> int some_data;
>> ... some code ...
>> MyClass my_object(some_data);
>> ... some code ...
>> foo(my_object);
>> ... some code ...
>> }
>>
>>> I'm not even sure if malloc&c. are getting _called_, or
>>> just pulled in because C++ uses "new", which uses -- well, you get the
>>> idea.
>>
>> If your classes don't call new and your functions never call new and
>> if you
>> don't use classes from other libraries then malloc() should not be
>> linked in.
>> That's theory though :-(
>>
>>> For embedded I pretty much avoid dynamic deallocation like the
>>> plague*; while this is against the C++ desktop paradigm, it lets you use
>>> a very useful subset of the language without a whole 'heap' of trouble.
>>>
>>> * Meaning I'll "new" things at system startup but only if they're going
>>> to live until power-down. This gives one great flexibility in making
>>> portable libraries while still not fragmenting the heap through
>>> new-delete-new-delete-new sequences.
>>
>> For this approach (and placement new, see
>> http://en.wikipedia.org/wiki/Placement_syntax)
>> malloc is linked in.
>>
>
> It is also possible to use static objects that get constructed before
> main() - this will not lead to any mallocs and will also allow absolute
> addressing modes which are more efficient on some processors. Of course,
> you then have to be sure of the ordering of construction of these static
> objects (or use a two-part constructor and init method).

That's what I'm doing!  Unless I'm creating something dynamically and 
I've forgotten it (the code is old, and getting resurrected) -- now I 
need to go over the code with a fine-toothed comb.

-- 

Tim Wescott
Wescott Design Services
http://www.wescottdesign.com

Do you need to implement control loops in software?
"Applied Control Theory for Embedded Systems" was written for you.
See details at http://www.wescottdesign.com/actfes/actfes.html

Reply by David Brown ●February 23, 20112011-02-23

On 23/02/2011 16:47, Tim Wescott wrote:
> On 02/23/2011 12:20 AM, Andreas Huennebeck wrote:
>> Tim Wescott wrote:
>>
>>> If I could figure out how to keep C++ without using the heap, I'd be a
>>> happy camper.
>>
>> Don't use 'new'. Put your objects on the stack and let subroutines use
>> them through references as function arguments, e.g.:
>>
>> static int foo(MyClass& c)
>> {
>> c.do_something();
>> }
>>
>> int main()
>> {
>> int some_data;
>> ... some code ...
>> MyClass my_object(some_data);
>> ... some code ...
>> foo(my_object);
>> ... some code ...
>> }
>>
>>> I'm not even sure if malloc&c. are getting _called_, or
>>> just pulled in because C++ uses "new", which uses -- well, you get the
>>> idea.
>>
>> If your classes don't call new and your functions never call new and
>> if you
>> don't use classes from other libraries then malloc() should not be
>> linked in.
>> That's theory though :-(
>>
>>> For embedded I pretty much avoid dynamic deallocation like the
>>> plague*; while this is against the C++ desktop paradigm, it lets you use
>>> a very useful subset of the language without a whole 'heap' of trouble.
>>>
>>> * Meaning I'll "new" things at system startup but only if they're going
>>> to live until power-down. This gives one great flexibility in making
>>> portable libraries while still not fragmenting the heap through
>>> new-delete-new-delete-new sequences.
>>
>> For this approach (and placement new, see
>> http://en.wikipedia.org/wiki/Placement_syntax)
>> malloc is linked in.
>>
>> bye
>> Andreas
>
> void * malloc(size_t n)
> {
> assert(false);
> return 0;
> }
>
> is pretty small -- maybe I'll use that.
>

void * malloc(size_t n) {
	return 0;
}

is smaller.  "assert" can bring in all sorts of junk, depending on how 
it is defined, and what flags you have enabled - if it tries to print 
out a message on stderr, then this may be the cause of some of your 
problems.