What's more important optimisations or debugging?| page 5

Reply by Wilco Dijkstra ●June 6, 20072007-06-06

"David Brown" <david@westcontrol.removethisbit.com> wrote in message 
news:46666c98$0$15310$8404b019@news.wineasy.se...
> CBFalconer wrote:
>> David Brown wrote:
>>> Wilco Dijkstra wrote:
>>>> "David Brown" <david@westcontrol.removethisbit.com> wrote

>>>>> Remember Knuth's golden rules about optimisation:
>>>>>
>>>>> 1. Don't do it.
>>>>> 2. (For experts only) Don't do it yet.
>>> (Thanks for correcting my source on this quotation, by the way.)
>>>
>>>> I don't agree with this. For small programs it is easy to
>>>> implement an efficient algorithm immediately rather than start
>>>> with an inefficient one. It's hard to improve badly written code,
>>>> so rewriting it from scratch would be better than trying to fix
>>>> it.
>>> As I noted below, and as you already knew, the context of these
>>> rules is for optimisation of the implementation, not choice of
>>> algorithm - the whole point is that choosing a better algorithm
>>> will make a bigger difference to the result than a finely tuned
>>> poor algorithm.  Any time spent fine-tuning your bubble-sort
>>> implementation is time wasted - switch to a heap sort or quicksort.
>>
>> Here you are missing an option.  Also consider mergesort, combined
>> with a malloced list.  For an example of this see wdfreq.c, a part
>> of the hashlib distribution.  See:
>>
>>    <http://cbfalconer.home.att.net/download/>
>>
>> for the full package.
>>
> I was not intending to give a list of possible sorting algorithms - there are many, many 
> more than the four mentioned here, and there are many things to consider if you are 
> looking for an "optimal" solution. Sorting algorithms are fun to work with, since it is 
> easy to understand and specify the problem, while offering a wide variety of algorithms, 
> so a thread on the subject (or algorithm design in general) might be of general interest 
> in c.a.e. - but it's off-topic for this thread.

Sorting is actually a good example to show what I mean. It's incorrect to
think that an algorithm with a better algorithmic complexity is faster than
a simpler one. The implementation of a simple algorithm can be optimised
more and so wins in most cases (unless the number of elements is huge).
QuickSort is the best-known example of this, while it is theoretically slower
than all NlogN sorting algorithms, in practise it is much faster.

There is easily a factor 2 to be had in implementation efficiency - after
you've chosen the most appropriate algorithms. So optimizing the
implementation make sense - I consider it an important aspect of
writing high quality code. It doesn't mean spending a lot of time micro
optimizing code. It means putting more effort in the design of the
implementation, making it as simple as possible. The simplicity translates
into fewer lines of code, and thus smaller and faster code.

A bonus is that the extra time spent on the design usually results in better
understanding of the problem and so fewer bugs. In my experience there
is a high correlation between the amount of code someone writes for a
given problem, and the efficiency and quality of it. As a real-world example,
how many lines of code do you need to calculate the number of bits used
by a value (ie. integer log2)? I've seen an experienced programmer writing
20 lines of code with 4 mistakes (a few of which were not obvious or easily
found by testing). The really worrying bit is that this was in software destined
for a well known US attack helicopter...

Wilco

Reply by Marra ●June 7, 20072007-06-07

On 30 May, 23:15, rhapg...@gmail.com wrote:
> I'm trying to get a feel for what people now consider more important,
> optimisations or debugging ability. In the past with such tight memory
> limits I would have said optimisations but now with expanded-memory
> parts becoming cheaper I would think that debugging ability is more
> important in a development tool.  It's is not exactly a black and
> white question, debug or smallused, but more a ratio. eg. 50% debug/
> 50% optimised or 70% debug/30% optimised, etc.
>
> Any feedback would be greatly appreciated.

Optimising also adds in the possibiity of adding new bugs so its a
stage to skip if you can.
With very fast clock speeds now I no longer optimise unless I really
need to.

Getting bugs out has got to be a top priority otherwise your customer
wont come back !

Reply by Walter Banks ●June 7, 20072007-06-07

Wilco Dijkstra wrote:

> > I was not intending to give a list of possible sorting algorithms - there are many, many
> > more than the four mentioned here, and there are many things to consider if you are
> > looking for an "optimal" solution. Sorting algorithms are fun to work with, since it is
> > easy to understand and specify the problem, while offering a wide variety of algorithms,
> > so a thread on the subject (or algorithm design in general) might be of general interest
> > in c.a.e. - but it's off-topic for this thread.
>
> Sorting is actually a good example to show what I mean. It's incorrect to
> think that an algorithm with a better algorithmic complexity is faster than
> a simpler one. The implementation of a simple algorithm can be optimised
> more and so wins in most cases (unless the number of elements is huge).
> QuickSort is the best-known example of this, while it is theoretically slower
> than all NlogN sorting algorithms, in practise it is much faster.
>
> There is easily a factor 2 to be had in implementation efficiency - after
> you've chosen the most appropriate algorithms. So optimizing the
> implementation make sense - I consider it an important aspect of
> writing high quality code.

Good compilers will not change a bubble sort into a quicksort. Optimization
starts with describing what is wanted with good clear code. Good compiler
optimization analyses the code presented and creates a solid implementation.

To weigh in on this thread set the compiler at the optimization level that the
code is going to be shipped so the code that is debugged is the code shipped.
There is nothing worse than debugging at one optimization level and shipping
at another. The product that is shipped should be the one that is best
understood and has had the most experience being used.

Walter Banks
--
Byte Craft Limited
Tel. (519) 888-6911
 http://www.bytecraft.com
 email walter@bytecraft.com

Reply by David Brown ●June 7, 20072007-06-07

Wilco Dijkstra wrote:
> "David Brown" <david@westcontrol.removethisbit.com> wrote in message 
> news:46666c98$0$15310$8404b019@news.wineasy.se...
>> CBFalconer wrote:
>>> David Brown wrote:
>>>> Wilco Dijkstra wrote:
>>>>> "David Brown" <david@westcontrol.removethisbit.com> wrote
> 
>>>>>> Remember Knuth's golden rules about optimisation:
>>>>>>
>>>>>> 1. Don't do it.
>>>>>> 2. (For experts only) Don't do it yet.
>>>> (Thanks for correcting my source on this quotation, by the way.)
>>>>
>>>>> I don't agree with this. For small programs it is easy to
>>>>> implement an efficient algorithm immediately rather than start
>>>>> with an inefficient one. It's hard to improve badly written code,
>>>>> so rewriting it from scratch would be better than trying to fix
>>>>> it.
>>>> As I noted below, and as you already knew, the context of these
>>>> rules is for optimisation of the implementation, not choice of
>>>> algorithm - the whole point is that choosing a better algorithm
>>>> will make a bigger difference to the result than a finely tuned
>>>> poor algorithm.  Any time spent fine-tuning your bubble-sort
>>>> implementation is time wasted - switch to a heap sort or quicksort.
>>> Here you are missing an option.  Also consider mergesort, combined
>>> with a malloced list.  For an example of this see wdfreq.c, a part
>>> of the hashlib distribution.  See:
>>>
>>>    <http://cbfalconer.home.att.net/download/>
>>>
>>> for the full package.
>>>
>> I was not intending to give a list of possible sorting algorithms - there are many, many 
>> more than the four mentioned here, and there are many things to consider if you are 
>> looking for an "optimal" solution. Sorting algorithms are fun to work with, since it is 
>> easy to understand and specify the problem, while offering a wide variety of algorithms, 
>> so a thread on the subject (or algorithm design in general) might be of general interest 
>> in c.a.e. - but it's off-topic for this thread.
> 
> Sorting is actually a good example to show what I mean. It's incorrect to
> think that an algorithm with a better algorithmic complexity is faster than
> a simpler one. The implementation of a simple algorithm can be optimised
> more and so wins in most cases (unless the number of elements is huge).

A good implementation of a simple algorithm is often the right choice - 
bubblesort can be faster than quicksort for small numbers of elements, 
and is much easier to understand and get right in the code.

What I am arguing against is "optimising" your source code to the extent 
that it is no longer good code - if you need to do that, you've got the 
wrong algorithm (or wrong hardware).  Writing sensible, efficient code 
is always a good thing - but if you need to go back and make lots of 
tweaks to your source code for better performance, then either you've 
written bad code in the first case, you have the wrong algorithm, or you 
have poor tools.  For example, if you are at the level of changing a 
"for" loop into a "while (n--)" loop, or adding extra local pointer 
registers, then you are not helping (assuming, of course, that you have 
a reasonably effective optimising compiler).

> QuickSort is the best-known example of this, while it is theoretically slower
> than all NlogN sorting algorithms, in practise it is much faster.
> 

QuickSort is not theoretically slower than all NlogN algorithms.  It is 
theoretically faster than most, assuming a completely random 
distribution of the original data.  But it has a worst case O(N^2) 
complexity, and it hits this worst case when the original data is 
already sorted.  It can be modified to avoid this, if you are expecting 
to meet nearly sorted data.  But if you are expecting to regularly meet 
nearly sorted data, then some other algorithms (including even 
bubblesort) can be much faster.

This is why sorting is so interesting in algorithm design - depending on 
the requirements, and the kind and size of data you will be sorting, 
there are many different algorithms.

> There is easily a factor 2 to be had in implementation efficiency - after
> you've chosen the most appropriate algorithms. So optimizing the
> implementation make sense - I consider it an important aspect of
> writing high quality code. It doesn't mean spending a lot of time micro
> optimizing code. It means putting more effort in the design of the
> implementation, making it as simple as possible. The simplicity translates
> into fewer lines of code, and thus smaller and faster code.
> 

There can easily be a factor of two between well-written code and badly 
written code, or between code compiled with a good optimising compiler 
and code compiled with a poorer compiler (or a good one with optimising 
turned off).  Writing fast code is certainly part of writing high 
quality code - but I don't think you can, in general, take good quality 
code and "optimise" it to run twice as fast while still keeping it as 
good quality code.

> A bonus is that the extra time spent on the design usually results in better
> understanding of the problem and so fewer bugs. In my experience there
> is a high correlation between the amount of code someone writes for a
> given problem, and the efficiency and quality of it. As a real-world example,

I don't see that correlation at all.  I've seen code that is 
long-winded, unclear, unmaintainable and bug-ridden.  I've also seen 
code that is tight and compact, and is also unclear, unmaintainable and 
bug-ridden.  I am in full support of the idea of spending extra time on 
the hard bits of the code, to get a better understanding and to write 
better code - and that this will often give faster code as well.  But 
it's the speed of the code that is the bonus from writing better code 
with fewer bugs, not the other way round.

> how many lines of code do you need to calculate the number of bits used
> by a value (ie. integer log2)? I've seen an experienced programmer writing
> 20 lines of code with 4 mistakes (a few of which were not obvious or easily
> found by testing). The really worrying bit is that this was in software destined
> for a well known US attack helicopter...
> 
> Wilco 
> 
>

Reply by Walter Banks ●June 7, 20072007-06-07

Marra wrote:

>
> Optimising also adds in the possibiity of adding new bugs so its a
> stage to skip if you can.
> With very fast clock speeds now I no longer optimise unless I really
> need to.
>
> Getting bugs out has got to be a top priority otherwise your customer
> wont come back !

Cranking the clock up with a slower implementation is not an option on
many volume commercial products for two reasons

1) EMI Most products require EMI testing and reducing the clock speed
is a factor that all other things being equal generally will reduce emissions.

2) Power consumption is a function of clock speed. A first order
    approximation power is a linear function of clock speed. With
    so many battery powered hand held products this is an important
    factor to be considered.

Optimization is not the issue as much as debugging at the optimization level
that will be shipped so the final product is the product that was developed.

Something I see in Asia more than North America is project budgets for
code size and execution cycles by function.

Walter Banks
--
Byte Craft Limited
Tel. (519) 888-6911
 http://www.bytecraft.com
 email walter@bytecraft.com

Reply by Marra ●June 7, 20072007-06-07

On 7 Jun, 14:13, Walter Banks <wal...@bytecraft.com> wrote:
> Marra wrote:
>
> > Optimising also adds in the possibiity of adding new bugs so its a
> > stage to skip if you can.
> > With very fast clock speeds now I no longer optimise unless I really
> > need to.
>
> > Getting bugs out has got to be a top priority otherwise your customer
> > wont come back !
>
> Cranking the clock up with a slower implementation is not an option on
> many volume commercial products for two reasons
>
> 1) EMI Most products require EMI testing and reducing the clock speed
> is a factor that all other things being equal generally will reduce emissions.
>
> 2) Power consumption is a function of clock speed. A first order
>     approximation power is a linear function of clock speed. With
>     so many battery powered hand held products this is an important
>     factor to be considered.
>
> Optimization is not the issue as much as debugging at the optimization level
> that will be shipped so the final product is the product that was developed.
>
> Something I see in Asia more than North America is project budgets for
> code size and execution cycles by function.
>
> Walter Banks
> --
> Byte Craft Limited
> Tel. (519) 888-6911
>  http://www.bytecraft.com
>  email wal...@bytecraft.com

A lot of the PICs instructions are single cycle anyway.
I have done a lot of projects where the PIC has to be the cheapest
possible and has to mop up as much external hardware as possible i.e.
UARTs. ADC etc
Good PCB design should help with EMI.
Given that the high clock is internal to the PIC I dont see that the
problem should be so severe.
Externals clocks running around the board is quite a different matter.
Enclosure is important too to shield radiation both in and out.

Reply by Wilco Dijkstra ●June 7, 20072007-06-07

"Walter Banks" <walter@bytecraft.com> wrote in message 
news:4668045C.F6B30DB2@bytecraft.com...
>
>
> Marra wrote:
>
>>
>> Optimising also adds in the possibiity of adding new bugs so its a
>> stage to skip if you can.
>> With very fast clock speeds now I no longer optimise unless I really
>> need to.
>>
>> Getting bugs out has got to be a top priority otherwise your customer
>> wont come back !
>
> Cranking the clock up with a slower implementation is not an option on
> many volume commercial products for two reasons
>
> 1) EMI Most products require EMI testing and reducing the clock speed
> is a factor that all other things being equal generally will reduce emissions.
>
> 2) Power consumption is a function of clock speed. A first order
>    approximation power is a linear function of clock speed. With
>    so many battery powered hand held products this is an important
>    factor to be considered.
>
> Optimization is not the issue as much as debugging at the optimization level
> that will be shipped so the final product is the product that was developed.

Also most compilers are of good quality nowadays. It's always possible to find
bugs of course (I've found at least one in every compiler I have used), but you
need to throw millions of lines of code at it before you're guaranteed to find
something. Also compiler bugs are not exclusively introduced by optimizations.
In fact I know a compiler which had many more bugs when you turned
optimizations off...

The idea of shipping with debugging off has been obsolete for quite a while -
as Walter says, shipping the optimised code you tested is best. Most bugs are in
the application rather than caused by the compiler. Of course bad programmers
prefer to blame the compiler when a bug goes away when they change
optimization level rather than fix their code...

Wilco

Reply by CBFalconer ●June 7, 20072007-06-07

David Brown wrote:
> 
... snip ...
> 
> QuickSort is not theoretically slower than all NlogN algorithms. 
> It is theoretically faster than most, assuming a completely random
> distribution of the original data.  But it has a worst case O(N^2)
> complexity, and it hits this worst case when the original data is
> already sorted.  It can be modified to avoid this, if you are
> expecting to meet nearly sorted data.  But if you are expecting to
> regularly meet nearly sorted data, then some other algorithms
> (including even bubblesort) can be much faster.
> 
> This is why sorting is so interesting in algorithm design -
> depending on the requirements, and the kind and size of data you
> will be sorting, there are many different algorithms.

This is one of the points about mergesort.  It is ALWAYS O(nlogn). 
In addition, it is always (properly coded) a stable sort.  Neither
applies to quicksort.  Mergesort code can be very compact.  For an
example, see the demo applications in my hashlib package, at:

   <http://cbfalconer.home.att.net/download/>

which is written in standard C and is completely portable. 
Released under GPL.

-- 
 <http://www.cs.auckland.ac.nz/~pgut001/pubs/vista_cost.txt>
 <http://www.securityfocus.com/columnists/423>
 <http://www.aaxnet.com/editor/edit043.html>
 <http://kadaitcha.cx/vista/dogsbreakfast/index.html>
                        cbfalconer at maineline dot net



-- 
Posted via a free Usenet account from http://www.teranews.com

Reply by Paul E. Bennett ●June 7, 20072007-06-07

Wilco Dijkstra wrote:

> Also most compilers are of good quality nowadays. It's always possible to
> find bugs of course (I've found at least one in every compiler I have
> used), but you need to throw millions of lines of code at it before you're
> guaranteed to find something. Also compiler bugs are not exclusively
> introduced by optimizations. In fact I know a compiler which had many more
> bugs when you turned optimizations off...
> 
> The idea of shipping with debugging off has been obsolete for quite a
> while - as Walter says, shipping the optimised code you tested is best.
> Most bugs are in the application rather than caused by the compiler. Of
> course bad programmers prefer to blame the compiler when a bug goes away
> when they change optimization level rather than fix their code...

Let us no forget the aims for software development.
  It is to "Write a correct, robust, readable, documented program to perform
the specified functions. The program should be written so that it can be
modified, extended, or re-used in the future by the original author or
others. It is good (and in some cases vital) that it demonstrate efficiency
at run time in time and space, machine independence, ease of debugging,
etc."

(paraphrased from "Software Fault Prevention by Language Choice: Why C is
Not My Favorite Language" by Richard Fateman (UoC, Berkeley). Richard
happens to like Lisp.

This is why the debug support would be higher up my list rather than
optimisation. However, I also consider a decent development process that is
able to debug the specification before you begin thinking about the code is
a much better prospect all round.

-- 
********************************************************************
Paul E. Bennett ....................<email://peb@amleth.demon.co.uk>
Forth based HIDECS Consultancy .....<http://www.amleth.demon.co.uk/>
Mob: +44 (0)7811-639972
Tel: +44 (0)1235-811095
Going Forth Safely ..... EBA. www.electric-boat-association.org.uk..
********************************************************************

Reply by Wilco Dijkstra ●June 7, 20072007-06-07

"CBFalconer" <cbfalconer@yahoo.com> wrote in message news:4668771B.913E42A5@yahoo.com...
> David Brown wrote:
>>
> ... snip ...
>>
>> QuickSort is not theoretically slower than all NlogN algorithms.

I meant the worst case complexity. People too often only look at
the theoretical worst case rather than at the amortized average
case which is what matters in practise. Similarly, the constant
factor is often overlooked or being handwaved about, but again
this is what actually matters in practise.

For similar reasons it's much better to use radix trees rather
than balanced search trees. However everybody just codes the
terribly complex and inefficient AVL or Red-Black stuff...

>> It is theoretically faster than most, assuming a completely random
>> distribution of the original data.  But it has a worst case O(N^2)
>> complexity, and it hits this worst case when the original data is
>> already sorted.  It can be modified to avoid this, if you are
>> expecting to meet nearly sorted data.

My standard implementation chooses the pivot from the middle,
which works perfectly for partially sorted, sorted, reverse sorted
inputs, and even the degenerate case of  all elements identical.
My point is that not all implementations are equal despite using the
same basic algorithm...

>> But if you are expecting to
>> regularly meet nearly sorted data, then some other algorithms
>> (including even bubblesort) can be much faster.

Indeed. You could do partial insertion sort and only run quicksort if
it didn't end up sorted.

>> This is why sorting is so interesting in algorithm design -
>> depending on the requirements, and the kind and size of data you
>> will be sorting, there are many different algorithms.
>
> This is one of the points about mergesort.  It is ALWAYS O(nlogn).
> In addition, it is always (properly coded) a stable sort.  Neither
> applies to quicksort.  Mergesort code can be very compact.

Mergesort works quite well on linked lists but is very inefficient
on arrays as it isn't an in-place algorithm. Quicksort needs far
fewer data swaps, and that is the main reason it wins over
mergesort or heapsort.

Wilco