# Fastest Math routines - Another Newbie Question

Started by September 7, 2009
I have am builidng a direct digital synthesizer and controlling the
synthesizer with a PIC 16F877A. The selection of frequencies are
determined by an optical decoder "tuning knob" and the frequencies
require 32 bit unsigned numbers. The math routines may be invoked 150
times per second as a tuning knob is turned rapidly. For each encoder
frequency tuning step I need to manipulate the 32 bit frequency value to
create various control words to the synthesizer, etc.

Therefore I need the fastest combination of math routines. I have
three main 'math' steps steps to to obtain these control numbers and I
can see at least two different approaches:

1) 8 bit number x 8 bit number multiplication
2) 8 bit number and 16 bit addition
3) 16 bit x 24 bit multiplication

or

1) 8 bit number x 32 bit number multiplication
2) 8 bit x 24 bit multiplcation
3) 24 bit + 32 bit addition

The first method intuitively seems would be fastest but I don;t want to
write or modify my own math routines and have not been able to find a
16b x 24b routine. Am I correct that using a 32b x 32b multiplication
routine for 16x24b numbers would still be many more clock cycles as I
would need to include the higher significant 8bit clusters with 0x00
dummy values to be included into the routine?

--- In p..., "Jerry O. Stern" wrote:
>
>
> I have am builidng a direct digital synthesizer and controlling the
> synthesizer with a PIC 16F877A. The selection of frequencies are
> determined by an optical decoder "tuning knob" and the frequencies
> require 32 bit unsigned numbers. The math routines may be invoked 150
> times per second as a tuning knob is turned rapidly. For each encoder
> frequency tuning step I need to manipulate the 32 bit frequency value to
> create various control words to the synthesizer, etc.
>
> Therefore I need the fastest combination of math routines. I have
> three main 'math' steps steps to to obtain these control numbers and I
> can see at least two different approaches:
>
>
> 1) 8 bit number x 8 bit number multiplication
> 2) 8 bit number and 16 bit addition
> 3) 16 bit x 24 bit multiplication
>
> or
>
> 1) 8 bit number x 32 bit number multiplication
> 2) 8 bit x 24 bit multiplcation
> 3) 24 bit + 32 bit addition
>
>
> The first method intuitively seems would be fastest but I don;t want to
> write or modify my own math routines and have not been able to find a
> 16b x 24b routine. Am I correct that using a 32b x 32b multiplication
> routine for 16x24b numbers would still be many more clock cycles as I
> would need to include the higher significant 8bit clusters with 0x00
> dummy values to be included into the routine?
>
>
> thanks in advance
>

But aren't the frequencies constant relative to a particular amount of rotation of the knob?

Calculate all possible frequency constants and put the results in a large serial EEPROM. It's pretty easy to get an SPI based 32 Mb flash (4 megabytes!). The SPI interface is very fast.

I use these http://www.digilentinc.com/Products/Detail.cfm?NavPath=2,401,489&Prod=PMOD-SF - the 32 Mb version. I also bought a few M25P32 chips from DigiKey.

If the knob is turning faster than you can fetch frequencies, ignore some counts. At some point it just isn't worth trying to keep up. What if someone is turning the knob with a DeWalt drill motor?

There is no point in calculating constants.

Richard
That's an interesting idea when I get past my first prototype. I need
to look how that would be implemented as I need a resolution of 1 hertz
and for all frequencies between 0 hz to 30,000,000 hz each frequency is
defined by two unique 32 bit numbers. If I understand you correctly I
would therefore need a table that could hold 60 million 32 bit numbers
or over 240 million bytes? I guess I could divide the table up into
various segments to have some sort of look-up algorithm that would cut
down look up time but that's a lot of flash memory to manage but I am
very new to this so perhaps I am missing something??

jerry

-----Original Message-----
From: p... [mailto:p...] On Behalf
Of rtstofer
Sent: Monday, September 07, 2009 8:08 PM
To: p...
Subject: [piclist] Re: Fastest Math routines - Another Newbie Question
--- In p..., "Jerry O. Stern" wrote:
>
>
> I have am builidng a direct digital synthesizer and controlling the
> synthesizer with a PIC 16F877A. The selection of frequencies are
> determined by an optical decoder "tuning knob" and the frequencies
> require 32 bit unsigned numbers. The math routines may be invoked 150

> times per second as a tuning knob is turned rapidly. For each encoder

> frequency tuning step I need to manipulate the 32 bit frequency value
> to create various control words to the synthesizer, etc.
>
> Therefore I need the fastest combination of math routines. I have
> three main 'math' steps steps to to obtain these control numbers and I

> can see at least two different approaches:
>
>
> 1) 8 bit number x 8 bit number multiplication
> 2) 8 bit number and 16 bit addition
> 3) 16 bit x 24 bit multiplication
>
> or
>
> 1) 8 bit number x 32 bit number multiplication
> 2) 8 bit x 24 bit multiplcation
> 3) 24 bit + 32 bit addition
>
>
> The first method intuitively seems would be fastest but I don;t want
> to write or modify my own math routines and have not been able to find

> a 16b x 24b routine. Am I correct that using a 32b x 32b
> multiplication routine for 16x24b numbers would still be many more
> clock cycles as I would need to include the higher significant 8bit
> clusters with 0x00 dummy values to be included into the routine?
>
>
> thanks in advance
>

But aren't the frequencies constant relative to a particular amount of
rotation of the knob?

Calculate all possible frequency constants and put the results in a
large serial EEPROM. It's pretty easy to get an SPI based 32 Mb flash
(4 megabytes!). The SPI interface is very fast.

I use these
http://www.digilentinc.com/Products/Detail.cfm?NavPath=2,401,489&Prod=PM
OD-SF - the 32 Mb version. I also bought a few M25P32 chips from
DigiKey.

If the knob is turning faster than you can fetch frequencies, ignore
some counts. At some point it just isn't worth trying to keep up. What
if someone is turning the knob with a DeWalt drill motor?

There is no point in calculating constants.

Richard

to unsubscribe, go to http://www.yahoogroups.com and follow the
instructions
The speeds you're talking about here should be no problem on a 40MIPS
device with a 8x8 HWmul.
I'm just guessing here but I think 32x32 @ 200Hz on a PIC18 is fully
possible.
But on a PIC16...hm...?
Jerry O. Stern skrev:
>
>
>
> I have am builidng a direct digital synthesizer and controlling the
> synthesizer with a PIC 16F877A. The selection of frequencies are
> determined by an optical decoder "tuning knob" and the frequencies
> require 32 bit unsigned numbers. The math routines may be invoked 150
> times per second as a tuning knob is turned rapidly. For each encoder
> frequency tuning step I need to manipulate the 32 bit frequency value
> to create various control words to the synthesizer, etc.
>
> Therefore I need the fastest combination of math routines. I have
> three main 'math' steps steps to to obtain these control numbers and I
> can see at least two different approaches:
>
>
> 1) 8 bit number x 8 bit number multiplication
> 2) 8 bit number and 16 bit addition
> 3) 16 bit x 24 bit multiplication
>
> or
>
> 1) 8 bit number x 32 bit number multiplication
> 2) 8 bit x 24 bit multiplcation
> 3) 24 bit + 32 bit addition
>
>
> The first method intuitively seems would be fastest but I don;t want
> to write or modify my own math routines and have not been able to find
> a 16b x 24b routine. Am I correct that using a 32b x 32b
> multiplication routine for 16x24b numbers would still be many more
> clock cycles as I would need to include the higher significant 8bit
> clusters with 0x00 dummy values to be included into the routine?
>
>
> thanks in advance
>
>
>

--
*******************************************

LAST UPDATED: 23/08/2003
*******************************************
Regards
Eirik Karlsen
Jerry,

Do you really need to calculate for every increment of the tuning knob? Why not keep track of the knob with an interupt routine and then calculate the values in a background routine based on the stored value of the knob. You should then be able to keep up with even the quickest changes.

dB

----- Original Message -----
From: Jerry O. Stern
To: p...
Sent: Tuesday, September 08, 2009 12:25 AM
Subject: [piclist] Fastest Math routines - Another Newbie Question
I have am builidng a direct digital synthesizer and controlling the synthesizer with a PIC 16F877A. The selection of frequencies are determined by an optical decoder "tuning knob" and the frequencies require 32 bit unsigned numbers. The math routines may be invoked 150 times per second as a tuning knob is turned rapidly. For each encoder frequency tuning step I need to manipulate the 32 bit frequency value to create various control words to the synthesizer, etc.

Therefore I need the fastest combination of math routines. I have three main 'math' steps steps to to obtain these control numbers and I can see at least two different approaches:
1) 8 bit number x 8 bit number multiplication
2) 8 bit number and 16 bit addition
3) 16 bit x 24 bit multiplication

or

1) 8 bit number x 32 bit number multiplication
2) 8 bit x 24 bit multiplcation
3) 24 bit + 32 bit addition
The first method intuitively seems would be fastest but I don;t want to write or modify my own math routines and have not been able to find a 16b x 24b routine. Am I correct that using a 32b x 32b multiplication routine for 16x24b numbers would still be many more clock cycles as I would need to include the higher significant 8bit clusters with 0x00 dummy values to be included into the routine?
dB,

Sounds like a great idea, if the tuning knob is being moved so quickly
no need to do a calculation for every encoder tic. On the other hand I
want to get that nice feel and look of spinning the tuning knob and the
LCD showing the frequency scroll up or down smoothly. I guess some time
function would have to be included so that when the encoder is being
turned slowly each math routine frequency update is shown while beyond a
certain encoder speed perhaps every 5th or 10th encoder tic needs to be
updated. Some of the digital synthesizers that others have designed
(like Craig Johnson, AA0ZZ) have two PIC 16F877A's where the encoder and
math and LCD routines are split up between the PICs and use an interrupt
method but the math routines per encoder tic still need to be done the
same brute way for each 1 hz change in the encoder. I have to dig more
deeply into his code to understand whether the interrupts have these
kind of tuning knob speed adjustments.

jerry

-----Original Message-----
From: p... [mailto:p...] On Behalf
Of BlueStax
Sent: Tuesday, September 08, 2009 4:10 AM
To: p...
Subject: Re: [piclist] Fastest Math routines - Another Newbie Question

Jerry,

Do you really need to calculate for every increment of the tuning knob?
Why not keep track of the knob with an interupt routine and then
calculate the values in a background routine based on the stored value
of the knob. You should then be able to keep up with even the quickest
changes.

dB

----- Original Message -----
From: Jerry O. Stern
To: p...
Sent: Tuesday, September 08, 2009 12:25 AM
Subject: [piclist] Fastest Math routines - Another Newbie Question

I have am builidng a direct digital synthesizer and controlling the
synthesizer with a PIC 16F877A. The selection of frequencies are
determined by an optical decoder "tuning knob" and the frequencies
require 32 bit unsigned numbers. The math routines may be invoked 150
times per second as a tuning knob is turned rapidly. For each encoder
frequency tuning step I need to manipulate the 32 bit frequency value to
create various control words to the synthesizer, etc.

Therefore I need the fastest combination of math routines. I have
three main 'math' steps steps to to obtain these control numbers and I
can see at least two different approaches:

1) 8 bit number x 8 bit number multiplication
2) 8 bit number and 16 bit addition
3) 16 bit x 24 bit multiplication

or

1) 8 bit number x 32 bit number multiplication
2) 8 bit x 24 bit multiplcation
3) 24 bit + 32 bit addition

The first method intuitively seems would be fastest but I don;t want to
write or modify my own math routines and have not been able to find a
16b x 24b routine. Am I correct that using a 32b x 32b multiplication
routine for 16x24b numbers would still be many more clock cycles as I
would need to include the higher significant 8bit clusters with 0x00
dummy values to be included into the routine?

--- In p..., "Jerry O. Stern" wrote:
>
> dB,
>
> Sounds like a great idea, if the tuning knob is being moved so quickly
> no need to do a calculation for every encoder tic. On the other hand I
> want to get that nice feel and look of spinning the tuning knob and the
> LCD showing the frequency scroll up or down smoothly. I guess some time
> function would have to be included so that when the encoder is being
> turned slowly each math routine frequency update is shown while beyond a
> certain encoder speed perhaps every 5th or 10th encoder tic needs to be
> updated. Some of the digital synthesizers that others have designed
> (like Craig Johnson, AA0ZZ) have two PIC 16F877A's where the encoder and
> math and LCD routines are split up between the PICs and use an interrupt
> method but the math routines per encoder tic still need to be done the
> same brute way for each 1 hz change in the encoder. I have to dig more
> deeply into his code to understand whether the interrupts have these
> kind of tuning knob speed adjustments.
>
> jerry
>

Do you even need to multiply? If the last calculation was m * n and the next calculation is m * (n+1), why not just add the increment 'm' to the previous result. Multi-byte adds are pretty quick. There could be a small table of increments - small increments for small steps and large increments for large steps.

Then there are the external math co-processors. This one can do a 32 bit multiply in 210 uS and if you add another couple of hundred uS for transfer, you get something like 500 uS, So it could do about 2000 multiplies per second. Give or take... It's actually capable of doing some pretty sophisticated stuff given the internal register set.

If the resolution is 1 Hz per step, it's going to take a long time to change a MHz or so. I suspect that the step value will be somewhat related to the step rate such that when the knob is turned quickly, larger steps are taken.

I found the project code on the Internet but I didn't study it.

Richard
Jerry,
If you use an interrupt driven encoder routine then this comes more or
less by itself...
on interrupt you disable further interrupts and do all the math, and as
soon as the time consuming
math is done you re-enable encoder interrupts.
I have done this (RF synth and servo/remote positioning systems) and it
works great; at slow to
medium encoder speed each encoder 'tick' is fully processed, at higher
speeds more and more
ticks will be skipped due to lack of time but this was never noticeable,
at least not in my apps which
ran up to some 200Hz loop frequency.
Jerry O. Stern skrev:
>
>
> dB,
>
> Sounds like a great idea, if the tuning knob is being moved so quickly
> no need to do a calculation for every encoder tic. On the other hand
> I want to get that nice feel and look of spinning the tuning knob and
> the LCD showing the frequency scroll up or down smoothly. I guess
> some time function would have to be included so that when the encoder
> is being turned slowly each math routine frequency update is shown
> while beyond a certain encoder speed perhaps every 5th or 10th encoder
> tic needs to be updated. Some of the digital synthesizers that others
> have designed (like Craig Johnson, AA0ZZ) have two PIC 16F877A's where
> the encoder and math and LCD routines are split up between the PICs
> and use an interrupt method but the math routines per encoder tic
> still need to be done the same brute way for each 1 hz change in the
> encoder. I have to dig more deeply into his code to understand
> whether the interrupts have these kind of tuning knob speed adjustments.
>
> jerry
>

--
*******************************************

LAST UPDATED: 23/08/2003
*******************************************
Regards
Eirik Karlsen
Eric,

That won't work. If you disable the interrupts to do slow maths you take the risk of missing fast encoder pulses and also prevent the interrupts from being used for any other purpose. Just use the interrupt to update the encoder count in register(s) and then do the calcs at your leisure. You don't need to calculate for every encoder pulse. This means that you can spin the encoder real fast for big changes and not be held back by the calculation speed.

I fully take your point that missed ticks won't be noticed but keeping the calcs out of the interrupt routine is a better solution especially if you need to use the interrupts for other purposes such as timing.

Jerry,

If you update the LCD after every calc there should be no problem with perception of 'reactiveness'. Just do the calcs in a background loop when nothing else is going on. You'll be amazed at what performance you can get out of even the slowest PICs by doing it this way.

dB.

----- Original Message -----
From: Eirik Karlsen
To: p...
Sent: Tuesday, September 08, 2009 6:08 PM
Subject: Re: [piclist] Fastest Math routines - Another Newbie Question
Jerry,
If you use an interrupt driven encoder routine then this comes more or less by itself...
on interrupt you disable further interrupts and do all the math, and as soon as the time consuming
math is done you re-enable encoder interrupts.
I have done this (RF synth and servo/remote positioning systems) and it works great; at slow to
medium encoder speed each encoder 'tick' is fully processed, at higher speeds more and more
ticks will be skipped due to lack of time but this was never noticeable, at least not in my apps which
ran up to some 200Hz loop frequency.
Jerry O. Stern skrev:

Change settings via the Web (Yahoo! ID required)
Change settings via email: Switch delivery to Daily Digest | Switch format to Traditional
a.. 9New Members
Give Back
Yahoo! for Good

Get inspred

by a good cause.

Y! Toolbar
Get it Free!

easy 1-click access

Yahoo! Groups
Start a group

in 3 easy steps.

Connect with others.
.
--- In p..., "BlueStax" wrote:
>
> Eric,
>
> That won't work. If you disable the interrupts to do slow maths you take the risk of missing fast encoder pulses and also prevent the interrupts from being used for any other purpose. Just use the interrupt to update the encoder count in register(s) and then do the calcs at your leisure. You don't need to calculate for every encoder pulse. This means that you can spin the encoder real fast for big changes and not be held back by the calculation speed.
>
> I fully take your point that missed ticks won't be noticed but keeping the calcs out of the interrupt routine is a better solution especially if you need to use the interrupts for other purposes such as timing.
>
> Jerry,
>
> If you update the LCD after every calc there should be no problem with perception of 'reactiveness'. Just do the calcs in a background loop when nothing else is going on. You'll be amazed at what performance you can get out of even the slowest PICs by doing it this way.

As a rule disabling interrupts should be avoided if at all possible. It might get you out of trouble in the short term but sooner or later it will cause you problems.

Just to clarify:

(1) ... humans can only recognise display updates very slowly compared to the speed at which an MCU can perform them (try looking at a stop watch counting in 10ths of a second)

(2) ... the interrupt routine needs to execute very frequently if it is to catch all events from a rotary encoder (hundreds of times a second for a mechanical encoder being spun quickly)

(3) ... the absolute position of the encoder may be IMPORTANT in some applications so missing events will not be acceptable for these apps (think about the user turning the dial ten times quickly to get near to the value he wants then slowly to get to the exact value - he would want to be able to repeat the process and get the same result each time, not sometimes need to turn the dial 10 times and sometimes 11 times - think of using a finger on the dial to drag the dial around)

(4) ... if the number crunching is happening in the background and the dial update in the interrupt routine, it wont mater how long it takes to crunch the numbers as long as you take a snapshot of the value to be crunched and only crunch that and not the variable that is being updated by the interrupt routine.

(5) ... number crunching in the background will proceed as quickly as it can and dial event counting will also proceed as quickly as it can. This means that if it takes a long time to process the current dial position (lets exaggerate and say 0.5 seconds) and during this time the dial has been turned 10 units (detentes) then the next display update will skip the in between dial updates and just catch the 10th unit. This ends up looking like the display is being updated every 0.5 seconds (if we continue to assume 0.5s number crunching). You may even find that you need to slow down the display refresh rate because the number crunching is way to fast!

(6) ... when you take a snapshot of the dial variable make sure it is not modified by the number crunching or display update process. Next time you go to process the dial variable (in the background) check to see if it has actually changed since the last time you processed it (compare the dial variable with the snapshot). If it hasn't changed then there is no reason to do any number crunching or a display update.

Regards
Sergio Masci