EmbeddedRelated.com
Forums

newbie: FIFOs and correlation algorithm in C for TI DSPs

Started by sebastian August 17, 2004
hi,

im a newbie in DSP and embedded programming and i've some doubts that
i hope some of you could kindly answer.

i've to do some profiling between DSP and ASIC/FPGA solutions. Im a
FPGA guy so im lost with this DSP thing.

1)
i'd like to know what do you think about the code i wrote, cause i
feel it's not very "DSP optimized". I need to implement some sort of
shiftregister or FIFO, because im to calculate a correlation, so
here's how i do it. (like i'd do in vhdl...)

the input coming from an ADC is 10bits wide, so it'll be stored in a
"short"

#include....

#define FifoSize(x) (x+1)

// FIFO declarations
short FifoN[FifoSize(gN)];
short *FifoN_ptr_write = &FifoN[0];
short *FifoN_ptr_read = &FifoN[1];
short *FifoN_end_ptr = &FifoN[gN];

short input;
int multN;

// check for the end of FIFO (write), if yes, then wrap around
if (FifoN_ptr_write == FifoN_end_ptr)
{
     FifoN_ptr_write = &FifoN[0];
}

// write to the FIFO
*(FifoN_ptr_write++) = input;
	
// check for the end of FIFO (read), if yes, then wrap around
if (FifoN_ptr_read == FifoN_end_ptr)
  {
    FifoN_ptr_read = &FifoN[0];
  }

// calculate input * FIFO_output
multN = input * (*FifoN_ptr_read++)

....

EchR = EchR + multN - (*FifoD1_ptr_read++);


is this the best way to do it? or there's a better way to implement
the FIFOs?
should i "fix" for 20bits discarding the first 12? (i guess that's not
necesary)
will memory used by the FIFO be effectivelly stored in the cache?
can i use three operands at the right of the equal? or should i split
the expression?

2)
how about the ADC doing a DMA transfer to RAM and then the DSP reading
a whole chunk of data while the ADC performs the next DMA? it is
possible i guess.

3)
are there online tutorials or coding style guideliness for C for DSP?
sebastian wrote:

> hi, > > im a newbie in DSP and embedded programming and i've some doubts that > i hope some of you could kindly answer. > > i've to do some profiling between DSP and ASIC/FPGA solutions. Im a > FPGA guy so im lost with this DSP thing. > > 1) > i'd like to know what do you think about the code i wrote, cause i > feel it's not very "DSP optimized". I need to implement some sort of > shiftregister or FIFO, because im to calculate a correlation, so > here's how i do it. (like i'd do in vhdl...)
FIFOs are generally implemented as circular buffers. Your C code appears to do that, but ignores the chip's hardware support for circular buffers. Once the circular data buffer is set up -- you do that once at initialization -- A correlation using the MAC* instruction takes fewer than half a dozen op codes. If your compiler doesn't have a macro to do that, use in-line assembly. Jerry _________________________ * Multiply And Accumulate. -- ... the worst possible design that just meets the specification - almost a definition of practical engineering. .. Chris Bore ������������������������������������������������������������������������
What specific DSP are you using?

1)  In my opinion the two most important things that distinguish a DSP from
a microprocessor are the ability to do a single cycle multiply-accumulate
(MAC) and also the ability to do circular buffering in hardware.  From C you
may not be able to take advantage of either of those.  However, you may not
need to.  Most math-intensive algorithms already have C-callable assembly
functions that will very optimally do your processing.  If there is not a
specific "correlation" function you could always use a convolution function
or an FIR filter function since all 3 of these are doing the same math.

2)  Yes, you can have the DMA fill one buffer while you process another.
This is referred to as "double buffering" or sometimes "ping-pong buffers."
This is very common and the most efficient way to do your processing.
Furthermore, if you ADC is a stereo ADC you can use the DMA to sort your
data into a "left" buffer and "right" buffer.  That way you don't need to
expend CPU cycles sorting it.  By having the DMA buffer up more than one
sample between interrupts you increase your efficiency because you have
fewer cycles lost doing context switches.  You also get efficiency gains due
to pipelining in very tight filter loops, etc.

3)  I don't know that there is any specific guide for writing C code for
DSPs.  Your C code looks good so I wouldn't worry about it.  Just following
fundamental programming rules like you seem to already be doing.  A more
general thing to think about as you're coding is that DSPs excel at doing
math, not at jumping/branching all over the place in the code.

Brad

"sebastian" <malaka@email.it> wrote in message
news:6aefd6be.0408170555.42b253@posting.google.com...
> hi, > > im a newbie in DSP and embedded programming and i've some doubts that > i hope some of you could kindly answer. > > i've to do some profiling between DSP and ASIC/FPGA solutions. Im a > FPGA guy so im lost with this DSP thing. > > 1) > i'd like to know what do you think about the code i wrote, cause i > feel it's not very "DSP optimized". I need to implement some sort of > shiftregister or FIFO, because im to calculate a correlation, so > here's how i do it. (like i'd do in vhdl...) > > the input coming from an ADC is 10bits wide, so it'll be stored in a > "short" > > #include.... > > #define FifoSize(x) (x+1) > > // FIFO declarations > short FifoN[FifoSize(gN)]; > short *FifoN_ptr_write = &FifoN[0]; > short *FifoN_ptr_read = &FifoN[1]; > short *FifoN_end_ptr = &FifoN[gN]; > > short input; > int multN; > > // check for the end of FIFO (write), if yes, then wrap around > if (FifoN_ptr_write == FifoN_end_ptr) > { > FifoN_ptr_write = &FifoN[0]; > } > > // write to the FIFO > *(FifoN_ptr_write++) = input; > > // check for the end of FIFO (read), if yes, then wrap around > if (FifoN_ptr_read == FifoN_end_ptr) > { > FifoN_ptr_read = &FifoN[0]; > } > > // calculate input * FIFO_output > multN = input * (*FifoN_ptr_read++) > > .... > > EchR = EchR + multN - (*FifoD1_ptr_read++); > > > is this the best way to do it? or there's a better way to implement > the FIFOs? > should i "fix" for 20bits discarding the first 12? (i guess that's not > necesary) > will memory used by the FIFO be effectivelly stored in the cache? > can i use three operands at the right of the equal? or should i split > the expression? > > 2) > how about the ADC doing a DMA transfer to RAM and then the DSP reading > a whole chunk of data while the ADC performs the next DMA? it is > possible i guess. > > 3) > are there online tutorials or coding style guideliness for C for DSP?
"Brad Griffis" <bradgriffis@hotmail.com> wrote in message news:<fxzUc.1838$1Z3.1477@newssvr32.news.prodigy.com>...
> What specific DSP are you using? > > 1) In my opinion the two most important things that distinguish a DSP from > a microprocessor are the ability to do a single cycle multiply-accumulate > (MAC) and also the ability to do circular buffering in hardware. From C you > may not be able to take advantage of either of those. However, you may not > need to. Most math-intensive algorithms already have C-callable assembly > functions that will very optimally do your processing. If there is not a > specific "correlation" function you could always use a convolution function > or an FIR filter function since all 3 of these are doing the same math. >
thanks, i'll search for them and take a look at it
> 2) Yes, you can have the DMA fill one buffer while you process another. > This is referred to as "double buffering" or sometimes "ping-pong buffers." > This is very common and the most efficient way to do your processing. > Furthermore, if you ADC is a stereo ADC you can use the DMA to sort your > data into a "left" buffer and "right" buffer. That way you don't need to > expend CPU cycles sorting it. By having the DMA buffer up more than one > sample between interrupts you increase your efficiency because you have > fewer cycles lost doing context switches. You also get efficiency gains due > to pipelining in very tight filter loops, etc. >
well it's stereo cause i've real and imaginary parts to treat :) though im talking of 100MSPS, not audio stuff. You think i could fit baseband signal processing, FFT, etc, in a 1800MIPS DSP? cause so far i dont see it happening.
> 3) I don't know that there is any specific guide for writing C code for > DSPs. Your C code looks good so I wouldn't worry about it. Just following > fundamental programming rules like you seem to already be doing. A more > general thing to think about as you're coding is that DSPs excel at doing > math, not at jumping/branching all over the place in the code. >
thanks, i'll keep that in mind and try to "inline" whenever it's possible
> Brad > >
"sebastian" <malaka@email.it> wrote in message
news:6aefd6be.0408180424.52ccdf9e@posting.google.com...
> "Brad Griffis" <bradgriffis@hotmail.com> wrote in message
news:<fxzUc.1838$1Z3.1477@newssvr32.news.prodigy.com>...
> > What specific DSP are you using? > > > > 1) In my opinion the two most important things that distinguish a DSP
from
> > a microprocessor are the ability to do a single cycle
multiply-accumulate
> > (MAC) and also the ability to do circular buffering in hardware. From C
you
> > may not be able to take advantage of either of those. However, you may
not
> > need to. Most math-intensive algorithms already have C-callable
assembly
> > functions that will very optimally do your processing. If there is not
a
> > specific "correlation" function you could always use a convolution
function
> > or an FIR filter function since all 3 of these are doing the same math. > > > > thanks, i'll search for them and take a look at it > > > 2) Yes, you can have the DMA fill one buffer while you process another. > > This is referred to as "double buffering" or sometimes "ping-pong
buffers."
> > This is very common and the most efficient way to do your processing. > > Furthermore, if you ADC is a stereo ADC you can use the DMA to sort your > > data into a "left" buffer and "right" buffer. That way you don't need
to
> > expend CPU cycles sorting it. By having the DMA buffer up more than one > > sample between interrupts you increase your efficiency because you have > > fewer cycles lost doing context switches. You also get efficiency gains
due
> > to pipelining in very tight filter loops, etc. > > > > well it's stereo cause i've real and imaginary parts to treat :) > though im talking of 100MSPS, not audio stuff. You think i could fit > baseband signal processing, FFT, etc, in a 1800MIPS DSP? cause so far > i dont see it happening.
[BG] Not a chance! If you want to do processing at 100 MSPS you'll need a very powerful DSP. If you're doing communications you may want a digital down-converter chip between the ADC and the DSP. That chip will do a lot of the heavy duty filtering and demodulation to get you down to baseband. That will leave a more reasonable data rate for you to deal with in the DSP. You said you're an FPGA guy so why not put in a small relatively inexpensive FPGA to do the first part of the work and then do the rest on a DSP. I don't know if this will be of interest or not, but TI's 6416 DSP comes in speeds up to 1 GHz and has coprocessors for Turbo and Viterbi decoding. I think you can buy a 6416 DSP Starter Kit (DSK) from spectrum digital for $400 or so.
> > 3) I don't know that there is any specific guide for writing C code for > > DSPs. Your C code looks good so I wouldn't worry about it. Just
following
> > fundamental programming rules like you seem to already be doing. A more > > general thing to think about as you're coding is that DSPs excel at
doing
> > math, not at jumping/branching all over the place in the code. > > > > thanks, i'll keep that in mind and try to "inline" whenever it's > possible
[BG] I think the compiler will often do this for you if you turn on optimization and set a inlining threshold. It's a tradeoff between code size and execution speed...
"Brad Griffis" <bradgriffis@hotmail.com> wrote in message news:<rXIUc.3219$ZC7.1477@newssvr19.news.prodigy.com>...
> > [BG] Not a chance! If you want to do processing at 100 MSPS you'll need a > very powerful DSP. If you're doing communications you may want a digital > down-converter chip between the ADC and the DSP. That chip will do a lot of > the heavy duty filtering and demodulation to get you down to baseband. That > will leave a more reasonable data rate for you to deal with in the DSP. You > said you're an FPGA guy so why not put in a small relatively inexpensive > FPGA to do the first part of the work and then do the rest on a DSP.
thanks for the suggestion, but what it made attractive the DSP was it's FP capabilities, using the FPGA, i will implement my current algorithms (not that there's anything wrong with that! :) ) but they are "limited" in precision.
> I don't know if this will be of interest or not, but TI's 6416 DSP comes in > speeds up to 1 GHz and has coprocessors for Turbo and Viterbi decoding. I > think you can buy a 6416 DSP Starter Kit (DSK) from spectrum digital for > $400 or so. >
thanks! i saw it on TI's site too, it's very interesting for hobby, too bad it has only audio codecs and not video.
> > [BG] I think the compiler will often do this for you if you turn on > optimization and set a inlining threshold. It's a tradeoff between code > size and execution speed...
i need speed :)