EmbeddedRelated.com
Forums

Strange problem (crossworks/LPC2366)

Started by drproton2003 May 15, 2008
Sorry for the vague title, but I don't know how to describe this
problem in a few words. I've encountered a very unusual problem
today. After adding some code to an existing project the processor
hangs, usually very quickly. The processor stops in a place that
should have nothing to do with my new code, rather it stop in a
section (according to the link register) that has been in place for a
long time.

Now it start to get weird. This problem only seems to occur on one
board (so far I've only tried it on two), and only with the relevant
file set to Level 3 optimization. If I set the file to None, it runs
just fine, I haven't tried other settings yet. I've tracked the
problem down to one specific call to this function:

inline int SSP1_write_word(short int data)
{
int ReturnVal;
SSP1DRa;
while((SSP1SR&0x04)==0x00) {}
while((SSP1SR&0x10)) {}
ReturnVal=SSP1DR;
return ReturnVal;
}

This function is used in several other locations, none of which create
a problem. It also doesn't matter whether or not it is inline. As a
test I moved the function to another file and set both to level
optimization. It works just fine, and builds to a slightly smaller
size (900 bytes) than with the function in the same file. This has me
stumped. What could be the cause of some of these issues?

I'm having a hard time figuring out what is going on here. This is
complicated by the fact that with the problem only occurring with
level 3 optimization I can't use the debugger for too much. Any input
is appreciated.

An Engineer's Guide to the LPC2100 Series

Turn the optimisation off

'optimisation is a root of evil' Quote

If you want to figure out what is hapening...take a look at the asm
output from your compiler and compare the difference...that might give
you some clues

I don't realy know...sorry...perhaps it is something to do with the
cast from short into SSP1DR ?

Regards
Jimbo ( Reading UK )
Hi,

> Sorry for the vague title, but I don't know how to describe this
> problem in a few words. I've encountered a very unusual problem
> today. After adding some code to an existing project the processor
> hangs, usually very quickly. The processor stops in a place that
> should have nothing to do with my new code, rather it stop in a
> section (according to the link register) that has been in place for a
> long time.
>
> Now it start to get weird. This problem only seems to occur on one
> board (so far I've only tried it on two), and only with the relevant
> file set to Level 3 optimization. If I set the file to None, it runs
> just fine, I haven't tried other settings yet. I've tracked the
> problem down to one specific call to this function:
>
> inline int SSP1_write_word(short int data)
> {
> int ReturnVal;
> SSP1DRa;
> while((SSP1SR&0x04)==0x00) {}
> while((SSP1SR&0x10)) {}
> ReturnVal=SSP1DR;
> return ReturnVal;
> }
>
> This function is used in several other locations, none of which create
> a problem. It also doesn't matter whether or not it is inline. As a
> test I moved the function to another file and set both to level
> optimization. It works just fine, and builds to a slightly smaller
> size (900 bytes) than with the function in the same file. This has me
> stumped. What could be the cause of some of these issues?

Have you thought about MAMs? Seems like this has bitten quite a few of our
customers lately. Also, are you using 8-bit frames or larger?

> I'm having a hard time figuring out what is going on here. This is
> complicated by the fact that with the problem only occurring with
> level 3 optimization I can't use the debugger for too much. Any input
> is appreciated.

Some observations.

First, the high-level code, lightly re-formatted:

SSP1DRa;
while ((SSP1SR & SSP_ReceiverNotEmpty)==0x00)
;
while (SSP1SR & SSP_Busy)
;
return SSP1DR;

There's no need to test SSP_Busy when the receiver is not empty. As the
receiver isn't empty you're assured that you have something to read *and*
that byte-by-byte feeding of the SSP means that the Tx fifo is empty.

This code can be cut to:

SSP1DRa;
while ((SSP1SR & SSP_ReceiverNotEmpty)==0x00)
;
return SSP1DR;

This is exactly what I have in some of my code, but also I have code that
will transmit and receive using the whole FIFO which is highly efficient.
I've run this at the maximum speed I can to SD and MMC cards and it works
like a charm even at optimization level 3 on a multitude of evaluation
boards I have here.

Regards,

--
Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
CrossWorks for ARM, MSP430, AVR, MAXQ, and now Cortex-M3 processors

I modified my function as you suggested and it seems to work.
However, I don't think this was the "real" problem. I discovered
another interesting thing. When I commented out a portion of my
recently added code the problem went away, even though the code in
question was never being called. This section of code was located
around 0xDDXX, the section of code that crashes is located in the
0x100XX region. I believe the crashing code is called from addresses
less than 0x10000. Is it possible that the wraparound from 0xFFFF to
0x10000 creates a problem? I tried setting the file to use long
calls, but this made no difference.

--- In l..., "Paul Curtis" wrote:
>
> Hi,
>
> > Sorry for the vague title, but I don't know how to describe this
> > problem in a few words. I've encountered a very unusual problem
> > today. After adding some code to an existing project the processor
> > hangs, usually very quickly. The processor stops in a place that
> > should have nothing to do with my new code, rather it stop in a
> > section (according to the link register) that has been in place for a
> > long time.
> >
> > Now it start to get weird. This problem only seems to occur on one
> > board (so far I've only tried it on two), and only with the relevant
> > file set to Level 3 optimization. If I set the file to None, it runs
> > just fine, I haven't tried other settings yet. I've tracked the
> > problem down to one specific call to this function:
> >
> > inline int SSP1_write_word(short int data)
> > {
> > int ReturnVal;
> > SSP1DRa;
> > while((SSP1SR&0x04)==0x00) {}
> > while((SSP1SR&0x10)) {}
> > ReturnVal=SSP1DR;
> > return ReturnVal;
> > }
> >
> > This function is used in several other locations, none of which create
> > a problem. It also doesn't matter whether or not it is inline. As a
> > test I moved the function to another file and set both to level
> > optimization. It works just fine, and builds to a slightly smaller
> > size (900 bytes) than with the function in the same file. This has me
> > stumped. What could be the cause of some of these issues?
>
> Have you thought about MAMs? Seems like this has bitten quite a few
of our
> customers lately. Also, are you using 8-bit frames or larger?
>
> > I'm having a hard time figuring out what is going on here. This is
> > complicated by the fact that with the problem only occurring with
> > level 3 optimization I can't use the debugger for too much. Any input
> > is appreciated.
>
> Some observations.
>
> First, the high-level code, lightly re-formatted:
>
> SSP1DRa;
> while ((SSP1SR & SSP_ReceiverNotEmpty)==0x00)
> ;
> while (SSP1SR & SSP_Busy)
> ;
> return SSP1DR;
>
> There's no need to test SSP_Busy when the receiver is not empty. As the
> receiver isn't empty you're assured that you have something to read
*and*
> that byte-by-byte feeding of the SSP means that the Tx fifo is empty.
>
> This code can be cut to:
>
> SSP1DRa;
> while ((SSP1SR & SSP_ReceiverNotEmpty)==0x00)
> ;
> return SSP1DR;
>
> This is exactly what I have in some of my code, but also I have code
that
> will transmit and receive using the whole FIFO which is highly
efficient.
> I've run this at the maximum speed I can to SD and MMC cards and it
works
> like a charm even at optimization level 3 on a multitude of evaluation
> boards I have here.
>
> Regards,
>
> --
> Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
> CrossWorks for ARM, MSP430, AVR, MAXQ, and now Cortex-M3 processors
>

Yeah, and err, have you eliminated the possibility of tripping over the MAM
bug? It looks like this is hitting more customers now.

--
Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
CrossWorks for ARM, MSP430, AVR, MAXQ, and now Cortex-M3 processors

> -----Original Message-----
> From: l... [mailto:l...] On
> Behalf Of drproton2003
> Sent: 15 May 2008 17:19
> To: l...
> Subject: [lpc2000] Re: Strange problem (crossworks/LPC2366)
>
> I modified my function as you suggested and it seems to work.
> However, I don't think this was the "real" problem. I discovered
> another interesting thing. When I commented out a portion of my
> recently added code the problem went away, even though the code in
> question was never being called. This section of code was located
> around 0xDDXX, the section of code that crashes is located in the
> 0x100XX region. I believe the crashing code is called from addresses
> less than 0x10000. Is it possible that the wraparound from 0xFFFF to
> 0x10000 creates a problem? I tried setting the file to use long
> calls, but this made no difference.
Hi

I also discovered something weird (only on some devices of an 100pcs batch)

I could not read data from a SPI mem device, changing SPI clock speed did not help

Now it comes : when i just declared a global volatile variable the problem went away !!!
I could not find any explanation for this.

Disabling the MAM or setting its clockcount to 3 solved the problem.

MAM seem to bite many guys...

johan
----- Original Message -----
From: Paul Curtis
To: l...
Sent: Thursday, May 15, 2008 6:48 PM
Subject: RE: [lpc2000] Re: Strange problem (crossworks/LPC2366)
Yeah, and err, have you eliminated the possibility of tripping over the MAM
bug? It looks like this is hitting more customers now.

--
Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
CrossWorks for ARM, MSP430, AVR, MAXQ, and now Cortex-M3 processors

> -----Original Message-----
> From: l... [mailto:l...] On
> Behalf Of drproton2003
> Sent: 15 May 2008 17:19
> To: l...
> Subject: [lpc2000] Re: Strange problem (crossworks/LPC2366)
>
> I modified my function as you suggested and it seems to work.
> However, I don't think this was the "real" problem. I discovered
> another interesting thing. When I commented out a portion of my
> recently added code the problem went away, even though the code in
> question was never being called. This section of code was located
> around 0xDDXX, the section of code that crashes is located in the
> 0x100XX region. I believe the crashing code is called from addresses
> less than 0x10000. Is it possible that the wraparound from 0xFFFF to
> 0x10000 creates a problem? I tried setting the file to use long
> calls, but this made no difference.


That IS weird!

I thought the MAM issue depended on code and its placement.
So, assuming the same app on all boards, you would expect to see the problem
either on all boards or on none, not just on a few.

I wouldn't rule out other problems (than MAM)!

~ Paul Claessen

From: l... [mailto:l...] On Behalf Of
Sagaert Johan
Sent: Thursday, May 15, 2008 4:02 PM
To: l...
Subject: Re: [lpc2000] Re: Strange problem (crossworks/LPC2366)

Hi

I also discovered something weird (only on some devices of an 100pcs batch)

I could not read data from a SPI mem device, changing SPI clock speed did
not help

Now it comes : when i just declared a global volatile variable the problem
went away !!!
I could not find any explanation for this.

Disabling the MAM or setting its clockcount to 3 solved the problem.

MAM seem to bite many guys...

johan


Hi,

> I thought the MAM issue depended on code and its placement.
> So, assuming the same app on all boards, you would expect to see the
> problem
> either on all boards or on none, not just on a few.

I believe that many others have reported seeing MAM problems on *some*
boards but not others with identical code and the same chip revision.

--
Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
CrossWorks for ARM, MSP430, AVR, MAXQ, and now Cortex-M3 processors

I can absolutely confirm what Paul said.

Jeff

--- In l..., "Paul Curtis" wrote:
>
> Hi,
>
> > I thought the MAM issue depended on code and its placement.
> > So, assuming the same app on all boards, you would expect to see
the
> > problem
> > either on all boards or on none, not just on a few.
>
> I believe that many others have reported seeing MAM problems on
*some*
> boards but not others with identical code and the same chip revision.
>
> --
> Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
> CrossWorks for ARM, MSP430, AVR, MAXQ, and now Cortex-M3 processors
>