EmbeddedRelated.com
Forums

How to reliably deal with UART errors

Started by Mike Harrison July 17, 2007
I've been having some problems reliably coping with UART receive errors on the LPC2136/01
I'm reading a stream of DMX data ( 250kbits, 8N2, 88uS break indicating frame sync).
The problem seems to be that when the stream is present at startup ( i.e. before the interrupt code
is set up), the UART can get itself into a state that I can't recover from, generally permamnet
overrun errors. If it gets in sync to start with, it works fine.

What I typically see is that I get Uart rx interrupts with timing as as expected WRT the incoming
data, but the OE bit is always set regardless of the IRQ code's attempts to empty the fifo. .

On most other micros, there is a 'UART enable' bit which can be turned off & on to clear all UART
errors, however the LPC doesn't have this, and the UM is not too clear on how exactly certain errors
are cleared. They generally state that reading the LSR clears the error, but it is not clear if you
first have to clear the underlying cause of that error to avoid it recurring, e.g. for an overrun
error, do you need to empty the FIFO before re-reading LSR to clear the error?

Using the fifo reset bit in FCR on errors doesn't seem to help.

Here is the rx int code: (only receive ints are enabled)

__irq __arm void u0int(void)
{
static char itemp,itemp2;
switch(U0IIR & 0x0f) {

case 4 : // data available
case 12 : // timeout
itemp=U0LSR;
while ((itemp & 0x81)==1) // good data available - read until fifo empty
{itemp2=U0RBR;itemp=U0LSR; if(dmxptr<512) dmxbuf[dmxptr++]=itemp2;}

while (itemp & 0x9e) // error - P1.24 used as debug flag. Assume any error is a DMX break
{ FIO1SET=1<<24;dmxptr=0;U0THR=0x55;;itemp=U0RBR;itemp=U0LSR;}

} //switch iir

FIO1CLR=1<<24;
VICVectAddr = 0;

}

UART setup code :

U0IER=0;
U0LCR=0x87; // 8+2stop, DLAB hi
U0DLM=0;U0DLL=2;
U0FDR=0x87;
U0LCR=7;
U0FCR=0xc7; // fifo clear
U0IER=1; // rx int only

An Engineer's Guide to the LPC2100 Series

Hi-

I am sorry to tell you this, but if the uart is enabled in the middle
of a data stream, there is very little likelyhood that it will ever
recover reliabily. It's in the nature of the beast. What it is
trying to do is find a "start bit". It cannot distinuish between a
start bit and a logic 0 data bit being sent. So, it will latch onto
the next high voltage level on the wire and say:

"AHA! THAT's a start bit", and start trying to assemble characters.

After it's assembled a data word length (specified by the control
registers), then it looks for a "stop" bit (voltage low on the wire).
If it doesn't see one, then it sets it's "framing error" bit, to let
the program know that something was received in error..... Then it
goes and looks for another start bit........
So, there are various things that one can do to rectify the situation.

1. One can install hardware or software handshaking (rts/cts
signalling or XON/XOFF) flow control and wait for the data to stop,
then restart communications after a character time or two of the
interface being in the stop state. This allows the receiving uart a
chance to start looking for a real start bit again.

2. After data has been sent by the transmitter, the receiving side can
issue a negative acknowledge of some sort to the transmitter to
indicate that the whole data frame was in error and should be resent.
This doesn't mean that the data was transmitted error free.
Additional logic (checksums, crc's, etc) would have to be done on the
data frame to ensure error free data transmission.

It might be of interest to go into a routine that actually changes the
state of the input line (RX data) and put it back to a digital input
and verify that the incoming data is indeed stopped for a sufficent
time. One could get clever and use the RX data lines as external
interrupts (in the 2119, RxD0 is also EINT0 and RxD1 is also EINT3). A
little clever programming with a timer to look for a low level on the
wire that doesn't go high for a character (or two) in time would add
code complexity, but not too bad on CPU overhead.

So, in summary, it's just the nature of the beast. If you want to
have your uart jump into the middle of a data stream, then you might
find it a challenge to have it come up with some meaningfull data.

Hope this helps!

Cheers,

Rich S.

--- In l..., Mike Harrison wrote:
>
> I've been having some problems reliably coping with UART receive
errors on the LPC2136/01
> I'm reading a stream of DMX data ( 250kbits, 8N2, 88uS break
indicating frame sync).
> The problem seems to be that when the stream is present at startup (
i.e. before the interrupt code
> is set up), the UART can get itself into a state that I can't
recover from, generally permamnet
> overrun errors. If it gets in sync to start with, it works fine.
>
> What I typically see is that I get Uart rx interrupts with timing as
as expected WRT the incoming
> data, but the OE bit is always set regardless of the IRQ code's
attempts to empty the fifo. .
>
> On most other micros, there is a 'UART enable' bit which can be
turned off & on to clear all UART
> errors, however the LPC doesn't have this, and the UM is not too
clear on how exactly certain errors
> are cleared. They generally state that reading the LSR clears the
error, but it is not clear if you
> first have to clear the underlying cause of that error to avoid it
recurring, e.g. for an overrun
> error, do you need to empty the FIFO before re-reading LSR to clear
the error?
>
> Using the fifo reset bit in FCR on errors doesn't seem to help.
>
> Here is the rx int code: (only receive ints are enabled)
>
> __irq __arm void u0int(void)
> {
> static char itemp,itemp2;
> switch(U0IIR & 0x0f) {
>
> case 4 : // data available
> case 12 : // timeout
> itemp=U0LSR;
> while ((itemp & 0x81)==1) // good data available - read until
fifo empty
> {itemp2=U0RBR;itemp=U0LSR; if(dmxptr<512)
dmxbuf[dmxptr++]=itemp2;}
>
> while (itemp & 0x9e) // error - P1.24 used as debug flag.
Assume any error is a DMX break
> { FIO1SET=1<<24;dmxptr=0;U0THR=0x55;;itemp=U0RBR;itemp=U0LSR;}
>
> } //switch iir
>
> FIO1CLR=1<<24;
> VICVectAddr = 0;
>
> }
>
> UART setup code :
>
> U0IER=0;
> U0LCR=0x87; // 8+2stop, DLAB hi
> U0DLM=0;U0DLL=2;
> U0FDR=0x87;
> U0LCR=7;
> U0FCR=0xc7; // fifo clear
> U0IER=1; // rx int only
>
On Tue, 17 Jul 2007 15:33:18 -0000, you wrote:

>Hi-
>
>I am sorry to tell you this, but if the uart is enabled in the middle
>of a data stream, there is very little likelyhood that it will ever
>recover reliabily. It's in the nature of the beast. What it is
>trying to do is find a "start bit". It cannot distinuish between a
>start bit and a logic 0 data bit being sent. So, it will latch onto
>the next high voltage level on the wire and say:
>
>"AHA! THAT's a start bit", and start trying to assemble characters.
>
>After it's assembled a data word length (specified by the control
>registers), then it looks for a "stop" bit (voltage low on the wire).
> If it doesn't see one, then it sets it's "framing error" bit, to let
>the program know that something was received in error..... Then it
>goes and looks for another start bit........

And on detection of the framing error, the software resets the UART ( if it can figure out how to) ,
ready for the next startbit. That's what stopbits are for. On most micros you just disable and
re-enable the UART and that always resets everything. I'm just trying to figure how to
completelyreset the LPC UART suitable forcefully in the absence of any concrete guidance in the
User manual.

I suppose there may some unfortunate sequences of data that cause continual mis-syncs, but any break
condition or gap longer than 1 byte time should always be recoverable.

In my instance (DMX) there are also gaps and breaks which ensure that sync is always possible,
given a well-behaved UART.
The LPC2136 features break detection, and even an interrupt on break
detection. Why don't you make use of it?

As I see it, assuming every error is a break is dangerous code because
there are so many things that could go wrong during a packet.

If there is an error in the middle of the packet, you assume a break
condition and the next byte will be the first channel (or start code).

If you get an error codition that is not a break condition (i.e. a
Framing error with the character == 0) then you should abandon the rest
of the packet.

Abandon the rest of a packet if you get errors receiving characters, DMX
networks feature LONG cable runs and can go through numerous processors
before the data reaches your device - assume that there will be genuine
data errors (because there WILL be).

At the moment, you are doing almost no error handling and that is
probably why you are suffering problems with the UART.

It might be handy to set a packeterror flag if you do receive an error
during a packet, then on your next break detection just reset the fifo's
and start again.

Good luck,

Brian Sidebotham.

Mike Harrison wrote:
> I've been having some problems reliably coping with UART receive errors on the LPC2136/01
> I'm reading a stream of DMX data ( 250kbits, 8N2, 88uS break indicating frame sync).
> The problem seems to be that when the stream is present at startup ( i.e. before the interrupt code
> is set up), the UART can get itself into a state that I can't recover from, generally permamnet
> overrun errors. If it gets in sync to start with, it works fine.
>
> What I typically see is that I get Uart rx interrupts with timing as as expected WRT the incoming
> data, but the OE bit is always set regardless of the IRQ code's attempts to empty the fifo. .
>
> On most other micros, there is a 'UART enable' bit which can be turned off & on to clear all UART
> errors, however the LPC doesn't have this, and the UM is not too clear on how exactly certain errors
> are cleared. They generally state that reading the LSR clears the error, but it is not clear if you
> first have to clear the underlying cause of that error to avoid it recurring, e.g. for an overrun
> error, do you need to empty the FIFO before re-reading LSR to clear the error?
>
> Using the fifo reset bit in FCR on errors doesn't seem to help.
>
> Here is the rx int code: (only receive ints are enabled)
>
> __irq __arm void u0int(void)
> {
> static char itemp,itemp2;
> switch(U0IIR & 0x0f) {
>
> case 4 : // data available
> case 12 : // timeout
> itemp=U0LSR;
> while ((itemp & 0x81)==1) // good data available - read until fifo empty
> {itemp2=U0RBR;itemp=U0LSR; if(dmxptr<512) dmxbuf[dmxptr++]=itemp2;}
>
> while (itemp & 0x9e) // error - P1.24 used as debug flag. Assume any error is a DMX break
> { FIO1SET=1<<24;dmxptr=0;U0THR=0x55;;itemp=U0RBR;itemp=U0LSR;}
>
> } //switch iir
>
> FIO1CLR=1<<24;
> VICVectAddr = 0;
>
> }
>
> UART setup code :
>
> U0IER=0;
> U0LCR=0x87; // 8+2stop, DLAB hi
> U0DLM=0;U0DLL=2;
> U0FDR=0x87;
> U0LCR=7;
> U0FCR=0xc7; // fifo clear
> U0IER=1; // rx int only
>
>
>
> 17:42
On Tue, 17 Jul 2007 19:03:05 +0100, you wrote:

>The LPC2136 features break detection, and even an interrupt on break
>detection. Why don't you make use of it?

Because I shouldn't need to - a break should cause a received character with a break error which can
be handled in the normal rx interrupt code, and indeed when the UART is synced correctly it works
fine.
I believe the fundamental problem is not related to breaks, but to overrun errors.

>As I see it, assuming every error is a break is dangerous code because
>there are so many things that could go wrong during a packet.

Quite possibly, but at the moment this is for a test/debug setup, not production, so I don't really
care about errors mid-packet and DMX framing - I can add code to handle these easily once the UART
is syncing reliably. I am trying to get the uart to get into byte sync reliably before looking at
ways to do things properly.
I am intending to use the LPC for 'proper' DMX apps in future, so I want to know how I can make the
UART behave properly in all circumstances for when I really need to.

>At the moment, you are doing almost no error handling and that is
>probably why you are suffering problems with the UART.

I an trying to establish the 'correct' way to reliably recover from what appear to be mostly overrun
errors. More detailed handling has to wait until that is sorted...
>Good luck,
>
>Brian Sidebotham.
>
>Mike Harrison wrote:
>> I've been having some problems reliably coping with UART receive errors on the LPC2136/01
>> I'm reading a stream of DMX data ( 250kbits, 8N2, 88uS break indicating frame sync).
>> The problem seems to be that when the stream is present at startup ( i.e. before the interrupt code
>> is set up), the UART can get itself into a state that I can't recover from, generally permamnet
>> overrun errors. If it gets in sync to start with, it works fine.
>>
>> What I typically see is that I get Uart rx interrupts with timing as as expected WRT the incoming
>> data, but the OE bit is always set regardless of the IRQ code's attempts to empty the fifo. .
>>
>> On most other micros, there is a 'UART enable' bit which can be turned off & on to clear all UART
>> errors, however the LPC doesn't have this, and the UM is not too clear on how exactly certain errors
>> are cleared. They generally state that reading the LSR clears the error, but it is not clear if you
>> first have to clear the underlying cause of that error to avoid it recurring, e.g. for an overrun
>> error, do you need to empty the FIFO before re-reading LSR to clear the error?
>>
>> Using the fifo reset bit in FCR on errors doesn't seem to help.
>>
>> Here is the rx int code: (only receive ints are enabled)
>>
>> __irq __arm void u0int(void)
>> {
>> static char itemp,itemp2;
>> switch(U0IIR & 0x0f) {
>>
>> case 4 : // data available
>> case 12 : // timeout
>> itemp=U0LSR;
>> while ((itemp & 0x81)==1) // good data available - read until fifo empty
>> {itemp2=U0RBR;itemp=U0LSR; if(dmxptr<512) dmxbuf[dmxptr++]=itemp2;}
>>
>> while (itemp & 0x9e) // error - P1.24 used as debug flag. Assume any error is a DMX break
>> { FIO1SET=1<<24;dmxptr=0;U0THR=0x55;;itemp=U0RBR;itemp=U0LSR;}
>>
>> } //switch iir
>>
>> FIO1CLR=1<<24;
>> VICVectAddr = 0;
>>
>> }
>>
>> UART setup code :
>>
>> U0IER=0;
>> U0LCR=0x87; // 8+2stop, DLAB hi
>> U0DLM=0;U0DLL=2;
>> U0FDR=0x87;
>> U0LCR=7;
>> U0FCR=0xc7; // fifo clear
>> U0IER=1; // rx int only
>>
>>
>>
>> 17:42
>
>Yahoo! Groups Links
>
Mike Harrison wrote:
> On Tue, 17 Jul 2007 19:03:05 +0100, you wrote:
>
>> The LPC2136 features break detection, and even an interrupt on break
>> detection. Why don't you make use of it?
>
> Because I shouldn't need to - a break should cause a received character with a break error which can
> be handled in the normal rx interrupt code, and indeed when the UART is synced correctly it works
> fine.
> I believe the fundamental problem is not related to breaks, but to overrun errors.
>
>> As I see it, assuming every error is a break is dangerous code because
>> there are so many things that could go wrong during a packet.
>
> Quite possibly, but at the moment this is for a test/debug setup, not production, so I don't really
> care about errors mid-packet and DMX framing - I can add code to handle these easily once the UART
> is syncing reliably. I am trying to get the uart to get into byte sync reliably before looking at
> ways to do things properly.
> I am intending to use the LPC for 'proper' DMX apps in future, so I want to know how I can make the
> UART behave properly in all circumstances for when I really need to.
>
>> At the moment, you are doing almost no error handling and that is
>> probably why you are suffering problems with the UART.
>
> I an trying to establish the 'correct' way to reliably recover from what appear to be mostly overrun
> errors. More detailed handling has to wait until that is sorted...

Recovering from error is more detailed handling though. To 'reset' the
uart Rx, simply flush the Rx fifo (U0FCR &= (1<<1)). You must
distinguish between a break and an error which you are not currently doing.

Good luck with everything, particularly if you're implimenting RDM over DMX!

Brian Sidebotham.
Brian Sidebotham wrote:
> Mike Harrison wrote:
>> On Tue, 17 Jul 2007 19:03:05 +0100, you wrote:
>>
>>> The LPC2136 features break detection, and even an interrupt on break
>>> detection. Why don't you make use of it?
>> Because I shouldn't need to - a break should cause a received character with a break error which can
>> be handled in the normal rx interrupt code, and indeed when the UART is synced correctly it works
>> fine.
>> I believe the fundamental problem is not related to breaks, but to overrun errors.
>>
>>> As I see it, assuming every error is a break is dangerous code because
>>> there are so many things that could go wrong during a packet.
>> Quite possibly, but at the moment this is for a test/debug setup, not production, so I don't really
>> care about errors mid-packet and DMX framing - I can add code to handle these easily once the UART
>> is syncing reliably. I am trying to get the uart to get into byte sync reliably before looking at
>> ways to do things properly.
>> I am intending to use the LPC for 'proper' DMX apps in future, so I want to know how I can make the
>> UART behave properly in all circumstances for when I really need to.
>>
>>> At the moment, you are doing almost no error handling and that is
>>> probably why you are suffering problems with the UART.
>> I an trying to establish the 'correct' way to reliably recover from what appear to be mostly overrun
>> errors. More detailed handling has to wait until that is sorted...
>
> Recovering from error is more detailed handling though. To 'reset' the
> uart Rx, simply flush the Rx fifo (U0FCR &= (1<<1)). You must
> distinguish between a break and an error which you are not currently doing.

Apologies, that should read (U0FCR |= (1<<1))

> Good luck with everything, particularly if you're implimenting RDM over DMX!
>
> Brian Sidebotham.
>Recovering from error is more detailed handling though. To 'reset' the
>uart Rx, simply flush the Rx fifo (U0FCR &= (1<<1)). You must
>distinguish between a break and an error which you are not currently doing.

Using the reset-fifo function is one of the things I tried, and it didn't appear to reset things
reliably - at th etime I was trying all sorts of things so it's possible that there were additional
problems - I'll have to look at in more detail when I get time.

Incidentally, the FCR reset function is by writing a '1' into bit 1, not clearing it - the above
operation would be somewhat dangerous as the FCR is write-only, and shares its address with the
read-only IIR - although bits 6,7 are mirrorred in the IIR, the lower bits are not.
>Good luck with everything, particularly if you're implimenting RDM over DMX!
>
>Brian Sidebotham.
>
>Yahoo! Groups Links
>
Mike Harrison wrote:
>> Recovering from error is more detailed handling though. To 'reset' the
>> uart Rx, simply flush the Rx fifo (U0FCR &= (1<<1)). You must
>> distinguish between a break and an error which you are not currently doing.
>
> Using the reset-fifo function is one of the things I tried, and it didn't appear to reset things
> reliably - at th etime I was trying all sorts of things so it's possible that there were additional
> problems - I'll have to look at in more detail when I get time.
>
> Incidentally, the FCR reset function is by writing a '1' into bit 1, not clearing it - the above
> operation would be somewhat dangerous as the FCR is write-only, and shares its address with the
> read-only IIR - although bits 6,7 are mirrorred in the IIR, the lower bits are not.

Yes, again sorry. I quickly scan-read the register description without
observing it was WO.

Best Regards,

Brian.

>
>> Good luck with everything, particularly if you're implimenting RDM over DMX!
>>
>> Brian Sidebotham.
Mike, the code you posted doesn't enabled the RX Line Status Interrupt (Bit
2 in IIR). If you want to handle the errors in a timely fashion it may help
to have the UART throw this interrupt rather then waiting until you get
valid data.

Chris

On 7/18/07, Brian Sidebotham wrote:
>
> Mike Harrison wrote:
> >> Recovering from error is more detailed handling though. To 'reset' the
> >> uart Rx, simply flush the Rx fifo (U0FCR &= (1<<1)). You must
> >> distinguish between a break and an error which you are not currently
> doing.
> >
> > Using the reset-fifo function is one of the things I tried, and it
> didn't appear to reset things
> > reliably - at th etime I was trying all sorts of things so it's possible
> that there were additional
> > problems - I'll have to look at in more detail when I get time.
> >
> > Incidentally, the FCR reset function is by writing a '1' into bit 1, not
> clearing it - the above
> > operation would be somewhat dangerous as the FCR is write-only, and
> shares its address with the
> > read-only IIR - although bits 6,7 are mirrorred in the IIR, the lower
> bits are not.
>
> Yes, again sorry. I quickly scan-read the register description without
> observing it was WO.
>
> Best Regards,
>
> Brian.
>
> >
> >> Good luck with everything, particularly if you're implimenting RDM over
> DMX!
> >>
> >> Brian Sidebotham.
>
>