EmbeddedRelated.com
Forums

Re: LPC2000 UART drops characters silently?

Started by jayasooriah July 14, 2006
System is LPC2292; XTAL = 14.7456 MHz; PLL disabled; MAM disabled;
VPBDIV = 1; UART = 8-data, 1-stop, no-parity, FIFO enabled.

Symptom are that for certain (low) baud rates, UART silently drops
characters on Rx channel when saturated. There is no indication of
this in the LSR and none of the error bits (OE|PE|FE|BI|RXFE) are set
when this happens.

It appears that UART logic is failing to copy RSR (Rx Shift Register)
to RBR (Rx Buffer Register) for certain baud rates when a further
character follows immediately behind.

More specifically, at 1200 baud, when a sequence of characters is
transmitted 8.33 ms apart, all characters but the last one are dropped
by the UART and the driver is unaware it has lost characters.

[Test program to demonstrate bug available on request -- please drop
me an email.]

The work around for this anomaly appears to have been implemented in
Philips Boot Loader that works under boot loader conditions.

It appears there is no real work around when UART is used in an
environment where Rx and Tx channels are subject to saturation.

Does anyone else know of this bug and/or its work around?

Jaya

An Engineer's Guide to the LPC2100 Series

--- In l..., "jayasooriah"
wrote:
>
> System is LPC2292; XTAL = 14.7456 MHz; PLL disabled; MAM disabled;
> VPBDIV = 1; UART = 8-data, 1-stop, no-parity, FIFO enabled.
>
> Symptom are that for certain (low) baud rates, UART silently drops
> characters on Rx channel when saturated. There is no indication of
> this in the LSR and none of the error bits (OE|PE|FE|BI|RXFE) are
set
> when this happens.
>
> It appears that UART logic is failing to copy RSR (Rx Shift
Register)
> to RBR (Rx Buffer Register) for certain baud rates when a further
> character follows immediately behind.
>
> More specifically, at 1200 baud, when a sequence of characters is
> transmitted 8.33 ms apart, all characters but the last one are
dropped
> by the UART and the driver is unaware it has lost characters.
>

This all sounds very odd. Can you post the register settings you are
using for the UART, so others can verify? Also the code used to
process the receive?

By the way, do you mean 8.33ms between each chracater in the
sequence? Also, do you mean 833 usec by any chance?

I'd take a look on a scope if I were you: no matter how similar your
system is to something else that's known to work, there could be
some difference not thought of.

> [Test program to demonstrate bug available on request -- please
drop
> me an email.]
>
> The work around for this anomaly appears to have been implemented
in
> Philips Boot Loader that works under boot loader conditions.

Do you know what this work around is, and can you describe it?

>
> It appears there is no real work around when UART is used in an
> environment where Rx and Tx channels are subject to saturation.
>
> Does anyone else know of this bug and/or its work around?
>

We've done a lot of testing driving both UARTs as hard as we can
over extended periods, and have never seen anything like this:
everything works fine. As you say though, it could be that this is
only seen at very specific bit rates and/or very specific timings.

Brendan

--- In l..., "brendanmurphy37"
wrote:
>
> --- In l..., "jayasooriah"
> >
> > More specifically, at 1200 baud, when a sequence of characters is
> > transmitted 8.33 ms apart, all characters but the last one are
> dropped

> Can you post the register settings you are
> using for the UART, so others can verify?

My Settings (cut and paste):

@ // select divisor
@ _UART0.lcr = _UART_DLAB;
@
@ // autobaud count
@ _UART0.dll = count;
@ _UART0.dlm = count >> 8;
@
@ // line control
@ _UART0.lcr = _UART_DATA8 | _UART_STOP1;
@
@ // fifo control
@ _UART0.fcr = _UART_FEN | _UART_RFR | _UART_TFR;
@
@ // disable irqs
@ _UART0.ier = 0;

> Also the code used to
> process the receive?

See http://groups.yahoo.com/group/lpc2000/message/17581

To verify it you must have the capability to saturate Rx data channel.

> By the way, do you mean 8.33ms between each chracater in the
> sequence? Also, do you mean 833 usec by any chance?

I meant saturating the channel by sending one character every 8.33 ms.

> I'd take a look on a scope if I were you: no matter how similar your
> system is to something else that's known to work, there could be
> some difference not thought of.

I did and the RxD waverform is what I expected to be.

> > The work around for this anomaly appears to have been implemented
> in
> > Philips Boot Loader that works under boot loader conditions.
>
> Do you know what this work around is, and can you describe it?

I do not want to make public (on a forum like this) what Philips wants
to be kept secret. It is better if Philips authoritatively says what
it did and why to make its boot loader work at low baud rates.

> We've done a lot of testing driving both UARTs as hard as we can
> over extended periods, and have never seen anything like this:
> everything works fine. As you say though, it could be that this is
> only seen at very specific bit rates and/or very specific timings.

It is well understood that quality is not testable. The error appears
to depends on speed and Tx activity, and will only manifest itself
when the Rx channel is saturated.

I put my finger on it when saw Stephen's post and decided play with my
LPC2292 board I got back recently by running it at 300 baud.

Jaya

--- In l..., "jayasooriah"
wrote:
>
> --- In l..., "brendanmurphy37"
> wrote:
> >
> > --- In l..., "jayasooriah"
> > >
> > > More specifically, at 1200 baud, when a sequence of characters
is
> > > transmitted 8.33 ms apart, all characters but the last one are
> > dropped
>
> > Can you post the register settings you are
> > using for the UART, so others can verify?
>
> My Settings (cut and paste):
>
> @ // select divisor
> @ _UART0.lcr = _UART_DLAB;
> @
> @ // autobaud count
> @ _UART0.dll = count;
> @ _UART0.dlm = count >> 8;
> @
> @ // line control
> @ _UART0.lcr = _UART_DATA8 | _UART_STOP1;
> @
> @ // fifo control
> @ _UART0.fcr = _UART_FEN | _UART_RFR | _UART_TFR;
> @
> @ // disable irqs
> @ _UART0.ier = 0;
>
> > Also the code used to
> > process the receive?
>
> See http://groups.yahoo.com/group/lpc2000/message/17581
>
> To verify it you must have the capability to saturate Rx data
channel.
>
> > By the way, do you mean 8.33ms between each chracater in the
> > sequence? Also, do you mean 833 usec by any chance?
>
> I meant saturating the channel by sending one character every 8.33
ms.
>
> > I'd take a look on a scope if I were you: no matter how similar
your
> > system is to something else that's known to work, there could be
> > some difference not thought of.
>
> I did and the RxD waverform is what I expected to be.
>
> > > The work around for this anomaly appears to have been
implemented
> > in
> > > Philips Boot Loader that works under boot loader conditions.
> >
> > Do you know what this work around is, and can you describe it?
>
> I do not want to make public (on a forum like this) what Philips
wants
> to be kept secret. It is better if Philips authoritatively says
what
> it did and why to make its boot loader work at low baud rates.
>
> > We've done a lot of testing driving both UARTs as hard as we can
> > over extended periods, and have never seen anything like this:
> > everything works fine. As you say though, it could be that this
is
> > only seen at very specific bit rates and/or very specific
timings.
>
> It is well understood that quality is not testable. The error
appears
> to depends on speed and Tx activity, and will only manifest itself
> when the Rx channel is saturated.
>
> I put my finger on it when saw Stephen's post and decided play
with my
> LPC2292 board I got back recently by running it at 300 baud.
>
> Jaya
>

Hi,

OK - it's clearer now by what you mean when you say "saturate",
though it's still not clear what each UART register is set to (I was
thinking more of hex values, rather than variable or MACRO names,
which could resolve to anything).

To isolate differences from other systems, it would be useful to
have a very short program that demonstrates the behaviour on any
test system. For example, a small self-contained program that sends
say 100 characters on one UART and receives them on the other UART
(assuming an external loopback). That way, anyone can run it and
confirm the behaviour.

Unless you can get an issue down to a very short demonstration
program that can be easily run on a comparable system, it's
difficult to establish that the problem is with the device itself
rather than the way in which it is being used.

For information, our own tests were largely done at 9.6 and/or
115.2kbps: we've never seen anything like this, regarless of how
hard we drive the interfaces.

I don't doubt you're seeing the problem, but until someone can
reproduce it I would say it's premature to assume it's a problem
with the internal design of the UART.

Brendan.

--- In l..., "brendanmurphy37"
wrote:

> Hi,
>
> OK - it's clearer now by what you mean when you say "saturate",
> though it's still not clear what each UART register is set to (I was
> thinking more of hex values, rather than variable or MACRO names,
> which could resolve to anything).
>
> To isolate differences from other systems, it would be useful to
> have a very short program that demonstrates the behaviour on any
> test system. For example, a small self-contained program that sends
> say 100 characters on one UART and receives them on the other UART
> (assuming an external loopback). That way, anyone can run it and
> confirm the behaviour.
>
> Unless you can get an issue down to a very short demonstration
> program that can be easily run on a comparable system, it's
> difficult to establish that the problem is with the device itself
> rather than the way in which it is being used.
>
> For information, our own tests were largely done at 9.6 and/or
> 115.2kbps: we've never seen anything like this, regarless of how
> hard we drive the interfaces.
>
> I don't doubt you're seeing the problem, but until someone can
> reproduce it I would say it's premature to assume it's a problem
> with the internal design of the UART.
>
> Brendan.

No need to go looking for the problem. You can verify if it manifests
in your part by writing a simple program as per my code and doing the
experiment I suggested.

That demo is as short as you can get and is a sure fire even if you
are using Hyperterm on Windows so long as you run it at 300 baud.

I am interested to know if this anomaly is documented elsewhere given
the Boot Loader has code that (I have verified) works around this issue.

Jaya

Sorry, I cannot get the full picture...

Does that mean??
- The UART working fine at higher baud rate and having bugs at
lower baud rate
- And difference in the 2 settings (Higher and lower baud rates)
is only the baud rate prescaler value.

I'm only interested in if there is any bugs at commonly used 9600 to
115200, at about 60MHz PCLK/CPUCLK

(One way to verify that bug could be using another uC or PC to tap
the same RX line)

Regards

--- In l..., "jayasooriah"
wrote:
>
> System is LPC2292; XTAL = 14.7456 MHz; PLL disabled; MAM disabled;
> VPBDIV = 1; UART = 8-data, 1-stop, no-parity, FIFO enabled.
>
> Symptom are that for certain (low) baud rates, UART silently drops
> characters on Rx channel when saturated. There is no indication of
> this in the LSR and none of the error bits (OE|PE|FE|BI|RXFE) are
set
> when this happens.
>
> It appears that UART logic is failing to copy RSR (Rx Shift
Register)
> to RBR (Rx Buffer Register) for certain baud rates when a further
> character follows immediately behind.
>
> More specifically, at 1200 baud, when a sequence of characters is
> transmitted 8.33 ms apart, all characters but the last one are
dropped
> by the UART and the driver is unaware it has lost characters.
>
> [Test program to demonstrate bug available on request -- please
drop
> me an email.]
>
> The work around for this anomaly appears to have been implemented
in
> Philips Boot Loader that works under boot loader conditions.
>
> It appears there is no real work around when UART is used in an
> environment where Rx and Tx channels are subject to saturation.
>
> Does anyone else know of this bug and/or its work around?
>
> Jaya
>

--- In l..., "jayasooriah" wrote:
> No need to go looking for the problem. You can verify if it
manifests
> in your part by writing a simple program as per my code and doing the
> experiment I suggested.

But that's just my point. I haven't seen anything like the problem you
describe in all the testing I've seen done, and nobody else seems to
have has either (can anyone else confirm?).

You've supplied some parts of software that exhibits the issue, but
not others (for example the code that actually reads the FIFO).

In other words, without more information it's difficult to either
verify what you're seeing or comment on the possible cause.

This would be addressed if someone else can confirm the failure mode
you're seeing (which as I understand it is: "data is lost when
receiving at low bit rates if the data is sent too quickly").

>
> That demo is as short as you can get and is a sure fire even if you
> are using Hyperterm on Windows so long as you run it at 300 baud.

I'd strongly advise against using Hyperterm for any testing on serial
ports. It has too many faults, and I've seen too many people tear
their hair out looking for phantom problems which turned out to be
Hyperterm "features".

>
> I am interested to know if this anomaly is documented elsewhere given
> the Boot Loader has code that (I have verified) works around this
issue.
>

What is the nature of this work around?

Brendan.
Dear Brendan,

I like to conclude this dialog with you because this ping-pong
exchange is not getting anywhere.

Please allow me to conclude with the following clarifications:

--- In l..., "brendanmurphy37"
wrote:
>
> --- In l..., "jayasooriah" wrote:
> > No need to go looking for the problem. You can verify if it
> manifests
> > in your part by writing a simple program as per my code and doing the
> > experiment I suggested.
>
> But that's just my point. I haven't seen anything like the problem you
> describe in all the testing I've seen done, and nobody else seems to
> have has either (can anyone else confirm?).

Have you tried polling the UART0 set to 8-data 1-stop with no parity,
at 300 baud while the Rx channel is saturated?

If you have, and you are not missing characters, then your part is not
affected. It would be useful to share this information.

If you have not, it is up to you to try. I started this thread in
response to Stephen's post as my observations may be relevant.

If you feel you do not need to do this test, feel free to ignore this
post.

> You've supplied some parts of software that exhibits the issue, but
> not others (for example the code that actually reads the FIFO).
>
> In other words, without more information it's difficult to either
> verify what you're seeing or comment on the possible cause.

I tend to frown on "show me your code" kind of requests. When I
diagnose problems, I look at code only as the very last resort.

I would study the symptoms carefully and form an opinion as to likely
causes. I would then ask the questions that are relevant these
suspect causes.

I tend to stay away from fishing expeditions as it is known in this field.

Lets consider what we already know:

* The system works perfectly when characters are sent one at a time
with gape in between them.

* The problem only manifests itself at slow baud rates and when the
receiver channel is saturated.

* All characters but the last are dropped when they are sent back to back.

One character at a time works. With two character the first one is
dropped. With three characters, the first two are dropped. And so on ...

What kind of things can go wrong that works for one character at a
time, or at some baud rates, but not at other baud rates?

Initialisation of the UART? I provided you with this anyway.

The method I used to observe the problem? I provided this too.

While Yahoo is not very good with presenting code, nonetheless, I
expect you should be able to work it out what that piece of code
segment does with my explanation.

> This would be addressed if someone else can confirm the failure mode
> you're seeing (which as I understand it is: "data is lost when
> receiving at low bit rates if the data is sent too quickly").

I am sorry if I sounded like I wanted your confirmation. Not at all.
I have other ways. Let me say the existence of the problem is
confirmed. Besides the Boot Loader has code to work around this problem.

I have no idea if this UART problem has been fixed in later silicon
revisions or other variants given they have not made it into any of
the errata sheets AFAIK.

> > That demo is as short as you can get and is a sure fire even if you
> > are using Hyperterm on Windows so long as you run it at 300 baud.
>
> I'd strongly advise against using Hyperterm for any testing on serial
> ports. It has too many faults, and I've seen too many people tear
> their hair out looking for phantom problems which turned out to be
> Hyperterm "features".

This is exactly the kind of infuriating digressions you should avoid.
Replace "Hyperterm" with "Minicom", "NITE" (my own terminal emulator)
or "your favourite terminal emulator" and you will get *exactly* the
same result.

I would stay away blaming the tools and maintain my focus on the
problem and its symptoms.

> > I am interested to know if this anomaly is documented elsewhere given
> > the Boot Loader has code that (I have verified) works around this
> issue.
> > What is the nature of this work around?

If you do not have this problem why bother? As I said, I do not want
to disclose in an open forum proprietary information that Philips
intends to keep secret.

>
> Brendan.

Thank you Brendan for your interest in this problem, but given it does
not affect you, perhaps it is time to move on.

Kind regards,

Jaya

--- In l..., "unity0724" wrote:
>
> Sorry, I cannot get the full picture...
>
> Does that mean??
> - The UART working fine at higher baud rate and having bugs at
> lower baud rate

Yes.

> - And difference in the 2 settings (Higher and lower baud rates)
> is only the baud rate prescaler value.

Yes.

>
> I'm only interested in if there is any bugs at commonly used 9600 to
> 115200, at about 60MHz PCLK/CPUCLK

It does not seem to matter if PLL is enabled and CPU clock is 4 times
XTAL clock. The problem manifests itself in the same way.

> (One way to verify that bug could be using another uC or PC to tap
> the same RX line)

The waveforms look okay -- not much to diagnost between a one- and
two-character traces on the CRO.

When driven by CPLD based UART, and an extra stop bit inserted between
the charactes, the problem goes away, with all else constant.

>
> Regards

In the operational situation where Tx channel is used while receiving
characters (for XON/XOFF handshake), no errors were seen for standard
baud rates from 921600 down to 230400.

The error rates start going up from 115200 down to 19200 and then the
problem is deterministic from there on -- one character at a time
always works and two character at a time always fails.

It was hard to get data from the client's operational system, so I
recreated the same situation with my own (verified) code.

For the controlled situation (experiment just reading a line without
echoing, and printing the line on receiving EOL), it no errors are
seen at 19200 baud and above.

The above for 14.7456 MHz crystal, and data coming from FTDI 232BM
chip running off 6 MHz crystal.

Jaya

--- In l..., "jayasooriah"
wrote:
>
> Dear Brendan,
>
> I like to conclude this dialog with you because this ping-pong
> exchange is not getting anywhere.

I think you're right in that we don't seem to be getting anywhere.

To summarize:

- you have a system that drops characters under certain conditions
(low baud rate, saturated line)

- you claim this is due to a problem with the design of the UART on
the LPC2000

- you ask has anyone seen anything similar

- so far, nobody else has come forward to say they've seen the same
behavior

I responded saying that in all the testing done (including at low
baud rates), I'd never seen anything similar, and asking for further
details so that I might be able reproduce the problem. My concern is
that there is some problem inherent in the part, so far unseen in my
own systems, but that it might become apparent in the field in
unusual or exceptional conditions. I would imagine others would have
similar concerns.

By the way, I have no interest in your code or how it works. All I
asked for was what values you were configuring the UART with, again
so that others can try and reproduce your problem and confirm or
deny your claim. You seem either unwilling or unable to provide this
information.

I find your statement "I look at code only as the very last resort"
very interesting. My own experience is that there's invariably good
value to be obtained from looking at the code, for the simple reason
that most errors tend ultimately to be found there.

If I had the symptoms you describe in a system, my own suspect list
would be:

- software error on target board
- hardware build or design issue on target board
- hardware issue on the comms link to the test system
- tools issue on test system
- etc. etc.
- at very end of list: previously unknown and undocumented hardware
problem with processor

Maybe you've gone through such a list already. It's a pretty big
claim to make that there's some bug in the internal design of an IC,
without providing sufficient detail on how it might be reproduced
consistently, something like "if you set the UART to this, and do
this, the following happens".

Although it's rare, I've gone through this process a few times,
including with the LPC2000. I supplied Philips with very complete
information on the issue (related to switching between external
interrupt and GPIO), including software to demonstrate it. They
subsequently confirmed the issue. It was only when I had this
confirmation that I posted the relevant information to this group.
Without providing a similar level of detail, I think it unlikely
it'll be taken seriously: there are just too many unknowns in the
information you've provided so far for your claim to be
substantiated.

I don't doubt for a minute that the system you have demonstrated the
behavior you describe, or that your claim might in fact be true.
However, until you or someone else can provide more detailed
information, the claim will have to remain just that: a claim.

Brendan