EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

can't get uart tx fifo to work

Started by fhriley July 31, 2011
A good uart design should allow you to read the fifo levels, then the ISR or
polling can just read the uart Tx fifo level, work out the free space and
set a loop counter limit to transfer bytes without further handshaking.

There is no efficient way to do this on the 16550 which makes maximum
throughput dependent on being able to DMA data or service the ISr with low
latency.

A trivial but important oversight.

From: l... [mailto:l...] On Behalf Of
fhriley
Sent: 01 August 2011 18:26
To: l...
Subject: [lpc2000] Re: can't get uart tx fifo to work

It's the person who wrote the documentation that made it hard for me! Now
that I understand how it works and have a good implementation (only 6 lines
of code for putting a byte and only 5 lines in the interrupt), I actually
find it rather elegant. I can do arbitrary message lengths with hw flow
control, and I don't have to interrupt for every byte in the message. Very
nice.

--- In l... , "Phil
Young" wrote:

> Works much better than trying to code for a broken UART architecture,
> who-ever architected the 16550 should have been shot!.

An Engineer's Guide to the LPC2100 Series

On 8/1/2011 1:56 PM, Phil Young wrote:
> A good uart design should allow you to read the fifo levels, then the ISR or
> polling can just read the uart Tx fifo level, work out the free space and
> set a loop counter limit to transfer bytes without further handshaking.

I'm not seeing that as a help.

On read what you need to know is if the receive buffer is empty. The
current count of items in the buffer helps only if the time to read the
number of items in the buffer and set up a loop counter is less than the
time checking the flag in a loop. The time to read and set up the loop
needs to accounted for at least twice since you need a way to verify
that nothing came in while reading the first n items, so this adds some
complexity.

On write you already know what space is available so that is a constant.
If it's not available you wouldn't get an interrupt. The only place
you gain might be on writes less than the size of the buffer you could
check for room and write before an interrupt occurred but this check
would need to be protected against interrupts and race conditions. The
larger your message sizes the lower your potential gain. As long as you
are filling your larger software buffer as fast or faster than the UART
can empty the HW buffer there is no gain to be made, you will always
have a full buffer to write. I think this is an edge optimization with
little effect.

> There is no efficient way to do this on the 16550 which makes maximum
> throughput dependent on being able to DMA data or service the ISr with low
> latency.

Well maximum throughput on most systems depends on minimal latency to
service the I/O. In the case of the 16550, the receive thresholds act
to reduce the required latency for a receive interrupt by giving time to
respond. They could have done the same to the transmit but that would
have required also adding a transmit buffer full flag (i don't know off
hand if there is room for such a flag in a register that is not modify
on read in the architecture). A buffer byte count wouldn't affect the
latency requirements that I can see. Remember the 16550 was designed as
an enhancement to the 8250 largely aimed at PCs running DOS. The
maximum throughput design on these was interrupt on receive and poll on
transmit even before the 16550A. So a couple of additional flags may
have helped in other situations and architectures but the extra cost may
have made them undesirable.

Robert

--
http://www.aeolusdevelopment.com/

From the Divided by a Common Language File (Edited to protect the guilty)
ME - "I'd like to get Price and delivery for connector Part # XXXXX"
Dist./Rep - "$X.XX Lead time 37 days"
ME - "Anything we can do about lead time? 37 days seems a bit high."
Dist./Rep - "that is the lead time given because our stock is live....
we currently have stock."
No, that is not true, but you just proved what I said.

On Write you NEVER know what space is available in the TX Buffer unless THRE
is set.

So if you have a periodic ISR and want to handle the UART Tx efficiently
without it generating Tx interrupts you cannot know how much space there is
in the TX fifo.

My point was exactly that in order to drive such a UART efficiently you MUST
drive it in a TX ISR, so you have to have both TX and RX ISR handlers.

If you want to chain Tx and Rx into a single handler run off a timer, or
chained to an alternative ISR you cannot do it with the 16550 architecture.

I've designed and implemented many chips both with the 16550 style uart and
with an enhanced version, there is a huge difference.

If you read the Tx fifo level, then compute the Tx space you can simply
transfer the required bytes directly to the UART Tx fifo without any more
status polling, so of course it is very much more eccifient.

e.g.

TxSpace = UartSize - pUART->TXFifoLevel;

If (TxSpace > TxDataBufferLength) TxSpace = TxDataBufferLength;

while(TxDataBufferLength--){

pUART->TxData = *(pTxDataBuffer++);

}

You can add the buffer limit checking, or split a transfer from a circular
buffer into to transfers easily enough, but you should get the idea.

Maybe I should take some of the blame as I worked for Nat Semi around the
time they released the 16550, but I never looked at it in detail then.

And it doesn't need extra registers as the FIFO count on the bus side is
synchronous to the bus, all it needed was a read mux for a 4 bit field., by
uart standards this is 0.001% of the gate count.

On read the same is useful, since accessing the register to test for Rx fifo
empty adds additional cycles to the code that are simply unnecessary if the
architecture is good.

CortexM0 code requires reading the status register, anding the bit to be
tested with a mask and branching, but you still have to test the Rx buffer
space so you have some sort of loop counter, reading the Rx fifo level and
using the same technique for Rx as for Tx also improves efficieny even when
servicing from the dedicated ISR.

Bear in mind that in an Rx ISR you do not know how much data is available
when the Rx ISR as triggered regardless of fifo settings, but the code runs
quick enough that you can transfer all data indicated by the count then
check the RX fifo empty indication at the end for the maximum 1 character
that might have been received.

Regards

Phil.

From: l... [mailto:l...] On Behalf Of
Robert Adsett
Sent: 01 August 2011 19:57
To: l...
Subject: Re: [lpc2000] Re: can't get uart tx fifo to work

On 8/1/2011 1:56 PM, Phil Young wrote:
> A good uart design should allow you to read the fifo levels, then the ISR
or
> polling can just read the uart Tx fifo level, work out the free space and
> set a loop counter limit to transfer bytes without further handshaking.

I'm not seeing that as a help.

On read what you need to know is if the receive buffer is empty. The
current count of items in the buffer helps only if the time to read the
number of items in the buffer and set up a loop counter is less than the
time checking the flag in a loop. The time to read and set up the loop
needs to accounted for at least twice since you need a way to verify
that nothing came in while reading the first n items, so this adds some
complexity.

On write you already know what space is available so that is a constant.
If it's not available you wouldn't get an interrupt. The only place
you gain might be on writes less than the size of the buffer you could
check for room and write before an interrupt occurred but this check
would need to be protected against interrupts and race conditions. The
larger your message sizes the lower your potential gain. As long as you
are filling your larger software buffer as fast or faster than the UART
can empty the HW buffer there is no gain to be made, you will always
have a full buffer to write. I think this is an edge optimization with
little effect.

> There is no efficient way to do this on the 16550 which makes maximum
> throughput dependent on being able to DMA data or service the ISr with low
> latency.

Well maximum throughput on most systems depends on minimal latency to
service the I/O. In the case of the 16550, the receive thresholds act
to reduce the required latency for a receive interrupt by giving time to
respond. They could have done the same to the transmit but that would
have required also adding a transmit buffer full flag (i don't know off
hand if there is room for such a flag in a register that is not modify
on read in the architecture). A buffer byte count wouldn't affect the
latency requirements that I can see. Remember the 16550 was designed as
an enhancement to the 8250 largely aimed at PCs running DOS. The
maximum throughput design on these was interrupt on receive and poll on
transmit even before the 16550A. So a couple of additional flags may
have helped in other situations and architectures but the extra cost may
have made them undesirable.

Robert

--
http://www.aeolusdevelopment.com/

From the Divided by a Common Language File (Edited to protect the guilty)
ME - "I'd like to get Price and delivery for connector Part # XXXXX"
Dist./Rep - "$X.XX Lead time 37 days"
ME - "Anything we can do about lead time? 37 days seems a bit high."
Dist./Rep - "that is the lead time given because our stock is live....
we currently have stock."
The 2026 Embedded Online Conference