EmbeddedRelated.com
Forums

Linux serial port dropping bytes

Started by Derek Young March 31, 2008
David Brown wrote:
> CBFalconer wrote: > > <snip> > >> I left the whole thing unsnipped. The time has come for me to >> crave forgiveness. I think I have been afflicted with age or >> something. The bits/persec crowd are absolutely correct, and I >> am wrong. > > I don't think you need forgiveness - you just made a mistake. > >> So that leaves the real problem handling throughput of >> approximately 1 char each 10 microsec. > > You need to handle an *average* of 1 character per 10 us. But > the cost of handling each character is peanuts - even if the > UART is on a slow bus, you should be able to read out characters > at something like 20 per us. The cost is in the handling of the > interrupt itself - context switches, cache misses, etc. That's > why you use a UART with a buffer - it takes virtually the same > time to read 128 bytes out the buffer during one interrupt, as > to read 1 byte from the buffer during the interrupt. So if > you've set your UART to give an interrupt every 100 characters, > you get an interrupt every ms and read out a block of 100 > characters at a time.
That depends on your CPU speed. Within the interrupt, you have to handle something like: REPEAT test for more in FIFO take one, stuff in buffer, while checking for buffer full. test for overflow or other errors. if any, call appropriate handler UNTIL FIFO empty clear interrupt system rearm interrupt exit Note that some operations will require several accesses to the UART. Those will eat up time. They will be much slower than on-chip memory access. -- [mail]: Chuck F (cbfalconer at maineline dot net) [page]: <http://cbfalconer.home.att.net> Try the download section. -- Posted via a free Usenet account from http://www.teranews.com
> >> So that leaves the real problem handling throughput of >> approximately 1 char each 10 microsec. > > That's where a large FIFO becomes important. Using Linux on an > XScale (which, IIRC, is is what the OP is using), I've done up > to 460K baud without problems. But, that was using a UART with > a 1K byte rx FIFO. That UART also allowed 32-bit wide accesses > to the tx/rx FIFOs so that you could transfer 4 bytes per bus > cycle. > > With a 128 byte FIFO and byte-wide access, the timing > constraints are quite a bit tighter, but I think it should be > doable if you carefully vet the other drivers that are running > on the system. >
Unfortunately, I need to rely on drivers written by Arcom. I've sent some questions to their tech support, but am still waiting for a reply. I was thinking... at 921.6 kbps (8/N/1 -> 92160 bytes/sec), if the FIFO interrupt level is set at: 128 bytes: 720 interrupts/sec (1.4 ms/int), 10.8 us allowed for the ISR to respond and empty the FIFO 64 bytes: 1440 interrupts/sec (694 us/int), ~700 us for the ISR to respond/finish (I'm still trying to look through the driver code to figure out where the interrupt level is set.) The XScale CPU I'm using runs at 400 MHz. (I've forgotten who asked, but it's communicating to with a TSM320F2812 DSP.) Hardware flow control is not an option because it's not implemented in the Arcom hardware. Do these interrupt frequencies sound reasonable for a non-realtime OS, or is it hopeless as some of my coworkers here have suggested? BTW, I'm going to give the guy who picked out this particular hardware/software combo a really hard time. :P Derek
On 2008-04-02, CBFalconer <cbfalconer@yahoo.com> wrote:

>>> So that leaves the real problem handling throughput of >>> approximately 1 char each 10 microsec. >> >> You need to handle an *average* of 1 character per 10 us. But >> the cost of handling each character is peanuts - even if the >> UART is on a slow bus, you should be able to read out >> characters at something like 20 per us. The cost is in the >> handling of the interrupt itself - context switches, cache >> misses, etc. That's why you use a UART with a buffer - it >> takes virtually the same time to read 128 bytes out the buffer >> during one interrupt, as to read 1 byte from the buffer during >> the interrupt. So if you've set your UART to give an interrupt >> every 100 characters, you get an interrupt every ms and read >> out a block of 100 characters at a time. > > That depends on your CPU speed.
True. The OP is running an XScale with Linux, so I'd guess he's running at a couple hundred MHz.
> Within the interrupt, you have to > handle something like: > > REPEAT > test for more in FIFO > take one, stuff in buffer, while checking for buffer full. > test for overflow or other errors. > if any, call appropriate handler > UNTIL FIFO empty > clear interrupt system > rearm interrupt > exit > > Note that some operations will require several accesses to the > UART. Those will eat up time. They will be much slower than > on-chip memory access.
People have been supporting 921K bps serial links for ages. You do have to pay attention to what you're doing, but it's really not that hard with a sufficiently large FIFO. However, IMO a 128 FIFO is getting close to being insufficiently large. I wouldn't want to try to support it on a Linux system with interrupt latencies imposed by a bunch of randomly chosen device drivers. If it's an embedded system and you've got control over what other ISRs are running, it should be doable. -- Grant Edwards grante Yow! I would like to at urinate in an OVULAR, visi.com porcelain pool --
Grant Edwards wrote:

> People have been supporting 921K bps serial links for ages. You > do have to pay attention to what you're doing, but it's really > not that hard with a sufficiently large FIFO. However, IMO a > 128 FIFO is getting close to being insufficiently large.
If the board has USB, he could interpose a serial-to-USB converter- the FTD2232 from FTDI has a 384 byte receive buffer, which should get him down to 4ms or so per interrupt.
> If the board has USB, he could interpose a serial-to-USB converter- the > FTD2232 from FTDI has a 384 byte receive buffer, which should get him > down to 4ms or so per interrupt. >
That's a really good suggestion. During early development on a Windows/Labview machine, I used a Quatech RS-422 to USB2.0 converter (with a 2K buffer) to get over this same problem. But in the current hardware, there's not a lot of room for an adapter, and I'm worried about getting a working driver for this particular flavor of Linux. It took a couple of calls to Quatech to get their box working in Windows. It was initially randomly duplicating bytes, so I would get packets that were longer than expected! Derek
On 2008-04-02, Derek Young <edu.mit.LL@dereky.nospam> wrote:

>>> So that leaves the real problem handling throughput of >>> approximately 1 char each 10 microsec. >> >> That's where a large FIFO becomes important. Using Linux on >> an XScale (which, IIRC, is is what the OP is using), I've done >> up to 460K baud without problems. But, that was using a UART >> with a 1K byte rx FIFO. That UART also allowed 32-bit wide >> accesses to the tx/rx FIFOs so that you could transfer 4 bytes >> per bus cycle. >> >> With a 128 byte FIFO and byte-wide access, the timing >> constraints are quite a bit tighter, but I think it should be >> doable if you carefully vet the other drivers that are running >> on the system. > > Unfortunately, I need to rely on drivers written by Arcom. > I've sent some questions to their tech support, but am still > waiting for a reply. > > I was thinking... at 921.6 kbps (8/N/1 -> 92160 bytes/sec), if the FIFO > interrupt level is set at: > > 128 bytes: 720 interrupts/sec (1.4 ms/int), 10.8 us allowed for the ISR > to respond and empty the FIFO
That 10 us latency requirement is probably impossible.
> 64 bytes: 1440 interrupts/sec (694 us/int), ~700 us for the ISR to > respond/finish
That might be possible as long as you can make sure there aren't any other ISRs running that take more than a a few tens of microseconds.
> (I'm still trying to look through the driver code to figure out where > the interrupt level is set.) > > The XScale CPU I'm using runs at 400 MHz. (I've forgotten who > asked, but it's communicating to with a TSM320F2812 DSP.) > Hardware flow control is not an option because it's not > implemented in the Arcom hardware. > > Do these interrupt frequencies sound reasonable for a > non-realtime OS, or is it hopeless as some of my coworkers > here have suggested?
It's definitely pushing the limits pretty hard. With enough time and effort you might be able to make it work but all it would take is one other ISR that runs for more than a few hundred microseconds and you've lost data. Whether the current architecture is acceptible depends on a number of questions: 1) What are the consequences of loosing data? Does somebody die, or is there merely a retry? 2) How much schedule risk is acceptible? Is it OK if it takes a year of hacking on the source code for a few kernel modules and a half-dozen different device drivers to get the overall ISR latency down? 3) How much redesign risk is acceptible? Is it OK if you work on it for three months before proving that it can't work and a better UART or different interface has to be chosen? -- Grant Edwards grante Yow! I'm having BEAUTIFUL at THOUGHTS about the INSIPID visi.com WIVES of smug and wealthy CORPORATE LAWYERS ...
sprocket wrote:
> Grant Edwards wrote: > >> People have been supporting 921K bps serial links for ages. You >> do have to pay attention to what you're doing, but it's really >> not that hard with a sufficiently large FIFO. However, IMO a >> 128 FIFO is getting close to being insufficiently large. > > If the board has USB, he could interpose a serial-to-USB converter- the > FTD2232 from FTDI has a 384 byte receive buffer, which should get him > down to 4ms or so per interrupt. >
I've got a board that uses one of these devices, with a ColdFire running at 150 MHz. I run the UART at about 2.5 Mbps (250 KB per second, if you prefer :-). I'm not running Linux, but there are plenty of other interrupts going on at rates of at least several kHz.
CBFalconer wrote:
> .... > That depends on your CPU speed. Within the interrupt, you have to > handle something like: > > REPEAT > test for more in FIFO > take one, stuff in buffer, while checking for buffer full. > test for overflow or other errors.
So far OK, although it sounds more complex than it actually is - 128 bytes will be processed within 1000 CPU cycles easily, which would be taking all the time on a 1 MHz CPU...
> if any, call appropriate handler
You don't call any "handlers" from an IRQ service routine. You set a flag bit somewhere to indicate what happened and let the non-time critical code deal with it.
> UNTIL FIFO empty > clear interrupt system > rearm interrupt > exit
That "rearm interrupt" is actually part of the return from interrupt opcode on normal processors (perhaps even on Intel?), but generally this is how it is typically done. The "call this or that" mistake from a handler seems to be also frequently done, of course. Here is how it has to be done to keep latency really low: begin IRQ handler: (save registers which will be changed) disable UART interrupt enable CPU interrupts empty UARTs FIFO into memory and flag error(s), if some detected disable CPU interrupts enable UART interrupt (restore saved registers) return from interrupt end IRQ handler This must be applied to *all* interrupt handlers in a system in order to work, of course. The minimum latency gets a little worse, but the maximum latency - which is the limiting factor - is dramatically reduced. Grant suggests this would not be that easy to do even if it is one person working on it and while I tend to agree with him based on what I have witnessed last 20+ years, I must say I have been doing things like that routinely over these 20+years myself... :-). Dimiter ------------------------------------------------------ Dimiter Popoff Transgalactic Instruments http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/ CBFalconer wrote:
> David Brown wrote: > > CBFalconer wrote: > > > > <snip> > > > >> I left the whole thing unsnipped. The time has come for me to > >> crave forgiveness. I think I have been afflicted with age or > >> something. The bits/persec crowd are absolutely correct, and I > >> am wrong. > > > > I don't think you need forgiveness - you just made a mistake. > > > >> So that leaves the real problem handling throughput of > >> approximately 1 char each 10 microsec. > > > > You need to handle an *average* of 1 character per 10 us. But > > the cost of handling each character is peanuts - even if the > > UART is on a slow bus, you should be able to read out characters > > at something like 20 per us. The cost is in the handling of the > > interrupt itself - context switches, cache misses, etc. That's > > why you use a UART with a buffer - it takes virtually the same > > time to read 128 bytes out the buffer during one interrupt, as > > to read 1 byte from the buffer during the interrupt. So if > > you've set your UART to give an interrupt every 100 characters, > > you get an interrupt every ms and read out a block of 100 > > characters at a time. > > That depends on your CPU speed. Within the interrupt, you have to > handle something like: > > REPEAT > test for more in FIFO > take one, stuff in buffer, while checking for buffer full. > test for overflow or other errors. > if any, call appropriate handler > UNTIL FIFO empty > clear interrupt system > rearm interrupt > exit > > Note that some operations will require several accesses to the > UART. Those will eat up time. They will be much slower than > on-chip memory access. > > -- > [mail]: Chuck F (cbfalconer at maineline dot net) > [page]: <http://cbfalconer.home.att.net> > Try the download section. > > > > -- > Posted via a free Usenet account from http://www.teranews.com
CBFalconer wrote:
<snip>
> That depends on your CPU speed. Within the interrupt, you have to > handle something like: > > REPEAT > test for more in FIFO > take one, stuff in buffer, while checking for buffer full. > test for overflow or other errors. > if any, call appropriate handler > UNTIL FIFO empty > clear interrupt system > rearm interrupt > exit > > Note that some operations will require several accesses to the > UART. Those will eat up time. They will be much slower than > on-chip memory access. >
This stuff is not magic - it's standard fare for embedded developers. You seem determined to view the problem from the worst possible angle, and pick the worst possible solution. You do *not* have to check for overflows or other receive errors for each byte (buffered uarts provide summary flags, and you would normally use higher level constructs, such as crc checks, to check correctness on a fast link). You do *not* have to check for space in your buffer for each byte. At the start of the ISR, you ask the UART how many bytes are in the FIFO buffer, and you check how much space you have in the memory buffer. That tells you how often to execute your loop. The requirements for the read loop are so simple that in many 32-bit microcontrollers, you can set up a DMA controller to handle it.
Grant Edwards wrote:
> On 2008-04-02, Derek Young <edu.mit.LL@dereky.nospam> wrote: > >>>> So that leaves the real problem handling throughput of >>>> approximately 1 char each 10 microsec. >>> That's where a large FIFO becomes important. Using Linux on >>> an XScale (which, IIRC, is is what the OP is using), I've done >>> up to 460K baud without problems. But, that was using a UART >>> with a 1K byte rx FIFO. That UART also allowed 32-bit wide >>> accesses to the tx/rx FIFOs so that you could transfer 4 bytes >>> per bus cycle. >>> >>> With a 128 byte FIFO and byte-wide access, the timing >>> constraints are quite a bit tighter, but I think it should be >>> doable if you carefully vet the other drivers that are running >>> on the system. >> Unfortunately, I need to rely on drivers written by Arcom. >> I've sent some questions to their tech support, but am still >> waiting for a reply. >> >> I was thinking... at 921.6 kbps (8/N/1 -> 92160 bytes/sec), if the FIFO >> interrupt level is set at: >> >> 128 bytes: 720 interrupts/sec (1.4 ms/int), 10.8 us allowed for the ISR >> to respond and empty the FIFO > > That 10 us latency requirement is probably impossible. > >> 64 bytes: 1440 interrupts/sec (694 us/int), ~700 us for the ISR to >> respond/finish > > That might be possible as long as you can make sure there > aren't any other ISRs running that take more than a a few tens > of microseconds. > >> (I'm still trying to look through the driver code to figure out where >> the interrupt level is set.) >> >> The XScale CPU I'm using runs at 400 MHz. (I've forgotten who >> asked, but it's communicating to with a TSM320F2812 DSP.) >> Hardware flow control is not an option because it's not >> implemented in the Arcom hardware. >> >> Do these interrupt frequencies sound reasonable for a >> non-realtime OS, or is it hopeless as some of my coworkers >> here have suggested? > > It's definitely pushing the limits pretty hard. With enough > time and effort you might be able to make it work but all it > would take is one other ISR that runs for more than a few > hundred microseconds and you've lost data. > > Whether the current architecture is acceptible depends on a > number of questions: > > 1) What are the consequences of loosing data? Does somebody > die, or is there merely a retry? > > 2) How much schedule risk is acceptible? Is it OK if it takes > a year of hacking on the source code for a few kernel > modules and a half-dozen different device drivers to get > the overall ISR latency down? > > 3) How much redesign risk is acceptible? Is it OK if you work > on it for three months before proving that it can't work > and a better UART or different interface has to be chosen? >
Absolutely. So, for me, the dropped bytes mean that I'm losing approximately 3% of my data packets, worst-case. I'm going to argue that this is okay for now. It's just data. Each packet can be considered a retry. And I can easily tell which are bad and resync. I don't think I have the time (or expertise, really) to mess around any more with the Linux kernel. I'm going to suggest redesigning the hardware in the next version to avoid using the serial link. Also going to suggest avoid using Linux (or any OS). It's really overkill for this application, as it turns out, quite a hassle. Thanks everybody for all your input and advice. Derek