EmbeddedRelated.com
Forums
Memfault Beyond the Launch

atmel_serial driver modifications for 6Mb/s throughput in synchronous mode

Started by wzab October 15, 2011
Hi,

I need to transmit serial data at rate 6Mb through the AT91SAM9260
USART.
I have succesfully modified the driver to work in synchronous mode
(see thread: https://groups.google.com/group/comp.arch.embedded/browse_thre=
ad/thread/503dd4dfd723e74f
)
However then I have faced strange problems at high data rates.
I use USART with DMA. Tha application sets tty into "raw"  mode.
Anyway at 6Mb/s I've found, that some data are lost, but neither
errors nor even messages in system log were generated.
I supposed, that this is caused by DMA buffer overrun (the buffer
length of 512 bytes is sufficient for ca 580=B5s of transmission at 6Mb/
s, so the user space application may have problem to receive data on
time).
Therefore I've modified the length of this buffer (PDC_BUFFER_SIZE
constant) from the original 512 to 65536.
However after this modification transmission was even worse.
Maybe some parts of the code rely on fact that DMA buffer is e.g. not
bigger than one memory page?
Maybe some pointers used to communicate with this buffer use only
limited number bits?
Has anyone faced similar problems with the atmel_serial driver?

Looking at the amount of code assosciated with the TTY layer
surrounding the serial driver, I'm considering getting rid of
atmel_serial driver for this partical USART port and writing my own
lightweight and optimized DMA based driver for high speed transmission
of raw data. Maybe such solution already exists?
--
TIA & Regards,
Wojtek'
On 2011-10-15, wzab <wzab01@gmail.com> wrote:
> I have succesfully modified the driver to work in synchronous mode > (see thread: https://groups.google.com/group/comp.arch.embedded/browse_thread/thread/503dd4dfd723e74f > ) > However then I have faced strange problems at high data rates. > I use USART with DMA. Tha application sets tty into "raw" mode. > Anyway at 6Mb/s I've found, that some data are lost, but neither > errors nor even messages in system log were generated.
I don't know about the driver you are using, but we've had severe problems with Atmel chips and SPI running at high speed. Atmel is proud to have DMA for various peripherals, but they are too stupid to implement any kind of FIFO buffers. If the MCU bus gets too busy, the DMA system can steal no bus cycles and it simply drops the received data without telling anyone. I guess the same problem affects data transmission. And yes, this really happens. We had to change our hardware and stop using SPI at all. In the good old Motorola MCU's (68k series eg) have both DMA _and_ FIFO and so they work reliably. Wish Atmel borrowed a brain and fixed their chips. You may try to place the serial buffer in the internal MCU RAM - it has larger bandwidth and on the newer chips, it can be accessed even while the external memory bus is busy. Good luck. -jm
On Oct 15, 12:00=A0pm, Jukka Marin <jma...@pyy.embedtronics.fi> wrote:
> On 2011-10-15, wzab <wza...@gmail.com> wrote: > > > I have succesfully modified the driver to work in synchronous mode > > (see thread:https://groups.google.com/group/comp.arch.embedded/browse_t=
hread/thre...
> > ) > > However then I have faced strange problems at high data rates. > > I use USART with DMA. Tha application sets tty into "raw" =A0mode. > > Anyway at 6Mb/s I've found, that some data are lost, but neither > > errors nor even messages in system log were generated. > > I don't know about the driver you are using, but we've had severe problem=
s
> with Atmel chips and SPI running at high speed. =A0Atmel is proud to have=
DMA
> for various peripherals, but they are too stupid to implement any kind of > FIFO buffers. =A0If the MCU bus gets too busy, the DMA system can steal n=
o
> bus cycles and it simply drops the received data without telling anyone. > I guess the same problem affects data transmission. >
> And yes, this really happens. =A0We had to change our hardware and stop u=
sing
> SPI at all. =A0In the good old Motorola MCU's (68k series eg) have both D=
MA
> _and_ FIFO and so they work reliably. =A0Wish Atmel borrowed a brain and =
fixed
> their chips. > > You may try to place the serial buffer in the internal MCU RAM - it has > larger bandwidth and on the newer chips, it can be accessed even while th=
e
> external memory bus is busy. >
Thanks a lot, it seems, that you are right. So I'll need to move my USART dma buffer to the internal SRAM. As I can see, there are two areas with length of 4KB at physical addresses 0x20000-0x20fff and 0x30000-0x30fff I have found a structure at91sam9260_sram_desc describing how this area should be mapped by the Linux system: http://lxr.linux.no/linux+v3.0.4/arch/arm/mach-at91/at91sam9260.c#L37 However I can't see any support for allocation of DMA buffers from this area :-(. There is no special zone (well for memory consisting of one page it makes no sense to define such a zone ;-) ). Does it mean, that I should allocate the buffer myself by something like: request_mem_region(AT91_IO_VIRT_BASE- AT91SAM9260_SRAM0_SIZE,PDC_BUFFER_SIZE,"usart") ? I wouldn't like to tie my buffer to the first PDC_BUFFER_SIZE bytes of this area. (There may be other drivers using SRAM for their high-speed buffers - eg MAC). Is there any memory management provided for allocation of small part of the SRAM? I've found the: http://www.at91.com/forum/viewtopic.php/f,12/t,4991/ , but it didn't provide me with all neded info... In fact allocating of DMA buffers in SRAM for all USARTs will be an overkill. I have only one USART running at 6Mb/s, so it seems that anyway I'll need to customize the platform code for USART allocation... I'll appreciate any hints... -- TIA & Regards, Wojtek
OK. I have found a patch with full implementation of Ethernet TX
buffer in SRAM:
ftp://www.at91.com/pub/linux/2.6.27-at91/2.6.27-at91-exp.patch.gz
It makes a little clear how to allocate buffer:

+#if defined(CONFIG_ARCH_AT91) && defined(CONFIG_MACB_TX_SRAM)
+#if  defined(CONFIG_ARCH_AT91SAM9260)
+	if (request_mem_region(AT91SAM9260_SRAM0_BASE, TX_DMA_SIZE, "macb"))
{
+		bp->tx_ring_dma = AT91SAM9260_SRAM0_BASE;
+	} else {
+		if (request_mem_region(AT91SAM9260_SRAM1_BASE, TX_DMA_SIZE,
"macb")) {
+			bp->tx_ring_dma = AT91SAM9260_SRAM1_BASE;
+		} else {
+			printk(KERN_WARNING "Cannot request SRAM memory for TX ring,
already used\n");
+			return -EBUSY;
+		}
+	}
+

However it doesn't explain how to avoid collisions between different
drivers needing to use buffers in SRAM.
E.g. in my case (Ethernet connected data acquisition using USART via
optoisolation to receive data from AD converters) I need both Ethernet
and USART to have buffers in internal SRAM...
--
Regards,
WZab
I have found the patch: http://lists.infradead.org/pipermail/linux-arm-kernel/2011-April/048279.html
which makes use of http://lxr.linux.no/linux+v3.0.4/include/linux/genalloc.h
Maybe this is the right way to go.
I'll share my results if I find a satisfactory solution...
--
Regards,
Wojtek
Jukka Marin skrev 2011-10-15 12:00:
> On 2011-10-15, wzab<wzab01@gmail.com> wrote: >> I have succesfully modified the driver to work in synchronous mode >> (see thread: https://groups.google.com/group/comp.arch.embedded/browse_thread/thread/503dd4dfd723e74f >> ) >> However then I have faced strange problems at high data rates. >> I use USART with DMA. Tha application sets tty into "raw" mode. >> Anyway at 6Mb/s I've found, that some data are lost, but neither >> errors nor even messages in system log were generated. > > I don't know about the driver you are using, but we've had severe problems > with Atmel chips and SPI running at high speed. Atmel is proud to have DMA > for various peripherals, but they are too stupid to implement any kind of > FIFO buffers. If the MCU bus gets too busy, the DMA system can steal no > bus cycles and it simply drops the received data without telling anyone. > I guess the same problem affects data transmission. > > And yes, this really happens. We had to change our hardware and stop using > SPI at all. In the good old Motorola MCU's (68k series eg) have both DMA > _and_ FIFO and so they work reliably. Wish Atmel borrowed a brain and fixed > their chips. > > You may try to place the serial buffer in the internal MCU RAM - it has > larger bandwidth and on the newer chips, it can be accessed even while the > external memory bus is busy. > > Good luck. > > -jm
The new AT91SAM9Gx5 chips have get rid of the PDC in favour of a real DMA controller with built in FIFO. This will allow burst access to the SDRAM. With the pricing of the SDRAM going up, the DDR2 interface of these chips, will make them more cost effective. BR Ulf Samuelsson
On Oct 16, 8:56=A0pm, Ulf Samuelsson <ulf_samuels...@invalid.telia.com>
wrote:
> > The new AT91SAM9Gx5 chips have get rid of the PDC in favour > of a real DMA controller with built in FIFO. > This will allow burst access to the SDRAM. > With the pricing of the SDRAM going up, the DDR2 interface > of these chips, will make them more cost effective. > > BR > Ulf Samuelsson
Yes, I know, but unfortunately my hardware platform is fixed, and I really have to force THIS hardware (AT91SAM9260) to provide required performance... BR Wojtek
wzab skrev 2011-10-16 23:31:
> On Oct 16, 8:56 pm, Ulf Samuelsson<ulf_samuels...@invalid.telia.com> > wrote: >> >> The new AT91SAM9Gx5 chips have get rid of the PDC in favour >> of a real DMA controller with built in FIFO. >> This will allow burst access to the SDRAM. >> With the pricing of the SDRAM going up, the DDR2 interface >> of these chips, will make them more cost effective. >> >> BR >> Ulf Samuelsson > > Yes, I know, but unfortunately my hardware platform is fixed, and I > really > have to force THIS hardware (AT91SAM9260) to provide required > performance... > > BR > Wojtek
Anything running high speed with the PDC needs to understand that the PDC needs access to the bus *often*. At 6 Mbps and 10 bits per character, start, 8 bit data, stop, it will take 1+2/3 us to handle one character. the PDC will request the bus, when there is a byte in the holding register. This byte needs to be moved to the memory before the next byte is received. If any group of peripheral occupies the bus for more than 1,67 us, then you lose characters on reception. If you put the receive buffer in SRAM, then the bus matrix will be of great help, as long as nothing else is eating up ALL the bandwidth to the bus. A PDC transfer should then only take 10 ns. It should be OK to have the transmit buffer in SDRAM, saving the precious 4 kB of SRAM that normally is used for data. You might also consider setting up the bus matrix to prioritize the PDC. This is not done by default. If you have to use Ethernet, you might be screwed unless you put the receive buffer in SRAM. BR Ulf Samuelsson
> Anything running high speed with the PDC needs to understand > that the PDC needs access to the bus *often*. > At 6 Mbps and 10 bits per character, start, 8 bit data, stop, > it will take 1+2/3 us to handle one character. > the PDC will request the bus, when there is a byte in the holding > register. This byte needs to be moved to the memory before the next > byte is received. > If any group of peripheral occupies the bus for more than 1,67 us, > then you lose characters on reception. > If you put the receive buffer in SRAM, then the bus matrix > will be of great help, as long as nothing else is eating up > ALL the bandwidth to the bus. A PDC transfer should then > only take 10 ns. > > It should be OK to have the transmit buffer in SDRAM, saving > the precious 4 kB of SRAM that normally is used for data. > You might also consider setting up the bus matrix > to prioritize the PDC. > This is not done by default.
Well, I have managed to allocate the DMA RX buffer for the particular USART in SRAM. It required small modification in atmel_serial.c . However now of course I got the kernel panic, when the atmel_tasklet_func called atmel_rx_from_dma, which in turn called dma_sync_single_for_cpu. As for ioremapped memory the mapping didn't exist, it lead to kernel panic. Well, I can check if the currently serviced USART uses buffer in SRAM and bypass calls to synchronization functions, but I'm not sure if for this platform is it safe. Aren't there any cache mechanisms active for SRAM memory? When the PDC stores something to the SRAM buffer is it immediately visible for CPU, or should I invalidate associated cache? -- BR & TIA, Wojtek
> Aren't there any cache mechanisms active for SRAM memory? > When the PDC stores something to the SRAM buffer is it immediately > visible for CPU, or should I invalidate associated cache? >
Maybe for the buffer ioremapped from the SRAM memory I should simply directly call consistent_sync (as it is done in fact in the dma_sync_single_for... routines: http://lxr.linux.no/linux+v3.0.4/arch/xtensa/include/asm/dma-mapping.h#L101 )? -- BR & TIA, Wojtek

Memfault Beyond the Launch