Reply by wzab October 26, 20112011-10-26
On 26 Pa=C5=BA, 16:56, wzab <wza...@gmail.com> wrote:

> Probably I should get rid of the whole, very good, but from my point > of view too > sophisticated atmel_serial.c driver, and instead I should write a > minimalistic > DMA based driver passing data directly from DMA buffer to mmapped > memory in the > user space...
Ooops, of course I have mistaken. It should be the memory buffer allocated in kernel driver, mmapped by this driver for the user mode application. To move data to the user space memory I should use get_user_pages, but this again is too complex for this purpose ;-). -- BR Wojtek
Reply by wzab October 26, 20112011-10-26
On 25 Pa=C5=BA, 00:39, Ulf Samuelsson <ulf_samuels...@invalid.telia.com>
wrote:

> Linux does not set up the Matrix. > This is (should be) done in at91bootstrap. > The Atmel at91bootstrap hardly touches the matrix. > The argument beeing, that only the customer knows > what the priorities should be. > I checked in at91bootstrap v2.13 into openembedded.www.openembedded.org, =
where I set up the matrix
> for the SAM9263. > It was done as an example, but I have not verified > the throughput details. > > BR > Ulf Samuelsson
I have put the matrix setup code into the board initialization code: (board-mmnet1000c, probably derived from board-sam9260ek.c) static void __init ek_board_init(void) { /* Serial */ at91_add_device_serial(); /* USB Host */ at91_add_device_usbh(&ek_usbh_data); /* USB Device */ at91_add_device_udc(&ek_udc_data); /* SPI */ at91_add_device_spi(ek_spi_devices, ARRAY_SIZE(ek_spi_devices)); /* I2S */ at91_add_device_ssc(AT91SAM9260_ID_SSC, ATMEL_SSC_RX); /* NAND */ at91_add_device_nand(&ek_nand_data); /* Ethernet */ at91_add_device_eth(&ek_macb_data); /* MMC */ at91_add_device_mmc(0, &ek_mmc_data); /* I2C */ at91_add_device_i2c(NULL, 0); /* LEDs */ at91_gpio_leds(ek_leds, ARRAY_SIZE(ek_leds)); /* Push Buttons */ ek_add_device_buttons(); /* Modify the Bus Matrix settings */ printk("<1>AT91_MATRIX_SCFG0 was %x \n",at91_sys_read(AT91_MATRIX_SCFG0)); at91_sys_write(AT91_MATRIX_SCFG0, 0x010a0010); printk("<1>AT91_MATRIX_SCFG1 was %x \n",at91_sys_read(AT91_MATRIX_SCFG1)); at91_sys_write(AT91_MATRIX_SCFG1, 0x010a0010); printk("<1>AT91_MATRIX_PRAS0 was %x \n",at91_sys_read(AT91_MATRIX_PRAS0)); at91_sys_write(AT91_MATRIX_PRAS0, 0x00200300); printk("<1>AT91_MATRIX_PRAS1 was %x \n",at91_sys_read(AT91_MATRIX_PRAS1)); at91_sys_write(AT91_MATRIX_PRAS1, 0x00200300); } I have also checked, that my ioremapped buffers do not require DMA synchronization. Anyway the problem has not been succesfully solved, as the overhead introduced by the whole tty related layer of atmel_serial driver is simply to big. When I receive data up to ca. 170kB/s everything works fine. When the data stream reaches the desired 340kB/s, some data are lost probably not between the USART and SDC but due to software buffer overruns. Probably I should get rid of the whole, very good, but from my point of view too sophisticated atmel_serial.c driver, and instead I should write a minimalistic DMA based driver passing data directly from DMA buffer to mmapped memory in the user space... Thanks for help in resolving of bus matrix related problems. -- BR Wojtek
Reply by Ulf Samuelsson October 24, 20112011-10-24
wzab skrev 2011-10-23 21:48:
> I have changed the implementation of atmel_rx_from_dma to use the > __dma_single_cpu_to_dev and __dma_singlu_dev_to_cpu for USART whitch > uses RX buffer in SRAM. However then I get the Kernel BUG at: > http://lxr.linux.no/linux+v3.0.4/arch/arm/mm/dma-mapping.c#L451 > > Investigating the problem more thoroughly I've stated that my buffer > is allocated at 0x200000 (SRAM0) and ioremapped to 0xc4890000, while > high_memory is at 0xc4000000. Therefore my buffer address fails the > test of address validity at http://lxr.linux.no/linux+v3.0.4/arch/arm/include/asm/memory.h#L292 > . > > I've checked if location of my buffer above the high_memory location > makes synchronisation unnecessary (in fact I thought, that virtual > memory returned by ioremap is accessed bypassing the cache > mechanisms), but when I simply skipped the synchronization routines, I > still got my data stream corrupted. > > So either my SRAM0 located buffer still requires the DMA > synchronization (but how to provide it?) or even though it is located > in SRAM i still get lost characters due to "traffic jams" in the Bus > Matrix (as Ulf suggested a few posts above). > If the latter is the case, I should probably boost the priority of > USART in the Bus Matrix, but this problem seems to be not weel > documented :-(. > I have found http://lxr.linux.no/linux+v3.0.4/arch/arm/mach-at91/include/mach/at91sam9260_matrix.h > but the only places where the matrix seems to be used are: > http://lxr.linux.no/linux+v3.0.4/arch/arm/mach-at91/at91sam9260_devices.c#L418 > and http://lxr.linux.no/linux+v3.0.4/arch/arm/mach-at91/at91sam9260_devices.c#L1275 > > Does it mean, that Linux on AT91SAM9260 uses the Bus Matrix in its > default state? > -- > TIA& Regards, > Wojtek
Linux does not set up the Matrix. This is (should be) done in at91bootstrap. The Atmel at91bootstrap hardly touches the matrix. The argument beeing, that only the customer knows what the priorities should be. I checked in at91bootstrap v2.13 into openembedded. www.openembedded.org, where I set up the matrix for the SAM9263. It was done as an example, but I have not verified the throughput details. BR Ulf Samuelsson
Reply by wzab October 23, 20112011-10-23
I have changed the implementation of atmel_rx_from_dma to use the
__dma_single_cpu_to_dev and __dma_singlu_dev_to_cpu for USART whitch
uses RX buffer in SRAM. However then I get the Kernel BUG at:
http://lxr.linux.no/linux+v3.0.4/arch/arm/mm/dma-mapping.c#L451

Investigating the problem more thoroughly I've stated that my buffer
is allocated at 0x200000 (SRAM0) and ioremapped to 0xc4890000, while
high_memory is at 0xc4000000. Therefore my buffer address fails the
test of address validity at http://lxr.linux.no/linux+v3.0.4/arch/arm/include/asm/memory.h#L292
.

I've checked if location of my buffer above the high_memory location
makes synchronisation unnecessary (in fact I thought, that virtual
memory returned by ioremap is accessed bypassing the cache
mechanisms), but when I simply skipped the synchronization routines, I
still got my data stream corrupted.

So either my SRAM0 located buffer still requires the DMA
synchronization (but how to provide it?) or even though it is located
in SRAM i still get lost characters due to "traffic jams" in the Bus
Matrix (as Ulf suggested a few posts above).
If the latter is the case, I should probably boost the priority of
USART in the Bus Matrix, but this problem seems to be not weel
documented :-(.
I have found http://lxr.linux.no/linux+v3.0.4/arch/arm/mach-at91/include/mach/at91sam9260_matrix.h
but the only places where the matrix seems to be used are:
http://lxr.linux.no/linux+v3.0.4/arch/arm/mach-at91/at91sam9260_devices.c#L418
and http://lxr.linux.no/linux+v3.0.4/arch/arm/mach-at91/at91sam9260_devices.c#L1275

Does it mean, that Linux on AT91SAM9260 uses the Bus Matrix in its
default state?
--
TIA & Regards,
Wojtek
Reply by wzab October 23, 20112011-10-23
On Oct 22, 11:51=A0pm, wzab <wza...@gmail.com> wrote:

> Maybe for the buffer ioremapped from the SRAM memory I should simply > directly call consistent_sync (as it is done in fact in the > dma_sync_single_for... routines:http://lxr.linux.no/linux+v3.0.4/arch/xte=
nsa/include/asm/dma-mapping....
> )?
Oooops, sorry, I have not noticed, that consistent_sync does not exist for arm architecture. The implementation of dma_sync_single... for arm is different: http://lxr.linux.no/linux+v3.0.4/arch/arm/include/asm/dma-mapping.h#L503 -- Wojtek
Reply by wzab October 22, 20112011-10-22
> Aren't there any cache mechanisms active for SRAM memory? > When the PDC stores something to the SRAM buffer is it immediately > visible for CPU, or should I invalidate associated cache? >
Maybe for the buffer ioremapped from the SRAM memory I should simply directly call consistent_sync (as it is done in fact in the dma_sync_single_for... routines: http://lxr.linux.no/linux+v3.0.4/arch/xtensa/include/asm/dma-mapping.h#L101 )? -- BR & TIA, Wojtek
Reply by wzab October 22, 20112011-10-22
> Anything running high speed with the PDC needs to understand > that the PDC needs access to the bus *often*. > At 6 Mbps and 10 bits per character, start, 8 bit data, stop, > it will take 1+2/3 us to handle one character. > the PDC will request the bus, when there is a byte in the holding > register. This byte needs to be moved to the memory before the next > byte is received. > If any group of peripheral occupies the bus for more than 1,67 us, > then you lose characters on reception. > If you put the receive buffer in SRAM, then the bus matrix > will be of great help, as long as nothing else is eating up > ALL the bandwidth to the bus. A PDC transfer should then > only take 10 ns. > > It should be OK to have the transmit buffer in SDRAM, saving > the precious 4 kB of SRAM that normally is used for data. > You might also consider setting up the bus matrix > to prioritize the PDC. > This is not done by default.
Well, I have managed to allocate the DMA RX buffer for the particular USART in SRAM. It required small modification in atmel_serial.c . However now of course I got the kernel panic, when the atmel_tasklet_func called atmel_rx_from_dma, which in turn called dma_sync_single_for_cpu. As for ioremapped memory the mapping didn't exist, it lead to kernel panic. Well, I can check if the currently serviced USART uses buffer in SRAM and bypass calls to synchronization functions, but I'm not sure if for this platform is it safe. Aren't there any cache mechanisms active for SRAM memory? When the PDC stores something to the SRAM buffer is it immediately visible for CPU, or should I invalidate associated cache? -- BR & TIA, Wojtek
Reply by Ulf Samuelsson October 17, 20112011-10-17
wzab skrev 2011-10-16 23:31:
> On Oct 16, 8:56 pm, Ulf Samuelsson<ulf_samuels...@invalid.telia.com> > wrote: >> >> The new AT91SAM9Gx5 chips have get rid of the PDC in favour >> of a real DMA controller with built in FIFO. >> This will allow burst access to the SDRAM. >> With the pricing of the SDRAM going up, the DDR2 interface >> of these chips, will make them more cost effective. >> >> BR >> Ulf Samuelsson > > Yes, I know, but unfortunately my hardware platform is fixed, and I > really > have to force THIS hardware (AT91SAM9260) to provide required > performance... > > BR > Wojtek
Anything running high speed with the PDC needs to understand that the PDC needs access to the bus *often*. At 6 Mbps and 10 bits per character, start, 8 bit data, stop, it will take 1+2/3 us to handle one character. the PDC will request the bus, when there is a byte in the holding register. This byte needs to be moved to the memory before the next byte is received. If any group of peripheral occupies the bus for more than 1,67 us, then you lose characters on reception. If you put the receive buffer in SRAM, then the bus matrix will be of great help, as long as nothing else is eating up ALL the bandwidth to the bus. A PDC transfer should then only take 10 ns. It should be OK to have the transmit buffer in SDRAM, saving the precious 4 kB of SRAM that normally is used for data. You might also consider setting up the bus matrix to prioritize the PDC. This is not done by default. If you have to use Ethernet, you might be screwed unless you put the receive buffer in SRAM. BR Ulf Samuelsson
Reply by wzab October 16, 20112011-10-16
On Oct 16, 8:56=A0pm, Ulf Samuelsson <ulf_samuels...@invalid.telia.com>
wrote:
> > The new AT91SAM9Gx5 chips have get rid of the PDC in favour > of a real DMA controller with built in FIFO. > This will allow burst access to the SDRAM. > With the pricing of the SDRAM going up, the DDR2 interface > of these chips, will make them more cost effective. > > BR > Ulf Samuelsson
Yes, I know, but unfortunately my hardware platform is fixed, and I really have to force THIS hardware (AT91SAM9260) to provide required performance... BR Wojtek
Reply by Ulf Samuelsson October 16, 20112011-10-16
Jukka Marin skrev 2011-10-15 12:00:
> On 2011-10-15, wzab<wzab01@gmail.com> wrote: >> I have succesfully modified the driver to work in synchronous mode >> (see thread: https://groups.google.com/group/comp.arch.embedded/browse_thread/thread/503dd4dfd723e74f >> ) >> However then I have faced strange problems at high data rates. >> I use USART with DMA. Tha application sets tty into "raw" mode. >> Anyway at 6Mb/s I've found, that some data are lost, but neither >> errors nor even messages in system log were generated. > > I don't know about the driver you are using, but we've had severe problems > with Atmel chips and SPI running at high speed. Atmel is proud to have DMA > for various peripherals, but they are too stupid to implement any kind of > FIFO buffers. If the MCU bus gets too busy, the DMA system can steal no > bus cycles and it simply drops the received data without telling anyone. > I guess the same problem affects data transmission. > > And yes, this really happens. We had to change our hardware and stop using > SPI at all. In the good old Motorola MCU's (68k series eg) have both DMA > _and_ FIFO and so they work reliably. Wish Atmel borrowed a brain and fixed > their chips. > > You may try to place the serial buffer in the internal MCU RAM - it has > larger bandwidth and on the newer chips, it can be accessed even while the > external memory bus is busy. > > Good luck. > > -jm
The new AT91SAM9Gx5 chips have get rid of the PDC in favour of a real DMA controller with built in FIFO. This will allow burst access to the SDRAM. With the pricing of the SDRAM going up, the DDR2 interface of these chips, will make them more cost effective. BR Ulf Samuelsson