EmbeddedRelated.com
Forums
The 2024 Embedded Online Conference

Flat Memory DDR3 Access on the ARM Cortex A8

Started by Randy Yates January 12, 2016
I'm trying to kick out a simple (ha ha!) assembly language project in
which I perform some simple DRAM tests on board based on the TI AM3352
using a 512 MB DRAM (256M x 16). 

Ideally I'd like to just treat DRAM as a flat address space from
0x80000000 to 0x82000000.

  1. Can I just "turn off" the MMU?

  2. Should I keep data cache turned off as well?

  3. Are there any access sequence restrictions? E.g., can I just
  read/write each 16-bit value sequentially?

  4. Any other gotchas?

I believe I've got the EMIF/DDR3 controller initialized properly (I used
the u-boot code for the beaglebone black), but I can't get the test code
proper running. I'm getting DABORT exceptions on the _second_ time
through the write loop in the following code.

Any help would be appreciated.

--Randy


#include "asm-defs/asm-defs.h"
#include "asm-defs/prcm.h"
#include "asm-defs/emif.h"
#include "asm-defs/control.h"
#include "asm-defs/cm.h"

                .cpu        cortex-a8
                .text

//----------------------------------------
// Randy Yates
// 
// entry:
//   ro = base of DRAM test. must be even since DRAM is 16-bit words
//   r2 = count of 16-bit words to test
//   r3 = value to write (lower 16-bits)

// routine usage
//   r1 = current offset into r0 (base of dram test)
//   r4 = value read back

// return:
//   r7 = result (boolean: test successful)
//              0x01 = test passed
//              0x00 = test failed
//----------------------------------------

.fun ddr3_value_test
                push        {lr}

                mov         r1, 0

// first write all values
ddr3_value_test_write_loop:
                str.n       r3, [r0, r1]                // <== DABORT error here on second time through the loop 
                add         r1, #2
                subs        r2, #1
                bne         ddr3_value_test_write_loop

// now read back values
                mov         r1, 0
ddr3_value_test_read_loop:
                ldr.n       r4, [r0, r1]
                subs        r4, r3
                bne         ddr3_value_test_fail

                add         r1, #2
                subs        r2, #1
                bne         ddr3_value_test_read_loop

// test passed:
                mov         r7, 0x01
                pop         {pc}

// test failed:
ddr3_value_test_fail:
                mov         r7, 0
                pop         {pc}


-- 
Randy Yates, DSP/Embedded Firmware Developer
Digital Signal Labs
http://www.digitalsignallabs.com
On 12.1.16 06:21, Randy Yates wrote:
> I'm trying to kick out a simple (ha ha!) assembly language project in > which I perform some simple DRAM tests on board based on the TI AM3352 > using a 512 MB DRAM (256M x 16). > > Ideally I'd like to just treat DRAM as a flat address space from > 0x80000000 to 0x82000000. > > 1. Can I just "turn off" the MMU? > > 2. Should I keep data cache turned off as well? > > 3. Are there any access sequence restrictions? E.g., can I just > read/write each 16-bit value sequentially? > > 4. Any other gotchas? > > I believe I've got the EMIF/DDR3 controller initialized properly (I used > the u-boot code for the beaglebone black), but I can't get the test code > proper running. I'm getting DABORT exceptions on the _second_ time > through the write loop in the following code. > > Any help would be appreciated. > > --Randy > > > #include "asm-defs/asm-defs.h" > #include "asm-defs/prcm.h" > #include "asm-defs/emif.h" > #include "asm-defs/control.h" > #include "asm-defs/cm.h" > > .cpu cortex-a8 > .text > > //---------------------------------------- > // Randy Yates > // > // entry: > // ro = base of DRAM test. must be even since DRAM is 16-bit words > // r2 = count of 16-bit words to test > // r3 = value to write (lower 16-bits) > > // routine usage > // r1 = current offset into r0 (base of dram test) > // r4 = value read back > > // return: > // r7 = result (boolean: test successful) > // 0x01 = test passed > // 0x00 = test failed > //---------------------------------------- > > .fun ddr3_value_test > push {lr} > > mov r1, 0 > > // first write all values > ddr3_value_test_write_loop: > str.n r3, [r0, r1] // <== DABORT error here on second time through the loop > add r1, #2 > subs r2, #1 > bne ddr3_value_test_write_loop > > // now read back values > mov r1, 0 > ddr3_value_test_read_loop: > ldr.n r4, [r0, r1] > subs r4, r3 > bne ddr3_value_test_fail > > add r1, #2 > subs r2, #1 > bne ddr3_value_test_read_loop > > // test passed: > mov r7, 0x01 > pop {pc} > > // test failed: > ddr3_value_test_fail: > mov r7, 0 > pop {pc}
This is just a subroutine. Which is the environment it is run in (Linux)? If there is an operating system, you have to negotiate the MMU setup with it. It depends on the tolerance of the assembler used if the instruction 'mov r1,0' is accepted and interpreted as you seem to like. Please try 'mov r1,#0' instead. There is actually no need to store the link register. If the system uses the EABI conventions, you should push and pop an even number of registers to maintain the 8 byte alignment of the stack pointer. The standard ARM convention is to transfer first 4 arguments in registers r0 - r3, and the return value in r0. For the standard conventions, a subroutine is allowed to clobber r0 to r3 as it likes. -- -TV
Hi Tauno,

Thank you for your responses. Please see my comments 
below.

Tauno Voipio <tauno.voipio@notused.fi.invalid> writes:

> On 12.1.16 06:21, Randy Yates wrote: >> I'm trying to kick out a simple (ha ha!) assembly language project in >> which I perform some simple DRAM tests on board based on the TI AM3352 >> using a 512 MB DRAM (256M x 16). >> >> Ideally I'd like to just treat DRAM as a flat address space from >> 0x80000000 to 0x82000000. >> >> 1. Can I just "turn off" the MMU? >> >> 2. Should I keep data cache turned off as well? >> >> 3. Are there any access sequence restrictions? E.g., can I just >> read/write each 16-bit value sequentially? >> >> 4. Any other gotchas? >> >> I believe I've got the EMIF/DDR3 controller initialized properly (I used >> the u-boot code for the beaglebone black), but I can't get the test code >> proper running. I'm getting DABORT exceptions on the _second_ time >> through the write loop in the following code. >> >> Any help would be appreciated. >> >> --Randy >> >> >> #include "asm-defs/asm-defs.h" >> #include "asm-defs/prcm.h" >> #include "asm-defs/emif.h" >> #include "asm-defs/control.h" >> #include "asm-defs/cm.h" >> >> .cpu cortex-a8 >> .text >> >> //---------------------------------------- >> // Randy Yates >> // >> // entry: >> // ro = base of DRAM test. must be even since DRAM is 16-bit words >> // r2 = count of 16-bit words to test >> // r3 = value to write (lower 16-bits) >> >> // routine usage >> // r1 = current offset into r0 (base of dram test) >> // r4 = value read back >> >> // return: >> // r7 = result (boolean: test successful) >> // 0x01 = test passed >> // 0x00 = test failed >> //---------------------------------------- >> >> .fun ddr3_value_test >> push {lr} >> >> mov r1, 0 >> >> // first write all values >> ddr3_value_test_write_loop: >> str.n r3, [r0, r1] // <== DABORT error here on second time through the loop >> add r1, #2 >> subs r2, #1 >> bne ddr3_value_test_write_loop >> >> // now read back values >> mov r1, 0 >> ddr3_value_test_read_loop: >> ldr.n r4, [r0, r1] >> subs r4, r3 >> bne ddr3_value_test_fail >> >> add r1, #2 >> subs r2, #1 >> bne ddr3_value_test_read_loop >> >> // test passed: >> mov r7, 0x01 >> pop {pc} >> >> // test failed: >> ddr3_value_test_fail: >> mov r7, 0 >> pop {pc} > > > This is just a subroutine. Which is the environment it is run > in (Linux)?
No, bare metal assembly. No operating system, no C, no C library. I perform my own startup and minimal system initialization in assembly, then do the DDR3/EMIF initialization, then the test code above.
> If there is an operating system, you have to negotiate the > MMU setup with it. > > It depends on the tolerance of the assembler used if the > instruction 'mov r1,0' is accepted and interpreted as you > seem to like. Please try 'mov r1,#0' instead.
Agreed that is sloppy syntax, but from the available mov instructions, I don't see how else it could be interpreted. I will fix the syntax. In general I hate the following aspects of the gnu assembler: 1. It permits sloppy syntax (as above). I want the assembler to complain for every wrong syntax. It is the assembly language programmer's job to input correct syntax and not the assembler's job to relax it's syntax checking. 2. It auto-exports certain symbols. I want to explicitly specify which symbols are exported. 3. It permits identical labels, e.g., l0: subs r1, #1 bne l0b subs r2, #1 bne l0f ... l0: ldr r2, SOME_VAR .... 4. Special syntax like "=" destroys the one-to-one mapping of assembly instructions to machine language instructions. Horrid.
> There is actually no need to store the link register. If the > system uses the EABI conventions, you should push and pop > an even number of registers to maintain the 8 byte alignment > of the stack pointer.
As said, this is all my own assembly language code and I am perfectly free to implement things as I wish.
> The standard ARM convention is to transfer first 4 arguments > in registers r0 - r3, and the return value in r0. For the > standard conventions, a subroutine is allowed to clobber > r0 to r3 as it likes.
Ditto above. -- Randy Yates, DSP/Embedded Firmware Developer Digital Signal Labs http://www.digitalsignallabs.com
Hello Randy,

Thank you. Mine also below.

On 12.1.16 13:28, Randy Yates wrote:
> Hi Tauno, > > Thank you for your responses. Please see my comments > below. > > Tauno Voipio <tauno.voipio@notused.fi.invalid> writes: > >> On 12.1.16 06:21, Randy Yates wrote: >>> I'm trying to kick out a simple (ha ha!) assembly language project in >>> which I perform some simple DRAM tests on board based on the TI AM3352 >>> using a 512 MB DRAM (256M x 16). >>> >>> Ideally I'd like to just treat DRAM as a flat address space from >>> 0x80000000 to 0x82000000. >>> >>> 1. Can I just "turn off" the MMU? >>> >>> 2. Should I keep data cache turned off as well? >>> >>> 3. Are there any access sequence restrictions? E.g., can I just >>> read/write each 16-bit value sequentially? >>> >>> 4. Any other gotchas? >>> >>> I believe I've got the EMIF/DDR3 controller initialized properly (I used >>> the u-boot code for the beaglebone black), but I can't get the test code >>> proper running. I'm getting DABORT exceptions on the _second_ time >>> through the write loop in the following code. >>> >>> Any help would be appreciated. >>> >>> --Randy >>> >>> >>> #include "asm-defs/asm-defs.h" >>> #include "asm-defs/prcm.h" >>> #include "asm-defs/emif.h" >>> #include "asm-defs/control.h" >>> #include "asm-defs/cm.h" >>> >>> .cpu cortex-a8 >>> .text >>> >>> //---------------------------------------- >>> // Randy Yates >>> // >>> // entry: >>> // ro = base of DRAM test. must be even since DRAM is 16-bit words >>> // r2 = count of 16-bit words to test >>> // r3 = value to write (lower 16-bits) >>> >>> // routine usage >>> // r1 = current offset into r0 (base of dram test) >>> // r4 = value read back >>> >>> // return: >>> // r7 = result (boolean: test successful) >>> // 0x01 = test passed >>> // 0x00 = test failed >>> //---------------------------------------- >>> >>> .fun ddr3_value_test >>> push {lr} >>> >>> mov r1, 0 >>> >>> // first write all values >>> ddr3_value_test_write_loop: >>> str.n r3, [r0, r1] // <== DABORT error here on second time through the loop >>> add r1, #2 >>> subs r2, #1 >>> bne ddr3_value_test_write_loop >>> >>> // now read back values >>> mov r1, 0 >>> ddr3_value_test_read_loop: >>> ldr.n r4, [r0, r1] >>> subs r4, r3 >>> bne ddr3_value_test_fail >>> >>> add r1, #2 >>> subs r2, #1 >>> bne ddr3_value_test_read_loop >>> >>> // test passed: >>> mov r7, 0x01 >>> pop {pc} >>> >>> // test failed: >>> ddr3_value_test_fail: >>> mov r7, 0 >>> pop {pc} >> >> >> This is just a subroutine. Which is the environment it is run >> in (Linux)? > > No, bare metal assembly. No operating system, no C, no C library. > I perform my own startup and minimal system initialization in > assembly, then do the DDR3/EMIF initialization, then the test > code above. > >> If there is an operating system, you have to negotiate the >> MMU setup with it. >> >> It depends on the tolerance of the assembler used if the >> instruction 'mov r1,0' is accepted and interpreted as you >> seem to like. Please try 'mov r1,#0' instead. > > Agreed that is sloppy syntax, but from the available mov > instructions, I don't see how else it could be interpreted. > I will fix the syntax.
> In general I hate the following aspects of the gnu assembler: > > 1. It permits sloppy syntax (as above). I want the assembler > to complain for every wrong syntax. It is the assembly > language programmer's job to input correct syntax and not > the assembler's job to relax it's syntax checking.
The sloppy acceptance has come with the ARM unified syntax, as the Thumb and ARM syntaxes were combined.
> 2. It auto-exports certain symbols. I want to explicitly > specify which symbols are exported.
Which one? Are you sure that they do not come from the included headers (which I do not have)?
> 3. It permits identical labels, e.g., > > l0: subs r1, #1 > bne l0b > subs r2, #1 > bne l0f > ... > l0: ldr r2, SOME_VAR > ....
These are called explicitly local labels, and the intention is to make simple throwaway labels for short loops etc. As far as I remember, they should be numeric. AFAIK, they have been inherited from the PDP-11 assembler.
> 4. Special syntax like "=" destroys the one-to-one mapping > of assembly instructions to machine language instructions.
This is a service of the assembler to keep a supply of constants. It is called literal addressing, and it has been in various assemblers at least since the IBM S/360 era.
> Horrid.
I beg to differ.
>> There is actually no need to store the link register. If the >> system uses the EABI conventions, you should push and pop >> an even number of registers to maintain the 8 byte alignment >> of the stack pointer. > > As said, this is all my own assembly language code and I am perfectly > free to implement things as I wish. > >> The standard ARM convention is to transfer first 4 arguments >> in registers r0 - r3, and the return value in r0. For the >> standard conventions, a subroutine is allowed to clobber >> r0 to r3 as it likes. > > Ditto above.
Here is GNU assembler output before declaring unified syntax: dramtest.s: Assembler messages: dramtest.s:22: Error: unknown pseudo-op: `.fun' dramtest.s:25: Error: immediate expression requires a # prefix -- `mov r1,0' dramtest.s:29: Error: unexpected character `n' in type specifier dramtest.s:29: Error: bad instruction `str.n r3,[r0,r1]' dramtest.s:35: Error: immediate expression requires a # prefix -- `mov r1,0' dramtest.s:37: Error: unexpected character `n' in type specifier dramtest.s:37: Error: bad instruction `ldr.n r4,[r0,r1]' dramtest.s:46: Error: immediate expression requires a # prefix -- `mov r7,0x01' dramtest.s:51: Error: immediate expression requires a # prefix -- `mov r7,0' make: *** [all] Error 1 ---- Here is your function without the #includes and twisted to be accepted by the assembler. I also changed the syntax a bit to avoid the pre-processor. I had to guess some, hope it is correct: .syntax unified .thumb .cpu cortex-a8 .text @---------------------------------------- @ Randy Yates @ @ entry: @ r0 = base of DRAM test. must be even since DRAM is 16-bit words @ r2 = count of 16-bit words to test @ r3 = value to write (lower 16-bits) @ routine usage @ r1 = current offset into r0 (base of dram test) @ r4 = value read back @ return: @ r7 = result (boolean: test successful) @ 0x01 = test passed @ 0x00 = test failed @---------------------------------------- @.fun ddr3_value_test .globl ddr3_value_test ddr3_value_test: push {lr} mov r1, 0 @ first write all values ddr3_value_test_write_loop: str.n r3, [r0, r1] @ <== DABORT error here on second time through the loop add r1, #2 subs r2, #1 bne ddr3_value_test_write_loop @ now read back values mov r1, 0 ddr3_value_test_read_loop: ldr.n r4, [r0, r1] subs r4, r3 bne ddr3_value_test_fail add r1, #2 subs r2, #1 bne ddr3_value_test_read_loop @ test passed: mov r7, 0x01 pop {pc} @ test failed: ddr3_value_test_fail: mov r7, 0 pop {pc} ---- Here is the resulting object code dis-assembled: Disassembly of section .text: 00000000 <ddr3_value_test>: 0: b500 push {lr} 2: f04f 0100 mov.w r1, #0 00000006 <ddr3_value_test_write_loop>: 6: 5043 str r3, [r0, r1] 8: f101 0102 add.w r1, r1, #2 c: 3a01 subs r2, #1 e: f47f affa bne.w 6 <ddr3_value_test_write_loop> 12: f04f 0100 mov.w r1, #0 00000016 <ddr3_value_test_read_loop>: 16: 5844 ldr r4, [r0, r1] 18: 1ae4 subs r4, r4, r3 1a: f040 8008 bne.w 2e <ddr3_value_test_fail> 1e: f101 0102 add.w r1, r1, #2 22: 3a01 subs r2, #1 24: f47f aff7 bne.w 16 <ddr3_value_test_read_loop> 28: f04f 0701 mov.w r7, #1 2c: bd00 pop {pc} 0000002e <ddr3_value_test_fail>: 2e: f04f 0700 mov.w r7, #0 32: bd00 pop {pc} ---- The ldr.n and str.n opcodes translate to 32 bit instructions. You step the index by 2, creating unaligned accesses. The code might work if you change ldr.n to ldrh and str.n to strh. -- Regards, Tauno
Randy Yates <yates@digitalsignallabs.com> wrote:
> I'm trying to kick out a simple (ha ha!) assembly language project in > which I perform some simple DRAM tests on board based on the TI AM3352 > using a 512 MB DRAM (256M x 16). > > Ideally I'd like to just treat DRAM as a flat address space from > 0x80000000 to 0x82000000. > > 1. Can I just "turn off" the MMU?
Yes. You need system control coprocessor control register 1: http://infocenter.arm.com/help/topic/com.arm.doc.ddi0344h/Bgbciiaf.html For instance I think this will turn off the MMU: MRC p15, 0, r0, c1, c0, 0 BIC r0, #1 MCR p15, 0, r0, c1, c0, 0
> 2. Should I keep data cache turned off as well?
Probably wise if you want to test DRAM. Clearing bit 2 should disable the D-cache.
> 3. Are there any access sequence restrictions? E.g., can I just > read/write each 16-bit value sequentially?
That should work. However the memory controller may coalesce accesses - it will depend on how it is configured.
> 4. Any other gotchas?
Initialising the DDR3 controller is probably the biggest gotcha, so if you've managed that it should be straightforward (probably).
> I believe I've got the EMIF/DDR3 controller initialized properly (I used > the u-boot code for the beaglebone black), but I can't get the test code > proper running. I'm getting DABORT exceptions on the _second_ time > through the write loop in the following code.
I don't know what's going on in the code you supplied, since it depends on the MMU state and physical memory map. Don't forget to turn off IRQs and FIQs, since whatever they're expecting is probably no longer there if you turned off the MMU. Theo
Tauno Voipio <tauno.voipio@notused.fi.invalid> writes:
> [...] > The ldr.n and str.n opcodes translate to 32 bit instructions. > You step the index by 2, creating unaligned accesses. > > The code might work if you change ldr.n to ldrh and str.n > to strh.
That was precisely the problem, Tauno! Thanks very much for your feedback! It is working now! -- Randy Yates, DSP/Embedded Firmware Developer Digital Signal Labs http://www.digitalsignallabs.com
On 2016-01-12, Randy Yates <yates@digitalsignallabs.com> wrote:
> I'm trying to kick out a simple (ha ha!) assembly language project in > which I perform some simple DRAM tests on board based on the TI AM3352 > using a 512 MB DRAM (256M x 16). > > Ideally I'd like to just treat DRAM as a flat address space from > 0x80000000 to 0x82000000. > > 1. Can I just "turn off" the MMU? > > 2. Should I keep data cache turned off as well? >
If you read closely enough, I think you will find that you actually cannot use the data cache on the A8 unless the MMU is enabled. Simon. -- Simon Clubley, clubley@remove_me.eisner.decus.org-Earth.UFP Microsoft: Bringing you 1980s technology to a 21st century world

The 2024 Embedded Online Conference