Randy Yates <yates@digitalsignallabs.com> wrote:
> I'm trying to kick out a simple (ha ha!) assembly language project in
> which I perform some simple DRAM tests on board based on the TI AM3352
> using a 512 MB DRAM (256M x 16). 
> 
> Ideally I'd like to just treat DRAM as a flat address space from
> 0x80000000 to 0x82000000.
> 
>   1. Can I just "turn off" the MMU?

Yes.  You need system control coprocessor control register 1:
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0344h/Bgbciiaf.html

For instance I think this will turn off the MMU:
MRC p15, 0, r0, c1, c0, 0
BIC r0, #1
MCR p15, 0, r0, c1, c0, 0

>   2. Should I keep data cache turned off as well?

Probably wise if you want to test DRAM.
Clearing bit 2 should disable the D-cache.

>   3. Are there any access sequence restrictions? E.g., can I just
>   read/write each 16-bit value sequentially?

That should work.  However the memory controller may coalesce accesses - it
will depend on how it is configured.

>   4. Any other gotchas?

Initialising the DDR3 controller is probably the biggest gotcha, so if
you've managed that it should be straightforward (probably).

> I believe I've got the EMIF/DDR3 controller initialized properly (I used
> the u-boot code for the beaglebone black), but I can't get the test code
> proper running. I'm getting DABORT exceptions on the _second_ time
> through the write loop in the following code.

I don't know what's going on in the code you supplied, since it depends on
the MMU state and physical memory map.

Don't forget to turn off IRQs and FIQs, since whatever they're expecting is
probably no longer there if you turned off the MMU.

Theo

Hello Randy,

Thank you. Mine also below.

On 12.1.16 13:28, Randy Yates wrote:
> Hi Tauno,
>
> Thank you for your responses. Please see my comments
> below.
>
> Tauno Voipio <tauno.voipio@notused.fi.invalid> writes:
>
>> On 12.1.16 06:21, Randy Yates wrote:
>>> I'm trying to kick out a simple (ha ha!) assembly language project in
>>> which I perform some simple DRAM tests on board based on the TI AM3352
>>> using a 512 MB DRAM (256M x 16).
>>>
>>> Ideally I'd like to just treat DRAM as a flat address space from
>>> 0x80000000 to 0x82000000.
>>>
>>>     1. Can I just "turn off" the MMU?
>>>
>>>     2. Should I keep data cache turned off as well?
>>>
>>>     3. Are there any access sequence restrictions? E.g., can I just
>>>     read/write each 16-bit value sequentially?
>>>
>>>     4. Any other gotchas?
>>>
>>> I believe I've got the EMIF/DDR3 controller initialized properly (I used
>>> the u-boot code for the beaglebone black), but I can't get the test code
>>> proper running. I'm getting DABORT exceptions on the _second_ time
>>> through the write loop in the following code.
>>>
>>> Any help would be appreciated.
>>>
>>> --Randy
>>>
>>>
>>> #include "asm-defs/asm-defs.h"
>>> #include "asm-defs/prcm.h"
>>> #include "asm-defs/emif.h"
>>> #include "asm-defs/control.h"
>>> #include "asm-defs/cm.h"
>>>
>>>                   .cpu        cortex-a8
>>>                   .text
>>>
>>> //----------------------------------------
>>> // Randy Yates
>>> //
>>> // entry:
>>> //   ro = base of DRAM test. must be even since DRAM is 16-bit words
>>> //   r2 = count of 16-bit words to test
>>> //   r3 = value to write (lower 16-bits)
>>>
>>> // routine usage
>>> //   r1 = current offset into r0 (base of dram test)
>>> //   r4 = value read back
>>>
>>> // return:
>>> //   r7 = result (boolean: test successful)
>>> //              0x01 = test passed
>>> //              0x00 = test failed
>>> //----------------------------------------
>>>
>>> .fun ddr3_value_test
>>>                   push        {lr}
>>>
>>>                   mov         r1, 0
>>>
>>> // first write all values
>>> ddr3_value_test_write_loop:
>>>                   str.n       r3, [r0, r1]                // <== DABORT error here on second time through the loop
>>>                   add         r1, #2
>>>                   subs        r2, #1
>>>                   bne         ddr3_value_test_write_loop
>>>
>>> // now read back values
>>>                   mov         r1, 0
>>> ddr3_value_test_read_loop:
>>>                   ldr.n       r4, [r0, r1]
>>>                   subs        r4, r3
>>>                   bne         ddr3_value_test_fail
>>>
>>>                   add         r1, #2
>>>                   subs        r2, #1
>>>                   bne         ddr3_value_test_read_loop
>>>
>>> // test passed:
>>>                   mov         r7, 0x01
>>>                   pop         {pc}
>>>
>>> // test failed:
>>> ddr3_value_test_fail:
>>>                   mov         r7, 0
>>>                   pop         {pc}
>>
>>
>> This is just a subroutine. Which is the environment it is run
>> in (Linux)?
>
> No, bare metal assembly. No operating system, no C, no C library.
> I perform my own startup and minimal system initialization in
> assembly, then do the DDR3/EMIF initialization, then the test
> code above.
>
>> If there is an operating system, you have to negotiate the
>> MMU setup with it.
>>
>> It depends on the tolerance of the assembler used if the
>> instruction 'mov r1,0' is accepted and interpreted as you
>> seem to like. Please try 'mov r1,#0' instead.
>
> Agreed that is sloppy syntax, but from the available mov
> instructions, I don't see how else it could be interpreted.
> I will fix the syntax.

> In general I hate the following aspects of the gnu assembler:
>
>    1. It permits sloppy syntax (as above). I want the assembler
>    to complain for every wrong syntax. It is the assembly
>    language programmer's job to input correct syntax and not
>    the assembler's job to relax it's syntax checking.

The sloppy acceptance has come with the ARM unified syntax, as
the Thumb and ARM syntaxes were combined.

>    2. It auto-exports certain symbols. I want to explicitly
>    specify which symbols are exported.

Which one? Are you sure that they do not come from the
included headers (which I do not have)?

>    3. It permits identical labels, e.g.,
>
>       l0:        subs      r1, #1
>                  bne       l0b
>                  subs      r2, #1
>                  bne       l0f
>                  ...
>       l0:        ldr       r2, SOME_VAR
>                  ....

These are called explicitly local labels, and the intention
is to make simple throwaway labels for short loops etc. As
far as I remember, they should be numeric.

AFAIK, they have been inherited from the PDP-11 assembler.

>    4. Special syntax like "=" destroys the one-to-one mapping
>    of assembly instructions to machine language instructions.

This is a service of the assembler to keep a supply of constants.
It is called literal addressing, and it has been in various
assemblers at least since the IBM S/360 era.

> Horrid.

I beg to differ.

>> There is actually no need to store the link register. If the
>> system uses the EABI conventions, you should push and pop
>> an even number of registers to maintain the 8 byte alignment
>> of the stack pointer.
>
> As said, this is all my own assembly language code and I am perfectly
> free to implement things as I wish.
>
>> The standard ARM convention is to transfer first 4 arguments
>> in registers r0 - r3, and the return value in r0. For the
>> standard conventions, a subroutine is allowed to clobber
>> r0 to r3 as it likes.
>
> Ditto above.

Here is GNU assembler output before declaring unified syntax:

dramtest.s: Assembler messages:
dramtest.s:22: Error: unknown pseudo-op: `.fun'
dramtest.s:25: Error: immediate expression requires a # prefix -- `mov r1,0'
dramtest.s:29: Error: unexpected character `n' in type specifier
dramtest.s:29: Error: bad instruction `str.n r3,[r0,r1]'
dramtest.s:35: Error: immediate expression requires a # prefix -- `mov r1,0'
dramtest.s:37: Error: unexpected character `n' in type specifier
dramtest.s:37: Error: bad instruction `ldr.n r4,[r0,r1]'
dramtest.s:46: Error: immediate expression requires a # prefix -- `mov 
r7,0x01'
dramtest.s:51: Error: immediate expression requires a # prefix -- `mov r7,0'
make: *** [all] Error 1

----

Here is your function without the #includes and twisted to
be accepted by the assembler. I also changed the syntax a
bit to avoid the pre-processor. I had to guess some, hope it
is correct:

                  .syntax unified
                  .thumb
                  .cpu        cortex-a8
                  .text

@----------------------------------------
@ Randy Yates
@
@ entry:
@   r0 = base of DRAM test. must be even since DRAM is 16-bit words
@   r2 = count of 16-bit words to test
@   r3 = value to write (lower 16-bits)

@ routine usage
@   r1 = current offset into r0 (base of dram test)
@   r4 = value read back

@ return:
@   r7 = result (boolean: test successful)
@              0x01 = test passed
@              0x00 = test failed
@----------------------------------------

@.fun ddr3_value_test

                  .globl ddr3_value_test
ddr3_value_test:
                  push        {lr}

                  mov         r1, 0

@ first write all values
ddr3_value_test_write_loop:
                  str.n       r3, [r0, r1]                @ <== DABORT 
error here on second time through the loop
                  add         r1, #2
                  subs        r2, #1
                  bne         ddr3_value_test_write_loop

@ now read back values
                  mov         r1, 0
ddr3_value_test_read_loop:
                  ldr.n       r4, [r0, r1]
                  subs        r4, r3
                  bne         ddr3_value_test_fail

                  add         r1, #2
                  subs        r2, #1
                  bne         ddr3_value_test_read_loop

@ test passed:
                  mov         r7, 0x01
                  pop         {pc}

@ test failed:
ddr3_value_test_fail:
                  mov         r7, 0
                  pop         {pc}

----

Here is the resulting object code dis-assembled:

Disassembly of section .text:

00000000 <ddr3_value_test>:
    0:	b500      	push	{lr}
    2:	f04f 0100 	mov.w	r1, #0

00000006 <ddr3_value_test_write_loop>:
    6:	5043      	str	r3, [r0, r1]
    8:	f101 0102 	add.w	r1, r1, #2
    c:	3a01      	subs	r2, #1
    e:	f47f affa 	bne.w	6 <ddr3_value_test_write_loop>
   12:	f04f 0100 	mov.w	r1, #0

00000016 <ddr3_value_test_read_loop>:
   16:	5844      	ldr	r4, [r0, r1]
   18:	1ae4      	subs	r4, r4, r3
   1a:	f040 8008 	bne.w	2e <ddr3_value_test_fail>
   1e:	f101 0102 	add.w	r1, r1, #2
   22:	3a01      	subs	r2, #1
   24:	f47f aff7 	bne.w	16 <ddr3_value_test_read_loop>
   28:	f04f 0701 	mov.w	r7, #1
   2c:	bd00      	pop	{pc}

0000002e <ddr3_value_test_fail>:
   2e:	f04f 0700 	mov.w	r7, #0
   32:	bd00      	pop	{pc}

----

The ldr.n and str.n opcodes translate to 32 bit instructions.
You step the index by 2, creating unaligned accesses.

The code might work if you change ldr.n to ldrh and str.n
to strh.

-- 

Regards,
Tauno

Hi Tauno,

Thank you for your responses. Please see my comments 
below.

Tauno Voipio <tauno.voipio@notused.fi.invalid> writes:

> On 12.1.16 06:21, Randy Yates wrote:
>> I'm trying to kick out a simple (ha ha!) assembly language project in
>> which I perform some simple DRAM tests on board based on the TI AM3352
>> using a 512 MB DRAM (256M x 16).
>>
>> Ideally I'd like to just treat DRAM as a flat address space from
>> 0x80000000 to 0x82000000.
>>
>>    1. Can I just "turn off" the MMU?
>>
>>    2. Should I keep data cache turned off as well?
>>
>>    3. Are there any access sequence restrictions? E.g., can I just
>>    read/write each 16-bit value sequentially?
>>
>>    4. Any other gotchas?
>>
>> I believe I've got the EMIF/DDR3 controller initialized properly (I used
>> the u-boot code for the beaglebone black), but I can't get the test code
>> proper running. I'm getting DABORT exceptions on the _second_ time
>> through the write loop in the following code.
>>
>> Any help would be appreciated.
>>
>> --Randy
>>
>>
>> #include "asm-defs/asm-defs.h"
>> #include "asm-defs/prcm.h"
>> #include "asm-defs/emif.h"
>> #include "asm-defs/control.h"
>> #include "asm-defs/cm.h"
>>
>>                  .cpu        cortex-a8
>>                  .text
>>
>> //----------------------------------------
>> // Randy Yates
>> //
>> // entry:
>> //   ro = base of DRAM test. must be even since DRAM is 16-bit words
>> //   r2 = count of 16-bit words to test
>> //   r3 = value to write (lower 16-bits)
>>
>> // routine usage
>> //   r1 = current offset into r0 (base of dram test)
>> //   r4 = value read back
>>
>> // return:
>> //   r7 = result (boolean: test successful)
>> //              0x01 = test passed
>> //              0x00 = test failed
>> //----------------------------------------
>>
>> .fun ddr3_value_test
>>                  push        {lr}
>>
>>                  mov         r1, 0
>>
>> // first write all values
>> ddr3_value_test_write_loop:
>>                  str.n       r3, [r0, r1]                // <== DABORT error here on second time through the loop
>>                  add         r1, #2
>>                  subs        r2, #1
>>                  bne         ddr3_value_test_write_loop
>>
>> // now read back values
>>                  mov         r1, 0
>> ddr3_value_test_read_loop:
>>                  ldr.n       r4, [r0, r1]
>>                  subs        r4, r3
>>                  bne         ddr3_value_test_fail
>>
>>                  add         r1, #2
>>                  subs        r2, #1
>>                  bne         ddr3_value_test_read_loop
>>
>> // test passed:
>>                  mov         r7, 0x01
>>                  pop         {pc}
>>
>> // test failed:
>> ddr3_value_test_fail:
>>                  mov         r7, 0
>>                  pop         {pc}
>
>
> This is just a subroutine. Which is the environment it is run
> in (Linux)?

No, bare metal assembly. No operating system, no C, no C library.
I perform my own startup and minimal system initialization in 
assembly, then do the DDR3/EMIF initialization, then the test
code above.

> If there is an operating system, you have to negotiate the
> MMU setup with it.
>
> It depends on the tolerance of the assembler used if the
> instruction 'mov r1,0' is accepted and interpreted as you
> seem to like. Please try 'mov r1,#0' instead.

Agreed that is sloppy syntax, but from the available mov
instructions, I don't see how else it could be interpreted.
I will fix the syntax.

In general I hate the following aspects of the gnu assembler:

  1. It permits sloppy syntax (as above). I want the assembler
  to complain for every wrong syntax. It is the assembly
  language programmer's job to input correct syntax and not
  the assembler's job to relax it's syntax checking.

  2. It auto-exports certain symbols. I want to explicitly
  specify which symbols are exported.

  3. It permits identical labels, e.g.,

     l0:        subs      r1, #1
                bne       l0b
                subs      r2, #1
                bne       l0f
                ...
     l0:        ldr       r2, SOME_VAR
                ....

  4. Special syntax like "=" destroys the one-to-one mapping
  of assembly instructions to machine language instructions.

Horrid.

> There is actually no need to store the link register. If the
> system uses the EABI conventions, you should push and pop
> an even number of registers to maintain the 8 byte alignment
> of the stack pointer.

As said, this is all my own assembly language code and I am perfectly
free to implement things as I wish.

> The standard ARM convention is to transfer first 4 arguments
> in registers r0 - r3, and the return value in r0. For the
> standard conventions, a subroutine is allowed to clobber
> r0 to r3 as it likes.

Ditto above.
-- 
Randy Yates, DSP/Embedded Firmware Developer
Digital Signal Labs
http://www.digitalsignallabs.com

On 12.1.16 06:21, Randy Yates wrote:
> I'm trying to kick out a simple (ha ha!) assembly language project in
> which I perform some simple DRAM tests on board based on the TI AM3352
> using a 512 MB DRAM (256M x 16).
>
> Ideally I'd like to just treat DRAM as a flat address space from
> 0x80000000 to 0x82000000.
>
>    1. Can I just "turn off" the MMU?
>
>    2. Should I keep data cache turned off as well?
>
>    3. Are there any access sequence restrictions? E.g., can I just
>    read/write each 16-bit value sequentially?
>
>    4. Any other gotchas?
>
> I believe I've got the EMIF/DDR3 controller initialized properly (I used
> the u-boot code for the beaglebone black), but I can't get the test code
> proper running. I'm getting DABORT exceptions on the _second_ time
> through the write loop in the following code.
>
> Any help would be appreciated.
>
> --Randy
>
>
> #include "asm-defs/asm-defs.h"
> #include "asm-defs/prcm.h"
> #include "asm-defs/emif.h"
> #include "asm-defs/control.h"
> #include "asm-defs/cm.h"
>
>                  .cpu        cortex-a8
>                  .text
>
> //----------------------------------------
> // Randy Yates
> //
> // entry:
> //   ro = base of DRAM test. must be even since DRAM is 16-bit words
> //   r2 = count of 16-bit words to test
> //   r3 = value to write (lower 16-bits)
>
> // routine usage
> //   r1 = current offset into r0 (base of dram test)
> //   r4 = value read back
>
> // return:
> //   r7 = result (boolean: test successful)
> //              0x01 = test passed
> //              0x00 = test failed
> //----------------------------------------
>
> .fun ddr3_value_test
>                  push        {lr}
>
>                  mov         r1, 0
>
> // first write all values
> ddr3_value_test_write_loop:
>                  str.n       r3, [r0, r1]                // <== DABORT error here on second time through the loop
>                  add         r1, #2
>                  subs        r2, #1
>                  bne         ddr3_value_test_write_loop
>
> // now read back values
>                  mov         r1, 0
> ddr3_value_test_read_loop:
>                  ldr.n       r4, [r0, r1]
>                  subs        r4, r3
>                  bne         ddr3_value_test_fail
>
>                  add         r1, #2
>                  subs        r2, #1
>                  bne         ddr3_value_test_read_loop
>
> // test passed:
>                  mov         r7, 0x01
>                  pop         {pc}
>
> // test failed:
> ddr3_value_test_fail:
>                  mov         r7, 0
>                  pop         {pc}


This is just a subroutine. Which is the environment it is run
in (Linux)?

If there is an operating system, you have to negotiate the
MMU setup with it.

It depends on the tolerance of the assembler used if the
instruction 'mov r1,0' is accepted and interpreted as you
seem to like. Please try 'mov r1,#0' instead.

There is actually no need to store the link register. If the
system uses the EABI conventions, you should push and pop
an even number of registers to maintain the 8 byte alignment
of the stack pointer.

The standard ARM convention is to transfer first 4 arguments
in registers r0 - r3, and the return value in r0. For the
standard conventions, a subroutine is allowed to clobber
r0 to r3 as it likes.

-- 

-TV

I'm trying to kick out a simple (ha ha!) assembly language project in
which I perform some simple DRAM tests on board based on the TI AM3352
using a 512 MB DRAM (256M x 16). 

Ideally I'd like to just treat DRAM as a flat address space from
0x80000000 to 0x82000000.

  1. Can I just "turn off" the MMU?

  2. Should I keep data cache turned off as well?

  3. Are there any access sequence restrictions? E.g., can I just
  read/write each 16-bit value sequentially?

  4. Any other gotchas?

I believe I've got the EMIF/DDR3 controller initialized properly (I used
the u-boot code for the beaglebone black), but I can't get the test code
proper running. I'm getting DABORT exceptions on the _second_ time
through the write loop in the following code.

Any help would be appreciated.

--Randy


#include "asm-defs/asm-defs.h"
#include "asm-defs/prcm.h"
#include "asm-defs/emif.h"
#include "asm-defs/control.h"
#include "asm-defs/cm.h"

                .cpu        cortex-a8
                .text

//----------------------------------------
// Randy Yates
// 
// entry:
//   ro = base of DRAM test. must be even since DRAM is 16-bit words
//   r2 = count of 16-bit words to test
//   r3 = value to write (lower 16-bits)

// routine usage
//   r1 = current offset into r0 (base of dram test)
//   r4 = value read back

// return:
//   r7 = result (boolean: test successful)
//              0x01 = test passed
//              0x00 = test failed
//----------------------------------------

.fun ddr3_value_test
                push        {lr}

                mov         r1, 0

// first write all values
ddr3_value_test_write_loop:
                str.n       r3, [r0, r1]                // <== DABORT error here on second time through the loop 
                add         r1, #2
                subs        r2, #1
                bne         ddr3_value_test_write_loop

// now read back values
                mov         r1, 0
ddr3_value_test_read_loop:
                ldr.n       r4, [r0, r1]
                subs        r4, r3
                bne         ddr3_value_test_fail

                add         r1, #2
                subs        r2, #1
                bne         ddr3_value_test_read_loop

// test passed:
                mov         r7, 0x01
                pop         {pc}

// test failed:
ddr3_value_test_fail:
                mov         r7, 0
                pop         {pc}


-- 
Randy Yates, DSP/Embedded Firmware Developer
Digital Signal Labs
http://www.digitalsignallabs.com