Reply by Ed Prochak March 24, 20092009-03-24
On Mar 23, 10:07=A0pm, chakra <narashim...@gmail.com> wrote:
> Hello all, > > I am working on a project which involves a simple BRAM, OPB-PLB, > Microblaze/PPC, and opb_ddr_sdram controller. I am reading 1280 bytes > of data from bram (32 bits each read, thus a total of 320 reads) and > writing it to DDR sdram. i am using Xilinx standalone OS. i use the > command =A0XIo_in32(addr) to read from bram and use XIo_out32(addr,data) > to write to DDR sdram controller. > > here is the c code i use to write to ddr. > > for(p=3D0;p<320;p++) //number of reads from Bram =3D 320 (0:319) > { > =A0a =3D XIo_In32(bram_addr); > =A0XIo_Out32(ddr_addr+(p*4), a); //P*4 is to make sure we create room > for 32 bit data > > } > > I use chipscope pro to debug whats happenig inside. > stepa: It takes 3 opb clock cycles to read the value from bram > stepb: there is a 5 clock cycle delay (probably Processor is > calculating p*4), > stepc: it takes about 9 clock cycles to write to DDR sdram and > stepd: 7 clock cyles (no clue why it takes so long) to the next cycle > of read and =A0write. > > question 1: is there a way to cutdown the clock cycles (delay clock > cycles b and d) (i =A0want to bring the total clock cycles for one > iteration down to about 15 or less)
Step b: Why recalculate the destination address? Just increment to avoid the multiply. for(p=3D0;p<320;p++) { a =3D XIo_In32(bram_addr); XIo_Out32(ddr_addr, a); ddr_addr =3D ddr_addr + 4; // increment by 4 to make sure // we create room for 32 bit data } Step d I'm not familiar with Microblaze/PPC, but is there a prefetch queue of instruction cache that is being flushed by the jump at the end of the loop? HTH, Ed
Reply by chakra March 24, 20092009-03-24
Hello all,

I am working on a project which involves a simple BRAM, OPB-PLB,
Microblaze/PPC, and opb_ddr_sdram controller. I am reading 1280 bytes
of data from bram (32 bits each read, thus a total of 320 reads) and
writing it to DDR sdram. i am using Xilinx standalone OS. i use the
command  XIo_in32(addr) to read from bram and use XIo_out32(addr,data)
to write to DDR sdram controller.

here is the c code i use to write to ddr.

for(p=0;p<320;p++) //number of reads from Bram = 320 (0:319)
{
 a = XIo_In32(bram_addr);
 XIo_Out32(ddr_addr+(p*4), a); //P*4 is to make sure we create room
for 32 bit data
}

I use chipscope pro to debug whats happenig inside.
stepa: It takes 3 opb clock cycles to read the value from bram
stepb: there is a 5 clock cycle delay (probably Processor is
calculating p*4),
stepc: it takes about 9 clock cycles to write to DDR sdram and
stepd: 7 clock cyles (no clue why it takes so long) to the next cycle
of read and  write.

question 1: is there a way to cutdown the clock cycles (delay clock
cycles b and d) (i  want to bring the total clock cycles for one
iteration down to about 15 or less)

question 2: is there a software way of reading from the bram and
writing to DDR sdram in burst mode (I know there is one in hardware
opb burst mode)

thanks,
with warm regards,
Chakra.