Hello all, I am working on a project which involves a simple BRAM, OPB-PLB, Microblaze/PPC, and opb_ddr_sdram controller. I am reading 1280 bytes of data from bram (32 bits each read, thus a total of 320 reads) and writing it to DDR sdram. i am using Xilinx standalone OS. i use the command XIo_in32(addr) to read from bram and use XIo_out32(addr,data) to write to DDR sdram controller. here is the c code i use to write to ddr. for(p=0;p<320;p++) //number of reads from Bram = 320 (0:319) { a = XIo_In32(bram_addr); XIo_Out32(ddr_addr+(p*4), a); //P*4 is to make sure we create room for 32 bit data } I use chipscope pro to debug whats happenig inside. stepa: It takes 3 opb clock cycles to read the value from bram stepb: there is a 5 clock cycle delay (probably Processor is calculating p*4), stepc: it takes about 9 clock cycles to write to DDR sdram and stepd: 7 clock cyles (no clue why it takes so long) to the next cycle of read and write. question 1: is there a way to cutdown the clock cycles (delay clock cycles b and d) (i want to bring the total clock cycles for one iteration down to about 15 or less) question 2: is there a software way of reading from the bram and writing to DDR sdram in burst mode (I know there is one in hardware opb burst mode) thanks, with warm regards, Chakra.
cutting down opb_clk cycles while read-write BRAM-DDR in FPGA
Started by ●March 24, 2009
Reply by ●March 24, 20092009-03-24
On Mar 23, 10:07=A0pm, chakra <narashim...@gmail.com> wrote:> Hello all, > > I am working on a project which involves a simple BRAM, OPB-PLB, > Microblaze/PPC, and opb_ddr_sdram controller. I am reading 1280 bytes > of data from bram (32 bits each read, thus a total of 320 reads) and > writing it to DDR sdram. i am using Xilinx standalone OS. i use the > command =A0XIo_in32(addr) to read from bram and use XIo_out32(addr,data) > to write to DDR sdram controller. > > here is the c code i use to write to ddr. > > for(p=3D0;p<320;p++) //number of reads from Bram =3D 320 (0:319) > { > =A0a =3D XIo_In32(bram_addr); > =A0XIo_Out32(ddr_addr+(p*4), a); //P*4 is to make sure we create room > for 32 bit data > > } > > I use chipscope pro to debug whats happenig inside. > stepa: It takes 3 opb clock cycles to read the value from bram > stepb: there is a 5 clock cycle delay (probably Processor is > calculating p*4), > stepc: it takes about 9 clock cycles to write to DDR sdram and > stepd: 7 clock cyles (no clue why it takes so long) to the next cycle > of read and =A0write. > > question 1: is there a way to cutdown the clock cycles (delay clock > cycles b and d) (i =A0want to bring the total clock cycles for one > iteration down to about 15 or less)Step b: Why recalculate the destination address? Just increment to avoid the multiply. for(p=3D0;p<320;p++) { a =3D XIo_In32(bram_addr); XIo_Out32(ddr_addr, a); ddr_addr =3D ddr_addr + 4; // increment by 4 to make sure // we create room for 32 bit data } Step d I'm not familiar with Microblaze/PPC, but is there a prefetch queue of instruction cache that is being flushed by the jump at the end of the loop? HTH, Ed