This list is for discussion of the design and implementation of field-programmable gate array based processors and integrated systems. It is also for discussion and community support of the XSOC Project (see http://www.fpgacpu.org/xsoc).
|
Hopefully, the following tale of how NOT to do something may save someone else a few nights of code rework... When moving the YARD-1A processor from the XC4010E to the Spartan 2, I changed the memory implementation from a separate instruction ROM and data RAM, built with CLB resources, to a shared block RAM using the variable port width dual port feature ( which happened to be the first time I'd used the variable width DP block RAM ). Using the dual port RAM this way allows a "Harvard" processor to read/write instruction space without needing any extra port bypass logic. For the 32 bit processor, I needed a 16 bit port for instruction fetches, and a 32 bit port for data accesses. The dual port block RAM was built from four block RAMs, each using 1K x 4 for PORTA and 512 x 8 for PORTB: ram_gen: for i in 0 to 3 generate begin d_we(i) <= ( NOT d_wr_l ) AND ( NOT d_wr_en_l(i)); RAMn : RAMB4_S4_S8 port map ( ADDRA => i_addr (9 downto 0), DIA => ( others => '0'), DOA => i_dat( (4*i)+3 downto 4*i ), CLKA => clk, ENA => '1', WEA => '0', RSTA => '0', ADDRB => d_addr( 8 downto 0), DIB => l_wdat( (8*i)+7 downto 8*i ), DOB => l_rdat( (8*i)+7 downto 8*i ), CLKB => clk, ENB => d_en, WEB => d_we(i), RSTB => '0' ); end generate; The cross assembler generates the INIT values for the RAM. The instruction port worked fine, but the data port didn't read pre-initialized values properly; after a bit of head-scratching, I drew a picture of where the INIT values were going, and realized that the wider port needs the data bus connections scrambled to work properly. ( by "scrambled", I mean a reordering other than the usual big-endian vs. little-endian byte/word shuffle ) In other words: the upper block RAM doesn't contain the upper eight bits of the 32 bit data, but rather the four MSB's of one 16 bit word, plus the four MSB's of the other. The following code unscrambles the 32 bit port for a 1K x 16 / 512 x 32, big-endian RAM: d_dat <= l_rdat( 27 downto 24 ) & l_rdat( 19 downto 16 ) & l_rdat( 11 downto 8 ) & l_rdat( 3 downto 0 ) & l_rdat( 31 downto 28 ) & l_rdat( 23 downto 20 ) & l_rdat( 15 downto 12 ) & l_rdat( 7 downto 4 ) when ( ( d_rd_l = '0' ) AND ( d_cs_l = '0' ) ) else (others => 'Z'); l_wdat <= d_dat( 15 downto 12 ) & d_dat( 31 downto 28 ) & d_dat( 11 downto 8 ) & d_dat( 27 downto 24 ) & d_dat( 7 downto 4 ) & d_dat( 23 downto 20 ) & d_dat( 3 downto 0 ) & d_dat( 19 downto 16 ); Where "d_dat" is the normally ordered data bus, "l_rdat" is the scrambled read bus, and "l_wdat" is the scrambled write bus. After this change, pre-initialized values read properly, and 32 bit writes worked, but 16 and 8 bit writes still had problems; after looking at the should-have-been and was-written memory values, I realized I'd forgotten about the byte-write enables when I'd fixed the data bus bits- there's no way to write only the eight bits of a given byte. I'm currently rewriting the code (and INIT generator) to use x8 ports on both sides of the dual port, with a decode mux for the instruction port to get back to 16 bits; this allows each port to maintain the same 'view' of memory while still having working byte write enables on the wider port. Summary: 1) If a variable width dual-port block RAM spans more than one block RAM, the output bits of the wider port need shuffling to assemble the proper output data word, beyond the usual big-endian vs. little-endian swaps. 2) If you need byte write enables on the RAM, you can't use the variable width dual port feature. ( If you can live with word writes on the wider port, you're OK. ) 3) To use byte write enables, both ports need to be at the same port width ( 8 bits or narrower ). COREGEN notes: I looked at the coregen EDIF output for a 1Kx16 / 512x32 block RAM; it does the output bit shuffling for you, but always builds a little-endian output word on the wider port. Also, it doesn't allow for byte write enables, which is why I didn't use COREGEN in the first place. Brian |
|
|
|
> -----Original Message----- > From: [mailto:] > Sent: Tuesday, February 06, 2001 3:29 AM > To: > Subject: [fpga-cpu] How NOT to build variable width dual port Block RAMs > > Hopefully, the following tale of how NOT to do something > may save someone else a few nights of code rework... [snip] > 2) If you need byte write enables on the RAM, you can't > use the variable width dual port feature. ( If you can > live with word writes on the wider port, you're OK. ) Wow, Brian, you just saved me a lot of grief. I was about to do exactly that! Best Regards, Gary Watson Chief Technology Officer Nexsan Technologies, Ltd. Imperial House East Service Road Raynesway Derby DE21 7BF ENGLAND +44 (0) 1332 5 444 33 http://www.nexsan.com |
|
|
|
--- In fpga-cpu@y..., "Gary Watson" <gary@n...> wrote: > > > Hopefully, the following tale of how NOT to do something > > may save someone else a few nights of code rework... <snip> > > Wow, Brian, you just saved me a lot of grief. I was about to > do exactly that! Glad that helped; the variable-width problem seemed like one of those things that's very easy to overlook, but fairly obvious once it's been pointed out. Once I fix the assembler & block ram code, I'll post them in an update to my "YARD-1" zip file. ( Other than getting the YARD-1 code to compile under XST, I haven't had time to tidy up the other loose ends and finish the instruction set testbench, so the post of the entire source code may not be for a while ) Brian |