Sign in

username:

password:



Not a member?

Search fpga-cpu



Search tips

Subscribe to fpga-cpu



fpga-cpu by Keywords

Altera | CISCifying | IDE | ISA | Java | JHDL | JTAG | LBU | MicroBlaze | PAR | PCI | RISC | SoC | Spartan | Transputers | Verilog | VHDL | Virtex | VLIW | WebPack | Xilinx | Xsoc | YARD-1A

Discussion Groups

Discussion Groups | FPGA-CPU | floorplanning ... can someone offer guru advice ?

This list is for discussion of the design and implementation of field-programmable gate array based processors and integrated systems. It is also for discussion and community support of the XSOC Project (see http://www.fpgacpu.org/xsoc).

floorplanning ... can someone offer guru advice ? - Sean - Mar 23 0:17:00 2002



I'd like to practice floor planning and working with Xilinx tools so I grabbed

http://doc.union.edu/237/Projects/Mips/Vhdl/

an 8 bit constrained mips isa implementation in VHDL by J Hamblen. I didn't
really look at the source that much and I haven't run any test benches against
it. My primary goal is to get comfortable with place and route and using the
xilinx floor planner.

I fixed a couple lines in the source and compiled it with Webpack 4.2 and I get
57Mhz in Spartan2 -6 grade part.

The files I used are here

http://www.xyke.com/hamblen-mips.00.tar.gz

-- Where do I look for bottle necks either in the design or the layout. This is
from a place and route and floorplanning perspective, so recognizing where
pipelining would help is good but I'd rather restructure the existing logic only
minimally and get most of the performance gains from the layout.

-- **HOW** do I look for bottle necks using the Xilinx tools. I am not looking
for hand holding but just some suggestions from people that already know.

-- What techniques does one employ with the floor planner and the automatic
place and route to speed up a design?

-- Are there easy heuristics to apply to this problem and what are they?

I read the online help for the floorplanner as well as

http://www.xilinx.com/support/techxclusives/timing6.htm

http://support.xilinx.com/support/techtips/documents/timing/presentation/timingcsts3_1i/sld001.htm

and of course, the article linked from fpgacpu.org

http://www.eedesign.com/isd/OEG20020227S0052

This process seems very black magic and applied. Something you learn by doing
rather being all Platonic.

Books articles suggestions?

It would be great if we flushed out this floor planning problem and turned it
from a black art to a previously solved problem.

Thanks, Sean.





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: floorplanning ... can someone offer guru advice ? - Ben Franchuk - Mar 23 0:38:00 2002

Sean wrote:
> I'd like to practice floor planning and working with Xilinx tools so I grabbed
Know the hardware is the best thing and your logic design. Floor planing
may speed things up over a random layout but only if the design FITS
well in the FPGA. Also many logic examples could favor features of a
specific FPGA family or supplier. A 8 bit adder may fit in say exactly
in row of 8 macro cells a be fast, yet a 8 bit adder with carry out
could need a extra logic cell for buffering and be only half the speed.
It is this type of logic where floor planning has the most room for
speed improvement. Also I expect for speed a FPGA can't be more about
33% full as more than that FAST lines could all be used up.
--
Ben Franchuk - Dawn * 12/24 bit cpu *
www.jetnet.ab.ca/users/bfranchuk/index.html






(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: floorplanning ... can someone offer guru advice ? - Eric Smith - Mar 23 4:14:00 2002

Ben wrote:
> A 8 bit adder may fit in say
> exactly in row of 8 macro cells a be fast, yet a 8 bit adder with carry
> out could need a extra logic cell for buffering and be only half the
> speed.

You're not going to lose half the speed just to get a carry out. If the
tools don't do the right thing for you automatically, just force it to be a
9-bit adder and use the ninth output as the carry out (with zeros for the
9th bit of each input). Unless something really strange is going on, this
shouldn't slow it down more than 13% over the 8-bit adder without carry
out.

In my experience, unless you're doing something a lot more exotic than
an adder with a carry out, the tools do the mapping just fine. I only
have had to tweak the placement.





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: floorplanning ... can someone offer guru advice ? - Ben Franchuk - Mar 23 11:57:00 2002

Eric Smith wrote:
>
> Ben wrote:
> > A 8 bit adder may fit in say
> > exactly in row of 8 macro cells a be fast, yet a 8 bit adder with carry
> > out could need a extra logic cell for buffering and be only half the
> > speed.
>
> You're not going to lose half the speed just to get a carry out. If the
> tools don't do the right thing for you automatically, just force it to be a
> 9-bit adder and use the ninth output as the carry out (with zeros for the
> 9th bit of each input). Unless something really strange is going on, this
> shouldn't slow it down more than 13% over the 8-bit adder without carry
> out.
>
> In my experience, unless you're doing something a lot more exotic than
> an adder with a carry out, the tools do the mapping just fine. I only
> have had to tweak the placement.

Like I said know the hardware. I use the other brand of FPGA. In my case
CLB's are arranged in blocks of 8 . The with this FPGA carry out has to
skip over the next block of maco cells before it can be routed to a CLB.
I have a 24 bit cpu, and
no way of floor planning can speed things up because the data path is
too wide to fit nicely. The adder example was just picked at random,
every FPGA has its own features and flaws.

--
Ben Franchuk - Dawn * 12/24 bit cpu *
www.jetnet.ab.ca/users/bfranchuk/index.html





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: floorplanning ... can someone offer guru advice ? - Eric Smith - Mar 23 19:37:00 2002

Ben wrote:
> Like I said know the hardware. I use the other brand of FPGA. In my
> case CLB's are arranged in blocks of 8 . The with this FPGA carry out
> has to skip over the next block of maco cells before it can be routed
> to a CLB.

Altera doesn't support hardwired carry chain beyond 8 bits, and has
a huge performance penalty for more than 8? I didn't know that, and
am now *much* less likely to ever consider Altera parts. Most of my
designs have counters and adders wider than 8 bits in the critical
paths. It's hard for me to believe that they would really do something
this dumb.

Eric




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: floorplanning ... can someone offer guru advice ? - Ben Franchuk - Mar 23 23:06:00 2002

Eric Smith wrote:

> Altera doesn't support hardwired carry chain beyond 8 bits, and has
> a huge performance penalty for more than 8? I didn't know that, and
> am now *much* less likely to ever consider Altera parts. Most of my
> designs have counters and adders wider than 8 bits in the critical
> paths. It's hard for me to believe that they would really do something
> this dumb.

It is not that you can't have adders larger than 8 bits, it is just that
routing for adders could use 'fast' lines quickly. It looks like altera
routes slower than but has more consistent delays and could route higher
density designs better.
--
Ben Franchuk - Dawn * 12/24 bit cpu *
www.jetnet.ab.ca/users/bfranchuk/index.html






(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: floorplanning ... can someone offer guru advice ? - Eric Smith - Mar 24 7:36:00 2002

Ben wrote:
> It is not that you can't have adders larger than 8 bits, it is just
> that routing for adders could use 'fast' lines quickly. It looks like
> altera routes slower than but has more consistent delays and could
> route higher density designs better.

Yeah, but you said that a 9-bit adder would run at half the speed of
an 8-bit adder. For me, that makes the parts useless. I don't *have*
high-density designs with only 8-bit-wide data paths. I'll stick
to Xilinx, they understand that people want wide data paths.





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: floorplanning ... can someone offer guru advice ? - Ben Franchuk - Mar 24 10:09:00 2002

Eric Smith wrote:
> Yeah, but you said that a 9-bit adder would run at half the speed of
> an 8-bit adder. For me, that makes the parts useless. I don't *have*
> high-density designs with only 8-bit-wide data paths. I'll stick
> to Xilinx, they understand that people want wide data paths.

The logic is 1 standard gate delay every 8 bits. 8 bits : 1 unit delay +
ripple carry,
16 bits: 2 units + ripple carry 32: bits 4 units + ripple carry. I have
not proved this in any way but that looks to be the case. With my 24 bit
cpu I can be 98% full and still route and keep a standard pin layout and
a resonable speed. I have not used Xilinx since when I got my FPGA
development board Xilinx did not have 1) free development software that
was not crippled 2) A low cost FPGA board of about 500 CLB's in a chip.

Ben Franchuk - Dawn * 12/24 bit cpu *
www.jetnet.ab.ca/users/bfranchuk/index.html





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: floorplanning ... can someone offer guru advice ? - Jacob Nelson - Mar 24 17:42:00 2002

On Sun, 24 Mar 2002, Eric Smith wrote:

> Yeah, but you said that a 9-bit adder would run at half the speed of
> an 8-bit adder. For me, that makes the parts useless. I don't *have*
> high-density designs with only 8-bit-wide data paths. I'll stick
> to Xilinx, they understand that people want wide data paths.

I don't see a big performance degradation when the carry chain leaves the
LAB.

I built a little variable-width adder with registered inputs and outputs
to explore this. After setting various options to get Max+Plus II to use
the carry chain, and setting the device to the (old and slow) Flex 10K20-4
that I have on my UP1 proto board, this is what the timing analyzer says
it'll do:

8 bits: 101 MHz
9 bits: 97 MHz
16 bits: 74 MHz
18 bits: 67 MHz
32 bits: 50 MHz
36 bits: 45 MHz

The 8 bit version fits in one LAB, so the carry delay path includes 7
inter-LE delays. The 32 bit version fits in 4 LABs, so the carry delay
path includes 28 inter-LE delays, and 3 inter-LAB delays. In the
floorplanner, you can see the carry chain go through one LAB, jump to the
LAB two over, and continue. I don't know how the carry chain is carried
between LABs, but it seems to happen reasonably efficiently. In more
modern Altera parts, a LAB has 10 LEs, I think.

Considering that the board has a 27 MHz oscillator, I'm not complaining.
Of course, things will slow down when one adds actual functionality....

jake





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: floorplanning ... can someone offer guru advice ? - Eric Smith - Mar 24 18:30:00 2002

> 8 bits: 101 MHz
> 9 bits: 97 MHz

That seems quite reasonable; less than 4% degradation for the extra
bit.

Ben, why were you claiming 50%?




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: floorplanning ... can someone offer guru advice ? - Ben Franchuk - Mar 24 21:03:00 2002

Eric Smith wrote:
>
> > 8 bits: 101 MHz
> > 9 bits: 97 MHz
>
> That seems quite reasonable; less than 4% degradation for the extra
> bit.
>
> Ben, why were you claiming 50%?
>
A ball park figure here.
Routing plays a large factor here. If the carry out line happens to be
placed on
a slow line a large distance away things slow down. I have not done
floor planning but
carry paths could be aided by floor planning.

--
Ben Franchuk - Dawn * 12/24 bit cpu *
www.jetnet.ab.ca/users/bfranchuk/index.html





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: floorplanning ... can someone offer guru advice ? - Ben Franchuk - Mar 24 21:19:00 2002

Jacob Nelson wrote:
>
> On Sun, 24 Mar 2002, Eric Smith wrote:
>
> > Yeah, but you said that a 9-bit adder would run at half the speed of
> > an 8-bit adder. For me, that makes the parts useless. I don't *have*
> > high-density designs with only 8-bit-wide data paths. I'll stick
> > to Xilinx, they understand that people want wide data paths.
>
> I don't see a big performance degradation when the carry chain leaves the
> LAB.
>
> I built a little variable-width adder with registered inputs and outputs
> to explore this. After setting various options to get Max+Plus II to use
> the carry chain, and setting the device to the (old and slow) Flex 10K20-4
> that I have on my UP1 proto board, this is what the timing analyzer says
> it'll do:
>
> 8 bits: 101 MHz
> 9 bits: 97 MHz
> 16 bits: 74 MHz
> 18 bits: 67 MHz
> 32 bits: 50 MHz
> 36 bits: 45 MHz
>
> The 8 bit version fits in one LAB, so the carry delay path includes 7
> inter-LE delays. The 32 bit version fits in 4 LABs, so the carry delay
> path includes 28 inter-LE delays, and 3 inter-LAB delays. In the
> floorplanner, you can see the carry chain go through one LAB, jump to the
> LAB two over, and continue. I don't know how the carry chain is carried
> between LABs, but it seems to happen reasonably efficiently. In more
> modern Altera parts, a LAB has 10 LEs, I think.
>
> Considering that the board has a 27 MHz oscillator, I'm not complaining.
> Of course, things will slow down when one adds actual functionality....

Try 24 bits in a 10K10 ! With my cpu I have had a speed range from 20
Mhz to 14 Mhz due to routing delays depending on I/O asignments and how
full the FPGA is (95 to 98%).
Note I use 6809 style timing so that is a 3.5 Mhz to 5 Mhz real memory
cycle.
--
Ben Franchuk - Dawn * 12/24 bit cpu *
www.jetnet.ab.ca/users/bfranchuk/index.html





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )