Sign in

username:

password:



Not a member?

Search fpga-cpu



Search tips

Subscribe to fpga-cpu



fpga-cpu by Keywords

Altera | CISCifying | IDE | ISA | Java | JHDL | JTAG | LBU | MicroBlaze | PAR | PCI | RISC | SoC | Spartan | Transputers | Verilog | VHDL | Virtex | VLIW | WebPack | Xilinx | Xsoc | YARD-1A

Ads

Discussion Groups

Discussion Groups | FPGA-CPU | Help - Shifter using MUXCYs

This list is for discussion of the design and implementation of field-programmable gate array based processors and integrated systems. It is also for discussion and community support of the XSOC Project (see http://www.fpgacpu.org/xsoc).

Help - Shifter using MUXCYs - Lucian Damoc - May 7 8:03:00 2005


Hello,

I've designed a shifter using only MUXCYs (found in
Xilinx FPGAs). Instead of using the usual 2:1 MUX
(implemented in a LUT), I've used the MUXCY (see the
attached Verilog file).

XST synthesized the design with no errors/warnings
(target device = SpartanIIE, 300K).
The delay (as shown in the .syr file) seems very good. Synplify gives a totally different delay for the
design (it's worse than using the classic 2:1 MUX).

My questions are:

1) Has someone done something similar ?

2) Who should I trust: XST or Synplify ?
(I would like to trust XST... but by looking at the
SLICE organization it seems impossible to obtain such
a small delay... using the MUXCY in the way I did... I
expected to see some routing-delay for the net
connecting the output of a MUXCY to the DI input of
the next MUXCY).

A derivative of this shifter is part of the small
32-bit microprocessor that I'm designing for my
diploma project... so I would REALLY appreciate your
help as I don't have much time left.

Thank you very much.

Lucian, Romania

__________________________________

[Non-text portions of this message have been removed]





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )


Re: Help - Shifter using MUXCYs - Author Unknown - May 7 10:33:00 2005

Don't trust either XST or Synplicity. Synthesis tools
can only estimate timing.

Run the Xilinx place and route and look at the timing
results for both of your synthesized designs. That
should give you real results.

-Chris

--- Lucian Damoc <damocl82@damo...> wrote: > Hello,
>
> I've designed a shifter using only MUXCYs (found in
> Xilinx FPGAs). Instead of using the usual 2:1 MUX
> (implemented in a LUT), I've used the MUXCY (see the
> attached Verilog file).
>
> XST synthesized the design with no errors/warnings
> (target device = SpartanIIE, 300K).
> The delay (as shown in the .syr file) seems very
> good. > Synplify gives a totally different delay for the
> design (it's worse than using the classic 2:1 MUX).
>
> My questions are:
>
> 1) Has someone done something similar ?
>
> 2) Who should I trust: XST or Synplify ?
> (I would like to trust XST... but by looking at the
> SLICE organization it seems impossible to obtain
> such
> a small delay... using the MUXCY in the way I did...
> I
> expected to see some routing-delay for the net
> connecting the output of a MUXCY to the DI input of
> the next MUXCY).
>
> A derivative of this shifter is part of the small
> 32-bit microprocessor that I'm designing for my
> diploma project... so I would REALLY appreciate your
> help as I don't have much time left.
>
> Thank you very much.
>
> Lucian, Romania >
>
> __________________________________
>
> [Non-text portions of this message have been
> removed] > To post a message, send it to:
> fpga-cpu@fpga...
> To unsubscribe, send a blank message to:
> fpga-cpu-unsubscribe@fpga...
> Yahoo! Groups Links > fpga-cpu-unsubscribe@fpga...





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Help - Shifter using MUXCYs - Rob Finch - May 8 0:05:00 2005

> I've designed a shifter using only MUXCYs (found in
> Xilinx FPGAs). Instead of using the usual 2:1 MUX
> (implemented in a LUT), I've used the MUXCY (see the
> attached Verilog file).
>
> XST synthesized the design with no errors/warnings
> (target device = SpartanIIE, 300K).
> The delay (as shown in the .syr file) seems very good. > Synplify gives a totally different delay for the
> design (it's worse than using the classic 2:1 MUX).
>
> My questions are:
>
> 1) Has someone done something similar ?
>
> 2) Who should I trust: XST or Synplify ?
> (I would like to trust XST... but by looking at the
> SLICE organization it seems impossible to obtain such
> a small delay... using the MUXCY in the way I did... I
> expected to see some routing-delay for the net
> connecting the output of a MUXCY to the DI input of
> the next MUXCY).
>
> A derivative of this shifter is part of the small
> 32-bit microprocessor that I'm designing for my
> diploma project... so I would REALLY appreciate your
> help as I don't have much time left.
>
> Thank you very much.
>
> Lucian, Romania
Can you post the code (or is it somewhere on the net), or is it too
long ? (It looks like the attachment didn't come through). Just how
small is the delay ? If it seems too good to be
true...

I've found the XST synthesizer to be not bad at estimating times. One
thing to keep in mind is that depending on how things are put
together, the actual time may be quite a bit worse. A component in a
large system might not have the same timing as it does all by itself,
depending on the placement and other factors.

I've done some experimentation with shifters and I think using a 4-to-
1 mux and shifting zero to three bits at a time is a bit more
efficient than using a cascade of 2-to-1 muxes. (I'm guessing how
your shifter is organized). However, I'm using architecture neutral
code so I've not tried building the shifter using Xilinx primitives
which might give better times.

Are you going to make the details of your micro available ?




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Help - Shifter using MUXCYs - Lucian Damoc - May 9 5:02:00 2005


Hello again,

Sorry if the attachement didn't came... I attatched an
archive (to that first email) with these files:
>> .v (design),
>> .syr (XST - after synthesys timing)
>> .srr file (Synplify - after synthesys timing)

... but it seems the archive didn't make it.

======

I'll list the code (from the .v file) here:

//====================================================
// 32-bit shifter - can shift RIGHT or LEFT
//
//
// if(left_right == 1'b0) shift_right
// else shift_left
//
//====================================================

`timescale 1ns / 10ps

//===================================================

module SHIFTER(left_right, di, sa, do);

input left_right;
input [31:0] di;
input [4:0] sa; // shift amount

output [31:0] do;

// input 2:1 mux
// 0-7
wire [31:0] di_core;
wire [31:0] do_core;

// level 0 wires
wire [31:0] mux_L0;
wire [31:2] mux_L0_copy;
// level 1 wires
wire [31:0] mux_L1;
wire [31:4] mux_L1_copy;
// level 2 wires
wire [31:0] mux_L2;
wire [31:8] mux_L2_copy;
// level 3 wires
wire [31:0] mux_L3;
wire [31:16] mux_L3_copy;

// level 4 wires
//wire [31:0] mux_L4;
//wire [31:2] mux_L4_copy; // MUXCY_D MUXCY_D_instance_name ( .LO (user_LO), .O
(user_O), .CI (user_CI), .DI (user_DI), .S (user_S));
// SOMETHING LIKE: MUX2_1 mux_name( .S(sel) .DI(din0),
.CI(din1), .O(dout), .LO(dout_copy) )

// MUXCY MUXCY_instance_name (.O (user_O), .CI
(user_CI), .DI (user_DI), .S (user_S));
// SOMETHING LIKE: MUX2_1 mux_name( .S(sel) .DI(din0),
.CI(din1), .O(dout) )

// input mux: if( left_right == 1'b1) invert
di[31:0];
// output mux: if( left_right == 1'b1) invert
do[31:0];
genvar i, j, k;
generate for(i=0; i<=31; i=i+1) begin: mux_in_out_loop
MUXCY MUXCY_IN ( .S(left_right),
.DI(di[i]), .CI(di[31-i]), .O(di_core[i]));
MUXCY MUXCY_OUT ( .S(left_right),
.DI(do_core[i]), .CI(do_core[31-i]), .O(do[i]));
end
endgenerate // LEVEL 0
MUXCY MUXCY_L0_00 ( .S(sa[0]), .DI(di_core[0]),
.CI(di_core[1]), .O(mux_L0[0]));
MUXCY MUXCY_L0_01 ( .S(sa[0]), .DI(di_core[1]),
.CI(di_core[2]), .O(mux_L0[1]));

//MUXCY_D MUXCY_L0_02 ( .S(sa[0]), .DI(di_core[2]),
.CI(di_core[3]), .O(mux_L0[2]), .LO(mux_L0_copy[2]));
generate for(i=2; i<=30; i=i+1) begin: mux_L0_loop
MUXCY_D MUXCY_L0 ( .S(sa[0]),
.DI(di_core[i]), .CI(di_core[i+1]), .O(mux_L0[i]),
.LO(mux_L0_copy[i]));
end
endgenerate
MUXCY_D MUXCY_L0_31 ( .S(sa[0]), .DI(di_core[31]),
.CI(1'b0) , .O(mux_L0[31]), .LO(mux_L0_copy[31])); // LEVEL 1
MUXCY MUXCY_L1_00 ( .S(sa[1]), .DI(mux_L0[0]),
.CI(mux_L0_copy[2]), .O(mux_L1[0]));
MUXCY MUXCY_L1_01 ( .S(sa[1]), .DI(mux_L0[1]),
.CI(mux_L0_copy[3]), .O(mux_L1[1]));
MUXCY MUXCY_L1_02 ( .S(sa[1]), .DI(mux_L0[2]),
.CI(mux_L0_copy[4]), .O(mux_L1[2]));
MUXCY MUXCY_L1_03 ( .S(sa[1]), .DI(mux_L0[3]),
.CI(mux_L0_copy[5]), .O(mux_L1[3]));
//
//MUXCY_D MUXCY_L1_04 ( .S(sa[1]), .DI(mux_L0[4]),
.CI(mux_L0_copy[6]), .O(mux_L1[4]),
.LO(mux_L1_copy[4]));
//MUXCY_D MUXCY_L1_29 ( .S(sa[1]), .DI(mux_L0[29]),
.CI(mux_L0_copy[31]), .O(mux_L1[29]),
.LO(mux_L1_copy[29]));
generate for(i=4; i<=29; i=i+1) begin: mux_L1_loop
MUXCY_D MUXCY_L1 ( .S(sa[1]),
.DI(mux_L0[i]), .CI(mux_L0_copy[i+2]), .O(mux_L1[i]),
.LO(mux_L1_copy[i]));
end
endgenerate
MUXCY_D MUXCY_L1_30 ( .S(sa[1]), .DI(mux_L0[30]),
.CI(1'b0), .O(mux_L1[30]), .LO(mux_L1_copy[30]));
MUXCY_D MUXCY_L1_31 ( .S(sa[1]), .DI(mux_L0[31]),
.CI(1'b0), .O(mux_L1[31]), .LO(mux_L1_copy[31])); // LEVEL 2
//
//MUXCY MUXCY_L2_00 ( .S(sa[2]), .DI(mux_L1_00),
.CI(mux_L1_04_copy), .O(mux_L2_00));
//MUXCY MUXCY_L2_07 ( .S(sa[2]), .DI(mux_L1_07),
.CI(mux_L1_11_copy), .O(mux_L2_07));
generate for(i=0; i<=7; i=i+1) begin: mux_L2_0_7_loop
MUXCY MUXCY_L2 ( .S(sa[2]), .DI(mux_L1[i]),
.CI(mux_L1_copy[i+4]), .O(mux_L2[i]));
end
endgenerate

//MUXCY_D MUXCY_L2_08 ( .S(sa[2]), .DI(mux_L1_08),
.CI(mux_L1_12_copy), .O(mux_L2_08),
.LO(mux_L2_08_copy));
//MUXCY_D MUXCY_L2_27 ( .S(sa[2]), .DI(mux_L1_27),
.CI(mux_L1_31_copy), .O(mux_L2_27),
.LO(mux_L2_27_copy));
generate for(i=8; i<=27; i=i+1) begin:
mux_L2_8_27_loop
MUXCY_D MUXCY_L2 ( .S(sa[2]),
.DI(mux_L1[i]), .CI(mux_L1_copy[i+4]), .O(mux_L2[i]),
.LO(mux_L2_copy[i]));
end
endgenerate

MUXCY_D MUXCY_L2_28 ( .S(sa[2]), .DI(mux_L1[28]),
.CI(1'b0), .O(mux_L2[28]), .LO(mux_L2_copy[28]));
MUXCY_D MUXCY_L2_29 ( .S(sa[2]), .DI(mux_L1[29]),
.CI(1'b0), .O(mux_L2[29]), .LO(mux_L2_copy[29]));
MUXCY_D MUXCY_L2_30 ( .S(sa[2]), .DI(mux_L1[30]),
.CI(1'b0), .O(mux_L2[30]), .LO(mux_L2_copy[30]));
MUXCY_D MUXCY_L2_31 ( .S(sa[2]), .DI(mux_L1[31]),
.CI(1'b0), .O(mux_L2[31]), .LO(mux_L2_copy[31]));

// LEVEL 3
//
//MUXCY MUXCY_L3_00 ( .S(sa[3]), .DI(mux_L2_00),
.CI(mux_L2_08_copy), .O(mux_L3_00));
//MUXCY MUXCY_L3_15 ( .S(sa[3]), .DI(mux_L2_15),
.CI(mux_L2_23_copy), .O(mux_L3_15));
generate for(i=0; i<=15; i=i+1) begin:
mux_L3_0_15_loop
MUXCY MUXCY_L3 ( .S(sa[3]), .DI(mux_L2[i]),
.CI(mux_L2_copy[i+8]), .O(mux_L3[i]));
end
endgenerate
//
//MUXCY_D MUXCY_L3_16 ( .S(sa[3]), .DI(mux_L2_16),
.CI(mux_L2_24_copy), .O(mux_L3_16),
.LO(mux_L3_16_copy));
//MUXCY_D MUXCY_L3_23 ( .S(sa[3]), .DI(mux_L2_23),
.CI(mux_L2_31_copy), .O(mux_L3_23),
.LO(mux_L3_23_copy));
generate for(i=16; i<=23; i=i+1) begin:
mux_L3_16_23_loop
MUXCY_D MUXCY_L3 ( .S(sa[3]),
.DI(mux_L2[i]), .CI(mux_L2_copy[i+8]), .O(mux_L3[i]),
.LO(mux_L3_copy[i]));
end
endgenerate
//
//MUXCY_D MUXCY_L3_24 ( .S(sa[3]), .DI(mux_L2_24),
.CI(1'b0), .O(mux_L3_24), .LO(mux_L3_24_copy));
//MUXCY_D MUXCY_L3_31 ( .S(sa[3]), .DI(mux_L2_31),
.CI(1'b0), .O(mux_L3_31), .LO(mux_L3_31_copy));
generate for(i=24; i<=31; i=i+1) begin:
mux_L3_24_31_loop
MUXCY_D MUXCY_L3 ( .S(sa[3]),
.DI(mux_L2[i]), .CI(1'b0), .O(mux_L3[i]),
.LO(mux_L3_copy[i]));
end
endgenerate

// LEVEL 4
//
//MUXCY MUXCY_L4_00 ( .S(sa[4]), .DI(mux_L3_00),
.CI(mux_L3_16_copy), .O(do_core[0]));
//.O(mux_L4_00));
//MUXCY MUXCY_L4_15 ( .S(sa[4]), .DI(mux_L3_15),
.CI(mux_L3_31_copy), .O(do_core[15]));
//.O(mux_L4_15));
generate for(i=0; i<=15; i=i+1) begin:
mux_L4_0_15_loop
MUXCY MUXCY_L4 ( .S(sa[4]), .DI(mux_L3[i]),
.CI(mux_L3_copy[i+16]), .O(do_core[i]));
end
endgenerate

//
//MUXCY MUXCY_L4_16 ( .S(sa[4]), .DI(mux_L3_16),
.CI(1'b0), .O(do[16])); //.O(mux_L4_16));
//MUXCY MUXCY_L4_31 ( .S(sa[4]), .DI(mux_L3_31),
.CI(1'b0), .O(do[31])); //.O(mux_L4_31));
generate for(i=16; i<=31; i=i+1) begin:
mux_L4_16_31_loop
MUXCY MUXCY_L4 ( .S(sa[4]), .DI(mux_L3[i]),
.CI(1'b0), .O(do_core[i]));
end
endgenerate

endmodule

==========================

.syr file (part of if it) -contains delay given by
XST:

-------------------------------------------------------------------------
Timing constraint: Default path analysis
Delay: 12.320ns (Levels of Logic = 10)
Source: left_right (PAD)
Destination: do<31> (PAD)

Data Path: left_right to do<31>
Gate Net
Cell:in->out fanout Delay Delay Logical Name
------------------------------------- ---------------
IBUF:I->O 64 0.797 4.100 left_right_IBUF
LUT1:I0->O 1 0.468 0.000 left_right_IBUF_rt

MUXCY:S->O 0 0.515 0.000
mux_in_out_loop[0].MUXCY_IN

MUXCY:DI->O 0 0.153 0.000 MUXCY_L0_00
MUXCY:DI->O 0 0.153 0.000 MUXCY_L1_00
MUXCY:DI->O 0 0.153 0.000
mux_L2_0_7_loop[0].MUXCY_L2 (mux_L2<0>)

MUXCY:DI->O 0 0.153 0.000
mux_L3_0_15_loop[0].MUXCY_L3 (mux_L3<0>)

MUXCY:DI->O 1 0.153 0.000
mux_L4_0_15_loop[0].MUXCY_L4 (do_core<0>)

MUXCY:DI->O 1 0.153 0.920
mux_in_out_loop[0].MUXCY_OUT (do_0_OBUF)

OBUF:I->O 4.602 do_0_OBUF (do<0>)
-------------------------------------------------
Total 12.320ns (7.300ns logic, 5.020ns
route)
(59.3% logic, 40.7% route)
==========

.srr file (part of if it) - delay given by Synplify:

===========================================================

Instance / Net Pin
Pin Arrival No. of
Name Type Name
Dir Delay Time Fan Out(s)
-------------------------------------------------------------------------------------------------------
left_right Port
left_right In 0.000 0.000 -

left_right Net -
- 0.000 - 1
left_right_ibuf IBUF I
In - 0.000 -
left_right_ibuf IBUF O
Out 1.047 1.047 -
left_right_c Net -
- 2.478 - 64
mux_in_out_loop[31].MUXCY_IN_sf LUT1 I0
In - 3.525 -
mux_in_out_loop[31].MUXCY_IN_sf LUT1 O
Out 0.468 3.993 -
mux_in_out_loop\[31\].MUXCY_IN_sf Net -
- 0.000 - 1
mux_in_out_loop[31].MUXCY_IN MUXCY S
In - 3.993 -
mux_in_out_loop[31].MUXCY_IN MUXCY O
Out 0.464 4.457 -
di_core[31] Net -
- 0.947 - 2
MUXCY_L0_31 MUXCY_D DI
In - 5.404 -
MUXCY_L0_31 MUXCY_D LO
Out 0.380 5.784 -
mux_L0_copy[31] Net -
- 0.750 - 1
mux_L1_loop[29].MUXCY_L1 MUXCY_D CI
In - 6.534 -
mux_L1_loop[29].MUXCY_L1 MUXCY_D LO
Out 0.487 7.021 -
mux_L1_copy[29] Net -
- 0.750 - 1
mux_L2_8_27_loop[25].MUXCY_L2 MUXCY_D CI
In - 7.771 -
mux_L2_8_27_loop[25].MUXCY_L2 MUXCY_D LO
Out 0.487 8.258 -
mux_L2_copy[25] Net -
- 0.750 - 1
mux_L3_16_23_loop[17].MUXCY_L3 MUXCY_D CI
In - 9.008 -
mux_L3_16_23_loop[17].MUXCY_L3 MUXCY_D LO
Out 0.487 9.495 -
mux_L3_copy[17] Net -
- 0.750 - 1
mux_L4_0_15_loop[1].MUXCY_L4 MUXCY CI
In - 10.245 -
mux_L4_0_15_loop[1].MUXCY_L4 MUXCY O
Out 0.487 10.732 -
do_core[1] Net -
- 0.947 - 2
mux_in_out_loop[1].MUXCY_OUT MUXCY DI
In - 11.679 -
mux_in_out_loop[1].MUXCY_OUT MUXCY O
Out 0.380 12.059 -
do_1[1] Net -
- 0.750 - 1
do_obuf[1] OBUF I
In - 12.809 -
do_obuf[1] OBUF O
Out 4.851 17.660 -
do[1] Net -
- 0.000 - 1
do[31:0] Port do[1]
Out - 17.660 -
=======================================================================================================
Total path delay (propagation time + setup) of 17.660
is 9.538(54.0%) logic and 8.122(46.0%) route.

===========================

I agree about: "A component in a large system might
not have the same timing as it does all by itself,
depending on the placement and other factors."
I learned it ... the hard way.

There are 7 levels of 2:1 MUXes (implemented using
MUXCYs) :
- levels 0 and 6 are used to convert the right shift
into a left shift
- levels 1-5 do the right shift

In XST each of the levels 1 to 5 gives a delay of
0.153ns (it's pretty good compared to the delay of a
2:1 MUX implemented using a LUT).

There' s something else I'm worried about:

Can the MUXCYs REALLY be connected in the way I did
in the shifter design ? (I've looked at the technology
view shown by Synplify and it showed that each MUXCY
from levels 1 to 5 had a buffer on it's output;
XST does NOT behave in this way). ========

Thank you for your patience & time. Lucian, __________________________________________________
">http://mail.yahoo.com





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

RE: Re: Help - Shifter using MUXCYs - Jeffery, Robert - May 9 6:32:00 2005

Hi Lucian.

What are the device details, part and speedgrade?

Cheers.

Robert.

-----Original Message-----
From: fpga-cpu@fpga... [mailto:fpga-cpu@fpga...] On
Behalf Of Lucian Damoc
Sent: 09 May 2005 10:03
To: fpga-cpu@fpga...
Subject: [fpga-cpu] Re: Help - Shifter using MUXCYs Hello again,

Sorry if the attachement didn't came... I attatched an archive (to that
first email) with these files:
>> .v (design),
>> .syr (XST - after synthesys timing) >> .srr file (Synplify - after
synthesys timing)

... but it seems the archive didn't make it.

======

I'll list the code (from the .v file) here:

//====================================================
// 32-bit shifter - can shift RIGHT or LEFT // // // if(left_right ==
1'b0) shift_right
// else shift_left
//
//====================================================

`timescale 1ns / 10ps

//===================================================

module SHIFTER(left_right, di, sa, do);

input left_right;
input [31:0] di;
input [4:0] sa; // shift amount

output [31:0] do;

// input 2:1 mux
// 0-7
wire [31:0] di_core;
wire [31:0] do_core;

// level 0 wires
wire [31:0] mux_L0;
wire [31:2] mux_L0_copy;
// level 1 wires
wire [31:0] mux_L1;
wire [31:4] mux_L1_copy;
// level 2 wires
wire [31:0] mux_L2;
wire [31:8] mux_L2_copy;
// level 3 wires
wire [31:0] mux_L3;
wire [31:16] mux_L3_copy;

// level 4 wires
//wire [31:0] mux_L4;
//wire [31:2] mux_L4_copy; // MUXCY_D MUXCY_D_instance_name ( .LO (user_LO), .O (user_O), .CI
(user_CI), .DI (user_DI), .S (user_S)); // SOMETHING LIKE: MUX2_1
mux_name( .S(sel) .DI(din0), .CI(din1), .O(dout), .LO(dout_copy) )

// MUXCY MUXCY_instance_name (.O (user_O), .CI (user_CI), .DI (user_DI),
.S (user_S)); // SOMETHING LIKE: MUX2_1 mux_name( .S(sel) .DI(din0),
.CI(din1), .O(dout) )

// input mux: if( left_right == 1'b1) invert di[31:0]; // output mux:
if( left_right == 1'b1) invert do[31:0]; genvar i, j, k; generate
for(i=0; i<=31; i=i+1) begin: mux_in_out_loop
MUXCY MUXCY_IN ( .S(left_right),
.DI(di[i]), .CI(di[31-i]), .O(di_core[i]));
MUXCY MUXCY_OUT ( .S(left_right), .DI(do_core[i]),
.CI(do_core[31-i]), .O(do[i]));
end
endgenerate // LEVEL 0
MUXCY MUXCY_L0_00 ( .S(sa[0]), .DI(di_core[0]),
.CI(di_core[1]), .O(mux_L0[0]));
MUXCY MUXCY_L0_01 ( .S(sa[0]), .DI(di_core[1]),
.CI(di_core[2]), .O(mux_L0[1]));

//MUXCY_D MUXCY_L0_02 ( .S(sa[0]), .DI(di_core[2]), .CI(di_core[3]),
.O(mux_L0[2]), .LO(mux_L0_copy[2])); generate for(i=2; i<=30; i=i+1)
begin: mux_L0_loop
MUXCY_D MUXCY_L0 ( .S(sa[0]),
.DI(di_core[i]), .CI(di_core[i+1]), .O(mux_L0[i]), .LO(mux_L0_copy[i]));
end
endgenerate
MUXCY_D MUXCY_L0_31 ( .S(sa[0]), .DI(di_core[31]),
.CI(1'b0) , .O(mux_L0[31]), .LO(mux_L0_copy[31])); // LEVEL 1
MUXCY MUXCY_L1_00 ( .S(sa[1]), .DI(mux_L0[0]),
.CI(mux_L0_copy[2]), .O(mux_L1[0]));
MUXCY MUXCY_L1_01 ( .S(sa[1]), .DI(mux_L0[1]),
.CI(mux_L0_copy[3]), .O(mux_L1[1]));
MUXCY MUXCY_L1_02 ( .S(sa[1]), .DI(mux_L0[2]),
.CI(mux_L0_copy[4]), .O(mux_L1[2]));
MUXCY MUXCY_L1_03 ( .S(sa[1]), .DI(mux_L0[3]),
.CI(mux_L0_copy[5]), .O(mux_L1[3]));
//
//MUXCY_D MUXCY_L1_04 ( .S(sa[1]), .DI(mux_L0[4]), .CI(mux_L0_copy[6]),
.O(mux_L1[4]), .LO(mux_L1_copy[4])); //MUXCY_D MUXCY_L1_29 ( .S(sa[1]),
.DI(mux_L0[29]), .CI(mux_L0_copy[31]), .O(mux_L1[29]),
.LO(mux_L1_copy[29])); generate for(i=4; i<=29; i=i+1) begin:
mux_L1_loop
MUXCY_D MUXCY_L1 ( .S(sa[1]),
.DI(mux_L0[i]), .CI(mux_L0_copy[i+2]), .O(mux_L1[i]),
.LO(mux_L1_copy[i]));
end
endgenerate
MUXCY_D MUXCY_L1_30 ( .S(sa[1]), .DI(mux_L0[30]), .CI(1'b0),
.O(mux_L1[30]), .LO(mux_L1_copy[30])); MUXCY_D MUXCY_L1_31 ( .S(sa[1]),
.DI(mux_L0[31]), .CI(1'b0), .O(mux_L1[31]), .LO(mux_L1_copy[31])); // LEVEL 2
//
//MUXCY MUXCY_L2_00 ( .S(sa[2]), .DI(mux_L1_00),
.CI(mux_L1_04_copy), .O(mux_L2_00));
//MUXCY MUXCY_L2_07 ( .S(sa[2]), .DI(mux_L1_07),
.CI(mux_L1_11_copy), .O(mux_L2_07));
generate for(i=0; i<=7; i=i+1) begin: mux_L2_0_7_loop
MUXCY MUXCY_L2 ( .S(sa[2]), .DI(mux_L1[i]),
.CI(mux_L1_copy[i+4]), .O(mux_L2[i]));
end
endgenerate

//MUXCY_D MUXCY_L2_08 ( .S(sa[2]), .DI(mux_L1_08), .CI(mux_L1_12_copy),
.O(mux_L2_08), .LO(mux_L2_08_copy)); //MUXCY_D MUXCY_L2_27 ( .S(sa[2]),
.DI(mux_L1_27), .CI(mux_L1_31_copy), .O(mux_L2_27),
.LO(mux_L2_27_copy)); generate for(i=8; i<=27; i=i+1) begin:
mux_L2_8_27_loop
MUXCY_D MUXCY_L2 ( .S(sa[2]),
.DI(mux_L1[i]), .CI(mux_L1_copy[i+4]), .O(mux_L2[i]),
.LO(mux_L2_copy[i]));
end
endgenerate

MUXCY_D MUXCY_L2_28 ( .S(sa[2]), .DI(mux_L1[28]), .CI(1'b0),
.O(mux_L2[28]), .LO(mux_L2_copy[28])); MUXCY_D MUXCY_L2_29 ( .S(sa[2]),
.DI(mux_L1[29]), .CI(1'b0), .O(mux_L2[29]), .LO(mux_L2_copy[29]));
MUXCY_D MUXCY_L2_30 ( .S(sa[2]), .DI(mux_L1[30]), .CI(1'b0),
.O(mux_L2[30]), .LO(mux_L2_copy[30])); MUXCY_D MUXCY_L2_31 ( .S(sa[2]),
.DI(mux_L1[31]), .CI(1'b0), .O(mux_L2[31]), .LO(mux_L2_copy[31]));

// LEVEL 3
//
//MUXCY MUXCY_L3_00 ( .S(sa[3]), .DI(mux_L2_00),
.CI(mux_L2_08_copy), .O(mux_L3_00));
//MUXCY MUXCY_L3_15 ( .S(sa[3]), .DI(mux_L2_15),
.CI(mux_L2_23_copy), .O(mux_L3_15));
generate for(i=0; i<=15; i=i+1) begin:
mux_L3_0_15_loop
MUXCY MUXCY_L3 ( .S(sa[3]), .DI(mux_L2[i]),
.CI(mux_L2_copy[i+8]), .O(mux_L3[i]));
end
endgenerate
//
//MUXCY_D MUXCY_L3_16 ( .S(sa[3]), .DI(mux_L2_16), .CI(mux_L2_24_copy),
.O(mux_L3_16), .LO(mux_L3_16_copy)); //MUXCY_D MUXCY_L3_23 ( .S(sa[3]),
.DI(mux_L2_23), .CI(mux_L2_31_copy), .O(mux_L3_23),
.LO(mux_L3_23_copy)); generate for(i=16; i<=23; i=i+1) begin:
mux_L3_16_23_loop
MUXCY_D MUXCY_L3 ( .S(sa[3]),
.DI(mux_L2[i]), .CI(mux_L2_copy[i+8]), .O(mux_L3[i]),
.LO(mux_L3_copy[i]));
end
endgenerate
//
//MUXCY_D MUXCY_L3_24 ( .S(sa[3]), .DI(mux_L2_24), .CI(1'b0),
.O(mux_L3_24), .LO(mux_L3_24_copy)); //MUXCY_D MUXCY_L3_31 ( .S(sa[3]),
.DI(mux_L2_31), .CI(1'b0), .O(mux_L3_31), .LO(mux_L3_31_copy)); generate
for(i=24; i<=31; i=i+1) begin:
mux_L3_24_31_loop
MUXCY_D MUXCY_L3 ( .S(sa[3]),
.DI(mux_L2[i]), .CI(1'b0), .O(mux_L3[i]), .LO(mux_L3_copy[i]));
end
endgenerate

// LEVEL 4
//
//MUXCY MUXCY_L4_00 ( .S(sa[4]), .DI(mux_L3_00),
.CI(mux_L3_16_copy), .O(do_core[0]));
//.O(mux_L4_00));
//MUXCY MUXCY_L4_15 ( .S(sa[4]), .DI(mux_L3_15),
.CI(mux_L3_31_copy), .O(do_core[15]));
//.O(mux_L4_15));
generate for(i=0; i<=15; i=i+1) begin:
mux_L4_0_15_loop
MUXCY MUXCY_L4 ( .S(sa[4]), .DI(mux_L3[i]),
.CI(mux_L3_copy[i+16]), .O(do_core[i]));
end
endgenerate

//
//MUXCY MUXCY_L4_16 ( .S(sa[4]), .DI(mux_L3_16),
.CI(1'b0), .O(do[16])); //.O(mux_L4_16));
//MUXCY MUXCY_L4_31 ( .S(sa[4]), .DI(mux_L3_31),
.CI(1'b0), .O(do[31])); //.O(mux_L4_31)); generate for(i=16; i<=31;
i=i+1) begin:
mux_L4_16_31_loop
MUXCY MUXCY_L4 ( .S(sa[4]), .DI(mux_L3[i]), .CI(1'b0),
.O(do_core[i]));
end
endgenerate

endmodule

==========================

.syr file (part of if it) -contains delay given by
XST:

------------------------------------------------------------------------
-
Timing constraint: Default path analysis
Delay: 12.320ns (Levels of Logic = 10)
Source: left_right (PAD)
Destination: do<31> (PAD)

Data Path: left_right to do<31>
Gate Net
Cell:in->out fanout Delay Delay Logical Name
------------------------------------- ---------------
IBUF:I->O 64 0.797 4.100 left_right_IBUF
LUT1:I0->O 1 0.468 0.000 left_right_IBUF_rt

MUXCY:S->O 0 0.515 0.000
mux_in_out_loop[0].MUXCY_IN

MUXCY:DI->O 0 0.153 0.000 MUXCY_L0_00
MUXCY:DI->O 0 0.153 0.000 MUXCY_L1_00
MUXCY:DI->O 0 0.153 0.000
mux_L2_0_7_loop[0].MUXCY_L2 (mux_L2<0>)

MUXCY:DI->O 0 0.153 0.000
mux_L3_0_15_loop[0].MUXCY_L3 (mux_L3<0>)

MUXCY:DI->O 1 0.153 0.000
mux_L4_0_15_loop[0].MUXCY_L4 (do_core<0>)

MUXCY:DI->O 1 0.153 0.920
mux_in_out_loop[0].MUXCY_OUT (do_0_OBUF)

OBUF:I->O 4.602 do_0_OBUF (do<0>)
-------------------------------------------------
Total 12.320ns (7.300ns logic, 5.020ns
route)
(59.3% logic, 40.7% route)
==========

.srr file (part of if it) - delay given by Synplify:

===========================================================

Instance / Net Pin
Pin Arrival No. of
Name Type Name
Dir Delay Time Fan Out(s)
------------------------------------------------------------------------
-------------------------------
left_right Port
left_right In 0.000 0.000 -

left_right Net -
- 0.000 - 1
left_right_ibuf IBUF I
In - 0.000 -
left_right_ibuf IBUF O
Out 1.047 1.047 -
left_right_c Net -
- 2.478 - 64
mux_in_out_loop[31].MUXCY_IN_sf LUT1 I0
In - 3.525 -
mux_in_out_loop[31].MUXCY_IN_sf LUT1 O
Out 0.468 3.993 -
mux_in_out_loop\[31\].MUXCY_IN_sf Net -
- 0.000 - 1
mux_in_out_loop[31].MUXCY_IN MUXCY S
In - 3.993 -
mux_in_out_loop[31].MUXCY_IN MUXCY O
Out 0.464 4.457 -
di_core[31] Net -
- 0.947 - 2
MUXCY_L0_31 MUXCY_D DI
In - 5.404 -
MUXCY_L0_31 MUXCY_D LO
Out 0.380 5.784 -
mux_L0_copy[31] Net -
- 0.750 - 1
mux_L1_loop[29].MUXCY_L1 MUXCY_D CI
In - 6.534 -
mux_L1_loop[29].MUXCY_L1 MUXCY_D LO
Out 0.487 7.021 -
mux_L1_copy[29] Net -
- 0.750 - 1
mux_L2_8_27_loop[25].MUXCY_L2 MUXCY_D CI
In - 7.771 -
mux_L2_8_27_loop[25].MUXCY_L2 MUXCY_D LO
Out 0.487 8.258 -
mux_L2_copy[25] Net -
- 0.750 - 1
mux_L3_16_23_loop[17].MUXCY_L3 MUXCY_D CI
In - 9.008 -
mux_L3_16_23_loop[17].MUXCY_L3 MUXCY_D LO
Out 0.487 9.495 -
mux_L3_copy[17] Net -
- 0.750 - 1
mux_L4_0_15_loop[1].MUXCY_L4 MUXCY CI
In - 10.245 -
mux_L4_0_15_loop[1].MUXCY_L4 MUXCY O
Out 0.487 10.732 -
do_core[1] Net -
- 0.947 - 2
mux_in_out_loop[1].MUXCY_OUT MUXCY DI
In - 11.679 -
mux_in_out_loop[1].MUXCY_OUT MUXCY O
Out 0.380 12.059 -
do_1[1] Net -
- 0.750 - 1
do_obuf[1] OBUF I
In - 12.809 -
do_obuf[1] OBUF O
Out 4.851 17.660 -
do[1] Net -
- 0.000 - 1
do[31:0] Port do[1]
Out - 17.660 -
========================================================================
===============================
Total path delay (propagation time + setup) of 17.660 is 9.538(54.0%)
logic and 8.122(46.0%) route.

===========================

I agree about: "A component in a large system might not have the same
timing as it does all by itself, depending on the placement and other
factors."
I learned it ... the hard way.

There are 7 levels of 2:1 MUXes (implemented using
MUXCYs) :
- levels 0 and 6 are used to convert the right shift into a left shift
- levels 1-5 do the right shift

In XST each of the levels 1 to 5 gives a delay of 0.153ns (it's pretty
good compared to the delay of a
2:1 MUX implemented using a LUT).

There' s something else I'm worried about:

Can the MUXCYs REALLY be connected in the way I did in the shifter
design ? (I've looked at the technology view shown by Synplify and it
showed that each MUXCY from levels 1 to 5 had a buffer on it's output;
XST does NOT behave in this way). ========

Thank you for your patience & time. Lucian, __________________________________________________
">http://mail.yahoo.com To post a message, send it to: fpga-cpu@fpga...
To unsubscribe, send a blank message to:
fpga-cpu-unsubscribe@fpga...
Yahoo! Groups Links





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

RE: Re: Help - Shifter using MUXCYs - Lucian Damoc - May 9 10:11:00 2005


Hello,

Yes... I forgot about some "details":

1) Technology: Xilinx Spartan IIe
2) Part: XC2S300E
3) Speed: -6
4) Package: FT256

What I don't understand is this:
I've looked at the slice & CLB detailed view in the
VirtexE datasheet (SpartanIIe is a derivative of
VirtexE) and I don't think the net-delay between the O
output of a MUXCY and the DI input of the next MUXCY
can be 0 ns. --- "Jeffery, Robert" <robert_jeffery@robe...>
wrote:

> Hi Lucian.
>
> What are the device details, part and speedgrade?
>
> Cheers.
>
> Robert.
>
> -----Original Message-----
> From: fpga-cpu@fpga...
> [mailto:fpga-cpu@fpga...] On
> Behalf Of Lucian Damoc
> Sent: 09 May 2005 10:03
> To: fpga-cpu@fpga...
> Subject: [fpga-cpu] Re: Help - Shifter using MUXCYs > Hello again,
>
> Sorry if the attachement didn't came... I attatched
> an archive (to that
> first email) with these files:
> >> .v (design),
> >> .syr (XST - after synthesys timing) >> .srr
> file (Synplify - after
> synthesys timing)
>
> ... but it seems the archive didn't make it.
>
> ======
>
> I'll list the code (from the .v file) here: //====================================================
> // 32-bit shifter - can shift RIGHT or LEFT // //
> // if(left_right ==
> 1'b0) shift_right
> // else shift_left
> //
>
//====================================================
>
> `timescale 1ns / 10ps //===================================================
>
> module SHIFTER(left_right, di, sa, do);
>
> input left_right;
> input [31:0] di;
> input [4:0] sa; // shift amount
>
> output [31:0] do;
>
> // input 2:1 mux
> // 0-7
> wire [31:0] di_core;
> wire [31:0] do_core;
>
> // level 0 wires
> wire [31:0] mux_L0;
> wire [31:2] mux_L0_copy;
> // level 1 wires
> wire [31:0] mux_L1;
> wire [31:4] mux_L1_copy;
> // level 2 wires
> wire [31:0] mux_L2;
> wire [31:8] mux_L2_copy;
> // level 3 wires
> wire [31:0] mux_L3;
> wire [31:16] mux_L3_copy;
>
> // level 4 wires
> //wire [31:0] mux_L4;
> //wire [31:2] mux_L4_copy; > // MUXCY_D MUXCY_D_instance_name ( .LO (user_LO), .O
> (user_O), .CI
> (user_CI), .DI (user_DI), .S (user_S)); // SOMETHING
> LIKE: MUX2_1
> mux_name( .S(sel) .DI(din0), .CI(din1), .O(dout),
> .LO(dout_copy) )
>
> // MUXCY MUXCY_instance_name (.O (user_O), .CI
> (user_CI), .DI (user_DI),
> .S (user_S)); // SOMETHING LIKE: MUX2_1 mux_name(
> .S(sel) .DI(din0),
> .CI(din1), .O(dout) )
>
> // input mux: if( left_right == 1'b1) invert
> di[31:0]; // output mux:
> if( left_right == 1'b1) invert do[31:0]; genvar i,
> j, k; generate
> for(i=0; i<=31; i=i+1) begin: mux_in_out_loop
> MUXCY MUXCY_IN ( .S(left_right),
> .DI(di[i]), .CI(di[31-i]),
> .O(di_core[i]));
> MUXCY MUXCY_OUT ( .S(left_right),
> .DI(do_core[i]),
> .CI(do_core[31-i]), .O(do[i]));
> end
> endgenerate > // LEVEL 0
> MUXCY MUXCY_L0_00 ( .S(sa[0]), .DI(di_core[0]),
> .CI(di_core[1]), .O(mux_L0[0]));
> MUXCY MUXCY_L0_01 ( .S(sa[0]), .DI(di_core[1]),
> .CI(di_core[2]), .O(mux_L0[1]));
>
> //MUXCY_D MUXCY_L0_02 ( .S(sa[0]), .DI(di_core[2]),
> .CI(di_core[3]),
> .O(mux_L0[2]), .LO(mux_L0_copy[2])); generate
> for(i=2; i<=30; i=i+1)
> begin: mux_L0_loop
> MUXCY_D MUXCY_L0 ( .S(sa[0]),
> .DI(di_core[i]), .CI(di_core[i+1]), .O(mux_L0[i]),
> .LO(mux_L0_copy[i]));
> end
> endgenerate
> MUXCY_D MUXCY_L0_31 ( .S(sa[0]), .DI(di_core[31]),
> .CI(1'b0) , .O(mux_L0[31]), .LO(mux_L0_copy[31])); > // LEVEL 1
> MUXCY MUXCY_L1_00 ( .S(sa[1]), .DI(mux_L0[0]),
> .CI(mux_L0_copy[2]), .O(mux_L1[0]));
> MUXCY MUXCY_L1_01 ( .S(sa[1]), .DI(mux_L0[1]),
> .CI(mux_L0_copy[3]), .O(mux_L1[1]));
> MUXCY MUXCY_L1_02 ( .S(sa[1]), .DI(mux_L0[2]),
> .CI(mux_L0_copy[4]), .O(mux_L1[2]));
> MUXCY MUXCY_L1_03 ( .S(sa[1]), .DI(mux_L0[3]),
> .CI(mux_L0_copy[5]), .O(mux_L1[3]));
> //
> //MUXCY_D MUXCY_L1_04 ( .S(sa[1]), .DI(mux_L0[4]),
> .CI(mux_L0_copy[6]),
> .O(mux_L1[4]), .LO(mux_L1_copy[4])); //MUXCY_D
> MUXCY_L1_29 ( .S(sa[1]),
> .DI(mux_L0[29]), .CI(mux_L0_copy[31]),
> .O(mux_L1[29]),
> .LO(mux_L1_copy[29])); generate for(i=4; i<=29;
> i=i+1) begin:
> mux_L1_loop
> MUXCY_D MUXCY_L1 ( .S(sa[1]),
> .DI(mux_L0[i]), .CI(mux_L0_copy[i+2]),
> .O(mux_L1[i]),
> .LO(mux_L1_copy[i]));
> end
> endgenerate
> MUXCY_D MUXCY_L1_30 ( .S(sa[1]), .DI(mux_L0[30]),
> .CI(1'b0),
> .O(mux_L1[30]), .LO(mux_L1_copy[30])); MUXCY_D
> MUXCY_L1_31 ( .S(sa[1]),
> .DI(mux_L0[31]), .CI(1'b0), .O(mux_L1[31]),
> .LO(mux_L1_copy[31])); > // LEVEL 2
> //
> //MUXCY MUXCY_L2_00 ( .S(sa[2]), .DI(mux_L1_00),
> .CI(mux_L1_04_copy), .O(mux_L2_00));
> //MUXCY MUXCY_L2_07 ( .S(sa[2]), .DI(mux_L1_07),
> .CI(mux_L1_11_copy), .O(mux_L2_07));
> generate for(i=0; i<=7; i=i+1) begin:
> mux_L2_0_7_loop
> MUXCY MUXCY_L2 ( .S(sa[2]),
> .DI(mux_L1[i]),
> .CI(mux_L1_copy[i+4]), .O(mux_L2[i]));
> end
> endgenerate
>
> //MUXCY_D MUXCY_L2_08 ( .S(sa[2]), .DI(mux_L1_08),
> .CI(mux_L1_12_copy),
> .O(mux_L2_08), .LO(mux_L2_08_copy)); //MUXCY_D
> MUXCY_L2_27 ( .S(sa[2]),
> .DI(mux_L1_27), .CI(mux_L1_31_copy), .O(mux_L2_27),
> .LO(mux_L2_27_copy)); generate for(i=8; i<=27;
> i=i+1) begin:
> mux_L2_8_27_loop
> MUXCY_D MUXCY_L2 ( .S(sa[2]),
> .DI(mux_L1[i]), .CI(mux_L1_copy[i+4]),
> .O(mux_L2[i]),
> .LO(mux_L2_copy[i]));
> end
> endgenerate
>
> MUXCY_D MUXCY_L2_28 ( .S(sa[2]), .DI(mux_L1[28]),
> .CI(1'b0),
> .O(mux_L2[28]), .LO(mux_L2_copy[28])); MUXCY_D
> MUXCY_L2_29 ( .S(sa[2]),
> .DI(mux_L1[29]), .CI(1'b0), .O(mux_L2[29]),
> .LO(mux_L2_copy[29]));
> MUXCY_D MUXCY_L2_30 ( .S(sa[2]), .DI(mux_L1[30]),
> .CI(1'b0),
> .O(mux_L2[30]), .LO(mux_L2_copy[30])); MUXCY_D
> MUXCY_L2_31 ( .S(sa[2]),
> .DI(mux_L1[31]), .CI(1'b0), .O(mux_L2[31]),
> .LO(mux_L2_copy[31]));
>
> // LEVEL 3
> //
>
=== message truncated ===

Yahoo! Mail
Stay connected, organized, and protected. Take the tour:
http://tour.mail.yahoo.com/mailtour.html




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

RE: Re: Help - Shifter using MUXCYs - Jeffery, Robert - May 9 12:28:00 2005

Hi Lucian.

Having taken a look at the code below it's all got badly wrapped. Could
try a resend?

Cheers.

Robert.

-----Original Message-----
From: fpga-cpu@fpga... [mailto:fpga-cpu@fpga...] On
Behalf Of Lucian Damoc
Sent: 09 May 2005 15:12
To: fpga-cpu@fpga...
Subject: RE: [fpga-cpu] Re: Help - Shifter using MUXCYs Hello,

Yes... I forgot about some "details":

1) Technology: Xilinx Spartan IIe
2) Part: XC2S300E
3) Speed: -6
4) Package: FT256

What I don't understand is this:
I've looked at the slice & CLB detailed view in the VirtexE datasheet
(SpartanIIe is a derivative of
VirtexE) and I don't think the net-delay between the O output of a MUXCY
and the DI input of the next MUXCY can be 0 ns. --- "Jeffery, Robert" <robert_jeffery@robe...>
wrote:

> Hi Lucian.
>
> What are the device details, part and speedgrade?
>
> Cheers.
>
> Robert.
>
> -----Original Message-----
> From: fpga-cpu@fpga...
> [mailto:fpga-cpu@fpga...] On
> Behalf Of Lucian Damoc
> Sent: 09 May 2005 10:03
> To: fpga-cpu@fpga...
> Subject: [fpga-cpu] Re: Help - Shifter using MUXCYs > Hello again,
>
> Sorry if the attachement didn't came... I attatched an archive (to
> that first email) with these files:
> >> .v (design),
> >> .syr (XST - after synthesys timing) >> .srr file (Synplify -
> after synthesys timing)
>
> ... but it seems the archive didn't make it.
>
> ======
>
> I'll list the code (from the .v file) here: //====================================================
> // 32-bit shifter - can shift RIGHT or LEFT // // // if(left_right ==
> 1'b0) shift_right
> // else shift_left
> //
>
//====================================================
>
> `timescale 1ns / 10ps //===================================================
>
> module SHIFTER(left_right, di, sa, do);
>
> input left_right;
> input [31:0] di;
> input [4:0] sa; // shift amount
>
> output [31:0] do;
>
> // input 2:1 mux
> // 0-7
> wire [31:0] di_core;
> wire [31:0] do_core;
>
> // level 0 wires
> wire [31:0] mux_L0;
> wire [31:2] mux_L0_copy;
> // level 1 wires
> wire [31:0] mux_L1;
> wire [31:4] mux_L1_copy;
> // level 2 wires
> wire [31:0] mux_L2;
> wire [31:8] mux_L2_copy;
> // level 3 wires
> wire [31:0] mux_L3;
> wire [31:16] mux_L3_copy;
>
> // level 4 wires
> //wire [31:0] mux_L4;
> //wire [31:2] mux_L4_copy; > // MUXCY_D MUXCY_D_instance_name ( .LO (user_LO), .O (user_O), .CI
> (user_CI), .DI (user_DI), .S (user_S)); // SOMETHING
> LIKE: MUX2_1
> mux_name( .S(sel) .DI(din0), .CI(din1), .O(dout),
> .LO(dout_copy) )
>
> // MUXCY MUXCY_instance_name (.O (user_O), .CI (user_CI), .DI
> (user_DI), .S (user_S)); // SOMETHING LIKE: MUX2_1 mux_name(
> .S(sel) .DI(din0),
> .CI(din1), .O(dout) )
>
> // input mux: if( left_right == 1'b1) invert di[31:0]; // output
> mux:
> if( left_right == 1'b1) invert do[31:0]; genvar i, j, k; generate
> for(i=0; i<=31; i=i+1) begin: mux_in_out_loop
> MUXCY MUXCY_IN ( .S(left_right),
> .DI(di[i]), .CI(di[31-i]),
> .O(di_core[i]));
> MUXCY MUXCY_OUT ( .S(left_right), .DI(do_core[i]),
> .CI(do_core[31-i]), .O(do[i]));
> end
> endgenerate > // LEVEL 0
> MUXCY MUXCY_L0_00 ( .S(sa[0]), .DI(di_core[0]),
> .CI(di_core[1]), .O(mux_L0[0]));
> MUXCY MUXCY_L0_01 ( .S(sa[0]), .DI(di_core[1]),
> .CI(di_core[2]), .O(mux_L0[1]));
>
> //MUXCY_D MUXCY_L0_02 ( .S(sa[0]), .DI(di_core[2]), .CI(di_core[3]),
> .O(mux_L0[2]), .LO(mux_L0_copy[2])); generate for(i=2; i<=30; i=i+1)
> begin: mux_L0_loop
> MUXCY_D MUXCY_L0 ( .S(sa[0]), .DI(di_core[i]),
> .CI(di_core[i+1]), .O(mux_L0[i]), .LO(mux_L0_copy[i]));
> end
> endgenerate
> MUXCY_D MUXCY_L0_31 ( .S(sa[0]), .DI(di_core[31]),
> .CI(1'b0) , .O(mux_L0[31]), .LO(mux_L0_copy[31])); > // LEVEL 1
> MUXCY MUXCY_L1_00 ( .S(sa[1]), .DI(mux_L0[0]),
> .CI(mux_L0_copy[2]), .O(mux_L1[0]));
> MUXCY MUXCY_L1_01 ( .S(sa[1]), .DI(mux_L0[1]),
> .CI(mux_L0_copy[3]), .O(mux_L1[1]));
> MUXCY MUXCY_L1_02 ( .S(sa[1]), .DI(mux_L0[2]),
> .CI(mux_L0_copy[4]), .O(mux_L1[2]));
> MUXCY MUXCY_L1_03 ( .S(sa[1]), .DI(mux_L0[3]),
> .CI(mux_L0_copy[5]), .O(mux_L1[3]));
> //
> //MUXCY_D MUXCY_L1_04 ( .S(sa[1]), .DI(mux_L0[4]),
> .CI(mux_L0_copy[6]), .O(mux_L1[4]), .LO(mux_L1_copy[4])); //MUXCY_D
> MUXCY_L1_29 ( .S(sa[1]),
> .DI(mux_L0[29]), .CI(mux_L0_copy[31]), .O(mux_L1[29]),
> .LO(mux_L1_copy[29])); generate for(i=4; i<=29;
> i=i+1) begin:
> mux_L1_loop
> MUXCY_D MUXCY_L1 ( .S(sa[1]), .DI(mux_L0[i]),
> .CI(mux_L0_copy[i+2]), .O(mux_L1[i]), .LO(mux_L1_copy[i]));
> end
> endgenerate
> MUXCY_D MUXCY_L1_30 ( .S(sa[1]), .DI(mux_L0[30]), .CI(1'b0),
> .O(mux_L1[30]), .LO(mux_L1_copy[30])); MUXCY_D
> MUXCY_L1_31 ( .S(sa[1]),
> .DI(mux_L0[31]), .CI(1'b0), .O(mux_L1[31]), .LO(mux_L1_copy[31])); > // LEVEL 2
> //
> //MUXCY MUXCY_L2_00 ( .S(sa[2]), .DI(mux_L1_00),
> .CI(mux_L1_04_copy), .O(mux_L2_00));
> //MUXCY MUXCY_L2_07 ( .S(sa[2]), .DI(mux_L1_07),
> .CI(mux_L1_11_copy), .O(mux_L2_07));
> generate for(i=0; i<=7; i=i+1) begin:
> mux_L2_0_7_loop
> MUXCY MUXCY_L2 ( .S(sa[2]),
> .DI(mux_L1[i]),
> .CI(mux_L1_copy[i+4]), .O(mux_L2[i]));
> end
> endgenerate
>
> //MUXCY_D MUXCY_L2_08 ( .S(sa[2]), .DI(mux_L1_08),
> .CI(mux_L1_12_copy), .O(mux_L2_08), .LO(mux_L2_08_copy)); //MUXCY_D
> MUXCY_L2_27 ( .S(sa[2]),
> .DI(mux_L1_27), .CI(mux_L1_31_copy), .O(mux_L2_27),
> .LO(mux_L2_27_copy)); generate for(i=8; i<=27;
> i=i+1) begin:
> mux_L2_8_27_loop
> MUXCY_D MUXCY_L2 ( .S(sa[2]), .DI(mux_L1[i]),
> .CI(mux_L1_copy[i+4]), .O(mux_L2[i]), .LO(mux_L2_copy[i]));
> end
> endgenerate
>
> MUXCY_D MUXCY_L2_28 ( .S(sa[2]), .DI(mux_L1[28]), .CI(1'b0),
> .O(mux_L2[28]), .LO(mux_L2_copy[28])); MUXCY_D
> MUXCY_L2_29 ( .S(sa[2]),
> .DI(mux_L1[29]), .CI(1'b0), .O(mux_L2[29]), .LO(mux_L2_copy[29]));
> MUXCY_D MUXCY_L2_30 ( .S(sa[2]), .DI(mux_L1[30]), .CI(1'b0),
> .O(mux_L2[30]), .LO(mux_L2_copy[30])); MUXCY_D
> MUXCY_L2_31 ( .S(sa[2]),
> .DI(mux_L1[31]), .CI(1'b0), .O(mux_L2[31]), .LO(mux_L2_copy[31]));
>
> // LEVEL 3
> //
>
=== message truncated ===

Yahoo! Mail
Stay connected, organized, and protected. Take the tour:
http://tour.mail.yahoo.com/mailtour.html
To post a message, send it to: fpga-cpu@fpga... To unsubscribe,
send a blank message to: fpga-cpu-unsubscribe@fpga...
Yahoo! Groups Links





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Help - Shifter using MUXCYs - Rick Collins - May 9 12:31:00 2005

--- In fpga-cpu@fpga..., "Jeffery, Robert"
<robert_jeffery@m...> wrote:
> Hi Lucian.
>
> Having taken a look at the code below it's all got badly wrapped.
Could
> try a resend?
>
> Cheers.
>
> Robert.
>
> -----Original Message-----
> From: fpga-cpu@fpga... [mailto:fpga-cpu@fpga...] On
> Behalf Of Lucian Damoc
> Sent: 09 May 2005 15:12
> To: fpga-cpu@fpga...
> Subject: RE: [fpga-cpu] Re: Help - Shifter using MUXCYs > Hello,
>
> Yes... I forgot about some "details":
>
> 1) Technology: Xilinx Spartan IIe
> 2) Part: XC2S300E
> 3) Speed: -6
> 4) Package: FT256
>
> What I don't understand is this:
> I've looked at the slice & CLB detailed view in the VirtexE
datasheet
> (SpartanIIe is a derivative of
> VirtexE) and I don't think the net-delay between the O output of a
MUXCY
> and the DI input of the next MUXCY can be 0 ns.

It can if the delay is included in the other two segments. I believe
this path is either used or not and is not available to any other
routing. So the delay is included in the paths leading into and out
of it. It is sort of moot as to where you put the various delays as
long as they all add up correctly for every usage.




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

RE: Re: Help - Shifter using MUXCYs - damocl82 - May 9 13:11:00 2005

Hello,

Yes. I looked at the code... sorry... I was afraid it would happen.

This time it should be ok. I've arranged the code a little:

//====================================================
// 32-bit shifter - can shift left & right
//
// if(left_right == 1'b0) shift_right
// else shift_left
//
//====================================================

`timescale 1ns / 10ps

//===================================================

module SHIFTER(left_right, di, sa, do);

input left_right;
input [31:0] di;
input [4:0] sa; // shift amount

output [31:0] do;

// input 2:1 mux
// 0-7
wire [31:0] di_core;
wire [31:0] do_core;

// level 0 wires
wire [31:0] mux_L0;
wire [31:2] mux_L0_copy;
// level 1 wires
wire [31:0] mux_L1;
wire [31:4] mux_L1_copy;
// level 2 wires
wire [31:0] mux_L2;
wire [31:8] mux_L2_copy;
// level 3 wires
wire [31:0] mux_L3;
wire [31:16] mux_L3_copy;

// level 4 wires
//wire [31:0] mux_L4;
//wire [31:2] mux_L4_copy; // MUXCY_D i0(.LO(userLO),.O (useO),.CI (useCI),.DI (useDI),.S(useS));
// MUX2_1 i0( .S(sel) .DI(din0), .CI(din1), .O(dout), .LO(dout_copy))

// MUXCY i1(.O(user_O),.CI(user_CI),.DI(user_DI),.S(user_S));
// MUX2_1 mux_name(.S(sel),.DI(din0),.CI(din1),.O(dout));

// input mux: if( left_right == 1'b1) invert di[31:0];
// output mux: if( left_right == 1'b1) invert do[31:0];
genvar i, j, k;
generate for(i=0; i<=31; i=i+1) begin: mux_in_out_loop
MUXCY
MUX_IN(.S(left_right),.DI(di[i]),.CI(di[31-i]),.O(di_core[i]));
MUXCY
MUX_OUT(.S(left_right),.DI(do_core[i]),.CI(do_core[31-i]),.O(do[i]));
end
endgenerate

// LEVEL 0
MUXCY
MUXCY_L0_00(.S(sa[0]),.DI(di_core[0]),.CI(di_core[1]),.O(mux_L0[0]));
MUXCY
MUXCY_L0_01(.S(sa[0]),.DI(di_core[1]),.CI(di_core[2]),.O(mux_L0[1]));

generate for(i=2; i<=30; i=i+1) begin: mux_L0_loop
MUXCY_D
MUXCY_L0(.S(sa[0]),.DI(di_core[i]),.CI(di_core[i+1]),.O(mux_L0[i]),
.LO(mux_L0_copy[i]));
end
endgenerate

MUXCY_D
MUXCY_L0_31 (.S(sa[0]),.DI(di_core[31]),.CI(1'b0),.O(mux_L0[31]),
.LO(mux_L0_copy[31]));

// LEVEL 1
MUXCY
MUXCY_L1_00(.S(sa[1]),.DI(mux_L0[0]),.CI(mux_L0_copy[2]),.O(mux_L1[0]));
MUXCY
MUXCY_L1_01(.S(sa[1]),.DI(mux_L0[1]),.CI(mux_L0_copy[3]),.O(mux_L1[1]));
MUXCY
MUXCY_L1_02(.S(sa[1]),.DI(mux_L0[2]),.CI(mux_L0_copy[4]),.O(mux_L1[2]));
MUXCY
MUXCY_L1_03(.S(sa[1]),.DI(mux_L0[3]),.CI(mux_L0_copy[5]),.O(mux_L1[3]));
//
//MUXCY_D
MUXCY_L1_04(.S(sa[1]),.DI(mux_L0[4]),.CI(mux_L0_copy[6]),.O(mux_L1[4]),.LO(mux_L1_copy[4]));
//MUXCY_D
MUXCY_L1_29(.S(sa[1]),
.DI(mux_L0[29]),.CI(mux_L0_copy[31]),.O(mux_L1[29]),.LO(mux_L1_copy[29]));

generate for(i=4; i<=29; i=i+1) begin: mux_L1_loop
MUXCY_D
MUXCY_L1(.S(sa[1]),.DI(mux_L0[i]),
.CI(mux_L0_copy[i+2]),.O(mux_L1[i]),.LO(mux_L1_copy[i]));
end
endgenerate

MUXCY_D
MUXCY_L1_30(.S(sa[1]),.DI(mux_L0[30]),.CI(1'b0),.O(mux_L1[30]),.LO(mux_L1_copy[30]));
MUXCY_D
MUXCY_L1_31(.S(sa[1]),.DI(mux_L0[31]),.CI(1'b0),.O(mux_L1[31]),.LO(mux_L1_copy[31])); // LEVEL 2
//
//MUXCY
//MUXCY_L2_00(.S(sa[2]),.DI(mux_L1_00),.CI(mux_L1_04_copy),.O(mux_L2_00));
//MUXCY
//MUXCY_L2_07(.S(sa[2]),.DI(mux_L1_07),.CI(mux_L1_11_copy),.O(mux_L2_07));
generate for(i=0; i<=7; i=i+1) begin: mux_L2_0_7_loop
MUXCY
MUXCY_L2(.S(sa[2]),.DI(mux_L1[i]),.CI(mux_L1_copy[i+4]),.O(mux_L2[i]));
end
endgenerate

//MUXCY_D
MUXCY_L2_08(.S(sa[2]),.DI(mux_L1_0),.CI(mux_L1_12_copy),.O(mux_L2_08),.LO(mux_L2_08_copy));
//MUXCY_D
MUXCY_L2_27(.S(sa[2]),.DI(mux_L1_27),.CI(mux_L1_31_copy),.O(mux_L2_27),.LO(mux_L2_27_copy));
generate for(i=8; i<=27; i=i+1) begin: mux_L2_8_27_loop
MUXCY_D
MUXCY_L2(.S(sa[2]),.DI(mux_L1[i]),.CI(mux_L1_copy[i+4]),.O(mux_L2[i]),.LO(mux_L2_copy[i]));
end
endgenerate

MUXCY_D
MUXCY_L2_28(.S(sa[2]),.DI(mux_L1[28]),.CI(1'b0),.O(mux_L2[28]),.LO(mux_L2_copy[28]));
MUXCY_D
MUXCY_L2_29(.S(sa[2]),.DI(mux_L1[29]),.CI(1'b0),.O(mux_L2[29]),.LO(mux_L2_copy[29]));
MUXCY_D
MUXCY_L2_30(.S(sa[2]),.DI(mux_L1[30]),.CI(1'b0),.O(mux_L2[30]),.LO(mux_L2_copy[30]));
MUXCY_D
MUXCY_L2_31(.S(sa[2]),.DI(mux_L1[31]),.CI(1'b0),.O(mux_L2[31]),.LO(mux_L2_copy[31]));

// LEVEL 3
//
//MUXCY
//MUXCY_L3_00(.S(sa[3]),.DI(mux_L2_00),.CI(mux_L2_08_copy),.O(mux_L3_00));
//MUXCY
//MUXCY_L3_15(.S(sa[3]),.DI(mux_L2_15),.CI(mux_L2_23_copy),.O(mux_L3_15));
generate for(i=0; i<=15; i=i+1) begin: mux_L3_0_15_loop
MUXCY
MUXCY_L3(.S(sa[3]),.DI(mux_L2[i]),.CI(mux_L2_copy[i+8]),.O(mux_L3[i]));
end
endgenerate
//
//MUXCY_D
//MUXCY_L3_16(.S(sa[3]),.DI(mux_L2_16),.CI(mux_L2_24_copy),.O(mux_L3_16),.LO(mux_L3_16_copy));
//MUXCY_D
MUXCY_L3_23(.S(sa[3]),.DI(mux_L2_23),.CI(mux_L2_31_copy),.O(mux_L3_23),.LO(mux_L3_23_copy));
generate for(i=16; i<=23; i=i+1) begin: mux_L3_16_23_loop
MUXCY_D
MUXCY_L3(.S(sa[3]),.DI(mux_L2[i]),.CI(mux_L2_copy[i+8]),.O(mux_L3[i]),.LO(mux_L3_copy[i]));
end
endgenerate
//
//MUXCY_D
//MUXCY_L3_24(.S(sa[3]),.DI(mux_L2_24),.CI(1'b0),.O(mux_L3_24),.LO(mux_L3_24_copy));
//MUXCY_D
//MUXCY_L3_31(.S(sa[3]),.DI(mux_L2_31),.CI(1'b0),.O(mux_L3_31),.LO(mux_L3_31_copy));
generate for(i=24; i<=31; i=i+1) begin: mux_L3_24_31_loop
MUXCY_D
MUXCY_L3(.S(sa[3]),.DI(mux_L2[i]),.CI(1'b0),.O(mux_L3[i]),.LO(mux_L3_copy[i]));
end
endgenerate

// LEVEL 4
//
//MUXCY
//MUXCY_L4_00(.S(sa[4]),.DI(mux_L3_00),.CI(mux_L3_16_copy),.O(do_core[0]));

//MUXCY
//MUXCY_L4_15(.S(sa[4]),.DI(mux_L3_15),.CI(mux_L3_31_copy),.O(do_core[15]))
generate for(i=0; i<=15; i=i+1) begin: mux_L4_0_15_loop
MUXCY
MUXCY_L4(.S(sa[4]),.DI(mux_L3[i]),.CI(mux_L3_copy[i+16]),.O(do_core[i]));
end
endgenerate

//
//MUXCY MUXCY_L4_16 ( .S(sa[4]),.DI(mux_L3_16),.CI(1'b0),.O(do[16]));
//MUXCY MUXCY_L4_31 ( .S(sa[4]),.DI(mux_L3_31), .CI(1'b0),.O(do[31]));
generate for(i=16; i<=31; i=i+1) begin: mux_L4_16_31_loop
MUXCY MUXCY_L4(.S(sa[4]),.DI(mux_L3[i]), .CI(1'b0), .O(do_core[i]));
end
endgenerate

endmodule

==============

That's all. Bye.




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )