Sign in

username:

password:



Not a member?

Search fpga-cpu



Search tips

Subscribe to fpga-cpu



fpga-cpu by Keywords

Altera | CISCifying | IDE | ISA | Java | JHDL | JTAG | LBU | MicroBlaze | PAR | PCI | RISC | SoC | Spartan | Transputers | Verilog | VHDL | Virtex | VLIW | WebPack | Xilinx | Xsoc | YARD-1A

Discussion Groups

Discussion Groups | FPGA-CPU | Re: Have NIOS and Microblaze killed off the fpga-cpu

This list is for discussion of the design and implementation of field-programmable gate array based processors and integrated systems. It is also for discussion and community support of the XSOC Project (see http://www.fpgacpu.org/xsoc).

Re: Have NIOS and Microblaze killed off the fpga-cpu - John Kent - Aug 4 12:00:19 2006


Hi Hellwig,

H...@mni.fh-giessen.de wrote:
>
>
> > I've used a single phase clock on my designs. One clock cycle = one
> > instruction cycle.
>
> That's really nice. How do you supply instructions
> and operands from external memory with that rate? Your
> machine has caches? Instructions and data separately

There is no caching on the 6809 core.
The address is set up was 35 nsec after the falling E clock edge based on
the timing spec I got off the Spartan 2e.
Data must be set up 20 nsec before the next falling clock edge on read
cycles.
Data is set up at the same time as the address bus on write cycles.
I use the clock to gate the SRAM Chip Select so that access occurs
when the clock is high. The SRAM has a 15 nsec access time so with
a 12.5 MHz clock, that gives an 80nsec cycle time. 80 - 35 - 20 = 25nsec
RAM access time. It's not good practice to use the clock in
combinational logic
for the chip select. Webpack issues warnings about it but it still works.

I got the address setup times from webpack ISE when I was using an external
clock running at the CPU clock frequency on the B5-X300 board.
As far as I can work out you can only apply timing constraints to
external pins
under Webpack ISE. I changed the clock to x 4, because I wanted to
implement the 6809 design on a Spartan 3 starter board which only had
a 50MHz external oscillator. It also meant I could use a synchronous 25MHz
pixel clock for the VDU. This made it impossible to specify timing nets
relative
to the CPU clock, so I'm not sure what the set up times are for the
Spartan 3.

If you go to DDR SDRAM as on the Spartan 3E starter board, it is really
designed for 2 word, 4 word or 8 word burst access. So far as I can see
it becomes very inefficient to used DDR SDRAM on conventional 8 bit designs
that use random memory access. For this reason I really think you need cache
with DDR SDRAM memory, that will buffer up 8 words at a time, or more.

I had the idea that I could implement a 16 way associative memory using an
address shift register and an array of identity comparitors. If you hit
an address
in the associative memory you would then look up the low 8 words in the
dual
port cache buffer. If there is an associative cache miss, then the new
address
is pushed into the associative memory shift register. A "dirty" flag is
kept of
words written back to the cache buffer. The address that is pushed out
the end
of the associate memory shift register has its 8 word cache buffer
written back
to SDRAM based on the "dirty" bit. The 8 words for the new address in the
associative memory are then read out of SDRAM into the dual ported cache
buffer.

Obviously the CPU or what ever is using the cache will be stalled on misses.
If you have a separate instruction cache, then you don't need write back.

My reading in Wikipedia suggest that associative caches are normally use in
primary caches where as secondary cache normally holds the address you are
looking up and the data you are reading, and I assume you index the cache
using the lower address lines. This means that if you address the same lower
order address with a different higher order address, you will get a
cache miss.
Associative memory reduces the size of the cache buffer required because
you can reference the memory over a larger range of addresses.

There are certain refinements you can make to associative memory, such as
having a big mux on the shift register to feed back the last cache hit
so that
the most frequently used addresses remains at the front of the associative
memory shift register rather than falling out the end and being written back
out to SDRAM.

Anyway ... I'm a bit of a novice at caches I must admit, so if anyone has
any concerns about what I have written I am happy to be corrected.

>
> > Porting Linux to a new processor I would imagine would be a big job. I
> > bought the
> > LCC book that Jan recommends on his web site, but there is a lot to
work
> > through in it.
>
> Well, it is relatively easy to write a back-end for LCC.
> I did this some time ago for my bigger processor, which
> only exists as an instruction set simulator, but not yet
> as an FPGA design (http://homepages.fh-giessen.de/~hg53/eco32).
> I needed only two weeks for the whole job.
>

Sounds like I should take another look then. The book goes into
a great deal of detail on how the compiler is implemented.
If I don't need to worry about that and just skip to the back-end
then perhaps it's not such a big job.

> I do not necessarily think of Linux to be run on my
> machine (this would request using gcc and its utilities,
> and this is really quite a big task). I already had
> the old UNIX V7 running on the instruction set simulator,
> although it was not very stable and, more important, it
> did not use the paging hardware present in the CPU.
> The big advantage is kernel size: under 10000 lines...
>
> Hellwig

Yes, find all the assembly language dependencies would be a pain too.

Sorry for the long post.

John.

--
http://members.optushome.com.au/jekent

To post a message, send it to: f...@yahoogroups.com
To unsubscribe, send a blank message to: f...@yahoogroups.com



(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )