Sign in

username:

password:



Not a member?

Search fpga-cpu



Search tips

Subscribe to fpga-cpu



fpga-cpu by Keywords

Altera | CISCifying | IDE | ISA | Java | JHDL | JTAG | LBU | MicroBlaze | PAR | PCI | RISC | SoC | Spartan | Transputers | Verilog | VHDL | Virtex | VLIW | WebPack | Xilinx | Xsoc | YARD-1A

Ads

Discussion Groups

Discussion Groups | FPGA-CPU | RE: compiler ports

This list is for discussion of the design and implementation of field-programmable gate array based processors and integrated systems. It is also for discussion and community support of the XSOC Project (see http://www.fpgacpu.org/xsoc).

RE: compiler ports - Jan Gray - Jul 24 9:49:00 2004

> it is not absolutely necessary to port GCC; any C compiler will do.
> In my ECO32 project I used LCC, which is very small compared to GCC,
> is well documented (Fraser & Hanson book), and uses a back-end generator.
> Included are back-end descriptions for a handful of architectures.
> Porting the compiler to ECO32 (a 32-bit RISC machine) was done in
> about 2 weeks.
>
> Regards,
> Hellwig

Yes, see also http://www.fpgacpu.org/usenet/lcc.html. By chosing LCC you
work with a splendid compiler text book and so as a side effect you will
learn much about compiler implementation.

You'll also need an assembler, linker and a runtime library. I wrote a
simple home made assembler in C. In the past I have built assemblers using
awk. These days you might be happier implementing the assembler in Python
or Perl. I also cut corners and used my assembler as my linker. The output
of separately compiled modules were .o's consisting not of binary
instructions, with relocation records, but of assembly, with appropriate
.global and .extern directives. Then at "link time" I simply concatenated
together the .o's and assembled them. You just need to provide assembly
symbol table functionality so that duplicate local symbol names do not cause
errors. For runtime libraries I just implemented the functions I needed as
I went along.

I think this approach is perfectly acceptable, particuarly if you are then
going to run a higher level runtime environment like Scheme or Squeak on top
of your minimal C runtime.

In contrast, the considerable extra effort/investment of a
binutils/GCC/GAS/libraries/GDB port will pay off if you have a lot of off
the shelf C/C++ software that you want to reuse. Even there one might argue
that first doing a quick LCC port while you tune up your instruction set
architecture will save time in the long run.

Jan Gray, Gray Research LLC





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )


Re: RE: compiler ports - Simon Gornall - Jul 25 11:16:00 2004

Hi Jan,

Well, thought I'd let you know that I've just finished a port of the
gr0040 series to a 32-bit architecture (targetted at spartan-3) with a
few minor twists along the way. I'm just looking at the final
verification of the interrupt-driven timer as it scrolls up the screen,
and indeed every 64 (or so) clocks, it jumps to the interrupt handler,
resets the count and jumps back to the program :-)

I too wrote an assembler, though I did it in java rather than a
scripting-language - the StreamTokenizer does nice things like give you
a block-comment parser for free just like 'C'/C++ :-) Compiling it to
an executable with 'gcj' makes it a nice fast tool to run, and it took
about a day to write (Saturday, in fact :-)

Looking at the licence for xr16, it's not clear I'm allowed to host a
web-page saying 'here it is, warts and all', so I guess you're all
going to have to build your own [grin] - not that you wouldn't in the
first place, excuse me, I'm just still full of euphoria from actually
getting the thing to simulate verifiably correctly (as in: all the
instructions do what they're supposed to :-))

I modified the interrupt handlers slightly. I have a somewhat
embarrassingly large number of registers (512 in fact, 64 visible at
any one time, split as 32 permanent, 32 banked), so using one for the
interrupt-return wasn't a problem, and since I don't have "mov rdst,
rsrc" (it uses "add rdst, rsrc, r0, 0" instead) it was causing issues
when r0 wasn't 0 :-)

I moved to a 4-operand (src, a, b, constant) for the ALU ops, mainly
because I could - there's lots of bits in a 32-bit instruction :-)
Loads and stores are the same format though, "ldl Rdst, imm(Rsrc)",
with everything extended to 32-bits (using the same sort of ideas with
duplication of data on writes and zero-extension on read for smaller
bit-widths).

Branches now have a 27-bit relative range (25 stored, <<2 for
alignment). The only new instruction at the moment is 'bank', which
selects a second-set-of-32 register bank to use. This defaults to '1'
on reset, but ranges from 1-15. If the high-bit of the register number
is set in any instruction, the core replaces that bit with the current
bank number, giving a 9-bit address for the register location (which is
basically because a blockram is so large :-)

I am going to add 'ei' and 'di' instructions though to enable/disable
interrupt processing. I want to port a multi-tasking OS onto the chip
when I get a C compiler up and running, and being able to switch off
interrupts when in a critical section will be quite useful :-) It's
rather nice to identify a weakness for what *I* want to do, and be able
to fix it either in 'hardware' or in software :-)

The other point worth mentioning [to bring this slightly back to the
original topic] is that lcc only copes with 32 registers, as I found
out after getting my CPU to work with 64 at a time... Lesson 1: read
the manual. Lesson 2: read it again. So, unless there's a better
candidate out there, it looks like I'll be attempting gcc :-( (or
perhaps :-), who knows :-)

Oh yeah, click and go synthesis in Webpack gives me ~56MHz in a
Spartan-3 FG456. Doing some minimal floorplanning (aligning columns,
trying to keep datapaths short) gives a boost to 68->70 MHz (whenever I
get it to ~70, I still tweak it and it drops back to 68 or so :-( I
need to learn more about floorplanning...) It's complicated slightly
because there aren't any tristate buffers any more (I get lots of
warnings :-) which means you need to route the signals. My gut feeling
is that there's a fair bit of congestion in the 'middle' bits.

Also, the design as it stands uses 6 Blockrams (4 for memory, to give
8-bit write-enables, and 2 for dual-issue register files... Well, I've
no other use for them :-) which extends it along the chip a bit...

Anyway, thought you might be interested :-)

Simon





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: RE: compiler ports - Simon Gornall - Jul 25 14:25:00 2004


On 25 Jul 2004, at 17:16, Simon Gornall wrote: > I modified the interrupt handlers slightly. I have a somewhat
> embarrassingly large number of registers (512 in fact, 64 visible at
> any one time, split as 32 permanent, 32 banked)

Hmm, I think I've just thought of a way this could be a real advantage
and still use lcc - and I've just been skimming the book, if Jan thinks
this is easy compared to gcc, I don't want to go *near* gcc! The idea
is to do 2 ports :-) The second port only differs from the first in
that it uses r32->r63 whereas the first uses r0->r31.

This means I can write 'system' code that uses r0->r31 and
'application' code that uses r32->r63, giving me 16 (+the kernel) tasks
which can be context-switched using a single instruction (bank N). This
assumes I fold the sp and pc registers into the normal register file
along with the {ccc,ccn,ccv,ccz} vector (at the moment they're
single-instance 'reg's in the verilog code, as in gr0040). That's still
29 general-purpose registers to play with per task, which ought to be
plenty :-) 'System' tasks can interrupt application ones without
conflicting with their registers or state. I think Motorola had
something similar with their 'supervisor mode', which had a separate
set of registers when you were out of 'user mode'.

The only drawback is the limited number of tasks - but I think 16 will
be a good start, I'm not trying to write Linux here :-) I just want to
be able to run more than one task on the cpu, keeping the tasks
separate rather than multiplexing the cpu's actions within a single
program.

Tomorrows seminar on "How to turn necessity into virtue" will be
brought to you by P.ragmatism. Stay tuned :-)

Simon




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )