Sign in

username:

password:



Not a member?

Search fpga-cpu



Search tips

Subscribe to fpga-cpu



fpga-cpu by Keywords

Altera | CISCifying | IDE | ISA | Java | JHDL | JTAG | LBU | MicroBlaze | PAR | PCI | RISC | SoC | Spartan | Transputers | Verilog | VHDL | Virtex | VLIW | WebPack | Xilinx | Xsoc | YARD-1A

Ads

Discussion Groups

Discussion Groups | FPGA-CPU | Re: Re: Emulation of Processor

This list is for discussion of the design and implementation of field-programmable gate array based processors and integrated systems. It is also for discussion and community support of the XSOC Project (see http://www.fpgacpu.org/xsoc).

Emulation of Processor - Anand Gopal Shirahatti - Dec 17 23:57:00 2003

Hi All,

Say I want to build a cycle accurate model of an exisiting processor. Say Intel 386 for example. Now I have access to all the data sheets and plenty of other information. Now I have, a very good specifications as well a brief internal design.

How exactly I go about handling this project, so that I make best use of the all the information available and in very systematic way. Please send me the list of suggestions from u r experience and any pointers to information on such projects.

Regards,
Anand,.

[Non-text portions of this message have been removed]





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )


RE: Emulation of Processor - Author Unknown - Dec 18 1:25:00 2003

When you say emulation, do you mean in FPGA (as per list) or are you
thinking about a software emulation? Original Message:
-----------------
From: Anand Gopal Shirahatti
Date: 18 Dec 2003 04:57:09 -0000
To:
Subject: [fpga-cpu] Emulation of Processor Hi All,

Say I want to build a cycle accurate model of an exisiting processor. Say
Intel 386 for example. Now I have access to all the data sheets and plenty
of other information. Now I have, a very good specifications as well a
brief internal design.

How exactly I go about handling this project, so that I make best use of
the all the information available and in very systematic way. Please send
me the list of suggestions from u r experience and any pointers to
information on such projects.

Regards,
Anand,.

[Non-text portions of this message have been removed] To post a message, send it to:
To unsubscribe, send a blank message to:
--------------------------------------------------------------------
mail2web - Check your email from the web at
http://mail2web.com/ .




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Emulation of Processor - Tommy Thorn - Dec 18 1:30:00 2003

Anand Gopal Shirahatti brought up an excellent point:
> Say I want to build a cycle accurate model of an
> exisiting processor. Say Intel 386 for example. Now
I
> have access to all the data sheets and plenty of
> other information. Now I have, a very good
> specifications as well a brief internal design.
>
> How exactly I go about handling this project, so
that
> I make best use of the all the information available

> and in very systematic way. Please send me the list
> of suggestions from u r experience and any pointers
> to information on such projects.

That is a very good question.

The answer obviously depends on level of seriousness
(commercial, research, hobby) and degree of
accurateness required (perfect cycle accurate replica,
bug-faithful implementation, "semantic" faithful,
...).
Perfect cycle accurate replica are rarely neccessary,
except in cases where programs depend on instruction
timing (not practical for any modern architeture).

For most processor work the pivotal element is the
"golden reference model", which can be anything from a
simple cpu simulator written in C to a formal
description in some machine readable form.

Arriving at this model is the first part of the work.
It can be written based on data sheets but in general
those only provide a first approximation. For the
full detail there is really no alternative to good old
reverse engineering: writing test programs to answer
ambiguities and unknowns in the data sheets. One
lazy-mans approach is to extract execution traces from
a real processor (fx. by single stepping through real
programs and dumping the complete cpu state at each
step). This trace can then be used to verify the
reference simulator at each step. For truly lowlevel
timing relations, you have to hook up a logic analyser
also.

Often you don't have to go that far though. After
all, what people cares about the most is that the new
processor can run the same programs, so getting the
model into a shape where it can run real programs is
very helpful. Notice that this generally also
requires some amount of external device simulation for
more useful programs.

Once you have a model you trust there are several ways
to proceed. In my hobby projects, I evolve the
simulator in stages to include a more and more details
of the hardware implementation and eventually
implement it in Verilog. Each refinement is
co-simulated with one of it's predecessors to check
for bugs (trust me, it's *much* easier to find and
correct bugs this way compared to trying to debug why
some application wasn't executed correctly [1]).

Anyway, that's my take on it. I look forward to other
opinions.

/Tommy

[1] Alexander Klaiber, Sinclair Chau: Automatic
Detection of Logic Bugs in Hardware Designs, Fourth
International Workshop on Microprocessor Test and
Verification, Common Challenges and Solutions (MTV
2003), May 29-30, 2003, Hyatt Town Lake Hotel, Austin,
Texas, USA. IEEE Computer Society 2003.

__________________________________





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: RE: Emulation of Processor - Anand Gopal Shirahatti - Dec 18 1:49:00 2003

Ofcourse on FPGA !

Final goal is VHDL RTL

On Thu, 18 Dec 2003 wrote :
>When you say emulation, do you mean in FPGA (as per list) or are you
>thinking about a software emulation? >Original Message:
>-----------------
> From: Anand Gopal Shirahatti
>Date: 18 Dec 2003 04:57:09 -0000
>To:
>Subject: [fpga-cpu] Emulation of Processor >Hi All,
>
>Say I want to build a cycle accurate model of an exisiting processor. Say
>Intel 386 for example. Now I have access to all the data sheets and plenty
>of other information. Now I have, a very good specifications as well a
>brief internal design.
>
>How exactly I go about handling this project, so that I make best use of
>the all the information available and in very systematic way. Please send
>me the list of suggestions from u r experience and any pointers to
>information on such projects.
>
>Regards,
>Anand,.
>
>[Non-text portions of this message have been removed] >To post a message, send it to:
>To unsubscribe, send a blank message to: >
>--------------------------------------------------------------------
>mail2web - Check your email from the web at
>http://mail2web.com/ . >
>To post a message, send it to:
>To unsubscribe, send a blank message to:
[Non-text portions of this message have been removed]




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Emulation of Processor - Eric Smith - Dec 18 22:07:00 2003

> Say I want to build a cycle accurate model of an exisiting processor. Say
> Intel 386 for example.
[...]
> How exactly I go about handling this project, so that I make best use of
> the all the information available and in very systematic way. Please send
> me the list of suggestions from u r experience and any pointers to
> information on such projects.

With all due respect, you're in way over your head. The 386 is one of
the most complex scalar processors ever. Start with something simpler.
If it has to be an x86, start with an 8086 or 8088. But you'd be better
off doing an 8-bit processor, or a RISC processor. The stuff you learn
doing that will be essential when you are actually ready to tackle
something more complex.




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Emulation of Processor - Tommy Thorn - Dec 19 0:15:00 2003

Encouraging words from Eric Smith:
> With all due respect, you're in way over your head.
The 386 is one of
> the most complex scalar processors ever. Start with
something simpler.
> If it has to be an x86, start with an 8086 or 8088.
But you'd be better
> off doing an 8-bit processor, or a RISC processor.
The stuff you learn
> doing that will be essential when you are actually
ready to tackle
> something more complex.

I don't even know how to start replying to this, so
let me just enumerate:
0) Do you actually know Ananard? I don't, but I don't
jump to conclusions about other peoples resources and
abilities.

1) I personally know a handful of persons who could
each pull off a 386 on their own.

2) The 386 is not that complex. It's just not very
orthogonal and there's a lot of detail to describing
it.

3) There is already a pretty good starting point for a
reference model in BOCHS.

4) One reasonable approach would be to identify the
95% most executed instructions and features used on a
variety of benchmarks. Implement those and trap to an
interpreter (written the that subset) for the rest.
The subset actually used when running, say Linux, is a
lot smaller than the full 386.

I think that could be a fun project.

/Tommy
__________________________________






(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Emulation of Processor - Eric Smith - Dec 19 23:24:00 2003

"Tommy Thorn" <> wrote:
> 0) Do you actually know Ananard? I don't, but I don't
> jump to conclusions about other peoples resources and
> abilities.

I have no idea who Ananard is. I don't know Anand Gopal Shirahatti,
but I do know that it takes more than thirty man-years for a team of
expert microprocessor designers to build a fully 386-compatible core.
I've spoken to engineers who were involved in such projects at two
different companies. If it was as easy as you claim, there would have
been a lot more than just a handful companies (Intel, AMD, Cyrix, Chips
& Technology, perhaps I've missed one or two) making 386 processors.

So yes, based on the way his posting was written, I did jump to the
conclusion that Anand doesn't want to invest thirty man-years or more
in the project. I'll freely concede that this assumption could be
incorrect.

> 1) I personally know a handful of persons who could
> each pull off a 386 on their own.

By making this statement, you've just demonstrated that you have no grasp
of the complexity of the 386 yourself. However, feel free to tell us
more about these superstars, and what they've actually accomplished.

> 2) The 386 is not that complex.

False. It is a very complex part (though obviously less complex than
the newer x86 parts). The original implementation used approximately
275,000 transistors. And unlike most modern processors (RISC or CISC),
very few of those are RAM. In a modern processor, more than 90% of the
transistor count is RAM.

Even if you're substantially more clever than the original designers of
the 386 (which I rather doubt), you're probably not going to be able to
shave down the transistor count (or the gate count) by more than 20% and
still maintain full compatability to the extent that Anand wanted. Nor,
by using more transistors (or gates) than the original, are you going to
reduce the magnitude of the required design effort by more than 20%.

> It's just not very orthogonal and there's a lot of detail to
> describing it.

True.

> 3) There is already a pretty good starting point for a
> reference model in BOCHS.

Having a software 386 simulator is certainly useful as a reference, but
tells you very little about how to write a workable RTL model of a 386.
(Or any other sort of hardware model that can meet the project objective.)

> 4) One reasonable approach would be to identify the
> 95% most executed instructions and features used on a
> variety of benchmarks. Implement those and trap to an
> interpreter (written the that subset) for the rest.

The original project as described was "to build a cycle accurate model of
an exisiting processor. Say Intel 386 for example."

The approach you're proposing, while it might yield useful results, does
not produce the goal defined for the project.

Eric




(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Emulation of Processor - Rob Finch - Dec 21 0:20:00 2003

> Say I want to build a cycle accurate model of an exisiting
processor. Say Intel 386 for example. Now I have access to all the
data sheets and plenty of other information. Now I have, a very good
specifications as well a brief internal design.
>
> How exactly I go about handling this project, so that I make best
use of the all the information available and in very systematic way.

I have put together a cycle-accurate implementation of the 6502
processor. Whether this is a good appraoch to use or not, this how I
accomplished the task:

First, ignore the cycle accuracy. I'm of the opinion it's better to
get a working processor first, then make it cycle accurate. But keep
the cycle accuracy in the back of your mind. In other words don't put
together a bit-serial implementation of a byte wide processor, a
micro-code version of a RISC cpu, etc. The required conversion later
will probably only cause problems. My advice is to follow a similar
design pattern to begin with. Once you have a working processor, then
go back and patch it up so it's cycle accurate.

Get to cycle accuracy in stages. First try and make the cpu *faster*
than cycle accurate while keeping it simple at the same time. Once
it's faster than the original, it's easier to go back and add in
additional 'nop' cycles to slow the instructions down so they match
the original timing. Reducing the speed of a design is probably a lot
easier than trying to increase the speed of a design that started off
on the wrong foot with the wrong architecture.

Keep the pipelining of the original in mind. If the original
processor is pipelined so that instructions execute in a single
cycle, then you'll have to duplicate that pipelining in order to get
the single cycle instruction execution.

Tackle cycle accuracy on the instructions that are a) easy to make
cycle accurate, and b) the instructions that are likely to be the
critical ones for cycle accuracy. It might be acceptable for other
less critical instructions to be non-cycle accurate.

Cycle accuracy is mostly marketing hype. It's great to be able to say
the processor is 100% cycle-accurate, but it's not normally a
requirement. Coding that depends on cycle accuracy is strongly
discouraged because different versions of a processor (even within
the same generation from the same manufacturer) could potentially
have different timings. With todays complex systems involving
overlapped instructions sequences, caches accesses, interrupts, etc.
Almost no-one depends on cycle accuracy because it's an unreliable
approach.

Where cycle accuracy has been used in the past is for simple systems
where clock cycles were counted to determine timing delays. Most of
these delays consist of loops that simply decrement a counter. So
critical instructions to make cycle accurate are probably branch /
loop instructions and decrements / increments.

For my '02 implementation, in the first pass I had many instructions
that took longer than the original. Once I had the processor
basically working, I then looked at how I could streamline the cpu. I
streamlined the cpu to reduce all the instructions to the minimum
number of cycle (once again not trying too hard to keep cycle
accuracy). This was the second iteration of the cpu. At this point I
had all instructions executing in the same or fewer clock cycles than
the original. For the third iteration of the processor, I went back
and added in additional 'nop' cycles to extend instruction out to the
same timings as the original.

Note there are different kinds of cycle accuracy as well. My '02 has
instruction timing accuracy, but not bus-cycle by bus-cycle accuracy
(although it's very close).

Note obtaining cycle accuracy cost about 10% of the clock cycle, and
10% in size. The cycle accurate version runs at 10% slower clock
frequency and consumes about 10% more fpga resources. (Cycle-accuracy
uses the fpga resources less efficiently than they could otherwise be
used in this case). I have an option to build the code with non-cycle
accuracy for better performance and size.

=================================================

I spent about a year getting the 02 basically working. It was more
than another year before I had it cycle accurate. These were not
really man years, but I spent a lot of time at it on weekends and
evenings. It' probably represents many man-months of effort anyway (I
can code and get things working very fast....)

The x86 series is a complex processor. Twice I've started a 8086
clone, but then dropped it after a just a few hours. I'd estimate it
to be about three or four times more complex than the '02, meaning it
would probably take me about five years to get a decently working
version (without working on it full time). Something like the 386 is
several times more complex than that so the other poster's comment
about spending 30 man years isn't an unreasonable time estimate.
still, if you like a
challenge..... Implementing an existing processor has a lot of attraction because of
the existing base of software and tools.

Depending what your goals are...... it might be easier to get x386
comparable performance with a much simpler processor. For instance
isn't the xr16 20 MIPS ? Rob





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Re: Emulation of Processor - Eng How Khoo - Dec 21 22:34:00 2003

Hi Rob,

May i ask for more information on how you implement the 6502? cos i am doing an emulator on 6502 also ....
I plan to emulate the 6502 by Verilog ....
Let say i alredi obtain the 6502 Verilog code somewhere from the web, how do i test it to detect the bug? cos the code is said to contain bugs .....
Please advice
Thanks

EH

Rob Finch <> wrote:
> Say I want to build a cycle accurate model of an exisiting
processor. Say Intel 386 for example. Now I have access to all the
data sheets and plenty of other information. Now I have, a very good
specifications as well a brief internal design.
>
> How exactly I go about handling this project, so that I make best
use of the all the information available and in very systematic way.

I have put together a cycle-accurate implementation of the 6502
processor. Whether this is a good appraoch to use or not, this how I
accomplished the task:

First, ignore the cycle accuracy. I'm of the opinion it's better to
get a working processor first, then make it cycle accurate. But keep
the cycle accuracy in the back of your mind. In other words don't put
together a bit-serial implementation of a byte wide processor, a
micro-code version of a RISC cpu, etc. The required conversion later
will probably only cause problems. My advice is to follow a similar
design pattern to begin with. Once you have a working processor, then
go back and patch it up so it's cycle accurate.

Get to cycle accuracy in stages. First try and make the cpu *faster*
than cycle accurate while keeping it simple at the same time. Once
it's faster than the original, it's easier to go back and add in
additional 'nop' cycles to slow the instructions down so they match
the original timing. Reducing the speed of a design is probably a lot
easier than trying to increase the speed of a design that started off
on the wrong foot with the wrong architecture.

Keep the pipelining of the original in mind. If the original
processor is pipelined so that instructions execute in a single
cycle, then you'll have to duplicate that pipelining in order to get
the single cycle instruction execution.

Tackle cycle accuracy on the instructions that are a) easy to make
cycle accurate, and b) the instructions that are likely to be the
critical ones for cycle accuracy. It might be acceptable for other
less critical instructions to be non-cycle accurate.

Cycle accuracy is mostly marketing hype. It's great to be able to say
the processor is 100% cycle-accurate, but it's not normally a
requirement. Coding that depends on cycle accuracy is strongly
discouraged because different versions of a processor (even within
the same generation from the same manufacturer) could potentially
have different timings. With todays complex systems involving
overlapped instructions sequences, caches accesses, interrupts, etc.
Almost no-one depends on cycle accuracy because it's an unreliable
approach.

Where cycle accuracy has been used in the past is for simple systems
where clock cycles were counted to determine timing delays. Most of
these delays consist of loops that simply decrement a counter. So
critical instructions to make cycle accurate are probably branch /
loop instructions and decrements / increments.

For my '02 implementation, in the first pass I had many instructions
that took longer than the original. Once I had the processor
basically working, I then looked at how I could streamline the cpu. I
streamlined the cpu to reduce all the instructions to the minimum
number of cycle (once again not trying too hard to keep cycle
accuracy). This was the second iteration of the cpu. At this point I
had all instructions executing in the same or fewer clock cycles than
the original. For the third iteration of the processor, I went back
and added in additional 'nop' cycles to extend instruction out to the
same timings as the original.

Note there are different kinds of cycle accuracy as well. My '02 has
instruction timing accuracy, but not bus-cycle by bus-cycle accuracy
(although it's very close).

Note obtaining cycle accuracy cost about 10% of the clock cycle, and
10% in size. The cycle accurate version runs at 10% slower clock
frequency and consumes about 10% more fpga resources. (Cycle-accuracy
uses the fpga resources less efficiently than they could otherwise be
used in this case). I have an option to build the code with non-cycle
accuracy for better performance and size.

=================================================

I spent about a year getting the 02 basically working. It was more
than another year before I had it cycle accurate. These were not
really man years, but I spent a lot of time at it on weekends and
evenings. It' probably represents many man-months of effort anyway (I
can code and get things working very fast....)

The x86 series is a complex processor. Twice I've started a 8086
clone, but then dropped it after a just a few hours. I'd estimate it
to be about three or four times more complex than the '02, meaning it
would probably take me about five years to get a decently working
version (without working on it full time). Something like the 386 is
several times more complex than that so the other poster's comment
about spending 30 man years isn't an unreasonable time estimate.
still, if you like a
challenge..... Implementing an existing processor has a lot of attraction because of
the existing base of software and tools.

Depending what your goals are...... it might be easier to get x386
comparable performance with a much simpler processor. For instance
isn't the xr16 20 MIPS ? Rob ---------------------------------





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Re: Emulation of Processor - Tomasz Sztejka - Dec 22 1:53:00 2003

--- Eng How Khoo <> wrote: > Hi Rob,
>
> May i ask for more information on how you implement the 6502? cos i
> am doing an emulator on 6502 also ....
> I plan to emulate the 6502 by Verilog ....
> Let say i alredi obtain the 6502 Verilog code somewhere from the web,
> how do i test it to detect the bug? cos the code is said to contain
> bugs .....
> Please advice
> Thanks
Hi,
About testing Verilog cores... I'm working currently on bridge from
Verilog simulator to Java (using VPI). It is in early stage now and
supports Linux Icarus Verilog-Java link. It allows user to single
step/run your model, set/get each of its registeres/nets, do trace log
and breakpoints. Currently it does not have any GUI (command line like
interface). The target is to provide a set of GUI components (Java
beans) which allow user to build their processor specific simulators in
a flash, including source-level software debugging.

If you will be interrested I may put it together and post you within a
week (with all sources).

regards,
Tomasz Sztejka

________________________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html






(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Re: Emulation of Processor - Eng How Khoo - Dec 22 5:06:00 2003

Hi Tomasz Sztejka,

Can explain more about the bridge .... what is a bridge? The simulator will onli work in Linux environment? How about window? cos most of the comp at my uni is running on microsoft product ....

I am very interested, thank you .....
by the way, is there any documentation on how to use the simulator?

regards
EH ---------------------------------





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )

Re: Re: Emulation of Processor - Tomasz Sztejka - Dec 22 15:11:00 2003

--- Eng How Khoo <> wrote: > Hi Tomasz Sztejka,
>
> Can explain more about the bridge .... what is a bridge? The
> simulator will onli work in Linux environment? How about window? cos
> most of the comp at my uni is running on microsoft product ....
>
> I am very interested, thank you .....
> by the way, is there any documentation on how to use the simulator?

The Icarus Verilog (http://icarus.com/eda/verilog/) is a free
Linux/Windows/etc verilog compiler plus simulator (this is Verilog chip
description language simulator, it does not simulate on chip level -
won't calculate timings, propagations and etc., as far as I know)

As all verilogs its simulator uses VPI (what explains to: Verliog
Procedural Interface) which allows user written C programs to interact
with simulation process. This can be anything and in my case it is
"bridge" software which sends informations about simulation
progress/data to another program.

So my program is _not_ a simulator. It contains C part which interacts
with simulator and Java part for UI. This C part of program is specific
to Icarus Verilog and propably for Linux too - I didn't try to compile
it with gcc under Windows since I don't use it anymore at home and did
not tried other Verilog simulators since I don't have them.

Second part it is a Java side, which will actually implement GUI (well,
currently only commandline but I'm working on it). Java side is
completly portable and OS independent. This allows you to type get xxx,
set xxx=yy or something like that. In future user will just have
regular GUI to proceed and programmer will have set of toolbox java
classes to tune it for own needs. As I mentioned up to now it is in
early stage (not for beginners - it is horrible if you don't know who
is wrong - you or soft you use).

Anyway, if you like to complete your own Verilog environment I would
recommend:
- use some of Xilinx or Altera free soft. It is huge (>200Megs), slow
and in my opinion buggy (esp. verilog in Xilinx) but it is Windows,
free and complete. I recommed it to beginners equiped with powerfull
PC's;
- use Icarus Verilog as compiler / simulator. The simulator is
command line - you fire it and it dumps log to a file and finishes.
Very fast. No interaction. To view results in nice wave format you may
use GTKWave availble in most linux distributions. To actually get data
to program chip you may enter your design to Xilinx www accesible
online system and get them back in compiled form ( I didn't got that
far now ) Complete soft is <20Megs. I recommed it for linux freaks :)

regards,
Tomasz Sztejka ________________________________________________________________________
Yahoo! Messenger - Communicate instantly..."Ping"
your friends today! Download Messenger Now
http://uk.messenger.yahoo.com/download/index.html





(You need to be a member of fpga-cpu -- send a blank email to fpga-cpu-subscribe@yahoogroups.com )