EmbeddedRelated.com
Forums
The 2024 Embedded Online Conference

Writing a simple assembler

Started by Alex March 6, 2006
On 2006-03-08, 42Bastian Schick <bastian42@yahoo.com> wrote:
> On Tue, 07 Mar 2006 20:24:47 -0000, Grant Edwards <grante@visi.com> > wrote: > > [use Python! it's got built-in hashing and regular expressions] > >>> There are somethings you can ONLY do in assembly ... and on an 8 bit >>> micro (with 64k address space) every byte counts >> >>Ah, the assembler is running on an 8-bit processor. I guess I >>missed that. I assumed the assembler would be running on a >>Linux or Windows host. > > The OP wrote "for an 8 bit" not "on an 8 bit" CPU.
Then why does the target's 64K byte address space preclude the use of a high-level programming language? -- Grant Edwards grante Yow! You must be a CUB at SCOUT!! Have you made your visi.com MONEY-DROP today??
Grant Edwards wrote:
> On 2006-03-08, David Brown <david@westcontrol.removethisbit.com> wrote: >> CBFalconer wrote: >>> Isaac Bosompem wrote: >>> ... snip ... >>>> I do need an efficient hashing function, with a low probability of >>>> collisions (ideally). I will need to code to handle collisions. >>>> >>>> Anyways I am open to new ideas hopefully you have some of your own >>>> to add. >>> You are welcome to use hashlib, which is under GPL licensing, and >>> was designed to fill just this sort of need. It is written in pure >>> ISO C. >>> >>> You could open one table and stuff it with the opcodes. Another >>> could hold the macros, and yet another the symbols. The tables >>> will all use the same re-entrant hash code, yet can have widely >>> different content. >>> >>> <http://cbfalconer.home.att.net/download/hashlib.zip> >>> >> It's for this sort of reason that such programs should be written in >> languages like Perl or Python (or even PHP). Just as C++ is a poor >> choice of language for an 8051, so C/C++ is a poor choice of language >> for an assembler running on a PC. > > That's what I said, but apparently the assembler is running on > an 8-bit target with a 64K address space. So Python is out of > the question. But C++ and generic libraries are going to fit? > >> Both Perl and Python will give you much simpler regular >> expression engines than any C library, vastly easier (and more >> flexible, and probably faster) hash dictionaries, and a host >> of ready-to-use libraries. > > Yup. >
No one is writing an assembler to run *on* the target. Someone jumped into this thread with a discussion about running on an 8-bit target (as far as I've noticed, the OP has not mentioned the type of target), and how we should program efficiently in assembler rather than using high level languages and hash tables. As far as I can figure out, he is assumed that the thread was about writing in assembly language rather than writing an assembler, and his confusion has spread.
On 2006-03-08, David Brown <david@westcontrol.removethisbit.com> wrote:

>> That's what I said, but apparently the assembler is running on >> an 8-bit target with a 64K address space. So Python is out of >> the question. But C++ and generic libraries are going to fit? >> >>> Both Perl and Python will give you much simpler regular >>> expression engines than any C library, vastly easier (and more >>> flexible, and probably faster) hash dictionaries, and a host >>> of ready-to-use libraries. >> >> Yup. > > No one is writing an assembler to run *on* the target. Someone jumped > into this thread with a discussion about running on an 8-bit target (as > far as I've noticed, the OP has not mentioned the type of target), and > how we should program efficiently in assembler rather than using high > level languages and hash tables. As far as I can figure out, he is > assumed that the thread was about writing in assembly language rather > than writing an assembler, and his confusion has spread.
Ah. If that's the case, then I stand by my original suggestion of using a high-level language like Python to write the assembler. -- Grant Edwards grante Yow! Did you find a at DIGITAL WATCH in YOUR box visi.com of VELVEETA??
Grant Edwards wrote:
> ... I stand by my original suggestion of > using a high-level language like Python to write the assembler.
Yes, or Perl, and perhaps using a lexer/parser library. Even C++ with a decent parser generator. The fact that a thread headed "Writing a simple assembler" is dominated by the topic of hashing indicates that something is seriously off the rails somewhere.
> > -- > Grant Edwards
WOAH
the "thought police"!

> Please do not top-post. It loses all context, and makes no sense.
> You don't have to implement any collision detection, it is all done > for you.
completely missing the point. the problem is the NEED for collision detection ... and having to re-allocate twice the memory when the table is filled. having it "hidden" or "done for you" doesnt resolve the issues above
First, I would like to thank everyone for a response and advice.

Second - the purpose was to write a simple assembler in order to generate  
an op code on a PC
and then run it on my IC (fpga is used as a host controller).
I understand that the task is trivial for gurus, but being a novice in  
this thing first thing that came to
my mind was simply to make two passes: first - preprocessor, detects all  
the variables etc., second actual
translation  - recognising mnemonics and generate an opcode . Obviously it  
is not a "proper" way to do it
(grammar descriptions and so on..). That's why I was asking about some  
examples and articles about this issue.



-- 
Alex
Grant Edwards wrote:
> > My point exactly. Worrying about hash tables for symbol lookup > reeks of premature optimization for "a simple assembler". >
Speaking from exprience ... :-) First, choosing a good algorithm up front is not "premature optimization." It's simply good design. Second, despite the best intentions (encapsulation and other good software engineering methods), it's often not so simple to just replace one symbol table search routine with another. Third, "a simple assembler" today may be a complex assembler tomorrow. Better to do it right the first time around, especially as using a hash table lookup isn't a whole lot more complex than a linear search. I regret the day I said to myself "heck, this is just a prototype, I'll use a simple linear search right now and fix it in the final version." 82 versions later and over 100,000 lines of code, I can attest that this was the second worst design decision I made for my "prototype assembler" (the #1 bad design decision was using Flex and Bison). Cheers, Randy Hyde
randyhyde@earthlink.net <randyhyde@earthlink.net> wrote:

> Second, despite the best intentions (encapsulation and other good > software engineering methods), it's often not so simple to just replace > one symbol table search routine with another.
In all fairness, a "symbol table search routine" that fails to be easily replacable by another, should immediately be reported to the nearest Committee on Abusive Nomenclature and Fraudulent Assumption of Titles. If it can't be replaced, it's clearly not a search routine, but a hack. -- Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de) Even if all the snow were burnt, ashes would remain.
Hans-Bernhard Broeker wrote:
> > one symbol table search routine with another. > > In all fairness, a "symbol table search routine" that fails to be > easily replacable by another, should immediately be reported to the > nearest Committee on Abusive Nomenclature and Fraudulent Assumption of > Titles. If it can't be replaced, it's clearly not a search routine, > but a hack.
I couldn't agree more. But more often than not, guess what happens. Better to plan ahead, and be realistic. Cheers, Randy Hyde
On 2006-03-08, randyhyde@earthlink.net <randyhyde@earthlink.net> wrote:

>> My point exactly. Worrying about hash tables for symbol lookup >> reeks of premature optimization for "a simple assembler". > > Speaking from exprience ... :-) > > First, choosing a good algorithm up front is not "premature > optimization." It's simply good design. > > Second, despite the best intentions (encapsulation and other good > software engineering methods), it's often not so simple to just replace > one symbol table search routine with another.
Then you did it wrong. The API for a symbol lookup should be dead simple. -- Grant Edwards grante Yow! I feel like I'm at in a Toilet Bowl with a visi.com thumbtack in my forehead!!

The 2024 Embedded Online Conference