Writing a simple assembler| page 5

Reply by Grant Edwards ●March 8, 20062006-03-08

On 2006-03-08, 42Bastian Schick <bastian42@yahoo.com> wrote:
> On Tue, 07 Mar 2006 20:24:47 -0000, Grant Edwards <grante@visi.com>
> wrote:
> > [use Python! it's got built-in hashing and regular expressions]
>
>>> There are somethings you can ONLY do in assembly ... and on an 8 bit
>>> micro (with 64k address space) every byte counts
>>
>>Ah, the assembler is running on an 8-bit processor.  I guess I
>>missed that.  I assumed the assembler would be running on a
>>Linux or Windows host.
>
> The OP wrote "for an 8 bit" not "on an 8 bit" CPU.

Then why does the target's 64K byte address space preclude the
use of a high-level programming language?

-- 
Grant Edwards                   grante             Yow!  You must be a CUB
                                  at               SCOUT!! Have you made your
                               visi.com            MONEY-DROP today??

Reply by David Brown ●March 8, 20062006-03-08

Grant Edwards wrote:
> On 2006-03-08, David Brown <david@westcontrol.removethisbit.com> wrote:
>> CBFalconer wrote:
>>> Isaac Bosompem wrote:
>>> ... snip ...
>>>> I do need an efficient hashing function, with a low probability of
>>>> collisions (ideally). I will need to code to handle collisions.
>>>>
>>>> Anyways I am open to new ideas hopefully you have some of your own
>>>> to add.
>>> You are welcome to use hashlib, which is under GPL licensing, and
>>> was designed to fill just this sort of need.  It is written in pure
>>> ISO C.
>>>
>>> You could open one table and stuff it with the opcodes.  Another
>>> could hold the macros, and yet another the symbols.  The tables
>>> will all use the same re-entrant hash code, yet can have widely
>>> different content.
>>>
>>>   <http://cbfalconer.home.att.net/download/hashlib.zip>
>>>
>> It's for this sort of reason that such programs should be written in 
>> languages like Perl or Python (or even PHP).  Just as C++ is a poor 
>> choice of language for an 8051, so C/C++ is a poor choice of language 
>> for an assembler running on a PC.
> 
> That's what I said, but apparently the assembler is running on
> an 8-bit target with a 64K address space.  So Python is out of
> the question.  But C++ and generic libraries are going to fit?
> 
>> Both Perl and Python will give you much simpler regular
>> expression engines than any C library, vastly easier (and more
>> flexible, and probably faster) hash dictionaries, and a host
>> of ready-to-use libraries.
> 
> Yup.
> 

No one is writing an assembler to run *on* the target.  Someone jumped 
into this thread with a discussion about running on an 8-bit target (as 
far as I've noticed, the OP has not mentioned the type of target), and 
how we should program efficiently in assembler rather than using high 
level languages and hash tables.  As far as I can figure out, he is 
assumed that the thread was about writing in assembly language rather 
than writing an assembler, and his confusion has spread.

Reply by Grant Edwards ●March 8, 20062006-03-08

On 2006-03-08, David Brown <david@westcontrol.removethisbit.com> wrote:

>> That's what I said, but apparently the assembler is running on
>> an 8-bit target with a 64K address space.  So Python is out of
>> the question.  But C++ and generic libraries are going to fit?
>> 
>>> Both Perl and Python will give you much simpler regular
>>> expression engines than any C library, vastly easier (and more
>>> flexible, and probably faster) hash dictionaries, and a host
>>> of ready-to-use libraries.
>> 
>> Yup.
>
> No one is writing an assembler to run *on* the target.  Someone jumped 
> into this thread with a discussion about running on an 8-bit target (as 
> far as I've noticed, the OP has not mentioned the type of target), and 
> how we should program efficiently in assembler rather than using high 
> level languages and hash tables.  As far as I can figure out, he is 
> assumed that the thread was about writing in assembly language rather 
> than writing an assembler, and his confusion has spread.

Ah.  

If that's the case, then I stand by my original suggestion of
using a high-level language like Python to write the assembler.

-- 
Grant Edwards                   grante             Yow!  Did you find a
                                  at               DIGITAL WATCH in YOUR box
                               visi.com            of VELVEETA??

Reply by toby ●March 8, 20062006-03-08

Grant Edwards wrote:
> ... I stand by my original suggestion of
> using a high-level language like Python to write the assembler.

Yes, or Perl, and perhaps using a lexer/parser library. Even C++ with a
decent parser generator.

The fact that a thread headed "Writing a simple assembler" is dominated
by the topic of hashing indicates that something is seriously off the
rails somewhere.

> 
> -- 
> Grant Edwards

Reply by samIam ●March 8, 20062006-03-08

WOAH
the "thought police"!

> Please do not top-post.  It loses all context, and makes no sense. 


> You don't have to implement any collision detection, it is all done
> for you.

completely missing the point.
the problem is the NEED for collision detection ... and having to
re-allocate twice the memory when the table is filled.

having it "hidden" or "done for you" doesnt resolve the issues above

Reply by Alex ●March 8, 20062006-03-08

First, I would like to thank everyone for a response and advice.

Second - the purpose was to write a simple assembler in order to generate  
an op code on a PC
and then run it on my IC (fpga is used as a host controller).
I understand that the task is trivial for gurus, but being a novice in  
this thing first thing that came to
my mind was simply to make two passes: first - preprocessor, detects all  
the variables etc., second actual
translation  - recognising mnemonics and generate an opcode . Obviously it  
is not a "proper" way to do it
(grammar descriptions and so on..). That's why I was asking about some  
examples and articles about this issue.



-- 
Alex

Reply by rand...@earthlink.net ●March 8, 20062006-03-08

Grant Edwards wrote:
>
> My point exactly.  Worrying about hash tables for symbol lookup
> reeks of premature optimization for "a simple assembler".
>

Speaking from exprience ... :-)

First, choosing a good algorithm up front is not "premature
optimization." It's simply good design.

Second, despite the best intentions (encapsulation and other good
software engineering methods), it's often not so simple to just replace
one symbol table search routine with another.

Third, "a simple assembler" today may be a complex assembler tomorrow.
Better to do it right the first time around, especially as using a hash
table lookup isn't a whole lot more complex than a linear search.

I regret the day I said to myself "heck, this is just a prototype, I'll
use a simple linear search right now and fix it in the final version."
82 versions later and over 100,000 lines of code, I can attest that
this was the second worst design decision I made for my "prototype
assembler" (the #1 bad design decision was using Flex and Bison).
Cheers,
Randy Hyde

Reply by Hans-Bernhard Broeker ●March 8, 20062006-03-08

randyhyde@earthlink.net <randyhyde@earthlink.net> wrote:

> Second, despite the best intentions (encapsulation and other good
> software engineering methods), it's often not so simple to just replace
> one symbol table search routine with another.

In all fairness, a "symbol table search routine" that fails to be
easily replacable by another, should immediately be reported to the
nearest Committee on Abusive Nomenclature and Fraudulent Assumption of
Titles.  If it can't be replaced, it's clearly not a search routine,
but a hack.

-- 
Hans-Bernhard Broeker (broeker@physik.rwth-aachen.de)
Even if all the snow were burnt, ashes would remain.

Reply by rand...@earthlink.net ●March 8, 20062006-03-08

Hans-Bernhard Broeker wrote:
> > one symbol table search routine with another.
>
> In all fairness, a "symbol table search routine" that fails to be
> easily replacable by another, should immediately be reported to the
> nearest Committee on Abusive Nomenclature and Fraudulent Assumption of
> Titles.  If it can't be replaced, it's clearly not a search routine,
> but a hack.

I couldn't agree more. But more often than not, guess what happens.
Better to plan ahead, and be realistic.
Cheers,
Randy Hyde

Reply by Grant Edwards ●March 8, 20062006-03-08

On 2006-03-08, randyhyde@earthlink.net <randyhyde@earthlink.net> wrote:

>> My point exactly.  Worrying about hash tables for symbol lookup
>> reeks of premature optimization for "a simple assembler".
>
> Speaking from exprience ... :-)
>
> First, choosing a good algorithm up front is not "premature
> optimization." It's simply good design.
>
> Second, despite the best intentions (encapsulation and other good
> software engineering methods), it's often not so simple to just replace
> one symbol table search routine with another.

Then you did it wrong.  The API for a symbol lookup should be
dead simple.

-- 
Grant Edwards                   grante             Yow!  I feel like I'm
                                  at               in a Toilet Bowl with a
                               visi.com            thumbtack in my forehead!!