cs_posting@hotmail.com wrote:
> Paul Keinanen wrote:
>
> > A two pass assembler is definitely a practical way of doing a
> > cross-assembler. The first pass must generate the correct amount of
> > code for each instruction, in order to "detect" the locations of all
> > branch target labels in the program. In the second pass do the same
> > thing again and using the label addresses stored in the symbol table,
> > generate the correct code (especially the branch/jump instructions).
>
> I would imagine things get more complicated when you have both a short
> relative and a long call/jump addressing scheme - you don't know before
> you emit the intervening code how far away your target will be, so you
> don't know how long an instruction word you need to emit, so you don't
> know how far away your target will be...
>
> Of course if the two modes have different mnemonics, then it's up to
> the user to pick the right one, and the assembler only to error if the
> short one is used where it won't work.

Branch displacement optimization is provably NP-Complete.  And this is
true whether there are only two branch sizes or an arbitrary number of
them.  This means that the best known solution requires, worst case,
O(2**n) passes over the program (where n is the number of branches in
the program) to produce the optimally sized program.

Some assemblers, such as MASM, Gas, and FASM on the x86 start with long
branches and successively reduce their sizes on each pass (as legal)
until they make a pass during which no branch sizes change. This does
*not* produce, guaranteed, the minimal sized program. Indeed, it's
better in most cases to start off assuming the branches are short and
extend them as necessary. Even so, this scheme is not guaranteed to
produce the minimum sized program.

In fact, a "greedy" algorithm (make all the branches that can be made
short, short) is not guaranteed to produce the smallest program. You
can come up with some (synthetic) examples where making one particular
branch longer, that could be shorter, and the resulting program,
overall, consumes less bytes (think about other objects in the program,
besides branches, whose size depends upon the span of two labels).

Nevertheless, a "relaxation" or refinement approach, while not
guaranteed to be optimal, still produces very good results.  It may
take an arbitrary number of passes, though. Two or three is rarely
enough for a complex program.  I once generated a synthetic test file
and FASM made nearly 2,000 passes over the source file during
optimization.
Cheers,
Randy Hyde

randyhyde@earthlink.net wrote:
> toby wrote:
>
> >
> > Clearly HLA falls outside my parameters, or far outside the 'simple'
> > parameter, because it has strict performance requirements
>
> Hmm... That's the first time I've heard anyone claim HLA has "strict"
> performance requirements. :-)
>
> In matter of fact, HLA is one of the slower assemblers because
> performance was never a design criterion for HLA v1.x (which is a
> prototype used to develop the language).
>
> > -- e.g. it
> > has to process very large symbol tables and a complex macro syntax, and
> > because it's among other things (forgive me) a competitor in a pissing
> > contest, if you've ever frequented alt.lang.asm.
>
> Performance has nothing to do with the "pissing contest" over in ALA.
> On the performance side, only a few assemblers (e.g., NASM) are slower
> than HLA.
>
>
> > It's also written by
> > an assembly language programmer with an assembly language programmer's
> > preoccupations - cache, etc.
>
> It's written in Flex, Bison, and C. As I've pointed out a couple of
> times in this thread. Anyone concerned about "assembly language
> preoccupations", especially the cache, would steer clear of Flex and
> Bison. For reasons I've pointed out in this thread.
>
> Boy, you seem to be a bit confused about what HLA is.

Yes, I apologise for my ignorance of its implementation. I had gathered
from your earlier posts that you had rejected Flex/Bison early on. I
gather you will not use them in future rewrites?

>
>
> >
> > I am simply defending an alternative approach: Don't microdesign and
> > micromanage, but exploit tools like flex and bison to handle the
> > tedious and move on to focus on 'the real job'.
>
> That was exactly the approach I chose to use when developing the
> prototype for the HLA language. I lived to regret that choice. Granted,
> HLA has some sophisticated features that make it a bit more complex
> than a "simple assembler", but is exactly those types of features for
> which Flex and Bison are supposed to be ideal. Alas, the moment you
> want to do something like macros, Flex and Bison (well, especially
> Bison) become anchors, holding back the project.

I would agree with that. But again, maybe a macro assembler (especially
one with HLA's design goals) is outside the ambit of 'simple'.

>
> >
> > At one point I compared[1] two implementations of a PAL-III (PDP-8)
> > assembler, one with hand-coded lexer/parser, and one using flex/bison
> > (mine). In the hand-coded case, the code concerned with lexing/parsing
> > was 692 lines, or 58% of the program. In the flex/bison case, it was a
> > mere 179 lines, or 11% of the program (including token definitions,
> > grammar, support code).
> > It seems reasonable to infer that there is
> > correspondingly less to write and debug in the latter case, by a factor
> > of nearly four.
>
> That is an incredible generalization that you cannot make on the basis
> of such a small project.

I'm only talking about small projects, here. PAL-III is a strictly
defined syntax: It can't grow, and it doesn't have macros. This is only
a case study.

>
> > And it ends up in a clearer, more maintainable form.
> > But of course this reasoning applies most strongly to "simple"
> > projects.
>
> Which, alas, have a habit of growing. And Bison parsers don't scale up
> well.
> Cheers,
> Randy Hyde

toby wrote:

>
> Clearly HLA falls outside my parameters, or far outside the 'simple'
> parameter, because it has strict performance requirements

Hmm... That's the first time I've heard anyone claim HLA has "strict"
performance requirements. :-)

In matter of fact, HLA is one of the slower assemblers because
performance was never a design criterion for HLA v1.x (which is a
prototype used to develop the language).

> -- e.g. it
> has to process very large symbol tables and a complex macro syntax, and
> because it's among other things (forgive me) a competitor in a pissing
> contest, if you've ever frequented alt.lang.asm.

Performance has nothing to do with the "pissing contest" over in ALA.
On the performance side, only a few assemblers (e.g., NASM) are slower
than HLA.

> It's also written by
> an assembly language programmer with an assembly language programmer's
> preoccupations - cache, etc.

It's written in Flex, Bison, and C. As I've pointed out a couple of
times in this thread. Anyone concerned about "assembly language
preoccupations", especially the cache, would steer clear of Flex and
Bison. For reasons I've pointed out in this thread.

Boy, you seem to be a bit confused about what HLA is.

>
> I am simply defending an alternative approach: Don't microdesign and
> micromanage, but exploit tools like flex and bison to handle the
> tedious and move on to focus on 'the real job'.

That was exactly the approach I chose to use when developing the
prototype for the HLA language. I lived to regret that choice. Granted,
HLA has some sophisticated features that make it a bit more complex
than a "simple assembler", but is exactly those types of features for
which Flex and Bison are supposed to be ideal. Alas, the moment you
want to do something like macros, Flex and Bison (well, especially
Bison) become anchors, holding back the project.

>
> At one point I compared[1] two implementations of a PAL-III (PDP-8)
> assembler, one with hand-coded lexer/parser, and one using flex/bison
> (mine). In the hand-coded case, the code concerned with lexing/parsing
> was 692 lines, or 58% of the program. In the flex/bison case, it was a
> mere 179 lines, or 11% of the program (including token definitions,
> grammar, support code).
> It seems reasonable to infer that there is
> correspondingly less to write and debug in the latter case, by a factor
> of nearly four.

That is an incredible generalization that you cannot make on the basis
of such a small project.

> And it ends up in a clearer, more maintainable form.
> But of course this reasoning applies most strongly to "simple"
> projects.

Which, alas, have a habit of growing. And Bison parsers don't scale up
well.
Cheers,
Randy Hyde

Walter Banks wrote:
> The Datamation algorithm (Sorry I don't have the exact reference) fails
> to work on many instruction sets for example the National COP8 family
> which has some instructions that have a pc offset, offset in the current
> page and absolute addressing modes.
>
> The algorithm essentially says start with the longest instruction form and
> reduce to a shorter version of the instruction until no more instructions
> in the generated code have an "in range" workable form.

"Relaxation" method comes to mind. The Transputer was an interesting
case, since you could build offsets of arbitrary magnitude with its
general prefix mechanism. (i.e. There were an arbitrary number of jump
instruction lengths, not just short and long.)

>
> The COP8 case and many others with page oriented instructions start
> to reduce code using the Datamation algorithm a secondary problem
> starts to show as one of the source or destination addresses slides
> over a page boundary.
>
> The freescale 6812 has an interesting similar problem where one
> of the call types has a different return instruction. Then this call is
> used all calls to the subroutine needs to be the same type. In
> automatically generated code we always want to generate the
> shortest form but if any call cannot reach, all the calls need to be
> changed and all the returns for the function must be changed as
> well.
>
> w..
>
> Tauno Voipio wrote:
>
> > There is an algorithm, published in Datamation (1968-1970), which
> > enables the translation of variable-length addresses so that the
> > task of address length selection is left to the assembler.
> >
> > It needs a third pass to resolve the addressing lengths.
> >

Isaac Bosompem wrote:
> Hi Grant, sorry for the cross posting.
> 
> Do you know a good Python IDE? Which one do you use yourself?
> 

Idle is a simple one that is part of most Python installations.  For 
starting out, or for quick work, it's probably the fastest and easiest.

SPE and Eric are Python-specific IDEs.

Boa Constructor is a Python IDE and wxPython designer.

Most "big" editors are also IDEs, and can work with Python - Eclipse, 
jedit, (x)emacs, vim, etc.

Walter Banks wrote:
> The Datamation algorithm (Sorry I don't have the exact reference) fails
> to work on many instruction sets for example the National COP8 family
> which has some instructions that have a pc offset, offset in the current
> page and absolute addressing modes.
> 
> The algorithm essentially says start with the longest instruction form and
> reduce to a shorter version of the instruction until no more instructions
> in the generated code have an "in range" workable form.
> 
> The COP8 case and many others with page oriented instructions start
> to reduce code using the Datamation algorithm a secondary problem
> starts to show as one of the source or destination addresses slides
> over a page boundary.
> 
> The freescale 6812 has an interesting similar problem where one
> of the call types has a different return instruction. Then this call is
> used all calls to the subroutine needs to be the same type. In
> automatically generated code we always want to generate the
> shortest form but if any call cannot reach, all the calls need to be
> changed and all the returns for the function must be changed as
> well.

I left out the paged architectures (PDP-8 & co from the years gone),
they belong to the same place as dinosaurs and Fortran - history.

The algorithm is extensible to the paged case, but as long as
there are several separately translated modules, the code produced
is sub-optimal.

In any case, the paged architectures are prone to page-zero overflow.

-- 

Tauno Voipio
tauno voipio (at) iki fi

The Datamation algorithm (Sorry I don't have the exact reference) fails
to work on many instruction sets for example the National COP8 family
which has some instructions that have a pc offset, offset in the current
page and absolute addressing modes.

The algorithm essentially says start with the longest instruction form and
reduce to a shorter version of the instruction until no more instructions
in the generated code have an "in range" workable form.

The COP8 case and many others with page oriented instructions start
to reduce code using the Datamation algorithm a secondary problem
starts to show as one of the source or destination addresses slides
over a page boundary.

The freescale 6812 has an interesting similar problem where one
of the call types has a different return instruction. Then this call is
used all calls to the subroutine needs to be the same type. In
automatically generated code we always want to generate the
shortest form but if any call cannot reach, all the calls need to be
changed and all the returns for the function must be changed as
well.

w..

Tauno Voipio wrote:

> There is an algorithm, published in Datamation (1968-1970), which
> enables the translation of variable-length addresses so that the
> task of address length selection is left to the assembler.
>
> It needs a third pass to resolve the addressing lengths.
>

toby wrote:
> toby wrote:
> > Jim Stewart wrote:
> > > toby wrote:
> > >
> > > > Alex wrote:
> > > >
> > > >>First, I would like to thank everyone for a response and advice.
> > > >>...
> > > Myself, I'd just define some macros for
> > > MASM, the old Microsoft DOS assembler.
> >
> > Hi Jim,
> >
> > Now that is a technique I have heard of before! A gentleman Tom Evans
> > clued me into this, and I hope he will allow me the liberty of quoting:
> >
> > 	There's two ways to "write an assembler in Macro 11".
> > ...
>
> I don't know why I didn't think of this before, since I've been looking
> at it recently, but Mauro Persano's wicked-clever abuse of templates to

Not Mauro Persano's, but actually belonging to mncw, a.k.a. bi:
http://freshmeat.net/~mncw/
http://www.advogato.org/person/bi/
Apologies for the misattribution. And I only use the word 'abuse'
jokingly :-)

> create a mock-assembler syntax for GNU Lightning is also in this vein.
> I'm wishing for an excuse to try out his froofyjit:
>
> http://freshmeat.net/projects/froofyjit/?branch_id=62508&release_id=218685
> http://fzort.org/bi/sw/froofy/index.html#froofyjit
> 
> > 
> > --Toby

toby wrote:
> Jim Stewart wrote:
> > toby wrote:
> >
> > > Alex wrote:
> > >
> > >>First, I would like to thank everyone for a response and advice.
> > >>...
> > Myself, I'd just define some macros for
> > MASM, the old Microsoft DOS assembler.
>
> Hi Jim,
>
> Now that is a technique I have heard of before! A gentleman Tom Evans
> clued me into this, and I hope he will allow me the liberty of quoting:
>
> 	There's two ways to "write an assembler in Macro 11".
> ...

I don't know why I didn't think of this before, since I've been looking
at it recently, but Mauro Persano's wicked-clever abuse of templates to
create a mock-assembler syntax for GNU Lightning is also in this vein.
I'm wishing for an excuse to try out his froofyjit:

http://freshmeat.net/projects/froofyjit/?branch_id=62508&release_id=218685
http://fzort.org/bi/sw/froofy/index.html#froofyjit

> 
> --Toby