On 8 Mar 2006 15:56:44 -0800, "toby" <toby@telegraphics.com.au> wrote:>Jim Stewart wrote: >> toby wrote: >> >> > Alex wrote: >> > >> >>First, I would like to thank everyone for a response and advice. >> >> >> >>Second - the purpose was to write a simple assembler in order to generate >> >>an op code on a PC >> >>and then run it on my IC (fpga is used as a host controller). >> >>I understand that the task is trivial for gurus, but being a novice in >> >>this thing first thing that came to >> >>my mind was simply to make two passes: first - preprocessor, detects all >> >>the variables etc., second actual >> >>translation - recognising mnemonics and generate an opcode . Obviously it >> >>is not a "proper" way to do it >> >>(grammar descriptions and so on..). That's why I was asking about some >> >>examples and articles about this issue. >> > >> > >> > It can be argued that since you are in learning mode there is no >> > 'wrong' way to go about it. Go forth and build your prototype; doing so >> > is a great way to learn about languages and tools that might be related >> > or make the task easier. :) >> >> Total agreement. In the total time spent >> posting on this thread, a simple assembler >> could have been written and debugged :) >> >> Myself, I'd just define some macros for >> MASM, the old Microsoft DOS assembler. > >Hi Jim, > >Now that is a technique I have heard of before! A gentleman Tom Evans >clued me into this, and I hope he will allow me the liberty of quoting: > > There's two ways to "write an assembler in Macro 11". > > The IMP/PACE one I mentioned was "the classic version". It had > the parser, symbol table management and everything, all lovingly > coded in individual machine instructions. Lots of them. How > redundant... > > The SC/MP cross assembler (and other ones I've worked on ... > that generated microcode) consisted of MACROS. > > This is cheating big-time. The Macro-11 assembler is being abused > to assemble and emit code for a different CPU, or sometimes not > even for a CPU but for a ROM Sequencer or worse. The macros have > the same names as the target CPU's op-codes and they simply > generate the appropriate code, (ab)using the symbol table > management built into Macro-11. > > As a huge benefit you can also use all of the powerful macro > facilities in Macro-11. Try emulating all of that in lex/yacc. > > Of course if the targeted CPU uses opcodes with the same names as > the ones the PDP-11 uses there's a bit of strife, ... > > Macro-11 isn't exactly fast when abused like this. It took about > 5 minutes to make [a] 1023-byte ROM ... > >Kids these days have it easy! Lex! Yacc! Puts me in mind of: >http://www.phespirit.info/montypython/four_yorkshiremen.htm > >"SECOND YORKSHIREMAN: > Luxury. We used to have to get out of the lake at six o'clock in >the morning, clean the lake, eat a handful of 'ot gravel, work twenty >hour day at mill for tuppence a month, come home, and Dad would thrash >us to sleep with a broken bottle, if we were lucky!"Macro-11 is pretty good and I've read exactly these cases, back in the 1970's when I was using Macro-11. I never wrote Macro-11 code for a cross-assembler, though. Just heard about it. I've actually used MASM/ML from Microsoft for such things, though. From my vague recollection of Macro-11 macros, MASM/ML's macro facilities aren't nearly as general and can be confusing to figure out, at times. But the linker will actually punch out a .COM file, which is a clean, exact, binary image. MASM/ML will allow you to place things in separate segments so that you can, on the fly, place things into nicely organized sections which will later be fused together as you see fit. (You can generate a .EXE, but you will need another tool to 'fix' it up.) Jon
Writing a simple assembler
Started by ●March 6, 2006
Reply by ●March 8, 20062006-03-08
Reply by ●March 8, 20062006-03-08
On 2006-03-08, toby <toby@telegraphics.com.au> wrote:> The SC/MP cross assembler (and other ones I've worked on ... > that generated microcode) consisted of MACROS. > > This is cheating big-time. The Macro-11 assembler is being abused > to assemble and emit code for a different CPU, or sometimes not > even for a CPU but for a ROM Sequencer or worse. The macros have > the same names as the target CPU's op-codes and they simply > generate the appropriate code, (ab)using the symbol table > management built into Macro-11.I haven't done this with either MACRO-11 or MASM, but I have done it a few times with MACRO-32. First time was for FQAM, the QBus adapter for the VAXstation 3520/3540. It was built around a simple state machine implemented with registered EPROMs. Currently, I'm using a set of MACRO-32 macros that let me migrate microcode assembly for a 2910/29116 based system off an old META29R setup that required a VAX onto an Alpha using MACRO-32. I've managed to retain all the original mnemonics used in the META29R code and quite a bit of the syntax. This let me do automated source code conversion between the two assemblers, which was nice. The result is a .PSECT containing an initialized array that I link against a FORTRAN program to extract the binary in a variety of formats.> Macro-11 isn't exactly fast when abused like this. It took about > 5 minutes to make [a] 1023-byte ROM ...Neither is MACRO-32. It took a MicroVAX 2000 half an hour to assemble the microcode for the FQAM. -- roger ivie rivie@ridgenet.net
Reply by ●March 8, 20062006-03-08
toby wrote:> > > > Myself, I'd just define some macros for > > MASM, the old Microsoft DOS assembler. > > Hi Jim, > > Now that is a technique I have heard of before! A gentleman Tom Evans > clued me into this, and I hope he will allow me the liberty of quoting: >Having a good compile-time language (macro processor plus other goodies) in an assembler is useful for creating all kinds of different languages, not just assemblers. Interested individuals might want to take a look at my chapter on "Domain Specific Languages" in "The Art of Assembly Language" where it discusses how to use HLA's macros and compile-time language to create "mini-languages". Certainly, an assembler would be fairly trivial to write. Indeed, I'm using this technique to create a small assembler for a virtual machine I've created to help with some object code obfuscation. Cheers, Randy Hyde
Reply by ●March 8, 20062006-03-08
On 8 Mar 2006 18:01:13 -0800, "randyhyde@earthlink.net" <randyhyde@earthlink.net> wrote:>toby wrote: >> > >> > Myself, I'd just define some macros for >> > MASM, the old Microsoft DOS assembler. >> >> Hi Jim, >> >> Now that is a technique I have heard of before! A gentleman Tom Evans >> clued me into this, and I hope he will allow me the liberty of quoting: > >Having a good compile-time language (macro processor plus other >goodies) in an assembler is useful for creating all kinds of different >languages, not just assemblers. Interested individuals might want to >take a look at my chapter on "Domain Specific Languages" in "The Art of >Assembly Language" where it discusses how to use HLA's macros and >compile-time language to create "mini-languages". Certainly, an >assembler would be fairly trivial to write. Indeed, I'm using this >technique to create a small assembler for a virtual machine I've >created to help with some object code obfuscation.I was hoping you'd pop in on this. Jon
Reply by ●March 9, 20062006-03-09
randyhyde@earthlink.net wrote:> toby wrote: > >>>Myself, I'd just define some macros for >>>MASM, the old Microsoft DOS assembler. >> >>Hi Jim, >> >>Now that is a technique I have heard of before! A gentleman Tom Evans >>clued me into this, and I hope he will allow me the liberty of quoting: >> > > Having a good compile-time language (macro processor plus other > goodies) in an assembler is useful for creating all kinds of different > languages, not just assemblers. Interested individuals might want to > take a look at my chapter on "Domain Specific Languages" in "The Art of > Assembly Language" where it discusses how to use HLA's macros and > compile-time language to create "mini-languages". Certainly, an > assembler would be fairly trivial to write. Indeed, I'm using this > technique to create a small assembler for a virtual machine I've > created to help with some object code obfuscation. > Cheers, > Randy HydeIs your HLA v2.0 stable enough yet, that it can be used as a macro assembler ? - I see it had recent update. -jg
Reply by ●March 9, 20062006-03-09
"randyhyde@earthlink.net" wrote:> toby wrote: >>> >>> Myself, I'd just define some macros for >>> MASM, the old Microsoft DOS assembler. >> >> Now that is a technique I have heard of before! A gentleman Tom >> Evans clued me into this, and I hope he will allow me the liberty >> of quoting: > > Having a good compile-time language (macro processor plus other > goodies) in an assembler is useful for creating all kinds of > different languages, not just assemblers. Interested individuals > might want to take a look at my chapter on "Domain Specific > Languages" in "The Art of Assembly Language" where it discusses > how to use HLA's macros and compile-time language to create > "mini-languages". Certainly, an assembler would be fairly trivial > to write. Indeed, I'm using this technique to create a small > assembler for a virtual machine I've created to help with some > object code obfuscation.This, with different assemblers, linkers, etc. was precisely the path I took for the first version of machine code generation (as opposed to pcode generation) from PascalP. The result worked, generated excessively bloated code, but verified the ideas. I then wrote a proper code generator, which greatly improved speed, register assignments, bloat, etc. -- "If you want to post a followup via groups.google.com, don't use the broken "Reply" link at the bottom of the article. Click on "show options" at the top of the article, then click on the "Reply" at the bottom of the article headers." - Keith Thompson More details at: <http://cfaj.freeshell.org/google/> Also see <http://www.safalra.com/special/googlegroupsreply/>
Reply by ●March 9, 20062006-03-09
Isaac Bosompem wrote:> Grant Edwards wrote: >> On 2006-03-08, Paul Keinanen <keinanen@sci.fi> wrote: >> >>>>> If it's just "simple assembler" why not just use a list with >>>>> O(n) access time. >>>> Most 8bit micros have anywhere from 40 - 60 instructions (opcodes >>>> NOT opcodes+addressing formats) >>>> >>>> O(n) searches on that set looks like a waste of cpu cyles >>>> >>>>> Better yet, use a real programming language with a dictionary >>>>> type (Python, Smalltalk, whatever). >>>> There are somethings you can ONLY do in assembly ... and on an >>>> 8 bit micro (with 64k address space) every byte counts >>> In the 1970's I wrote several cross-assemblers in Fortran running on >>> PDP-11s (which have 64 KiB address space) for various processors such >>> as 8080 and 1802. I never bothered to use anything more fancier than >>> linear search for opcode or symbol table search, since the total time >>> was dominated by the program load time, the source file read time (two >>> passes) and the binary output file writing (or even punching to paper >>> tape for downloading to the target :-). >> My point exactly. Worrying about hash tables for symbol lookup >> reeks of premature optimization for "a simple assembler". > > Well the hash lookups were not for premature optimization. I first > decided on doing simple string parsing but I quickly realized it would > be a nightmare to code. I am simply looking for a solution that is easy > to code. >That would be why people are recommending languages like Python - so that such basic blocks like string parsing and hash tables are as easy as possible.>>> It would be very hard to write so huge modules for any small >>> target processor that would require such a huge number of >>> labels, that the inefficient symbol table search time would >>> have been of any significance relative to the I/O times. >> I'm also surprised that somebody thinks they're going to use >> C++ and generic hashing libraries on an 8-bit target with a 64K >> address space. > > That choice might have been overambitious, but I do feel I need to get > a grasp of OOP programming. >OOP would be a good idea for such a task. May I recommend a good OOP language, such as ... Python ? There are actually a fair number of good OOP programming languages that could be used for such a task - C++ isn't really one of them (multiple inheritance, friends, and templates are some of the "features" of C++ OOP that quickly lead to horrible ugly, unreadable and unmaintainable code).> I must ask though why you guys are so adamant about using TCL or > Python?! I have not started to code it yet, I am still in the planning > phase. >I don't think anyone has recommended TCL, nor are they likely to. TCL has its place, and can be a useful language - but not for a task such as this. I'm a Python fan, and I have no doubts in recommending Python as a suitable language for writing an assembler. I'm not a Perl expert, but I know enough about it to know it is also up to the job, but I don't think it would be as good a fit. There are plenty of other options that may or may not be good fits - I don't know them well enough (such as Java, Smalltalk, OCAML, Ruby). And I know several languages that could be used, such as C, C++ and Pascal, but which force you to start much nearer the bottom instead of providing the basic building blocks. Choosing your language is a much more important step than the choosing the implementation for a particular data structure, and is definitely part of the planning phase (do you want to choose languages *after* starting coding?).
Reply by ●March 9, 20062006-03-09
On 8 Mar 2006 15:56:44 -0800, "toby" <toby@telegraphics.com.au> wrote:> > This is cheating big-time. The Macro-11 assembler is being abused > to assemble and emit code for a different CPU, or sometimes not > even for a CPU but for a ROM Sequencer or worse. The macros have > the same names as the target CPU's op-codes and they simply > generate the appropriate code, (ab)using the symbol table > management built into Macro-11. > > As a huge benefit you can also use all of the powerful macro > facilities in Macro-11. Try emulating all of that in lex/yacc.At least RSX-11 also contained a nice table driven parser (TPARS) in the run time library. Macros were used to define the state tables with all the various transitions. This parser was initially intended for command line parsing, but it was so versatile that I wrote a cross assembler (for a strange 20 bit test instrument processor). This cross assembler was then used to write a Basic interpreter for that 20 bit processor. The parsing tables became quite complex, due to the oddities of the target processor. The TPARS was also useable for parsing simple high level languages:-). Paul
Reply by ●March 9, 20062006-03-09
Paul Keinanen <keinanen@sci.fi> writes:> "toby" <toby@telegraphics.com.au> wrote: > > > > This is cheating big-time. The Macro-11 assembler is being abused > > to assemble and emit code for a different CPU, or sometimes not > > even for a CPU but for a ROM Sequencer or worse. The macros have > > the same names as the target CPU's op-codes and they simply > > generate the appropriate code, (ab)using the symbol table > > management built into Macro-11. > > > > As a huge benefit you can also use all of the powerful macro > > facilities in Macro-11. Try emulating all of that in lex/yacc. > > At least RSX-11 also contained a nice table driven parser (TPARS) in > the run time library. Macros were used to define the state tables with > all the various transitions. > > This parser was initially intended for command line parsing, but it > was so versatile that I wrote a cross assembler (for a strange 20 bit > test instrument processor). This cross assembler was then used to > write a Basic interpreter for that 20 bit processor. The parsing > tables became quite complex, due to the oddities of the target > processor. > > The TPARS was also useable for parsing simple high level languages:-).And then there are meta-assemblers that can generate code for any processor...
Reply by ●March 9, 20062006-03-09
On Thu, 09 Mar 2006 00:18:45 GMT, Jonathan Kirwan <jkirwan@easystreet.com> wrote: <snip>>> This is cheating big-time. The Macro-11 assembler is being abused >> to assemble and emit code for a different CPU, or sometimes not >> even for a CPU but for a ROM Sequencer or worse. The macros have >> the same names as the target CPU's op-codes and they simply >> generate the appropriate code, (ab)using the symbol table >> management built into Macro-11. >><snip>> >Macro-11 is pretty good and I've read exactly these cases, back in the >1970's when I was using Macro-11. I never wrote Macro-11 code for a >cross-assembler, though. Just heard about it. > >I've actually used MASM/ML from Microsoft for such things, though. >From my vague recollection of Macro-11 macros, MASM/ML's macro >facilities aren't nearly as general and can be confusing to figure >out, at times. But the linker will actually punch out a .COM file, >which is a clean, exact, binary image. MASM/ML will allow you to >place things in separate segments so that you can, on the fly, place >things into nicely organized sections which will later be fused >together as you see fit. (You can generate a .EXE, but you will need >another tool to 'fix' it up.)Way back in the 70's I wrote a series of macros for 360 ASM F to cross compile object for a IBM 705. The cute part is the the 705 was a character machine, some what like a 14xx. Which meant that the binary addresses from the symbol table had to be converted to 4? 5? character leading zero strings. I have forgotten most of the details by now. -- ArarghMail603 at [drop the 'http://www.' from ->] http://www.arargh.com BCET Basic Compiler Page: http://www.arargh.com/basic/index.html To reply by email, remove the garbage from the reply address.