EmbeddedRelated.com
Forums

Portable Assembly

Started by rickman May 27, 2017
Am 28.05.2017 um 19:47 schrieb Dimiter_Popoff:
> On 28.5.2017 г. 19:45, Stefan Reuther wrote: >> Am 27.05.2017 um 23:31 schrieb Dimiter_Popoff: >>> However, they are by far not what a "portable assembler" - existing >>> under the name Virtual Processor Assembler in our house is. >>> And never will be, like any high level language C is yet another >>> phrase book - convenient when you need to do a quick interaction >>> when you don't speak the language - and only then. >> >> So, then, what is a portable assembler? > > One which is not tied to a particular architecture, rather to an > idealized machine model.
So, to what *is* it tied then? What is its *concrete* machine model?
> It makes sense to use this assuming that processors evolve towards > better, larger register sets - which has been the case last few > decades. It would be impractical to try to assemble something written > once for say 68k and then assemble it for a 6502 - perhaps doable but > insane.
Doable and not insane with C. Actually, you can programm the 6502 in C++17.
>> One major variable of a processor architecture is the number of >> registers, and what you can do with them. On one side of the spectrum, >> we have PICs or 6502 with pretty much no registers, on the other side, >> there's things like x86_64 or ARM64 with plenty 64-bit registers. Using >> an abstraction like C to let the compiler handle the distinction (which >> register to use, when to spill) sounds like a pretty good idea to me. > > Using a phrase book is of course a good idea if you want to conduct > a quick conversation. > It is a terrible idea if you try to use the language for years and > choose to stay confined within the phrases you have in the book.
My point being: if you work on assembler level, that is: registers, you'll not have anything more than a phrase book. A C compiler can use knowledge from one phrase^Wstatement and carry it into the next, and it can use grammar to generate not only "a = b + c" and "x = y * z", but also "a = b + (y*z)".
>> If you were more close to assembler, you'd either limit yourself to an >> unuseful subset that works everywhere, or to a set that works only in >> one or two places. > > Like I said before, there is no point to write code which can work > on any processor ever made. I have no time to waste on that, I just need > my code to be working on what is the best silicon available. This > used to be 68k, now it is power. You have to program with some > constraints - e.g. knowing that the "assembler" (which in reality > is more a compiler) may use r3-r4 as it wishes and not preserve > them on a per line basis etc.
I am not an expert in either of these two architectures, but 68k has 8 data + 8 address registers whereas Power has 32 GPRs. If you work on a virtual pseudo-assembler level you probably ignore most of your Power. A classic compiler will happily use as many registers as it finds useful. The only possible gripe with C would be that it has no easy way to write a memory cell by number. But a simple macro fixes that. Stefan
On 29.5.2017 г. 16:06, David Brown wrote:
> .... I would imagine that > translating 68K assembly into PPC assembly is mostly straightforward - > unlike translating it into x86, or even ARM. (The extra registers on > the PPC give you the freedom you need for converting complex addressing > modes on the 68K into reasonable PPC code - while the ARM has fewer > registers available.)
Indeed, having more registers is of huge help. But it is not as straight forward as it might seem at first glance. Then while at it I did a lot more than just emulate the 68k - on power we have a lot more on offer, I wanted to take advantage of it, like adding syntax to not touch the CCR - as the 68k unavoidably does on moves, add and many others, use the 3 address mode - source1,source2, destination - and have this available not just as registers but as any addressing mode etc. If you assemble plain CPU32 code the resulting power object code size is about 3.5 times the native CPU32 code size. If you write with power in mind - e.g. you discourage all the unnecessary CCR (CR in power) updates - code sizes get pretty close. I have designed in a pass for optimizing that automatically, 1.5 decades later still waiting to happen... :-). No need for it which would deflect me from more pressing issues I suppose.
> > If it were /portable/ assembly, then your code that works well for the > PPC would automatically work well for the 68K.
This is semantics - but since user level 68k code assembles directly I think it is fair enough to borrow the word "assembler". Not what everyone understands under it every time of course but must have sounded OK to me back then. Then I am a practical person and tend not to waste much time on names as long as they do not look outright ugly or misleading (well I might go on purpose for "ugly" of course but have not done it for vpa).
> > The three key points about assembly, compared to other languages, are > that you know /exactly/ what instructions will be generated, including > the ordering, register choices, etc.,
Well yes, if we accept that we have to accept that VPA (Virtual Processor Assembler) is not exactly an assembler. But I think the name is telling enough what to expect.
> that you can access /all/ features > of the target cpu, and that you can write code that is as efficient as > possible for the target.
That is completely possible with vpa for power, nothing is stopping you from using native to power opcodes (I use rwlinm and rlwimi quite often, realizing there might be no efficient way to emulate them but I do what I can do best, if I get stuck in a world with x86 processors only which have just the few original 8086 registers I'll switch occupation to herding kangaroos or something. Until then I'll change to a new architecture only if I see why it is better than the one I use now, for me portability is just a means, not a goal).
> There is simply no way for this to portable. > Code written for the 68k may use complex addressing modes - they need > multiple instructions in PPC assembly.
Yes but they run in fewer cycles. Apart from the PC relative - there is no direct access to the PC on power, takes 2-3 opcodes to get to it alone - the rest works faster. And say the An,Dn.l*4 mode can take not just powers of 2... etc., it is pretty powerful.
> If you do this mechanically, you > will know exactly what instructions this generates - but the result will > not be as efficient as code that re-uses registers or re-orders > instructions for better pipelining. Code written for the PPC may use > more registers than are available on the 68K - /something/ has to give.
Oh yes, backward porting would be quite painful. I do use all registers I have - rarely resorting to r4-r7, they are used for addressing mode calculations, intermediate operands etc., use one of them and you have something to work on when porting later. I still do it at times when I think it is justified... may well bite me one day.
> > Thus your VLA may be a fantastic low-level programming language (having > never used it or seen it, I can't be sure - but I'm sure you would not > have stuck with it if it were not good!). But it is not a portable > assembly language - it cannot let you write assembly-style code for more > than one target.
Hmmm, not for any target - yes. For more than one target with the code not losing efficiency - it certainly can, if the new target is right (as was the case 68k -> power). Basically I have never been after a "universal assembler", I just wanted to do what you already know I wanted. How we call it is of secondary interest to me to be fair :-).
> Some time it might be fun to look at some example functions, compiled > for either the 68K or the PPC (or, better still, both) and compare both > the source code and the generated object code to modern C and modern C > compilers. (Noting that the state of C compilers has changed a great > deal since you started making VLA.)
Yes, I would also be curious to see that. Not just a function - as it will likely have been written in assembly by the compiler author - but some sort of standard thing, say a base64 encoder/decoder or some vnc server thing etc. (the vnc server under dps is about 8 kilobytes, just looked at it. Does one type of compression (RRE misused as RLE) and raw). Dimiter ====================================================== Dimiter Popoff, TGI http://www.tgi-sci.com ====================================================== http://www.flickr.com/photos/didi_tgi/
On 5/29/2017 9:43 AM, Stefan Reuther wrote:
> The only possible gripe with C would be that it has no easy way to write > a memory cell by number. But a simple macro fixes that.
"Only" gripe? Every language choice makes implicit tradeoffs in abstraction management. The sorts of data types and the operations that can be performed on them are baked into the underlying assumptions of the language. What C construct maps to the NS16032's native *bit* array instructions? Or, the test-and-set capability present in many architectures? Or, x86 BCD data types? Support for 12 or 60 bit integers? 24b floats? How is the PSW exposed? Why pointers in some languages and not others? Why do we have to *worry* about atomic operations in the language in a different way than on the underlying hardware? Why doesn't the language explicitly acknowledge the idea of multiple tasks, foreground/background, etc.? Folks designing languages make the 90-10 (%) decisions and hope the 10 aren't unduly burdened by the wins afforded to the 90. Or, that the applications addressed by the 10 can tolerate the contortions they must endure as a necessary cost to gain *any* of the benefits granted to the 90.
On Mon, 29 May 2017 02:33:47 -0700, Don Y
<blockedofcourse@foo.invalid> wrote:

>On 5/27/2017 2:52 PM, Theo Markettos wrote: >> Dimiter_Popoff <dp@tgi-sci.com> wrote: >>> The need for portability arises when you have megabytes of >>> sources which are good and need to be moved to another, better >>> platform. For smallish projects - anything which would fit in >>> an MCU flash - porting is likely a waste of time, rewriting it >>> for the new target will be faster if done by the same person >>> who has already done it once. >> >> Back in the 80s, lots of software was written in assembly. > >For embedded systems (before we called them that),
When did the "embedded system" term become popular ? Of course, there were some military system (such as SAGE) that used purpose built computers in the 1950s. In the 1970s the PDP-11/34 was very popular as a single purpose computer and the PDP-11/23 in the 1980's. After that 8080/Z80/6800 became popular as the low end processors.
On 5/29/2017 1:46 PM, upsidedown@downunder.com wrote:
> On Mon, 29 May 2017 02:33:47 -0700, Don Y > <blockedofcourse@foo.invalid> wrote: > >> On 5/27/2017 2:52 PM, Theo Markettos wrote: >>> Dimiter_Popoff <dp@tgi-sci.com> wrote: >>>> The need for portability arises when you have megabytes of >>>> sources which are good and need to be moved to another, better >>>> platform. For smallish projects - anything which would fit in >>>> an MCU flash - porting is likely a waste of time, rewriting it >>>> for the new target will be faster if done by the same person >>>> who has already done it once. >>> >>> Back in the 80s, lots of software was written in assembly. >> >> For embedded systems (before we called them that), > > When did the "embedded system" term become popular ?
No idea. I was "surprised" when told that this is what I did for a living (and HAD been doing all along!). I now tell people that I design "computers that don't LOOK like computers" (cuz everyone thinks they KNOW what a "computer" looks like!) "things that you know have a computer *in* them but don't look like the stereotype you think of..."
> Of course, there were some military system (such as SAGE) that used > purpose built computers in the 1950s. > > In the 1970s the PDP-11/34 was very popular as a single purpose > computer and the PDP-11/23 in the 1980's. After that 8080/Z80/6800 > became popular as the low end processors.
11's were used a lot as they were reasonably affordable and widely available (along with folks who could code for them). E.g., the Therac was 11-based. The i4004 was the first real chance to put "smarts" into something that didn't also have a big, noisey box attached. I recall thinking the i8080 (and 85) were pure luxury coming from that more crippled world ("Oooh! Kilobytes of memory!!!")
On 29/05/17 19:02, Dimiter_Popoff wrote:
> On 29.5.2017 &#1075;. 16:06, David Brown wrote:
<snipped some interesting stuff about VPA>
> >> Some time it might be fun to look at some example functions, compiled >> for either the 68K or the PPC (or, better still, both) and compare both >> the source code and the generated object code to modern C and modern C >> compilers. (Noting that the state of C compilers has changed a great >> deal since you started making VLA.) > > Yes, I would also be curious to see that. Not just a function - as it > will likely have been written in assembly by the compiler author - but > some sort of standard thing, say a base64 encoder/decoder or some > vnc server thing etc. (the vnc server under dps is about 8 kilobytes, > just looked at it. Does one type of compression (RRE misused as RLE) and > raw). >
To be practical, it /should/ be a function - or no more than a few functions. (I don't know why you think functions might be written in assembly by the compiler author - the compiler author is only going to provide compiler-assist functions such as division routines, floating point emulation, etc.) And it should be something that has a clear algorithm, so no one can "cheat" by using a better algorithm for the job.
On 2017-05-27 3:39 PM, rickman wrote:
> Someone in another group is thinking of using a portable assembler to > write code for an app that would be ported to a number of different > embedded processors including custom processors in FPGAs. I'm > wondering how useful this will be in writing code that will require > few changes across CPU ISAs and manufacturers. > > I am aware that there are many aspects of porting between CPUs that > is assembly language independent, like writing to Flash memory. I'm > more interested in the issues involved in trying to use a universal > assembler to write portable code in general. I'm wondering if it > restricts the instructions you can use or if it works more like a > compiler where a single instruction translates to multiple target > instructions when there is no one instruction suitable. > > Or do I misunderstand how a portable assembler works? Does it > require a specific assembly language source format for each target > just like using the standard assembler for the target? >
I have done a few portable assemblers of the general type your describing. There are two approaches. One is to write macro's for the instruction set for the target processor and effectively assembler processor A into processor B with macros. This might work for architecturally close processors but even then has significant problems. To give an example 6805 to 6502. The carry following the subtract of 0 - 0 is different. There is one approach that I have used that does work reasonably well. Assemble processor A into functionally rich intermediate code and compile the intermediate code into processor B. The resulting code is quite portable between the processors and it is capable of supporting a diverse architectures quite well. I have done mostly 8 bit processors this way 6808 3 major families to PIC many varieties 12,14,14x,16 families. In all cases I set up the translation so I could go either way. I have also targeted some 16,24, and 32 bit processors. For pure code this has worked quite well with a low penalty for the translation. Application code usually has processor specific I/O which can actually be detected by the translator but generally needs to have some hand intervention. w..
On 30.5.2017 &#1075;. 00:13, David Brown wrote:
> On 29/05/17 19:02, Dimiter_Popoff wrote: >> On 29.5.2017 &#1075;. 16:06, David Brown wrote: > > <snipped some interesting stuff about VPA> > >> >>> Some time it might be fun to look at some example functions, compiled >>> for either the 68K or the PPC (or, better still, both) and compare both >>> the source code and the generated object code to modern C and modern C >>> compilers. (Noting that the state of C compilers has changed a great >>> deal since you started making VLA.) >> >> Yes, I would also be curious to see that. Not just a function - as it >> will likely have been written in assembly by the compiler author - but >> some sort of standard thing, say a base64 encoder/decoder or some >> vnc server thing etc. (the vnc server under dps is about 8 kilobytes, >> just looked at it. Does one type of compression (RRE misused as RLE) and >> raw). >> > > To be practical, it /should/ be a function - or no more than a few > functions. (I don't know why you think functions might be written in > assembly by the compiler author - the compiler author is only going to > provide compiler-assist functions such as division routines, floating > point emulation, etc.) And it should be something that has a clear > algorithm, so no one can "cheat" by using a better algorithm for the job. > >
I am pretty sure I have seen - or read about - compiler generated code where the compiler detects what you want to do and inserts some assembly prewritten piece of code. Was something about CRC or about tcp checksum, not sure - and it was someone who said that, I don't know it from direct experience. But if the compiler does this it will be obvious enough. Anyway, a function would do - if complex and long enough to be close to real life, i.e. a few hundred lines. But I don't see why not compare written stuff, I just checked again on that vnc server for dps - not 8k, closer to 11k (the 8k I saw was a half-baked version, no keyboard tables inside it etc.; the complete version also includes a screen mask to allow it to ignore mouse clicks at certain areas, that sort of thing). Add to it some menu (it is command line option driven only), a much more complex menu than windows and android RealVNC has I have and it adds up to 25k. Compare this to the 350k exe for windows or to the 4M for Android (and the android does only raw...) and the picture is clear enough I think. Dimiter ====================================================== Dimiter Popoff, TGI http://www.tgi=sci.com ====================================================== http://www.flickr.com/photos/didi_tgi/ Dimiter
Op Mon, 29 May 2017 18:43:01 +0200 schreef Stefan Reuther  
<stefan.news@arcor.de>:
> Am 28.05.2017 um 19:47 schrieb Dimiter_Popoff: >> On 28.5.2017 &#1075;. 19:45, Stefan Reuther wrote: >>> Am 27.05.2017 um 23:31 schrieb Dimiter_Popoff: > > I am not an expert in either of these two architectures, but 68k has 8 > data + 8 address registers whereas Power has 32 GPRs. If you work on a > virtual pseudo-assembler level you probably ignore most of your Power.
Unless it uses a push/pop architecture like Java bytecode, which can get 'assembled' to any number of registers. -- (Remove the obvious prefix to reply privately.) Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/
David Brown <david.brown@hesbynett.no> writes:

> Writing a game involves a great deal more than just the coding. > Usually, the coding is in fact just a small part of the whole effort - > all the design of the gameplay, the storyline, the graphics, the music, > the algorithms for interaction, etc., is inherently cross-platform. The > code structure and design is also mostly cross-platform. Some parts > (the graphics and the music) need adapted to suit the limitations of the > different target platforms. The final coding in assembly would be done > by hand for each target.
I've sometimes wondered what kind of development systems were used for those early 1980s home computers. Unreliable, slow and small storage media would've made it pretty awful to do development on target systems. I've read Commodore used a VAX for ROM development so they probably had a cross assembler there but other than that, not much idea.