Reply by Albert van der Horst February 1, 20102010-02-01
In article <Noe8n.32589$Ym4.6873@text.news.virginmedia.com>,
bartc <bartc@freeuk.com> wrote:
> >"David Brown" <david@westcontrol.removethisbit.com> wrote in message >news:4b615962$0$3853$8404b019@news.wineasy.se... >> On 27/01/2010 21:15, Jon Kirwan wrote: > >>> http://webster.cs.ucr.edu/AoA/Windows/PDFs/0_PDFIndexWin.html > >> Thanks - that makes a /huge/ difference! Of course, now I just need the >> time to read it. As I've only had a brief look at it (I read most of the >> chapter on thunks), I may be misjudging it here, but I have difficulty >> seeing the relevance of the book at this time. I can see the point of a >> DOS assembly book long ago, and I can see the point of a general book on >> assembly for multiple architectures. But I can't think of any reason >> (other than for fun) why anyone would write software in assembly for the >> x86 - and certainly not for Windows. There are certainly times when you >> might want to /use/ assembly on an x86 - speeding up critical loops, for >> example - but not to write entire programs. The HLA concept strikes me as >> a waste of time in this day and age. > >Assembler is 100% flexible compared to any HLL, even C, so sometimes it >makes life easier. > >While you probably wouldn't write applications in it, there are types of >programs which do have a big proportion of assembler (in my case, these are >interpreters). > >Hyde's HLA is not for everyone, but I use of form of HLA (inline assembler >within a HLL) which makes writing large amounts of assembler much less >painful.
That is not HLA. HLA is IMHO a bizarre concept. It is abstraction added on top of an assembler. Assembler is the pinnacle of concreteness. If you go for assembler, you should complement it with macro's. Macro's may be a pain to use, but they are extremely flexible and the result remains horribly concrete. I don't think this is just an opinion, and I would add this to Wikipedia, but I have some stakes here. 1] I have a Forth system, where macro's help me to have the same source 16/32/64 bit, linux/msdos/mswindows/standalone. It is legitimate for a language implementation to be written in assembler (maybe complemented with parts written in the language itself.) (The alternative is using C as a portable assembler, but that leaves C itself to be written basically in assembler.) 2] I also have (dis)assembler that doesn't hide the difference (i86) between move-from-register-a-to-b and move-to-register-b-from-a MOV BX,AX (No not LEA. ) which makes it suitable for reverse engineering stealth viruses.
> >And, if you are working on a language product which generates assembler >code, then you need to understand how it works even if you are not manually >writing the code yourself.
Also this notion should be more prominent on Wikipedia pages. Sometimes it gives the impression that those using assembler are behind the times instead of supplying the foundation for the whole IT industry.
> >-- >Bartc >
-- -- Albert van der Horst, UTRECHT,THE NETHERLANDS Economic growth -- being exponential -- ultimately falters. albert@spe&ar&c.xs4all.nl &=n http://home.hccnet.nl/a.w.m.van.der.horst
Reply by Walter Banks January 28, 20102010-01-28

bartc wrote:

> Assembler is 100% flexible compared to any HLL, even C, so sometimes it > makes life easier. > > While you probably wouldn't write applications in it, there are types of > programs which do have a big proportion of assembler (in my case, these are > interpreters). > > Hyde's HLA is not for everyone, but I use of form of HLA (inline assembler > within a HLL) which makes writing large amounts of assembler much less > painful.
Embedded assembly in a C program becomes a lot more manageable when the assembler and C compiler share the same symbol table. Getting rid of address mangling between C and asm and giving the asm full access to C's symbol table makes it a lot easier to add embedded assembler to an application. Regards, Walter.. -- Walter Banks Byte Craft Limited http://www.bytecraft.com
Reply by David Brown January 28, 20102010-01-28
On 28/01/2010 12:19, bartc wrote:
> > "David Brown" <david@westcontrol.removethisbit.com> wrote in message > news:4b615962$0$3853$8404b019@news.wineasy.se... >> On 27/01/2010 21:15, Jon Kirwan wrote: > >>> http://webster.cs.ucr.edu/AoA/Windows/PDFs/0_PDFIndexWin.html > >> Thanks - that makes a /huge/ difference! Of course, now I just need >> the time to read it. As I've only had a brief look at it (I read most >> of the chapter on thunks), I may be misjudging it here, but I have >> difficulty seeing the relevance of the book at this time. I can see >> the point of a DOS assembly book long ago, and I can see the point of >> a general book on assembly for multiple architectures. But I can't >> think of any reason (other than for fun) why anyone would write >> software in assembly for the x86 - and certainly not for Windows. >> There are certainly times when you might want to /use/ assembly on an >> x86 - speeding up critical loops, for example - but not to write >> entire programs. The HLA concept strikes me as a waste of time in this >> day and age. > > Assembler is 100% flexible compared to any HLL, even C, so sometimes it > makes life easier. > > While you probably wouldn't write applications in it, there are types of > programs which do have a big proportion of assembler (in my case, these > are interpreters). > > Hyde's HLA is not for everyone, but I use of form of HLA (inline > assembler within a HLL) which makes writing large amounts of assembler > much less painful. > > And, if you are working on a language product which generates assembler > code, then you need to understand how it works even if you are not > manually writing the code yourself. >
I am not saying there is no place for assembly - for small systems, assembly can still be a good choice (and I have done a lot of assembly programming on small systems through the years). There are also parts of large systems that are best done in assembly. And of course you should understand assembly when working with embedded systems, and as you say, a compiler writer is going to have to be an assembly expert. But the days of writing large applications in x86 assembly for PCs (this book is targeting x86 assembly for windows) are long gone, bar a few specialist applications or keen enthusiasts.
Reply by Jon Kirwan January 28, 20102010-01-28
On Thu, 28 Jan 2010 11:19:09 GMT, "bartc" <bartc@freeuk.com>
wrote:

> >"David Brown" <david@westcontrol.removethisbit.com> wrote in message >news:4b615962$0$3853$8404b019@news.wineasy.se... >> On 27/01/2010 21:15, Jon Kirwan wrote: > >>> http://webster.cs.ucr.edu/AoA/Windows/PDFs/0_PDFIndexWin.html > >> Thanks - that makes a /huge/ difference! Of course, now I just need the >> time to read it. As I've only had a brief look at it (I read most of the >> chapter on thunks), I may be misjudging it here, but I have difficulty >> seeing the relevance of the book at this time. I can see the point of a >> DOS assembly book long ago, and I can see the point of a general book on >> assembly for multiple architectures. But I can't think of any reason >> (other than for fun) why anyone would write software in assembly for the >> x86 - and certainly not for Windows. There are certainly times when you >> might want to /use/ assembly on an x86 - speeding up critical loops, for >> example - but not to write entire programs. The HLA concept strikes me as >> a waste of time in this day and age. > >Assembler is 100% flexible compared to any HLL, even C, so sometimes it >makes life easier. > >While you probably wouldn't write applications in it, there are types of >programs which do have a big proportion of assembler (in my case, these are >interpreters). > >Hyde's HLA is not for everyone, but I use of form of HLA (inline assembler >within a HLL) which makes writing large amounts of assembler much less >painful. > >And, if you are working on a language product which generates assembler >code, then you need to understand how it works even if you are not manually >writing the code yourself.
This last paragraph makes an excellent point, regardless of how one may take the rest of what you say (which I also consider well-spoken.) Jon
Reply by Jon Kirwan January 28, 20102010-01-28
On Thu, 28 Jan 2010 10:30:58 +0100, David Brown
<david@westcontrol.removethisbit.com> wrote:

>On 27/01/2010 21:15, Jon Kirwan wrote: >> On Wed, 27 Jan 2010 14:26:35 +0100, David Brown >> <david@westcontrol.removethisbit.com> wrote: >> >>> On 27/01/2010 00:14, Jon Kirwan wrote: >>>> On Tue, 26 Jan 2010 23:31:15 +0100, David Brown >>>> <david.brown@hesbynett.removethisbit.no> wrote: >>>> >>>>> Jon Kirwan wrote: >>>>>> On Tue, 26 Jan 2010 09:14:50 +0100, David Brown >>>>>> <david@westcontrol.removethisbit.com> wrote: >>>>>> >>>>>>> On 25/01/2010 21:42, Jon Kirwan wrote: >>>>>>>> On Mon, 25 Jan 2010 13:13:37 +0100, David Brown >>>>>>>> <david@westcontrol.removethisbit.com> wrote: >>> >>>>> <snip> >>>>> >>>>>>>> As an aside, I have a lot of other things I'd like in c or >>>>>>>> c++ which just aren't there. For example, I dearly miss >>>>>>>> having access to thunking semantics in c or c++ (which does >>>>>>>> NOT break the c/c++ program model in any way, shape, or form >>>>>>>> and could easily be implemented as part of either language >>>>>>>> with no dire impacts at all. I might use this for efficient >>>>>>>> iterators (don't imagine that I'm talking about std library >>>>>>>> iterators here, which are somewhat similar in use but in no >>>>>>>> way similar in their implementation details -- they are much >>>>>>>> less efficient), so also do I miss this. There is no good >>>>>>>> reason I can think of not to have it and its utility is >>>>>>>> wonderful. (I'd be so happy to talk about it, at some point, >>>>>>>> as the examples are excellent and easily shown.) >>>>>>> What do you mean by "thunking" in this context? The term has several >>>>>>> meanings, as far as I know. If your answer is going to take more than a >>>>>>> dozen lines (you are better known for your in-depth explanations than >>>>>>> your short summaries!), it should probably be in its own thread. >>>>>> >>>>>> There are some excellent discussions already available. See, >>>>>> for example, the Metaware C (and Pascal) compiler manuals and >>>>>> their implemention of an iterator semantic, as well as their >>>>>> extensive discussion (with well-made examples) of its >>>>>> abundant benefits. There is also a somewhat different, but >>>>>> also interesting discussion made by Randall Hyde in The Art >>>>>> of Assembly manuals he generously put together some years >>>>>> back. (Don't worry yourself thumbing through Microsoft's use >>>>>> of the term.) >>>>>> >>>>>> ... >>>>>> >>>>>> The following case below is not nearly as well thought out, >>>>>> syntax wise, as Metaware's implementation (in other words, >>>>>> don't mistake it as a complete syntax designed by experts) >>>>>> but it may get the seed of a point across. >>>>>> >>>>>> for ( p in Primes( 20, 70 ) ) >>>>>> printf( "%d\n", p ); >>>>>> >>>>>> The code for Primes() might be, in short-hand, something like >>>>>> this below. (Please excuse my abuse of basic knowledge about >>>>>> prime numbers by using an increment by 1 in the for loop or >>>>>> any other tests for starting or ending on an even number only >>>>>> because I want to keep the code with as few lines as is >>>>>> reasonable to make the point. The idea is the point, not the >>>>>> implementation here.) >>>>>> >>>>>> integer Primes( int a, int b ) { >>>>>> int i; >>>>>> for ( i= a; i<= b; ++i ) >>>>>> if ( IsPrime( i ) ) >>>>>> yield( i ); >>>>>> return; >>>>>> } >>>>>> >>>>> >>>>> Ah, what you are talking about is often called a generator (for example, >>>>> in Python or JavaScript), or perhaps a closure. A generator is somewhat >>>>> like a lazily evaluated list, although it could be generalised (for >>>>> example, do the parameter values have to be the same for repeat calls as >>>>> they are for the initial call?). >>>>> >>>>>> I'm intentionally being brief, as well as leaving out a short >>>>>> discussion of each line. Partly, because longer examples may >>>>>> interfere with the central points. Partly, because you >>>>>> shouldn't need any discussion and should be able to almost >>>>>> instantly see what the above code "means" without that extra. >>>>>> It's in a form that should be plain without a manual. >>>>>> >>>>>> The place to focus isn't so much on examining Primes(), but >>>>>> instead more by imagining a wide variety of the types of >>>>>> for() loops which may require the mechanism itself. (Not my >>>>>> poor example, which may or may not be useful to anyone.) >>>>>> >>>>>> For example, you might prefer to imagine that Primes() is >>>>>> instead a routine that yields all the nodes of some arbitrary >>>>>> tree or graph using some very specific walking mechanism. If >>>>>> you use your imagination and let it take you for a ride, then >>>>>> perhaps the point may be clarified. >>>>>> >>>>>> If that gets you towards where I'm pointing, then the next >>>>>> question is how would you implement this in assembly code, >>>>>> consistent with c compilers you are aware of and in such a >>>>>> way that does _not_ break an existing linker in the process? >>>>>> >>>>>> On the other hand, if this doesn't do it for you at all -- if >>>>>> in short, your imagination isn't moved by those examples >>>>>> beyond the short distance I walked with them -- then let me >>>>>> commend again Metaware's c/pascal implementations and Hyde's >>>>>> AofA documentation before further discussion continues. >>>>>> >>>>>> But less than the above made my imagination literally spin >>>>>> with ideas when I first came across it. So maybe the above >>>>>> is enough. I hope so. It's also closely connected in a >>>>>> round about way to the idea of nested functions, aka Pascal. >>>>>> (If you see the connection, then I think you've probably got >>>>>> the larger picture.) >>>>>> >>>>>> Jon >>>>> >>>>> I've used generators in Python - they are a very nice way to solve some >>>>> kinds of problems. Unfortunately, they are not easy to implement >>>>> cleanly in a stack-based language because a general implementation >>>>> requires that the generator (or "thunk", if you prefer) has its own >>>>> stack for arbitrary local variables and state. Thus most languages that >>>>> implement generators use garbage collection. >>>> >>>> You are way off the reservation, already. Let me suggest you >>>> think about the _implementation_ for a moment. Maybe that >>>> will clarify what I'm pointing towards. For one thing, >>>> garbage collection has _NOTHING_ whatever to do with it. If >>>> you think so, you are far off-point. >>>> >>>>> You can get some of the features of generators in C++ using a class to >>>>> encapsulate the generator's state in class data. The newer lambda >>>>> syntax makes it a little neater, and comes somewhat closer to generators >>>>> or closures - but there are limitations. You can't implement them in >>>>> general without substantial helper code (as seen in boost's libraries) >>>>> or a major change in the structure of the language (including garbage >>>>> collection). >>>>> >>>>> In short, generators (and closures) are a very nice high-level concept - >>>>> you need a high level language (or, somewhat ironically, assembly) to >>>>> use them, and C / C++ are not suitable. >>>> >>>> I recommend that reading I suggested. Take the AofA one >>>> first. It's easier to get ahold of. It gets into the >>>> details of implementation, but does NOT deal with it as a c >>>> level concept. So you will have to fend for yourself there >>>> until you can get ahold of Metaware docs. (Or, I might be >>>> tempted to copy some of them for this purpose.) >>>> >>> >>> I don't have these books and manuals you are referring to, nor do I have >>> easy access to them. If you have web links of interest then I'll >>> happily look at them - but I am not going to find and order books just >>> to read a few pages. This discussion is interesting, but there's a >>> limit to what is a practical and appropriate use of time and money for a >>> discussion. >> >> Go here: >> http://webster.cs.ucr.edu/AoA/Windows/PDFs/0_PDFIndexWin.html >> >> Then download and read all of Volume 5. >> http://webster.cs.ucr.edu/AoA/Windows/PDFs/Volume5.pdf >> http://webster.cs.ucr.edu/AoA/Windows/PDFs/Thunks.pdf >> http://webster.cs.ucr.edu/AoA/Windows/PDFs/Iterators.pdf >> http://webster.cs.ucr.edu/AoA/Windows/PDFs/Coroutines.pdf >> http://webster.cs.ucr.edu/AoA/Windows/PDFs/ParameterImplementation.pdf >> http://webster.cs.ucr.edu/AoA/Windows/PDFs/LexicalNesting.pdf >> http://webster.cs.ucr.edu/AoA/Windows/PDFs/V5Questions.pdf >> >> I apologize for not doing this earlier. I had expected that >> you already knew about AofA and it's availability on the web. >> Had I known you didn't know about it, I would have >> immediately provided you the links. Again, my sincere >> apologies for not doing this earlier. >> > >Thanks - that makes a /huge/ difference! Of course, now I just need the >time to read it. As I've only had a brief look at it (I read most of >the chapter on thunks), I may be misjudging it here, but I have >difficulty seeing the relevance of the book at this time. I can see the >point of a DOS assembly book long ago, and I can see the point of a >general book on assembly for multiple architectures. But I can't think >of any reason (other than for fun) why anyone would write software in >assembly for the x86 - and certainly not for Windows. There are >certainly times when you might want to /use/ assembly on an x86 - >speeding up critical loops, for example - but not to write entire >programs. The HLA concept strikes me as a waste of time in this day and >age. > >Having said that, some of the concepts (such as in the chapters you have >indicated) are interesting and have wider applications. Coroutines are >useful devices - it's just that the implementation details of how to use >them with HLA x86 assembly are irrelevant to reality. Had the author >shown how to use them in C, Java, Python, or even in an artificial HLL, >it would have been more useful. > > >Anyway, now I see what you mean by the term "thunk" - and it is clear >from the book that these are limited devices that are basically >equivalent to a C++ class with initialisation of private data values and >a single method (or alternatively a C function that takes a struct >pointer). Useful, but hardly revolutionary. Your proposed syntax for >them in C is, however, neat and elegant - that would be a useful >addition to the C language. > >>>> Anyway, I can see I've sent you spinning in the wrong >>>> direction. Take a breath, read AofA on the topic of thunks >>>> and the nearby related chapters to it. That should provide >>>> an idea about implementation. Not the _whole_ idea, by the >>>> way. As it might be done in c, it involves the concept of >>>> nested functions (which you clearly don't yet see) without >>>> the use of the specific syntax you are used to seeing for >>>> them (it's entirely hidden at the c language level, but >>>> explicit at the assembly level.) If you _see_ this much, we >>>> are probably on the same page. >>> >>> Nested functions are perfectly possible in some extensions to C - in >>> particular, gcc supports them (since gcc also supports Ada, which has >>> nested functions, much of the gcc structure already supports nested >>> functions, and thus the C and C++ front-ends can get them almost for free). >> >> Yes, but as a general rule I don't always have the option of >> using gcc. Customers sometimes already have existing tools >> they want used, for example. There are other reasons, too. >> So it's not a general solution. Just an interesting one. >> > >Agreed. I use various gcc extensions if I think they improve the code >(with the emphasis here on improving the source code rather than the >target code - that's a bonus). I haven't used nested functions - they >often make code less readable because it is often unclear where >different functions start and end. > >>> Nested functions, C++ classes, the new C++ lambda syntax, etc., are all >>> ways to implement a limited form of generator or iterator. Compiler >>> extensions can be used to make a nicer syntax, and to automate some of >>> the manual work involved. But without some sort of multiple stack >>> system or garbage collection, you have serious limitations. I don't >>> mean to say that these ideas are not useful despite the limitations - >>> just that you cannot add proper flexible generators to a language with >>> the sort of structure of C or C++ without fundamental changes to the way >>> the language works - the compiler would need to be free to allocate (and >>> free) dynamic memory as needed, rather than through explicit malloc / >>> new calls. >>> >>> It could well be that what you call "thunking" really means "generators >>> with various limitations", in which case you are right that garbage >>> collection is not needed, and it's reasonably easy to figure out several >>> good implementations. But the term "thunking" is not a well known or >>> well-defined expression, and is used in many different ways by different >>> people - I have no idea how the author of a particular book you've read >>> happens to use it. >> >> Hopefully, the above will tell you more. >> >>> To look at some more general generators, and see why they can be used >>> much more freely in a language like Python than they can in C or C++, >>> let's vary your Primes function, using Python syntax so that we have >>> actual working code: >>> >>> def IsPrime(i) : >>> if i< 2 : >>> return False >>> for j in range(2, int(math.sqrt(i)) + 1) : >>> if (i % j) == 0 : >>> return False >>> return True >>> >>> def Primes(a, b) : >>> i = a >>> if (i == 2) : >>> yield i >>> if (i % 2 == 0) : >>> i = i + 1 >>> while ( i<= b ) : >>> if (IsPrime(i)) : >>> yield i >>> i = i + 1 >>> return >>> >>> for p in Primes(1, 20) : >>> print p >>> >>> To make this implementation of Primes work, the Primes closure has to >>> include information about where the execution currently is in the Primes >>> function, and in general it must also track any local variables. This >>> becomes increasingly difficult for more complex functions, especially >>> when doing it manually - you have to define a struct (or C++ class) to >>> hold all the local data, as well as a current "state" which is used for >>> jumping to the correct re-entry point on later calls. A language >>> extension could hide and automate much of this, however. >> >> In fact, that's what is vital. In the implementation done by >> Metaware's compilers, it was very well handled and the >> implementation was quite general and nestable to any depth >> without the programmer worrying over details such as that. >> It's all simply kept as stack frame contexts, just as normal >> functions do. The difference is that a thunk is used, >> instead, to move back and forth in order to preserve the >> stack context while the iterator remains "live." Once the >> iterator completes, though, the stack is unwound in the usual >> way and the context disappears just as you would expect for >> any function call. >> > >I agree here that such a syntax and compiler-aided handling of the >details would give you a very nice way to use these "thunks" - much more >convenient that doing things manually in C or C++. I suspect you could >get a fair way with "normal C" using a system similar to Adam Dunkels' >protothreads - but integrating it into the language would be best. > >>> If you were using such generators in a real program, you would want to >>> use structured and modular programming - and then things get difficult. >> >> Things do not get difficult. I've used metaware's compiler >> tools and there was NO difficulty involved. It's _exactly_ >> like using c, except you've got a wonderful additional >> semantic to handle, in beautiful and efficient ways, concepts >> like walking graphs. The idea of "data hiding" is expanded >> to also now include "algorithm hiding," but in a very light >> weight fashion that is entirely consistent with the c >> worldview. >> >>> To use generators in a stack-based language, you would have to >>> allocate a structure containing all the local data on the caller >>> function's stack - that means you (either the programmer figuring out >>> the closure data manually, or the extended compiler) need access to the >>> implementation when declaring the generation and using it. With a >>> garbage collecting language, the generator itself would allocate space >>> on the heap as needed - the caller need not know anything about the details. >> >> Got it. I understand your point now about garbage collection >> -- makes sense. But the way Metaware handles it is beautiful >> and doesn't require any of that. It's entirely handled >> within the standard c style program model with a single stack >> and all the usual, normal stack frame elements. The body of > >The Metaware implementation, as far as I can see, is limited to >situations where the thunk's frame can be allocated on the stack (or >possibly as a statically allocated region). That is certainly the >situation described in AofA. That is, of course, an entirely reasonable >limitation for an extension to C. > >My view of such concepts has come down from higher-level languages like >Python (and also functional programming languages), in which you have >much more general capability in how you work with function-like objects >and closures. From that angle, these "thunks" look limited, because you >need a compiler and run-time that handles dynamic memory (typically some >sort of garbage collection, but that's not absolutely necessary) to >implement the capabilities I assume. But when you are thinking of these >as an upwards extension of C, I can see these being a very useful >addition to the language. > >> a for loop is placed into a separate, nested function within >> the body of the enclosing function. The iterator is called >> using all the usual means, but includes a pointer to the >> body. The iterator may itself call any number of other >> functions, as well as other iterators if it likes, which may >> be nested down the stack to any depth you want. When a yield >> takes place, it is really a call to the for-body nested >> function but with the stack frame pointer set to the >> enclosing function so all the usual local variables are >> appropriately accessible off of the base pointer reference >> that all c compilers may normally use. The nested function >> returns rather normally, restoring the frame back to the down >> stream end of the stack. If the for-body temporarily stores >> on the stack, it does so at the end of course and obviously >> must restore it before returning. But that's just basic, >> anyway. >> >>> You start getting really high-level programming when you can pass >>> generators around as parameters and return values. This is something >>> that cannot be done with a stack model - if a function returns a >>> generator (or any function which requires closure data), the closure >>> data must exist even after the calling stack frame has exited. There >>> are ways to implement this without a general garbage collection facility >>> (for example, a pointer to a clean-up function could be passed up or >>> down the call chain while the closure itself is on the heap). But >>> basically, complex function manipulation like this needs more advanced >>> automatic control of the memory that you get in C or C++. >> >> So let me think about this for a second. Passing a generator >> would involve being able to return not only a pointer to >> code, but also its entire current context (and any such >> context of all activation records still active at the time?) >> In other words, it's not just a generator at its initial >> point, but one that may have already been used for a bit but >> hasn't yet completed and so it can be returned to a caller >> for more continued use? Interesting, and I gather the >> additional value here. >> > >That's correct. You can see here how this requires the generator's >local frame to remain valid after the function that created it has >exited - that means it has to exist outside the main stack. Thus for >this sort of thing to be handled directly by the language and the >compiler, rather than through explicit "new" or "malloc" calls, the >compiler has to have a direct understanding and control of dynamic memory. > >> However, as an _embedded_ programmer usually working on tiny >> micros, not workstation-competent board-level systems, I'm >> focused upon very modest but very useful extensions where I >> _know_ I have good application and where I don't have to pay >> for it with a significant change to the existing models I >> have grown to know well and fully understand and trust. >> > >I agree here - and I would appreciate the addition to C of the sort of >capabilities you have been describing. For embedded systems, it is >important to be able to understand the implementation for the code you >write, and that is possible for "thunks" as you have described them. On >a PC, it (typically) doesn't matter if the software takes a few extra MB >of run space, and runs through a byte code virtual machine - thus I >program in Python and take advantage of the language's power to write >shorter code. > >> That said, I'd be interested in seeing how to implement >> something like that which would work in ways where the >> run-time execution duration is entirely predictable and >> invariant (knowing, obviously, the initial conditions for the >> generator.) I think you hint towards this, but I'd need to >> see a specific implementation. >> >> In the meantime, you might look at the PDFs I've referred you >> towards. They aren't that long and are quite detailed. They >> do NOT show you the implementation used by Metaware, but I >> can talk about that. >> >> Jon
Sweet. Now if we can just convince those c-standard folks! By the way, just so you know, Metaware's founder (one of them, at least) was Dr. Frank DeRemer. He's known well for his Ph.D. thesis on LALR parsing, "Practical translators for LR(k) languages," MIT, Cambridge, Massachusetts, 1969. He and Dr. Tom Pennello went on to write some tools for compiler compilers and an article called "Efficient Computation of LALR(1) Look-Ahead Set," TOPLAS, vol 4, no 4, in October 1982. Which was around the time, I think, that Metaware was becoming a reality of sorts. I very much enjoyed my conversations and learned a few things from them (especially Tom), back around that time. They were generous with their time and help and willingness to teach. If you wonder how a paper that doesn't use LALR in its title is about that, take a look at the wiki page here: http://en.wikipedia.org/wiki/LALR_parser_generator Dr. DeRemer invented LALR. Jon
Reply by bartc January 28, 20102010-01-28
"David Brown" <david@westcontrol.removethisbit.com> wrote in message 
news:4b615962$0$3853$8404b019@news.wineasy.se...
> On 27/01/2010 21:15, Jon Kirwan wrote:
>> http://webster.cs.ucr.edu/AoA/Windows/PDFs/0_PDFIndexWin.html
> Thanks - that makes a /huge/ difference! Of course, now I just need the > time to read it. As I've only had a brief look at it (I read most of the > chapter on thunks), I may be misjudging it here, but I have difficulty > seeing the relevance of the book at this time. I can see the point of a > DOS assembly book long ago, and I can see the point of a general book on > assembly for multiple architectures. But I can't think of any reason > (other than for fun) why anyone would write software in assembly for the > x86 - and certainly not for Windows. There are certainly times when you > might want to /use/ assembly on an x86 - speeding up critical loops, for > example - but not to write entire programs. The HLA concept strikes me as > a waste of time in this day and age.
Assembler is 100% flexible compared to any HLL, even C, so sometimes it makes life easier. While you probably wouldn't write applications in it, there are types of programs which do have a big proportion of assembler (in my case, these are interpreters). Hyde's HLA is not for everyone, but I use of form of HLA (inline assembler within a HLL) which makes writing large amounts of assembler much less painful. And, if you are working on a language product which generates assembler code, then you need to understand how it works even if you are not manually writing the code yourself. -- Bartc
Reply by David Brown January 28, 20102010-01-28
On 27/01/2010 21:15, Jon Kirwan wrote:
> On Wed, 27 Jan 2010 14:26:35 +0100, David Brown > <david@westcontrol.removethisbit.com> wrote: > >> On 27/01/2010 00:14, Jon Kirwan wrote: >>> On Tue, 26 Jan 2010 23:31:15 +0100, David Brown >>> <david.brown@hesbynett.removethisbit.no> wrote: >>> >>>> Jon Kirwan wrote: >>>>> On Tue, 26 Jan 2010 09:14:50 +0100, David Brown >>>>> <david@westcontrol.removethisbit.com> wrote: >>>>> >>>>>> On 25/01/2010 21:42, Jon Kirwan wrote: >>>>>>> On Mon, 25 Jan 2010 13:13:37 +0100, David Brown >>>>>>> <david@westcontrol.removethisbit.com> wrote: >> >>>> <snip> >>>> >>>>>>> As an aside, I have a lot of other things I'd like in c or >>>>>>> c++ which just aren't there. For example, I dearly miss >>>>>>> having access to thunking semantics in c or c++ (which does >>>>>>> NOT break the c/c++ program model in any way, shape, or form >>>>>>> and could easily be implemented as part of either language >>>>>>> with no dire impacts at all. I might use this for efficient >>>>>>> iterators (don't imagine that I'm talking about std library >>>>>>> iterators here, which are somewhat similar in use but in no >>>>>>> way similar in their implementation details -- they are much >>>>>>> less efficient), so also do I miss this. There is no good >>>>>>> reason I can think of not to have it and its utility is >>>>>>> wonderful. (I'd be so happy to talk about it, at some point, >>>>>>> as the examples are excellent and easily shown.) >>>>>> What do you mean by "thunking" in this context? The term has several >>>>>> meanings, as far as I know. If your answer is going to take more than a >>>>>> dozen lines (you are better known for your in-depth explanations than >>>>>> your short summaries!), it should probably be in its own thread. >>>>> >>>>> There are some excellent discussions already available. See, >>>>> for example, the Metaware C (and Pascal) compiler manuals and >>>>> their implemention of an iterator semantic, as well as their >>>>> extensive discussion (with well-made examples) of its >>>>> abundant benefits. There is also a somewhat different, but >>>>> also interesting discussion made by Randall Hyde in The Art >>>>> of Assembly manuals he generously put together some years >>>>> back. (Don't worry yourself thumbing through Microsoft's use >>>>> of the term.) >>>>> >>>>> ... >>>>> >>>>> The following case below is not nearly as well thought out, >>>>> syntax wise, as Metaware's implementation (in other words, >>>>> don't mistake it as a complete syntax designed by experts) >>>>> but it may get the seed of a point across. >>>>> >>>>> for ( p in Primes( 20, 70 ) ) >>>>> printf( "%d\n", p ); >>>>> >>>>> The code for Primes() might be, in short-hand, something like >>>>> this below. (Please excuse my abuse of basic knowledge about >>>>> prime numbers by using an increment by 1 in the for loop or >>>>> any other tests for starting or ending on an even number only >>>>> because I want to keep the code with as few lines as is >>>>> reasonable to make the point. The idea is the point, not the >>>>> implementation here.) >>>>> >>>>> integer Primes( int a, int b ) { >>>>> int i; >>>>> for ( i= a; i<= b; ++i ) >>>>> if ( IsPrime( i ) ) >>>>> yield( i ); >>>>> return; >>>>> } >>>>> >>>> >>>> Ah, what you are talking about is often called a generator (for example, >>>> in Python or JavaScript), or perhaps a closure. A generator is somewhat >>>> like a lazily evaluated list, although it could be generalised (for >>>> example, do the parameter values have to be the same for repeat calls as >>>> they are for the initial call?). >>>> >>>>> I'm intentionally being brief, as well as leaving out a short >>>>> discussion of each line. Partly, because longer examples may >>>>> interfere with the central points. Partly, because you >>>>> shouldn't need any discussion and should be able to almost >>>>> instantly see what the above code "means" without that extra. >>>>> It's in a form that should be plain without a manual. >>>>> >>>>> The place to focus isn't so much on examining Primes(), but >>>>> instead more by imagining a wide variety of the types of >>>>> for() loops which may require the mechanism itself. (Not my >>>>> poor example, which may or may not be useful to anyone.) >>>>> >>>>> For example, you might prefer to imagine that Primes() is >>>>> instead a routine that yields all the nodes of some arbitrary >>>>> tree or graph using some very specific walking mechanism. If >>>>> you use your imagination and let it take you for a ride, then >>>>> perhaps the point may be clarified. >>>>> >>>>> If that gets you towards where I'm pointing, then the next >>>>> question is how would you implement this in assembly code, >>>>> consistent with c compilers you are aware of and in such a >>>>> way that does _not_ break an existing linker in the process? >>>>> >>>>> On the other hand, if this doesn't do it for you at all -- if >>>>> in short, your imagination isn't moved by those examples >>>>> beyond the short distance I walked with them -- then let me >>>>> commend again Metaware's c/pascal implementations and Hyde's >>>>> AofA documentation before further discussion continues. >>>>> >>>>> But less than the above made my imagination literally spin >>>>> with ideas when I first came across it. So maybe the above >>>>> is enough. I hope so. It's also closely connected in a >>>>> round about way to the idea of nested functions, aka Pascal. >>>>> (If you see the connection, then I think you've probably got >>>>> the larger picture.) >>>>> >>>>> Jon >>>> >>>> I've used generators in Python - they are a very nice way to solve some >>>> kinds of problems. Unfortunately, they are not easy to implement >>>> cleanly in a stack-based language because a general implementation >>>> requires that the generator (or "thunk", if you prefer) has its own >>>> stack for arbitrary local variables and state. Thus most languages that >>>> implement generators use garbage collection. >>> >>> You are way off the reservation, already. Let me suggest you >>> think about the _implementation_ for a moment. Maybe that >>> will clarify what I'm pointing towards. For one thing, >>> garbage collection has _NOTHING_ whatever to do with it. If >>> you think so, you are far off-point. >>> >>>> You can get some of the features of generators in C++ using a class to >>>> encapsulate the generator's state in class data. The newer lambda >>>> syntax makes it a little neater, and comes somewhat closer to generators >>>> or closures - but there are limitations. You can't implement them in >>>> general without substantial helper code (as seen in boost's libraries) >>>> or a major change in the structure of the language (including garbage >>>> collection). >>>> >>>> In short, generators (and closures) are a very nice high-level concept - >>>> you need a high level language (or, somewhat ironically, assembly) to >>>> use them, and C / C++ are not suitable. >>> >>> I recommend that reading I suggested. Take the AofA one >>> first. It's easier to get ahold of. It gets into the >>> details of implementation, but does NOT deal with it as a c >>> level concept. So you will have to fend for yourself there >>> until you can get ahold of Metaware docs. (Or, I might be >>> tempted to copy some of them for this purpose.) >>> >> >> I don't have these books and manuals you are referring to, nor do I have >> easy access to them. If you have web links of interest then I'll >> happily look at them - but I am not going to find and order books just >> to read a few pages. This discussion is interesting, but there's a >> limit to what is a practical and appropriate use of time and money for a >> discussion. > > Go here: > http://webster.cs.ucr.edu/AoA/Windows/PDFs/0_PDFIndexWin.html > > Then download and read all of Volume 5. > http://webster.cs.ucr.edu/AoA/Windows/PDFs/Volume5.pdf > http://webster.cs.ucr.edu/AoA/Windows/PDFs/Thunks.pdf > http://webster.cs.ucr.edu/AoA/Windows/PDFs/Iterators.pdf > http://webster.cs.ucr.edu/AoA/Windows/PDFs/Coroutines.pdf > http://webster.cs.ucr.edu/AoA/Windows/PDFs/ParameterImplementation.pdf > http://webster.cs.ucr.edu/AoA/Windows/PDFs/LexicalNesting.pdf > http://webster.cs.ucr.edu/AoA/Windows/PDFs/V5Questions.pdf > > I apologize for not doing this earlier. I had expected that > you already knew about AofA and it's availability on the web. > Had I known you didn't know about it, I would have > immediately provided you the links. Again, my sincere > apologies for not doing this earlier. >
Thanks - that makes a /huge/ difference! Of course, now I just need the time to read it. As I've only had a brief look at it (I read most of the chapter on thunks), I may be misjudging it here, but I have difficulty seeing the relevance of the book at this time. I can see the point of a DOS assembly book long ago, and I can see the point of a general book on assembly for multiple architectures. But I can't think of any reason (other than for fun) why anyone would write software in assembly for the x86 - and certainly not for Windows. There are certainly times when you might want to /use/ assembly on an x86 - speeding up critical loops, for example - but not to write entire programs. The HLA concept strikes me as a waste of time in this day and age. Having said that, some of the concepts (such as in the chapters you have indicated) are interesting and have wider applications. Coroutines are useful devices - it's just that the implementation details of how to use them with HLA x86 assembly are irrelevant to reality. Had the author shown how to use them in C, Java, Python, or even in an artificial HLL, it would have been more useful. Anyway, now I see what you mean by the term "thunk" - and it is clear from the book that these are limited devices that are basically equivalent to a C++ class with initialisation of private data values and a single method (or alternatively a C function that takes a struct pointer). Useful, but hardly revolutionary. Your proposed syntax for them in C is, however, neat and elegant - that would be a useful addition to the C language.
>>> Anyway, I can see I've sent you spinning in the wrong >>> direction. Take a breath, read AofA on the topic of thunks >>> and the nearby related chapters to it. That should provide >>> an idea about implementation. Not the _whole_ idea, by the >>> way. As it might be done in c, it involves the concept of >>> nested functions (which you clearly don't yet see) without >>> the use of the specific syntax you are used to seeing for >>> them (it's entirely hidden at the c language level, but >>> explicit at the assembly level.) If you _see_ this much, we >>> are probably on the same page. >> >> Nested functions are perfectly possible in some extensions to C - in >> particular, gcc supports them (since gcc also supports Ada, which has >> nested functions, much of the gcc structure already supports nested >> functions, and thus the C and C++ front-ends can get them almost for free). > > Yes, but as a general rule I don't always have the option of > using gcc. Customers sometimes already have existing tools > they want used, for example. There are other reasons, too. > So it's not a general solution. Just an interesting one. >
Agreed. I use various gcc extensions if I think they improve the code (with the emphasis here on improving the source code rather than the target code - that's a bonus). I haven't used nested functions - they often make code less readable because it is often unclear where different functions start and end.
>> Nested functions, C++ classes, the new C++ lambda syntax, etc., are all >> ways to implement a limited form of generator or iterator. Compiler >> extensions can be used to make a nicer syntax, and to automate some of >> the manual work involved. But without some sort of multiple stack >> system or garbage collection, you have serious limitations. I don't >> mean to say that these ideas are not useful despite the limitations - >> just that you cannot add proper flexible generators to a language with >> the sort of structure of C or C++ without fundamental changes to the way >> the language works - the compiler would need to be free to allocate (and >> free) dynamic memory as needed, rather than through explicit malloc / >> new calls. >> >> It could well be that what you call "thunking" really means "generators >> with various limitations", in which case you are right that garbage >> collection is not needed, and it's reasonably easy to figure out several >> good implementations. But the term "thunking" is not a well known or >> well-defined expression, and is used in many different ways by different >> people - I have no idea how the author of a particular book you've read >> happens to use it. > > Hopefully, the above will tell you more. > >> To look at some more general generators, and see why they can be used >> much more freely in a language like Python than they can in C or C++, >> let's vary your Primes function, using Python syntax so that we have >> actual working code: >> >> def IsPrime(i) : >> if i< 2 : >> return False >> for j in range(2, int(math.sqrt(i)) + 1) : >> if (i % j) == 0 : >> return False >> return True >> >> def Primes(a, b) : >> i = a >> if (i == 2) : >> yield i >> if (i % 2 == 0) : >> i = i + 1 >> while ( i<= b ) : >> if (IsPrime(i)) : >> yield i >> i = i + 1 >> return >> >> for p in Primes(1, 20) : >> print p >> >> To make this implementation of Primes work, the Primes closure has to >> include information about where the execution currently is in the Primes >> function, and in general it must also track any local variables. This >> becomes increasingly difficult for more complex functions, especially >> when doing it manually - you have to define a struct (or C++ class) to >> hold all the local data, as well as a current "state" which is used for >> jumping to the correct re-entry point on later calls. A language >> extension could hide and automate much of this, however. > > In fact, that's what is vital. In the implementation done by > Metaware's compilers, it was very well handled and the > implementation was quite general and nestable to any depth > without the programmer worrying over details such as that. > It's all simply kept as stack frame contexts, just as normal > functions do. The difference is that a thunk is used, > instead, to move back and forth in order to preserve the > stack context while the iterator remains "live." Once the > iterator completes, though, the stack is unwound in the usual > way and the context disappears just as you would expect for > any function call. >
I agree here that such a syntax and compiler-aided handling of the details would give you a very nice way to use these "thunks" - much more convenient that doing things manually in C or C++. I suspect you could get a fair way with "normal C" using a system similar to Adam Dunkels' protothreads - but integrating it into the language would be best.
>> If you were using such generators in a real program, you would want to >> use structured and modular programming - and then things get difficult. > > Things do not get difficult. I've used metaware's compiler > tools and there was NO difficulty involved. It's _exactly_ > like using c, except you've got a wonderful additional > semantic to handle, in beautiful and efficient ways, concepts > like walking graphs. The idea of "data hiding" is expanded > to also now include "algorithm hiding," but in a very light > weight fashion that is entirely consistent with the c > worldview. > >> To use generators in a stack-based language, you would have to >> allocate a structure containing all the local data on the caller >> function's stack - that means you (either the programmer figuring out >> the closure data manually, or the extended compiler) need access to the >> implementation when declaring the generation and using it. With a >> garbage collecting language, the generator itself would allocate space >> on the heap as needed - the caller need not know anything about the details. > > Got it. I understand your point now about garbage collection > -- makes sense. But the way Metaware handles it is beautiful > and doesn't require any of that. It's entirely handled > within the standard c style program model with a single stack > and all the usual, normal stack frame elements. The body of
The Metaware implementation, as far as I can see, is limited to situations where the thunk's frame can be allocated on the stack (or possibly as a statically allocated region). That is certainly the situation described in AofA. That is, of course, an entirely reasonable limitation for an extension to C. My view of such concepts has come down from higher-level languages like Python (and also functional programming languages), in which you have much more general capability in how you work with function-like objects and closures. From that angle, these "thunks" look limited, because you need a compiler and run-time that handles dynamic memory (typically some sort of garbage collection, but that's not absolutely necessary) to implement the capabilities I assume. But when you are thinking of these as an upwards extension of C, I can see these being a very useful addition to the language.
> a for loop is placed into a separate, nested function within > the body of the enclosing function. The iterator is called > using all the usual means, but includes a pointer to the > body. The iterator may itself call any number of other > functions, as well as other iterators if it likes, which may > be nested down the stack to any depth you want. When a yield > takes place, it is really a call to the for-body nested > function but with the stack frame pointer set to the > enclosing function so all the usual local variables are > appropriately accessible off of the base pointer reference > that all c compilers may normally use. The nested function > returns rather normally, restoring the frame back to the down > stream end of the stack. If the for-body temporarily stores > on the stack, it does so at the end of course and obviously > must restore it before returning. But that's just basic, > anyway. > >> You start getting really high-level programming when you can pass >> generators around as parameters and return values. This is something >> that cannot be done with a stack model - if a function returns a >> generator (or any function which requires closure data), the closure >> data must exist even after the calling stack frame has exited. There >> are ways to implement this without a general garbage collection facility >> (for example, a pointer to a clean-up function could be passed up or >> down the call chain while the closure itself is on the heap). But >> basically, complex function manipulation like this needs more advanced >> automatic control of the memory that you get in C or C++. > > So let me think about this for a second. Passing a generator > would involve being able to return not only a pointer to > code, but also its entire current context (and any such > context of all activation records still active at the time?) > In other words, it's not just a generator at its initial > point, but one that may have already been used for a bit but > hasn't yet completed and so it can be returned to a caller > for more continued use? Interesting, and I gather the > additional value here. >
That's correct. You can see here how this requires the generator's local frame to remain valid after the function that created it has exited - that means it has to exist outside the main stack. Thus for this sort of thing to be handled directly by the language and the compiler, rather than through explicit "new" or "malloc" calls, the compiler has to have a direct understanding and control of dynamic memory.
> However, as an _embedded_ programmer usually working on tiny > micros, not workstation-competent board-level systems, I'm > focused upon very modest but very useful extensions where I > _know_ I have good application and where I don't have to pay > for it with a significant change to the existing models I > have grown to know well and fully understand and trust. >
I agree here - and I would appreciate the addition to C of the sort of capabilities you have been describing. For embedded systems, it is important to be able to understand the implementation for the code you write, and that is possible for "thunks" as you have described them. On a PC, it (typically) doesn't matter if the software takes a few extra MB of run space, and runs through a byte code virtual machine - thus I program in Python and take advantage of the language's power to write shorter code.
> That said, I'd be interested in seeing how to implement > something like that which would work in ways where the > run-time execution duration is entirely predictable and > invariant (knowing, obviously, the initial conditions for the > generator.) I think you hint towards this, but I'd need to > see a specific implementation. > > In the meantime, you might look at the PDFs I've referred you > towards. They aren't that long and are quite detailed. They > do NOT show you the implementation used by Metaware, but I > can talk about that. > > Jon
Reply by Jon Kirwan January 27, 20102010-01-27
On Wed, 27 Jan 2010 14:26:35 +0100, David Brown
<david@westcontrol.removethisbit.com> wrote:

>On 27/01/2010 00:14, Jon Kirwan wrote: >> On Tue, 26 Jan 2010 23:31:15 +0100, David Brown >> <david.brown@hesbynett.removethisbit.no> wrote: >> >>> Jon Kirwan wrote: >>>> On Tue, 26 Jan 2010 09:14:50 +0100, David Brown >>>> <david@westcontrol.removethisbit.com> wrote: >>>> >>>>> On 25/01/2010 21:42, Jon Kirwan wrote: >>>>>> On Mon, 25 Jan 2010 13:13:37 +0100, David Brown >>>>>> <david@westcontrol.removethisbit.com> wrote: > >>> <snip> >>> >>>>>> As an aside, I have a lot of other things I'd like in c or >>>>>> c++ which just aren't there. For example, I dearly miss >>>>>> having access to thunking semantics in c or c++ (which does >>>>>> NOT break the c/c++ program model in any way, shape, or form >>>>>> and could easily be implemented as part of either language >>>>>> with no dire impacts at all. I might use this for efficient >>>>>> iterators (don't imagine that I'm talking about std library >>>>>> iterators here, which are somewhat similar in use but in no >>>>>> way similar in their implementation details -- they are much >>>>>> less efficient), so also do I miss this. There is no good >>>>>> reason I can think of not to have it and its utility is >>>>>> wonderful. (I'd be so happy to talk about it, at some point, >>>>>> as the examples are excellent and easily shown.) >>>>> What do you mean by "thunking" in this context? The term has several >>>>> meanings, as far as I know. If your answer is going to take more than a >>>>> dozen lines (you are better known for your in-depth explanations than >>>>> your short summaries!), it should probably be in its own thread. >>>> >>>> There are some excellent discussions already available. See, >>>> for example, the Metaware C (and Pascal) compiler manuals and >>>> their implemention of an iterator semantic, as well as their >>>> extensive discussion (with well-made examples) of its >>>> abundant benefits. There is also a somewhat different, but >>>> also interesting discussion made by Randall Hyde in The Art >>>> of Assembly manuals he generously put together some years >>>> back. (Don't worry yourself thumbing through Microsoft's use >>>> of the term.) >>>> >>>> ... >>>> >>>> The following case below is not nearly as well thought out, >>>> syntax wise, as Metaware's implementation (in other words, >>>> don't mistake it as a complete syntax designed by experts) >>>> but it may get the seed of a point across. >>>> >>>> for ( p in Primes( 20, 70 ) ) >>>> printf( "%d\n", p ); >>>> >>>> The code for Primes() might be, in short-hand, something like >>>> this below. (Please excuse my abuse of basic knowledge about >>>> prime numbers by using an increment by 1 in the for loop or >>>> any other tests for starting or ending on an even number only >>>> because I want to keep the code with as few lines as is >>>> reasonable to make the point. The idea is the point, not the >>>> implementation here.) >>>> >>>> integer Primes( int a, int b ) { >>>> int i; >>>> for ( i= a; i<= b; ++i ) >>>> if ( IsPrime( i ) ) >>>> yield( i ); >>>> return; >>>> } >>>> >>> >>> Ah, what you are talking about is often called a generator (for example, >>> in Python or JavaScript), or perhaps a closure. A generator is somewhat >>> like a lazily evaluated list, although it could be generalised (for >>> example, do the parameter values have to be the same for repeat calls as >>> they are for the initial call?). >>> >>>> I'm intentionally being brief, as well as leaving out a short >>>> discussion of each line. Partly, because longer examples may >>>> interfere with the central points. Partly, because you >>>> shouldn't need any discussion and should be able to almost >>>> instantly see what the above code "means" without that extra. >>>> It's in a form that should be plain without a manual. >>>> >>>> The place to focus isn't so much on examining Primes(), but >>>> instead more by imagining a wide variety of the types of >>>> for() loops which may require the mechanism itself. (Not my >>>> poor example, which may or may not be useful to anyone.) >>>> >>>> For example, you might prefer to imagine that Primes() is >>>> instead a routine that yields all the nodes of some arbitrary >>>> tree or graph using some very specific walking mechanism. If >>>> you use your imagination and let it take you for a ride, then >>>> perhaps the point may be clarified. >>>> >>>> If that gets you towards where I'm pointing, then the next >>>> question is how would you implement this in assembly code, >>>> consistent with c compilers you are aware of and in such a >>>> way that does _not_ break an existing linker in the process? >>>> >>>> On the other hand, if this doesn't do it for you at all -- if >>>> in short, your imagination isn't moved by those examples >>>> beyond the short distance I walked with them -- then let me >>>> commend again Metaware's c/pascal implementations and Hyde's >>>> AofA documentation before further discussion continues. >>>> >>>> But less than the above made my imagination literally spin >>>> with ideas when I first came across it. So maybe the above >>>> is enough. I hope so. It's also closely connected in a >>>> round about way to the idea of nested functions, aka Pascal. >>>> (If you see the connection, then I think you've probably got >>>> the larger picture.) >>>> >>>> Jon >>> >>> I've used generators in Python - they are a very nice way to solve some >>> kinds of problems. Unfortunately, they are not easy to implement >>> cleanly in a stack-based language because a general implementation >>> requires that the generator (or "thunk", if you prefer) has its own >>> stack for arbitrary local variables and state. Thus most languages that >>> implement generators use garbage collection. >> >> You are way off the reservation, already. Let me suggest you >> think about the _implementation_ for a moment. Maybe that >> will clarify what I'm pointing towards. For one thing, >> garbage collection has _NOTHING_ whatever to do with it. If >> you think so, you are far off-point. >> >>> You can get some of the features of generators in C++ using a class to >>> encapsulate the generator's state in class data. The newer lambda >>> syntax makes it a little neater, and comes somewhat closer to generators >>> or closures - but there are limitations. You can't implement them in >>> general without substantial helper code (as seen in boost's libraries) >>> or a major change in the structure of the language (including garbage >>> collection). >>> >>> In short, generators (and closures) are a very nice high-level concept - >>> you need a high level language (or, somewhat ironically, assembly) to >>> use them, and C / C++ are not suitable. >> >> I recommend that reading I suggested. Take the AofA one >> first. It's easier to get ahold of. It gets into the >> details of implementation, but does NOT deal with it as a c >> level concept. So you will have to fend for yourself there >> until you can get ahold of Metaware docs. (Or, I might be >> tempted to copy some of them for this purpose.) >> > >I don't have these books and manuals you are referring to, nor do I have >easy access to them. If you have web links of interest then I'll >happily look at them - but I am not going to find and order books just >to read a few pages. This discussion is interesting, but there's a >limit to what is a practical and appropriate use of time and money for a >discussion.
Go here: http://webster.cs.ucr.edu/AoA/Windows/PDFs/0_PDFIndexWin.html Then download and read all of Volume 5. http://webster.cs.ucr.edu/AoA/Windows/PDFs/Volume5.pdf http://webster.cs.ucr.edu/AoA/Windows/PDFs/Thunks.pdf http://webster.cs.ucr.edu/AoA/Windows/PDFs/Iterators.pdf http://webster.cs.ucr.edu/AoA/Windows/PDFs/Coroutines.pdf http://webster.cs.ucr.edu/AoA/Windows/PDFs/ParameterImplementation.pdf http://webster.cs.ucr.edu/AoA/Windows/PDFs/LexicalNesting.pdf http://webster.cs.ucr.edu/AoA/Windows/PDFs/V5Questions.pdf I apologize for not doing this earlier. I had expected that you already knew about AofA and it's availability on the web. Had I known you didn't know about it, I would have immediately provided you the links. Again, my sincere apologies for not doing this earlier.
>> Anyway, I can see I've sent you spinning in the wrong >> direction. Take a breath, read AofA on the topic of thunks >> and the nearby related chapters to it. That should provide >> an idea about implementation. Not the _whole_ idea, by the >> way. As it might be done in c, it involves the concept of >> nested functions (which you clearly don't yet see) without >> the use of the specific syntax you are used to seeing for >> them (it's entirely hidden at the c language level, but >> explicit at the assembly level.) If you _see_ this much, we >> are probably on the same page. > >Nested functions are perfectly possible in some extensions to C - in >particular, gcc supports them (since gcc also supports Ada, which has >nested functions, much of the gcc structure already supports nested >functions, and thus the C and C++ front-ends can get them almost for free).
Yes, but as a general rule I don't always have the option of using gcc. Customers sometimes already have existing tools they want used, for example. There are other reasons, too. So it's not a general solution. Just an interesting one.
>Nested functions, C++ classes, the new C++ lambda syntax, etc., are all >ways to implement a limited form of generator or iterator. Compiler >extensions can be used to make a nicer syntax, and to automate some of >the manual work involved. But without some sort of multiple stack >system or garbage collection, you have serious limitations. I don't >mean to say that these ideas are not useful despite the limitations - >just that you cannot add proper flexible generators to a language with >the sort of structure of C or C++ without fundamental changes to the way >the language works - the compiler would need to be free to allocate (and >free) dynamic memory as needed, rather than through explicit malloc / >new calls. > >It could well be that what you call "thunking" really means "generators >with various limitations", in which case you are right that garbage >collection is not needed, and it's reasonably easy to figure out several >good implementations. But the term "thunking" is not a well known or >well-defined expression, and is used in many different ways by different >people - I have no idea how the author of a particular book you've read >happens to use it.
Hopefully, the above will tell you more.
>To look at some more general generators, and see why they can be used >much more freely in a language like Python than they can in C or C++, >let's vary your Primes function, using Python syntax so that we have >actual working code: > >def IsPrime(i) : > if i < 2 : > return False > for j in range(2, int(math.sqrt(i)) + 1) : > if (i % j) == 0 : > return False > return True > >def Primes(a, b) : > i = a > if (i == 2) : > yield i > if (i % 2 == 0) : > i = i + 1 > while ( i <= b ) : > if (IsPrime(i)) : > yield i > i = i + 1 > return > >for p in Primes(1, 20) : > print p > >To make this implementation of Primes work, the Primes closure has to >include information about where the execution currently is in the Primes >function, and in general it must also track any local variables. This >becomes increasingly difficult for more complex functions, especially >when doing it manually - you have to define a struct (or C++ class) to >hold all the local data, as well as a current "state" which is used for >jumping to the correct re-entry point on later calls. A language >extension could hide and automate much of this, however.
In fact, that's what is vital. In the implementation done by Metaware's compilers, it was very well handled and the implementation was quite general and nestable to any depth without the programmer worrying over details such as that. It's all simply kept as stack frame contexts, just as normal functions do. The difference is that a thunk is used, instead, to move back and forth in order to preserve the stack context while the iterator remains "live." Once the iterator completes, though, the stack is unwound in the usual way and the context disappears just as you would expect for any function call.
>If you were using such generators in a real program, you would want to >use structured and modular programming - and then things get difficult.
Things do not get difficult. I've used metaware's compiler tools and there was NO difficulty involved. It's _exactly_ like using c, except you've got a wonderful additional semantic to handle, in beautiful and efficient ways, concepts like walking graphs. The idea of "data hiding" is expanded to also now include "algorithm hiding," but in a very light weight fashion that is entirely consistent with the c worldview.
> To use generators in a stack-based language, you would have to >allocate a structure containing all the local data on the caller >function's stack - that means you (either the programmer figuring out >the closure data manually, or the extended compiler) need access to the >implementation when declaring the generation and using it. With a >garbage collecting language, the generator itself would allocate space >on the heap as needed - the caller need not know anything about the details.
Got it. I understand your point now about garbage collection -- makes sense. But the way Metaware handles it is beautiful and doesn't require any of that. It's entirely handled within the standard c style program model with a single stack and all the usual, normal stack frame elements. The body of a for loop is placed into a separate, nested function within the body of the enclosing function. The iterator is called using all the usual means, but includes a pointer to the body. The iterator may itself call any number of other functions, as well as other iterators if it likes, which may be nested down the stack to any depth you want. When a yield takes place, it is really a call to the for-body nested function but with the stack frame pointer set to the enclosing function so all the usual local variables are appropriately accessible off of the base pointer reference that all c compilers may normally use. The nested function returns rather normally, restoring the frame back to the down stream end of the stack. If the for-body temporarily stores on the stack, it does so at the end of course and obviously must restore it before returning. But that's just basic, anyway.
>You start getting really high-level programming when you can pass >generators around as parameters and return values. This is something >that cannot be done with a stack model - if a function returns a >generator (or any function which requires closure data), the closure >data must exist even after the calling stack frame has exited. There >are ways to implement this without a general garbage collection facility >(for example, a pointer to a clean-up function could be passed up or >down the call chain while the closure itself is on the heap). But >basically, complex function manipulation like this needs more advanced >automatic control of the memory that you get in C or C++.
So let me think about this for a second. Passing a generator would involve being able to return not only a pointer to code, but also its entire current context (and any such context of all activation records still active at the time?) In other words, it's not just a generator at its initial point, but one that may have already been used for a bit but hasn't yet completed and so it can be returned to a caller for more continued use? Interesting, and I gather the additional value here. However, as an _embedded_ programmer usually working on tiny micros, not workstation-competent board-level systems, I'm focused upon very modest but very useful extensions where I _know_ I have good application and where I don't have to pay for it with a significant change to the existing models I have grown to know well and fully understand and trust. That said, I'd be interested in seeing how to implement something like that which would work in ways where the run-time execution duration is entirely predictable and invariant (knowing, obviously, the initial conditions for the generator.) I think you hint towards this, but I'd need to see a specific implementation. In the meantime, you might look at the PDFs I've referred you towards. They aren't that long and are quite detailed. They do NOT show you the implementation used by Metaware, but I can talk about that. Jon
Reply by David Brown January 27, 20102010-01-27
On 27/01/2010 00:14, Jon Kirwan wrote:
> On Tue, 26 Jan 2010 23:31:15 +0100, David Brown > <david.brown@hesbynett.removethisbit.no> wrote: > >> Jon Kirwan wrote: >>> On Tue, 26 Jan 2010 09:14:50 +0100, David Brown >>> <david@westcontrol.removethisbit.com> wrote: >>> >>>> On 25/01/2010 21:42, Jon Kirwan wrote: >>>>> On Mon, 25 Jan 2010 13:13:37 +0100, David Brown >>>>> <david@westcontrol.removethisbit.com> wrote:
>> <snip> >> >>>>> As an aside, I have a lot of other things I'd like in c or >>>>> c++ which just aren't there. For example, I dearly miss >>>>> having access to thunking semantics in c or c++ (which does >>>>> NOT break the c/c++ program model in any way, shape, or form >>>>> and could easily be implemented as part of either language >>>>> with no dire impacts at all. I might use this for efficient >>>>> iterators (don't imagine that I'm talking about std library >>>>> iterators here, which are somewhat similar in use but in no >>>>> way similar in their implementation details -- they are much >>>>> less efficient), so also do I miss this. There is no good >>>>> reason I can think of not to have it and its utility is >>>>> wonderful. (I'd be so happy to talk about it, at some point, >>>>> as the examples are excellent and easily shown.) >>>> What do you mean by "thunking" in this context? The term has several >>>> meanings, as far as I know. If your answer is going to take more than a >>>> dozen lines (you are better known for your in-depth explanations than >>>> your short summaries!), it should probably be in its own thread. >>> >>> There are some excellent discussions already available. See, >>> for example, the Metaware C (and Pascal) compiler manuals and >>> their implemention of an iterator semantic, as well as their >>> extensive discussion (with well-made examples) of its >>> abundant benefits. There is also a somewhat different, but >>> also interesting discussion made by Randall Hyde in The Art >>> of Assembly manuals he generously put together some years >>> back. (Don't worry yourself thumbing through Microsoft's use >>> of the term.) >>> >>> ... >>> >>> The following case below is not nearly as well thought out, >>> syntax wise, as Metaware's implementation (in other words, >>> don't mistake it as a complete syntax designed by experts) >>> but it may get the seed of a point across. >>> >>> for ( p in Primes( 20, 70 ) ) >>> printf( "%d\n", p ); >>> >>> The code for Primes() might be, in short-hand, something like >>> this below. (Please excuse my abuse of basic knowledge about >>> prime numbers by using an increment by 1 in the for loop or >>> any other tests for starting or ending on an even number only >>> because I want to keep the code with as few lines as is >>> reasonable to make the point. The idea is the point, not the >>> implementation here.) >>> >>> integer Primes( int a, int b ) { >>> int i; >>> for ( i= a; i<= b; ++i ) >>> if ( IsPrime( i ) ) >>> yield( i ); >>> return; >>> } >>> >> >> Ah, what you are talking about is often called a generator (for example, >> in Python or JavaScript), or perhaps a closure. A generator is somewhat >> like a lazily evaluated list, although it could be generalised (for >> example, do the parameter values have to be the same for repeat calls as >> they are for the initial call?). >> >>> I'm intentionally being brief, as well as leaving out a short >>> discussion of each line. Partly, because longer examples may >>> interfere with the central points. Partly, because you >>> shouldn't need any discussion and should be able to almost >>> instantly see what the above code "means" without that extra. >>> It's in a form that should be plain without a manual. >>> >>> The place to focus isn't so much on examining Primes(), but >>> instead more by imagining a wide variety of the types of >>> for() loops which may require the mechanism itself. (Not my >>> poor example, which may or may not be useful to anyone.) >>> >>> For example, you might prefer to imagine that Primes() is >>> instead a routine that yields all the nodes of some arbitrary >>> tree or graph using some very specific walking mechanism. If >>> you use your imagination and let it take you for a ride, then >>> perhaps the point may be clarified. >>> >>> If that gets you towards where I'm pointing, then the next >>> question is how would you implement this in assembly code, >>> consistent with c compilers you are aware of and in such a >>> way that does _not_ break an existing linker in the process? >>> >>> On the other hand, if this doesn't do it for you at all -- if >>> in short, your imagination isn't moved by those examples >>> beyond the short distance I walked with them -- then let me >>> commend again Metaware's c/pascal implementations and Hyde's >>> AofA documentation before further discussion continues. >>> >>> But less than the above made my imagination literally spin >>> with ideas when I first came across it. So maybe the above >>> is enough. I hope so. It's also closely connected in a >>> round about way to the idea of nested functions, aka Pascal. >>> (If you see the connection, then I think you've probably got >>> the larger picture.) >>> >>> Jon >> >> I've used generators in Python - they are a very nice way to solve some >> kinds of problems. Unfortunately, they are not easy to implement >> cleanly in a stack-based language because a general implementation >> requires that the generator (or "thunk", if you prefer) has its own >> stack for arbitrary local variables and state. Thus most languages that >> implement generators use garbage collection. > > You are way off the reservation, already. Let me suggest you > think about the _implementation_ for a moment. Maybe that > will clarify what I'm pointing towards. For one thing, > garbage collection has _NOTHING_ whatever to do with it. If > you think so, you are far off-point. > >> You can get some of the features of generators in C++ using a class to >> encapsulate the generator's state in class data. The newer lambda >> syntax makes it a little neater, and comes somewhat closer to generators >> or closures - but there are limitations. You can't implement them in >> general without substantial helper code (as seen in boost's libraries) >> or a major change in the structure of the language (including garbage >> collection). >> >> In short, generators (and closures) are a very nice high-level concept - >> you need a high level language (or, somewhat ironically, assembly) to >> use them, and C / C++ are not suitable. > > I recommend that reading I suggested. Take the AofA one > first. It's easier to get ahold of. It gets into the > details of implementation, but does NOT deal with it as a c > level concept. So you will have to fend for yourself there > until you can get ahold of Metaware docs. (Or, I might be > tempted to copy some of them for this purpose.) >
I don't have these books and manuals you are referring to, nor do I have easy access to them. If you have web links of interest then I'll happily look at them - but I am not going to find and order books just to read a few pages. This discussion is interesting, but there's a limit to what is a practical and appropriate use of time and money for a discussion.
> Anyway, I can see I've sent you spinning in the wrong > direction. Take a breath, read AofA on the topic of thunks > and the nearby related chapters to it. That should provide > an idea about implementation. Not the _whole_ idea, by the > way. As it might be done in c, it involves the concept of > nested functions (which you clearly don't yet see) without > the use of the specific syntax you are used to seeing for > them (it's entirely hidden at the c language level, but > explicit at the assembly level.) If you _see_ this much, we > are probably on the same page. >
Nested functions are perfectly possible in some extensions to C - in particular, gcc supports them (since gcc also supports Ada, which has nested functions, much of the gcc structure already supports nested functions, and thus the C and C++ front-ends can get them almost for free). Nested functions, C++ classes, the new C++ lambda syntax, etc., are all ways to implement a limited form of generator or iterator. Compiler extensions can be used to make a nicer syntax, and to automate some of the manual work involved. But without some sort of multiple stack system or garbage collection, you have serious limitations. I don't mean to say that these ideas are not useful despite the limitations - just that you cannot add proper flexible generators to a language with the sort of structure of C or C++ without fundamental changes to the way the language works - the compiler would need to be free to allocate (and free) dynamic memory as needed, rather than through explicit malloc / new calls. It could well be that what you call "thunking" really means "generators with various limitations", in which case you are right that garbage collection is not needed, and it's reasonably easy to figure out several good implementations. But the term "thunking" is not a well known or well-defined expression, and is used in many different ways by different people - I have no idea how the author of a particular book you've read happens to use it. To look at some more general generators, and see why they can be used much more freely in a language like Python than they can in C or C++, let's vary your Primes function, using Python syntax so that we have actual working code: def IsPrime(i) : if i < 2 : return False for j in range(2, int(math.sqrt(i)) + 1) : if (i % j) == 0 : return False return True def Primes(a, b) : i = a if (i == 2) : yield i if (i % 2 == 0) : i = i + 1 while ( i <= b ) : if (IsPrime(i)) : yield i i = i + 1 return for p in Primes(1, 20) : print p To make this implementation of Primes work, the Primes closure has to include information about where the execution currently is in the Primes function, and in general it must also track any local variables. This becomes increasingly difficult for more complex functions, especially when doing it manually - you have to define a struct (or C++ class) to hold all the local data, as well as a current "state" which is used for jumping to the correct re-entry point on later calls. A language extension could hide and automate much of this, however. If you were using such generators in a real program, you would want to use structured and modular programming - and then things get difficult. To use generators in a stack-based language, you would have to allocate a structure containing all the local data on the caller function's stack - that means you (either the programmer figuring out the closure data manually, or the extended compiler) need access to the implementation when declaring the generation and using it. With a garbage collecting language, the generator itself would allocate space on the heap as needed - the caller need not know anything about the details. You start getting really high-level programming when you can pass generators around as parameters and return values. This is something that cannot be done with a stack model - if a function returns a generator (or any function which requires closure data), the closure data must exist even after the calling stack frame has exited. There are ways to implement this without a general garbage collection facility (for example, a pointer to a clean-up function could be passed up or down the call chain while the closure itself is on the heap). But basically, complex function manipulation like this needs more advanced automatic control of the memory that you get in C or C++.
Reply by Jon Kirwan January 26, 20102010-01-26
On Tue, 26 Jan 2010 23:31:15 +0100, David Brown
<david.brown@hesbynett.removethisbit.no> wrote:

>Jon Kirwan wrote: >> On Tue, 26 Jan 2010 09:14:50 +0100, David Brown >> <david@westcontrol.removethisbit.com> wrote: >> >>> On 25/01/2010 21:42, Jon Kirwan wrote: >>>> On Mon, 25 Jan 2010 13:13:37 +0100, David Brown >>>> <david@westcontrol.removethisbit.com> wrote: >>>> ><snip> > >>>> In addition, these public link-time constants can be used to >>>> conditionally include or exclude code sections. In fact, >>>> almost every compiler uses this fact in one way or the other. >>>> CRT0, in particular, may take advantage of such features to >>>> conditionally include or exclude initialization code for >>>> libraries which may, or may not, have been linked in. And >>>> most linkers support the concept in some fashion -- because >>>> it is needed. >>>> >>>> And yes, I'd sometimes like c-level access to it. >>> On devices for which I write my own CRT0 or other pre-C startup code, I >>> write it in C. Typically there's a couple of lines of assembly to set >>> the stack pointer and jump to the C startup function, but things like >>> clearing .bss and copying .data are all done in C. If I wanted sections >>> that may or may not be included, I'd use C - either with #if's, or by >>> relying on the compiler to eliminate dead code. And some symbols, such >>> as section start and end points, are passed from the linker into the C >>> code - but only those that /must/ be passed in that way. >> >> Let's just leave it here as an agreement to disagree, then. I >> believe you are sincere in considering what I've already >> written, probably doing me more justice than I may deserve. >> But since you still cannot gather, I have to assume it's my >> fault in writing poorly and that the effort may require more >> time than I care to have for a semantic only of some small >> value. We've already made more of it than the newsgroup's >> time is worth. So I'm fine leaving the topic behind and just >> saying that I have used the semantic before to good use and >> sometimes miss it in c. There's no payoff here. Let's just >> leave it as a mild disagreement over a not-terribly-important >> issue. > >Okay. I'm not sure whether I am misunderstanding you, or disagreeing >with you, but I'm happy to leave it for now. Maybe the topic will turn >up another time, and it will all suddenly become obvious.
I only introduced it as an aside, to start. It's just not that important.
><snip> > >>>> As an aside, I have a lot of other things I'd like in c or >>>> c++ which just aren't there. For example, I dearly miss >>>> having access to thunking semantics in c or c++ (which does >>>> NOT break the c/c++ program model in any way, shape, or form >>>> and could easily be implemented as part of either language >>>> with no dire impacts at all. I might use this for efficient >>>> iterators (don't imagine that I'm talking about std library >>>> iterators here, which are somewhat similar in use but in no >>>> way similar in their implementation details -- they are much >>>> less efficient), so also do I miss this. There is no good >>>> reason I can think of not to have it and its utility is >>>> wonderful. (I'd be so happy to talk about it, at some point, >>>> as the examples are excellent and easily shown.) >>> What do you mean by "thunking" in this context? The term has several >>> meanings, as far as I know. If your answer is going to take more than a >>> dozen lines (you are better known for your in-depth explanations than >>> your short summaries!), it should probably be in its own thread. >> >> There are some excellent discussions already available. See, >> for example, the Metaware C (and Pascal) compiler manuals and >> their implemention of an iterator semantic, as well as their >> extensive discussion (with well-made examples) of its >> abundant benefits. There is also a somewhat different, but >> also interesting discussion made by Randall Hyde in The Art >> of Assembly manuals he generously put together some years >> back. (Don't worry yourself thumbing through Microsoft's use >> of the term.) >> >> ... >> >> The following case below is not nearly as well thought out, >> syntax wise, as Metaware's implementation (in other words, >> don't mistake it as a complete syntax designed by experts) >> but it may get the seed of a point across. >> >> for ( p in Primes( 20, 70 ) ) >> printf( "%d\n", p ); >> >> The code for Primes() might be, in short-hand, something like >> this below. (Please excuse my abuse of basic knowledge about >> prime numbers by using an increment by 1 in the for loop or >> any other tests for starting or ending on an even number only >> because I want to keep the code with as few lines as is >> reasonable to make the point. The idea is the point, not the >> implementation here.) >> >> integer Primes( int a, int b ) { >> int i; >> for ( i= a; i <= b; ++i ) >> if ( IsPrime( i ) ) >> yield( i ); >> return; >> } >> > >Ah, what you are talking about is often called a generator (for example, >in Python or JavaScript), or perhaps a closure. A generator is somewhat >like a lazily evaluated list, although it could be generalised (for >example, do the parameter values have to be the same for repeat calls as >they are for the initial call?). > >> I'm intentionally being brief, as well as leaving out a short >> discussion of each line. Partly, because longer examples may >> interfere with the central points. Partly, because you >> shouldn't need any discussion and should be able to almost >> instantly see what the above code "means" without that extra. >> It's in a form that should be plain without a manual. >> >> The place to focus isn't so much on examining Primes(), but >> instead more by imagining a wide variety of the types of >> for() loops which may require the mechanism itself. (Not my >> poor example, which may or may not be useful to anyone.) >> >> For example, you might prefer to imagine that Primes() is >> instead a routine that yields all the nodes of some arbitrary >> tree or graph using some very specific walking mechanism. If >> you use your imagination and let it take you for a ride, then >> perhaps the point may be clarified. >> >> If that gets you towards where I'm pointing, then the next >> question is how would you implement this in assembly code, >> consistent with c compilers you are aware of and in such a >> way that does _not_ break an existing linker in the process? >> >> On the other hand, if this doesn't do it for you at all -- if >> in short, your imagination isn't moved by those examples >> beyond the short distance I walked with them -- then let me >> commend again Metaware's c/pascal implementations and Hyde's >> AofA documentation before further discussion continues. >> >> But less than the above made my imagination literally spin >> with ideas when I first came across it. So maybe the above >> is enough. I hope so. It's also closely connected in a >> round about way to the idea of nested functions, aka Pascal. >> (If you see the connection, then I think you've probably got >> the larger picture.) >> >> Jon > >I've used generators in Python - they are a very nice way to solve some >kinds of problems. Unfortunately, they are not easy to implement >cleanly in a stack-based language because a general implementation >requires that the generator (or "thunk", if you prefer) has its own >stack for arbitrary local variables and state. Thus most languages that >implement generators use garbage collection.
You are way off the reservation, already. Let me suggest you think about the _implementation_ for a moment. Maybe that will clarify what I'm pointing towards. For one thing, garbage collection has _NOTHING_ whatever to do with it. If you think so, you are far off-point.
>You can get some of the features of generators in C++ using a class to >encapsulate the generator's state in class data. The newer lambda >syntax makes it a little neater, and comes somewhat closer to generators >or closures - but there are limitations. You can't implement them in >general without substantial helper code (as seen in boost's libraries) >or a major change in the structure of the language (including garbage >collection). > >In short, generators (and closures) are a very nice high-level concept - >you need a high level language (or, somewhat ironically, assembly) to >use them, and C / C++ are not suitable.
I recommend that reading I suggested. Take the AofA one first. It's easier to get ahold of. It gets into the details of implementation, but does NOT deal with it as a c level concept. So you will have to fend for yourself there until you can get ahold of Metaware docs. (Or, I might be tempted to copy some of them for this purpose.) Anyway, I can see I've sent you spinning in the wrong direction. Take a breath, read AofA on the topic of thunks and the nearby related chapters to it. That should provide an idea about implementation. Not the _whole_ idea, by the way. As it might be done in c, it involves the concept of nested functions (which you clearly don't yet see) without the use of the specific syntax you are used to seeing for them (it's entirely hidden at the c language level, but explicit at the assembly level.) If you _see_ this much, we are probably on the same page. Jon