Digression on Iterators

Started by Jon Kirwan October 18, 2011
Thought I'd add a discussion of a c implementation iterators,
since we are back on my favorite subject. ;)

Suppose I wanted to iterate through a binary tree that
includes a left and right branch. (Not a graph, let's say.)

[Iterators are even better in ways when the iteration
algorithm isn't so simple (so not so easy to code
repetitively where needed) abd is quite complex and not
recursive -- as it might be, say, in iterating through
overlapping objects in memory; something a c compiler might
want to do, for example.]

But back to the tree.

typedef struct node {
struct node * left;
struct node * right;
int something_interesting;
} * node_T;

void each_leaf ( node_T n ) -> ( node_T n ) {
void P ( node_T n ) {
if ( n == 0 ) return;
if ( n->left == 0 && n->right == 0 ) yield ( n );
P( n->left );
P( n->right );
}
P( n );
}

.
.
.

int c= 0;
for ( n <- each_leaf( mytree ) )
c++;

[The type of 'n' in the for loop is known because of the
declaration (implicit in the definition) of the iterator
function. So it's not necessary to state there.]

The above extension of the function type includes a
specification of the yield values, with type checking. And
I've avoided the use of 'in' as a keyword, keeping better to
the 'c' way of doing things, too. No additional keyword was
used in the above, breaking old code.

The algorithm above does something that isn't possible in c,
normally. It uses a recursive (powerful) method. This is a
huge increase in expressive power in c. Not trivial. It
could be supported only by explicitly using a stack to hold
the state of the recursion, but why should _I_ do that when
the compiler can? (This gets back to the basic argument
about c vs assembly, I suppose. I love arguing both sides to
the middle.)

Let's take it differently, without the nested function but
keeping iterators.

typedef struct node {
struct node * left;
struct node * right;
int something_interesting;
} * node_T;

void each_leaf ( node_T n ) -> ( node_T n ) {
if ( n == 0 ) return;
if ( n->left == 0 && n->right == 0 ) yield ( n );
for ( l_n <- each_leaf( n->left ) )
yield( l_n );
for ( r_n <- each_leaf( n->left ) )
yield( r_n );
}
}

.
.
.

int c= 0;
for ( n <- each_leaf( mytree ) )
c++;

Same thing, no nested function. But the first way separates
the tree walk itself from the yielding in a way the 2nd
doesn't. The 2nd form uses yield three times, for example.

There is no reason why yield cannot yield more than one
value, by the way. For example, in the case of compilers
looking for overlapping objects, I might write:

for ( o, p <- each_overlapping_object( obj ) )
something_interesting( o, p );
There is incredible power to it.

Already in c++, the _this pointer is implicitly passed to
non-static member functions, for example. Using an iterator
merely means including a hidden parameter that is a thunking
object (basically, an address and the activation frame
pointer.)

An iterator can yield everything, including integers,
functions, and even iterators themselves.

Passing iterators requires a special syntax to differentiate
the pointer. Metaware used ! for that. Taking the above
example will help illustrate what I mean.

In:

void each_leaf ( node_T n ) -> ( node_T n );

The c compiler will, as in c++ non-static member functions,
rewrite this as something more like:

void each_leaf ( void yield( node_T n )!, node_T n );

Here, ! indicates this new pointer type. I don't much like
the syntax. I might have preferred:

void each_leaf ( void (!yield)( node_T n ), node_T n );

But that's the way they did it. In effect, each_leaf is
passed a thunking pointer. In the for() loop that uses the
iterator, the iterator is passed the thunk as the first
parameter as indicated above. So there really are two
parameters here. Like in c++ non-member functions where
_this is passed along. The body of the for() loop is
compiled as a nested function of sorts, but where it expects
in the prologue and epilogue to do something useful with the
activation frame pointer, as well. Nothing fancy, really.
Very basic stuff.

Here's a graph traversal prototype looking for strongly
connected components, for example:

void each_SCC (
void each_node() -> ( node_T )!,
void R( node_T )) -> ( node_T )! ) ->
( node_T, void each_node_in_SCC()->( node_T )! );

Yeah, don't bitch at me about the trailing !. Make it better
if you like. Anyway, this might be used as in:

for ( root, each <- each_SCC( each_node, relation ) ) {
printf( " SCC found. It's members are: \n" );
for ( n <- each() )
printf( " %d", n );
printf( "\n" );
}

Very powerful stuff. And the cost is miniscule. It amounts
to a special kind of call that includes the activation frame
pointer. A thunk. So what? It's well worth the cost in
terms of the new expressive power. And it entirely works
cleanly within the simple, basic program model of just one
stack.

I construct this stuff on my own, in c. It provides
expressive power that is unparalleled without breaking
anything. I have yet to find a c compiler that I can't
readily adapt to achieve it. It takes a little digging as to
the implementation in hand and some assembly code. But once
I have the macros set up, it's clean and easy to understand
and use. The use is much like adding an exception module,
really. But without the muss and fuss of a separate linked
list of setjmp buffers, one per thread.

In assembly, I get all the same power and it is dead easy to
use. Of course, when I'm using this kind of power I probably
have an application that I'd rather be doing in c. So I find
using this paradigm much more common when I'm using c as the
language. Which means I'm stuck implementing it. Which
means I don't do it unless I can justify the extra fluff in
front of an angry crowd of programmers with pitch forks. So
I don't use it all that much in the end.

The power of this is exceptional and it breaks nothing in c
if careful about not creating a new keyword and is very easy
to incorporate in existing compilation technology. So I'd
argue, anyway.

Jon

Beginning Microcontrollers with the MSP430

That was far enough over my head that my ears bled a little just from
reading it. It was, however, intriguing. I like having my mind blown. And I
have the feeling that if I just fight a little longer I might win. ,o)

Could I humbly ask for a tad more instruction?

1. Although I generally understand <- and ->, do you have a favorite
tutorial on really explaining what they do and why and how to use them?

2. What is "yield()"? We aren't talking about this:
http://en.wikipedia.org/wiki/Coroutine
are we?

3. Do you have examples of macros for Code Composer C or other C compilers
that implement this? I'm only asking on the assumption that after nailing
down 1 and 2 I will have a hope of understanding the macros.

Finally, are you going to put this stuff up on a web page somewhere and if
not, can I?

--
James Newton
1-970-462-7764

_____

From: m... [mailto:m...] On Behalf Of
Jon Kirwan
Sent: Tuesday, October 18, 2011 12:41
To: MSP430 List
Subject: [msp430] Digression on Iterators

Thought I'd add a discussion of a c implementation iterators,
since we are back on my favorite subject. ;)

Suppose I wanted to iterate through a binary tree that
includes a left and right branch. (Not a graph, let's say.)

[Iterators are even better in ways when the iteration
algorithm isn't so simple (so not so easy to code
repetitively where needed) abd is quite complex and not
recursive -- as it might be, say, in iterating through
overlapping objects in memory; something a c compiler might
want to do, for example.]

But back to the tree.

typedef struct node {
struct node * left;
struct node * right;
int something_interesting;
} * node_T;

void each_leaf ( node_T n ) -> ( node_T n ) {
void P ( node_T n ) {
if ( n == 0 ) return;
if ( n->left == 0 && n->right == 0 ) yield ( n );
P( n->left );
P( n->right );
}
P( n );
}

.
.
.

int c= 0;
for ( n <- each_leaf( mytree ) )
c++;

[The type of 'n' in the for loop is known because of the
declaration (implicit in the definition) of the iterator
function. So it's not necessary to state there.]

The above extension of the function type includes a
specification of the yield values, with type checking. And
I've avoided the use of 'in' as a keyword, keeping better to
the 'c' way of doing things, too. No additional keyword was
used in the above, breaking old code.

The algorithm above does something that isn't possible in c,
normally. It uses a recursive (powerful) method. This is a
huge increase in expressive power in c. Not trivial. It
could be supported only by explicitly using a stack to hold
the state of the recursion, but why should _I_ do that when
the compiler can? (This gets back to the basic argument
about c vs assembly, I suppose. I love arguing both sides to
the middle.)

Let's take it differently, without the nested function but
keeping iterators.

typedef struct node {
struct node * left;
struct node * right;
int something_interesting;
} * node_T;

void each_leaf ( node_T n ) -> ( node_T n ) {
if ( n == 0 ) return;
if ( n->left == 0 && n->right == 0 ) yield ( n );
for ( l_n <- each_leaf( n->left ) )
yield( l_n );
for ( r_n <- each_leaf( n->left ) )
yield( r_n );
}
}

.
.
.

int c= 0;
for ( n <- each_leaf( mytree ) )
c++;

Same thing, no nested function. But the first way separates
the tree walk itself from the yielding in a way the 2nd
doesn't. The 2nd form uses yield three times, for example.

There is no reason why yield cannot yield more than one
value, by the way. For example, in the case of compilers
looking for overlapping objects, I might write:

for ( o, p <- each_overlapping_object( obj ) )
something_interesting( o, p );

There is incredible power to it.

Already in c++, the _this pointer is implicitly passed to
non-static member functions, for example. Using an iterator
merely means including a hidden parameter that is a thunking
object (basically, an address and the activation frame
pointer.)

An iterator can yield everything, including integers,
functions, and even iterators themselves.

Passing iterators requires a special syntax to differentiate
the pointer. Metaware used ! for that. Taking the above
example will help illustrate what I mean.

In:

void each_leaf ( node_T n ) -> ( node_T n );

The c compiler will, as in c++ non-static member functions,
rewrite this as something more like:

void each_leaf ( void yield( node_T n )!, node_T n );

Here, ! indicates this new pointer type. I don't much like
the syntax. I might have preferred:

void each_leaf ( void (!yield)( node_T n ), node_T n );

But that's the way they did it. In effect, each_leaf is
passed a thunking pointer. In the for() loop that uses the
iterator, the iterator is passed the thunk as the first
parameter as indicated above. So there really are two
parameters here. Like in c++ non-member functions where
_this is passed along. The body of the for() loop is
compiled as a nested function of sorts, but where it expects
in the prologue and epilogue to do something useful with the
activation frame pointer, as well. Nothing fancy, really.
Very basic stuff.

Here's a graph traversal prototype looking for strongly
connected components, for example:

void each_SCC (
void each_node() -> ( node_T )!,
void R( node_T )) -> ( node_T )! ) ->
( node_T, void each_node_in_SCC()->( node_T )! );

Yeah, don't bitch at me about the trailing !. Make it better
if you like. Anyway, this might be used as in:

for ( root, each <- each_SCC( each_node, relation ) ) {
printf( " SCC found. It's members are: \n" );
for ( n <- each() )
printf( " %d", n );
printf( "\n" );
}

Very powerful stuff. And the cost is miniscule. It amounts
to a special kind of call that includes the activation frame
pointer. A thunk. So what? It's well worth the cost in
terms of the new expressive power. And it entirely works
cleanly within the simple, basic program model of just one
stack.

I construct this stuff on my own, in c. It provides
expressive power that is unparalleled without breaking
anything. I have yet to find a c compiler that I can't
readily adapt to achieve it. It takes a little digging as to
the implementation in hand and some assembly code. But once
I have the macros set up, it's clean and easy to understand
and use. The use is much like adding an exception module,
really. But without the muss and fuss of a separate linked
list of setjmp buffers, one per thread.

In assembly, I get all the same power and it is dead easy to
use. Of course, when I'm using this kind of power I probably
have an application that I'd rather be doing in c. So I find
using this paradigm much more common when I'm using c as the
language. Which means I'm stuck implementing it. Which
means I don't do it unless I can justify the extra fluff in
front of an angry crowd of programmers with pitch forks. So
I don't use it all that much in the end.

The power of this is exceptional and it breaks nothing in c
if careful about not creating a new keyword and is very easy
to incorporate in existing compilation technology. So I'd
argue, anyway.

Jon



On Thu, 3 Nov 2011 15:27:37 -0700, James wrote:

>That was far enough over my head that my ears bled a little just from
>reading it. It was, however, intriguing. I like having my mind blown. And I
>have the feeling that if I just fight a little longer I might win. ,o)

You make me a little sad. I'm sure that Paul and Richard and
Anders and Michel and others who make some of their living
developing compilers may read your comment and would now say
to me, "See. That's why we don't do it, Jon. Good idea, no
market. Way over everyone's head. Who cares? So stop
bothering us about it."

I suppose it's only that they are too polite or that they've
already relegated me to the heap bin and didn't even notice
(which is worse, but I'd understand too.)

>Could I humbly ask for a tad more instruction?

Hehe. Of course. ;) I will write a little more below in a
footnote. I will try and not restate myself too much.

>1. Although I generally understand <- and ->, do you have a favorite
>tutorial on really explaining what they do and why and how to use them?

That syntax came from Metaware's compiler. I don't know if
you know who these folks were, but Dr. Frank DeRemer
developed (I think as his Ph.D. thesis) the LALR parsing
concept, which was a meaningful advance at the time. See
"Practical translators for LR(k) languages," MIT, Cambridge,
Massachusetts, 1969.

http://en.wikipedia.org/wiki/LALR_parser_generator

Dr. DeRemer headed up Metaware. Their earlier products were
compiler compilers, before getting into specific compilers,
per se. Dr. Tom Pennello, the guy I spent more time talking
to, went on to write some tools for compiler compilers and an
article called "Efficient Computation of LALR(1) Look-Ahead
Set," TOPLAS, vol 4, no 4, in October 1982. Which was around
the time, I think, that Metaware was becoming a reality of
sorts. This was one of the things I enjoyed learning at the
time. And Tom was generous to a fault with his time and
would repair problems I found in less than 24 hours, most
times.

I didn't like the religious trappings they stuffed into their
manuals. It felt "unbusiness like" to me. But I was always
treated very well and the fact that I would occasionally trip
up on some pithy Bible verse while reading their manuals
didn't annoy me too much. They never brought it up to me on
the phone. Anyway, I was learning way too much from them to
really care. Except that I felt it might put an unneeded
kink into their business model.

The specific syntax was chosen in the same vein that guided
the C standards committee members towards C89, I felt.
Namely, that they didn't introduce any new keywords. I
believe they scouted around for symbols that could be
'repurposed' and determined that these wouldn't unduly
complicate their parsers and wouldn't break code that already
existed. That was the impression I took from talking about
it, back then. There could have been a lot more to the
story, though.

>2. What is "yield()"? We aren't talking about this:
>http://en.wikipedia.org/wiki/Coroutine
>are we?

Yes and no. The term 'coroutine' is very broad. A program
running on Windows is a coroutine in every real sense of the
word. But this is nothing like the yield() feature, as I was
discussing it. yield() is a very specific semantic (and like
C++ inheritance handled with vtables, nearly always
implemented in a specific way. So it is like saying "a
3-inch Hillman 47850 Star Drive Deck Screw" as opposed to
just saying "a screw."

I'll explain more under #3.

>3. Do you have examples of macros for Code Composer C or other C compilers
>that implement this? I'm only asking on the assumption that after nailing
>down 1 and 2 I will have a hope of understanding the macros.

It requires some assembly code. There is no escaping it that
I'm aware of. You can get kind of close with setjmp and
longjmp, but not close enough because longjmp unwinds the
stack -- which is a problem.

>Finally, are you going to put this stuff up on a web page somewhere and if
>not, can I?

I suppose I could do that. It just takes work to document
and explain. Porting is required (due to assembly and a need
to know details about activation frames and register use.)
And I had no idea there was any strong interest. But if
there is, I can pony express something up onto a web page.

This whole thing would be a whole lot better added to gcc. If
I connect up with someone who can help speed up my learning
curve a bit, I might be able to add it. Though I suspect I'd
have put in less work than they in the end. But I'm not
completely ignorant, either.

Jon

P.S. On the implementation details.

Let's take a simple case: walking a tree using infix
traversal. You start out designing this with a sorted binary
tree, but you might change your mind at some point and make
it a sorted multiway tree.

For now, the structure looks like:

typedef struct node_s {
int sortkey;
void * morestuff;
struct node_s * left;
struct node_s * right;
} node_t;

Let's say you are in some function and suddenly need to visit
the nodes in sorted order, so you want to use an infix
traversal of the nodes. In this case, let's just say that
this function prints out a debug file listing the nodes in
sorted order. Before I get to a more traditional approach,
which has its own problems, let's decide that this routine
uses only local instance variables, no statics. It might get
called pre-emptively, since let's also say for the moment,
you have a nice thread-based, pre-emptive operating system
running your embedded code. (So it's important that there
are no statics, or that if there are statics used that you
guard them with a critical region. In this case, no
statics.) The tree can be of an arbitrary depth. How do you
solve this?

Well, while traversing down one side it will be helpful to
save parent nodes along the way so you can continue along. So
you might create a stack object, pushing those nodes for a
later visit. But because this is of arbitrary depth, you
don't use a fixed sized array. You decide to use a linked
list of nodes. To do that, you create another structure:

typedef struct nodelist_s {
node_t * node;
struct nodelist_s * next;
} nodelist_t;

Then within some body part of the function that needs to walk
the tree:

node_t *n= mytree;
nodelist_t *list= NULL, *a;
for ( ; ; ) {
while ( n != NULL ) {
a= (nodelist_t *) malloc( sizeof( nodelist_t ) );
a->node= n;
a->next= list;
list= a;
n= n->left;
}
if ( list == NULL )
break;
a= list;
list= a->next;
n= a->node;
fprintf( flun, "%d\n", n->sortkey );
free( a );
n= n->right;
}

All I really wanted to do was something like this:

for ( n <- treewalk( mytree ) )
fprintf( flun, "%d\n", n->sortkey );

But you see what I had to do? Worse, if I needed to walk
this tree somewhere else, I'd have to replicate all that dang
code over and over again. And The Great Sky Tree forbid that
I should ever _change_ that tree structure to a multiway!

Here's how a smarter C programmer might handle this:

void treewalk( node_t * t, void (*f)( node_t * ) ) {
if( t != NULL ) {
treewalk( t->left, f );
(*f)( t );
treewalk( t->right, f );
}
}

In that case, though, you'd need to create a function called
printtree:

void printtree( node_t * n ) {
fprintf( flun, "%d\n", n->sortkey );
return;
}

And then do the loop like this:

treewalk( mytree, printtree );

The only problem here is that for each and every kind of
thing you might want to do with the nodes, you'd need to
pollute the namespace with a new function and put in the code
you need there.

So, suppose I now just wanted to count the nodes? Well:

static int count;
void counttree( node_t * n ) {
++count;
return;
}

And then handle it like this:

count= 0;
treewalk( mytree, counttree );
// do something with count!

Problem is, you've just been forced to create a static
variable, count, create a new function to update it, must
initialize it before calling treewalk, pollute the namespace
with count and counttree(), and it's not safe for preemptive
threads since you now have a static variable involved. Oh,
and besides all of that, countree() doesn't even have access
to any of the local variables within the code that is wanting
to walk the tree. Suppose you wanted to do something a lot
more complicated that requires access to some local variable
information, as well? You'd somehow have to export that
data. How?

Well, something like this?

void treewalk( node_t *t, void (*f)(node_t *), void *x ) {
if( t != NULL ) {
treewalk( t->left, f );
(*f)( t, x );
treewalk( t->right, f );
}
}

And now every function you create that can be passed to
treewalk must also support a pointer to void that will be
recast to whatever data structure whose pointer you also now
have to pass in. And you now have to pack it all into a
structure, as well. It can't just be a loose list of
variables or else you'd have to support a variable length
parameter list on top of everything else.

Point is, it is damned complicated to do all this. WAY MORE
complicated than it needs to be. And making it thread safe
is a downright pain in the ass on top of everything else.

Again, why can't I just write:

int count= 0;
for ( n <- treewalk( mytree ) )
++count;
// do something with count

It's so much more sensible. It's clean. It's neat. It is
readable. It is maintainable. It is short. And it is safe
as a thread.

(Now. Try writing a compiler without this! I _know_ that
those folks who actually have spent serious time inside a
compiler know just how much this could help.)

The basic idea here in data abstraction regarding this tree
concept is to hide the details of walking it from those parts
of the code that need to walk it. They should simply walk
through it. Not have to know how that walking gets done.
This way you can make changes in the structure and then
simply make ONE change to the walking code and be done with
it.

Now, in C++ STL, they go to great extents to provide very
heavy weight types of iterators that can walk containers and
templates to help (with partial template specialization you
hope to gosh also exists.) But C++ really isn't a good idea
for embedded work I do. And I usually couldn't afford the
STL, anyway. And all that mess is just to make something so
darned simple to conceive _and_ implement?

Which gets me to the implementation.

What does C (or should C) do with:

int count= 0;
for ( n <- treewalk( mytree ) )
++count;

Well, it certainly needs to allocate count on the local
activation frame -- whether in a register or on the stack.
(Though I suspect that most C compilers implementing this
would likely place these locals on the stack, as you will see
why shortly.) The single line "++count;" becomes a
subroutine. I'll write this in 32-bit x86 code:

_special_for_body_function:
push ebp
mov ebp, [esp+8]
inc count[ebp]
pop ebp
ret

The for() loop would look something like:

push mytree
push _special_for_body_function
push ebp
call treewalk
add esp, 8

That's it. A hidden parameter is passed to treewalk, the
code of which probably looks like:

void treewalk( node_t * n ) -> ( node_t * ) {
void P ( node_t * n ) {
if ( n == 0 ) return;
if ( n->left == 0 && n->right == 0 ) yield ( n );
P( n->left );
P( n->right );
}
P( n );
}

That hidden parameter is EBP, which is just a copy of the
activation frame pointer being used by the caller's code, and
the address of the for() loop body code (which I called above
as _special_for_body_function. P() has access to it as it is
nested within treewalk(). So when yield(n) takes place the
code looks like this:

push n
push hiddenEBP
call _special_for_body_function
add esp, 8

This just calls the for() loop body as a function, but also
passes along the EBP of the caller so that the for() loop
body code has access to the local variables there as well as
any passed parameters within the yield() parameter list. This
connects the two together, beautifully. The for() loop body
does its thing and then returns from the yield() call back
into the full context of the iterator code, which still has
all of its instance variables.

It's really slick.

And fast.

And it doesn't break even old style linkers (doesn't need
anything special), require special stacks, etc. It _does_
require that the compiler compile out bodies as functions.
But remember the example above? I had to construct C
functions to pass to the treewalk() routine when I don't have
access to iterator semantics within the C syntax. But I have
to do it manually. Why? Why shouldn't the compiler do that
automatically for me?

Is the idea clearer? Is my anger at being forced to do this
over and over on my own for no good reason at all, palpable
now? ;)

Jon
On Thu, 03 Nov 2011 23:54:34 -0700, I wrote:

> void treewalk( node_t *t, void (*f)(node_t *), void *x ) {
> if( t != NULL ) {
> treewalk( t->left, f );
> (*f)( t, x );
> treewalk( t->right, f );
> }
> }

Sorry, that's not quite correct. This would be better:

void treewalk( node_t *t, void (*f)(node_t *, void *), void
*x ) {

for the first line.

Jon
On Thu, 03 Nov 2011 23:54:34 -0700, I wrote:

> void treewalk( node_t *t, void (*f)(node_t *), void *x ) {
> if( t != NULL ) {
> treewalk( t->left, f );
> (*f)( t, x );
> treewalk( t->right, f );
> }
> }

Should be:

void treewalk( node_t *t, void (*f)(node_t *, void *), void
*x ) {
if( t != NULL ) {
treewalk( t->left, f, x );
(*f)( t, x );
treewalk( t->right, f, x );
}
return;
}
On 04/11/11 07:54, Jon Kirwan wrote:
> On Thu, 3 Nov 2011 15:27:37 -0700, James wrote:
>
> >That was far enough over my head that my ears bled a little just from
> >reading it. It was, however, intriguing. I like having my mind blown.
> And I
> >have the feeling that if I just fight a little longer I might win. ,o)
>
> You make me a little sad. I'm sure that Paul and Richard and
> Anders and Michel and others who make some of their living
> developing compilers may read your comment and would now say
> to me, "See. That's why we don't do it, Jon. Good idea, no
> market. Way over everyone's head. Who cares? So stop
> bothering us about it."
>

I doubt very much that this is the reason that iterator, yield (or more
general coroutine) support is missing from the development tools. After
all, there are plenty of things in C that are above the heads of most C
programmers (and a few C compiler writers), such as the details of
strict aliasing, ordering of volatile and non-volatile memory accesses,
etc. And there are /many/ features of C++ that are above the heads of
the users.

You know as well as me that the cost of implementing such a major change
to the C (or C++) programming paradigm is very high, as are the barriers
towards getting it included as part of the C (or C++) standards.

Users are not the problem. If the features are useful, efficiently
implemented, widely available (not just on one weird archaic compiler),
safe to use (low probability of accidentally writing incorrect code),
and well documented with examples and tutorials, then advanced
programmers will learn them and use them. Sure, it will be "above their
heads" at the start - until they learn them and get used to them.

Iterators and yield were added to Python, and people took to that
without problems. But for many reasons it was far easier to add such
features to Python than it would be to C (or C++).
There is hope, however. The standard method of implementing major
changes seems to be to put together a C++ class/template implementation
in the Boost libraries. Usually that means an incredibly ugly, and
often inefficient, implementation - but not too bad to use. If the
library is popular and often used, then it gets proposed for a "native"
implementation in future C++ standards so that it can get a better
implementation and perhaps a nicer syntax. That seems to have been
successful with lambda functions at least, and I'm sure it could work
for coroutines.

Hi Jon,

> The only problem here is that for each and every kind of thing you
> might want to do with the nodes, you'd need to pollute the namespace
> with a new function and put in the code you need there.
>
> So, suppose I now just wanted to count the nodes? Well:
>
> static int count;
> void counttree( node_t * n ) {
> ++count;
> return;
> }
>
> And then handle it like this:
>
> count= 0;
> treewalk( mytree, counttree );
> // do something with count!
>
> Problem is, you've just been forced to create a static variable, count,
> create a new function to update it, must initialize it before calling
> treewalk, pollute the namespace with count and counttree(), and it's
> not safe for preemptive threads since you now have a static variable
> involved. Oh, and besides all of that, countree() doesn't even have
> access to any of the local variables within the code that is wanting to
> walk the tree. Suppose you wanted to do something a lot more
> complicated that requires access to some local variable information, as
> well? You'd somehow have to export that data. How?
>
> Well, something like this?
>
> void treewalk( node_t *t, void (*f)(node_t *), void *x ) {
> if( t != NULL ) {
> treewalk( t->left, f );
> (*f)( t, x );
> treewalk( t->right, f );
> }
> }
>
> And now every function you create that can be passed to treewalk must
> also support a pointer to void that will be recast to whatever data
> structure whose pointer you also now have to pass in. And you now have
> to pack it all into a structure, as well. It can't just be a loose
> list of variables or else you'd have to support a variable length
> parameter list on top of everything else.

This just shows how bad C is for this particular type of problem. In Lisp,
possibly my most favoured language of all time (though Smalltalk comes
close), there is no problem, it is highly elegant. :-) I would say that
the languages I liked most, of the Lisp/Scheme class, are T and Common Lisp.

http://en.wikipedia.org/wiki/T_(programming_language)

C++11 now has lambdas. As lambdas can access free variables (in
Lisp/Church/lambda calculus speak), your count example actually comes out
fairly well using a lambda.

Everything in C++ with lambdas is ugly compared to Lisp. Currying is
abysmal in C++.

Everybody should be able to speak at least three different classes of
programming language to be rounded as a programmer. You don't need to be
proficient, but learning another class of programming language expands your
capabilities and allows you to think differently in the language you
practice most.

My particular approximate timeline (well-known languages only):

FORTRAN -> BASIC -> Assembly -> Pascal -> LISP+Scheme -> Modula-2 -> C ->
C++
Algol-68 -> FORTH -> Java
-> COBOL -> Prolog
-> Smalltalk

--
Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
SolderCore running Defender... http://www.vimeo.com/25709426

David,

> There is hope, however. The standard method of implementing major
> changes seems to be to put together a C++ class/template implementation
> in the Boost libraries. Usually that means an incredibly ugly, and
> often inefficient, implementation - but not too bad to use. If the
> library is popular and often used, then it gets proposed for a "native"
> implementation in future C++ standards so that it can get a better
> implementation and perhaps a nicer syntax. That seems to have been
> successful with lambda functions at least, and I'm sure it could work
> for coroutines.

Lambdas in C++11 are pretty ugly compared to lambdas in other languages.
However, it's hard to come up with something pleasing for lambda in C++, so
what we now have is probably the best we can do. Still, putting lambdas to
use also needs a good understanding of the features offered by templates so
that you can deploy templates and lambdas together to get some things done.
:-(

--
Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
SolderCore running Defender... http://www.vimeo.com/25709426



On Fri, 4 Nov 2011 12:01:48 -0000, Paul wrote:

>This just shows how bad C is for this particular type of problem.

But Paul, it is such a COMMON problem! Trees??? Come on.
Who doesn't have to do one, once in a while? With it being
so hard to do in C, folks go to great lengths to avoid them
even when that data structure is the natural and correct fit
for the problem at hand.

I completely understand the early K&R elements as to why the
concept wasn't included -- it was being developed for an
operating system port soon to come and this just wasn't on
the radar scope then. But it seems inexcusable to me that
C89 missed including the concept. It had already been
implemented by an existing compiler team, fielded as a
product in some use by the time, and it was a good team who
knew their business and understood the entire context well.

I don't want to over-emphasize trees, per se. There are so
many other situations where this very simple concept fits
very well, as I'm sure you are aware. The SCC example I gave
before is only one of many such.

But this lack terribly hobbles the language and any language
missing it, including c++.

>In Lisp,
>possibly my most favoured language of all time (though Smalltalk comes
>close), there is no problem, it is highly elegant. :-) I would say that
>the languages I liked most, of the Lisp/Scheme class, are T and Common Lisp.

But I would _never_ consider using Lisp in many of my
embedded projects. And I'm sure you can do a better job
listing out more good reasons why than I could manage.
>http://en.wikipedia.org/wiki/T_(programming_language)

Thanks. I wasn't aware of it.

>C++11 now has lambdas. As lambdas can access free variables (in
>Lisp/Church/lambda calculus speak), your count example actually comes out
>fairly well using a lambda.

Would you care to provide a quality example? I am only
vaguely familiar with lambdas, as I haven't yet picked up c#
for anything serious yet (I do NOT enjoy being a Windows
programmer, though I am still forced to it at times, but I am
developing .NET code right now and probably will delve in
soon.)

>Everything in C++ with lambdas is ugly compared to Lisp. Currying is
>abysmal in C++.

I think I spotted something like currying in f#, when
skimming it some months back.

>Everybody should be able to speak at least three different classes of
>programming language to be rounded as a programmer. You don't need to be
>proficient, but learning another class of programming language expands your
>capabilities and allows you to think differently in the language you
>practice most.

While I don't disagree with the thrust you make, it's not
addressing the point I'm making. I use assembly and C for
most of my embedded programming. You know why better than I
do. I don't have the option of using Lisp or C++11 or c# or
f#. Sorry. Those aren't possible, nor will they be. But I
do have to deal with the real kinds of problems I've laid out
and do so in C. I have in the past and I will in the future.

The point is that the semantic I laid out is so useful and so
simple to maintain and explain and use and so powerful. It
doesn't violate for a fraction of a second any of C's own
motivations which are simple and efficient compiler tools. It
doesn't break any existing C code. And it solves very
frequently encountered, REAL problems that everyday C
programmers encounter. And it does ALL OF THIS in a fashion
that would work with the smaller micros around just like C
itself does. It breaks nothing at all and isn't even that
difficult to modify an existing compiler (so said Pennello
when I spoke with him about it -- that was in 1986, or so, so
we are NOT talking about the 25 years of later improvements
and refinements to linkers and compilers. It was easy then.)

You could differentiate your product, Paul, by including
syntax for the semantic. ;)

>My particular approximate timeline (well-known languages only):
>
>FORTRAN -> BASIC -> Assembly -> Pascal -> LISP+Scheme -> Modula-2 -> C ->
>C++
>Algol-68 -> FORTH -> Java
> -> COBOL -> Prolog
> -> Smalltalk

Mine is almost identical, except that C was in 1978, so would
be put before Pascal but after assembly code. I've used
Lisp, but not very much -- at Lockheed in Burbank, CA. And I
used Algol-60 as well as Algol-68 when it became available.
Forth would be around the late 1970's, as well. RPG-II would
be included (IBM System 3, memory serving, and some Bouroughs
computer I forget the name of.) COBOL, of course. Fortran
included the -II, the -IV, and the -77 varieties. Nothing
since. No smalltalk outside of reading articles. Never used
it. Lots of C++ since around 1989-ish. Also, my list
includes PL/M and lots of Bliss-32.

Jon
> On Fri, 4 Nov 2011 12:01:48 -0000, Paul wrote:
>
> >This just shows how bad C is for this particular type of problem.
>
> But Paul, it is such a COMMON problem! Trees??? Come on.
> Who doesn't have to do one, once in a while? With it being so hard to
> do in C, folks go to great lengths to avoid them even when that data
> structure is the natural and correct fit for the problem at hand.
>
> I completely understand the early K&R elements as to why the concept
> wasn't included -- it was being developed for an operating system port
> soon to come and this just wasn't on the radar scope then. But it
> seems inexcusable to me that
> C89 missed including the concept. It had already been implemented by
> an existing compiler team, fielded as a product in some use by the
> time, and it was a good team who knew their business and understood the
> entire context well.

Perhaps this would have been preferred over the wreck that is the imaginary
floating point type of C99!

If you're keen for this, the way to make it happen is to push the national
body, make an implementation, and try to gain acceptance. Just talking
about it is not enough to persuade anybody to implement it on your behalf,
or even adopt it, I'm afraid.

> >In Lisp,
> >possibly my most favoured language of all time (though Smalltalk comes
> >close), there is no problem, it is highly elegant. :-) I would say
> >that the languages I liked most, of the Lisp/Scheme class, are T and
> Common Lisp.
>
> But I would _never_ consider using Lisp in many of my embedded
> projects. And I'm sure you can do a better job listing out more good
> reasons why than I could manage.

Yes, but I wasn't commenting on using Lisp as an embedded programming
language. I'd agree that Lisp in a small embedded system doesn't work well.
However, my first exposure to Lisp was on a Commodore PET with a humble 6502
running the Lisp interpreter in RAM--not fast enough for anything serious,
but fun to dabble with.

> Thanks. I wasn't aware of it.
>
> >C++11 now has lambdas. As lambdas can access free variables (in
> >Lisp/Church/lambda calculus speak), your count example actually comes
> >out fairly well using a lambda.
>
> Would you care to provide a quality example? I am only vaguely
> familiar with lambdas, as I haven't yet picked up c# for anything
> serious yet (I do NOT enjoy being a Windows programmer, though I am
> still forced to it at times, but I am developing .NET code right now
> and probably will delve in soon.)

There are loads of C++11 lambda examples which you can find using Google.
You create a lambda with a capture list, a parameter list, and the body.
You still need an iterator, but the lambda means that you don't need to
create a stand-alone function, allocate a static, and so on. I use
iterators, templates and the like a *lot* in our linker in order to optimize
and rewrite code, and lambdas would help here. If I used coroutines, my
code would not be portable. However, C++11 is new, so there's no point
rewriting existing, working code!

> >Everything in C++ with lambdas is ugly compared to Lisp. Currying is
> >abysmal in C++.
>
> I think I spotted something like currying in f#, when skimming it some
> months back.
>
> >Everybody should be able to speak at least three different classes of
> >programming language to be rounded as a programmer. You don't need to
> >be proficient, but learning another class of programming language
> >expands your capabilities and allows you to think differently in the
> >language you practice most.
>
> While I don't disagree with the thrust you make, it's not addressing
> the point I'm making. I use assembly and C for most of my embedded
> programming. You know why better than I do. I don't have the option
> of using Lisp or C++11 or c# or f#. Sorry.

Yes, you do. You can use them whenever you want to program for fun on a PC.
I'm just comparing high-level features here, not the application to embedded
systems. There are so many things about that you can play with if you just
had the time...

> Those aren't possible, nor
> will they be. But I do have to deal with the real kinds of problems
> I've laid out and do so in C. I have in the past and I will in the
> future.

> The point is that the semantic I laid out is so useful and so simple to
> maintain and explain and use and so powerful. It doesn't violate for a
> fraction of a second any of C's own motivations which are simple and
> efficient compiler tools. It doesn't break any existing C code. And it
> solves very frequently encountered, REAL problems that everyday C
> programmers encounter. And it does ALL OF THIS in a fashion that would
> work with the smaller micros around just like C itself does. It breaks
> nothing at all and isn't even that difficult to modify an existing
> compiler (so said Pennello when I spoke with him about it -- that was
> in 1986, or so, so we are NOT talking about the 25 years of later
> improvements and refinements to linkers and compilers. It was easy
> then.)
>
> You could differentiate your product, Paul, by including syntax for the
> semantic. ;)

Nobody would use it as it has no industry backing; there is no business
motivation to do this. It doesn't matter how hard you argue, you're pushing
uphill. If you really need it, do what I do, get down and write your own
tools!

--
Paul Curtis, Rowley Associates Ltd http://www.rowley.co.uk
SolderCore running Defender... http://www.vimeo.com/25709426