EmbeddedRelated.com
Forums

Making Fatal Hidden Assumptions

Started by CBFalconer March 6, 2006
Andrew Reilly wrote:
> On Tue, 14 Mar 2006 03:13:08 +0000, Dik T. Winter wrote: > > > In article <pan.2006.03.13.22.35.43.776370@areilly.bpc-users.org> Andrew Reilly <andrew-newspost@areilly.bpc-users.org> writes: > > > On Mon, 13 Mar 2006 15:31:35 +0000, Dik T. Winter wrote: > > ... > > > > On the other hand for every machine instruction there should be an > > > > construct in the assembler to get that instruction. With that in > > > > mind C doesn't fit either. > > > > > > Well, since we're talking about a "universal assembler" (a reasonably > > > commonly used term), that's obviously something different from the usual > > > machine-specific assembler, which does indeed usually have that property. > > > (Although I've met assemblers where the only way to get certain > > > instructions was to insert the data for the op-code in-line. Instruction > > > coverage sometimes lags behind features added to actual implementations.) > > > > I have met one such, but not for the reason you think. In this case the > > assembler knew the instruction, and translated it, but completely wrong. > > Apparently an instruction never used, but nevertheless published. And I > > needed it. > > > > But what is (in your opinion) a universal assembler? What properties should > > it have to contrast it with a HLL? > > I posted a page-long description of what I concieve a universal assembler > to be in a previous message in the thread. Perhaps it didn't get to your > news server? Google has it here: > http://groups.google.com/group/comp.lang.c/msg/a91a898c08457481?hl=en& > > The main properties that it would have, compared to a C (some other HLLs > do have some of these properties) are: > > a) Rigidly defined functionality, without "optimization", except for > instruction scheduling, in support of VLIW or (some) superscaler cores. > (Different ways of expressing a particular algorithm, which perform more > or less efficiently on different architectures should be coded as such, > and selected at compile/configuration time, or built using > meta-programming techniques.) This is opposed to the HLL view which is > something like: express the algorithm in a sufficiently abstract way and > the compiler will figure out an efficient way to code it, perhaps. Yes, > compilers are really quite good at that, now, but that's not really the > point. This aspect is a bit like my suggestion in the linked post as > being something a bit like the Java spec, but without objects. Tao's > "Intent" VM is perhaps even closer. Not stack based. I would probably > still be happy if limited common-subexpression-elimination (factoring) was > allowed, to paper-over the array index vs pointer/cursor coding style vs > architecture differences.
If I can summarize this as: -- the source code changes when the underlying processor architecture changes then I agree this is a key reason why i consider C a glorified assembler.
> > b) Very little or no "language" level support for control structures or > calling conventions, but made-up-for with powerful compile-time > meta-programming facilities, and a standard "macro" library that provides > most of the expected facilities found in languages like C or Pascal. Much > of what are now thought of as compiler implementation features would wind > up in macro libraries. The advantage of this would be that code could be > written to *rely* on specific transformation performance and existence, > instead of just saying "hope that your compiler is clever enough to > recognize this idiom", in the documentation. It would also make possible > the sorts of small code factorizations that happen all the time in > assembly language, but which single-value-return, unnested function call > conventions in C make close to impossible. Or different coding styles, > like threaded interpreters, reasonable without language extensions.
Interesting features. I'm not sure how much different multiple value returns would be from values returned via reference parameters (pointers). it sounds like a good idea.
> > I imagine something like LLVM (http://llvm.cs.uiuc.edu/), but with a > powerful symbolic compile-time macro language on top (eg scheme...), an > algepraic (infix) operator syntax, and an expression parser. > > In the mean time, "C", not as defined by the standard, but as implemented > in the half dozen or so compilers that I regularly use, is not so far from > what I want, to make me put in the effort to build my universal assembler > myself. > > Cheers, > > -- > Andrew
Thanks for the contribution to the discussion. Ed
"Al Balmer" <albalmer@att.net> wrote in message 
news:sb0s02l3tqn0ifj6jb3geri6qveeb4qnjr@4ax.com...
> On Wed, 08 Mar 2006 08:14:37 +1100, Andrew Reilly > <andrew-newspost@areilly.bpc-users.org> wrote: > >>On Tue, 07 Mar 2006 13:29:11 -0500, Eric Sosman wrote: >>> The Rationale says the Committee considered defining >>> the effects at both ends (bilateral dispensations?), but rejected it for >>> efficiency reasons. Consider an array of large elements -- structs of >>> 32KB size, say. A system that actually performed hardware checking of >>> pointer values could accommodate the one-past-the-end rule by allocating >>> just one extra byte after the end of the array, a byte that the special >>> pointer value could point at without setting off the hardware's alarms. >>> But one-before-the-beginning would require an extra 32KB, just to hold >>> data that could never be used ... >> >>So the standards body broke decades of practice and perfectly safe and >>reasonable code to support a *hypothetical* implementation that was so >>stupid that it checked pointer values, rather than pointer *use*? >>Amazing. > > Decades of use? This isn't a new rule. > > An implementation might choose, for valid reasons, to prefetch the > data that pointer is pointing to. If it's in a segment not allocated > ...
If a system traps on a prefetch, it's fundamentally broken. However, a system that traps when an invalid pointer is loaded is not broken, and the AS/400 is the usual example. Annoying, but not broken. Why IBM did it that way, I'm not sure, but my guess is they found it was cheaper to do validity/permission checks when the address was loaded than when it was used since the latter has a latency impact. S -- Stephen Sprunk "Stupid people surround themselves with smart CCIE #3723 people. Smart people surround themselves with K5SSS smart people who disagree with them." --Aaron Sorkin *** Free account sponsored by SecureIX.com *** *** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
"Rod Pemberton" <do_not_have@sorry.bitbucket.cmm> wrote in message 
news:dumjdr$blip$1@news3.infoave.net...
> "Gerry Quinn" <gerryq@DELETETHISindigo.ie> wrote in message > news:MPG.1e78c38d984332398ac06@news1.eircom.net... >> In article <0001HW.C033E3BE0194AEB7F0386550@news.verizon.net>, >> randyhoward@FOOverizonBAR.net says... >> > Does everything have to become a racism experiment? >> >> Of course there are those who object to every figure in which the >> adjective 'black' has negative connotations. > > True. Mostly black people.
In my experience (which is more limited to recent years than many others' here), it is typically do-gooder whites that are offended by words like "black" or "oriental" or "Indian" (referring to the US domestic variety). I recall an interview of Nelson Mandela by (I think) Dan Rather shortly after the former's first election, and he was asked "How does it feel to be the first African-American president of South Africa?" Mandela was understandably confused, but the interviewer simply couldn't bring himself to say the word "black". Mandela finally figured it out and answered, but he had to come away from that thinking all Americans are complete dolts. S -- Stephen Sprunk "Stupid people surround themselves with smart CCIE #3723 people. Smart people surround themselves with K5SSS smart people who disagree with them." --Aaron Sorkin *** Free account sponsored by SecureIX.com *** *** Encrypt your Internet usage with a free VPN account from http://www.SecureIX.com ***
On Tue, 21 Mar 2006 14:51:21 -0800, Ed Prochak wrote:
> If I can summarize this as: > -- the source code changes when the underlying processor architecture > changes > then I agree this is a key reason why i consider C a glorified > assembler.
That's pretty close. I think that the link to Dan Bernstein's page on the topic said it better than me: you can use different code and different approaches where it matters to both the program and to the target processor, but you can also use a simpler, generic approach that will just work anywhere, when absolute maximum performance isn't necessary.
>> b) Very little or no "language" level support for control structures or >> calling conventions, but made-up-for with powerful compile-time >> meta-programming facilities, and a standard "macro" library that >> provides most of the expected facilities found in languages like C or >> Pascal. Much of what are now thought of as compiler implementation >> features would wind up in macro libraries. The advantage of this would >> be that code could be written to *rely* on specific transformation >> performance and existence, instead of just saying "hope that your >> compiler is clever enough to recognize this idiom", in the >> documentation. It would also make possible the sorts of small code >> factorizations that happen all the time in assembly language, but which >> single-value-return, unnested function call conventions in C make close >> to impossible. Or different coding styles, like threaded interpreters, >> reasonable without language extensions. > > Interesting features. I'm not sure how much different multiple value > returns would be from values returned via reference parameters > (pointers). it sounds like a good idea.
The significant difference is that reference parameters (pointers) can't be in registers. (Not to mention the inefficiency of repeatedly pushing the reference onto the call stack...) Say you have a few to half a dozen peices of state in some algorithm, and the algorithm operates through a pattern of "mutations" of that state, such that some or all of the state changes as a result of each operation. The only way to code that in C is either to write out the code that comprises the element operations of each pattern long-hand, or use preprocessor macros. The most obvious concrete example of this sort of thing is the pattern where you have one or more "cursors" into a data structure, and code that walks through it, producing results at the same time. You want your "codelets" to return both their result *and* change the cursor to point to the next element in the list to be processed. In C, you can't have both the result and the pointer in registers, but that's how you would code it in assembly. Cheers, -- Andrew
On Tue, 21 Mar 2006 19:02:55 -0600, Stephen Sprunk wrote:
> If a system traps on a prefetch, it's fundamentally broken. However, a > system that traps when an invalid pointer is loaded is not broken, and the > AS/400 is the usual example. Annoying, but not broken.
And I still say that constraining C for everyone so that it could fit the AS/400, rather than making C-on-AS/400 jump through a few more hoops to match traditional C behaviour, was the wrong trade-off. I accept that this may well be a minority view. -- Andrew
Stephen Sprunk wrote:
>
... snip ...
> > Why IBM did it that way, I'm not sure, but my guess is they found > it was cheaper to do validity/permission checks when the address > was loaded than when it was used since the latter has a latency > impact.
A single pointer check can validate the pointer for multiple dereferences. This is much cheaper than checking it at each dereference. -- "Churchill and Bush can both be considered wartime leaders, just as Secretariat and Mr Ed were both horses." - James Rhodes. "We have always known that heedless self-interest was bad morals. We now know that it is bad economics" - FDR
In article <pan.2006.03.22.00.24.22.758255@areilly.bpc-users.org> Andrew Reilly <andrew-newspost@areilly.bpc-users.org> writes:
...
 > > Interesting features. I'm not sure how much different multiple value
 > > returns would be from values returned via reference parameters
 > > (pointers).  it sounds like a good idea.
 > 
 > The significant difference is that reference parameters (pointers) can't
 > be in registers.

Why not?  I have worked with a lot of implementations where the first few
parameters were passed through registers.  (Depending on the processor,
from four to eight.)  And i many cases no need at all to put those pointers
on the stack.
-- 
dik t. winter, cwi, kruislaan 413, 1098 sj  amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn  amsterdam, nederland; http://www.cwi.nl/~dik/
Andrew Reilly <andrew-newspost@areilly.bpc-users.org> writes:
> On Tue, 21 Mar 2006 19:02:55 -0600, Stephen Sprunk wrote: >> If a system traps on a prefetch, it's fundamentally broken. However, a >> system that traps when an invalid pointer is loaded is not broken, and the >> AS/400 is the usual example. Annoying, but not broken. > > And I still say that constraining C for everyone so that it could fit the > AS/400, rather than making C-on-AS/400 jump through a few more hoops to > match traditional C behaviour, was the wrong trade-off. I accept that > this may well be a minority view.
It is. The C standard wouldn't just have to forbid an implementation from trapping when it loads an invalid address; it would have to define the behavior of any program that uses such an address. A number of examples have been posted here where that could cause serious problems for some implementations other than the AS/400. -- Keith Thompson (The_Other_Keith) kst-u@mib.org <http://www.ghoti.net/~kst> San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst> We must do something. This is something. Therefore, we must do this.
On Wed, 22 Mar 2006, Dik T. Winter wrote:
> Andrew Reilly <andrew-newspost@areilly.bpc-users.org> writes: > ... > > > Interesting features. I'm not sure how much different multiple value > > > returns would be from values returned via reference parameters > > > (pointers). it sounds like a good idea. > > > > The significant difference is that reference parameters (pointers) can't > > be in registers. > > Why not? I have worked with a lot of implementations where the first few > parameters were passed through registers. (Depending on the processor, > from four to eight.) And i many cases no need at all to put those pointers > on the stack.
I believe Andrew means void foo(int & x) { use(x); } void bar() { register int a; foo(a); /* Will C++ accept this? */ } I don't know whether standard C++ would accept the above code, or whether it would, like standard C, insist that the programmer can't take the address of a 'register' variable, even implicitly. But in any case, it would be hard for the compiler to put the variable 'a' into a machine register when it compiles 'bar', because it needs to pass its address to 'foo' later on. HTH, -Arthur
On 2006-03-22, Keith Thompson <kst-u@mib.org> wrote:
> Andrew Reilly <andrew-newspost@areilly.bpc-users.org> writes: >> On Tue, 21 Mar 2006 19:02:55 -0600, Stephen Sprunk wrote: >>> If a system traps on a prefetch, it's fundamentally broken. However, a >>> system that traps when an invalid pointer is loaded is not broken, and the >>> AS/400 is the usual example. Annoying, but not broken. >> >> And I still say that constraining C for everyone so that it could fit the >> AS/400, rather than making C-on-AS/400 jump through a few more hoops to >> match traditional C behaviour, was the wrong trade-off. I accept that >> this may well be a minority view. > > It is. The C standard wouldn't just have to forbid an implementation > from trapping when it loads an invalid address; it would have to > define the behavior of any program that uses such an address.
Why? It's not that difficult to define the behavior of a program that "uses" such an address other than by dereferencing, and no problem to leave the behavior undefined for dereferencing
> A number of examples have been posted here where that could cause > serious problems for some implementations other than the AS/400.