EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

Language feature selection

Started by Don Y March 5, 2017
On Fri, 10 Mar 2017 23:25:20 +0000, Tom Gardner
<spamjunk@blueyonder.co.uk> wrote:

>On 10/03/17 19:33, Don Y wrote: >> >> ... You can discipline yourself to avoid relying on undocumented >> behaviors or "letting the compiler choose" (among possible >> interpretations). Of course, writing PORTABLE code under those >> constraints is considerably harder (how many folks actually >> worry about exceeding the implementation-specific limits for >> particular data types? Or, verify that their code will continue >> to function properly *if* those limits are considerably higher >> than "nominal" from the Standard?) > >Not in C/C++. The languages remove the ability of any >compiler to determine what is and is not aliased. That >precludes many optimisations unless you assert there >won't be aliasing.
Now that is an overstatement. It is merely excruciatingly difficult for a C/C++ compiler to determine aliasing ... it is _not_ impossible. There are at least 2 compilers which do whole program alias analysis. George
On 11/03/17 01:22, George Neuner wrote:
> On Fri, 10 Mar 2017 23:25:20 +0000, Tom Gardner > <spamjunk@blueyonder.co.uk> wrote: > >> On 10/03/17 19:33, Don Y wrote: >>> >>> ... You can discipline yourself to avoid relying on undocumented >>> behaviors or "letting the compiler choose" (among possible >>> interpretations). Of course, writing PORTABLE code under those >>> constraints is considerably harder (how many folks actually >>> worry about exceeding the implementation-specific limits for >>> particular data types? Or, verify that their code will continue >>> to function properly *if* those limits are considerably higher >>> than "nominal" from the Standard?) >> >> Not in C/C++. The languages remove the ability of any >> compiler to determine what is and is not aliased. That >> precludes many optimisations unless you assert there >> won't be aliasing. > > Now that is an overstatement. It is merely excruciatingly difficult > for a C/C++ compiler to determine aliasing ... it is _not_ impossible. > There are at least 2 compilers which do whole program alias analysis.
How do they do that if the program includes a library for which the source is not available, and for which the compiler flags are not known?
George Neuner <gneuner2@comcast.net> writes:
> Now that is an overstatement. It is merely excruciatingly difficult > for a C/C++ compiler to determine aliasing ... it is _not_ impossible. > There are at least 2 compilers which do whole program alias analysis.
The general case is assuredly impossible from Rice's theorem. Some compilers may do some conservative analyses that can help specific programs. Or if x and y are of different datatypes, then iirc the compiler is allowed to assume that they aren't aliased. It's probably easier to make use of that in C++ than in C.
On 2017-03-10 6:02 PM, Don Y wrote:
> On 3/10/2017 3:42 PM, Walter Banks wrote: >> On 2017-03-10 4:10 PM, Don Y wrote: >>>> What's wrong with a single set of sources that defines an >>>> application, no command line options or linker scripts just an >>>> application including the definition of the target, files and >>>> libraries it needs. Compilation is both faster by many factors >>>> and there is a simple self contained project that can be >>>> easily re-created after a decade or more. >>> >>> That would depend on the size and complexity of the project, >>> right? I have 192 processors (each with multiple cores) in my >>> current design. It would be *delightful* if <something> could >>> sort out how best to allocate resources at run-time instead of my >>> crude metrics. >>> >>> But, those tools don't exist and aren't likely to any time soon. >> >> Most of my time now is working on both tools and ISA's. There has >> been some really significant changes in both approaches to >> compiling for heterogeneous parallel environments and execution >> environments that have hundreds to thousands of processors in >> them. > > But they are (largely) *static* environments (?). The toolchain > doesn't have to decide when to bring another processor on-line... or, > when it can retire a running processor and migrate its workload to > some OTHER processor, etc. Or, which aspects of an application > should be bound to specific processors (nearness of related I/Os) and > which aspects should AVOID particular processors (as they were in > insecure locations).
It is not a static environment. The compiler DOES allocate which processor (the compiler has heterogeneous processor support) is suitable for some particular part of the application. Most of the application distribution IS determined at compile time. The compiler tool work is an evolution of the named address space work we did in Japan in the early 90's (ISO/IEC18037) to named processor space to compiler allocated named processor space. w..
Walter Banks <walter@bytecraft.com> writes:
>> Do you mean C? > More like the whole crop of interpreted languages now being used.
Oh ok, but those languages are generally sugared-up re-inventions of Lisp, which is even older than C, and which the cogniscenti have been using all along ;-). E. W. Dijkstra in his Turing Award lecture back in 1972 had already observed: With a few very basic principles at its foundation, it [LISP] has shown a remarkable stability. Besides that, LISP has been the carrier for a considerable number of in a sense our most sophisticated computer applications. LISP has jokingly been described as "the most intelligent way to misuse a computer". I think that description a great compliment because it transmits the full flavour of liberation: it has assisted a number of our most gifted fellow humans in thinking previously impossible thoughts. By all means give the interpreters a try if you haven't. They make programming more productive along several axes, at the cost of some hardware resources (cpu and memory) that are generally plentiful with today's computers.
> I tend to think of C as something of our generation.
Yes, I was less surprised that good stuff was being done in interpreted languages, as that it's now relatively rare for even their expert users to have ever used C for anything.
Tom Gardner <spamjunk@blueyonder.co.uk> writes:
>> There are at least 2 compilers which do whole program alias analysis. > > How do they do that if the program includes a library for which the > source is not available, and for which the compiler flags are not > known?
"Whole program" means the compiler has all of the source code and can munch it all as a single piece. All kinds of added optimizations are then possible.
On Sunday, March 5, 2017 at 8:43:28 PM UTC-6, Don Y wrote:
> A quick/informal/UNSCIENTIFIC poll: > > What *single* (non-traditional) language feature do you find most > valuable in developing code? (and, applicable language if unique > to *a* language or class of languages)
A plug for array operators: as in Numpy, IDL/PV~wave, APL and Julia. That is: array and vector operators baked into the language. I've found that programming at this level yields shorter programs with less debugging: You windup making your data structures and algorithms use the fewest operators possible/practical. Writing loops and subscript expressions are mostly gone! In theory, such renditions of algorithms can be optimized and/or parallelized by the compiler to a greater extent than for normal code. There is a theoretical vantage point for this style of programming (which I call "programming in the large" as opposed to "programming in the small"): the fundamental data structure is, say, an eight dimensional array (possibly non-contiguous). A scalar is such an array with the extent in each dimension equal to one. A vector has one dimension with the extent greater than one, etc. Of course, the array element can be something other than a scalar (integer, float, etc) such as a record or list or hash table. Jim Brakefield
On Fri, 10 Mar 2017 16:49:21 -0700, Don Y
<blockedofcourse@foo.invalid> wrote:

>On 3/10/2017 11:43 AM, George Neuner wrote: >> >> You asked if closures introduced a GC penalty - the answer to which is >> "not necessarily". Now you are complaining that functions referenced >> by a closure need to remain available for its entire lifetime. > >Yes -- so how does the tool and/or developer ensure it provides the >intended functionality when required? How do you imbue the tool >with the "smarts" to be able to analyze these cases (e.g., impose >a "persistent" implementation) in light of the different ways >that the developer can opt to use it?
Dependency (import) analysis shows what is needed - but not when it is needed. *Safe* dynamic unloading requires some form of GC - but the form GC takes depends on many factors. E.g., using stack-based closures, or in the absence of closures altogether, it is relatively simple for a compiler to identify uses of a module and affect OS-based reference counting to control its residency. This works transitively if all modules obey the same conventions. [This is just like using COM or CORBA interfaces, but the mechanism would be built into the primary language compiler instead of being a separate tool. More later.] Once you introduce heap-based closures the lifetime of closures becomes indefinite: it can be *estimated* by a compiler - e.g., using region analysis - but runtime GC generally is better at noticing disuse in a timely fashion. Module reference counting by the compiler is no longer sufficient and must be augmented by runtime GC.
>Alternatively, how do you give the developer tools that let *him* >convey those dependencies to the tool? > >Unless you restrict how they can be used, I don't see how you can >address each possibility...
It is up to the programmer to structure her modules so that dynamic (un)loading is most effective. Modules have management overhead, and many small modules could be worse than fewer, larger ones. There is no formulae for figuring out the best structure. There may be systemic usage patterns which suggest certain structurings, but such patterns necessarily are usage dependent.
>> A module has not "served its purpose" if its functions may yet be >> called in the future. That, in general, is undecidable. >> >> The GC standard of "reachability" is conservative - GC does not >> consider whether an object will be used again because it can't know >> that. A compiler is in a better position to figure out usage, but >> even there the only way to know for certain is to simulate execution >> of the program and observe that it terminates without ever again >> referencing the object in question. >> [Hint: indefinate loops can't be guaranteed to terminate.] > >Exactly. I "avoid" this issue by requiring the developer to >handle the "resource reclamation". This means he can make a mistake >and shoot off both feet (so the OS has to know to catch these >sorts of problems, indirectly).
If modules are stateless - iow, all state is in the client - then modules can come and go at the whim of the OS memory manager. Not necessarily very performant, but always correct. However, stateless modules likely will seem unnatural to many programmers, who may find them difficult to create even if there is a language (or other mechanism) that makes them [relatively] easy to use.
> >>> [If initialize_heap RETURNS "memory manager"s that are then used to >>> manage each individual heap, then initialize_heap() itself is still >>> dynamically bound to those objects. If you unload the module >>> containing it, then the memory manager objects (in this example) that >>> it created for the callers also disappears.] >> >> Not necessarily - it depends on the module structure. >> E.g., >> >> +----------------+ >> -| heap functions | >> / +----------------+ >> / | >> / v >> / +-----------------+ >> <closure> <-- | initialize_heap | >> +-----------------+ >> >> initialize_heap can be in a module separate from other heap control >> functions. Once the (heap interface) closure is created, the module >> containing initialize_heap is unneeded and could be unloaded. > >That's how I "manually" implement these things. But *I* have to >keep track of whether the functions are loaded and *where* they >are loaded. E.g., if I want to migrate something to another >node, then I need a handle by which the heap functions (and the >bindings made at "initialization") can be invoked "later". > >I just don't see how a compiler can be aware of this sort of thing >(without PREVENTING me from doing these sorts of things).
In the simple case it is just a matter of identifying call sites and tracking which modules are supplying the called function. The main difficulty stems from reachability of code in the face of indeterminant loops. The compiler pretty much has to assume that any call made to the module will actually happen. The analysis can be simplified if the language has constructs that guarantee termination even in the face of errors or exceptions: e.g., Lisp's unwind-protect or Scheme's dynamic-wind.
>I think a lot of these sorts of things have an underlying assumption that >they are part of a persistent, single "program unit" so the compiler doesn't >have to "worry" about lifespan.
Most compilers assume libraries/modules are always available *because* that's what they assume. There is no reason that they must assume this - it's simply easier to do so. Consider the ancient overlay compilers ... dynamic module handling is not really any different [unless you have to consider the bin-packing aspect of fitting arbitrary load sequences into memory].
>>> Does the language magically enforce this dependence? Or, does the developer >>> have to be aware of the potential "gotcha"? Or, does something else reference >>> count, etc.?
Modular languages handle dependence - they arrange to load "library" modules in the same way a C program loads a DLL. However, most systems have no provisions to automatically unload unreferenced modules. E.g., COM and CORBA, when used via IDL, normally hold a reference to the library until the program ends (secondary references are added and deleted in the course of using the library). With COM (not sure about CORBA) you can step outside the IDL and explicitly release the initial reference, which - if there are no other references held by the program - will unlink the library and remove it from the program's address space. But that doesn't affect other clients using the library - the COM server won't terminate the library until all clients have released it. George
On Fri, 10 Mar 2017 18:08:26 -0800, Paul Rubin
<no.email@nospam.invalid> wrote:

>George Neuner <gneuner2@comcast.net> writes: >> Now that is an overstatement. It is merely excruciatingly difficult >> for a C/C++ compiler to determine aliasing ... it is _not_ impossible. >> There are at least 2 compilers which do whole program alias analysis. > >The general case is assuredly impossible from Rice's theorem. Some >compilers may do some conservative analyses that can help specific >programs. Or if x and y are of different datatypes, then iirc the >compiler is allowed to assume that they aren't aliased. It's probably >easier to make use of that in C++ than in C.
Yes, that is correct. However you can go quite a long way by taking into account both type and scope of visibility. George
On 11/03/17 02:16, Paul Rubin wrote:
> Tom Gardner <spamjunk@blueyonder.co.uk> writes: >>> There are at least 2 compilers which do whole program alias analysis. >> >> How do they do that if the program includes a library for which the >> source is not available, and for which the compiler flags are not >> known? > > "Whole program" means the compiler has all of the source code and can > munch it all as a single piece. All kinds of added optimizations are > then possible.
I thought you would say that. It means such alias analysis is impossible in many (most?) applications.
The 2026 Embedded Online Conference