EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

Language feature selection

Started by Don Y March 5, 2017
On 11/03/17 09:22, George Neuner wrote:
> On Fri, 10 Mar 2017 18:08:26 -0800, Paul Rubin > <no.email@nospam.invalid> wrote: > >> George Neuner <gneuner2@comcast.net> writes: >>> Now that is an overstatement. It is merely excruciatingly difficult >>> for a C/C++ compiler to determine aliasing ... it is _not_ impossible. >>> There are at least 2 compilers which do whole program alias analysis. >> >> The general case is assuredly impossible from Rice's theorem. Some >> compilers may do some conservative analyses that can help specific >> programs. Or if x and y are of different datatypes, then iirc the >> compiler is allowed to assume that they aren't aliased. It's probably >> easier to make use of that in C++ than in C. > > Yes, that is correct. However you can go quite a long way by taking > into account both type and scope of visibility.
Is that a long way to making the program fast or to making it correct? A tool that can mutate a program so that it doesn't have to produce a correct result (as is visible in the source code), is a very simple tool. Even I could write one!
Tom Gardner <spamjunk@blueyonder.co.uk> writes:
> Is that a long way to making the program fast or to making it correct?
It's for speed. If the compiler can figure out that that *x and *y are in separate memory locations, then it knows updating *x won't change *y. That means more temporaries can be kept in registers, avoiding some memory loads and stores. The speedups are significant.
Tom Gardner <spamjunk@blueyonder.co.uk> writes:
> It means such alias analysis is impossible in many (most?) > applications.
Global alias analysis, yes, but some local analysis is possible.
On 11/03/17 10:22, Paul Rubin wrote:
> Tom Gardner <spamjunk@blueyonder.co.uk> writes: >> Is that a long way to making the program fast or to making it correct? > > It's for speed. If the compiler can figure out that that *x and *y are > in separate memory locations, then it knows updating *x won't change *y. > That means more temporaries can be kept in registers, avoiding some > memory loads and stores. The speedups are significant.
My compiler meets that constraint by producing very fast and very small code. The downside is that the result is always 42 :)
On 11/03/17 10:23, Paul Rubin wrote:
> Tom Gardner <spamjunk@blueyonder.co.uk> writes: >> It means such alias analysis is impossible in many (most?) >> applications. > > Global alias analysis, yes, but some local analysis is possible.
With the emphasis on "some". Significant difficulties arise with non-primitive data that is shared rather than copied. In the C/C++ /language/, aliasing is a pig resulting in pessimised code. Typically, unlike other languages, you have to resort giving extra assertions to the /tools/; those assertions are the source of quite a few subtle problems.
On 17-03-10 23:32 , Paul Rubin wrote:
> Jacob Sparre Andersen <jacob@jacob-sparre.dk> writes: >> The Ada variant is: >> Some_Variable : Some_Type with Address => #16#dead_beef#;
A small nit-pick: as the Ada type System.Address is "private" (that is, opaque), it is necessary to apply the To_Address function to the integer literal: ... with Address => To_Address (16#dead_beef#);
> Do you need something like a volatile declaration? Does Ada have that?
Yes. This can be specified by adding "Volatile" to the declaration: Some_Variable : Some_Type with Volatile, Address => To_Address (16#dead_beef#); However, for I/O registers it is often desirable to ensure atomic reads and writes by using the aspect "Atomic" instead of "Volatile". The Atomic aspect implies the Volatile aspect, so it is not necessary to specify both. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
On 2017-03-11 3:59 AM, Tom Gardner wrote:
> On 11/03/17 02:16, Paul Rubin wrote: >> Tom Gardner <spamjunk@blueyonder.co.uk> writes: >>>> There are at least 2 compilers which do whole program alias >>>> analysis. >>> >>> How do they do that if the program includes a library for which >>> the source is not available, and for which the compiler flags are >>> not known? >> >> "Whole program" means the compiler has all of the source code and >> can munch it all as a single piece. All kinds of added >> optimizations are then possible. > > I thought you would say that. > > It means such alias analysis is impossible in many (most?) > applications. >
Yes and no. Libraries can still be precompiled and don't need sources as long as the object format has the information the compiler needs for its analysis. w..
On 17-03-11 04:46 , jim.brakefield@ieee.org wrote:
> On Sunday, March 5, 2017 at 8:43:28 PM UTC-6, Don Y wrote: >> A quick/informal/UNSCIENTIFIC poll: >> >> What *single* (non-traditional) language feature do you find most >> valuable in developing code? (and, applicable language if unique >> to *a* language or class of languages) > > A plug for array operators: as in Numpy, IDL/PV~wave, APL and Julia. > That is: array and vector operators baked into the language.
In a language that allows operator overloading, programmers can define their own array operators.
> I've found that programming at this level yields shorter programs with > less debugging: You windup making your data structures and algorithms > use the fewest operators possible/practical.
I agree that array operators are useful, but only for relatively simple cases such as the basic arithmetic operations on arrays. However, taking it to the APL extreme with complex vector/matrix restructurings (outer products, lamination, ...) can create code that is hard for others to understand.
> There is a theoretical vantage point for this style of programming > (which I call "programming in the large" as opposed to "programming > in the small"):
You may call it that, but I hope you know that most people understand these large/small terms differently; see https://en.wikipedia.org/wiki/Programming_in_the_large_and_programming_in_the_small. -- Niklas Holsti Tidorum Ltd niklas holsti tidorum fi . @ .
On 2017-03-10 8:14 PM, Paul Rubin wrote:
> Walter Banks <walter@bytecraft.com> writes: >>> Do you mean C? >> More like the whole crop of interpreted languages now being used. > > Oh ok, but those languages are generally sugared-up re-inventions of > Lisp, which is even older than C, and which the cogniscenti have > been using all along ;-). E. W. Dijkstra in his Turing Award lecture > back in 1972 had already observed: > > With a few very basic principles at its foundation, it [LISP] has > shown a remarkable stability. Besides that, LISP has been the carrier > for a considerable number of in a sense our most sophisticated > computer applications. LISP has jokingly been described as "the most > intelligent way to misuse a computer". I think that description a > great compliment because it transmits the full flavour of liberation: > it has assisted a number of our most gifted fellow humans in thinking > previously impossible thoughts. > > By all means give the interpreters a try if you haven't. They make > programming more productive along several axes, at the cost of some > hardware resources (cpu and memory) that are generally plentiful > with today's computers. > >> I tend to think of C as something of our generation. > > Yes, I was less surprised that good stuff was being done in > interpreted languages, as that it's now relatively rare for even > their expert users to have ever used C for anything. >
It is our generation that is obsessed with optimization execution and data space. Some of the programs I have seen be developed in by these people tend to be using algorithms that are exchanging our sense of optimization for application performance. (VR applications for example) Related to that we have almost always treated processor resources as a rare resource and when I changed that mind set on some of the massively parallel systems I have been working on to processors are just a resource that needs to be managed like memory in applications I was suddenly seeing huge leaps in application performance. w..
On 3/10/2017 7:14 PM, Walter Banks wrote:
> On 2017-03-10 6:02 PM, Don Y wrote: >> On 3/10/2017 3:42 PM, Walter Banks wrote: >>> On 2017-03-10 4:10 PM, Don Y wrote: >>>>> What's wrong with a single set of sources that defines an >>>>> application, no command line options or linker scripts just an >>>>> application including the definition of the target, files and >>>>> libraries it needs. Compilation is both faster by many factors >>>>> and there is a simple self contained project that can be >>>>> easily re-created after a decade or more. >>>> >>>> That would depend on the size and complexity of the project, >>>> right? I have 192 processors (each with multiple cores) in my >>>> current design. It would be *delightful* if <something> could >>>> sort out how best to allocate resources at run-time instead of my >>>> crude metrics. >>>> >>>> But, those tools don't exist and aren't likely to any time soon. >>> >>> Most of my time now is working on both tools and ISA's. There has >>> been some really significant changes in both approaches to >>> compiling for heterogeneous parallel environments and execution >>> environments that have hundreds to thousands of processors in >>> them. >> >> But they are (largely) *static* environments (?). The toolchain >> doesn't have to decide when to bring another processor on-line... or, >> when it can retire a running processor and migrate its workload to >> some OTHER processor, etc. Or, which aspects of an application >> should be bound to specific processors (nearness of related I/Os) and >> which aspects should AVOID particular processors (as they were in >> insecure locations). > > It is not a static environment. The compiler DOES allocate which > processor (the compiler has heterogeneous processor support) is suitable > for some particular part of the application. Most of the application > distribution IS determined at compile time. > > The compiler tool work is an evolution of the named address space work > we did in Japan in the early 90's (ISO/IEC18037) to named processor > space to compiler allocated named processor space.
I don't see how that can be applied to anything but a generic environment. E.g., I track resources associated with each node (especially "unique" I/O's). Then, based on the *events* encountered in the environment, decide which of those resources NEED to be brought on-line and how resources can then be redistributed to meet the current workload. (and, conversely, when I can "shed" resources -- to conserve *power* and/or improve communication overhead) So, for example, if it's "daytime", the node that supports the video camera that monitors folks approaching the front door is powered up -- because I want to notice "visitors" in order to ANNOUNCE them (I don't accept visitors "after hours"). In this case, there is a NEED for that particular set of I/O's (i.e., the cameras facing the back yard are not capable of watching the front door approach!) along with a need for some additional compute resources (real-time image analysis). There is a COST associated with this: the power required to run that node and those hardware resources. *Where* the code that analyzes the imagery executes is determined by the locations of the resources (CPU+memory) required to perform that task along with the communication costs to/from that application's physical location and the I/O's that it requires along with the clients with which it interacts. If/when a visitor is "detected", then a means of informing the occupants of its identity is required. The most obvious solution being to power up another node proximate to an occupant and dispatch a live video feed to the *display* served by that node. An occupant can elect to interact with the visitor ("intercom") *or* direct the system to interact with them on their behalf (i.e., so they don't have to disclose their presence to the visitor: "Who are you? Whaddya want?"). Of course, this means tasks dedicated to synthesizing the required prompts need to be brought on-line and a channel opened by which that audio can be fed to the visitor and the visitor's reply captured and relayed to the occupant. If the house is unoccupied, that video feed might, instead, be spooled to a media tank. Or, pushed over an internet/phone connection to the occupant(s) at a remote location. The audio prompts can be triggered from a "house_unoccupied()" script and responses similarly captured/dispatched. When the visitor departs, all of this mechanism can be taken down to conserve power. Later that night, that idle (cold!) node might be deliberately powered up, its camera left OFF and the CPU+memory assigned to "off-line/batch" processing of commercial detection in some OTA video broadcast captured earlier in the day. Or, the resources used to refine the speech recognizer's training set for UserA based on the stored audio for the voice commands issued during that day. It's not possible to come up with an "ideal" resource (re)allocation strategy -- even having detailed knowledge of the *current* workload. I don't see how a tool can know these usage patterns or even possibilities at compile/build time! [How does it know the cost of migrating taskA to node4 to accommodate taskM's MORE EFFICIENT use of node4's hardware resources (I/O's) in order to factor that into its decision as to whether node27 should, instead, be powered up and taskM spawned there, instead (incurring higher communication costs to PROXY stubs on node4 to twiddle those I/O's)? How does it know the communication costs for taskM's interactions with those proxies? etc.] [[I do this with a combination of crude metrics and heuristics "learned" over time -- by the system observing itself and how well it meets its performance requirements and deadlines with particular (task,node) bindings. And, I never know if I've got the *ideal* configuration for any set of nodes and tasks...]]
The 2026 Embedded Online Conference