Language feature selection| page 9

Reply by George Neuner ●March 10, 20172017-03-10

Hi Don,

On Wed, 8 Mar 2017 11:23:48 -0700, Don Y <blockedofcourse@foo.invalid>
wrote:

>On 3/8/2017 10:20 AM, George Neuner wrote:
>>
>> A closure can be defined over a function anywhere the function is in
>> scope.  A function F exported from module X may be used by a closure
>> in module Y which imports X.  Similarly a closure defined in module X
>> may be exported from X as an opaque object.
>
>Two heaps created, each by referencing a function:
>     instantiate_heap(memory, metrics, allocation_policy, release_policy, ...)
>
>Module that *defines* that function must remain "loaded" as it defines the
>memory managers (directly or indirectly, depending on what is return-ed)
>for each heap.
>
>Any "choices" that are stored (again, avoiding "static") in the function's
>definition need to persist beyond the *invocation* of instantiate_heap.
>
>> Recall that a module may require "initialization" when it is imported.
>> Closures defined for export would be created at that time.
>
>Yes.  But who ensures they remain present <somewhere> after the module
>itself has served its purpose?  I.e., the state is "hidden" and not
>obvious to the "invoker".

You're moving the goal post again.  

You asked if closures introduced a GC penalty - the answer to which is
"not necessarily".  Now you are complaining that functions referenced
by a closure need to remain available for its entire lifetime.

I'd say "duh" but that would be redundant.


A module has not "served its purpose" if its functions may yet be
called in the future.  That, in general, is undecidable.  

The GC standard of "reachability" is conservative - GC does not
consider whether an object will be used again because it can't know
that.  A compiler is in a better position to figure out usage, but
even there the only way to know for certain is to simulate execution
of the program and observe that it terminates without ever again
referencing the object in question.
[Hint: indefinate loops can't be guaranteed to terminate.]


>[If initialize_heap RETURNS "memory manager"s that are then used to
>manage each individual heap, then initialize_heap() itself is still
>dynamically bound to those objects.  If you unload the module
>containing it, then the memory manager objects (in this example) that
>it created for the callers also disappears.]

Not necessarily - it depends on the module structure.  
E.g., 

              +----------------+
             -| heap functions |
            / +----------------+
           /      |
          /       v
         /    +-----------------+
<closure> <-- | initialize_heap |
              +-----------------+


initialize_heap can be in a module separate from other heap control
functions.  Once the (heap interface) closure is created, the module
containing initialize_heap is unneeded and could be unloaded.

Your imagination is failing.


>> You can do this even with stack bound closures.  Consider that
>> imported namespaces need to be available before the importing module's
>> code can execute.  However, even in the case of the 1st (top) module,
>> the *process* invoking it already exists, and therefore there is
>> already is a stack.
>>
>> With appropriate language support, a module which exports closures can
>> construct them on the process stack at the point when the module is
>> 1st imported.  Then they would be available anywhere "below" the site
>> of the import (subject to visibility).
>
>So, the memory manager example would necessitate loading that
>"system object" (memory MANAGER) into the user's process space.
>
>Or, keeping process/task specific "state" like that in some
>protected per-process portion of privileged memory.

Depends on how the system is structured.  Certainly functions that are
needed would have to remain accessible, but proper structuring can
reduce incidental retention of things that are unneeded.


>>> [E.g., consider the heap instantiator example:  why does *it* have
>>> to persist just so something it provides remains accessible?]

It doesn't.


>> In any case, the functions involved cannot be unloaded (at least not
>> easily) if they will be needed by something that is still running.
>
>Does the language magically enforce this dependence?  Or, does the developer
>have to be aware of the potential "gotcha"?  Or, does something else reference
>count, etc.?

What "language" are we talking about?  Dependencies can be enforced at
many levels.


>E.g., I rely on manual handling of pointers to implement these sorts
>of mechanisms.  But, it is clear to me that I am storing *pointers*
>and, as such, have to ensure that the object pointed AT remains
>accessible throughout its potential use.  (like an object WITHIN
>a particular module)
>
>[This makes for interesting design tradeoffs as you decide which objects
>and portions of objects can migrate and which must remain /in situ/.
>It *also* makes security analysis more difficult as I have to consider
>the possibility of "untrusted" code being executed /at priviledge/.]
>
>If a dependence exists on a "foreign" module, then the OS inherently
>tracks the object reference and will prevent the referenced object
>from being deleted/unloaded while the handle is still active.
>(though it doesn't guarantee that the object is deleted as soon
>as the last handle is removed)
>
>> My point, though, is that the function and a closure that uses it are
>> 2 different things.  Their lifetimes necessarily are linked, but their
>> visibility scopes may be very different.


George

Reply by Don Y ●March 10, 20172017-03-10

On 3/10/2017 11:10 AM, Don Y wrote:
> By asking in more abstract terms ("valuable"), I let others
> decide what is important (in terms of criteria) AND, then,
> what features they deem "best suiting" THAT particular goal.
>
> My followup comments are merely to draw out more detail and
> "backup" for how they came to that conclusion, not as an
> assessment *of* their conclusion ("what's your favorite
> color, and why?")

E.g., one reply that I received (elsewhere) resulted in a
lengthy discussion as to deficiencies in "list" implementations;
why lists were valuable to some types of use (application) but
problematic for others (in the discussion) because of implementation
shortcomings.

This sort of back-and-forth refines the "value" that is being
placed (or not!) on the "feature" in the minds of the different
folks discussing it.

Reply by Jacob Sparre Andersen ●March 10, 20172017-03-10

Don Y <blockedofcourse@foo.invalid> writes:

> What *single* (non-traditional) language feature do you find most
> valuable in developing code?  (and, applicable language if unique
> to *a* language or class of languages)

In my daily work I would say strong typing (as done in Ada and SPARK).

When I'm doing embedded systems, being able to declare representations
for a type (especially for enumerations) is very valuable.

What really made me say "wow" to a programming language feature was when
I first read about having tasking built into a language (and not as an
add-on library) in a book about Ada 83.

Greetings,

Jacob
-- 
"We will be restoring normality as soon as we are sure what is normal anyway."

Reply by Tom Gardner ●March 10, 20172017-03-10

On 10/03/17 18:24, Don Y wrote:
> On 3/10/2017 1:17 AM, Tom Gardner wrote:
>> On 09/03/17 23:54, Les Cargill wrote:
>>> Walter Banks wrote:
>>>> On 2017-03-05 5:03 PM, Don Y wrote:
>>>>> A quick/informal/UNSCIENTIFIC poll:
>>>>>
>>>>> What *single* (non-traditional) language feature do you find most
>>>>> valuable in developing code?  (and, applicable language if unique to
>>>>> *a* language or class of languages)
>>>>
>>>> The @ in various forms to tie a physical address to a symbolic variable.
>>>> This construct more that any other single thing allows many high level
>>>> languages have the ability broaden the range of potential applications
>>>> from high level to close to the machine.
>>>>
>>>
>>> Hi Walter! No offense, but...
>>>
>>> This is utterly 1) inappropriate and 2) unnecessary. It's a *terrible* extension
>>> to  ( I presume C, where I have seen it ).
>>>
>>> Even back on Borland, Microsoft, Tektronix and ... another VAX C compiler 30ish
>>> years ago, you'd use a linker/locater to put specific structures at specific
>>> locations.
>>>
>>> The source code itself doesn't need to know about where variables go[1].
>>>
>>> Its part of the responsibility of tools that are invoked later
>>> in the build cycle.
>>
>> This is a philosophical difference.
>>
>> If something is important to the correct operation of
>> the program, then I like it to be visible in the source
>> code. A useful benefit is that the information is easily
>> found and analysed by the IDEs and/or other source code
>> manipulation tools around.
>
> I guess it depends on what you consider "source code".
> How do you treat makefiles, linker scripts, etc.?  Clearly
> they are all important to the *intended* operation of
> the program -- as are the actual tools, themselves.

Keep it simple...

Source code => as defined in the language standard.

If something is tool-specific, then it is not part
of the source code. Hence compiler arguments are
not part of the source code.

Code inspection tools such as browsers, analysers,
and compliance checkers work on the source code.
That's important.


> How much of this cruft do you clutter the "sources" with
> in the attempt to ensure they accompany the sources?
> What about applications wherein multiple languages are
> combined; how do you nail down the "implementation defined"
> behavior of their interfaces?  What order will your Java/C/Python/foo
> function build its stack frame?  How will your FORTRAN/Pascal/ASM/bar
> module interface to it??
>
>> In the same vein, in C I dislike having correct code
>> operation being dependent on combinations of command
>> line compiler arguments.
>
> There's usually a difference between "correct" and "desired".

Not really.

If there is a distinction between "desired" and
"correct" then I can instantly rewrite the program
to be much faster and much smaller.


> It's unfortunate when "correct" relies on command line
> arguments to resolve some "implementation defined behavior"
> with which the compiler could, otherwise, take liberties.

That is always the case with C/C++, unless the program
uses no separately compiled libraries and turns off
all optimisations.

Other languages are much better in that regard.


> Likewise, if the order and locations at which objects
> can be bound can arbitrarily be altered and affect operation

That is just one small consideration in this context.


> [These should be eschewed, IMO]
>
> (This is an issue on many processors, without concern for the
> actual I/O's)
>
> Of course, there's no way for the tool to know/enforce these constraints
> other than a suitable note to the future developer!

And that's undesirable.



>>>> It is language independent and very easy to add to compilers without
>>>> changing the basic form of the language.
>>>
>>> it very nearly destroys the portability of code that uses it...
>>
>> That seems unimportant to me. I cannot think of a
>> reason why you would need to nail down addresses
>> in portable code. Of course "portable" is not a
>> black and white concept!
>
> Therein lies the rub.  Code can be "portable" yet still tied to
> a particular processor (but a different implementation).  E.g.,
> reset_processor()...

More than just a processor, consider different
boards with the same processor.

>
>> Any examples?

Reply by Don Y ●March 10, 20172017-03-10

On 3/10/2017 12:09 PM, Tom Gardner wrote:

>>>> The source code itself doesn't need to know about where variables go[1].
>>>>
>>>> Its part of the responsibility of tools that are invoked later
>>>> in the build cycle.
>>>
>>> This is a philosophical difference.
>>>
>>> If something is important to the correct operation of
>>> the program, then I like it to be visible in the source
>>> code. A useful benefit is that the information is easily
>>> found and analysed by the IDEs and/or other source code
>>> manipulation tools around.
>>
>> I guess it depends on what you consider "source code".
>> How do you treat makefiles, linker scripts, etc.?  Clearly
>> they are all important to the *intended* operation of
>> the program -- as are the actual tools, themselves.
>
> Keep it simple...
>
> Source code => as defined in the language standard.

So, you're requiring everything "that is important to the correct
operation of the program" to reside *in* that "source code", NOT
handled by the linkage editor (?)

[The linkage editor is outside the scope of the "language standard"]

> If something is tool-specific, then it is not part
> of the source code. Hence compiler arguments are
> not part of the source code.
>
> Code inspection tools such as browsers, analysers,
> and compliance checkers work on the source code.
> That's important.
>
>> How much of this cruft do you clutter the "sources" with
>> in the attempt to ensure they accompany the sources?
>> What about applications wherein multiple languages are
>> combined; how do you nail down the "implementation defined"
>> behavior of their interfaces?  What order will your Java/C/Python/foo
>> function build its stack frame?  How will your FORTRAN/Pascal/ASM/bar
>> module interface to it??
>>
>>> In the same vein, in C I dislike having correct code
>>> operation being dependent on combinations of command
>>> line compiler arguments.
>>
>> There's usually a difference between "correct" and "desired".
>
> Not really.
>
> If there is a distinction between "desired" and
> "correct" then I can instantly rewrite the program
> to be much faster and much smaller.

"Desired" implies some acknowledgement of the application and tools
involved.

Ages ago, it wasn't uncommon to have applications that did
bank switching, overlays, etc.  WHERE things resided was
a crucial part of how they would work IN THAT ENVIRONMENT.
E.g., if the trampoline code could ever be mapped *out* of
the address space, then the bank-switching feature came to an
unceremonious halt.

Move the exact same application to a machine with a larger
physical address space, tweek the linker script accordingly
and the code is still "correct".

In the first case, where things are located can also have a marked
impact on size and space ("desired" behavior); by avoiding the
overhead of "distant" references for those things that can (or must)
benefit from the "near" efficiencies.

>> It's unfortunate when "correct" relies on command line
>> arguments to resolve some "implementation defined behavior"
>> with which the compiler could, otherwise, take liberties.
>
> That is always the case with C/C++, unless the program
> uses no separately compiled libraries and turns off
> all optimisations.

No.  You can discipline yourself to avoid relying on undocumented
behaviors or "letting the compiler choose" (among possible
interpretations).  Of course, writing PORTABLE code under those
constraints is considerably harder (how many folks actually
worry about exceeding the implementation-specific limits for
particular data types?  Or, verify that their code will continue
to function properly *if* those limits are considerably higher
than "nominal" from the Standard?)

> Other languages are much better in that regard.

Sure.  But that comes at the expense of either requiring more
work of the processor (given that the language isn't tailored
to a particular processor) *or* rendering extra processor
capabilities moot.

[E.g., imagine all ints were 16b and you developed an application
on a 128b processor in <whatever> language.  Or, that all ints were
128b and you wanted to run the application on a 16b processor.  etc.]

>> Likewise, if the order and locations at which objects
>> can be bound can arbitrarily be altered and affect operation
>
> That is just one small consideration in this context.

Of course!  I was merely drawing attention to the fact (which could
easily have been overlooked) that *order* can affect the DESIRED
operation (e.g., performance).

>> [These should be eschewed, IMO]
>>
>> (This is an issue on many processors, without concern for the
>> actual I/O's)
>>
>> Of course, there's no way for the tool to know/enforce these constraints
>> other than a suitable note to the future developer!
>
> And that's undesirable.

But largely unavoidable.  Its also undesireable for developers to
write crappy code, fail to document their algorithms/implementations,
fail to include comprehensive test suites, etc.  But, there is a
point at which you have to assume "professional" means more than
"paid to do a job".

>>>>> It is language independent and very easy to add to compilers without
>>>>> changing the basic form of the language.
>>>>
>>>> it very nearly destroys the portability of code that uses it...
>>>
>>> That seems unimportant to me. I cannot think of a
>>> reason why you would need to nail down addresses
>>> in portable code. Of course "portable" is not a
>>> black and white concept!
>>
>> Therein lies the rub.  Code can be "portable" yet still tied to
>> a particular processor (but a different implementation).  E.g.,
>> reset_processor()...
>
> More than just a processor, consider different
> boards with the same processor.

I addressed "different boards" with my "different implementation"
reference.  E.g., relocating a UART to a different location in
(memory/IO) space, altering the "sense" of the address lines to
that device (i.e., so consecutive registers are NOT in consecutive
locations *or* are presented in an entirely different order),
scrambling the data lines (e.g., so 0x53 is returned when the character
'5' is received), etc.

The "reset_processor()" reference intending to suggest the fact that
most processors "come out of reset" at a particular (fixed!) point in
their address space, regardless of the rest of the board around them.

[This sort of thing is increasingly common with SoC targets where
many of the design choices (e.g., address map) have been taken away
from the *hardware* designer]

>>> Any examples?

Reply by Jacob Sparre Andersen ●March 10, 20172017-03-10

Walter Banks <walter@bytecraft.com> writes:

> Far less complex. A simple way to declare a variable at a specific
> physical address.  Every other variable attribute remains untouched.

Definitely a nice feature for embedded applications.

<plug>

The Ada variant is:

   Some_Variable : Some_Type with Address => #16#dead_beef#;

</plug>

Greetings,

Jacob
-- 
"Any politician with a live opposition does not understand
 how to make proper use of the true instruments of politics."

Reply by Hans-Bernhard Bröker ●March 10, 20172017-03-10

Am 10.03.2017 um 09:17 schrieb Tom Gardner:
> This is a philosophical difference.
>
> If something is important to the correct operation of
> the program, then I like it to be visible in the source
> code.

That's just it: it is hardly essential for the operation of such code 
_where_ exactly that register is; what really matters is that the 
variable describing it is a) properly structured and named, and b) 
ultimately correctly located.

So how is it better to have that address hard-coded into the driver 
source code, as opposed to getting to decide it at link-time?  Among 
other things, this allows using the same, pre-compiled driver code to be 
used for different micros, even if they mapped the same peripheral IP 
module to different addresses.

Practically speaking, the @ extension has caused my employer a good deal 
more grief than equivalent methods in other toolchains we use.  That 
happened primarily because it _is_ an extension that quite a number of 
our other tools that read C source don't fully recognize.  The worst 
aspect of it is that no amount of preprocessor work gets the @ syntax 
reliably hidden from those tools' view.

> A useful benefit is that the information is easily
> found and analysed by the IDEs and/or other source code
> manipulation tools around.

An IDE that cannot find text in linker/locator scripts or make files 
isn't worth being in use.

> In the same vein, in C I dislike having correct code
> operation being dependent on combinations of command
> line compiler arguments.

Well, most code whose behaviour is changed by compiler flags is anything 
but correct --- its authors usually fail to see that, though.

OTOH, using language extensions like this '@' makes your code dependent 
not just on a compiler option that might toggle it on/off, but even on 
the compiler _having_ such support in the first place, for that option 
to enable.

> I cannot think of a reason why you would need to nail down addresses
> in portable code.

For platforms like ARM that have multiple compiler toolchains available, 
it's highly preferrable to allow a single code base to work on all of 
them.  That's one kind of portability the '@' extension breaks almost 
immediately, first by not being supported on all compilers, second by 
hard-coding addresses that aren't necessarily the same on all 
implementations of the CPU platform.

Reply by Grant Edwards ●March 10, 20172017-03-10

On 2017-03-10, Jacob Sparre Andersen <jacob@jacob-sparre.dk> wrote:
> Don Y <blockedofcourse@foo.invalid> writes:
>
>> What *single* (non-traditional) language feature do you find most
>> valuable in developing code?  (and, applicable language if unique
>> to *a* language or class of languages)
>
> In my daily work I would say strong typing (as done in Ada and SPARK).

When you say "strong typing" you are also assuming "static typing"?

-- 
Grant Edwards               grant.b.edwards        Yow! I love ROCK 'N ROLL!
                                  at               I memorized the all WORDS
                              gmail.com            to "WIPE-OUT" in 1965!!

Reply by Don Y ●March 10, 20172017-03-10

On 3/10/2017 12:03 PM, Jacob Sparre Andersen wrote:
> Don Y <blockedofcourse@foo.invalid> writes:
>
>> What *single* (non-traditional) language feature do you find most
>> valuable in developing code?  (and, applicable language if unique
>> to *a* language or class of languages)
>
> In my daily work I would say strong typing (as done in Ada and SPARK).
>
> When I'm doing embedded systems, being able to declare representations
> for a type (especially for enumerations) is very valuable.

Yes.  Though I want a mechanism by which I can "override" the type of
the object in this reference (even if that has to be done by invoking
a user-defined "cast operator").

And, the ability to apply operations to each of these types in a
somewhat orthogonal manner.

(why can't I reference the N'th element of a CONST list?  or,
ALTER an element of a non-const list?  I.e., why make lists
less capable than arrays?)

I think it takes a bit of "experience" (time in the trenches) to
truly appreciate strong typing.  Early in your career, you are likely
to view it as a nuisance (hail, Pascal!).

There are also often significant conflicts trying to make "general
purpose" languages that can address application development AND
operating system/driver development; the concepts that make sense in
one domain don't always directly translate to the other.

> What really made me say "wow" to a programming language feature was when
> I first read about having tasking built into a language (and not as an
> add-on library) in a book about Ada 83.

The flip side of this is that it requires more of the run-time; a
greater set of constraints on the environment in which the code
can execute.

E.g., a language supporting tasking can "arrange" for atomic references
by simply not allowing the scheduler (that *it* controls) to be invoked
within such regions.  It can elect to do this without the developer's
awareness -- which can have unexpected (though "correct") outcomes
if the developer isn't aware of those behind-the-scenes activities.

     foo: array[HUGE_NUMBER] of rational
     ...
     bar := foo

The same is true of integrating IPC/RPC, GC, etc. in the language.

Each constrains how the application can be deployed -- and requires
more intimate knowledge of the contract that the compiler is willing
to make with the support platform.

Increasingly, languages seem to want to bring their own sandbox to
the party -- in the guise of "helping" the developer.  If you can't
use things "as is", then you're faced with a BIGGER problem as the
language designer may not have planned for the sandbox to be
dissociated from the code generator.

[I.e., often, its better to piece together smaller capabilities
than it is to try to extract those capabilities from a gestalt
for reimplementation]

Reply by Walter Banks ●March 10, 20172017-03-10

On 2017-03-10 12:24 PM, Don Y wrote:
>>
>> If something is important to the correct operation of the program,
>> then I like it to be visible in the source code. A useful benefit
>> is that the information is easily found and analysed by the IDEs
>> and/or other source code manipulation tools around.
>
> I guess it depends on what you consider "source code". How do you
> treat makefiles, linker scripts, etc.?  Clearly they are all
> important to the *intended* operation of the program -- as are the
> actual tools, themselves.
>
> How much of this cruft do you clutter the "sources" with in the
> attempt to ensure they accompany the sources? What about applications
> wherein multiple languages are combined; how do you nail down the
> "implementation defined" behavior of their interfaces?  What order
> will your Java/C/Python/foo function build its stack frame?  How will
> your FORTRAN/Pascal/ASM/bar module interface to it??

There are lots of tools issues that should be re-examined. There is many
cases where the tool execution is backwards to what generates the best
code. The compile link sequence where some key information about the
target ISA or processor architecture is not known until link time
sometimes forces the compiler to use a subset of the actual processors
ISA rather than take advantage of a specific member feature. The recent
discussion of the mill belt length is a good example of how important
this is. This type of compiler approach can encapsulate the specific
processor variations and make application wide optimizations with
relative ease.

Switching this around so the compiler is focused on creation code for a
well defined target rarely is anything more than including a device
specific header file in the application. (as a side effect eliminating
the link step)

What's wrong with a single set of sources that defines an application,
no command line options or linker scripts just an application including
the definition of the target, files and libraries it needs. Compilation
is both faster by many factors and there is a simple self contained
project that can be easily re-created after a decade or more.

(The oldest project we have helped customers re-create in the last year
was archived  by the customer in 1988, we have copies of every released
tool set. Start to recreating an identical HEX file < 2 hours from
receiving the customer support request email)

The brouhaha about "@" and C is really more about having supporting
syntax to be able to explain what is desired without needing an
indirect definition. That is my real argument not that it can't be done.
Most languages have some way to access the underlying machine, fewer of
these languages do so in a simple clean way.

Don I am not arguing to create a more complex world but in the area of
language design why are many tool sets burdened with solutions to the
computer limitations of 1980?

It is a real eyeopener to spend some time with some of the current crop
of programmers who are using what many of us would consider a toy
language to actually achieve some pretty remarkable results. It took me
a long time to respect what they are doing.

w..