On 18.3.17 08:25, Don Y wrote:
> On 3/17/2017 8:59 PM, George Neuner wrote:
>> On Thu, 16 Mar 2017 17:23:11 -0700, Don Y
>> <blockedofcourse@foo.invalid> wrote:
>>
>>> On 3/15/2017 6:05 PM, George Neuner wrote:
>>>
>>>> For one "applet" (process / security context) to allocate memory on
>>>> behalf of another, that memory has to be supplied by something that
>>>> exists outside of either of them.
>>>
>>> Actually, the memory can exist anywhere: in some global space, in
>>> the "allocating entity" or in the "client entity". As can the code
>>> that does the actual allocating (at the behest of the allocator).
>>>
>>> [Consider client environment not supporting arbitrary pointers.
>>> Unless something *gives* you a reference/handle to a piece of memory,
>>> there's no way you can access its contents (short of an exploit).
>>> E.g., no way you can go looking up/down the stack even though you
>>> know there *is* a stack supporting the implementation language]
>>
>> Moving the goal-post again.
>>
>> From previous discussions, I know your system works similarly to Mach
>> ... and while I don't know your system, I *do* know Mach. You're
>> conflating things when you talk about "allocating entities" (aka Mach
>> servers), and introducing complexity where none should exist.
>>
>> E.g.,
>>
>> The Mach memory manager (MM) grabs most of the RAM for itself and
>> creates a heap from which it parcels blocks to other processes.
>>
>> An user allocated memory block physically exists IN THE MM's HEAP, but
>> semantically it *belongs* to the user process: it is mapped into the
>> address space of the user process, else that process could not use it.
>> When you go to rehost this process, you need to consult the MM only so
>> far as to know the attributes of the block. The block itself is IN
>> THE USER PROCESS, and what it contains has to be replicated at the
>> same VMM "linear" address in the new process on the new host.
>>
>> The MM itself need not be involved in the replication. In fact you
>> probably would not want it to be because you only want to copy live
>> data and not the entire address space of the rehosting process.
>>
>> The access/call and identity mechanisms: i.e. the "handles" by which
>> the MM is called, and by which the block is known to the MM - mostly
>> are beside the point.
>
> (sigh) You're fixating on AN EXAMPLE of an OBJECT that can need things
> like "allocation_policy" bound to it. (rewind the thread to see where I
> introduced a "heap"). Reread the thread with "potato" substituting
> for all instances of "heap". And, "mashing_policy" for
> "allocation_policy",
> "weight" for "size", etc.
>
> THEN, lecture me as to how the "OS" owns this object and how *its*
> primitives (VMM in your example) apply to its management.
>
> [Should I, in the future, talk STRICTLY in terms of abstractions and avoid
> ANY pleas to reify those concepts? For fear that a reader will seize on an
> example as representative of the entire universe of possibilities? How
> much "imagination" should I expect of readers?]
>
>>>> Ok, so you have a block of system memory, and you created one (or
>>>> more) heaps having certain properties within the block - heaps which
>>>> are "managed" locally in the process by code from a dynamic module.
>>>> And now you want to rehost the process.
>>>
>>> Objects (heaps in this example) are managed by <something>. That
>>> something may be the entity "owning" the object, a proxy or the
>>> OS acting as the "proxy of last resort".
>>
>> Now you are getting completely away from the "memory" question you
>> initially asked and into objects managed by servers on behalf of
>> clients.
>>
>> Again with the goal post.
>
> "Imagination". Remember, we're talking about potatoes...
>
>> An IN-PROCESS server allocates from memory already owned by the client
>> process. The client DOES NOT OWN an object which it can't touch
>> directly, and which it can only affect via RPC to another process.
>
> An in-process server allocates RESOURCES from resources that it
> can REFERENCE from ANY OTHER SERVICE (which may be the OS or any
> other object server to which the "in-process server" has rights.
>
> The potato server can call upon a "masher" supplied by a "mashing service"
> (and the poor OS is completely clueless as to the concept of "mashing").
> The handle for the object that it then instantiates is all the client
> ever needs to manipulate this particular potato.
>
> The potato server can migrate to another node. The client referencing
> the potato can migrate. The client can give the potato to someone else.
> The potato server can pass management of the potato to a different
> instance of a *compatible* potato server.
>
> The potato can be "serialized" to a persistent medium and later
> reconstituted ("instant potatoes"! :> )
>
> The application need be aware of none of these things.
>
>> Replicating the memory state of the process on the new host
>> automagically handles the case of an in-process server. It also
>> replicates client side state for anything controlled on its behalf by
>> an out-of-process server.
>>
>> The state of an out-of-process server does not factor directly into
>> rehosting a client. All the server needs to know (if even this much)
>> is that the client's "session" is to be suspended pending reconnection
>> rather than terminated.
>
> Replicating "memory" doesn't replicate the state of the machine because
> other machines are involved.
>
> Copy the memory image of *my* PC's "shell", at this moment, onto your PC.
> Now, tell me which files I have open. Which TCP connections are live.
> etc.
> The "state" is spread around in several places *besides* the process
> itself. Including the mail server at the other end of this SMTP connection
> (when I click SEND).
>
> [I.e., unless the passphrase for my SMTP server persists somewhere in
> this process's memory, your copy of it's memory won't help you access
> that connection that exists *now*]
>
>>> The actions involved in management may reside in the OS, a service
>>> (created by <something> -- including the client in question!) or the
>>> client. At the direct or *implied* request of the client, etc.
>>
>> You are failing to realize that it doesn't matter who manages the
>> data: it matters only who *owns* the data.
>>
>>> Likewise, if some of the "properties" reside *in* the client (e.g.,
>>> the algorithm by which allocations will be performed -- under the
>>> supervision of some remote/system service that actually *does* them),
>>> then you have to drag that/those things along with you.
>>
>> Or simply provide them in the new setting. E.g., there is no reason
>> to copy in-memory code if the new host has access to it via a file
>> system. If the new host has a different architecture, you can't copy
>> the code anyway and would have to provide checkpoint mechanisms
>> whereby the *program* (not "process") can be stopped and restarted
>> cleanly.
>
> The applets run in a virtual machine. So, I can pick them up and
> move them to another heterogeneous host. (The "servers" are currently
> designed for a homogeneous environment. But, the same concepts apply)
>
>>>> 2) Rehosting a running process necessarily requires checkpointing,
>>>> copying or reconstructing its entire dynamic state: heaps, stacks,
>>>> globals, loaded modules, kernel [meta]data, etc. - far more than one
>>>> memory block.
>>>>
>>>> Every piece of distributed state must be identifiable as belonging to
>>>> the process - regardless of what "module" may have created or is
>>>> currently managing it.
>>>>
>>>> You have to copy all data belonging to the process, so having heap
>>>> properties stored separately from the heap itself is not a problem.
>>>
>>> In my "portable" case, this is handled by moving the object handles.
>>> The system can then opt to "optimize" execution by (later?) moving the
>>> actual object instances (to minimize communication delays or take
>>> better advantage of processing power on some particular node -- which
>>> may differ from the source or destination nodes)
>>>
>>> But, you can (currently!) create "non-portable" objects.
>>
>> So they are lost. So what? You knew they were non-portable when you
>> created them.
>
> But the apps that contain them can be portable! You might have
> *chosen* to make a non-portable potato (by implementing a potato
> server as one of the threads in your process). The potato is
> bound to the process. But, the process isn't (necessarily) bound
> to the *node* (or process group, etc.).
>
> E.g., I can create a mutex to allow the local threads to cooperatively
> access <something> (maybe just a 4 byte counter). As long as the threads
> remain bound in that process container AND the counter (or it's accessor)
> AND the mutex, the whole entity (process) can be moved at will.
>
> If some other process/agent needs to access that mutex, then it
> needs a handle that is visible from outside of the local process.
> If that other process can migrate to a different node, then it
> can retain the ability to interact with the mutex in this first
> process still residing on the original node.
>
> So, you end up with very different use/performance cases:
> - two or more colocated threads accessing the object
> - two or more LOCAL *processes* accessing the object
> - two or more REMOTE processes accessing the object
>
> [Note, of course, that any other *object* can also indirectly be accessing
> that object/potato]
>
> In the first case, you can conceivably optimize the access to that of
> a conventional mutex (in this example -- remember potatoes!)
> implementation.
>
> In the second case, one process might be able to access the object
> "from within" (a direct call to the functions that implement the
> object) while another process (same node) accesses it "from without".
>
> In the third case, one may be from within, or without, or "remote".
>
> I don't consider it a "good idea" to be forced to make these "mobility"
> decisions at compile time. For the same reason that you don't decide
> where in
> memory a piece of code should execute. Or, whether it WILL be optimized by
> the compiler, or not (compiler command line options aren't reflected in
> the sources)
>
> Just like I don't define which functions/methods should be accessible
> remotely (let the IDL compiler address that based on what I tell *it* to
> do)
>
>>> And, objects
>>> that can't (easily) be shared. (e.g., for a "conventional" heap, you
>>> can let the default policies available in a "heap manager" govern
>>> the way the heap operates. *But*, you can't take advantage of any
>>> "enhanced capabilities" in much the same way that you can't in a
>>> more conventional process container, etc.)
>>
>> There's no reason code can't use "enhanced capabilities" - as long as
>> the result is the same (for some definition). E.g., if you can ignore
>> timing, then debug code is interchangeable with optimized code.
>
> That's been my goal! But, it means the code has to reconfigure itself AT
> RUNTIME. Because the compiler doesn't know where an instance of a potato
> will reside at *compile* time.
>
>>> The problem with this (apparently arbitrary) distinction between
>>> portable/shareable objects and "legacy" variety implementations
>>> is that the developer has to explicitly decide what can be migrated,
>>> shared, and how (because the developer has to take extra steps at
>>> compile and link time to make those capabilities available).
>>
>> It is completely arbitrary, and IMO unnecessary.
>>
>> If there is something - a device possibly - that only exists on one
>> host, then the module that interacts DIRECTLY with that <something> is
>> not portable. Anything objects it creates as a server on behalf of a
>> client likewise are not portable.
>
> No. A "something" that CAN only exist on one host (node) is not portable.
> A "something" that HAPPENS TO only exist on one node but CAN be migrated
> to another node doesn't suffer those same restrictions.
>
> I put a thin layer immediately above the hardware devices on each node
> (think of it as a sort of HAL). At the top interface of this layer, I
> make a deliberate choice as to whether or not that interface shall be
> exportable across nodes.
>
> For example, there is little need to bind anything that uses the
> "irrigation valve" hardware (HAL layer) to the node that actually
> has the hammer drivers for those valve actuators. They are inherently
> slow and largely sacrificial. So, that interface is exported. The
> "irrigation control task" can reside anywhere in the system -- including
> nowhere (zombie).
>
> OTOH, the interface to each of the video cameras is relatively high
> bandwidth. The task(s) that directly interact with them *should*
> reside on those nodes (*if* you assume the camera output will see
> much consumption). So, those interfaces are not exported.
>
> OToOH, the interface to the *compressed* video streams provided by those
> tasks *can* be exported. Their bandwidth requirements might be
> considerably less than the "raw" video coming off each individual
> camera. So, a "motion detection" task can run on the IRRIGATION
> NODE. Or, mode to the LAUNDRY NODE if/when the irrigation node
> is taxed with some other responsibility (perhaps it is taken out of
> service as the valves -- and the proxies that run them -- are no longer
> needed)
>
> I.e., you gain the most flexibility by making everything exportable.
> But, that means you either incur the costs associated with that
> overhead -- even for the "local" case (e.g., motion detection task
> running on the same node as the camera!) -- *or* do run-time
> optimization of the communication path so that the remote costs
> are only incurred when necessary (and factored into the workload
> manager's reassignment of tasks to nodes)
>
>> So if this module is to be used by portable applications, it should
>> NOT be implemented as an in-process server. Q.E.D.
>>
>>> So, where these sorts of bindings get stored (and how they get
>>> tracked) varies based on this "other" information that the developer
>>> supplies.
>>
>> This is where you and I have parted company in the past. You like the
>> kernel capability approach whereas I prefer the user capability
>> approach ... precisely BECAUSE user capabilities avoid many of these
>> semantic problems by virtue of being ordinary data.
>
> The value of the in-kernel capability is that inherently tracks
> references. If the capability is "just data", then you have to
> add a second mechanism to track *where* every copy of the data
> happens to reside! There's no way for the system to know that
> you've just handed a COPY of a capability to something, deleted
> YOUR copy, then retrieved a copy of that copy! (i.e., the
> "reference" is still "live", even though it looks like you
> deleted it). The "system" has no idea who has outstanding
> references (handles/capabilities) nor how MANY.
>
> By contrast, if the kernel manages the capabilities, then they
> can't (by definition) be copied or transferred or deleted
> without the kernel being an active part of that operation.
> The kernel (and the manager for that particular capability)
> can mediate how each is used ("No, you can't give that
> capability away! If you try to do so, I will abend your
> process and/or just silently DELETE the capability and let
> you and the other party wonder why that operation that
> you intended to have performed *isn't*!")
>
> "Is anyone using this video camera? No, there are no active
> capabilities in existence ANYWHERE in the system so we can
> shut it down. Is anyone (process) using the node that hosts
> that camera? No, so we can shut down the node, as well!"
>
>> Everything other than kernel level process and address space control
>> can be done in user space, and rehosting then is simply a matter of
>> replicating the memory state [and reconstructing the kernel state].
>>
>>>>> The "consistent" solution is to make all of these first class
>>>>> objects and let the OS manage them and their locations/instantiations.
>>>>> But, that adds considerably to the cost of *using* them.
>>>>
>>>> Yes, that certainly is the ridiculous, overcompensating solution.
>>>>
>>>> The "consistent" solution is the one done already by any decent OS:
>>>> keep a record of all the kernel structures and memory pages belonging
>>>> to the process.
>>>>
>>>> Program objects then are just ordinary data in memory blocks "owned"
>>>> by the process, or by the kernel on behalf of the process.
>>>
>>> But that ignores all of the "other" dependencies that can be in
>>> play at any given time (i.e., my "solution" being these portable
>>> handles that the OS tracks for the application/developer). So,
>>> <something> knows that there is an aspect of the current applet's
>>> instantiation that relies upon something else *or* that is relied
>>> upon BY something else.
>>
>> No it doesn't.
>>
>> You have created a whole set of artificial dependencies through your
>> (ab)use of kernel capabilities. Nothing in your system is - or even
>> can be - self contained.
>
> A process need never export a handle to an object that it creates
> and manages -- for itself. The OS needn't even be aware that the
> object exists. Does the OS know that I have a 4 byte counter
> *here*? Or here?
>
> OTOH, if something *else* DEPENDS on that counter (potato), then
> you want that dependency to be visible to the system. You don't
> want the potato to silently disappear without the dependent entity
> knowing about it, *now* (not at some later date when it expects it
> to STILL exist).
>
> Likewise, if it "moves", you don't want the dependent entity to have to
> scramble to *find* it when it needs it -- for much the same reason as
> above.
>
>>> [My approach lets the run-time know of these possible external
>>> references by the presence of object handles held -- or exported -- by
>>> the applet in question. *Absent* these, the task is just a block of
>>> memory and "processor state" that can be copied anywhere and "resumed".]
>>
>> But by doing that, you've inserted the operating system into all kinds
>> of things where IMO it has no business being. You've created the MCC
>> from "Tron".
>
> An OS manages resources. I consider dependencies to be resources.
> And, the permissions associated with them.
>
> I don't want a user-level task to have to keep track of where the
> things that it needs happen to be, *now* -- or if they even happen
> to EXIST!
>
> Nor do I want to have to require the developer to never try to do
> something with a particular thing/object (potato) that it *shouldn't*.
> Just like the OS shouldn't let process A peek into process B's memory.
> Or, access unmapped portions of memory, etc.
>
> You'd not advocate allowing a user to "add 3" to an arbitrary string.
> You'd expect the *compiler* to enforce the rules of the language on
> his *use* of the language.
>
> I expect the OS to enforce the rules of the *system* on the users
> of that system in much the same way!
>
>>>> You can somewhat mitigate [rehosting] by making large allocations
>>>> piece wise: abusing VMM to make them appear address contiguous.
>>>>
>>>> E.g., if a process asks for 100MB, reserve the address space but
>>>> instantiate just a few megabytes at a time, on demand as the process
>>>> touches unbacked addresses.
>>>>
>>>> [Because (I assume) you don't want to overcommit memory, you need to
>>>> reserve requested address space both locally in the process, and
>>>> globally so that other processes can't accidentally grab it.]
>>>
>>> Resource constraints vary with the resource and the consumer/provider.
>>> You can overcommit many resources because many "jobs" don't exploit
>>> their worst-case resource needs.
>>>
>>> But, you do so at the risk of delaying the availability of those
>>> resources
>>> to the job in question. Or, some other job interested in those
>>> resources.
>>
>> No, you are misunderstanding. The process owns the space ... but in
>> the scenario I presented, the physical memory backing the space is
>> mapped on demand - which gives the OS more visibility into the process
>> if/when it needs to rehost the process.
>
> The "handles" give the workload manager (via the OS) insight into how and
> where it can/should rehost the process (or any other object!). It knows
> what the process "talks to" because it manages the capabilities FOR that
> process. And, it knows the other (more tangible) resources that the
> process requires (RAM, MIPS) by examining its current ledger and historical
> performance. So, it can balance the costs of "stretching" certain
> communication channels (RPC/RMI being more expensive than the local case)
> against the costs that some *other* process/object (a process is just a
> different sort of object, managed by similar handles) would incur if
> *this* process is not migrated.
>
> [You can't come up with an "optimal" solution. But, you can remember the
> solutions you've tried and how they performed -- because the OS is
> involved in these communications and can measure them! So, you can
> avoid solutions that performed poorly in favor of those that performed
> *well* or that "have yet to be tried.]
>
>> Remember that there are 3 forms of VMM addresses: "linear", "virtual"
>> and "physical".
>> - "Linear" space is what is seen by an application.
>
> "Logical" -- it need not be linear/contiguous.
>
>> - "Physical" space is the actual RAM as seen by the system.
>> - "Virtual" space is what ties the other 2 together.
>
> And, how does this pertain to potatoes? :>
>
>> In a system using MMU only for process isolation, the "virtual" and
>> "physical" spaces would be identical - mapped 1:1. Overcommitting
>> memory is only possible when "virtual" space > "physical" space.
>
> Exactly. Recall that there are multiple objects sharing that physical
> space on a particular node. The physical space determines the maximum
> amount that can be mapped at a given time to the set of objects
> currently making demands on it. Those demands can (do) change, over time.
>
> E.g., a process may load many modules and, by executing code *in*
> each of them, cause pages to be mapped to instantiate those "opcodes"
> JIT. If a portion of a module is never actively referenced, then
> the memory required to back it never gets mapped -- not taxing the
> physical resources currently available. If a module gets UNloaded,
> then all of the mapped pages that were referenced can be released for
> other uses (other potatoes). Likewise, for the pages that were
> never actually mapped!
>
> E := load Encyclopaedia
> ...
> article := E.lookup(Da Vinci)
> ...
> unload(E)
>
> You wouldn't foolishly load the entire encyclopaedia into physical
> (or virtual) memory to do this. Instead, you'd load some accessor
> code that could, itself, load PORTIONS of the "index" *when* you
> initiated a search. This would require some number of pages of
> physical memory to be mapped to hold the resulting "article".
> Anything beyond that would be waste!
>
> E := load Encyclopaedia
> ...
> article := E.lookup(Da Vinci)
> ...
> article := E.lookup(R Crumb)
> ...
> unload(E)
>
> Would only require resources proportional to the LARGER of the Da Vinci
> and R Crumb entries (the resources held by the first "article" are
> freed when the second "article" is defined).
>
>> When an application requests 100MB [e.g., (s)brk] from the OS, then
>> [assuming the request can be granted] the OS adds 100MB to the linear
>> space of the process, and *reserves* 100MB in the virtual space of the
>> system.
>
> No. I don't *impose* that requirement! Because it makes the system
> brittle. The application handles this at a different level dictated
> by the CM system.
>
> If, in your example, you ultimately *use* all 100MB of that memory
> (which may be mmap'd in page-at-a-time increments) and then need
> (or, someone else needs) 50MB *more*, the system decides how best to
> provide those resources based on parameters from the CM system.
>
> If "more resources" are available (though not necessarily "50" MB),
> the system can opt to satisfy the request and let you continue to
> mmap (by your usage patterns) additional pages.
>
> If there are NO more resources available *when* you mmap another
> page (indirectly by your actions), the system decides how best to
> handle that request.
>
> Can I unmap some portion of a loaded module (knowing that I can
> always reload that portion of the module AS IF it hadn't *yet*
> been referenced/mmap'd)? Can I pageout some "really low" portion
> of the stack (on the assumption that it won't be accessed RSN)?
> Can I *migrate* some other task off of this node and reclaim the
> physical resources that it currently holds? Can I migrate *this*
> task to some other node that has more resources available? Can
> I bring another node on-line and migrate "something" to it? Can
> I rely on some redundant (potato!) server to assume responsibility
> for managing some objects that happen to reside on the local system
> and, as such, free up the space they occupy *and* the space
> required for the code to manage them? Can I *kill* some task
> (and associated objects) that are just running opportunistically,
> presently (let's do some commercial detection in that OTA TV
> broadcast we recorded two hours ago so its ready for viewing before
> the user comes home and wants to see it!)? etc.
>
> You don't want to make those decisions at compile time. Because you
> don't know what will be active in the system at *run* time! And,
> you want to build as much flexibility in how/where resources can
> reside so that you have more run-time options. You don't want to
> overprovision just to handle a "worst case" that, in practice, will
> never occur!
>
> [Repeat this substituting "MIPS" for "memory", etc.]
>
>> Now the reserved space belongs to the requesting process - no other
>> process can take it.
>
> See above. If a process (potato) *needs* to be sure that resource
> (memory, MIPS, etc.) *is* available, "sitting and waiting", then it
> has to place an active "reservation" through the CM system. The
> CM system dictates limits for what individual tasks/"users" can
> consume.
>
> Reserves are A Bad Thing as they tie up those resources regardless
> of what other objects might happen to place similar demands on the
> system's resources. What's to stop "everyone" from demanding EVERYTHING
> that they *might* need "just in case"?
>
> If, instead, you allow elasticity in how the resources are supplied,
> you can afford to loosen the constraints on individual "consumers"
> because they have less ACTUAL impact on other consumers.
>
> Joe User (not Bob Developer) could write a sloppy script that
> consumes "limitless" resources. Should he be allowed to bring the
> system to its knees -- because he is the "owner" of the device?
>
> Similarly, should Bob Developer be allowed free reign at the risk
> of compromising core services (implemented by *me*)?
>
> It's a lot easier to ensure your system can meet all its goals if
> you provision *for* those goals. But, with an open/expandable system
> that sees huge variations between actual and potential use, that
> translates into dollars that consumers aren't keen on spending.
>
> (why do you have more VM configured in your PC than physical memory?)
>
>> However, the OS can choose whether or not to immediately map the new
>> linear space to physical space. By mapping physical space on demand
>> as it is touched by the process, the OS has a much better picture of
>> the *actual* memory use of the process.
>
> Yes. Though that can vary with other "input data" that this instance
> of the process (potato) sees. E.g., the resources required for that
> lookup(R Crumb) are probably far less than the earlier lookup(Da Vinci).
> Do you *know* what the user will opt to want to "lookup" when you
> instantiate that task? :>
>
> You can either say "you can lookup ANYTHING" -- and provision for it.
> Or, "you can look up anything as long as the article doesn't exceed X"
> (and impose a disconcerting constraint on the user: "How do I know
> how big the article is LIKELY to be?"). Or, you can make a best
> effort and trade your response based on your current needs and those
> just added by the user.
>
>> When the time comes to replicate the process's memory on another host,
>> only mapped physical space needs to be copied - not the entire virtual
>> address space reservation.
>
> Again, that assumes the (active/mapped) memory represents the entire
> "state" of the process and all of its dependencies. (what about any
> network connections -- managed by the OS -- that may be active at the
> time? the contents of any NIC buffers AS the process is in transit?
> file handles, my "object handles", etc.)
>
> [Think about it. Send me an ISO with an image of your PC's memory
> at this point in time. I'll let you include the OS's state in that
> image, as well! Will I be able to read this USENET post at exactly
> THIS --> . <-- point when I load the image into my machine's memory?
> What if you mark the thread as read before the ISO gets loaded on
> my machine? Or, if YOUR server expires the article? Or, if
> your NNTP server happens to be offline? That's all state that is part
> of your "reading this USENET post" activity -- yet, you've not captured
> it well enough for me to pick up where you left off! :> ]
Just get the book
Andrew S. Tanenbaum, Operating Systems: Design and Implementation,
read and understand it, then come back.
--
-TV