*IN GENERAL*, what dogma would you suggest regarding addressing
potential ("current") resource inadequacies when starting a task
(or, offering that capability *to* start that task to a user)?
Keep in mind that resource availability varies before, during and
after a potential task/activity is initiated. And, a particular
activity may result in other activities (automatically started
or likely to *want* to be started -- by the user).
E.g., if you were asked to copy a file, you'd probably stat() the
file to ascertain its size (*hope* that size remains constant
during the following operation) and check to be sure you have that
much free-space on the target. I.e., the copy operation would
tend to be fast enough that the user couldn't remedy a "no space
on device" error before it was signaled.
OTOH, if the transport medium was of sufficiently low bandwidth,
you could allow the operation to start and warn the user that
he/she WILL run out of space at the target *if* nothing changes
(the implication being that the user should be that agent of change).
The 1960's approach, of course, is just to naively start the operation
and then have it abend when it hits that brick wall!
Note that each action that the user takes implicitly consumes resources
and, as such, can hinder other actions that he/she may want to take.
And, that your device can also have autonomous resource needs that
are incurred alongside the user's actions (e.g., daemons).
Do you, for example, let the user consume battery in a futile attempt
to perform some operation -- and end up jeopardizing his ability to
do some more valuable operation later (e.g., back up his device before
power fails)?
Again, these are only examples. The question is what criteria do you use
for alerting (and/or inhibiting!) the user when you know that it is likely
that he won't be able to perform the desired task WITH THE SYSTEM IN ITS
CURRENT STATE -- and *when* do you impose those notifications?
Forewarning of resource inadequacies
Started by ●April 7, 2016
Reply by ●April 8, 20162016-04-08
Hi Don, On Thu, 07 Apr 2016 14:10:04 -0700, Don Y <blockedofcourse@foo.invalid> wrote:>*IN GENERAL*, what dogma would you suggest regarding addressing >potential ("current") resource inadequacies when starting a task >(or, offering that capability *to* start that task to a user)?I suggest you watch the movie "Dogma", think about it for a while, and then try asking your question again. http://www.imdb.com/title/tt0120655/ 8-)>E.g., if you were asked to copy a file, you'd probably stat() the >file to ascertain its size (*hope* that size remains constant >during the following operation) and check to be sure you have that >much free-space on the target. I.e., the copy operation would >tend to be fast enough that the user couldn't remedy a "no space >on device" error before it was signaled. > >The 1960's approach, of course, is just to naively start the operation >and then have it abend when it hits that brick wall!That's also the 2016 approach because it's just too complex to figure out for most cases. What you essentially are trying to do is predict successful (or not) completion of a resource limited scheduling with an open ended set of processes, priorities and resources. That's an impossible situation: resource scheduling is a bin packing problem that's solvable only in a closed system. Given a known set of resources and a enumeration of the needs of each process, you can predict completion of any given process. But process priorities complicate scheduling and any predictions of success go out the window the moment a higher priority process enters the mix. And every process must be able to state the resources it needs to perform any given operation: e.g., I need xxx KB of RAM, nnn file buffers, yyy KB on disk Q, etc. ad nauseam. And if the file system is remote, implicitly add network connections, etc. This does _not_ have to be done statically prior to execution [that is just the simplest case] ... but to be effective the process must communicate with the scheduler and participate in (re)scheduling whenever it's resource needs change. It is complicated when the needs are ad hoc, like with your file copy where the space on the target device is unknown until the source file is examined. The best you can do in such situations is to preempt the running process until scheduling shows that it can complete, and tell the user so she can halt the process if necessary.>Note that each action that the user takes implicitly consumes resources >and, as such, can hinder other actions that he/she may want to take.Yes. Either you must defer a new process until scheduling says it can complete, or you introduce a "higher priority" process.>And, that your device can also have autonomous resource needs that >are incurred alongside the user's actions (e.g., daemons).Daemon resources should be known and limited. There's a reason, e.g., that only a superuser can take the last process slot. But if you're trying to do resource scheduling, then every process must have known limits.>Do you, for example, let the user consume battery in a futile attempt >to perform some operation -- and end up jeopardizing his ability to >do some more valuable operation later (e.g., back up his device before >power fails)?You fix it so only the backup process can invade the last xx% of the battery. Then watch it blow up due to WiFi retries because there is interference or signal strength is poor because the device is too far from the AP.>Again, these are only examples. The question is what criteria do you use >for alerting (and/or inhibiting!) the user when you know that it is likely >that he won't be able to perform the desired task WITH THE SYSTEM IN ITS >CURRENT STATE -- and *when* do you impose those notifications?If you can implement a reasonably effective resource based scheduler, then you can warn the user that some program can't run now, but will run when <some set of> currently executing processes finish, and does the user want to defer it until then? But as I said previously, everything goes out the window when a higher priority process enters the mix. Even doing that much is incredibly hard. In an open system, too many resource needs are ad hoc, and it may not be possible to even enumerate every resource that *might* be involved. I.e. the scheduler may have to consider not only dynamically appearing resources, but new resources that it didn't even know of yesterday. And in a distributed system a centralized scheduler will be a bottleneck, but without it there's no way to predict effects on a shared resource of processes executing on different nodes. I know this didn't really help and that you've thought of all or most of it already. YMMV, George
Reply by ●April 8, 20162016-04-08
Hi George, On 4/8/2016 4:08 AM, George Neuner wrote:> Hi Don, > > On Thu, 07 Apr 2016 14:10:04 -0700, Don Y > <blockedofcourse@foo.invalid> wrote: > >> *IN GENERAL*, what dogma would you suggest regarding addressing >> potential ("current") resource inadequacies when starting a task >> (or, offering that capability *to* start that task to a user)? > > I suggest you watch the movie "Dogma", think about it for a while, and > then try asking your question again. > http://www.imdb.com/title/tt0120655/ > > 8-)<frown> Doesn't look like the sort of thing in which I'd be interested. _RED_ and _RED2_ last night. _Flushed Away_ before that. Finestkind.>> E.g., if you were asked to copy a file, you'd probably stat() the >> file to ascertain its size (*hope* that size remains constant >> during the following operation) and check to be sure you have that >> much free-space on the target. I.e., the copy operation would >> tend to be fast enough that the user couldn't remedy a "no space >> on device" error before it was signaled. >> >> The 1960's approach, of course, is just to naively start the operation >> and then have it abend when it hits that brick wall! > > That's also the 2016 approach because it's just too complex to figure > out for most cases.In an RT system, it is explicitly known for each task -- as indicated by it's (numerical) deadline. In my case, the brick wall occurs only for HRT tasks: deadline handler kills task and frees all held resources. (for SRT tasks, deadline handler cooks the books and decides if the task should be continued "at lesser value"). [Obviously, the goal is to convert HRT tasks into SRT tasks wherever possible]> What you essentially are trying to do is predict successful (or not) > completion of a resource limited scheduling with an open ended set of > processes, priorities and resources. That's an impossible situation: > resource scheduling is a bin packing problem that's solvable only in a > closed system. > > Given a known set of resources and a enumeration of the needs of each > process, you can predict completion of any given process. But process > priorities complicate scheduling and any predictions of success go out > the window the moment a higher priority process enters the mix.Yes. But in an RT system, processes (tasks/threads/whatever) have temporal constraints -- deadlines. You can exploit these in your scheduling algorithms to ensure resources are where they should be.> And every process must be able to state the resources it needs to > perform any given operation: e.g., I need xxx KB of RAM, nnn file > buffers, yyy KB on disk Q, etc. ad nauseam. And if the file system is > remote, implicitly add network connections, etc.Yes: reservations (aka "reserves"). I can ensure "resources" are "ready and waiting" for tasks that place appropriate reservations. So, you needn't block indefinitely waiting for a piece of memory that you need to perform your task (memory in use by some other can't be forcefully freed without restarting that "other"). OTOH, the "CPU cycles" that you need can be reserved for you WHEN you need them -- yet freely given to any other (lower priority) tasks that come along before you do. Battery power is, of course, related to CPU cycles (if you're burning them, you're eating battery). So, reserves can have some impact on physical resources (memory, battery) required by tasks but minimal impact on other resources.> This does _not_ have to be done statically prior to execution [that is > just the simplest case] ... but to be effective the process must > communicate with the scheduler and participate in (re)scheduling > whenever it's resource needs change. > > It is complicated when the needs are ad hoc, like with your file copy > where the space on the target device is unknown until the source file > is examined. The best you can do in such situations is to preempt the > running process until scheduling shows that it can complete, and tell > the user so she can halt the process if necessary.The "visible" aspect is what I am trying to address. I want the user to be able to know (and accept as "reasonable") how ANY such shortage will be handled. Returning to the file copy (it's easy to internalize), think of how different systems will address this. E.g., copy a 1G file to a volume with < 1G free space and the operation isn't even attempted (windows). OTOH, copy a *set* of files and it is treated as a (unordered, for all practical purpose via the GUI) set of individual operations, the first of which that fails aborts ALL the remaining. So, an 800M file would copy OK but the next 300M file wouldn't -- and would prevent the 100M file that follows it from being copied as well! I know I've scurried to make space available when doing big transfers over remote procedures (e.g., FTP) where I can manually delete files that I've spontaneously decided I could "live without" ($TEMP) in order to ensure a long transfer completes. And, I'm sure I could manually start another process that CONSUMES free space after a file copy is started -- and upset that previously started operation ("WTF? There WAS sufficient free space when I started this copy operation. But, fwrite() just signaled an error!") I.e., the user experience isn't consistent.>> Note that each action that the user takes implicitly consumes resources >> and, as such, can hinder other actions that he/she may want to take. > > Yes. Either you must defer a new process until scheduling says it can > complete, or you introduce a "higher priority" process.But the user interacts with each of those. So, you've an opportunity to inform the user of the consequences of his actions (e.g., you are now burning more CPU cycles so the previous operation will take longer -- but still complete! OR you are now consuming more memory so the previous operation may abend!)>> And, that your device can also have autonomous resource needs that >> are incurred alongside the user's actions (e.g., daemons). > > Daemon resources should be known and limited. There's a reason, e.g., > that only a superuser can take the last process slot.That's where reserves come into play. The process *will* run as expected (or, won't and will be handled as per the criteria encoded in its deadline handler). I brought this up, here, as a reminder that a system is rarely static; resources that appear to exist NOW can magically disappear, later -- without any deliberate action on the user's part!> But if you're trying to do resource scheduling, then every process > must have known limits. > >> Do you, for example, let the user consume battery in a futile attempt >> to perform some operation -- and end up jeopardizing his ability to >> do some more valuable operation later (e.g., back up his device before >> power fails)? > > You fix it so only the backup process can invade the last xx% of the > battery. Then watch it blow up due to WiFi retries because there is > interference or signal strength is poor because the device is too far > from the AP.Again, just offered as an example of what can creep into the above calculus. (And, handled with reserves, in my case)>> Again, these are only examples. The question is what criteria do you use >> for alerting (and/or inhibiting!) the user when you know that it is likely >> that he won't be able to perform the desired task WITH THE SYSTEM IN ITS >> CURRENT STATE -- and *when* do you impose those notifications? > > If you can implement a reasonably effective resource based scheduler, > then you can warn the user that some program can't run now, but will > run when <some set of> currently executing processes finish, and does > the user want to defer it until then? But as I said previously, > everything goes out the window when a higher priority process enters > the mix.If the higher priority process was there all along (and its reserves thus known to the system), it can be addressed when the user task is started. If the *user* starts a higher priority task, then you have another opportunity to inform the user that his current actions will impact his previous actions (or, vice versa). The trick is not confusing the user: why won't this run but this other (nearly) identical thing will?> Even doing that much is incredibly hard. In an open system, too many > resource needs are ad hoc, and it may not be possible to even > enumerate every resource that *might* be involved. I.e. the scheduler > may have to consider not only dynamically appearing resources, but new > resources that it didn't even know of yesterday.Yes. As I can bring more resources online on-demand, the workload scheduler always has a changing mix of resources to evaluate. But, it doesn't need to track all of the existing jobs running on the various processors. Rather, it just sees a set of processors with varying capabilities ("surplus resources") as likely candidates for the newest workload to be dispatched. The schedulers on each node then handle the finer grained scheduling of the resources *on* that node. [The only magic involved deals with the decision -- by the workload scheduler -- to bring another node on-line and possibly re-shuffle the locations of currently running loads. And, the inverse operation of moving load off of underutilized nodes so they can be powered down.]> And in a distributed system a centralized scheduler will be a > bottleneck, but without it there's no way to predict effects on a > shared resource of processes executing on different nodes.Clump tasks together based on how much they share (IPC vs RPC, SHM vs DSM, etc.) So, any tight coupling is handled within the node and can be ignored by the workload scheduler (hopefully). For simple resources (CPU, battery, memory, etc.) there are no "interactive" sharing but, rather, just a resource limitation that can't be exceeded.> I know this didn't really help and that you've thought of all or most > of it already."If it was easy..." :> Thanks! A colleague sent me some materials on how they design UI's at their shop. Hopefully it will contain some mantras pertinent to this...
Reply by ●April 11, 20162016-04-11
Hi Don, Sorry for the delay ... busy weekend. On Fri, 08 Apr 2016 11:58:28 -0700, Don Y <blockedofcourse@foo.invalid> wrote:>>> E.g., if you were asked to copy a file, you'd probably stat() the >>> file to ascertain its size (*hope* that size remains constant >>> during the following operation) and check to be sure you have that >>> much free-space on the target. I.e., the copy operation would >>> tend to be fast enough that the user couldn't remedy a "no space >>> on device" error before it was signaled. >>> >>> The 1960's approach, of course, is just to naively start the operation >>> and then have it abend when it hits that brick wall! >> >> That's also the 2016 approach because it's just too complex to figure >> out for most cases. > >In an RT system, it is explicitly known for each task -- as indicated >by it's (numerical) deadline. In my case, the brick wall occurs only >for HRT tasks: deadline handler kills task and frees all held resources. >(for SRT tasks, deadline handler cooks the books and decides if the task >should be continued "at lesser value").The deadline of the RT task is just one of its resource limitations.>[Obviously, the goal is to convert HRT tasks into SRT tasks wherever >possible] > >> What you essentially are trying to do is predict successful (or not) >> completion of a resource limited scheduling with an open ended set of >> processes, priorities and resources. That's an impossible situation: >> resource scheduling is a bin packing problem that's solvable only in a >> closed system. >> >> Given a known set of resources and a enumeration of the needs of each >> process, you can predict completion of any given process. But process >> priorities complicate scheduling and any predictions of success go out >> the window the moment a higher priority process enters the mix. > >Yes. But in an RT system, processes (tasks/threads/whatever) have temporal >constraints -- deadlines. You can exploit these in your scheduling algorithms >to ensure resources are where they should be.But a task that requires more resources than are available can't run regardless of having an execution deadline. The RT aspect of your system is in some sense a diversion ... time really is just another resource constraint on task execution.>> And every process must be able to state the resources it needs to >> perform any given operation: e.g., I need xxx KB of RAM, nnn file >> buffers, yyy KB on disk Q, etc. ad nauseam. And if the file system is >> remote, implicitly add network connections, etc. > >Yes: reservations (aka "reserves"). I can ensure "resources" are >"ready and waiting" for tasks that place appropriate reservations. >So, you needn't block indefinitely waiting for a piece of memory >that you need to perform your task (memory in use by some other >can't be forcefully freed without restarting that "other"). > >OTOH, the "CPU cycles" that you need can be reserved for you WHEN you need >them -- yet freely given to any other (lower priority) tasks that come >along before you do.The problem is when those lower priority tasks are still running at the time the higher priority task needs to start. What if, given the needs of the task, there aren't sufficient resources to execute it? Do you kill some lower priority task?>Battery power is, of course, related to CPU cycles (if you're burning them, >you're eating battery). > >So, reserves can have some impact on physical resources (memory, battery) >required by tasks but minimal impact on other resources. > >> This does _not_ have to be done statically prior to execution [that is >> just the simplest case] ... but to be effective the process must >> communicate with the scheduler and participate in (re)scheduling >> whenever it's resource needs change. >> >> It is complicated when the needs are ad hoc, like with your file copy >> where the space on the target device is unknown until the source file >> is examined. The best you can do in such situations is to preempt the >> running process until scheduling shows that it can complete, and tell >> the user so she can halt the process if necessary. > >The "visible" aspect is what I am trying to address. I want the user >to be able to know (and accept as "reasonable") how ANY such shortage will be >handled. > >Returning to the file copy (it's easy to internalize), think of how >different systems will address this. E.g., copy a 1G file to a volume with >< 1G free space and the operation isn't even attempted (windows). OTOH, >copy a *set* of files and it is treated as a (unordered, for all practical >purpose via the GUI) set of individual operations, the first of which >that fails aborts ALL the remaining. So, an 800M file would copy OK >but the next 300M file wouldn't -- and would prevent the 100M file that >follows it from being copied as well!Yes, but that's just a failure of imagination on the part of whoever designed the copy mechanism. A group of files can be copied in many different ways: largest->smallest, smallest->largest, alphabetically by name, forward directory order, reverse directory order, inode order, randomly, etc. There are plenty of copy utilities that provide more choices in handling. But again, the copy example is a distraction ... whatever you come up with has to work for any program and situation. There are too many potential interactions in an open system. Now you'll argue that the system is closed ... but it isn't. Your system is distributed, so only inside a given node can there be a closed system. As soon as try to deal with off-node resources ... e.g., the shared filesystem, sprinkler valves, etc. 8-) ... the resource scheduling problem becomes exponentially more difficult. And again, RT is only one aspect of the problem.>I know I've scurried to make space available when doing big transfers >over remote procedures (e.g., FTP) where I can manually delete files >that I've spontaneously decided I could "live without" ($TEMP) in >order to ensure a long transfer completes. And, I'm sure I could >manually start another process that CONSUMES free space after a file >copy is started -- and upset that previously started operation >("WTF? There WAS sufficient free space when I started this copy >operation. But, fwrite() just signaled an error!") > >I.e., the user experience isn't consistent. > >>> Note that each action that the user takes implicitly consumes resources >>> and, as such, can hinder other actions that he/she may want to take. >> >> Yes. Either you must defer a new process until scheduling says it can >> complete, or you introduce a "higher priority" process. > >But the user interacts with each of those. So, you've an opportunity to >inform the user of the consequences of his actions (e.g., you are now burning >more CPU cycles so the previous operation will take longer -- but still >complete! OR you are now consuming more memory so the previous operation >may abend!)Right. Which is why I said the solution in 2016 is no different from the solution in 1960. There are too many variables and too many of them are hidden.>>> And, that your device can also have autonomous resource needs that >>> are incurred alongside the user's actions (e.g., daemons). >> >> Daemon resources should be known and limited. There's a reason, e.g., >> that only a superuser can take the last process slot. > >That's where reserves come into play. The process *will* run as >expected (or, won't and will be handled as per the criteria encoded in >its deadline handler). I brought this up, here, as a reminder that >a system is rarely static; resources that appear to exist NOW can >magically disappear, later -- without any deliberate action on the >user's part!Right, but there's more to the issue of subsystems than just that. They should be suitably limited in their allowed resource use. On most systems you can keep opening sockets until you run out of descriptors. However if it is known that your system is a server that can handle a maximum of 1000 connections, there is no reason to allow more ... doing so invites DOS failures.>> But if you're trying to do resource scheduling, then every process >> must have known limits. > >>> Again, these are only examples. The question is what criteria do you use >>> for alerting (and/or inhibiting!) the user when you know that it is likely >>> that he won't be able to perform the desired task WITH THE SYSTEM IN ITS >>> CURRENT STATE -- and *when* do you impose those notifications? >> >> If you can implement a reasonably effective resource based scheduler, >> then you can warn the user that some program can't run now, but will >> run when <some set of> currently executing processes finish, and does >> the user want to defer it until then? But as I said previously, >> everything goes out the window when a higher priority process enters >> the mix. > >If the higher priority process was there all along (and its reserves >thus known to the system), it can be addressed when the user task is >started.That doesn't work for tasks which execute in response to outside events. Such tasks can't easily (or at all) timeshare their reserved resources with others - the maximums needed must be kept available for the duration of the task.>> Even doing that much is incredibly hard. In an open system, too many >> resource needs are ad hoc, and it may not be possible to even >> enumerate every resource that *might* be involved. I.e. the scheduler >> may have to consider not only dynamically appearing resources, but new >> resources that it didn't even know of yesterday. > >Yes. As I can bring more resources online on-demand, the workload scheduler >always has a changing mix of resources to evaluate. But, it doesn't need >to track all of the existing jobs running on the various processors. Rather, >it just sees a set of processors with varying capabilities ("surplus >resources") as likely candidates for the newest workload to be dispatched. >The schedulers on each node then handle the finer grained scheduling of >the resources *on* that node.Which is fine for cycles - i.e. "compute" resources - but not necessarily so great for others. Does this node have enough memory? How far (network span) is the node from storage? Etc.>[The only magic involved deals with the decision -- by the workload >scheduler -- to bring another node on-line and possibly re-shuffle >the locations of currently running loads. And, the inverse operation >of moving load off of underutilized nodes so they can be powered down.] > >> And in a distributed system a centralized scheduler will be a >> bottleneck, but without it there's no way to predict effects on a >> shared resource of processes executing on different nodes. > >Clump tasks together based on how much they share (IPC vs RPC, SHM vs DSM, >etc.) So, any tight coupling is handled within the node and can be >ignored by the workload scheduler (hopefully). For simple resources >(CPU, battery, memory, etc.) there are no "interactive" sharing but, >rather, just a resource limitation that can't be exceeded.Unifying to find for a local minimum on the task set? Yeesh!!!! Still the problem is you need fairly complete enumeration of the maximum resources needed by every task. That's effectively impossible ... there are too many dynamic and hidden variables.>> I know this didn't really help and that you've thought of all or most >> of it already. > >"If it was easy..." :> > >Thanks! A colleague sent me some materials on how they design UI's >at their shop. Hopefully it will contain some mantras pertinent to this...The UI is secondary to the guidelines for notifying the user. I think meaningful guidelines in your system will necessarily be heuristic and hard to pin down. YMMV, George
Reply by ●April 11, 20162016-04-11
Hi George, On 4/10/2016 10:47 PM, George Neuner wrote:> Sorry for the delay ... busy weekend.Ditto. Big time rain so I took advantage of the softer soil to pull up the wildflowers. Tonight making choc covered almonds for the chocoholic. <frown> Shes had a rough couple of weeks so I figure worth a few hours of my time...>>>> E.g., if you were asked to copy a file, you'd probably stat() the >>>> file to ascertain its size (*hope* that size remains constant >>>> during the following operation) and check to be sure you have that >>>> much free-space on the target. I.e., the copy operation would >>>> tend to be fast enough that the user couldn't remedy a "no space >>>> on device" error before it was signaled. >>>> >>>> The 1960's approach, of course, is just to naively start the operation >>>> and then have it abend when it hits that brick wall! >>> >>> That's also the 2016 approach because it's just too complex to figure >>> out for most cases. >> >> In an RT system, it is explicitly known for each task -- as indicated >> by it's (numerical) deadline. In my case, the brick wall occurs only >> for HRT tasks: deadline handler kills task and frees all held resources. >> (for SRT tasks, deadline handler cooks the books and decides if the task >> should be continued "at lesser value"). > > The deadline of the RT task is just one of its resource limitations.It's the point at which you "hit the brick wall". I.e., if you haven't acquired the resources you need by then, you die (if HRT). So, you implicitly have a timespan in which to alert the user to any anticipated difficulties.>> [Obviously, the goal is to convert HRT tasks into SRT tasks wherever >> possible] >> >>> What you essentially are trying to do is predict successful (or not) >>> completion of a resource limited scheduling with an open ended set of >>> processes, priorities and resources. That's an impossible situation: >>> resource scheduling is a bin packing problem that's solvable only in a >>> closed system. >>> >>> Given a known set of resources and a enumeration of the needs of each >>> process, you can predict completion of any given process. But process >>> priorities complicate scheduling and any predictions of success go out >>> the window the moment a higher priority process enters the mix. >> >> Yes. But in an RT system, processes (tasks/threads/whatever) have temporal >> constraints -- deadlines. You can exploit these in your scheduling algorithms >> to ensure resources are where they should be. > > But a task that requires more resources than are available can't run > regardless of having an execution deadline. The RT aspect of your > system is in some sense a diversion ... time really is just another > resource constraint on task execution.If the resources aren't available when the task is "invoked" (by the user), then you can tell the user that the task can't be started (made ready). What you have to be concerned with is resources that can "slip away" after the task has been accepted for execution. This poses a UI problem as this can happen some time after the user's attention has moved away from the "start task A" activity -- you effectively have to tell him that something he THOUGHT was going to proceed "OK" some time ago will *not*; based on new information that you (the system) now have.>>> And every process must be able to state the resources it needs to >>> perform any given operation: e.g., I need xxx KB of RAM, nnn file >>> buffers, yyy KB on disk Q, etc. ad nauseam. And if the file system is >>> remote, implicitly add network connections, etc. >> >> Yes: reservations (aka "reserves"). I can ensure "resources" are >> "ready and waiting" for tasks that place appropriate reservations. >> So, you needn't block indefinitely waiting for a piece of memory >> that you need to perform your task (memory in use by some other >> can't be forcefully freed without restarting that "other"). >> >> OTOH, the "CPU cycles" that you need can be reserved for you WHEN you need >> them -- yet freely given to any other (lower priority) tasks that come >> along before you do. > > The problem is when those lower priority tasks are still running at > the time the higher priority task needs to start. What if, given the > needs of the task, there aren't sufficient resources to execute it? Do > you kill some lower priority task?That's where the reserves come into play. Conceptually, the reserved resources don't appear as "available". So, when a task tries to be started, you look and see that there are insufficient resources and don't let it start. CPU cycles are a special case -- you CAN make them available to other tasks (other than the task that has placed the reserve on them) as you can quickly "steal them back" when the reserving task actually starts. [This can cause the expected completion time of the "low priority" (for want of a better name) task to slip further into the future as it will now get less work done per unit time. This raises the "how do I tell the user that his task -- started some time earlier under "better conditions" -- is now potentially compromised? Do I complain, NOW? Or, do I see if conditions will improve in the future thereby giving it a better chance of meeting its deadline?] Memory, OTOH, is a physical resource that can't be "rescinded" when the task that has reserved it becomes ready. So, it can't be given to the "lower priority" task even though it looks like it is PRESENTLY unused. [An exception is that the memory can be allocated as "anonymous" memory that can actually benefit the lower priority task(s) -- *if* the system allocates it (i.e., NOT the task in question) for roles that the system can effectively rescind. E.g., the system can allocate it in place of VM to allow that low priority task to make fewer round trips to the backing store. When the task that has reserved the memory comes ready, the system has to figure out how to free up that memory -- flush to backing store if not already there *or* just freeing the pages if they are already available on the backing store.]>> Battery power is, of course, related to CPU cycles (if you're burning them, >> you're eating battery). >> >> So, reserves can have some impact on physical resources (memory, battery) >> required by tasks but minimal impact on other resources. >> >>> This does _not_ have to be done statically prior to execution [that is >>> just the simplest case] ... but to be effective the process must >>> communicate with the scheduler and participate in (re)scheduling >>> whenever it's resource needs change. >>> >>> It is complicated when the needs are ad hoc, like with your file copy >>> where the space on the target device is unknown until the source file >>> is examined. The best you can do in such situations is to preempt the >>> running process until scheduling shows that it can complete, and tell >>> the user so she can halt the process if necessary. >> >> The "visible" aspect is what I am trying to address. I want the user >> to be able to know (and accept as "reasonable") how ANY such shortage will be >> handled. >> >> Returning to the file copy (it's easy to internalize), think of how >> different systems will address this. E.g., copy a 1G file to a volume with >> < 1G free space and the operation isn't even attempted (windows). OTOH, >> copy a *set* of files and it is treated as a (unordered, for all practical >> purpose via the GUI) set of individual operations, the first of which >> that fails aborts ALL the remaining. So, an 800M file would copy OK >> but the next 300M file wouldn't -- and would prevent the 100M file that >> follows it from being copied as well! > > Yes, but that's just a failure of imagination on the part of whoever > designed the copy mechanism. A group of files can be copied in many > different ways: largest->smallest, smallest->largest, alphabetically > by name, forward directory order, reverse directory order, inode > order, randomly, etc. There are plenty of copy utilities that provide > more choices in handling.Exactly. The UI didn't see consistency as a design goal. Had the copies all been treated as transactional, then the behavior for a single file copy (pass/fail) could be identical to the behavior of the "group" file copy.> But again, the copy example is a distraction ... whatever you come up > with has to work for any program and situation. > > There are too many potential interactions in an open system. Now > you'll argue that the system is closed ... but it isn't. Your system > is distributed, so only inside a given node can there be a closed > system. As soon as try to deal with off-node resources ... e.g., the > shared filesystem, sprinkler valves, etc. 8-) ... the resource > scheduling problem becomes exponentially more difficult. > > And again, RT is only one aspect of the problem.The workload scheduler looks at the system as a whole. It knows the resources available at each node (MIPS, RAM, etc.) at any particular instant. This reflects the "in use" resources as well as the resources that have been reserved on that node (possibly for a task that isn't running anywhere in the system, *now*). The scheduler on each node handles finer-grained resource scheduling for that node -- because it knows what the resource requirements are for the tasks (and objects) *on* that node. The workload scheduler can opt to allocate all of the reserves on nodes that are currently not powered up. (this is the most efficient in terms of "hardware in use" criteria) If a task has to be activated that requires X resources and those resources are not available *now* on a running node, then the powered down node on which they have been (conceptually) "allocated" by the workload scheduler is powered up and those "real" resources are now available for that task. The surplus "real" resources on that node are available for immediate use by any OTHER task that comes online (or that is already running, somewhere) [E.g., the workload scheduler can opt to move some task to this node to allow the tasks executing on that other node to run more effectively]>>>> Note that each action that the user takes implicitly consumes resources >>>> and, as such, can hinder other actions that he/she may want to take. >>> >>> Yes. Either you must defer a new process until scheduling says it can >>> complete, or you introduce a "higher priority" process. >> >> But the user interacts with each of those. So, you've an opportunity to >> inform the user of the consequences of his actions (e.g., you are now burning >> more CPU cycles so the previous operation will take longer -- but still >> complete! OR you are now consuming more memory so the previous operation >> may abend!) > > Right. Which is why I said the solution in 2016 is no different from > the solution in 1960. There are too many variables and too many of > them are hidden.But they aren't! That's the point of the reservations. You write a program to run on box X under OS Y. Is it a *surprise* to you when you actually run the program and see that it needs more memory, CPU, etc. than box X can provide? Are you surprised that the time required to compute the next Fibonacci number is many days? Are you surprised that stack penetration exceeds the amount of RAM available in the box? Of course not! You've engineered your solution to require resources A, B and C. And, you know how much those will "bend" based on the actual hardware characteristics (i.e., if the clock is slowed down, the amount of time required will increase; if the physical memory is decreased, VM requirements will increase -- along with execution time required to use that VM). The workload scheduler effectively creates a machine on each physical node with the requirements of the task in mind. When the workload scheduler has allocated all the resources available in the system, then the system is running at capacity. Just like a single "program" using all of the available resources on a physical machine (X) tailored to its needs.>>>> And, that your device can also have autonomous resource needs that >>>> are incurred alongside the user's actions (e.g., daemons). >>> >>> Daemon resources should be known and limited. There's a reason, e.g., >>> that only a superuser can take the last process slot. >> >> That's where reserves come into play. The process *will* run as >> expected (or, won't and will be handled as per the criteria encoded in >> its deadline handler). I brought this up, here, as a reminder that >> a system is rarely static; resources that appear to exist NOW can >> magically disappear, later -- without any deliberate action on the >> user's part! > > Right, but there's more to the issue of subsystems than just that. > They should be suitably limited in their allowed resource use.Yes. You reserve what you need. If you don't need it, reserving it is "unsportsman-like". Just like allocating 10MB for a buffer that will be used to store a single float. "The Market" eventually penalizes your sh*tty implementation -- by NOT running it! The system is accommodating in that it WILL let you effectively use resources that you don't "need" (e.g., surplus memory for backing store, surplus CPU cycles, etc.) because it makes no sense to discard (make unavailable) those resources when they *can* be used when the only cost of doing so lies in system complexity (i.e., the task doesn't see this cost).> On most systems you can keep opening sockets until you run out of > descriptors. However if it is known that your system is a server that > can handle a maximum of 1000 connections, there is no reason to allow > more ... doing so invites DOS failures.Yes. You'd not reserve 1001 sockets if you only need 1000. And, you'd KNOW that you *would* have 1000 sockets available to you -- whether you needed them "up front" or "down the road". You wouldn't reserve N MIPS-seconds/interval (I'm still struggling to come up with a unit of measure for "workload") if you only needed half of that. (if you need it for half as long, then specify an interval that is half as long). With all of these "requirements" laid out, the workload scheduler can know what "minimum" resources are required at each node -- based on the tasks that it has dispatched to those nodes. Anything left over (on powered UP nodes as well as powered DOWN nodes) is available for future tasks -- "reserves" plus "surplus">>> But if you're trying to do resource scheduling, then every process >>> must have known limits. >> >>>> Again, these are only examples. The question is what criteria do you use >>>> for alerting (and/or inhibiting!) the user when you know that it is likely >>>> that he won't be able to perform the desired task WITH THE SYSTEM IN ITS >>>> CURRENT STATE -- and *when* do you impose those notifications? >>> >>> If you can implement a reasonably effective resource based scheduler, >>> then you can warn the user that some program can't run now, but will >>> run when <some set of> currently executing processes finish, and does >>> the user want to defer it until then? But as I said previously, >>> everything goes out the window when a higher priority process enters >>> the mix. >> >> If the higher priority process was there all along (and its reserves >> thus known to the system), it can be addressed when the user task is >> started. > > That doesn't work for tasks which execute in response to outside > events. Such tasks can't easily (or at all) timeshare their reserved > resources with others - the maximums needed must be kept available for > the duration of the task.Yes. "Reserves". If you assume all of the nodes are powered up 24/7 and then visualize them as a single processor, then that processor's resources are managed (by the workload scheduler in concert with the individual per-node schedulers) so that the RESERVED resources for any/all tasks are sitting there, ready to be used. In my case, there are delays involved as I may have to bring up a node and/or move existing tasks around based on some "outside event". But, these are fixed constants that can be factored into the scheduling decisions -- just like loading a task from disk, etc. E.g., if someone approaches the front door, the "doorbell" task needs to be "made ready". If not already present on a real node at this time, then it must be loaded from the persistent store and activated on *some* node. If the node on which it should be activated is not currently powered up, then a node must be powered up before the code can be installed on the node, etc. But, all of the resources that it requires are available *somewhere* in the system -- because they have been RESERVED. Note that this can cause some "low priority" task to be swapped out (if it was using "physical" resources that have been reserved for that doorbell task). So, the low priority task now is in danger of not meeting it's deadline -- in a manner similar to losing processor share to a higher priority task (in a single CPU system). There's no way to address requirements that exceed capabilities. So, the sum of all of the reservations must be compatible with the resources available in the system as a whole. In much the same way that N tasks executing on a single processor are constrained to use only the resources available on that single processor. This is something that you know at design time. E.g., Windows makes reservations for itself. If you try to run more "programs" than the available hardware can support, *windows* still runs but your programs (none of which have contractual guarantees!) suffer.>>> Even doing that much is incredibly hard. In an open system, too many >>> resource needs are ad hoc, and it may not be possible to even >>> enumerate every resource that *might* be involved. I.e. the scheduler >>> may have to consider not only dynamically appearing resources, but new >>> resources that it didn't even know of yesterday. >> >> Yes. As I can bring more resources online on-demand, the workload scheduler >> always has a changing mix of resources to evaluate. But, it doesn't need >> to track all of the existing jobs running on the various processors. Rather, >> it just sees a set of processors with varying capabilities ("surplus >> resources") as likely candidates for the newest workload to be dispatched. >> The schedulers on each node then handle the finer grained scheduling of >> the resources *on* that node. > > Which is fine for cycles - i.e. "compute" resources - but not > necessarily so great for others. Does this node have enough memory? > How far (network span) is the node from storage? Etc.See above.>> [The only magic involved deals with the decision -- by the workload >> scheduler -- to bring another node on-line and possibly re-shuffle >> the locations of currently running loads. And, the inverse operation >> of moving load off of underutilized nodes so they can be powered down.] >> >>> And in a distributed system a centralized scheduler will be a >>> bottleneck, but without it there's no way to predict effects on a >>> shared resource of processes executing on different nodes. >> >> Clump tasks together based on how much they share (IPC vs RPC, SHM vs DSM, >> etc.) So, any tight coupling is handled within the node and can be >> ignored by the workload scheduler (hopefully). For simple resources >> (CPU, battery, memory, etc.) there are no "interactive" sharing but, >> rather, just a resource limitation that can't be exceeded. > > Unifying to find for a local minimum on the task set? Yeesh!!!! > > Still the problem is you need fairly complete enumeration of the > maximum resources needed by every task. That's effectively impossible > ... there are too many dynamic and hidden variables.I disagree. Don't you know how much resources YOUR programs require? If you design a microwave oven, do you expect it to signal an "out of memory" error some months after a consumer has purchased it? The same applies to "open" systems; when you try to run too many programs on your desktop, you become disappointed with the performance of the system and learn not to try to do X, Y and Z at the same time. The difference is, there is no way for you to tell your PC that X and Z are important and Y should just get table scraps. *You* have to explicitly do that by terminating those things that you *think* are eating more resources than they deserve.>>> I know this didn't really help and that you've thought of all or most >>> of it already. >> >> "If it was easy..." :> >> >> Thanks! A colleague sent me some materials on how they design UI's >> at their shop. Hopefully it will contain some mantras pertinent to this... > > The UI is secondary to the guidelines for notifying the user. I think > meaningful guidelines in your system will necessarily be heuristic and > hard to pin down.I think some generalizations can probably be made. E.g., "small efforts" with long deadlines can probably be accepted and allowed to "struggle" (if needed) for a long time before complaining to the user. Other "large efforts" with short deadlines should probably be turned away if the resources are not available at the time the task is activated. I think I can also "take notes" as to how things turn out and use that to learn which tasks might defy the odds. E.g., diarization and voice characterization is expensive ("large effort"). OTOH, it has no hard deadline -- it can be done "offline" and as a low priority task (even though it can require a boatload of resources!). It can tolerate being swapped out -- even for a really low priority task! -- without affecting performance. The only real deadline is the next time that speaker is encountered (which might be immediately -- or never!) And, it can also be killed! That decision can be handled by a "user level" task. E.g., if we already have a voice characterization for that person on hand, we might elect to abandon that effort if some *other* voice needs to be characterized (that we don't have already). But, in those cases, the "user" is the system itself (an intelligent agent acting on its behalf) Likewise, if the doorbell task's deadline handler sees that the bell isn't handled before the initial deadline (SRT so the deadline handler can afford to give the task a "second chance"), the system can decide the task (and its reserves) should be "on-line" at all times (instead of loading from persistent store into a recently powered up node). Ditto for incoming phone calls, etc. I.e., for places that see lots of visitors but few (phone) callers, you'd allocate (online) resources differently than places that have few visitors but lots of callers. <shrug> Dunno. I think I'll just have to make a best (educated) guess at how to convey this information to the user and hope it's "least surprising". _P&tB_ marathon tonight. Time to rot my brain! :> --don
Reply by ●April 12, 20162016-04-12
On Mon, 11 Apr 2016 02:06:03 -0700, Don Y <blockedofcourse@foo.invalid> wrote:>On 4/10/2016 10:47 PM, George Neuner wrote: > >> The deadline of the RT task is just one of its resource limitations. > >It's the point at which you "hit the brick wall". I.e., if you haven't >acquired the resources you need by then, you die (if HRT). So, you >implicitly have a timespan in which to alert the user to any anticipated >difficulties.To a point ... the problem is that there are multiple deadlines involved for a single task: the deadline by which it must finish (obviously) but also the deadline by which it must have acquired its resources and *started*. But when there are multiple resources, and time necessary to acquire them, you have a situation where: - if I can't get memory by T1 ... - if I can't get a network connection by T2 ... - if I can't do ... by T3 ... : etc. ad nauseam. Where is the "drop dead" point?>>> [Obviously, the goal is to convert HRT tasks into SRT tasks wherever >>> possible]A reasonable thing to do whenever possible.>> But a task that requires more resources than are available can't run >> regardless of having an execution deadline. The RT aspect of your >> system is in some sense a diversion ... time really is just another >> resource constraint on task execution. > >If the resources aren't available when the task is "invoked" (by the user), >then you can tell the user that the task can't be started (made ready). >What you have to be concerned with is resources that can "slip away" >after the task has been accepted for execution. This poses a UI problem as >this can happen some time after the user's attention has moved away from >the "start task A" activityYes. The problem though is that you can't necessarily account for all the resources that may "slip away". Disk space, network bandwidth, battery life ... all are things you can't reserve. Ok, you *can* reserve disk space if you know how much you'll need. But many programs that produce data don't know up front. And no, priority token networks do not guarantee that you will get any specified bandwidth - they only guarantee that you will get access according to your priority, and then only if the network is still functional when that time comes. And with dynamic nodes, any time the network reconfigures (if it does) all your access time calculations go out the window.>-- you effectively have to tell him that >something he THOUGHT was going to proceed "OK" some time ago will *not*; >based on new information that you (the system) now have.Worse, the user may have to clean up a partly finished mess. Unless you can undo everything and anything [e.g., transactional ala IBM's Quicksilver OS].>>> ... reservations (aka "reserves"). I can ensure "resources" are >>> "ready and waiting" for tasks that place appropriate reservations. >>> So, you needn't block indefinitely waiting for a piece of memory >>> that you need to perform your task (memory in use by some other >>> can't be forcefully freed without restarting that "other"). >> >> The problem is when those lower priority tasks are still running at >> the time the higher priority task needs to start. What if, given the >> needs of the task, there aren't sufficient resources to execute it? Do >> you kill some lower priority task? > >That's where the reserves come into play. Conceptually, the reserved resources >don't appear as "available". So, when a task tries to be started, you look >and see that there are insufficient resources and don't let it start.But you have been talking about over-committing resources: e.g., 2 programs each need up to 48KB but there's only 64KB available. Great if the 2 programs never need to run simultaneously, but impossible otherwise.>[This can cause the expected completion time of the "low priority" (for want >of a better name) task to slip further into the future as it will now get >less work done per unit time. This raises the "how do I tell the user >that his task -- started some time earlier under "better conditions" -- is >now potentially compromised? Do I complain, NOW? Or, do I see if conditions >will improve in the future thereby giving it a better chance of meeting >its deadline?]That depends greatly on the UI. E.g., a flurry of console error messages about things the user started 17 minutes ago may not even make sense depending on the user's current state of consciousness. Some kind of graphic where you can represent the task as an object in a colored heat map would draw more attention to problems. Everything green? Good.>Memory, OTOH, is a physical resource that can't be "rescinded" when the >task that has reserved it becomes ready. So, it can't be given to the >"lower priority" task even though it looks like it is PRESENTLY unused.Right, but what about the high(er) priority task that needs to run now but wasn't active and so hasn't reserved memory? Or do you figure reservations at install and limit concurrent instances?>[An exception is that the memory can be allocated as "anonymous" memory >that can actually benefit the lower priority task(s) -- *if* the system >allocates it (i.e., NOT the task in question) for roles that the system >can effectively rescind. E.g., the system can allocate it in place of VM >to allow that low priority task to make fewer round trips to the backing >store. When the task that has reserved the memory comes ready, the >system has to figure out how to free up that memory -- flush to backing >store if not already there *or* just freeing the pages if they are already >available on the backing store.]Distributed memory. Need more address space, activate another node. <grin>>>> The "visible" aspect is what I am trying to address. I want the user >>> to be able to know (and accept as "reasonable") how ANY such shortage >>> will be handled.Yeah, but what shortages exist and what can be done about them are constantly changing. You need some way to represent the state of the whole (distributed) system at once.>The workload scheduler looks at the system as a whole. It knows the resources >available at each node (MIPS, RAM, etc.) at any particular instant. This >reflects the "in use" resources as well as the resources that have been >reserved on that node (possibly for a task that isn't running anywhere in >the system, *now*).What if no active node can handle the task now? E.g., too much CPU crunching. And what if there's no inactive node that can be brought up to run it (wrong CPUs, not enough RAM, etc.)>The workload scheduler can opt to allocate all of the reserves on >nodes that are currently not powered up. (this is the most efficient >in terms of "hardware in use" criteria) If a task has to be activated >that requires X resources and those resources are not available *now* >on a running node, then the powered down node on which they have been >(conceptually) "allocated" by the workload scheduler is powered up and >those "real" resources are now available for that task. The surplus >"real" resources on that node are available for immediate use by any >OTHER task that comes online (or that is already running, somewhere) > >[E.g., the workload scheduler can opt to move some task to this node to >allow the tasks executing on that other node to run more effectively]This kind of distributed "meta-scheduling" is a mess. Quite a bit of research - no really good solutions. Task migration ability doesn't actually make meta-scheduling easier - in fact it makes it harder as there are more potential states to consider. Fortunately, dynamically powering up/down nodes doesn't really change the complexity if the goal is to minimize the number of CPUs in use.>> ... Which is why I said the solution in 2016 is no different from >> the solution in 1960. There are too many variables and too many of >> them are hidden. > >But they aren't! That's the point of the reservations.How are you reserving sockets (ports, whatever) for priority programs to use? How are you reserving bandwidth in your network? How are you reserving space on the NAS? How are you reserving battery charge? There always are hidden variables in a distributed system. No matter how hard you try, there *always* will be something you can't account for. You're deluding yourself if you think you can. Since you're playing with expert systems: a hidden state ANN is able to take into account unknowns in the system - but it can't identify the unknowns or tell you which of them is out of whack so that you could tell a user. There's no free lunch.>The workload scheduler effectively creates a machine on each physical node >with the requirements of the task in mind. When the workload scheduler >has allocated all the resources available in the system, then the system >is running at capacity. Just like a single "program" using all of the >available resources on a physical machine (X) tailored to its needs.But it's over-committing "machines" on the nodes. M1 is active on nodeA, so M2 can't simultaneously on nodeA. Both M1 and M2 could run simultaneously on nodeB, but nodeB is powered down. So I power up nodeB and start M2. But do I leave M1 where it is on nodeA? It will run faster if I leave it, but running both nodes uses more power. It's more efficient if I move M1 to nodeB and power down nodeA. Meta-scheduling.>With all of these "requirements" laid out, the workload scheduler >can know what "minimum" resources are required at each node -- based >on the tasks that it has dispatched to those nodes. Anything left >over (on powered UP nodes as well as powered DOWN nodes) is available >for future tasks -- "reserves" plus "surplus"I'm saying that you can't account for all the requirements. You only *think* you can.>If you assume all of the nodes are powered up 24/7 and then visualize them >as a single processor, then that processor's resources are managed (by the >workload scheduler in concert with the individual per-node schedulers) >so that the RESERVED resources for any/all tasks are sitting there, ready >to be used.I understand. That works only if all the CPUs are homogenous. Your virtual machine architecture smoothes some of the differences but does not eliminate them - native programs require their particular environment. And VM programs are highly unlikely to be RT (even low latency SRT) unless the VM JIT compiles - which is just more memory use you can't account for.>In my case, there are delays involved as I may have to bring up a node >and/or move existing tasks around based on some "outside event". >But, these are fixed constants that can be factored into the scheduling >decisions -- just like loading a task from disk, etc.That isn't a problem.>E.g., if someone approaches the front door, the "doorbell" task needs >to be "made ready". If not already present on a real node at this >time, then it must be loaded from the persistent store and activated >on *some* node. If the node on which it should be activated is not >currently powered up, then a node must be powered up before the code >can be installed on the node, etc. But, all of the resources that >it requires are available *somewhere* in the system -- because they >have been RESERVED. > >Note that this can cause some "low priority" task to be swapped out >(if it was using "physical" resources that have been reserved for that >doorbell task). So, the low priority task now is in danger of not >meeting it's deadline -- in a manner similar to losing processor >share to a higher priority task (in a single CPU system). > >There's no way to address requirements that exceed capabilities. >So, the sum of all of the reservations must be compatible with the >resources available in the system as a whole. In much the same >way that N tasks executing on a single processor are constrained >to use only the resources available on that single processor.Which means reservations (at least many of them) have to be made statically at install and there can't ever be N+1 instances running when only N reservations have been made.>This is something that you know at design time. E.g., Windows makes >reservations for itself. If you try to run more "programs" than >the available hardware can support, *windows* still runs but your >programs (none of which have contractual guarantees!) suffer.Yes, but your system is "open" in the sense that you want others to design for it. You can't expect the same attention to detail from someone just dabbling in home automation with your system. You have to expect that other people won't be able to design a new component so that it plays nicely with everything else.>>> Clump tasks together based on how much they share (IPC vs RPC, SHM vs DSM, >>> etc.) So, any tight coupling is handled within the node and can be >>> ignored by the workload scheduler (hopefully). For simple resources >>> (CPU, battery, memory, etc.) there are no "interactive" sharing but, >>> rather, just a resource limitation that can't be exceeded. >> >> Unifying to find for a local minimum on the task set? Yeesh!!!! >> >> Still the problem is you need fairly complete enumeration of the >> maximum resources needed by every task. That's effectively impossible >> ... there are too many dynamic and hidden variables. > >I disagree. Don't you know how much resources YOUR programs require?No. Because I am currently working mainly with a JIT compiling VM implementation of a GC'd language. It has both kernel and green threading and in either case the threads are just abstract CPUs with no way to tell how many resources they are consuming. Oops ... so are your users programming a VM. I have an loose idea of how much memory the VM needs to run my program - it is difficult to bound tightly - and similarly I have some idea of the space used by the process ... ... but I have no idea what's happening inside the VM when I, e.g., open a TCP connection - how much memory or how many CPU cycles are dedicated to the illusion of the connection as a "file stream". I could find out ... by searching through source code for the VM and its libraries ... but that still wouldn't tell me what Windows or Linux is doing underneath the VM. Again, for Linux I could find out by wading through source code. For Windows I'm out of luck. Similarly your users will have only a vague notion of what their programs cost in terms of the system services they use. Even the designers of native applications (new components, etc,) are not necessarily going to know what resources are consumed by system services their programs may try to use. George
Reply by ●April 12, 20162016-04-12
On 4/12/2016 12:22 AM, George Neuner wrote: [offlist as much of this is implementation specific -- before someone complains about length (yet can't restrain themselves from reading it! :> )]
Reply by ●April 12, 20162016-04-12
On 12.4.2016 г. 14:22, Don Y wrote:> On 4/12/2016 12:22 AM, George Neuner wrote: > > [offlist as much of this is implementation specific -- before > someone complains about length (yet can't restrain themselves > from reading it! :> )] >Hi Don, If someone complains why not just ignore him. I did not read the messages entirely but I did read most of them and enjoyed the posts of you both. Dimiter
Reply by ●April 12, 20162016-04-12
Hi Dimiter, On 4/12/2016 5:39 AM, Dimiter_Popoff wrote:> If someone complains why not just ignore him. I did not read the > messages entirely but I did read most of them and enjoyed the posts > of you both.George and I correspond. There are details that I will explain to him that I wouldn't post, publicly. And, having a bit of a "history", I know which things I need to "explain" and which he'll understand without further exposition. Likewise, I can relate ideas to his work (which I'd respectfully not disclose without his prior public disclosure) or some of my prior. He also has a healthy understanding of the lengths to which I take some of my approaches/implementations having previously discussed some of those with him. Taking lengthy/detailed discussions offlist just makes it easier to talk without having to consider everything one says/types and how third parties are receiving it. Second day of rain, here. I may have to move to your neck of the woods for WARMTH! Or, is it STILL freezing? (does it EVER warm up, there?) ;-) Regards to L, --don
Reply by ●April 13, 20162016-04-13
Op Thu, 07 Apr 2016 23:10:04 +0200 schreef Don Y <blockedofcourse@foo.invalid>:> *IN GENERAL*, what dogma would you suggestI suggest rejecting all dogma. (I can suggest doctrines, however.)> regarding addressing > potential ("current") resource inadequacies when starting a task > (or, offering that capability *to* start that task to a user)?Depends who is initiating the task and who is responsible for resource availability. 1. In an open user-controlled environment, leave it to the user. This means giving the user the tools to determine resource usage and the option to shoot himself in the foot. 2. In more closed/rigid environments, make a best effort to inform the user. If system damage is possible, prevent it.> Keep in mind that resource availability varies before, during and > after a potential task/activity is initiated. And, a particular > activity may result in other activities (automatically started > or likely to *want* to be started -- by the user). > > E.g., if you were asked to copy a file, you'd probably stat() the > file to ascertain its size (*hope* that size remains constant > during the following operation)Or lock the file.> and check to be sure you have that > much free-space on the target.Or reserve the space.> I.e., the copy operation would > tend to be fast enough that the user couldn't remedy a "no space > on device" error before it was signaled.What if you're recursively copying thousands of files from potentially multiple partitions to potentially multiple partitions potentially containing hardlinks and/or symlinks?> OTOH, if the transport medium was of sufficiently low bandwidth, > you could allow the operation to start and warn the user that > he/she WILL run out of space at the target *if* nothing changes > (the implication being that the user should be that agent of change).You could, but what if the user's mom calls right when he is about free some space? It should be possible to pause the operation.> The 1960's approach, of course, is just to naively start the operation > and then have it abend when it hits that brick wall! > > Note that each action that the user takes implicitly consumes resources > and, as such, can hinder other actions that he/she may want to take.As with everything in life. Your questions are as old as human society.> And, that your device can also have autonomous resource needs that > are incurred alongside the user's actions (e.g., daemons). > > Do you, for example, let the user consume battery in a futile attempt > to perform some operation -- and end up jeopardizing his ability to > do some more valuable operation later (e.g., back up his device before > power fails)?If you already know that the user is likely to perform a backup, then the resources for that should already be reserved before "some operation" can be initiated. In this case, there is no jeopardy.> Again, these are only examples. The question is what criteria do you use > for alerting (and/or inhibiting!) the user when you know that it is > likely > that he won't be able to perform the desired task WITH THE SYSTEM IN ITS > CURRENT STATE -- and *when* do you impose those notifications?-- (Remove the obvious prefix to reply privately.) Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/







