EmbeddedRelated.com
Forums
Memfault Beyond the Launch

Managing "capabilities" for security

Started by Don Y November 1, 2013
On 11/7/13, 2:27 AM, Don Y wrote:
> Hi Richard, > > On 11/6/2013 10:43 PM, Richard Damon wrote: > >> My complaint was to your statement that the ONLY proper way to revoke a >> permission is asynchronously. > > Sorry, that wasn't my intent. What I was trying to address was the > *practical* aspect of all this. > > *I* have to create the mechanisms that will ultimately be used > throughout the system. Run the thought experiment(s):
Ok, have you thought this far enough to be able to say that for *ALL* system that need to be able to revoke permissions the *ONLY* valid method is asynchronous revocation without prior warning. That specifying that it is *NEVER* proper for the permission system to give prior warning and give the actor given the permission an opportunity to clean up and indicate it is done. This is my objection, the categorical statement that only one method is right.
> > -Imagine I make a system that notifies, waits "some" time, then revokes. > -Imagine I make a system that just revokes -- and notifies after the > fact. > > I then tried to present possible scenarios for what *might* happen > in each case. I.e., notification gets lost/delayed/ignored -- or > he was "blocked" while the notification came in. In each case, how > does that affect the eventual actions of the "holder"? Ans: he > has to deal with NOT having the capability when he opts to use it. > I.e., he can't blindly *assume* it will "work". > > So, he has to code for both cases: that he received the notification > and is going to try to comply in an orderly fashion; and, that he > didn't have enough warning (or *any* warning) when the notification > arrived and has effectively *lost* the resource before/during its > intended use.
No, if he writes his code to be able to respond in time, then most of the time a revocation will have the likely much lower cost of an orderly shut down, and only in the rare cases that something has gone wrong suffer the higher cost of "random" revocation. If the cost differential is high enough, the holder may need to use a different, less efficient, algorithm (maybe check-pointing information often to allow resumption in case of failure) to minimize the cost of random revocation, even if it should only be rare. Now, perhaps in the situation you are talking about, where communication is so unreliable, everything already has so much error checking, that random revocation isn't an issue, but I think that is actually the rare case, and in such a system you probably can't be promising much performance anyway. Other situations the trades work differently, and in some of them, a cooperative process of revocation may give much benefit.
> > My *opinion* is that this extra complexity -- both "in the system" > and in the "applications" -- will end up wasted. That to be > effective, it would require even *more* mechanism than we have > discussed (e.g., negotiating a "early warning" interval, deciding > how to handle the case when that interval can't be met, etc.). > > Given that "holders" will have to tolerate the case of the capability > "going away", it seems easier to just handle that case and make > folks aware of it in the API. > > Remember, these are "exceptional conditions". You *expect* to > be able to hold a capability that you have requested and been > granted. I'm just not willing to make that a *guarantee*. So, > I need a way to "change my mind" -- BECAUSE I HAVE A GOOD REASON > FOR DOING SO, NOW (just like I can preempt your execution if I > have a good reason!).
This becomes a cost trade off. A cooperative system is much better able to handle making good trade offs. The cooperative system may want the asynchronous method as a back up to handle extreme cases or fall backs for a process not meeting the requirements. If you ONLY have the extreme cases then maybe you don't need the cooperative system.
> >> My position is that you can't make such a >> statement and that you need to apply design to the situation, and that >> in some conditions revocation should have a defined notification before >> it is to be revoked. > > What if I can't *guarantee* that notification arrives sufficiently > early for you to do anything about it? If you will be able to > cope with this (by implementing your algorithm differently -- even > if it requires a complete rollback), then why shouldn't I opt for > this as the normal behavior? > > If you *insist* on this, then I may need other concessions from > you to ensure the level of performance is met. E.g., maybe you > can only hold a capability for a fixed period of time -- that > way, *I* know all I have to do is wait and I get it back > automatically? But, this complicates your work in other ways... > >> Let me put a real world example, generally ones right to drive a vehicle >> is a privilege granted by the government, and the government has the >> power to revoke this privilege, but if it does so, there are >> notification requirements so that you do know your privilege is being >> revoked. This means that once your have gone through the procedures to >> get the privilege to drive, you can safely do so, knowing that if for >> some reason your privilege is revoked, you will be given sufficient >> notice so that you don't get in trouble. > > Ah, but they will only *try* to give you notification! If that > notification doesn't make it to you (you've moved, were out of > town, etc.) and you later encounter a police officer, you're in > the same situation as if you *had* been notified and chose to > ignore it. > > [Sure, you could go to court and hope you get a rational judge > but it's not The State's responsibility to ensure you have been > notified -- only that they "made a concerted attempt".] >
The "try" generally includes positive indication that you have received the message or not. If you don't get it because you moved, then you have broken the protocol, as you are required to give notification of the move (and if you fail, it is your fault for not do so). For notifications with serious consequences, there often IS a requirement that some official of the court (Police officer or some other trusted individual) be able to attest that they delivered the notice or performed the legally required attempts before you can be punished for not knowing about the action. In the process model, the revoker has the obligation to make a good faith attempt at delivering the notification, and the holder the responsibility to listen for and act on the notification. If the holder doesn't keep his end up, he can't complain about the cost of not reacting to the message. Even if the communication channel isn't totally reliable, hopefully it is reliable enough
>> Imagine instead, that the government reserved the right to revoke your >> privilege without notice (but did give you a way for you to check if >> your right has been revoked), also check points were established at >> random to check that you DO have current privilege to drive, and that >> driving without privilege was a capital offense. Would you want to >> drive? IF you did have to, I bet you would want to spend a lot of effort >> checking that you haven't been revoked. > > But you're assuming there *is* some "really bad consequence". > What if you rarely drive? What if there are few police officers > in the parts of town that you frequent? > > What if you are approached by a cop while *walking* and he asks > for an ID. He sees that your license is expired and confiscates > it. > > Or, you go to cash a check at a bank and the bank officer does > this on behalf of The State? > > I.e., any time you would *normally* use that credential you run > the risk of it NOT being honored -- even if you aren;t "punished" > for this (your "punishment" is not being allowed to USE it) >
An "expired" credential is something that you can know of ahead of time that it is coming up, and if you do your job right, you know to renew it so you have the renewal in time to continue acting. The case present was a case where you had to use a resource/capability, but you couldn't know for sure you still have it, and the cost for using it when you didn't was high. This is a possible case. Pointing out cases where the cost is low does NOT negate the cost in the cases where the cost is high.
>> This is exactly like the case that can happen for some forms of >> privilege, like access to shared memory, if this sort of access is to be >> revoked asynchronously, it generally means that process doing it will be >> aborted, or the process needs to not treat it as shared memory but use >> some sort of kernel call to check the permission and do the access >> atomically (instead of just accessing the memory). > > Protect shared memory with a mutex. Hold it as long as you want. > If I want to control that with a capability, I can wrap the > mutex access with the capability: so, you can't *take* the > lock without permission but, once held, I can't interfere with > your holding the lock.
i.e. you are admitting that there as some capability that can't be just asynchronously revoked.
> > It's a capability. I can make it "control" whatever I choose. > And, implement whatever *else* I choose to ensure that this > control makes sense. > > E.g., if I revoke access to a piece of memory, I could opt to > *suspend* your process at the same time. Then, make a copy of > the memory while someone else accesses it. Then, restore the > original before resuming your process (and restoring your > capability).
Hopefully you document that any process that get this resource is subject to being randomly blocked for period of time.
> > I.e., you are *always* at the mercy of the kernel. I just have to > ensure that I uphold any contracts that I have agreed to with you. > And vice versa (of course, if *you* cheat, I can bitch-slap you! :> )
Yes, the kernel is capable of doing anything it want to your process, BUT if it to be considered a "working" kernel is shouldn't. Every process should have a programming contract with the kernel, of what the process can expect from the kernel, what it can ask for, and what can happen. You also are assuming that there IS a kernel that is the master over the machine. In many system there is not this sort of kernel, but the kernel just coordinates various actors, under the assumption that each actor follows the rules and plays nice. It costs resources to place each actor in their own "jail" to make sure they behave, and if you have control over the code in the machine, sometimes it is just better to play nice.
> >> I agree, that in SOME cases, the asynchronous revocation is a good >> model, but not all. In most cases where a notification/cooperative >> revocation system makes sense, for reliability concerns, a backup >> asynchronous method make sense, to allow you to revoke a malfunctioning >> process, but at that point, since it is already malfunctioning (since it >> didn't complete the cooperative revocation method), the problems imposed >> on the task are likely reasonable. This doesn't mean that the >> cooperative method was worthless. > > I'm not claiming "one is good" and "the other is bad". I'm just trying > to look at the realistic consequences of each approach. How to > balance complexity, resources, etc. against "convenience" (for want > of a better word :< ) I suspect most folks will just code as if > they could lose a resource prior to using it or *while* using it. > I imagine the result code from accessing the service/resource will > be *all* they look at. And, that any signal handler for "resource > revocation" will simply be undefined. It's just the least effort > approach (it should be obvious that I expect folks to be lazy in > their implementations!). > > When faced with this sort of condition, I *also* expect these > folks to just report "FAIL" for their activities and not even > *try* to get things "right" (i.e., "as good as possible in > the circumstances")
You seem to assume that testing every usage for revocation is easy, or that having the capability randomly removed will always be of relatively low cost.
> >> Also, non-backwards compatible specification changes ARE expensive. That >> is just the way things work, at least if you want to be able to talk >> about software having correctness. This does mean that you do want to >> put some effort into defining your requirements, to put into them the >> things you need to verify/prove correctness, but not things that you >> don't need that add unneeded future limitations. > > This is why I am spending the effort *now* considering how variuous > scenarios are likely to be handled. I don't want to have to make > a change down the road because I "discovered" something that "can't > work". > > I'm not vain enough to think I can come up with the Right way to > handle every situation. But, I *do* think I can come up with a > practical way that handles most situations economically and *all* > situations "properly", even if not efficiently. > > I can always decide *not* to revoke a capability! Then, *none* of > the mechanism gets invoked.
But you have decided that it *NEVER* makes sense to have a protocol to ask the holder of a privilege to quickly finish up and release it. (Since it is always better to use asynchronous revocation). Yes, there are cases where you know that you now have a need for a resource that is so sever and so immediate it doesn't really matter the cost imposed on the actor given the resource, you need it back. In this case an asynchronous revocation makes sense. Other times you may come across the need for something that isn't immediately urgent, but is important to get done before the actor might normally finish with what it is doing. This is where a cooperative protocol adds value.
Hi Richard,

On 11/10/2013 5:00 PM, Richard Damon wrote:
> On 11/7/13, 2:27 AM, Don Y wrote: >> On 11/6/2013 10:43 PM, Richard Damon wrote: >> >>> My complaint was to your statement that the ONLY proper way to revoke a >>> permission is asynchronously. >> >> Sorry, that wasn't my intent. What I was trying to address was the >> *practical* aspect of all this. >> >> *I* have to create the mechanisms that will ultimately be used >> throughout the system. Run the thought experiment(s): > > Ok, have you thought this far enough to be able to say that for *ALL* > system that need to be able to revoke permissions the *ONLY* valid > method is asynchronous revocation without prior warning. That specifying > that it is *NEVER* proper for the permission system to give prior > warning and give the actor given the permission an opportunity to clean > up and indicate it is done. This is my objection, the categorical > statement that only one method is right.
Let's try this again... "*I* have to create the mechanisms that will ultimately be used throughout the system. Run the thought experiment(s):" I.e., *I* am the system architect. It is my responsibility to design the environment in which all actors will operate. It's not a "homework assignment" where I can talk theoreticals. It has to *work* in The Real World. You can design systems lots of different ways. You can implicitly trust every actor and every developer and assume they will ALWAYS have the needs of others in mind. I.e., they will never "hog" a resource unless they absolutely need to (including the CPU itself!). You can assume they will be very technically competent and know all the right ways of doing things to maximize their cooperation, etc. In such a system, you need very little mechanism. The "mechanism" is already implicit -- each actor (and developer) KNOWS that other agencies will only ask for something when they need it. *OTOH*, the other agency also knows that an actor who chooses NOT to relinquish a resource does so because *he* has decided (fully aware of your request!) that *HE* needs it more! (i.e., no need for an asynchronous revocation mechanism at all! And, arguably, no need to *request* revocation of a resource, either -- the actor holding it WILL release it immediately after he is done with it! If you need something, just WAIT FOR IT (it will turn up when it "should" -- and no sooner!) Such systems are inherently limited in size. It's just not practical to keep track of every use and *hope* they all sort themselves out at run time. They are also *closed*. Adding something (another actor) to one requires too much knowledge about EVERYTHING in the system in order to know what you can expect *from* it. They also IMPLICITLY trust all actors (and developers). A hostile actor can too readily compromise the entire system. THE SAME APPLIES TO A ROGUE ACTOR (not intending to be hostile but, having a bug or operating in a failed state). I.e., these systems are uninteresting in the real world -- except as examples of how badly things can fail. As complex systems are becoming increasingly more "open" (bad choice of words as it suggests "free", "easy to inspect", etc.), you can neither expect to NOT encounter hostile/rogue actors -- nor can you expect to encounter "highly skilled/aware/cooperative" developers, etc. Esp if the system is exploited commercially! Unless you interpose some active agency to "qualify" the sorts of software *allowed* (authorized) to run on/in the system. [If you think otherwise, we are simply talking at cross purposes] So, I have taken the approach of putting clamps on damn near everything. Who can talk to whom, which memory belongs to each, who can use what, etc. At considerable cost (resources). Because I believe this is a key part of making systems that "stay up forever" in spite of the inadequacies of the components thrown at it/into it. If a component has a defect, the defect should penalize the component, not the rest of the system. So, how do you implement "permissions" in such a system? One approach is as above: trust everyone to Do The Right Thing. But, that won't work in anything other than a "Perfect World". Perfect Developers creating Perfect Actors with plentiful resources. You can also "Trust but Verify" -- hope everything happens in an amicable fashion and include a mechanism to deal with cases where it *sometimes* doesn't. (i.e., still giving actors the benefit of the doubt). Each of these assumes a benign environment with largely cooperating actors/developers and just the occasional "fluke". A more realistic approach may be "Trust and Enforce" -- hope folks do what they should and extract a penalty when they don't! So, lazy/inept actors suffer for their laziness/errors and those who "behave" are effectively rewarded (!punished). This allows a rogue/hostile actor to do damage -- but, hopefully, reduces his capability to do so, over time (by allowing penalties to escalate -- possibly infinitely!). I've opted for an approach closest to this last -- IN TERMS OF THE SYSTEM'S ABILITIES! I.e., the system doesn't *rely* on cooperation. The implementor of a resource can choose to be as cooperative as *he* wants. Or, as heavy-handed! If you implement a "math facility", you can choose to allow actors to have exclusive control over when -- and if -- they relinquish that resource ONCE THEY POSSESS IT. Or, perhaps you will expect them to honor requests from you that it be released (because you know you have another client waiting to use it). Perhaps you allow them a certain time interval? Or, a certain number of "operations"? Or, some other criteria negotiated (and tracked BY YOU) when the resource was initially granted? But, as The System is the only entity that can actually *effect* changes made to a "capability" (Handle), The System still needs to be able to revoke a capability unceremoniously. BECAUSE THERE WILL BE TIMES WHEN THIS IS NECESSARY. I.e., NOT having this ability means an actor can impose an unbreakable deadlock ("Sorry, we can't shut the system down yet because Task A won't release Resource X") You want to use a resource? You adhere to the contract governing its use. If you don't want to deal withthe possibility that the resource can be asynchronously revoked, then DON'T USE IT! Because you *know* it *will* be revoked, sooner or later (in a 24/7/365 application). GIVEN THAT SYSTEM ABILITY, wanna bet most resource *implementors* opt for asynchronous revocation? No, they won't *eagerly* do this... thy will try to allow you to keep a resource as long as you NEED it (but, why would you hold a resource any *longer* than that? Are you being inconsiderate??). But, when they want/need those resources back, they will simply *take* them -- and tell you that they have done so (assuming you don't discover this by trying to access the resource at that exact time and received a FAIL_PERMISSION error). Remember, each implementor uses resources that may have been granted to *him* by someone else further up the food chain. So, for *you* to be kind to your clients, *he* would have to be kind to you! (the presence of any "impatient" resource implementor in your supply line means *you* have to be similarly impatient) I don't see this as a problem. If you can't tolerate the possibility of an asynchronous revocation, don't use the resource. If you need 100% of the CPU, then you can't run. If you need more memory than is available, you can't run. Find some other environment to operate in. Ask yourself how EVERY application running on your Linux box can *magically* tolerate a "kill -SIGKILL". Those that can't (e.g., those "flying the aircraft") can't run in that environment! (and there's nothing wrong with *that*, either!) If an (Linux) app wants to be able to recover from something asynchronous like this, *it* bears the cost of doing so. It doesn't force the rest of the system to bear it on its behalf.
Hi Don,

On Thu, 07 Nov 2013 17:40:46 -0700, Don Y <This.is@not.Me> wrote:

>[eliding a lot for fear of hitting upper message length limit]
Don't know what the limit is, but I've seen messages several thousand lines long in various groups. If everyone edits judiciously, ISTM that it would be hard to get there in any reasonable discussion. The ridiculously long messages I have seen often were the result of repeated top-postings or "me too"ing with no attempts made at editing.
>On 11/7/2013 2:27 PM, George Neuner wrote: >> In the original version yes ... later they went to a 256-bit ticket to >> include more end-point information and better crypto-signing. > >OK. But that just changes the size of the copy. It still allows you >to create as many copies as you want -- without anyone knowing about >them. And, makes "a certain bit pattern" effectively the same as >another copy of that capability!
Yes. However, the enlarged capability was an improvement over the original because it carried information on client(s) authorized to use the capability.
>Once the kernel has decided that you *can* "talk" to that service >(the one that backs the object in question), the IPC/RPC proceeds >(marshall arguments, push the message across the comm media, await >reply, etc.). > >On the receiving end, the service sees your request come in. Knows >the object to which it applies (because of which "wire" it came in on), >identifies the action you want to perform (becasue of the IPC/RPC >payload) and *decides* if you have been allowed to do that!
Server has an addressable port per managed object? Seems like overkill.
> ... unlike AMoeba and other >ticket-based systems, the number of "authorizations" isn't defined >by a bitfield *in* the "ticket/key". Rather, its whatever the >Handler considers to be important.
Yes. As I noted previously, when the set of "authorizations" is arbitrary, the role of the ticket has to be demoted from self-contained capability to some kind of capability selector. But it doesn't require kernel involvement - it could be done all in user-space.
>Much of the implementation is Mach-inspired. Think of Handles as >port+authorizations.
Understood.
>User-space capailities allow the kernel to get out of the loop. >But, mean that the kernel can't *do* anything to control the >proliferation of copies, etc.
Yes. However, capabilities can be managed in user-space by the services themselves - which IMO actually makes more sense if the set of authorizations they control are wildly different. All that is necessary at the kernel level is to validate "port send" permission. But in any case, we're back to how it is granted 8-)
>>> If H truly *is* backed ("handled") by B, then the kernel allows the >>> transaction -- calling on B to enforce any finer grained authorities >>> (besides "access"). I.e., B knows which authorities are available >>> *in* H and can verify that the action requested is one of those allowed. >> >> What "transaction"? The set of possible objects and the actions that >> might need to be performed on them both are unbounded. > >Yes. Kernel cares not about *what* A is asking B to do on H. ... > >> A generic "do-it" kernel API that can evaluate every possible action >> on any object is a major bottleneck and a PITA to work with. Even if >> the high level programmer has a sweet wrapper API, the low level >> programmer has to deal with absolutely anything that can be pushed >> through the nonspecific interface. > >Handlers and Holders conspire as to what actions they want/need to >support. If you want to be able to erase every odd byte on the raw >disk device, then *someone* has to write the code to do that! >If you want to ensure this action isn't casually initiated, then >someone has to enforce some "rules" as to who can use it -- and >even *how*/when (e.g., you might have authorization to do this, >but the Handler only lets it happen on Fridays at midnight). >Let the Handler and Holder decide what makes sense to them!
Previously you had said that your kernel was able to prevent clients from making connections on the basis of complex permissions like the right to "erase every odd byte" of the object. That's why I asked how the kernel knows what the client wants. Now you are saying that the kernel only checks the client's "port send" authority and more leaves complex decisions to the server. Which is it?
>I wanted to keep the kernel out of the "policy" issues and just >let it provide/enforce "mechanism". > >Unfortunately, it makes the kernel a bottleneck as all IPC/RPC >has to be authenticated there. But, it gives me a stranglehold >on "who can do what". It also gives Handlers the ability to >decide what constitutes abuse of privilege -- *its* privilege! >And, provides far more refined ideas of what those privleges >actually *are*.
See, again here you seem to be saying again that the kernel can make decisions based on fairly intimate knowledge of the client's intentions.
>In my case, a namespace binds a name to a Handler. What that Handler >does and how it does it can have absolutely nothing in common with >any other Handler in the system. > >The *namespace* "object" has operations that can be performed on >it (methods defined in the IDL that can be applied to any Handle >that references that particular *flavor* of namespace). E.g., >resolve(), create(), delete(), etc. But, it has no sense of >reading/writing *to* the Handles that it manages. > >> Unfortunately, doing this generically is a PITA (so no one does it). >> If you are familiar with COM or Corba, it amounts to the server >> returning an IDL specification, and the program [somehow] being able >> to interpret/use the IDL spec to make specific requests. > >I don't implement a full-fledged factory. Rather, I assume you know >everything there is to know about the objects with which you are >interacting. That you and their Handlers have conspired beforehand >to some set of agreed upon methods (abilities? trying to avoid >using the word "capabilities").
So implementing the IDL gives you capability? Or just potential? I.e. you can assume that an imposter task has implemented the IDL for the service/object it wants to hijack.
>> System-wide synchronous revocation is impractical, but revocation can >> be done asynchronously if master capabilities are versioned and >> derived capabilities indicate which version of the master was in force >> when they were issued. > >No need for versioning. Handles are unique -- not "reused" (until all >references to it are known to be gone). As they can't be duplicated >(without kernels involvement), it knows when it is safe to reuse a >stale Handle. (a task can *try* to hold onto it but the kernel that >serves that task *knows* it doesn't exist anymore. "File descriptor >27 is no longer attached to a file -- regardless of what you may >*think*!"
But if you are using "handles" ("indexes", "selectors", whatever) to represent arbitrary collections of authorities, you're going to run out of them pretty quickly unless the handle objects are fairly large. I.e. 4 billion [32-bit handles] seems like a really large number until you actually start parceling it out: e.g., if "objX,read" is distinct from "objX,read,append" is distinct from "objY,read,append", etc. That's part of the reason Amoeba used wide tickets. [1st version used 80-bits without the crypto-signing field, 128 bits in all. 2nd version capabilities were 256 bits]. George
Hi George,

On 11/11/2013 10:20 PM, George Neuner wrote:
> On Thu, 07 Nov 2013 17:40:46 -0700, Don Y <This.is@not.Me> wrote: > >> [eliding a lot for fear of hitting upper message length limit] > > Don't know what the limit is, but I've seen messages several thousand > lines long in various groups. If everyone edits judiciously, ISTM > that it would be hard to get there in any reasonable discussion. The > ridiculously long messages I have seen often were the result of > repeated top-postings or "me too"ing with no attempts made at editing.
I think NNTP servers are free to impose their own limits. I've previously bumped up against it and found it annoying to have to edit my own reply before being allowed to send it...
>> On 11/7/2013 2:27 PM, George Neuner wrote: >>> In the original version yes ... later they went to a 256-bit ticket to >>> include more end-point information and better crypto-signing. >> >> OK. But that just changes the size of the copy. It still allows you >> to create as many copies as you want -- without anyone knowing about >> them. And, makes "a certain bit pattern" effectively the same as >> another copy of that capability! > > Yes. However, the enlarged capability was an improvement over the > original because it carried information on client(s) authorized to use > the capability.
So, how are surrogates handled? E.g., send capability to X and X wants to delegate some or all of it to Y. I.e., it can create a new capability from a subset of its own (which Y can then do for Z, etc.) but how do you track down all derived capabilities (or, just not recycle "identifiers" so any stale copes eventually find their IDs invalid WHEN PRESENTED FOR SERVICE)
>> Once the kernel has decided that you *can* "talk" to that service >> (the one that backs the object in question), the IPC/RPC proceeds >> (marshall arguments, push the message across the comm media, await >> reply, etc.). >> >> On the receiving end, the service sees your request come in. Knows >> the object to which it applies (because of which "wire" it came in on), >> identifies the action you want to perform (becasue of the IPC/RPC >> payload) and *decides* if you have been allowed to do that! > > Server has an addressable port per managed object? Seems like > overkill.
Yes. But think about how many managed objects you are likely to have. E.g., only *open* files need to have handles...
>> ... unlike AMoeba and other >> ticket-based systems, the number of "authorizations" isn't defined >> by a bitfield *in* the "ticket/key". Rather, its whatever the >> Handler considers to be important. > > Yes. As I noted previously, when the set of "authorizations" is > arbitrary, the role of the ticket has to be demoted from > self-contained capability to some kind of capability selector. But it > doesn't require kernel involvement - it could be done all in > user-space.
Yup. In my case, the Handler provides "policy"... and can decide whatever those authorizations make sense for this instance of this object. Kernel provides communications and "Handle-related" (i.e., think of the Handles as objects in their own right -- not just REPRESENTATIVES of other objects) operations.
>> Much of the implementation is Mach-inspired. Think of Handles as >> port+authorizations. > > Understood. > >> User-space capailities allow the kernel to get out of the loop. >> But, mean that the kernel can't *do* anything to control the >> proliferation of copies, etc. > > Yes. However, capabilities can be managed in user-space by the > services themselves - which IMO actually makes more sense if the set > of authorizations they control are wildly different. All that is > necessary at the kernel level is to validate "port send" permission.
Correct. And ensure the "messages" (IDL) destined for each "object" (Handle/port) get routed to the right Handler for that object.
> But in any case, we're back to how it is granted 8-)
Initially, everything is hierarchical. So, whatever *I* create, *I* can give to others (e.g., my offspring -- directly or indirectly). But, they are free to create *their* own objects and act as Handlers for them -- and, give them to other actors that they are made aware of (e.g., via a directory service, their explicit namespaces, etc.)
>>>> If H truly *is* backed ("handled") by B, then the kernel allows the >>>> transaction -- calling on B to enforce any finer grained authorities >>>> (besides "access"). I.e., B knows which authorities are available >>>> *in* H and can verify that the action requested is one of those allowed. >>> >>> What "transaction"? The set of possible objects and the actions that >>> might need to be performed on them both are unbounded. >> >> Yes. Kernel cares not about *what* A is asking B to do on H. ... >> >>> A generic "do-it" kernel API that can evaluate every possible action >>> on any object is a major bottleneck and a PITA to work with. Even if >>> the high level programmer has a sweet wrapper API, the low level >>> programmer has to deal with absolutely anything that can be pushed >>> through the nonspecific interface. >> >> Handlers and Holders conspire as to what actions they want/need to >> support. If you want to be able to erase every odd byte on the raw >> disk device, then *someone* has to write the code to do that! >> If you want to ensure this action isn't casually initiated, then >> someone has to enforce some "rules" as to who can use it -- and >> even *how*/when (e.g., you might have authorization to do this, >> but the Handler only lets it happen on Fridays at midnight). >> Let the Handler and Holder decide what makes sense to them! > > Previously you had said that your kernel was able to prevent clients > from making connections on the basis of complex permissions like the > right to "erase every odd byte" of the object. That's why I asked how > the kernel knows what the client wants. > > Now you are saying that the kernel only checks the client's "port > send" authority and more leaves complex decisions to the server. > > Which is it?
Nothing talks to anything without kernel's involvement. If you don't hold a send right for a port, then you can't *send* to it! So, you can repeatedly trap to the kernel -- but never get past that point. (send rights are not forgeable). Whatever *backs* the object (the "Handler" behind the "Handle") decides how to interpret each communication (e.g., IDL). But, the Kernel acts as Handler for certain object types as well! E.g., a Task is a container for resources and Threads. You may want to operate *on* a task (e.g., change its priority, scheduling algorithm, stack allocation, kill it, etc.). So, each Task has a (at least one) Handle. When someone wants to SUSPEND a task, it takes a Handle for that Task and passes it to the task_suspend() IDL. If the caller has permission to *talk* to that object (i.e., the Task), the kernel routes the IPC/RPC message to the Handler for that object -- namely, the kernel itself! If the permissions recorded *by* the Handler for that instance of that Handle include the ability to SUSPEND the task, then the Handler (i.e., the kernel) suspends the task and returns SUCCESS to the caller. If an object is a page of memory and there is some (wacky) operation supported on memory pages that allows "every odd byte" to be erased, then anyone holding a Handle to a memory page object for which that "authorization" has been granted can invoke the "erase odd bytes" method on that object. (The kernel has some involvement with memory though not exclusive)
>> I wanted to keep the kernel out of the "policy" issues and just >> let it provide/enforce "mechanism". >> >> Unfortunately, it makes the kernel a bottleneck as all IPC/RPC >> has to be authenticated there. But, it gives me a stranglehold >> on "who can do what". It also gives Handlers the ability to >> decide what constitutes abuse of privilege -- *its* privilege! >> And, provides far more refined ideas of what those privleges >> actually *are*. > > See, again here you seem to be saying again that the kernel can make > decisions based on fairly intimate knowledge of the client's > intentions.
Kernel acts as initial gatekeeper. Implements communication and transport *mechanism* along with the "port capabilities" -- send and receive (plus others not discussed here). To each actor, you're always talking to the kernel -- your Handle resides *in* the kernel, the communication that it represents is implemented *by* the kernel, notifications, operations on those Handles, etc. Actor has no knowledge of who is backing the object (Handling the Handle). To it, everything LOOKS like a kernel interface. This is different from Amoeba where actors are conscious of the fact that they are actually talking to other actors. In Amoeba, you could pass a capability from Task A to Task B using USMail (or whatever). The kernel didn't need to be involved! Or, if it was, it could just provide a *pipe* -- no real checking going on, there. Since my Handles are implemented *in* the kernel, the kernel has to be involved in every communication. But, this is what I want -- I don't want Task A to be able to *bother* Task B unless it has previously been authorized to do so! And, if Task A turns out to be hostile or goes rogue, then Task B can revoke Task A's ability to "send" to it and effectively isolate it. If Task B only notices this annoying behavior on a couple of Handles that it provides to Task A, it can disconnect those Handles (ports) without affecting other Handles that Task A may currently hold (that are backed by Task B). I.e., I can implement fine-grained damage control instead of taking a meat cleaver to Task A.
>> In my case, a namespace binds a name to a Handler. What that Handler >> does and how it does it can have absolutely nothing in common with >> any other Handler in the system. >> >> The *namespace* "object" has operations that can be performed on >> it (methods defined in the IDL that can be applied to any Handle >> that references that particular *flavor* of namespace). E.g., >> resolve(), create(), delete(), etc. But, it has no sense of >> reading/writing *to* the Handles that it manages. >> >>> Unfortunately, doing this generically is a PITA (so no one does it). >>> If you are familiar with COM or Corba, it amounts to the server >>> returning an IDL specification, and the program [somehow] being able >>> to interpret/use the IDL spec to make specific requests. >> >> I don't implement a full-fledged factory. Rather, I assume you know >> everything there is to know about the objects with which you are >> interacting. That you and their Handlers have conspired beforehand >> to some set of agreed upon methods (abilities? trying to avoid >> using the word "capabilities"). > > So implementing the IDL gives you capability? Or just potential? I.e. > you can assume that an imposter task has implemented the IDL for the > service/object it wants to hijack.
IDL is just a collection of bytes that tell the recipient of the message (the envelope that contains the bytes) what they mean. You can "say" whatever you want -- no need to go through the stubs generated by the IDL. I.e., you can fabricate a message that says "write to file" and push it to a Handle (port). If it happens to agree with the correct form for a "write to file" message *and* the Handle happens to be backed by a "file Handler" *and* that instance of that Handle allows write authorization, then you will cause the file to be written! The IDL stubs are just convenience to save you this trouble. OTOH, if you send that message to a Handle that represents a *motor*, it won't make sense. *Or*, can mean something entirely different for a motor (perhaps it means APPLY BRAKE). If you don't have APPLY BRAKE authorization for the motor that is backed by that Handle, then the IPC/RPC will fail. If you *do* have authorization to APPLY BRAKE, then the brake will be applied -- even though you *thought* you were fabricating a message to cause a "file write" operation! In reality, this is minimized because you tend not to create your own messages. And, message ID's are disjoint. You would have to really work hard to create a message that works on an object of type X while thinking you were dealing with an object of type Y! [Sorry, sloppy explanation but I think you can imagine what the machinery looks like. Bottom line is the content of the message can't change the things you are "authorized" to do nor the things on which you are authorized to act!]
>>> System-wide synchronous revocation is impractical, but revocation can >>> be done asynchronously if master capabilities are versioned and >>> derived capabilities indicate which version of the master was in force >>> when they were issued. >> >> No need for versioning. Handles are unique -- not "reused" (until all >> references to it are known to be gone). As they can't be duplicated >> (without kernels involvement), it knows when it is safe to reuse a >> stale Handle. (a task can *try* to hold onto it but the kernel that >> serves that task *knows* it doesn't exist anymore. "File descriptor >> 27 is no longer attached to a file -- regardless of what you may >> *think*!" > > But if you are using "handles" ("indexes", "selectors", whatever) to > represent arbitrary collections of authorities, you're going to run > out of them pretty quickly unless the handle objects are fairly large. > > I.e. 4 billion [32-bit handles] seems like a really large number until > you actually start parceling it out: e.g., if "objX,read" is distinct > from "objX,read,append" is distinct from "objY,read,append", etc.
Again, think about the sort of applications and the things that are big enough AND IN PLAY to require a Handle. E.g., the email_addr_t example I've enjoyed playing with... you only need to represent an email_addr_t as a "live object" (i.e., a Handle backed by a Handler) when it actually *is* "live". You can have tens of thousands of email addresses in your address book (RDBMS) but only those that have been instantiated for live references/operations need Handles!
> That's part of the reason Amoeba used wide tickets. [1st version used > 80-bits without the crypto-signing field, 128 bits in all. 2nd > version capabilities were 256 bits].
But Amoeba also allowed persistence for capabilities. So, you *could* store a capability in the RDBMS alongside each of those thousands of email addresses! Or, one for every file on the disk (bullet server). But, you don't have thousands of file descriptors (Handles!) in your code! You don't fopen(2C) every file in the file system when your program starts -- "just in case". Instead, you create fd's as you happen to need them and the kernel (in most OS's) keeps track of what *actual* file each pertains to. When you close a file, the descriptor ceases to exist (in all practical terms) and the resources (kernel memory) that were associated with it can be reused for some *other* file reference. Make sense? It's not a "lean" way of doing things but I think it's the only way I can get all the isolation I want between (possibly hostile and/or rogue) actors. Gotta go finish building a machine to deliver tomorrow. Still have a few apps to install and snapshots to take before I will feel "confident" letting others screw with it! :> --don
Hi Don,

On Tue, 12 Nov 2013 00:04:16 -0700, Don Y <this@isnotme.com> wrote:

>>> On 11/7/2013 2:27 PM, George Neuner wrote: >>>> In the original version yes ... later they went to a 256-bit ticket to >>>> include more end-point information and better crypto-signing. >>> >>> OK. But that just changes the size of the copy. It still allows you >>> to create as many copies as you want -- without anyone knowing about >>> them. And, makes "a certain bit pattern" effectively the same as >>> another copy of that capability! >> >> Yes. However, the enlarged capability was an improvement over the >> original because it carried information on client(s) authorized to use >> the capability. > >So, how are surrogates handled? E.g., send capability to X >and X wants to delegate some or all of it to Y. I.e., it can >create a new capability from a subset of its own (which Y can >then do for Z, etc.) but how do you track down all derived >capabilities (or, just not recycle "identifiers" so any stale >copes eventually find their IDs invalid WHEN PRESENTED FOR SERVICE)
You don't track them down, you just invalidate them at the source. Some refresher background: In Amoeba, tickets are public objects, but capabilities are server held *private* objects. Tickets are cryptographically signed to prevent forging. The signing function is kernel based (for uniformity). There is a public server API (not discussed here) to ask for tickets you don't have. For brevity here I am focusing only how tickets are created, validated and revoked. Tickets may carry additional information beyond object access rights which I will not discuss here. [but see further below] Definitions: - "capability" is a tuple of { capID, rights, check#, ... } which is associated by a server/service with a managed object. - "ticket" is a tuple of { svrID, capID, rights, signature, ... }. The svrID identifies the server/service. The capID references a particular capability offered by the service. - "rights" are N-bit wide fields. The meanings of the bits are defined by the issuing server. The actual sizes of these data are version dependent on the capability subsystem. Both capabilities and tickets adopted new functionality over time. It's important to understand that Amoeba capabilities are, in fact, "versioned", though versioning is neither sequential nor readily predictable. When an object manager [server] creates a new capability, it generates two large random numbers to be used as the capability ID and as a "check" number associated with it. The ID will be made public, the check number is private, kept secret by the manager. The rights specified in the manager's capability tuple reflect the full set of privileges *this* capability can offer - which is not necessarily the complete set of privileges offered by the object. The capability ID, rights, and check number all are passed into the signing function to generate a signature. An "owner" ticket then is constructed from the ID, the rights, and the signature (the check number remains private to the manager). A "non-owner" ticket having reduced privileges is constructed by first determining a value for the ticket's rights field. "Owner" and "non-owner" tickets are distinguished by whether the rights field in the ticket _exactly_ matches the rights field in the manager's capability. The reduced rights value then is "combined" with the capability check number to create a derived check number. [Amoeba XOR'd them but any deterministic method will work] A derived signature is generated (as above) using the ID, reduced rights and the derived check number, and the new "non-owner" ticket is created from the ID, the reduced rights and the derived signature. [Signatures (and rights for issued non-owner tickets) can be stored to optimize server side ticket validation, but all the signatures could be recomputed if necessary using data from capabilities and tickets.] To validate a ticket, the object manager finds the specified capability using the ID field of the ticket. If the ticket's rights exactly match those of the capability (i.e. an "owner" ticket), the manager uses the check number to compute the expected signature and compares the value to the signature field of the ticket. If the ticket's rights don't exactly match the capability (i.e. a "non-owner" ticket), as above a derived check number and derived signature are computed, and the ticket is checked against the derived signature. ------------ At this point, it should be clear that every issued ticket is tied to a specific "version" of a capability by the capability's secret check number. If the capability is versioned - i.e. the check number modified- or if the capability record is deleted, then every ticket issued referencing that (no longer existing) capability is immediately rendered invalid. Of course, there is the possibility that the same pairing of ID and check# for an existing or past (deleted) capability could recur for an unrelated object. Amoeba used per-server [not global] capabilities and *large* randomly generated ID and check values to minimize the chances of that occurring. ------------ So how to handle surrogates? The meanings of bits in the rights field of the ticket are completely defined by the issuing server: the value may be an enum or a bitmap, there may be subfields ... whatever the implementer chooses. One bit can be defined as meaning "this is a surrogate ticket". A surrogate ticket holder would be permitted to ask the server to create a new reduced capability for the managed object. The new capability maximally would allow only those privileges that were granted to the surrogate, allowing the surrogate independently to delegate by issuing "non-owner" tickets based on its own capability. The surrogate capability might also permit the surrogate to name group peers, fail-over alternates, etc. by transferring its "owner" ticket if other factors allow this (see following). Because capabilities are kept private by the issuing server, surrogates capabilities can be linked to the owner's capability, allowing the owner to void delegate and/or surrogate tickets by versioning/deleting the appropriate capability. The surrogate, of course, can void delegate tickets by versioning/deleting its own capability. Further having tickets encode who is authorized to use them permits more restrictions, e.g., preventing delegates from enabling peers by copying the ticket. All versions of Amoeba's tickets specified a server (or service) ID - the field wasn't sign protected because the ID might be a task instance, but it allowed servers to immediately reject tickets they couldn't possibly have issued. Later versions of the capability system widened tickets to include an authorized user/group ID field protected by a 2nd crypt signature. [And also enlarged the rights field.]
>Kernel acts as initial gatekeeper. Implements communication and >transport *mechanism* along with the "port capabilities" -- send >and receive (plus others not discussed here). > >To each actor, you're always talking to the kernel -- your Handle >resides *in* the kernel, the communication that it represents >is implemented *by* the kernel, notifications, operations on >those Handles, etc. > >Actor has no knowledge of who is backing the object (Handling the >Handle). To it, everything LOOKS like a kernel interface.
Understood. IDL based RPC mechanism.
>This is different from Amoeba where actors are conscious of the >fact that they are actually talking to other actors.
Well, servers managing the objects anyway.
>In Amoeba, you could pass a capability from Task A to Task B >using USMail (or whatever). The kernel didn't need to be involved! >Or, if it was, it could just provide a *pipe* -- no real >checking going on, there. > >Since my Handles are implemented *in* the kernel, the kernel >has to be involved in every communication. But, this is what >I want -- I don't want Task A to be able to *bother* Task B >unless it has previously been authorized to do so!
Does the kernel recognize DOS attacks on itself?
>And, if Task A turns out to be hostile or goes rogue, then Task >B can revoke Task A's ability to "send" to it and effectively >isolate it. > >If Task B only notices this annoying behavior on a couple of >Handles that it provides to Task A, it can disconnect those >Handles (ports) without affecting other Handles that Task A >may currently hold (that are backed by Task B). > >I.e., I can implement fine-grained damage control instead of >taking a meat cleaver to Task A.
Amoeba v2 effectively could do the same.
>> [lotsa objects] reason Amoeba used wide tickets. > >But Amoeba also allowed persistence for capabilities. So, you >*could* store a capability in the RDBMS alongside each of those >thousands of email addresses! Or, one for every file on the >disk (bullet server).
Anyone could persist a *ticket* - but the referenced capability might no longer exist when the ticket is presented for use: e.g., following a restart or after version management performed by the capability owner.
>But, you don't have thousands of file descriptors (Handles!) in >your code! You don't fopen(2C) every file in the file system >when your program starts -- "just in case". Instead, you create >fd's as you happen to need them and the kernel (in most OS's) >keeps track of what *actual* file each pertains to.
Ever see an image based OS? No files (or, at least, none the user can perceive): just a virtual space containing "program" functions and "document" data structures with a directory for finding things. All "programs" and "documents" available at all times. Like working in a Lisp or Smalltalk system but extended to encompass all activity. Current NV memory based systems, e.g., for tablets, appear to work similarly, but they still perceptionally are "file" oriented.
>--don
George
Hi George,

On 11/14/2013 2:55 PM, George Neuner wrote:
>>>> On 11/7/2013 2:27 PM, George Neuner wrote: >>>>> In the original version yes ... later they went to a 256-bit ticket to >>>>> include more end-point information and better crypto-signing. >>>> >>>> OK. But that just changes the size of the copy. It still allows you >>>> to create as many copies as you want -- without anyone knowing about >>>> them. And, makes "a certain bit pattern" effectively the same as >>>> another copy of that capability! >>> >>> Yes. However, the enlarged capability was an improvement over the >>> original because it carried information on client(s) authorized to use >>> the capability. >> >> So, how are surrogates handled? E.g., send capability to X >> and X wants to delegate some or all of it to Y. I.e., it can >> create a new capability from a subset of its own (which Y can >> then do for Z, etc.) but how do you track down all derived >> capabilities (or, just not recycle "identifiers" so any stale >> copes eventually find their IDs invalid WHEN PRESENTED FOR SERVICE) > > You don't track them down, you just invalidate them at the source.
(by source, we agree to mean the server "handling" the object)
> Some refresher background: > > In Amoeba, tickets are public objects, but capabilities are server > held *private* objects. Tickets are cryptographically signed to > prevent forging. The signing function is kernel based (for > uniformity). There is a public server API (not discussed here) to ask > for tickets you don't have.
Yes. Tickets can be freely copied and passed around. Nothing *prevents* that. The capabilities (object, authorizations) behind them are "protected". ~dgy/.profile can be *known* to many, yet inaccessible to damn near *all*! Th esigning function could similarly be implemented within the "service" for a particular ticket -- or, in addition to. I.e., anything that needs to know that "secret" can perform that duty. By contrast, "Handles" (ports) in my scheme are just "small integers" in much the same way that file descriptors are "small integers". And, while nothing prevents you from *copying* a particular "small integer", the integer itself is neither the ticket nor the capability. Rather, *like* a file descriptor, it acts as a "name" for a particulat Handle IN A PARTICULAR CONTEXT! (that of the task holding that handle!) E.g., "23" interpreted as a file descriptor in task (process) A can refer to a particular pty. Passing "23" to some other task breaks the association with that particular pty. "23" is *just* "23" -- nothing more. OTOH, 0xDEADBEEF010204302893740 passed from Amoeba task A to Amoeba task B carries "rights" with it. Encoded within the cryptographic envelope!
> - "capability" is a tuple of { capID, rights, check#, ... } which is > associated by a server/service with a managed object.
In my case, the capability is embedded in the Handle and implemented by the Handler. The Handler could conceivably *change* how it interprets a set of "capabilities" (terms are getting WAY overloaded, here!) on the fly. Doing so without the actor's awareness could be challenging :>
> - "ticket" is a tuple of { svrID, capID, rights, signature, ... }. The > svrID identifies the server/service. The capID references a particular > capability offered by the service.
There is no concept of a "ticket" in my scheme. A "Handle" only exists in a specific context. Remove it from that context and it loses all meaning -- it's just a bunch of bits.
> - "rights" are N-bit wide fields. The meanings of the bits are defined > by the issuing server.
What I've called "authorizations". Except there is no visible "bit field" in my implementation. Each Handler decides how it wants to implement a set of "authorizations". E.g., a file server could have two threads (groups of threads) that are responsible for read or write access to a file. Files opened for read access are service (Handled) by thread R while those that are opened for write access are handled by thread W. Read requests are never *seen* by thread W and vice versa! (because the endpoint of the eventual read/write RPC differs -- wired differently when the open() is granted!) (I also have "communication rights" beneath the "rights" that are associated with the object being managed)
> It's important to understand that Amoeba capabilities are, in fact, > "versioned", though versioning is neither sequential nor readily > predictable.
This can just be seen as extending the namespace of the capability in a manner that makes for easier management. I.e., so a Handler (server) can opt to ignore "older" rights (because it can't vacuum memory to find and remove all instances of a particular "ticket")
> When an object manager [server] creates a new capability, it generates > two large random numbers to be used as the capability ID and as a > "check" number associated with it. The ID will be made public, the > check number is private, kept secret by the manager. > > The rights specified in the manager's capability tuple reflect the > full set of privileges *this* capability can offer - which is not > necessarily the complete set of privileges offered by the object.
Correct. As I don't have to give you read *and* write access to a file. Or, could opt to only grant APPEND access. Or, any other operator that I choose to implement (remove_duplicate_lines(), compress_in_place(), etc.)
> The capability ID, rights, and check number all are passed into the > signing function to generate a signature. An "owner" ticket then is > constructed from the ID, the rights, and the signature (the check > number remains private to the manager). > > A "non-owner" ticket having reduced privileges is constructed by first > determining a value for the ticket's rights field. "Owner" and > "non-owner" tickets are distinguished by whether the rights field in > the ticket _exactly_ matches the rights field in the manager's > capability. > > The reduced rights value then is "combined" with the capability check > number to create a derived check number. [Amoeba XOR'd them but any > deterministic method will work] A derived signature is generated (as > above) using the ID, reduced rights and the derived check number, and > the new "non-owner" ticket is created from the ID, the reduced rights > and the derived signature.
But, there is no direct *tie* to the original ticket from which this one (subset) was created! (or the one before *that*; or the one before *that*; etc.) So, when a ticket is presented, you can't look at the ticket and decide that the ticket from which it was created was revoked and, therefore, so should this one! (hence version. But, now the handler/server needs to keep track of which version is current for each outstanding!)
> [Signatures (and rights for issued non-owner tickets) can be stored to > optimize server side ticket validation, but all the signatures could > be recomputed if necessary using data from capabilities and tickets.] > > To validate a ticket, the object manager finds the specified > capability using the ID field of the ticket. If the ticket's rights > exactly match those of the capability (i.e. an "owner" ticket), the > manager uses the check number to compute the expected signature and > compares the value to the signature field of the ticket. > > If the ticket's rights don't exactly match the capability (i.e. a > "non-owner" ticket), as above a derived check number and derived > signature are computed, and the ticket is checked against the derived > signature. > > ------------ > > At this point, it should be clear that every issued ticket is tied to > a specific "version" of a capability by the capability's secret check > number. If the capability is versioned - i.e. the check number > modified- or if the capability record is deleted, then every ticket > issued referencing that (no longer existing) capability is immediately > rendered invalid.
Yes. So, the Handler/server needs to effectively treat the version as the ID of the actor to which a ticket is granted IF IT WANTS TO REVOKE THOSE CAPABILITIES and only those WITHOUT AFFECTING OTHER TICKET-HOLDERS.
> So how to handle surrogates? > > The meanings of bits in the rights field of the ticket are completely > defined by the issuing server: the value may be an enum or a bitmap, > there may be subfields ... whatever the implementer chooses.
Of course.
> One bit can be defined as meaning "this is a surrogate ticket". A > surrogate ticket holder would be permitted to ask the server to create > a new reduced capability for the managed object.
Of course. This is my "do not duplicate" attribute.
> The new capability maximally would allow only those privileges that > were granted to the surrogate, allowing the surrogate independently to > delegate by issuing "non-owner" tickets based on its own capability. > > The surrogate capability might also permit the surrogate to name group > peers, fail-over alternates, etc. by transferring its "owner" ticket > if other factors allow this (see following). > > Because capabilities are kept private by the issuing server, > surrogates capabilities can be linked to the owner's capability, > allowing the owner to void delegate and/or surrogate tickets by > versioning/deleting the appropriate capability. The surrogate, of > course, can void delegate tickets by versioning/deleting its own > capability.
But that means the handler/server has to do all this work! "Remembering".
> Further having tickets encode who is authorized to use them permits > more restrictions, e.g., preventing delegates from enabling peers by > copying the ticket.
In my case, the copying has to be done *in* the kernel. All that's exposed is the "small integer" so an actor can't do squat with it.
> All versions of Amoeba's tickets specified a server (or service) ID - > the field wasn't sign protected because the ID might be a task > instance, but it allowed servers to immediately reject tickets they > couldn't possibly have issued.
In my case, the Handler/server never is *presented* the communication in the first place -- unless the path has been previously created by possession of the Handle. Just like you can't write to a file without a file descriptor having been created. The file server never sees your actions; they are blocked *in* the kernel.
> Later versions of the capability system widened tickets to include an > authorized user/group ID field protected by a 2nd crypt signature. > [And also enlarged the rights field.] > >> Kernel acts as initial gatekeeper. Implements communication and >> transport *mechanism* along with the "port capabilities" -- send >> and receive (plus others not discussed here). >> >> To each actor, you're always talking to the kernel -- your Handle >> resides *in* the kernel, the communication that it represents >> is implemented *by* the kernel, notifications, operations on >> those Handles, etc. >> >> Actor has no knowledge of who is backing the object (Handling the >> Handle). To it, everything LOOKS like a kernel interface. > > Understood. IDL based RPC mechanism.
Yes. And the IDL is just a convenience service provided to the developer. Sort of like the difference between using the native X API and one of the widget sets. (the latter just encapsulates the former)
>> This is different from Amoeba where actors are conscious of the >> fact that they are actually talking to other actors. > > Well, servers managing the objects anyway.
Yes. In my approach, you are always "talking" to the kernel as *it* is responsible for validating (the communication portion) implementing the actual RPC/IPC/kernel trap (*all* look the same to the actor)
>> In Amoeba, you could pass a capability from Task A to Task B >> using USMail (or whatever). The kernel didn't need to be involved! >> Or, if it was, it could just provide a *pipe* -- no real >> checking going on, there. >> >> Since my Handles are implemented *in* the kernel, the kernel >> has to be involved in every communication. But, this is what >> I want -- I don't want Task A to be able to *bother* Task B >> unless it has previously been authorized to do so! > > Does the kernel recognize DOS attacks on itself?
It currently doesn't. Nor do I see a need to do so in the future. Any attacks on services (Handlers) *through* the kernel (as the communication medium) just come out of the attacker's resource share. I.e., *your* timeslice is being consumed while the kernel is trying to determine if you are entitled to this action. If that's how you want to spend your time... <shrug> You could just as ridiculously spend it spining in a tight while(1) {}! A "direct" attack (i.e., asking the kernel to perform an action that is known to be *backed* by the kernel itself) has the same net result. It's your dime, if you think this a wise way to spend it, then so be it! Of course, the *system* ends up losing performance because it's supporting a task that is "doing nothing productive". But, how does an independent agency make that distinction? Do I kill a task because it has tied to do something it isn't entitled to do? What if that ability has been revoked? Do I penalize the task for this? What if the only realistic recovery mechanism for an "unavailable (at this time)" resource is to "try, try again"? When do I decide the actor is attacking vs. normal behavior? Kernel tries REALLY HARD not to implement policy. Let the services and handlers make that definition AS BEFITTING THE APPLICATION (or portion thereof)
>>> [lotsa objects] reason Amoeba used wide tickets. >> >> But Amoeba also allowed persistence for capabilities. So, you >> *could* store a capability in the RDBMS alongside each of those >> thousands of email addresses! Or, one for every file on the >> disk (bullet server). > > Anyone could persist a *ticket* - but the referenced capability might > no longer exist when the ticket is presented for use: e.g., following > a restart or after version management performed by the capability > owner.
Yes. In my case, I don't support persistence of "Handles". I.e., they can't be created -- nor recreated -- from a store. Instead, everything (with the exception of bootstrap) is built dynamically and persists until explicitly killed/revoked *or* the system shuts down.
>> But, you don't have thousands of file descriptors (Handles!) in >> your code! You don't fopen(2C) every file in the file system >> when your program starts -- "just in case". Instead, you create >> fd's as you happen to need them and the kernel (in most OS's) >> keeps track of what *actual* file each pertains to. > > Ever see an image based OS? No files (or, at least, none the user can > perceive): just a virtual space containing "program" functions and > "document" data structures with a directory for finding things. All > "programs" and "documents" available at all times. > > Like working in a Lisp or Smalltalk system but extended to encompass > all activity.
But the actors don't hold "Handles" to all of those objects! E.g., I can have millions of files in the file store -- yet only need *dozens* of Handles to interact with those dozen objects that are "live" at the present time. The Handlers role is to create "live" objects, "however". If that means mapping some blocks on a disk to a particular Handle, so be it. If it means wrapping one of thousands of email addresses *in* an email_addr_t, likewise. The "problem" with my approach is that all of these things -- for the complete set of tasks executing on a host -- are contained in the kernel. Amoeba (et al.) allows the references to be moved *out* of the kernel into task-space (whether that's user-land or not). REGARDLESS OF WHETHER AN OBJECT IS LIVE OR NOT. One of the Mach problems, IMnsHO, was their desire/goal of trying to reimplement UN*X. So, any "impedance mismatches" between their model and the one used by the UN*X implementors was a performance or conceptualization "hit" (hence my deliberate choice of "impedance mismatch"). None of these things were "deal breakers" but they conspired to make it a bad fit. overall. I'm looking at themechanisms in a different light. To address a different class of problems FROM THE START instead of trying to back-fill to an existing implementation. E.g., the Standard Library wasn't reentrant. Users had to take pains to preserve "static" members (thereby exposing bits that should have remained *hidden* within the library!). Or, functions had to be redefined to export these entities. In my case, I can implement the libraries as a *service* that you "connect to" ("load library"). That service can take it upon itself to instantiate thread-specific copies of all these statics. Without exposing any of this to the application. Of course, UNIX could do likewise! But, now the library had to be aware of the details of the process/thread model *in* UNIX. In my case, I just create a tuple binding the "connection" (handle) to its specific "statics" WITHIN the "library SERVER"!
> Current NV memory based systems, e.g., for tablets, appear to work > similarly, but they still perceptionally are "file" oriented.
An afterthought...

On 11/14/2013 3:57 PM, Don Y wrote:

>>> So, how are surrogates handled? E.g., send capability to X >>> and X wants to delegate some or all of it to Y. I.e., it can >>> create a new capability from a subset of its own (which Y can >>> then do for Z, etc.) but how do you track down all derived >>> capabilities (or, just not recycle "identifiers" so any stale >>> copes eventually find their IDs invalid WHEN PRESENTED FOR SERVICE) >> >> You don't track them down, you just invalidate them at the source.
The flip side of this is also important: -- how do you know when all outstanding "capabilities" (object, permission tuples) FOR A PARTICULAR "object" are "gone"? I.e. how doesa Handler know that no one is interested in the "live" object any longer (if "nothing" is tracking "outstanding" Handles/tickets/capabilities?) How do you Know when to free the resovrces set aside to implement/manage that object? Keep "delegated" capabilities in mind, as well. Ignoring these "niggly issues" i5 how smaller, simpler, faster kernels get their "performance edge". :(
Hi Don,

On Thu, 14 Nov 2013 15:57:27 -0700, Don Y <This.is@not.Me> wrote:


>E.g., "23" interpreted as a file descriptor in task (process) A can >refer to a particular pty. Passing "23" to some other task breaks the >association with that particular pty. "23" is *just* "23" -- nothing >more.
On many systems, that actually is not true - the descriptor is a global identifier for the referenced object. Opening the same object in separate processes you may discover that the descriptors all have the same value.
>OTOH, 0xDEADBEEF010204302893740 passed from Amoeba task A to Amoeba >task B carries "rights" with it. Encoded within the cryptographic >envelope!
So what? The envelope validates the contents. Encoded rights are useless if the envelope is broken or even just sent to the wrong server.
>> - "rights" are N-bit wide fields. The meanings of the bits are defined >> by the issuing server. > >What I've called "authorizations". Except there is no visible >"bit field" in my implementation. Each Handler decides how it wants >to implement a set of "authorizations".
Amoeba chose a public ticket representation specifically to facilitate distributed agents. The entire point of it was that agents be able to delegate modified rights on their own.
>> It's important to understand that Amoeba capabilities are, in fact, >> "versioned", though versioning is neither sequential nor readily >> predictable. > >This can just be seen as extending the namespace of the capability >in a manner that makes for easier management. I.e., so a Handler >(server) can opt to ignore "older" rights (because it can't vacuum >memory to find and remove all instances of a particular "ticket")
I think "extending" is the wrong term. A particular version of a capability may be seen as defining a name space in which its referencing tickets exist, but the effect of versioning the capability is to destroy one space (and everything in it) and to open another. Moreover, the capability itself exists within a name space which is not changed by versioning the capability.
>But, there is no direct *tie* to the original ticket from which >this one (subset) was created! (or the one before *that*; or the one >before *that*; etc.)
The *tie* is through the capability. The ticket is only a reference to it.
>So, when a ticket is presented, you can't look at the ticket and >decide that the ticket from which it was created was revoked and, >therefore, so should this one!
You misunderstanding something. Tickets don't create tickets, servers do. You can't modify the rights of an existing ticket and copy it to someone else - it simply won't work. Creating a valid ticket requires knowledge of the capability's secret check (version) number, which only the server has. You have to present your ticket to the server holding the capability and request issue of a new ticket having the desired rights. The server will tell you to go to hell if you don't have the privilege to create new tickets. If you have the privilege, you may ask the server to create an *additional* capability for the original object that you can administer independently. This does not alter the original capability that created your ticket. Only the holder of an "owner" ticket can request the server to version the capability. That invalidates *every* ticket referencing the old version ... including the owner's own ticket!. When versioning a capability, a server must immediately issue a new "owner" ticket in response.
>(hence version. But, now the handler/server needs to keep track >of which version is current for each outstanding!)
No. A server needs to track only those (versions of) capabilities for which it intends to honor tickets. It does *not* need to retain any information regarding revoked capabilities.
>> At this point, it should be clear that every issued ticket is tied to >> a specific "version" of a capability by the capability's secret check >> number. If the capability is versioned - i.e. the check number >> modified- or if the capability record is deleted, then every ticket >> issued referencing that (no longer existing) capability is immediately >> rendered invalid. > >So, the Handler/server needs to effectively treat the version >as the ID of the actor to which a ticket is granted IF IT WANTS TO >REVOKE THOSE CAPABILITIES and only those WITHOUT AFFECTING OTHER >TICKET-HOLDERS.
To a 1st approximation, yes. A particular version of a capability is known by the pairing of its public identifier and private check number. A different version is, in a real sense, a different capability. However, it makes little sense to have multiple versions of a capability all which have the same public identifier. If you want simultaneously *valid* tickets to reference unique capabilities, you create simultaneous capabilities with unique identifiers. The multiple version issue is avoided and irrelevant.
>> Because capabilities are kept private by the issuing server, >> surrogates capabilities can be linked to the owner's capability, >> allowing the owner to void delegate and/or surrogate tickets by >> versioning/deleting the appropriate capability. The surrogate, of >> course, can void delegate tickets by versioning/deleting its own >> capability. > >But that means the handler/server has to do all this work! "Remembering".
In your case the kernel does all this work for _all_ of the servers.
>> All versions of Amoeba's tickets specified a server (or service) ID - >> the field wasn't sign protected because the ID might be a task >> instance, but it allowed servers to immediately reject tickets they >> couldn't possibly have issued. > >In my case, the Handler/server never is *presented* the communication >in the first place -- unless the path has been previously created by >possession of the Handle.
In Amoeba's case, the sender needs a ticket that says it can ride the communication channel. 6 of one ... It's easy enough to create separate channels for each service with unique tickets needed to access them.
>The "problem" with my approach is that all of these things -- for >the complete set of tasks executing on a host -- are contained in the >kernel. Amoeba (et al.) allows the references to be moved *out* >of the kernel into task-space (whether that's user-land or not). >REGARDLESS OF WHETHER AN OBJECT IS LIVE OR NOT.
But under your system, the kernel has to track capabilities for every live object even though it isn't *responsible* for them.
>One of the Mach problems, IMnsHO, was their desire/goal of trying to >reimplement UN*X. So, any "impedance mismatches" between their >model and the one used by the UN*X implementors was a performance >or conceptualization "hit" (hence my deliberate choice of "impedance >mismatch"). None of these things were "deal breakers" but they >conspired to make it a bad fit. overall.
That's a misconception ... I know someone who worked on the original implementation. The designers did not *want* to reimplement Unix - rather they believed that [recompile] compatibility with Unix was necessary for Mach to gain acceptance while a critical mass of native programs was being written. Mostly due to time pressure [Mach was developed on a grant] they took a disastrous shortcut. Realizing that they couldn't *quickly* implement a *full* Unix API translation on top of native services, they grabbed an existing BSD Unix kernel, modified it into a Mach server task and hacked the system libraries to RPC the "compatibility server". Their problem was that too much of the BSD kernel they used was serial non-reentrant code. Fixing it would have taken as much effort as doing a real API translation - effort they weren't prepared to give at the beginning. Running as one task among many, the kernel inside the compatibility server was painfully slow. Mach 3.0 brought out an almost complete, lightweight API translation library, and saw a complete rewrite of the compatibility service to offer only those few Unix things that had no analogue in Mach. But by then it was too late. Creating the compatibility service in the first place turned out to be Mach's fatal mistake. Because Mach could run Unix programs, few people bothered to port anything to the native API. And because Mach ran their Unix programs slowly, people judged the OS itself to be unworthy. [Which was unfair: native Mach programs were faster than equivalent Unix programs on the same hardware.] YMMV, George
On Thu, 14 Nov 2013 18:41:55 -0700, Don Y <This.is@not.Me> wrote:

>An afterthought... > >On 11/14/2013 3:57 PM, Don Y wrote: > >>>> So, how are surrogates handled? E.g., send capability to X >>>> and X wants to delegate some or all of it to Y. I.e., it can >>>> create a new capability from a subset of its own (which Y can >>>> then do for Z, etc.) but how do you track down all derived >>>> capabilities (or, just not recycle "identifiers" so any stale >>>> copes eventually find their IDs invalid WHEN PRESENTED FOR SERVICE) >>> >>> You don't track them down, you just invalidate them at the source. > >The flip side of this is also important: > -- how do you know when all outstanding "capabilities" > (object, permission tuples) FOR A PARTICULAR "object" are "gone"?
It doesn't matter - if the object is destroyed, all of the capabilities associated with it are destroyed. Capabilities only control access to an object, not its existence. They can be created and destroyed independently of the object.
>I.e. how doesa Handler know that no one is interested in the "live" >object any longer (if "nothing" is tracking "outstanding" >Handles/tickets/capabilities?) How do you Know when to free the >resovrces set aside to implement/manage that object?
When the owner destroys it. The handler should not care whether or not someone will try to access a nonexistent object in the future.
>Keep "delegated" capabilities in mind, as well.
Any signatory to a joint account can close it. If an object can have multiple owners, then any of them should be able to destroy the object and it's associated capabilities (which should be automatic). What's important is to be able to distinguish owners from their agents if/when necessary.
>Ignoring these "niggly issues" i5 how smaller, simpler, faster kernels >get their "performance edge". :(
Too often I think you perceive complexity where none really exists. George
Hi George,

On 11/17/2013 5:22 AM, George Neuner wrote:
> On Thu, 14 Nov 2013 18:41:55 -0700, Don Y<This.is@not.Me> wrote> >> An afterthought... >> >> On 11/14/2013 3:57 PM, Don Y wrote: >> >>>>> So, how are surrogates handled? E.g., send capability to X >>>>> and X wants to delegate some or all of it to Y. I.e., it can >>>>> create a new capability from a subset of its own (which Y can >>>>> then do for Z, etc.) but how do you track down all derived >>>>> capabilities (or, just not recycle "identifiers" so any stale >>>>> copes eventually find their IDs invalid WHEN PRESENTED FOR SERVICE) >>>> >>>> You don't track them down, you just invalidate them at the source. >> >> The flip side of this is also important: >> -- how do you know when all outstanding "capabilities" >> (object, permission tuples) FOR A PARTICULAR "object" are "gone"? > > It doesn't matter - if the object is destroyed, all of the > capabilities associated with it are destroyed. > > Capabilities only control access to an object, not its existence. They > can be created and destroyed independently of the object.
Agreed. But, you missed the point of my question. Capabilities (Handles) are (object&, permission) tuples. Each capability references an object. What happens when every reference to an object disappears? How does the Handler (server) BACKlNG the object know that all references to it (along with their particular permissions) have "disappeared"? How does it know that the resources that it has set aside to manage it can now be released for other uses?
>> I.e. how doesa Handler know that no one is interested in the "live" >> object any longer (if "nothing" is tracking "outstanding" >> Handles/tickets/capabilities?) How do you Know when to free the >> resovrces set aside to implement/manage that object? > > When the owner destroys it. > > The handler should not care whether or not someone will try to access > a nonexistent object in the future.
But the object hasn't been destroyed; just the last ovtstanding REFERENCE to this LlVE lNSTANCE of it! Five actors hold Handles to a particular "file". Each has some set of permissions enabled by *their* particular Handle (capability). The Handler backing that file (i.e. File server) has set aside some resources to implenent that/those live instance of that object. E.g., read and/or write buffers and/or a mmap()-ed view of the actual file-on-disk. Synchronization primitives within the server to ensure actions by actors are serialized in some predictable manner. How does the file server know when the last reference to this object disappears? I.e., the last actor holding a capability has terminated or died (unceremoniously). No one knows how many outstanding capabilities may still exist for that (Amoeba) object so the file handler never knows when it can forget about the object. [Rather than digress into a discussion about the bullet server and its oddities, replace "file" with "motor", above. How does he Motor Server know when it can afford to power down the translator/motor driver for a particular motor because no one has any OUTSTANDING live references to it? The motor still exists. A "non-live" reference to it still exists from which *live* references could be created in the future (e.g., from the soap server). It hasn't been "deleted" by it's "owner". Just everyone has currently lost interest in it -- for the time being!] The email Handler instantiates a particular email_addr_t. It looks up the "human" representation of an email address in the RDBMS (i.e., that's a privilege that only *it* has). It copies this into memory somewhere inside itself and returns a Handle to some actor that allows that actor to do certain things to/with that email_addr_t. Of course, what really happens when the actor wants to do something with that email_addr_t is the email Handler is called upon to perform that particular action under the authority granted by that particular Handle (capability). Actor gives some subset of his permissions to another actor. One or both of them invoke actions (methods) on the email_addr_t through their respective Handles. Eventually, both "lose interest" in that particular email_addr_t ("close()" file). When the last such "open Handle" into a particular email_addr_t instance is released (closed), the Email Handler can free its resources set aside for that email_addr_t. To "close" an email_addr_t in my scheme, all you do is forfeit your Handle. Because the Handle is implemented in the kernel, it knows every such reference to the object. It can notify the Handler when/if an actor holding a Handle *dies* -- without previously having explicitly told the server that it was no longer interested in the object backed by that particular Handle! With Amoeba's tickets, *anyone* could hold a valid ticket for an object. AND BE DOING SO LEGITIMATELY! How does the server backing a particular object referenced by those N copies of that ticket know that it is NOW safe to free the resources set aside to implement that object? It has no way of knowing what N is at any instant. Or, when it goes to 0! It has to rely on an actor explicitly saying, "close" the object represented by this ticket -- and all other future references to that object that may come along. If the actor responsible for doing this dies, there's no one to clean up the zombie objects! You'd have to implement a keep-alive policy so the server could automatically "shut down" objects that haven't been referenced, recently. And, actors would have to deliberately "tickle" every object for which they hold tickets just to be sure those objects didn't get closed due to inactivity! [IIRC, this was how the bullet server dealt with the possibility. And, that was a situation where it would be relatively *easy* for a client (ticket holder) to be reasonably expected to tickle the object regularly -- as the object (file) had to be created locally before being sent to the bullet server for "commitment" to media. The same sort of GC was required of all "live" objects -- under policies defined by their services. E.g., each time the GC was invoked, any "untouched" objects (objects for which tickets had not been presented in the previous N GC cycles) were deinstantiated. If an actor happened to be too sluggish to use a ticket (or, was perhaps BLOCKED from doing so), then the object could go away! If the ticket's capabilities didn;t allow him to recreate the object...]
>> Keep "delegated" capabilities in mind, as well. > > Any signatory to a joint account can close it. If an object can have > multiple owners, then any of them should be able to destroy the object > and it's associated capabilities (which should be automatic). > > What's important is to be able to distinguish owners from their agents > if/when necessary.
Again, you're missing the point. How does the bank know when both account holders have DIED?? (bank is a bad example because there are undoubtedly laws governing this) I.e., they can't just *spend* the monies in that account -- cuz either account holder may show up to claim them! Do they put the monies in a box for all eternity? JUST IN CASE someone shows up with a valid credential 50 years hence? Instead, they garbage collect. Accounts that haven't been referenced in N years are automatically closed (and monies go ???). Or, mailed statements that are returned by USPS as "undeliverable" trigger similar response. I.e., you have to have some periodic activity that FORCES TICKET HOLDERS to show that their tickets (capabilities) are still "of interest".
>> Ignoring these "niggly issues" i5 how smaller, simpler, faster kernels >> get their "performance edge". :( > > Too often I think you perceive complexity where none really exists.
<grin> In this case, you appear to have overlooked some complexity! :> IIRC, the Hurd people went through a similar "challenge" when they looked at moving to L4. It, being one of those "smaller, simpler, faster" kernels didn;t provide the same sorts of mechanism that Mach afforded so, trying to *emulate* those behaviors *on* L4 ended up making the L4 implementation as sluggish as the Mach approach! You always trade away something as you move down in complexity. I can do a context switch in near zero time -- *if* I don't have to preserve any process state!! :> (while this *sounds* ridiculous, you can actually be very effective in creating applications with this model! But, you have to be very disciplined, as well -- cuz *it* doesn't do much FOR you!) Tea time... Then, The Pork Dish! (yummmm!)

Memfault Beyond the Launch