On 6/30/2020 2:45 AM, George Neuner wrote:
> On Mon, 29 Jun 2020 23:24:14 -0700, Don Y
> <blockedofcourse@foo.invalid> wrote:
>
>>> On 6/27/2020 10:01 PM, George Neuner wrote:
>
>> I'd prefer the VMM system to be based on "variable sized pages" (akin
>> to segments) as you can emulate "variable sized protection zones" as
>> collections of one or more such "pages". Though I don't claim to need
>> "single byte resolution" on such page sizes.
>
> Mill allocates by cache lines (though what that means is model
> dependent). Is that a granularity you could live with?
That depends. At the end of the day, the cache determines performance
so anything finer seems wasted.
But, I'm not concerned with performance as much as functionality;
I'd like to be able to do vm_allocate()s in the same way that I can
build buffer pools -- nothing prevents me from building 48-byte buffers
so why shouldn't I (in a world where hardware could do whatever I wanted)
be able to create 48 byte "pages"? Then, a 4100 byte page to hold this
executable module and a set of 1000 9000-byte pages to hold jumbo packets?
I.e., I' not artificially constrained by what I can do in software...
just what the hardware will accommodate to match my needs.
>> But, the present trend towards larger page sizes renders them less
>> useful for many things. E.g., the 512B VAX page would be an oddity
>> in today's world -- I think ARM has some products that offer "tiny"
>> 1KB pages but even those are off-the-beaten track.
>
> Anything under 4KB is an oddball today. Some 64-bit chips have 1GB
> pages now .., handy for huge databases and science simulations but not
> much else.
Exactly. And, even if you have 1GB objects, there's no guarantee that you
would want to dedicate resources to having it completely mapped at any given
time. I.e., if you only wanted to map a quarter of it, you have to move to
smaller page sizes --> more levels of page-tables.
(presumably, you could discipline your software to only access parts that it
KNOWS are mapped... but, doesn't that sort of defeat the purpose?)
>>> So long as segments are only used as
>>> protection zones within the address space, they can overlap in any
>>> way.
>>
>> Then you need a means of resolving which segment has priority at a particular
>> logical address. Do you expect that to be free?
>
> Yes, and the right way to do that is to maintain a segment stack:
> e.g.,
>
> - the whole process space
> - the process data space
> - some heap
> - :
> - this 100 byte buffer
>
> etc., with the protection being the intersection of all permissions
> present in the stack. Since this naturally coincides with program
> scopes, the API only needs to manipulate the top of the stack.
But you need this for every disjoint 100 (or 300!) byte buffer
(or similar object managed as segment). I.e., every accessible
segment has to be visible/resolvable in that structure in order
to know what ACLs apply to it, NOW.
> For most programming a segment stack needs only a handful of entries.
> Even 8 entries likely is overkill for the most demanding cases.
>
> But this is a protection mechanism separate from address space
> allocation, which can be done by pages of any convenient size.
>
>>>> The advantage that fixed size (even if there is a selection of sizes
>>>> to choose from) pages offers is each page has a particular location
>>>> into which it fits. You don't have to worry that some *other* page
>>>> partially overlaps it or that it will overlap another.
>>>
>>> You do if the space is shared.
>>
>> So, all processes share a single address space? And, segments resolve
>> whether or not the "current process" can access the particular segment's
>> contents? How does that scale?
>
> No, I mean if the same location is shared between processes [or even
> threads in your model] that are using different page sizes.
[threads exist in a shared container so always have the same address space]
But nothing CAN overlap it that isn't intended to be accessible
in a given process. E.g., if "foo" resides in a 16K page in process A
and an 8K portion of that same physical memory is mapped into process B,
then A and B can each access foo -- at potentially different logical
addresses. The "other 8K of the 16K that is accessible in A need not
be mapped in B -- some other (8K) page can appear in that relative
location.
>> If I want a particular process to have access to the payload of a network
>> packet, do I have to create a "subsegment" that encompasses JUST the payload
>> so that I can leave the framing intact -- yet inaccessible? If that process
>> tries that portion of the packet that is "locked", it gets blocked -- and
>> doesn't that leak information (i.e., the fact that there IS framing information
>> surrounding the packet implies that it wasn't sourced by a *local* process)
>
> A subsegment of the program's space certainly. How deep to make the
> hierarchy largely is up to the programmer and how much she wants the
> hardware to check.
We're talking about hypothetical hardware so there's no reason it shouldn't
"do it all"!
> Yes, it may reveal there is something the programmer can't look at. So
> what? People have been frustrated by locked doors for thousands of
> years. Besides which, your system is open source, so anybody can go
> in, peek behind the curtain, and potentially remove any restrictions
> they don't like.
Wrong point.
You don't want the code -- at run time -- to be able to deduce anything that
isn't explicitly disclosed to it (FOSS just makes this more damning).
>> [By contrast, if I could create "memory units" that were of any particular
>> size, I'd create one that was sized to shrink-wrap to the payload with another
>> that encompassed the entire packet. The first would be mapped into the
>> aforementioned process's address space IMMEDIATELY ADJACENT to any other
>> packets that were part of the message (as an example). The larger unit
>> would be mapped into the address space of the process that was concerned with
>> the framing information.]
>
> I don't see how that's an improvement. Unless you provide byte sized
> "pages" then mapping over an existing data structure - e.g., to copy
> it - potentially will leak stuff at the ends.
Yes. So pages that are typically considerably larger than the sorts of buffer
you are inclined to use will tend to leak MORE.
You can work-around this by scrubbing pages (and buffers) after use. But,
you still have to rely on discipline to ensure a buffer doesn't get rewritten
before being considered "done" (when it can be scrubbed).
E.g., I scrub all "messages" at the end of each RPC and return them to the
"page pool" to ensure nothing leaks between uses. So, pages that are
significantly larger than what is needed for a message represent a wasted
effort (you have to scrub the whole page because you don't know if
the callee scribbled something on it)
But, I can't guarantee out-of-band pages passed won't leak information
(though I'm working on an architectural change to address that)
>> I wonder how "big" most processes are (in c.a.e products) -- assuming
>> the whole process can be mapped into a single contiguous page?
>>
>> Conversely, I wonder how many "smaller objects" need to be moved between
>> address spaces in such products (assuming, of course, that they operate
>> under those sorts of protection mechanisms)?
>
> I suspect many are single applications in a single address space.
By contrast, I've been working with disjoint address spaces for
much of my career (though usually not with hardware protection of
those address spaces). E.g., bank-switching TEXT and BSS/DATA/STACK
on a per-task basis so each task appears to have its own address space
separated from (most) of the other tasks (if a task is small enough,
it can share an address space with another task(s)) while the kernel
hides in another "hidden" bank.
But, even those COULD benefit from some resource reclamation -- if
executing out of RAM (loaded from FLASH). E.g., reclaiming the memory
that had been used for initialization (i.e., if you need a KB or so to
set things up, then you could reuse that KB for your pushdown stack
or run-time buffers)
>> Regardless (or, "Irregardless", as sayeth The Rat!), I'm stuck with
>> the fixed size pages that vendors currently offer. So, there can't
>> be any "policy" inherent in the crafting of my code as it can't know
>> whether it will be able to avail itself of "tiny" pages or if it
>> will be packaged in a more wasteful container.
>
> Then what was the point of *this* discussion? Exercise?
Indicating why I think variable size pages are of value.
E.g., the "architectural change" that I alluded to, above, will
effectively emulate the variable sized pages that I desire.
But, will do so at the expense of CPU cycles.
<shrug>
I've adopted that philosophy throughout the design -- performance
always improves (for a given cost) so why not "spend" it on features
and mechanisms that make coding easier and more robust? Christ, the
system will STILL spend most of its time twiddling its thumbs!
It's the same sort of reasoning that lets my processes decide which
of their pages to "swap out" instead of letting the kernel make those
decisions blindly. (it's just more opcode fetches!)
[But, using 4K -- or larger -- pages for 500-byte objects just
reeks of waste.]
I can't purchase a battery-backed, solar-powered 120-port network
switch with PoE (2000W) and PTP support (along with protection
against malevolent actors trying to physically damage the switch
via exposed connectors) but, I can EMULATE one using COTS parts
(and leverage that as an opportunity to add *other* value!).
"Some day" the hardware (CPU, switch) will move in a direction that is
more accommodating than present day. If not, my current hardware
will only end up FASTER which means the current implementation
will just more closely emulate (performance-wise) that conceptual
hardware that might have been available "today"!
[Time for C's morning walk -- while it's still < 90F.]