Hi Ed,

On 11/21/2016 5:52 AM, Ed Prochak wrote:
> On Tuesday, November 15, 2016 at 8:02:04 AM UTC-5, Don Y wrote:
> []
>> I suspect you'd have the same sort of problem I'm currently
>> encountering with the Windows Registry:  how could you
>> "painlessly" track changes made to your "global store",
>> "who" (which process) made them AND easily "undo" them.
>>
>> I've been amused to discover that this *is* possible under
>> Windows; but, much harder under my formal RDBMS implementation!
>> Especially the "undo them" (well, maybe I could build a giant
>> "transaction" around everything but I'm willing to be the
>> RDBMS would die when faced with an open-ended issue like that!)
>
> Since you are working in a RDBMS, I am surprised you have not yet
> tried a transaction history. About 15 years ago, I was contracting
> on an Oracle project. The system build managed a set of transaction
> tables that logged who, what, and when changes were made.

Yes, but a transaction log consumes resources -- FAST!  (fine if you have
lots of spinning media or tertiary storage that you can call upon; not so
fine when your data store is sized for the data that it *stores*!  :> )

To be clear, I've only considered how I would tackle it in my RDBMS
*after* having had to face the problem in Windows' Registry ("Gee, how
would I do this in *my* system?").

The Registry is relatively dense (at least in its "on-disk" form) so
it's conceivable to take a snapshot on Day X and compare against another
snapshot on Day Y.  The individual actions that caused it to transit
from X to Y (through a variety of other states) wouldn't be apparent.
But, you could easily see what was added/removed/changed.

As I use the RDBMS as my sole persistent store (i.e., IN PLACE OF a
filesystem), that snapshot is a lot "bigger" in my case.  And,
carries a lot of metadata that isn't really known (a priori) to be
of use in that eventual comparison.  (OTOH, as the RDBMS relies on it
for its operation, it *may* well be!  So, you can't dismiss it outright)

> We were able to use that tracking storage for the work I was doing
> which was "fixing" data errors. The initial data populated into the
> data base was pretty bad quality from an old ISAM style system. Mainly
> address data, and rural addresses at that. So we iteratively identified
> address corrections, applied them and then found where the change caused
> other issues and could back out the whole change, or even just subsets.

But, its unlikely that each changed datum saw *multiple* changes (?).
I.e., you found a faulty "address" and fixed it.  It's not like you
then decided that the residence had been demolished -- or, placed
on a flat-bed truck and relocated two blocks farther south...

So, a before-and-after comparison would summarize all of your "net"
changes (though without any chronological history or audit trail)

> It does not necessarily have to cost a lot more storage.  A binary
> mapping of what changes in a row of your working table and a separate
> table of those changed columns. Time-stamps and user IDs and such can
> be in the tracking tables also.
>
> I am guessing the RDBMS exists on larger nodes in your system that have
> more resources. But if you are resource limited, then your options are
> limited also and this may not be a possibly option. But I hope it helps.

I have exactly one RDBMS "server" -- the rest of the nodes are 100% volatile.
The whole point was to ensure that "stuff" wasn't being stored in a bunch
of distributed nodes which could complicate reconfiguration and/or replacement
of individual nodes ("Wait!  Don't throw that node out, yet -- I've got to
extract some data from it...")

My Windows solution really only needed to know before-after deltas.  So,
any mechanism I used to achieve that overall goal would be functionally
equivalent (before-after snapshots, incremental logging, etc.).  I just
wanted to know if applications X and Y were in conflict with each other
in any particular way (i.e., if they both tried to manipulate some setting
and "last change wins").

Thinking about how to do this with the RDBMS requires a definite idea
of *what* the goal is to be:  just noticing aggregate deltas can tell me
that the DB is in a state that I hadn't expected -- but, with no idea
as to HOW it got into that state (I can look at the ACLs to see how it
MIGHT have made it into that state but wouldn't have definitive answer).

Knowing that <something> specific has moved to an unexpected/undesired
state, I can add mechanisms to filter any changes and only log the details
of actions that affect that <something>.  But, I then have to hope the
action is repeatable.  If, for example, a user deliberately changed a
setting by some action of *choice*, I have no way of ensuring he will
make that same choice at a future date.

For example, if an executable's image is changed, that *should* only be
the result of an intentional "software update" -- not some rogue process
scribbling in the middle of such an object.  As those sorts of changes
are "significant" (and infrequent), logging details about them is prudent.

OTOH, logging changes to the OGM for the answering machine is probably
a waste of resources...

On Tuesday, November 15, 2016 at 8:02:04 AM UTC-5, Don Y wrote:
[]
> I suspect you'd have the same sort of problem I'm currently
> encountering with the Windows Registry:  how could you
> "painlessly" track changes made to your "global store",
> "who" (which process) made them AND easily "undo" them.
> 
> I've been amused to discover that this *is* possible under
> Windows; but, much harder under my formal RDBMS implementation!
> Especially the "undo them" (well, maybe I could build a giant
> "transaction" around everything but I'm willing to be the
> RDBMS would die when faced with an open-ended issue like that!)
> 
Hi Don,

Since you are working in a RDBMS, I am surprised you have not yet
tried a transaction history. About 15 years ago, I was contracting
on an Oracle project. The system build managed a set of transaction
tables that logged who, what, and when changes were made.

We were able to use that tracking storage for the work I was doing
which was "fixing" data errors. The initial data populated into the
data base was pretty bad quality from an old ISAM style system. Mainly
address data, and rural addresses at that. So we iteratively identified
address corrections, applied them and then found where the change caused
other issues and could back out the whole change, or even just subsets.

It does not necessarily have to cost a lot more storage.  A binary
mapping of what changes in a row of your working table and a separate
table of those changed columns. Time-stamps and user IDs and such can
be in the tracking tables also.

I am guessing the RDBMS exists on larger nodes in your system that have
more resources. But if you are resource limited, then your options are
limited also and this may not be a possibly option. But I hope it helps.

Ed

On 11/13/2016 11:34 AM, Don Y wrote:
> Is there a *simple* means (I've already found a complicated one)
> of noting the changes made to the registry by installing X vs. Y?
>
> (Consider the different cases:  install X, then Y vs. Y then X)

Well, the simplest solution is to install X in a sandbox and trap
the registry (as well as filesystem) changes.  Then, Y in a sandbox
*within* that sandbox for similar reasons.

Dump both sandboxes and repeat, swapping the order of X and Y.

On 17/11/16 04:33, Don Y wrote:
> On 11/16/2016 4:03 AM, Dimiter_Popoff wrote:
>> On 16.11.2016 &#1075;. 04:30, Clifford Heath wrote:
>>> On 16/11/16 09:39, Dimiter_Popoff wrote:
>
>>> What it seems you need is the kind of reliable storage that
>>> SQLite provides
>>
>> Sort of yes, but I am more after a hierarchical thing, like a
>> directory tree. I guess I'll be fine as I have made it, if I have to
>> take steps to compress it further than it is now I know how to do it.
>> But from your feedback - and that of Don - it seems the extra few bytes
>> spilled per entry does not bother you much.
>
> In the case of PostgreSQL, it's not JUST a "extra few bytes"!  :<
> There's a fair bit of (data) overhead that gets dragged in
> with the data.  Especially if you want to optimize access to
> that data or tie it to other data (primary/foreign keys).
>
>>> ... - and does so better than *anything* else you'll
>>> find, with a code size that is smaller than anything of comparable
>>> reliability.
>>
>> I did not read enough to get to the code size, could you please
>> post some figure? Just for reference, I'd be curious to know and
>> other people reading this might be as well.

They have an entire web page for that - 2nd hit in Google:
<https://www.sqlite.org/footprint.html>

> I think about a quarter megabyte -- depending on which features
> you include (exclude).

A quarter (all OMIT options) up to half a megabyte (no OMITs).
Yes, it's quite a lot. Yes, it would be possible to do the
same in a much smaller library. No, no-one has actually done
that in a widely-used well-tested library. db/ndb goes close.

FWIW, almost all such systems since 1990 were built by following
the instructions from just one book - "Transaction Processing -
Concepts and Techniques" by Gray and Reuter. Incredibly influential
book. This kind of reliability requires techniques that are very
non-obvious - much more so if you want to maximise concurrent
processing - and those techniques were closely-guarded trade
secrets until this book was published. I can highly recommend
it to anyone interested in making any composite action appear
to be atomic (this is the core idea of a "transaction").

>  For PostgreSQL, that number increases about 5-fold.

Postgres is not suitable for this. It requires periodic
"vacuum"ing and other maintenance.

> E.g., do you have to support concurrent readers/writers and guarantee
> atomic access?  Or, can you afford to do that in a "wrapper" around
> the "global store" (a "monitor", of sorts).

SQLite uses a global lock, so each transaction is single-threaded.
That's why it's so much smaller than e.g. Postgres. However, it's
still *really* hard to guarantee atomicity across failure restarts.
The method that gives you this, also gives you roll-back for free.

> Sizewise, your approach will almost always "win" -- because you're just
> special-casing the filesystem code (e.g., to handle your namespace).

But you're then *totally* reliant on the filesystem to provide
atomicity across unexpected restarts - and that is almost
certainly not the case.

> By contrast, SQLite/PostgreSQL don't really use the underlying file
> system for anything more than "bulk storage".

And even as "bulk storage" those are significant possible points of
failure. Consider that a database "page" which might be 8K is made
up of multiple sectors, and a power fail can result in a block that
has been only part written (so you get a so-called "torn page").
Torn page detection involves writing a page checksum to the start
and end of each page, and checking both against the actual data on
every read.

Drives (including SSD) do so much caching and write rescheduling
that you cannot really know which "completed" writes have actually
completed. Newer drives provide special modes and operations which
can be used to increase reliability for DBMS, in addition to global
flush all writes type features.

Clifford Heath

On 11/16/2016 4:03 AM, Dimiter_Popoff wrote:
> On 16.11.2016 &#1075;. 04:30, Clifford Heath wrote:
>> On 16/11/16 09:39, Dimiter_Popoff wrote:

>> What it seems you need is the kind of reliable storage that
>> SQLite provides
>
> Sort of yes, but I am more after a hierarchical thing, like a
> directory tree. I guess I'll be fine as I have made it, if I have to
> take steps to compress it further than it is now I know how to do it.
> But from your feedback - and that of Don - it seems the extra few bytes
> spilled per entry does not bother you much.

In the case of PostgreSQL, it's not JUST a "extra few bytes"!  :<
There's a fair bit of (data) overhead that gets dragged in
with the data.  Especially if you want to optimize access to
that data or tie it to other data (primary/foreign keys).

>> ... - and does so better than *anything* else you'll
>> find, with a code size that is smaller than anything of comparable
>> reliability.
>
> I did not read enough to get to the code size, could you please
> post some figure? Just for reference, I'd be curious to know and
> other people reading this might be as well.

I think about a quarter megabyte -- depending on which features
you include (exclude).  For PostgreSQL, that number increases about
5-fold.

[Given that "investment", I try to leverage as much functionality
out of the RDBMS as is conceivable!]

But, these are addressing different sorts of problems.  I suspect
an even "simpler" (name,value) store (e.g., ndbm) would be considerably
smaller -- and less feature-full.

E.g., do you have to support concurrent readers/writers and guarantee
atomic access?  Or, can you afford to do that in a "wrapper" around
the "global store" (a "monitor", of sorts).  Do you have to support
"transactions" and be able to roll-back operations on the store
based on conflicts encountered with "later" operations in that
same transaction?  etc.

Sizewise, your approach will almost always "win" -- because you're just
special-casing the filesystem code (e.g., to handle your namespace).
If you wanted to ensure no two clients (threads/processes) could compete
for a particular "setting/parameter", you could implement that mechanism
with simple file locking:  pend on lock, make change (to value or directory),
release lock.  etc.

By contrast, SQLite/PostgreSQL don't really use the underlying file
system for anything more than "bulk storage".

On 11/16/2016 4:43 AM, Dimiter_Popoff wrote:
> On 16.11.2016 &#1075;. 05:44, Don Y wrote:
>> On 11/15/2016 3:39 PM, Dimiter_Popoff wrote:
>>
>>>>> In dps a directory can just be copied like any other file
>>>>> and it will still point to the correct file entries at that
>>>>> moment; it is not done automatically but I have done it in
>>>>> the past. Then one can copy the directory file and set the
>>>>> type of the copy to be non-directory so it won't confuse
>>>>> the system by its duplicate pointers etc.
>>>>> It is probably quite different from other filesystems as I
>>>>> have done it without looking at many of them.
>>>>
>>>> I suspect you'd have the same sort of problem I'm currently
>>>> encountering with the Windows Registry:  how could you
>>>> "painlessly" track changes made to your "global store",
>>>> "who" (which process) made them AND easily "undo" them.
>>>
>>> well, may be. But being on top of the filesystem gives me more
>>> options, e.g. I have a "diff" script which compares all files
>>> in a directory to those in another; I could make it recursive
>>> like some others I have already (e.g. byte count (bc *.sa <top path>)
>>> etc. It lists differing files and lists "not found" ones in
>>> the second directory (second on the command line). I don't
>>> think it will be a huge issue to locate a change this way,
>>> it may be one of the changes are too many and I have to navigate
>>> through all detected ones.
>>
>> There are several issues involved:
>> - finding the change
>> - reporting it in a meaningful way
>> - identifying the "culprit"
>
> Finding the change would be easy. Reporting it in a meaningful way
> depends on the entity it is reported to, i.e. what it finds
> "meaningful" - if it is a human, this will depend on their knowledge.

In my case (the reason for the post), its a way of identifying
changes made to the system that I might not EXPECT to have occurred
(e.g., an application remapping certain file extensions to its
own handler instead of the handler that I'd previously "been
happy with").  It's easier to just *see* what it has added/changed
than to stumble across the consequences of those changes -- maybe
days or weeks/months later (then having to sort out how to undo them
and any *further* changes).

Then, to see how ANOTHER application potentially mucks with the settings
put in place by the first.

Or, how a *newer* version of an application changes settings from an
earlier version, etc.

In Windows, this sucks because there are so few data types and the
documentation for each registry setting is usually nonexistent for
any particular application.

> Identifying the culprit may or may not be possible - logging every
> event means logging the logging events involved and so on into
> infinity, so we should draw the line some place:-).

Again, a difference in our expectations of the store.  In my case,
as it is the sole means of storing stuff, it is *huge* (terabytes).
E.g., every executable is retrieved from the store and loaded,
on demand, at runtime.  Every song you want to play, the time
of every incoming phone call, voice recordings of those calls
(think: answering machine), surveillance video, etc.

A notion of "tablespaces" (i.e., store THIS table on THAT physical
medium) lets me present a unified interface to the store yet still
move data objects around "behind the scenes".  E.g., settings that
change frequently should be backed by BBSRAM; OTOH, silly to
store music (which is largely immutable) there -- NAND FLASH would
be a better choice; and, surveillance video on magnetic disks, etc.

So, logging "process ID", "time of change", and "change" isn't
a huge resource hog.  :>  And, a log need not be boundless; you
can elect to just save the last N transactions, etc.

But, having ACLs in place means I can already narrow down the list of
potential offenders:  who had *permission* to make that change?

>> The Windows registry just supports a few different data types:
>> - binary
>> - string
>> - dword
>> - qword
>
> I also think I have seen them store strings. The dps global store
> understands all the types and units an "indexed parameter" has
> known for 15+ years now (hopefully not 20 yet but I am not sure).
> Looking at the source actually I see it IS 20 years old now....
> (all capitals, i.e. it has been written for my asm32 which ran
> on a different machine....). Here:
>
> **************************************
> *                                    *
> *           Transgalactic            *
> *             Instruments            *
> *                                    *
> **************************************
> *
> * PARAMETER OBJECT RELATED EQUATES
> *
> *
>          ORG       0

Is this a hack to effectively define:
PAR$FLO		EQU	0	* byte = 1 byte
<unused>	EQU	1	* byte = 1 byte
PAR$DIX		EQU	2	* long = 4 bytes
PAR$DIM		EQU	6	* word = 2 bytes
PAR$MULT	EQU	8	...

*Or*, are these all "members" of a "parameter object"?

> PAR$FL0  DO.B      1         FLAG 0 (R/W ETC.)
>          DO.B      1         RESERVED
> PAR$DIX  DO.L      1         DEVICE DEPENDENT INDEX
> PAR$DIM  DO.W      1         DIMENSION
> PAR$MULT DO.W      1         POWER OF 10 (SIGNED) MULTIPLYER
> PAR$TYPE DO.W      1         PARAMETER TYPE (DATA TYPE, BYTE,REAL,TEXT, ETC.)
> PAR$DATA EQU       *         DATA FOLLOWING
> *
> * PAR$FL0 BIT DEFINITIONS
> *
> PR0$RD   EQU       0         CAN BE READ
> PR0$WR   EQU       1         CAN BE WRITTEN TO
> *
>          IFUDF     PT$BU
> *
> * DEVICE DRIVER (OR OTHER) INTERACTION TYPE DEFINITIONS
> * TYPE PASSED/RETURNED IN D5,(INDEX IN D6),DATA IN D1 UP TO D4,AS
> * MUCH AS IT TAKES IF NUMERIC; IF LIST OF OBJECTS, A5 ->
> *
>          ORG       0

E.g., here, it looks like you are using this as a "trick" to
define a bunch of mutually exclusive values (LIST, VAR, BU, BS...)
as constants.

> PT$LIST  DO.B      1         OBJECT LIST AT (A5)
> PT$VAR   DO.B      1         A2 -> VAR NAME, D1=@VAR
> PT$BU    DO.B      1         UNSIGNED BYTE
> PT$BS    DO.B      1         SIGNED BYTE
> PT$WU    DO.B      1         UNSIGNED WORD
> PT$WS    DO.B      1         SIGNED WORD
> PT$LU    DO.B      1         UNSIGNED LONG
> PT$LS    DO.B      1         SIGNED LONG
> PT$DU    DO.B      1         UNSIGNED DUAL
> PT$DS    DO.B      1         SIGNED DUAL (64-BIT)
> PT$QU    DO.B      1         UNSIGNED QUAD
> PT$QS    DO.B      1         SIGNED QUAD
> PT$FS    DO.B      1         FP SINGLE PRECISION
> PT$FD    DO.B      1         FP DUAL PRECISION
> PT$FX    DO.B      1         FP .X
> pt$alst  do.b      1        allocated list (same as pt$list but can be
> deallocated)
> *
> *
>          ENDC
> *
> * units
> *
>          ORG       0
> DIM$NUMB DO.B      1         UNDEFINED - JUST A NUMBER
> DIM$SEC  DO.B      1         SECONDS
> DIM$MIN  DO.B      1         MINUTES
> DIM$HOUR DO.B      1         HOURS
> DIM$DAY  DO.B      1         DAY
> DIM$WEEK DO.B      1         WEEK
> DIM$MON  DO.B      1         MONTH
> DIM$YEAR DO.B      1         YEAR; FURHTER WITH MULTIPLIER
> DIM$METR DO.B      1         METER
> DIM$DEG  DO.B      1         DEGREE
> DIM$RAD  DO.B      1         RADIAN
> DIM$GRAD DO.B      1         GRAD (100GR=90 DEG)
> DIM$VOLT DO.B      1         VOLT
> DIM$AMP  DO.B      1         AMPERE
> DIM$OHM  DO.B      1         OHM
> DIM$FAR  DO.B      1         FARAD
> DIM$HEN  DO.B      1         HENRY
> DIM$BQ   DO.B      1         BECQUEREL
> DIM$CU   DO.B      1         CURIE
> DIM$ROE  DO.B      1         ROENTGEN
> DIM$SLIC DO.B      1         SYSTEM TIME SLICES
> DIM$CELS DO.B      1         DEGREE CELSIUS
> DIM$FRNH DO.B      1         DEGREE FAHRENHEIT
> dim$hz   do.b      1         hertz
> *
> *
>          END
>
> The "parameter object" has been abandoned ages ago, probably
> never used since it was first introduced. But may be some code
> using it is still in use. The types and units are widely deployed.

The RDBMS that I'm using supports a variety of "natural" types:
<https://www.postgresql.org/docs/9.5/static/datatype.html#DATATYPE-TABLE>
and adjusts the storage required (as well as the sanity checks that
are applied to values -- e.g., a "date" has different rules than a
"point") accordingly.

But, I can freely augment the list of data types with types of my
own.  E.g., I have a "Bezier" type that is used to represent
cubic bezier curves.  Another that is used to represent ISBN
identifiers.  etc.

Additionally, I can define operators that apply to those particular
data types.  E.g., publisher() yields the publisher code of a particular
ISBN identifier:
<https://en.wikipedia.org/wiki/List_of_group-0_ISBN_publisher_codes>
And, is_line() tells me if a particular Bezier is actually a straight line
segment, etc.

Additionally, I can impose constraints on data items that the RDBMS will
enforce.  I.e., "ensure the time of incoming phone call is LATER than
the time of the call that preceded it".  As such, the RDBMS can be seen as
a contract enforcer for its clients.

I consider the store to be incorruptible; if something is accepted
by the store, it will always return that same value (or, the most
recent value for that "thing").  So, clients don't have to check
the sanity of anything coming FROM the store.

> The "units" (called "dim" because I have not thought of the correct
> English word, in Bulgarian a "unit" is called a "dimension") contains
> entries which were never used but it is a 16-bit field so no
> problem expected soon out of that.
>
> The "list" type is quite generic, it can be any sequence of
> dps inherent objects (lowest level objects, like horizontal line,
> text string etc., not the "object" I refer to elsewhere which
> is the basis of the dps runtime object system, the latter is
> "extobj", one of the many low level objects). But I mostly use
> it for text strings, these can be "pasted" (de-encapsulated
> and written to some memory address).

I can't support "bags" (groups of objects of inconsistent types)
but can support an array of any standard type:
<https://www.postgresql.org/docs/9.5/static/arrays.html>

> What comes with all the types is the check for overflow (hence
> the signed and unsigned types); then setting a parameter will
> fail if the supplied type and unit not as expected. In the global
> store this is not the case, the type/unit will be overwritten
> with the latest (I think...).

I can specify a valid range of values for a (certain) particular types:
<https://www.postgresql.org/docs/9.5/static/rangetypes.html>

For other types, I have to bear some cost for creating the
constraints that apply to that type.  E.g., if I wanted to
ensure a particular IP address was in a particular subnet...

>> So, there is very little "information" conveyed if you report
>> that a dword changed from 0x1234 to 0x4343.  OTOH, if a new
>> key is added, that might convey some information (as they
>> HOPEFULLY have descriptive names).
>
> Paths and names are what I rely on for meaning, ownership etc.
> indeed.

Yes.  And, for the identifiers (and positions in the "namespace
hierarchy") that YOU choose, this will probably work.  But, in
my case, I can't rely on a future developer (or USER!) to pick
good names.  So, I want to facilitate that effort by imposing the
fewest impediments to picking arbitrary names (within the
confines of the SQL language)

>> The advantage to a "real" database (I am playing fast and
>> loose with my definition of "real") is that you tend to have
>> more explicit types.  And, the datum (field) indicates its type.
>>
>> So, if a byte in my "persistent store" (RDBMS) changes from
>> 0x12 to 0x13, I can see that this was part of a MAC address...
>> or, a "currency" value, or a "text string", or a "book title"
>> (if I define a type that is used to represent book titles!)
>> or a UPC code, etc.
>>
>> And, I can identify who (process) changed it -- as well as
>> knowing who CAN'T have changed it (due to the ACL's in place
>> for that object).
>
> Hmmm, identifying the process which changed it may be useful
> but may be not so straight forward for that purpose. What if
> it has been modified by a process (task) which was killed and
> then another ran in its place? In dps this is countered by
> identifying tasks not just by their task descriptor ID (offset
> to access it really) but in addition by their spawn moment
> (system time). This does not survive reset though.... one would
> have to include the starting moment of the boot session.

Yes.  In my case, if I start a "job" many times over the lifetime
of the system, identifying which instance is the culprit is
problematic.  But, you'd really only need this ability as a
diagnostic tool; "who screwed with this?".  It would be nigh
on impossible to catch a one-time event.  But, if it is a
repeatable "bug", you should be able to manually fix a setting
(object) and then watch (instrument) to see how/when it changes
thereafter.

I can conceivably write a trigger that does this watching for me
(if I know what to look for) and turns on a red light when it
is tripped, etc.

>> Rescued some more toys, today, so a long night sorting stuff out...  :>
>
> Hah, sounds like you will have some fun :-).

<frown>  Until I have to decide what to DISCARD to accommodate the
NEW ADDITIONS!  :-/

"Simplify"  <big frown>

I rescued a second (i.e., "spare") 2KW UPS:
<https://www.amazon.com/APC-SMT2200RM2U-2200VA-120V-Smart-UPS/dp/B004F09D0O>
for my automation system.  Coupled with an "expansion battery pack"
(48V @ 15AHr), I should be able to keep the system "up" for a few
hours without shedding loads... the better part of a *day* with active
load management!  (beyond that, you've got bigger problems!  :> )

On 16.11.2016 &#1075;. 05:44, Don Y wrote:
> Hi Dimiter,
>
> On 11/15/2016 3:39 PM, Dimiter_Popoff wrote:
>
>>>> In dps a directory can just be copied like any other file
>>>> and it will still point to the correct file entries at that
>>>> moment; it is not done automatically but I have done it in
>>>> the past. Then one can copy the directory file and set the
>>>> type of the copy to be non-directory so it won't confuse
>>>> the system by its duplicate pointers etc.
>>>> It is probably quite different from other filesystems as I
>>>> have done it without looking at many of them.
>>>
>>> I suspect you'd have the same sort of problem I'm currently
>>> encountering with the Windows Registry:  how could you
>>> "painlessly" track changes made to your "global store",
>>> "who" (which process) made them AND easily "undo" them.
>>
>> well, may be. But being on top of the filesystem gives me more
>> options, e.g. I have a "diff" script which compares all files
>> in a directory to those in another; I could make it recursive
>> like some others I have already (e.g. byte count (bc *.sa <top path>)
>> etc. It lists differing files and lists "not found" ones in
>> the second directory (second on the command line). I don't
>> think it will be a huge issue to locate a change this way,
>> it may be one of the changes are too many and I have to navigate
>> through all detected ones.
>
> There are several issues involved:
> - finding the change
> - reporting it in a meaningful way
> - identifying the "culprit"

Finding the change would be easy. Reporting it in a meaningful way
depends on the entity it is reported to, i.e. what it finds
"meaningful" - if it is a human, this will depend on their knowledge.
Identifying the culprit may or may not be possible - logging every
event means logging the logging events involved and so on into
infinity, so we should draw the line some place:-).

> The Windows registry just supports a few different data types:
> - binary
> - string
> - dword
> - qword

I also think I have seen them store strings. The dps global store
understands all the types and units an "indexed parameter" has
known for 15+ years now (hopefully not 20 yet but I am not sure).
Looking at the source actually I see it IS 20 years old now....
(all capitals, i.e. it has been written for my asm32 which ran
on a different machine....). Here:

**************************************
*                                    *
*           Transgalactic            *
*             Instruments            *
*                                    *
**************************************
*
* PARAMETER OBJECT RELATED EQUATES
*
*
          ORG       0
*
PAR$FL0  DO.B      1         FLAG 0 (R/W ETC.)
          DO.B      1         RESERVED
PAR$DIX  DO.L      1         DEVICE DEPENDENT INDEX
PAR$DIM  DO.W      1         DIMENSION
PAR$MULT DO.W      1         POWER OF 10 (SIGNED) MULTIPLYER
PAR$TYPE DO.W      1         PARAMETER TYPE (DATA TYPE, BYTE,REAL,TEXT, 
ETC.)
PAR$DATA EQU       *         DATA FOLLOWING
*
* PAR$FL0 BIT DEFINITIONS
*
PR0$RD   EQU       0         CAN BE READ
PR0$WR   EQU       1         CAN BE WRITTEN TO
*
          IFUDF     PT$BU
*
* DEVICE DRIVER (OR OTHER) INTERACTION TYPE DEFINITIONS
* TYPE PASSED/RETURNED IN D5,(INDEX IN D6),DATA IN D1 UP TO D4,AS
* MUCH AS IT TAKES IF NUMERIC; IF LIST OF OBJECTS, A5 ->
*
          ORG       0
PT$LIST  DO.B      1         OBJECT LIST AT (A5)
PT$VAR   DO.B      1         A2 -> VAR NAME, D1=@VAR
PT$BU    DO.B      1         UNSIGNED BYTE
PT$BS    DO.B      1         SIGNED BYTE
PT$WU    DO.B      1         UNSIGNED WORD
PT$WS    DO.B      1         SIGNED WORD
PT$LU    DO.B      1         UNSIGNED LONG
PT$LS    DO.B      1         SIGNED LONG
PT$DU    DO.B      1         UNSIGNED DUAL
PT$DS    DO.B      1         SIGNED DUAL (64-BIT)
PT$QU    DO.B      1         UNSIGNED QUAD
PT$QS    DO.B      1         SIGNED QUAD
PT$FS    DO.B      1         FP SINGLE PRECISION
PT$FD    DO.B      1         FP DUAL PRECISION
PT$FX    DO.B      1         FP .X
pt$alst  do.b      1        allocated list (same as pt$list but can be 
deallocated)
*
*
          ENDC
*
* units
*
          ORG       0
DIM$NUMB DO.B      1         UNDEFINED - JUST A NUMBER
DIM$SEC  DO.B      1         SECONDS
DIM$MIN  DO.B      1         MINUTES
DIM$HOUR DO.B      1         HOURS
DIM$DAY  DO.B      1         DAY
DIM$WEEK DO.B      1         WEEK
DIM$MON  DO.B      1         MONTH
DIM$YEAR DO.B      1         YEAR; FURHTER WITH MULTIPLIER
DIM$METR DO.B      1         METER
DIM$DEG  DO.B      1         DEGREE
DIM$RAD  DO.B      1         RADIAN
DIM$GRAD DO.B      1         GRAD (100GR=90 DEG)
DIM$VOLT DO.B      1         VOLT
DIM$AMP  DO.B      1         AMPERE
DIM$OHM  DO.B      1         OHM
DIM$FAR  DO.B      1         FARAD
DIM$HEN  DO.B      1         HENRY
DIM$BQ   DO.B      1         BECQUEREL
DIM$CU   DO.B      1         CURIE
DIM$ROE  DO.B      1         ROENTGEN
DIM$SLIC DO.B      1         SYSTEM TIME SLICES
DIM$CELS DO.B      1         DEGREE CELSIUS
DIM$FRNH DO.B      1         DEGREE FAHRENHEIT
dim$hz   do.b      1         hertz
*
*
          END

The "parameter object" has been abandoned ages ago, probably
never used since it was first introduced. But may be some code
using it is still in use. The types and units are widely deployed.

The "units" (called "dim" because I have not thought of the correct
English word, in Bulgarian a "unit" is called a "dimension") contains
entries which were never used but it is a 16-bit field so no
problem expected soon out of that.

The "list" type is quite generic, it can be any sequence of
dps inherent objects (lowest level objects, like horizontal line,
text string etc., not the "object" I refer to elsewhere which
is the basis of the dps runtime object system, the latter is
"extobj", one of the many low level objects). But I mostly use
it for text strings, these can be "pasted" (de-encapsulated
and written to some memory address).

What comes with all the types is the check for overflow (hence
the signed and unsigned types); then setting a parameter will
fail if the supplied type and unit not as expected. In the global
store this is not the case, the type/unit will be overwritten
with the latest (I think...).

>
> So, there is very little "information" conveyed if you report
> that a dword changed from 0x1234 to 0x4343.  OTOH, if a new
> key is added, that might convey some information (as they
> HOPEFULLY have descriptive names).

Paths and names are what I rely on for meaning, ownership etc.
indeed.

> The advantage to a "real" database (I am playing fast and
> loose with my definition of "real") is that you tend to have
> more explicit types.  And, the datum (field) indicates its type.
>
> So, if a byte in my "persistent store" (RDBMS) changes from
> 0x12 to 0x13, I can see that this was part of a MAC address...
> or, a "currency" value, or a "text string", or a "book title"
> (if I define a type that is used to represent book titles!)
> or a UPC code, etc.
>
> And, I can identify who (process) changed it -- as well as
> knowing who CAN'T have changed it (due to the ACL's in place
> for that object).

Hmmm, identifying the process which changed it may be useful
but may be not so straight forward for that purpose. What if
it has been modified by a process (task) which was killed and
then another ran in its place? In dps this is countered by
identifying tasks not just by their task descriptor ID (offset
to access it really) but in addition by their spawn moment
(system time). This does not survive reset though.... one would
have to include the starting moment of the boot session.

>
> Rescued some more toys, today, so a long night sorting stuff out...  :>

Hah, sounds like you will have some fun :-).

Dimiter