Reply by Dombo December 27, 20162016-12-27
Op 21-Nov-16 om 23:31 schreef Clifford Heath:
> On 14/11/16 05:34, Don Y wrote: >> Is there a *simple* means (I've already found a complicated one) >> of noting the changes made to the registry by installing X vs. Y? >> >> (Consider the different cases: install X, then Y vs. Y then X) > > This has become an extraordinarily long thread for something > that is answered by a ten-second Google search: > > <https://www.google.com.au/search?q=site%3Amsdn.microsoft.com+track+registry+changes>
My impression with Don Y is that when he posts a question here he is not interested in the answer but just wants to have a conversation; one doesn't get that with a Google search.
Reply by Clifford Heath November 21, 20162016-11-21
On 14/11/16 05:34, Don Y wrote:
> Is there a *simple* means (I've already found a complicated one) > of noting the changes made to the registry by installing X vs. Y? > > (Consider the different cases: install X, then Y vs. Y then X)
This has become an extraordinarily long thread for something that is answered by a ten-second Google search: <https://www.google.com.au/search?q=site%3Amsdn.microsoft.com+track+registry+changes> Clifford Heath.
Reply by Don Y November 21, 20162016-11-21
Hi Ed,

On 11/21/2016 5:52 AM, Ed Prochak wrote:
> On Tuesday, November 15, 2016 at 8:02:04 AM UTC-5, Don Y wrote: > [] >> I suspect you'd have the same sort of problem I'm currently >> encountering with the Windows Registry: how could you >> "painlessly" track changes made to your "global store", >> "who" (which process) made them AND easily "undo" them. >> >> I've been amused to discover that this *is* possible under >> Windows; but, much harder under my formal RDBMS implementation! >> Especially the "undo them" (well, maybe I could build a giant >> "transaction" around everything but I'm willing to be the >> RDBMS would die when faced with an open-ended issue like that!) > > Since you are working in a RDBMS, I am surprised you have not yet > tried a transaction history. About 15 years ago, I was contracting > on an Oracle project. The system build managed a set of transaction > tables that logged who, what, and when changes were made.
Yes, but a transaction log consumes resources -- FAST! (fine if you have lots of spinning media or tertiary storage that you can call upon; not so fine when your data store is sized for the data that it *stores*! :> ) To be clear, I've only considered how I would tackle it in my RDBMS *after* having had to face the problem in Windows' Registry ("Gee, how would I do this in *my* system?"). The Registry is relatively dense (at least in its "on-disk" form) so it's conceivable to take a snapshot on Day X and compare against another snapshot on Day Y. The individual actions that caused it to transit from X to Y (through a variety of other states) wouldn't be apparent. But, you could easily see what was added/removed/changed. As I use the RDBMS as my sole persistent store (i.e., IN PLACE OF a filesystem), that snapshot is a lot "bigger" in my case. And, carries a lot of metadata that isn't really known (a priori) to be of use in that eventual comparison. (OTOH, as the RDBMS relies on it for its operation, it *may* well be! So, you can't dismiss it outright)
> We were able to use that tracking storage for the work I was doing > which was "fixing" data errors. The initial data populated into the > data base was pretty bad quality from an old ISAM style system. Mainly > address data, and rural addresses at that. So we iteratively identified > address corrections, applied them and then found where the change caused > other issues and could back out the whole change, or even just subsets.
But, its unlikely that each changed datum saw *multiple* changes (?). I.e., you found a faulty "address" and fixed it. It's not like you then decided that the residence had been demolished -- or, placed on a flat-bed truck and relocated two blocks farther south... So, a before-and-after comparison would summarize all of your "net" changes (though without any chronological history or audit trail)
> It does not necessarily have to cost a lot more storage. A binary > mapping of what changes in a row of your working table and a separate > table of those changed columns. Time-stamps and user IDs and such can > be in the tracking tables also. > > I am guessing the RDBMS exists on larger nodes in your system that have > more resources. But if you are resource limited, then your options are > limited also and this may not be a possibly option. But I hope it helps.
I have exactly one RDBMS "server" -- the rest of the nodes are 100% volatile. The whole point was to ensure that "stuff" wasn't being stored in a bunch of distributed nodes which could complicate reconfiguration and/or replacement of individual nodes ("Wait! Don't throw that node out, yet -- I've got to extract some data from it...") My Windows solution really only needed to know before-after deltas. So, any mechanism I used to achieve that overall goal would be functionally equivalent (before-after snapshots, incremental logging, etc.). I just wanted to know if applications X and Y were in conflict with each other in any particular way (i.e., if they both tried to manipulate some setting and "last change wins"). Thinking about how to do this with the RDBMS requires a definite idea of *what* the goal is to be: just noticing aggregate deltas can tell me that the DB is in a state that I hadn't expected -- but, with no idea as to HOW it got into that state (I can look at the ACLs to see how it MIGHT have made it into that state but wouldn't have definitive answer). Knowing that <something> specific has moved to an unexpected/undesired state, I can add mechanisms to filter any changes and only log the details of actions that affect that <something>. But, I then have to hope the action is repeatable. If, for example, a user deliberately changed a setting by some action of *choice*, I have no way of ensuring he will make that same choice at a future date. For example, if an executable's image is changed, that *should* only be the result of an intentional "software update" -- not some rogue process scribbling in the middle of such an object. As those sorts of changes are "significant" (and infrequent), logging details about them is prudent. OTOH, logging changes to the OGM for the answering machine is probably a waste of resources...
Reply by Ed Prochak November 21, 20162016-11-21
On Tuesday, November 15, 2016 at 8:02:04 AM UTC-5, Don Y wrote:
[]
> I suspect you'd have the same sort of problem I'm currently > encountering with the Windows Registry: how could you > "painlessly" track changes made to your "global store", > "who" (which process) made them AND easily "undo" them. > > I've been amused to discover that this *is* possible under > Windows; but, much harder under my formal RDBMS implementation! > Especially the "undo them" (well, maybe I could build a giant > "transaction" around everything but I'm willing to be the > RDBMS would die when faced with an open-ended issue like that!) >
Hi Don, Since you are working in a RDBMS, I am surprised you have not yet tried a transaction history. About 15 years ago, I was contracting on an Oracle project. The system build managed a set of transaction tables that logged who, what, and when changes were made. We were able to use that tracking storage for the work I was doing which was "fixing" data errors. The initial data populated into the data base was pretty bad quality from an old ISAM style system. Mainly address data, and rural addresses at that. So we iteratively identified address corrections, applied them and then found where the change caused other issues and could back out the whole change, or even just subsets. It does not necessarily have to cost a lot more storage. A binary mapping of what changes in a row of your working table and a separate table of those changed columns. Time-stamps and user IDs and such can be in the tracking tables also. I am guessing the RDBMS exists on larger nodes in your system that have more resources. But if you are resource limited, then your options are limited also and this may not be a possibly option. But I hope it helps. Ed
Reply by Don Y November 18, 20162016-11-18
On 11/13/2016 11:34 AM, Don Y wrote:
> Is there a *simple* means (I've already found a complicated one) > of noting the changes made to the registry by installing X vs. Y? > > (Consider the different cases: install X, then Y vs. Y then X)
Well, the simplest solution is to install X in a sandbox and trap the registry (as well as filesystem) changes. Then, Y in a sandbox *within* that sandbox for similar reasons. Dump both sandboxes and repeat, swapping the order of X and Y.
Reply by Clifford Heath November 16, 20162016-11-16
On 17/11/16 04:33, Don Y wrote:
> On 11/16/2016 4:03 AM, Dimiter_Popoff wrote: >> On 16.11.2016 &#1075;. 04:30, Clifford Heath wrote: >>> On 16/11/16 09:39, Dimiter_Popoff wrote: > >>> What it seems you need is the kind of reliable storage that >>> SQLite provides >> >> Sort of yes, but I am more after a hierarchical thing, like a >> directory tree. I guess I'll be fine as I have made it, if I have to >> take steps to compress it further than it is now I know how to do it. >> But from your feedback - and that of Don - it seems the extra few bytes >> spilled per entry does not bother you much. > > In the case of PostgreSQL, it's not JUST a "extra few bytes"! :< > There's a fair bit of (data) overhead that gets dragged in > with the data. Especially if you want to optimize access to > that data or tie it to other data (primary/foreign keys). > >>> ... - and does so better than *anything* else you'll >>> find, with a code size that is smaller than anything of comparable >>> reliability. >> >> I did not read enough to get to the code size, could you please >> post some figure? Just for reference, I'd be curious to know and >> other people reading this might be as well.
They have an entire web page for that - 2nd hit in Google: <https://www.sqlite.org/footprint.html>
> I think about a quarter megabyte -- depending on which features > you include (exclude).
A quarter (all OMIT options) up to half a megabyte (no OMITs). Yes, it's quite a lot. Yes, it would be possible to do the same in a much smaller library. No, no-one has actually done that in a widely-used well-tested library. db/ndb goes close. FWIW, almost all such systems since 1990 were built by following the instructions from just one book - "Transaction Processing - Concepts and Techniques" by Gray and Reuter. Incredibly influential book. This kind of reliability requires techniques that are very non-obvious - much more so if you want to maximise concurrent processing - and those techniques were closely-guarded trade secrets until this book was published. I can highly recommend it to anyone interested in making any composite action appear to be atomic (this is the core idea of a "transaction").
> For PostgreSQL, that number increases about 5-fold.
Postgres is not suitable for this. It requires periodic "vacuum"ing and other maintenance.
> E.g., do you have to support concurrent readers/writers and guarantee > atomic access? Or, can you afford to do that in a "wrapper" around > the "global store" (a "monitor", of sorts).
SQLite uses a global lock, so each transaction is single-threaded. That's why it's so much smaller than e.g. Postgres. However, it's still *really* hard to guarantee atomicity across failure restarts. The method that gives you this, also gives you roll-back for free.
> Sizewise, your approach will almost always "win" -- because you're just > special-casing the filesystem code (e.g., to handle your namespace).
But you're then *totally* reliant on the filesystem to provide atomicity across unexpected restarts - and that is almost certainly not the case.
> By contrast, SQLite/PostgreSQL don't really use the underlying file > system for anything more than "bulk storage".
And even as "bulk storage" those are significant possible points of failure. Consider that a database "page" which might be 8K is made up of multiple sectors, and a power fail can result in a block that has been only part written (so you get a so-called "torn page"). Torn page detection involves writing a page checksum to the start and end of each page, and checking both against the actual data on every read. Drives (including SSD) do so much caching and write rescheduling that you cannot really know which "completed" writes have actually completed. Newer drives provide special modes and operations which can be used to increase reliability for DBMS, in addition to global flush all writes type features. Clifford Heath
Reply by Don Y November 16, 20162016-11-16
On 11/16/2016 4:03 AM, Dimiter_Popoff wrote:
> On 16.11.2016 &#1075;. 04:30, Clifford Heath wrote: >> On 16/11/16 09:39, Dimiter_Popoff wrote:
>> What it seems you need is the kind of reliable storage that >> SQLite provides > > Sort of yes, but I am more after a hierarchical thing, like a > directory tree. I guess I'll be fine as I have made it, if I have to > take steps to compress it further than it is now I know how to do it. > But from your feedback - and that of Don - it seems the extra few bytes > spilled per entry does not bother you much.
In the case of PostgreSQL, it's not JUST a "extra few bytes"! :< There's a fair bit of (data) overhead that gets dragged in with the data. Especially if you want to optimize access to that data or tie it to other data (primary/foreign keys).
>> ... - and does so better than *anything* else you'll >> find, with a code size that is smaller than anything of comparable >> reliability. > > I did not read enough to get to the code size, could you please > post some figure? Just for reference, I'd be curious to know and > other people reading this might be as well.
I think about a quarter megabyte -- depending on which features you include (exclude). For PostgreSQL, that number increases about 5-fold. [Given that "investment", I try to leverage as much functionality out of the RDBMS as is conceivable!] But, these are addressing different sorts of problems. I suspect an even "simpler" (name,value) store (e.g., ndbm) would be considerably smaller -- and less feature-full. E.g., do you have to support concurrent readers/writers and guarantee atomic access? Or, can you afford to do that in a "wrapper" around the "global store" (a "monitor", of sorts). Do you have to support "transactions" and be able to roll-back operations on the store based on conflicts encountered with "later" operations in that same transaction? etc. Sizewise, your approach will almost always "win" -- because you're just special-casing the filesystem code (e.g., to handle your namespace). If you wanted to ensure no two clients (threads/processes) could compete for a particular "setting/parameter", you could implement that mechanism with simple file locking: pend on lock, make change (to value or directory), release lock. etc. By contrast, SQLite/PostgreSQL don't really use the underlying file system for anything more than "bulk storage".
Reply by Don Y November 16, 20162016-11-16
On 11/16/2016 4:43 AM, Dimiter_Popoff wrote:
> On 16.11.2016 &#1075;. 05:44, Don Y wrote: >> On 11/15/2016 3:39 PM, Dimiter_Popoff wrote: >> >>>>> In dps a directory can just be copied like any other file >>>>> and it will still point to the correct file entries at that >>>>> moment; it is not done automatically but I have done it in >>>>> the past. Then one can copy the directory file and set the >>>>> type of the copy to be non-directory so it won't confuse >>>>> the system by its duplicate pointers etc. >>>>> It is probably quite different from other filesystems as I >>>>> have done it without looking at many of them. >>>> >>>> I suspect you'd have the same sort of problem I'm currently >>>> encountering with the Windows Registry: how could you >>>> "painlessly" track changes made to your "global store", >>>> "who" (which process) made them AND easily "undo" them. >>> >>> well, may be. But being on top of the filesystem gives me more >>> options, e.g. I have a "diff" script which compares all files >>> in a directory to those in another; I could make it recursive >>> like some others I have already (e.g. byte count (bc *.sa <top path>) >>> etc. It lists differing files and lists "not found" ones in >>> the second directory (second on the command line). I don't >>> think it will be a huge issue to locate a change this way, >>> it may be one of the changes are too many and I have to navigate >>> through all detected ones. >> >> There are several issues involved: >> - finding the change >> - reporting it in a meaningful way >> - identifying the "culprit" > > Finding the change would be easy. Reporting it in a meaningful way > depends on the entity it is reported to, i.e. what it finds > "meaningful" - if it is a human, this will depend on their knowledge.
In my case (the reason for the post), its a way of identifying changes made to the system that I might not EXPECT to have occurred (e.g., an application remapping certain file extensions to its own handler instead of the handler that I'd previously "been happy with"). It's easier to just *see* what it has added/changed than to stumble across the consequences of those changes -- maybe days or weeks/months later (then having to sort out how to undo them and any *further* changes). Then, to see how ANOTHER application potentially mucks with the settings put in place by the first. Or, how a *newer* version of an application changes settings from an earlier version, etc. In Windows, this sucks because there are so few data types and the documentation for each registry setting is usually nonexistent for any particular application.
> Identifying the culprit may or may not be possible - logging every > event means logging the logging events involved and so on into > infinity, so we should draw the line some place:-).
Again, a difference in our expectations of the store. In my case, as it is the sole means of storing stuff, it is *huge* (terabytes). E.g., every executable is retrieved from the store and loaded, on demand, at runtime. Every song you want to play, the time of every incoming phone call, voice recordings of those calls (think: answering machine), surveillance video, etc. A notion of "tablespaces" (i.e., store THIS table on THAT physical medium) lets me present a unified interface to the store yet still move data objects around "behind the scenes". E.g., settings that change frequently should be backed by BBSRAM; OTOH, silly to store music (which is largely immutable) there -- NAND FLASH would be a better choice; and, surveillance video on magnetic disks, etc. So, logging "process ID", "time of change", and "change" isn't a huge resource hog. :> And, a log need not be boundless; you can elect to just save the last N transactions, etc. But, having ACLs in place means I can already narrow down the list of potential offenders: who had *permission* to make that change?
>> The Windows registry just supports a few different data types: >> - binary >> - string >> - dword >> - qword > > I also think I have seen them store strings. The dps global store > understands all the types and units an "indexed parameter" has > known for 15+ years now (hopefully not 20 yet but I am not sure). > Looking at the source actually I see it IS 20 years old now.... > (all capitals, i.e. it has been written for my asm32 which ran > on a different machine....). Here: > > ************************************** > * * > * Transgalactic * > * Instruments * > * * > ************************************** > * > * PARAMETER OBJECT RELATED EQUATES > * > * > ORG 0
Is this a hack to effectively define: PAR$FLO EQU 0 * byte = 1 byte <unused> EQU 1 * byte = 1 byte PAR$DIX EQU 2 * long = 4 bytes PAR$DIM EQU 6 * word = 2 bytes PAR$MULT EQU 8 ... *Or*, are these all "members" of a "parameter object"?
> PAR$FL0 DO.B 1 FLAG 0 (R/W ETC.) > DO.B 1 RESERVED > PAR$DIX DO.L 1 DEVICE DEPENDENT INDEX > PAR$DIM DO.W 1 DIMENSION > PAR$MULT DO.W 1 POWER OF 10 (SIGNED) MULTIPLYER > PAR$TYPE DO.W 1 PARAMETER TYPE (DATA TYPE, BYTE,REAL,TEXT, ETC.) > PAR$DATA EQU * DATA FOLLOWING > * > * PAR$FL0 BIT DEFINITIONS > * > PR0$RD EQU 0 CAN BE READ > PR0$WR EQU 1 CAN BE WRITTEN TO > * > IFUDF PT$BU > * > * DEVICE DRIVER (OR OTHER) INTERACTION TYPE DEFINITIONS > * TYPE PASSED/RETURNED IN D5,(INDEX IN D6),DATA IN D1 UP TO D4,AS > * MUCH AS IT TAKES IF NUMERIC; IF LIST OF OBJECTS, A5 -> > * > ORG 0
E.g., here, it looks like you are using this as a "trick" to define a bunch of mutually exclusive values (LIST, VAR, BU, BS...) as constants.
> PT$LIST DO.B 1 OBJECT LIST AT (A5) > PT$VAR DO.B 1 A2 -> VAR NAME, D1=@VAR > PT$BU DO.B 1 UNSIGNED BYTE > PT$BS DO.B 1 SIGNED BYTE > PT$WU DO.B 1 UNSIGNED WORD > PT$WS DO.B 1 SIGNED WORD > PT$LU DO.B 1 UNSIGNED LONG > PT$LS DO.B 1 SIGNED LONG > PT$DU DO.B 1 UNSIGNED DUAL > PT$DS DO.B 1 SIGNED DUAL (64-BIT) > PT$QU DO.B 1 UNSIGNED QUAD > PT$QS DO.B 1 SIGNED QUAD > PT$FS DO.B 1 FP SINGLE PRECISION > PT$FD DO.B 1 FP DUAL PRECISION > PT$FX DO.B 1 FP .X > pt$alst do.b 1 allocated list (same as pt$list but can be > deallocated) > * > * > ENDC > * > * units > * > ORG 0 > DIM$NUMB DO.B 1 UNDEFINED - JUST A NUMBER > DIM$SEC DO.B 1 SECONDS > DIM$MIN DO.B 1 MINUTES > DIM$HOUR DO.B 1 HOURS > DIM$DAY DO.B 1 DAY > DIM$WEEK DO.B 1 WEEK > DIM$MON DO.B 1 MONTH > DIM$YEAR DO.B 1 YEAR; FURHTER WITH MULTIPLIER > DIM$METR DO.B 1 METER > DIM$DEG DO.B 1 DEGREE > DIM$RAD DO.B 1 RADIAN > DIM$GRAD DO.B 1 GRAD (100GR=90 DEG) > DIM$VOLT DO.B 1 VOLT > DIM$AMP DO.B 1 AMPERE > DIM$OHM DO.B 1 OHM > DIM$FAR DO.B 1 FARAD > DIM$HEN DO.B 1 HENRY > DIM$BQ DO.B 1 BECQUEREL > DIM$CU DO.B 1 CURIE > DIM$ROE DO.B 1 ROENTGEN > DIM$SLIC DO.B 1 SYSTEM TIME SLICES > DIM$CELS DO.B 1 DEGREE CELSIUS > DIM$FRNH DO.B 1 DEGREE FAHRENHEIT > dim$hz do.b 1 hertz > * > * > END > > The "parameter object" has been abandoned ages ago, probably > never used since it was first introduced. But may be some code > using it is still in use. The types and units are widely deployed.
The RDBMS that I'm using supports a variety of "natural" types: <https://www.postgresql.org/docs/9.5/static/datatype.html#DATATYPE-TABLE> and adjusts the storage required (as well as the sanity checks that are applied to values -- e.g., a "date" has different rules than a "point") accordingly. But, I can freely augment the list of data types with types of my own. E.g., I have a "Bezier" type that is used to represent cubic bezier curves. Another that is used to represent ISBN identifiers. etc. Additionally, I can define operators that apply to those particular data types. E.g., publisher() yields the publisher code of a particular ISBN identifier: <https://en.wikipedia.org/wiki/List_of_group-0_ISBN_publisher_codes> And, is_line() tells me if a particular Bezier is actually a straight line segment, etc. Additionally, I can impose constraints on data items that the RDBMS will enforce. I.e., "ensure the time of incoming phone call is LATER than the time of the call that preceded it". As such, the RDBMS can be seen as a contract enforcer for its clients. I consider the store to be incorruptible; if something is accepted by the store, it will always return that same value (or, the most recent value for that "thing"). So, clients don't have to check the sanity of anything coming FROM the store.
> The "units" (called "dim" because I have not thought of the correct > English word, in Bulgarian a "unit" is called a "dimension") contains > entries which were never used but it is a 16-bit field so no > problem expected soon out of that. > > The "list" type is quite generic, it can be any sequence of > dps inherent objects (lowest level objects, like horizontal line, > text string etc., not the "object" I refer to elsewhere which > is the basis of the dps runtime object system, the latter is > "extobj", one of the many low level objects). But I mostly use > it for text strings, these can be "pasted" (de-encapsulated > and written to some memory address).
I can't support "bags" (groups of objects of inconsistent types) but can support an array of any standard type: <https://www.postgresql.org/docs/9.5/static/arrays.html>
> What comes with all the types is the check for overflow (hence > the signed and unsigned types); then setting a parameter will > fail if the supplied type and unit not as expected. In the global > store this is not the case, the type/unit will be overwritten > with the latest (I think...).
I can specify a valid range of values for a (certain) particular types: <https://www.postgresql.org/docs/9.5/static/rangetypes.html> For other types, I have to bear some cost for creating the constraints that apply to that type. E.g., if I wanted to ensure a particular IP address was in a particular subnet...
>> So, there is very little "information" conveyed if you report >> that a dword changed from 0x1234 to 0x4343. OTOH, if a new >> key is added, that might convey some information (as they >> HOPEFULLY have descriptive names). > > Paths and names are what I rely on for meaning, ownership etc. > indeed.
Yes. And, for the identifiers (and positions in the "namespace hierarchy") that YOU choose, this will probably work. But, in my case, I can't rely on a future developer (or USER!) to pick good names. So, I want to facilitate that effort by imposing the fewest impediments to picking arbitrary names (within the confines of the SQL language)
>> The advantage to a "real" database (I am playing fast and >> loose with my definition of "real") is that you tend to have >> more explicit types. And, the datum (field) indicates its type. >> >> So, if a byte in my "persistent store" (RDBMS) changes from >> 0x12 to 0x13, I can see that this was part of a MAC address... >> or, a "currency" value, or a "text string", or a "book title" >> (if I define a type that is used to represent book titles!) >> or a UPC code, etc. >> >> And, I can identify who (process) changed it -- as well as >> knowing who CAN'T have changed it (due to the ACL's in place >> for that object). > > Hmmm, identifying the process which changed it may be useful > but may be not so straight forward for that purpose. What if > it has been modified by a process (task) which was killed and > then another ran in its place? In dps this is countered by > identifying tasks not just by their task descriptor ID (offset > to access it really) but in addition by their spawn moment > (system time). This does not survive reset though.... one would > have to include the starting moment of the boot session.
Yes. In my case, if I start a "job" many times over the lifetime of the system, identifying which instance is the culprit is problematic. But, you'd really only need this ability as a diagnostic tool; "who screwed with this?". It would be nigh on impossible to catch a one-time event. But, if it is a repeatable "bug", you should be able to manually fix a setting (object) and then watch (instrument) to see how/when it changes thereafter. I can conceivably write a trigger that does this watching for me (if I know what to look for) and turns on a red light when it is tripped, etc.
>> Rescued some more toys, today, so a long night sorting stuff out... :> > > Hah, sounds like you will have some fun :-).
<frown> Until I have to decide what to DISCARD to accommodate the NEW ADDITIONS! :-/ "Simplify" <big frown> I rescued a second (i.e., "spare") 2KW UPS: <https://www.amazon.com/APC-SMT2200RM2U-2200VA-120V-Smart-UPS/dp/B004F09D0O> for my automation system. Coupled with an "expansion battery pack" (48V @ 15AHr), I should be able to keep the system "up" for a few hours without shedding loads... the better part of a *day* with active load management! (beyond that, you've got bigger problems! :> )
Reply by Dimiter_Popoff November 16, 20162016-11-16
On 16.11.2016 &#1075;. 05:44, Don Y wrote:
> Hi Dimiter, > > On 11/15/2016 3:39 PM, Dimiter_Popoff wrote: > >>>> In dps a directory can just be copied like any other file >>>> and it will still point to the correct file entries at that >>>> moment; it is not done automatically but I have done it in >>>> the past. Then one can copy the directory file and set the >>>> type of the copy to be non-directory so it won't confuse >>>> the system by its duplicate pointers etc. >>>> It is probably quite different from other filesystems as I >>>> have done it without looking at many of them. >>> >>> I suspect you'd have the same sort of problem I'm currently >>> encountering with the Windows Registry: how could you >>> "painlessly" track changes made to your "global store", >>> "who" (which process) made them AND easily "undo" them. >> >> well, may be. But being on top of the filesystem gives me more >> options, e.g. I have a "diff" script which compares all files >> in a directory to those in another; I could make it recursive >> like some others I have already (e.g. byte count (bc *.sa <top path>) >> etc. It lists differing files and lists "not found" ones in >> the second directory (second on the command line). I don't >> think it will be a huge issue to locate a change this way, >> it may be one of the changes are too many and I have to navigate >> through all detected ones. > > There are several issues involved: > - finding the change > - reporting it in a meaningful way > - identifying the "culprit"
Finding the change would be easy. Reporting it in a meaningful way depends on the entity it is reported to, i.e. what it finds "meaningful" - if it is a human, this will depend on their knowledge. Identifying the culprit may or may not be possible - logging every event means logging the logging events involved and so on into infinity, so we should draw the line some place:-).
> The Windows registry just supports a few different data types: > - binary > - string > - dword > - qword
I also think I have seen them store strings. The dps global store understands all the types and units an "indexed parameter" has known for 15+ years now (hopefully not 20 yet but I am not sure). Looking at the source actually I see it IS 20 years old now.... (all capitals, i.e. it has been written for my asm32 which ran on a different machine....). Here: ************************************** * * * Transgalactic * * Instruments * * * ************************************** * * PARAMETER OBJECT RELATED EQUATES * * ORG 0 * PAR$FL0 DO.B 1 FLAG 0 (R/W ETC.) DO.B 1 RESERVED PAR$DIX DO.L 1 DEVICE DEPENDENT INDEX PAR$DIM DO.W 1 DIMENSION PAR$MULT DO.W 1 POWER OF 10 (SIGNED) MULTIPLYER PAR$TYPE DO.W 1 PARAMETER TYPE (DATA TYPE, BYTE,REAL,TEXT, ETC.) PAR$DATA EQU * DATA FOLLOWING * * PAR$FL0 BIT DEFINITIONS * PR0$RD EQU 0 CAN BE READ PR0$WR EQU 1 CAN BE WRITTEN TO * IFUDF PT$BU * * DEVICE DRIVER (OR OTHER) INTERACTION TYPE DEFINITIONS * TYPE PASSED/RETURNED IN D5,(INDEX IN D6),DATA IN D1 UP TO D4,AS * MUCH AS IT TAKES IF NUMERIC; IF LIST OF OBJECTS, A5 -> * ORG 0 PT$LIST DO.B 1 OBJECT LIST AT (A5) PT$VAR DO.B 1 A2 -> VAR NAME, D1=@VAR PT$BU DO.B 1 UNSIGNED BYTE PT$BS DO.B 1 SIGNED BYTE PT$WU DO.B 1 UNSIGNED WORD PT$WS DO.B 1 SIGNED WORD PT$LU DO.B 1 UNSIGNED LONG PT$LS DO.B 1 SIGNED LONG PT$DU DO.B 1 UNSIGNED DUAL PT$DS DO.B 1 SIGNED DUAL (64-BIT) PT$QU DO.B 1 UNSIGNED QUAD PT$QS DO.B 1 SIGNED QUAD PT$FS DO.B 1 FP SINGLE PRECISION PT$FD DO.B 1 FP DUAL PRECISION PT$FX DO.B 1 FP .X pt$alst do.b 1 allocated list (same as pt$list but can be deallocated) * * ENDC * * units * ORG 0 DIM$NUMB DO.B 1 UNDEFINED - JUST A NUMBER DIM$SEC DO.B 1 SECONDS DIM$MIN DO.B 1 MINUTES DIM$HOUR DO.B 1 HOURS DIM$DAY DO.B 1 DAY DIM$WEEK DO.B 1 WEEK DIM$MON DO.B 1 MONTH DIM$YEAR DO.B 1 YEAR; FURHTER WITH MULTIPLIER DIM$METR DO.B 1 METER DIM$DEG DO.B 1 DEGREE DIM$RAD DO.B 1 RADIAN DIM$GRAD DO.B 1 GRAD (100GR=90 DEG) DIM$VOLT DO.B 1 VOLT DIM$AMP DO.B 1 AMPERE DIM$OHM DO.B 1 OHM DIM$FAR DO.B 1 FARAD DIM$HEN DO.B 1 HENRY DIM$BQ DO.B 1 BECQUEREL DIM$CU DO.B 1 CURIE DIM$ROE DO.B 1 ROENTGEN DIM$SLIC DO.B 1 SYSTEM TIME SLICES DIM$CELS DO.B 1 DEGREE CELSIUS DIM$FRNH DO.B 1 DEGREE FAHRENHEIT dim$hz do.b 1 hertz * * END The "parameter object" has been abandoned ages ago, probably never used since it was first introduced. But may be some code using it is still in use. The types and units are widely deployed. The "units" (called "dim" because I have not thought of the correct English word, in Bulgarian a "unit" is called a "dimension") contains entries which were never used but it is a 16-bit field so no problem expected soon out of that. The "list" type is quite generic, it can be any sequence of dps inherent objects (lowest level objects, like horizontal line, text string etc., not the "object" I refer to elsewhere which is the basis of the dps runtime object system, the latter is "extobj", one of the many low level objects). But I mostly use it for text strings, these can be "pasted" (de-encapsulated and written to some memory address). What comes with all the types is the check for overflow (hence the signed and unsigned types); then setting a parameter will fail if the supplied type and unit not as expected. In the global store this is not the case, the type/unit will be overwritten with the latest (I think...).
> > So, there is very little "information" conveyed if you report > that a dword changed from 0x1234 to 0x4343. OTOH, if a new > key is added, that might convey some information (as they > HOPEFULLY have descriptive names).
Paths and names are what I rely on for meaning, ownership etc. indeed.
> The advantage to a "real" database (I am playing fast and > loose with my definition of "real") is that you tend to have > more explicit types. And, the datum (field) indicates its type. > > So, if a byte in my "persistent store" (RDBMS) changes from > 0x12 to 0x13, I can see that this was part of a MAC address... > or, a "currency" value, or a "text string", or a "book title" > (if I define a type that is used to represent book titles!) > or a UPC code, etc. > > And, I can identify who (process) changed it -- as well as > knowing who CAN'T have changed it (due to the ACL's in place > for that object).
Hmmm, identifying the process which changed it may be useful but may be not so straight forward for that purpose. What if it has been modified by a process (task) which was killed and then another ran in its place? In dps this is countered by identifying tasks not just by their task descriptor ID (offset to access it really) but in addition by their spawn moment (system time). This does not survive reset though.... one would have to include the starting moment of the boot session.
> > Rescued some more toys, today, so a long night sorting stuff out... :>
Hah, sounds like you will have some fun :-). Dimiter
Reply by Dimiter_Popoff November 16, 20162016-11-16
On 16.11.2016 &#1075;. 04:30, Clifford Heath wrote:
> On 16/11/16 09:39, Dimiter_Popoff wrote: >> I looked into SQL as Clifford suggested and it turned out to be >> a language for relational databases of sorts. > > I suggested SQLite. The SQL language it uses is a *distraction* > and you don't need it... however...
I looked at it (its wikipedia entry) now and I see it is a different animal indeed.
> What it seems you need is the kind of reliable storage that > SQLite provides
Sort of yes, but I am more after a hierarchical thing, like a directory tree. I guess I'll be fine as I have made it, if I have to take steps to compress it further than it is now I know how to do it. But from your feedback - and that of Don - it seems the extra few bytes spilled per entry does not bother you much.
>... - and does so better than *anything* else you'll > find, with a code size that is smaller than anything of comparable > reliability.
I did not read enough to get to the code size, could you please post some figure? Just for reference, I'd be curious to know and other people reading this might be as well. Dimiter