EmbeddedRelated.com
Forums
The 2024 Embedded Online Conference

Windows registry diffs

Started by Don Y November 13, 2016
On 15.11.2016 г. 15:02, Don Y wrote:
> On 11/14/2016 4:19 AM, Dimiter_Popoff wrote: >> On 14.11.2016 г. 01:52, Clifford Heath wrote: >>> On 14/11/16 07:35, Dimiter_Popoff wrote: > >>>> Recently I introduced a sort of equivalent of it >>> ... >>>> I have implemented it on top of the dps file system, >>> >>> I don't know DPS, but that doesn't sound completely nuts, >>> as long as DPS is journalled (though I expect, only on >>> the directory structure, not the contents). >> >> In dps a directory can just be copied like any other file >> and it will still point to the correct file entries at that >> moment; it is not done automatically but I have done it in >> the past. Then one can copy the directory file and set the >> type of the copy to be non-directory so it won't confuse >> the system by its duplicate pointers etc. >> It is probably quite different from other filesystems as I >> have done it without looking at many of them. > > I suspect you'd have the same sort of problem I'm currently > encountering with the Windows Registry: how could you > "painlessly" track changes made to your "global store", > "who" (which process) made them AND easily "undo" them.
Hi Don, well, may be. But being on top of the filesystem gives me more options, e.g. I have a "diff" script which compares all files in a directory to those in another; I could make it recursive like some others I have already (e.g. byte count (bc *.sa <top path>) etc. It lists differing files and lists "not found" ones in the second directory (second on the command line). I don't think it will be a huge issue to locate a change this way, it may be one of the changes are too many and I have to navigate through all detected ones.
> > I've been amused to discover that this *is* possible under > Windows; but, much harder under my formal RDBMS implementation! > Especially the "undo them" (well, maybe I could build a giant > "transaction" around everything but I'm willing to be the > RDBMS would die when faced with an open-ended issue like that!)
If I want to be able to undo something I would just make a copy of the "global store" at some point I might need to return to, making each operation "undoable" is not worth the overhead this would cost I suppose.
> > A solution for *you* might be to physically make a recrusive > copy of your global store's root folder "off to the side, > somewhere" -- then, use that to replace the modified store > later, to return it to its original contents. (?)
Oh yes, the root and other directories. I have done that and various other things recovering from disasters; mainly my doing, I often program system code and run it on the same machine... have had my moments of course (I am tempted to say "not last 5 or 10 years" but I'd rather not pull the devil by the tail).
> >>> Personally if >>> I had to use a filesystem I'd use a "content store" of >>> immutable objects like the back end of GIT though. >> >> I believe my equivalent to this would be to use the directory >> entry to store data into rather than a pointer to a file. >> A directory entry in dps can store two SDW (segment descriptor >> word, starting_block:length each, a total of 4 longwords); I >> did this ages ago in order to be able to access files which >> are in up to two pieces directly from the directory entry, >> if more pieces are involved then the directory entry points >> to a RIB (retrieve information block, a list of SDW-s). > > This approach is used to store "symbolic links"; i.e., use > the directory entry to store a text pathname to the linked > file (if the pathname is short enough to fit *in* a > directory entry) without incurring other filesystem costs > (i.e., a real data block) > > E.g., if a directory entry is supposed to symbolically reference > some other point in the filesystem hierarchy: > file1 > folder/ > file2 -> /file1 > file3 > the "link" to "/file1" is stored in the dirent for "file2" > *as* "/file1" so it carries no real overhead beyond that of its > name ("file2")
I have seen the "link" type directory entries on unix machines but if I were to do the same I'd have to store the name in a file, the directory entry is not long enough to hold the name of the target file (up to 16 bytes - the 2 SDW-s, another 8 in the EOF entry - and the name length can be up to 255 characters). Let alone a complete path. But it is a practical thing to have, I may add it one of these days.
> >> Now this would save me the 64 bytes for a file; but I'll have >> to introduce another file type (there is room for that but >> obviously I am cautious with such a step) which would >> be treated as "null file". Not such a huge step now that I >> think of it. But I'll leave it for later, it will not be >> hard to change to it when I want to get rid of the 64 byte >> file contents (the longnamed directory entry is very efficient >> in that it is of variable length, depending on the name length, >> e.g. a 7 character name takes 3 longwords, an 11 char. name >> takes 4 etc.). >> >>> However, by far the most widespread solution used to this >>> incredibly common problem is to use SQLite. Never mind that >>> it's SQL, the underlying storage technology is some of the >>> most heavily tested and reliable code that has ever been >>> written. It's fully journalled, and the automated test >>> platform simulates *every* point of failure (every error >>> path) in the entire codebase, for every platform, on every >>> release. >>> >>> SQLite is small enough that it gets used in almost every >>> phone app, as well as larger things like web browsers. >>> I don't think I've ever heard of a failure that resulted >>> initially from corruption of such a database, and that's >>> almost incredible by itself. >>> >>> Read more about their testing strategy here, it's admirable: >>> <https://www.sqlite.org/testing.html> >> >> Thanks for the pointer (to SQL), I'll look into what it does >> for ideas. I won't use it - so far all dps code is my own and >> I want it to stay like this for now - but I can certainly have >> a look at it to see how other people do it. > > There are other "lightweight, connectionless" schemes to maintain > simple data in a "file-like hive". Some ~20 years ago, I was > fond of "db" (and dbm/ndbm). But, they didn't have the full > relational capabilities of more modern implementations > (I'm not sure that capability would benefit you in your usage; > its helpful for me as very few actions do NOT involve a variety > of JOINs) >
I looked into SQL as Clifford suggested and it turned out to be a language for relational databases of sorts. I can see the appeal of that reading here and in other posts of yours what you are doing (your song/artist/album etc. example) but I do not need all that for my "global store", it does not need to be searchable too efficiently. Generally it is more like a locker room, you know where your locker is if you are to have to deal with it. It can be searched on top of what it is of course - by indexing etc. for acceleration etc. as needed, but at the moment I do not need that. Dimiter
On 16/11/16 09:39, Dimiter_Popoff wrote:
> I looked into SQL as Clifford suggested and it turned out to be > a language for relational databases of sorts.
I suggested SQLite. The SQL language it uses is a *distraction* and you don't need it... however... What it seems you need is the kind of reliable storage that SQLite provides - and does so better than *anything* else you'll find, with a code size that is smaller than anything of comparable reliability. Clifford Heath.
Hi Dimiter,

On 11/15/2016 3:39 PM, Dimiter_Popoff wrote:

>>> In dps a directory can just be copied like any other file >>> and it will still point to the correct file entries at that >>> moment; it is not done automatically but I have done it in >>> the past. Then one can copy the directory file and set the >>> type of the copy to be non-directory so it won't confuse >>> the system by its duplicate pointers etc. >>> It is probably quite different from other filesystems as I >>> have done it without looking at many of them. >> >> I suspect you'd have the same sort of problem I'm currently >> encountering with the Windows Registry: how could you >> "painlessly" track changes made to your "global store", >> "who" (which process) made them AND easily "undo" them. > > well, may be. But being on top of the filesystem gives me more > options, e.g. I have a "diff" script which compares all files > in a directory to those in another; I could make it recursive > like some others I have already (e.g. byte count (bc *.sa <top path>) > etc. It lists differing files and lists "not found" ones in > the second directory (second on the command line). I don't > think it will be a huge issue to locate a change this way, > it may be one of the changes are too many and I have to navigate > through all detected ones.
There are several issues involved: - finding the change - reporting it in a meaningful way - identifying the "culprit" The Windows registry just supports a few different data types: - binary - string - dword - qword So, there is very little "information" conveyed if you report that a dword changed from 0x1234 to 0x4343. OTOH, if a new key is added, that might convey some information (as they HOPEFULLY have descriptive names). The advantage to a "real" database (I am playing fast and loose with my definition of "real") is that you tend to have more explicit types. And, the datum (field) indicates its type. So, if a byte in my "persistent store" (RDBMS) changes from 0x12 to 0x13, I can see that this was part of a MAC address... or, a "currency" value, or a "text string", or a "book title" (if I define a type that is used to represent book titles!) or a UPC code, etc. And, I can identify who (process) changed it -- as well as knowing who CAN'T have changed it (due to the ACL's in place for that object). (i.e., byte 27 changed fro 0x88 to 0x73 probably doesn't help you recognize *significant* changes)
>> I've been amused to discover that this *is* possible under >> Windows; but, much harder under my formal RDBMS implementation! >> Especially the "undo them" (well, maybe I could build a giant >> "transaction" around everything but I'm willing to be the >> RDBMS would die when faced with an open-ended issue like that!) > > If I want to be able to undo something I would just make a copy > of the "global store" at some point I might need to return to, > making each operation "undoable" is not worth the overhead this > would cost I suppose.
In my case, taking a snapshot of the database is expensive (because it is hundreds of tables, each of which can occupy many files, etc.). So, a "before and after" comparison isn't really practical. By contrast, the windows registry is reasonably self-contained.
>>>> SQLite is small enough that it gets used in almost every >>>> phone app, as well as larger things like web browsers. >>>> I don't think I've ever heard of a failure that resulted >>>> initially from corruption of such a database, and that's >>>> almost incredible by itself. >>>> >>>> Read more about their testing strategy here, it's admirable: >>>> <https://www.sqlite.org/testing.html> >>> >>> Thanks for the pointer (to SQL), I'll look into what it does >>> for ideas. I won't use it - so far all dps code is my own and >>> I want it to stay like this for now - but I can certainly have >>> a look at it to see how other people do it. >> >> There are other "lightweight, connectionless" schemes to maintain >> simple data in a "file-like hive". Some ~20 years ago, I was >> fond of "db" (and dbm/ndbm). But, they didn't have the full >> relational capabilities of more modern implementations >> (I'm not sure that capability would benefit you in your usage; >> its helpful for me as very few actions do NOT involve a variety >> of JOINs) > > I looked into SQL as Clifford suggested and it turned out to be > a language for relational databases of sorts.
Yes, though you can use it in flat "databases" just as a means of binding "names" to "values". In my case, as I had already opted to bear the cost of the RDBMS, it was silly NOT to avail myself of other features that it provides. If I had provided a filesystem interface, applications would undoubtedly just go about creating a bunch of ad hoc data files with very little in common -- in terms of formats, syntax, parsing code, etc. Instead, I make it easier for applications to assign *meanings* to the data they store -- and to retrieve that information without having to do all the legwork to ensure the data hasn't been corrupted (by a user with a text editor who is careless or ignorant of the rules for THIS particular file). And, hopefully, allow applications to build on the mechanisms and meanings of the data created and maintained by others. E.g., the music database example could easily be augmented by an application that tracks how OFTEN you play each song. Or, the most recent *time* that you played it, etc. Had the music "database" been some ad hoc file created by the "music player" application, the "music player TRACKER" application would have a harder time being implemented (poor documentation on the file formats, incomplete parsing algorithms for THAT file format vs. the "video player" application, etc.)
> I can see the appeal > of that reading here and in other posts of yours what you > are doing (your song/artist/album etc. example) but I do not > need all that for my "global store", it does not need to be searchable > too efficiently. Generally it is more like a locker room, you know > where your locker is if you are to have to deal with it. > It can be searched on top of what it is of course - by indexing etc. > for acceleration etc. as needed, but at the moment I do not need > that.
As I said, I've been leveraging my decision to use the RDBMS to the point where I no longer have "const data" in my applications. All of that stuff gets fetched from tables at run-time. Want to know what a '0' looks like vs. an 'O'? Load the templates for each of them! Want to know how *Bob* draws an 'O' vs. the way *Tom* draws it? Load Bob's O-template and compare it to Tom's! This changes the performance criteria on the store as now it's part of algorithms with timeliness guarantees (instead of just "initialization" data). Rescued some more toys, today, so a long night sorting stuff out... :>
On 16.11.2016 &#1075;. 04:30, Clifford Heath wrote:
> On 16/11/16 09:39, Dimiter_Popoff wrote: >> I looked into SQL as Clifford suggested and it turned out to be >> a language for relational databases of sorts. > > I suggested SQLite. The SQL language it uses is a *distraction* > and you don't need it... however...
I looked at it (its wikipedia entry) now and I see it is a different animal indeed.
> What it seems you need is the kind of reliable storage that > SQLite provides
Sort of yes, but I am more after a hierarchical thing, like a directory tree. I guess I'll be fine as I have made it, if I have to take steps to compress it further than it is now I know how to do it. But from your feedback - and that of Don - it seems the extra few bytes spilled per entry does not bother you much.
>... - and does so better than *anything* else you'll > find, with a code size that is smaller than anything of comparable > reliability.
I did not read enough to get to the code size, could you please post some figure? Just for reference, I'd be curious to know and other people reading this might be as well. Dimiter
On 16.11.2016 &#1075;. 05:44, Don Y wrote:
> Hi Dimiter, > > On 11/15/2016 3:39 PM, Dimiter_Popoff wrote: > >>>> In dps a directory can just be copied like any other file >>>> and it will still point to the correct file entries at that >>>> moment; it is not done automatically but I have done it in >>>> the past. Then one can copy the directory file and set the >>>> type of the copy to be non-directory so it won't confuse >>>> the system by its duplicate pointers etc. >>>> It is probably quite different from other filesystems as I >>>> have done it without looking at many of them. >>> >>> I suspect you'd have the same sort of problem I'm currently >>> encountering with the Windows Registry: how could you >>> "painlessly" track changes made to your "global store", >>> "who" (which process) made them AND easily "undo" them. >> >> well, may be. But being on top of the filesystem gives me more >> options, e.g. I have a "diff" script which compares all files >> in a directory to those in another; I could make it recursive >> like some others I have already (e.g. byte count (bc *.sa <top path>) >> etc. It lists differing files and lists "not found" ones in >> the second directory (second on the command line). I don't >> think it will be a huge issue to locate a change this way, >> it may be one of the changes are too many and I have to navigate >> through all detected ones. > > There are several issues involved: > - finding the change > - reporting it in a meaningful way > - identifying the "culprit"
Finding the change would be easy. Reporting it in a meaningful way depends on the entity it is reported to, i.e. what it finds "meaningful" - if it is a human, this will depend on their knowledge. Identifying the culprit may or may not be possible - logging every event means logging the logging events involved and so on into infinity, so we should draw the line some place:-).
> The Windows registry just supports a few different data types: > - binary > - string > - dword > - qword
I also think I have seen them store strings. The dps global store understands all the types and units an "indexed parameter" has known for 15+ years now (hopefully not 20 yet but I am not sure). Looking at the source actually I see it IS 20 years old now.... (all capitals, i.e. it has been written for my asm32 which ran on a different machine....). Here: ************************************** * * * Transgalactic * * Instruments * * * ************************************** * * PARAMETER OBJECT RELATED EQUATES * * ORG 0 * PAR$FL0 DO.B 1 FLAG 0 (R/W ETC.) DO.B 1 RESERVED PAR$DIX DO.L 1 DEVICE DEPENDENT INDEX PAR$DIM DO.W 1 DIMENSION PAR$MULT DO.W 1 POWER OF 10 (SIGNED) MULTIPLYER PAR$TYPE DO.W 1 PARAMETER TYPE (DATA TYPE, BYTE,REAL,TEXT, ETC.) PAR$DATA EQU * DATA FOLLOWING * * PAR$FL0 BIT DEFINITIONS * PR0$RD EQU 0 CAN BE READ PR0$WR EQU 1 CAN BE WRITTEN TO * IFUDF PT$BU * * DEVICE DRIVER (OR OTHER) INTERACTION TYPE DEFINITIONS * TYPE PASSED/RETURNED IN D5,(INDEX IN D6),DATA IN D1 UP TO D4,AS * MUCH AS IT TAKES IF NUMERIC; IF LIST OF OBJECTS, A5 -> * ORG 0 PT$LIST DO.B 1 OBJECT LIST AT (A5) PT$VAR DO.B 1 A2 -> VAR NAME, D1=@VAR PT$BU DO.B 1 UNSIGNED BYTE PT$BS DO.B 1 SIGNED BYTE PT$WU DO.B 1 UNSIGNED WORD PT$WS DO.B 1 SIGNED WORD PT$LU DO.B 1 UNSIGNED LONG PT$LS DO.B 1 SIGNED LONG PT$DU DO.B 1 UNSIGNED DUAL PT$DS DO.B 1 SIGNED DUAL (64-BIT) PT$QU DO.B 1 UNSIGNED QUAD PT$QS DO.B 1 SIGNED QUAD PT$FS DO.B 1 FP SINGLE PRECISION PT$FD DO.B 1 FP DUAL PRECISION PT$FX DO.B 1 FP .X pt$alst do.b 1 allocated list (same as pt$list but can be deallocated) * * ENDC * * units * ORG 0 DIM$NUMB DO.B 1 UNDEFINED - JUST A NUMBER DIM$SEC DO.B 1 SECONDS DIM$MIN DO.B 1 MINUTES DIM$HOUR DO.B 1 HOURS DIM$DAY DO.B 1 DAY DIM$WEEK DO.B 1 WEEK DIM$MON DO.B 1 MONTH DIM$YEAR DO.B 1 YEAR; FURHTER WITH MULTIPLIER DIM$METR DO.B 1 METER DIM$DEG DO.B 1 DEGREE DIM$RAD DO.B 1 RADIAN DIM$GRAD DO.B 1 GRAD (100GR=90 DEG) DIM$VOLT DO.B 1 VOLT DIM$AMP DO.B 1 AMPERE DIM$OHM DO.B 1 OHM DIM$FAR DO.B 1 FARAD DIM$HEN DO.B 1 HENRY DIM$BQ DO.B 1 BECQUEREL DIM$CU DO.B 1 CURIE DIM$ROE DO.B 1 ROENTGEN DIM$SLIC DO.B 1 SYSTEM TIME SLICES DIM$CELS DO.B 1 DEGREE CELSIUS DIM$FRNH DO.B 1 DEGREE FAHRENHEIT dim$hz do.b 1 hertz * * END The "parameter object" has been abandoned ages ago, probably never used since it was first introduced. But may be some code using it is still in use. The types and units are widely deployed. The "units" (called "dim" because I have not thought of the correct English word, in Bulgarian a "unit" is called a "dimension") contains entries which were never used but it is a 16-bit field so no problem expected soon out of that. The "list" type is quite generic, it can be any sequence of dps inherent objects (lowest level objects, like horizontal line, text string etc., not the "object" I refer to elsewhere which is the basis of the dps runtime object system, the latter is "extobj", one of the many low level objects). But I mostly use it for text strings, these can be "pasted" (de-encapsulated and written to some memory address). What comes with all the types is the check for overflow (hence the signed and unsigned types); then setting a parameter will fail if the supplied type and unit not as expected. In the global store this is not the case, the type/unit will be overwritten with the latest (I think...).
> > So, there is very little "information" conveyed if you report > that a dword changed from 0x1234 to 0x4343. OTOH, if a new > key is added, that might convey some information (as they > HOPEFULLY have descriptive names).
Paths and names are what I rely on for meaning, ownership etc. indeed.
> The advantage to a "real" database (I am playing fast and > loose with my definition of "real") is that you tend to have > more explicit types. And, the datum (field) indicates its type. > > So, if a byte in my "persistent store" (RDBMS) changes from > 0x12 to 0x13, I can see that this was part of a MAC address... > or, a "currency" value, or a "text string", or a "book title" > (if I define a type that is used to represent book titles!) > or a UPC code, etc. > > And, I can identify who (process) changed it -- as well as > knowing who CAN'T have changed it (due to the ACL's in place > for that object).
Hmmm, identifying the process which changed it may be useful but may be not so straight forward for that purpose. What if it has been modified by a process (task) which was killed and then another ran in its place? In dps this is countered by identifying tasks not just by their task descriptor ID (offset to access it really) but in addition by their spawn moment (system time). This does not survive reset though.... one would have to include the starting moment of the boot session.
> > Rescued some more toys, today, so a long night sorting stuff out... :>
Hah, sounds like you will have some fun :-). Dimiter
On 11/16/2016 4:43 AM, Dimiter_Popoff wrote:
> On 16.11.2016 &#1075;. 05:44, Don Y wrote: >> On 11/15/2016 3:39 PM, Dimiter_Popoff wrote: >> >>>>> In dps a directory can just be copied like any other file >>>>> and it will still point to the correct file entries at that >>>>> moment; it is not done automatically but I have done it in >>>>> the past. Then one can copy the directory file and set the >>>>> type of the copy to be non-directory so it won't confuse >>>>> the system by its duplicate pointers etc. >>>>> It is probably quite different from other filesystems as I >>>>> have done it without looking at many of them. >>>> >>>> I suspect you'd have the same sort of problem I'm currently >>>> encountering with the Windows Registry: how could you >>>> "painlessly" track changes made to your "global store", >>>> "who" (which process) made them AND easily "undo" them. >>> >>> well, may be. But being on top of the filesystem gives me more >>> options, e.g. I have a "diff" script which compares all files >>> in a directory to those in another; I could make it recursive >>> like some others I have already (e.g. byte count (bc *.sa <top path>) >>> etc. It lists differing files and lists "not found" ones in >>> the second directory (second on the command line). I don't >>> think it will be a huge issue to locate a change this way, >>> it may be one of the changes are too many and I have to navigate >>> through all detected ones. >> >> There are several issues involved: >> - finding the change >> - reporting it in a meaningful way >> - identifying the "culprit" > > Finding the change would be easy. Reporting it in a meaningful way > depends on the entity it is reported to, i.e. what it finds > "meaningful" - if it is a human, this will depend on their knowledge.
In my case (the reason for the post), its a way of identifying changes made to the system that I might not EXPECT to have occurred (e.g., an application remapping certain file extensions to its own handler instead of the handler that I'd previously "been happy with"). It's easier to just *see* what it has added/changed than to stumble across the consequences of those changes -- maybe days or weeks/months later (then having to sort out how to undo them and any *further* changes). Then, to see how ANOTHER application potentially mucks with the settings put in place by the first. Or, how a *newer* version of an application changes settings from an earlier version, etc. In Windows, this sucks because there are so few data types and the documentation for each registry setting is usually nonexistent for any particular application.
> Identifying the culprit may or may not be possible - logging every > event means logging the logging events involved and so on into > infinity, so we should draw the line some place:-).
Again, a difference in our expectations of the store. In my case, as it is the sole means of storing stuff, it is *huge* (terabytes). E.g., every executable is retrieved from the store and loaded, on demand, at runtime. Every song you want to play, the time of every incoming phone call, voice recordings of those calls (think: answering machine), surveillance video, etc. A notion of "tablespaces" (i.e., store THIS table on THAT physical medium) lets me present a unified interface to the store yet still move data objects around "behind the scenes". E.g., settings that change frequently should be backed by BBSRAM; OTOH, silly to store music (which is largely immutable) there -- NAND FLASH would be a better choice; and, surveillance video on magnetic disks, etc. So, logging "process ID", "time of change", and "change" isn't a huge resource hog. :> And, a log need not be boundless; you can elect to just save the last N transactions, etc. But, having ACLs in place means I can already narrow down the list of potential offenders: who had *permission* to make that change?
>> The Windows registry just supports a few different data types: >> - binary >> - string >> - dword >> - qword > > I also think I have seen them store strings. The dps global store > understands all the types and units an "indexed parameter" has > known for 15+ years now (hopefully not 20 yet but I am not sure). > Looking at the source actually I see it IS 20 years old now.... > (all capitals, i.e. it has been written for my asm32 which ran > on a different machine....). Here: > > ************************************** > * * > * Transgalactic * > * Instruments * > * * > ************************************** > * > * PARAMETER OBJECT RELATED EQUATES > * > * > ORG 0
Is this a hack to effectively define: PAR$FLO EQU 0 * byte = 1 byte <unused> EQU 1 * byte = 1 byte PAR$DIX EQU 2 * long = 4 bytes PAR$DIM EQU 6 * word = 2 bytes PAR$MULT EQU 8 ... *Or*, are these all "members" of a "parameter object"?
> PAR$FL0 DO.B 1 FLAG 0 (R/W ETC.) > DO.B 1 RESERVED > PAR$DIX DO.L 1 DEVICE DEPENDENT INDEX > PAR$DIM DO.W 1 DIMENSION > PAR$MULT DO.W 1 POWER OF 10 (SIGNED) MULTIPLYER > PAR$TYPE DO.W 1 PARAMETER TYPE (DATA TYPE, BYTE,REAL,TEXT, ETC.) > PAR$DATA EQU * DATA FOLLOWING > * > * PAR$FL0 BIT DEFINITIONS > * > PR0$RD EQU 0 CAN BE READ > PR0$WR EQU 1 CAN BE WRITTEN TO > * > IFUDF PT$BU > * > * DEVICE DRIVER (OR OTHER) INTERACTION TYPE DEFINITIONS > * TYPE PASSED/RETURNED IN D5,(INDEX IN D6),DATA IN D1 UP TO D4,AS > * MUCH AS IT TAKES IF NUMERIC; IF LIST OF OBJECTS, A5 -> > * > ORG 0
E.g., here, it looks like you are using this as a "trick" to define a bunch of mutually exclusive values (LIST, VAR, BU, BS...) as constants.
> PT$LIST DO.B 1 OBJECT LIST AT (A5) > PT$VAR DO.B 1 A2 -> VAR NAME, D1=@VAR > PT$BU DO.B 1 UNSIGNED BYTE > PT$BS DO.B 1 SIGNED BYTE > PT$WU DO.B 1 UNSIGNED WORD > PT$WS DO.B 1 SIGNED WORD > PT$LU DO.B 1 UNSIGNED LONG > PT$LS DO.B 1 SIGNED LONG > PT$DU DO.B 1 UNSIGNED DUAL > PT$DS DO.B 1 SIGNED DUAL (64-BIT) > PT$QU DO.B 1 UNSIGNED QUAD > PT$QS DO.B 1 SIGNED QUAD > PT$FS DO.B 1 FP SINGLE PRECISION > PT$FD DO.B 1 FP DUAL PRECISION > PT$FX DO.B 1 FP .X > pt$alst do.b 1 allocated list (same as pt$list but can be > deallocated) > * > * > ENDC > * > * units > * > ORG 0 > DIM$NUMB DO.B 1 UNDEFINED - JUST A NUMBER > DIM$SEC DO.B 1 SECONDS > DIM$MIN DO.B 1 MINUTES > DIM$HOUR DO.B 1 HOURS > DIM$DAY DO.B 1 DAY > DIM$WEEK DO.B 1 WEEK > DIM$MON DO.B 1 MONTH > DIM$YEAR DO.B 1 YEAR; FURHTER WITH MULTIPLIER > DIM$METR DO.B 1 METER > DIM$DEG DO.B 1 DEGREE > DIM$RAD DO.B 1 RADIAN > DIM$GRAD DO.B 1 GRAD (100GR=90 DEG) > DIM$VOLT DO.B 1 VOLT > DIM$AMP DO.B 1 AMPERE > DIM$OHM DO.B 1 OHM > DIM$FAR DO.B 1 FARAD > DIM$HEN DO.B 1 HENRY > DIM$BQ DO.B 1 BECQUEREL > DIM$CU DO.B 1 CURIE > DIM$ROE DO.B 1 ROENTGEN > DIM$SLIC DO.B 1 SYSTEM TIME SLICES > DIM$CELS DO.B 1 DEGREE CELSIUS > DIM$FRNH DO.B 1 DEGREE FAHRENHEIT > dim$hz do.b 1 hertz > * > * > END > > The "parameter object" has been abandoned ages ago, probably > never used since it was first introduced. But may be some code > using it is still in use. The types and units are widely deployed.
The RDBMS that I'm using supports a variety of "natural" types: <https://www.postgresql.org/docs/9.5/static/datatype.html#DATATYPE-TABLE> and adjusts the storage required (as well as the sanity checks that are applied to values -- e.g., a "date" has different rules than a "point") accordingly. But, I can freely augment the list of data types with types of my own. E.g., I have a "Bezier" type that is used to represent cubic bezier curves. Another that is used to represent ISBN identifiers. etc. Additionally, I can define operators that apply to those particular data types. E.g., publisher() yields the publisher code of a particular ISBN identifier: <https://en.wikipedia.org/wiki/List_of_group-0_ISBN_publisher_codes> And, is_line() tells me if a particular Bezier is actually a straight line segment, etc. Additionally, I can impose constraints on data items that the RDBMS will enforce. I.e., "ensure the time of incoming phone call is LATER than the time of the call that preceded it". As such, the RDBMS can be seen as a contract enforcer for its clients. I consider the store to be incorruptible; if something is accepted by the store, it will always return that same value (or, the most recent value for that "thing"). So, clients don't have to check the sanity of anything coming FROM the store.
> The "units" (called "dim" because I have not thought of the correct > English word, in Bulgarian a "unit" is called a "dimension") contains > entries which were never used but it is a 16-bit field so no > problem expected soon out of that. > > The "list" type is quite generic, it can be any sequence of > dps inherent objects (lowest level objects, like horizontal line, > text string etc., not the "object" I refer to elsewhere which > is the basis of the dps runtime object system, the latter is > "extobj", one of the many low level objects). But I mostly use > it for text strings, these can be "pasted" (de-encapsulated > and written to some memory address).
I can't support "bags" (groups of objects of inconsistent types) but can support an array of any standard type: <https://www.postgresql.org/docs/9.5/static/arrays.html>
> What comes with all the types is the check for overflow (hence > the signed and unsigned types); then setting a parameter will > fail if the supplied type and unit not as expected. In the global > store this is not the case, the type/unit will be overwritten > with the latest (I think...).
I can specify a valid range of values for a (certain) particular types: <https://www.postgresql.org/docs/9.5/static/rangetypes.html> For other types, I have to bear some cost for creating the constraints that apply to that type. E.g., if I wanted to ensure a particular IP address was in a particular subnet...
>> So, there is very little "information" conveyed if you report >> that a dword changed from 0x1234 to 0x4343. OTOH, if a new >> key is added, that might convey some information (as they >> HOPEFULLY have descriptive names). > > Paths and names are what I rely on for meaning, ownership etc. > indeed.
Yes. And, for the identifiers (and positions in the "namespace hierarchy") that YOU choose, this will probably work. But, in my case, I can't rely on a future developer (or USER!) to pick good names. So, I want to facilitate that effort by imposing the fewest impediments to picking arbitrary names (within the confines of the SQL language)
>> The advantage to a "real" database (I am playing fast and >> loose with my definition of "real") is that you tend to have >> more explicit types. And, the datum (field) indicates its type. >> >> So, if a byte in my "persistent store" (RDBMS) changes from >> 0x12 to 0x13, I can see that this was part of a MAC address... >> or, a "currency" value, or a "text string", or a "book title" >> (if I define a type that is used to represent book titles!) >> or a UPC code, etc. >> >> And, I can identify who (process) changed it -- as well as >> knowing who CAN'T have changed it (due to the ACL's in place >> for that object). > > Hmmm, identifying the process which changed it may be useful > but may be not so straight forward for that purpose. What if > it has been modified by a process (task) which was killed and > then another ran in its place? In dps this is countered by > identifying tasks not just by their task descriptor ID (offset > to access it really) but in addition by their spawn moment > (system time). This does not survive reset though.... one would > have to include the starting moment of the boot session.
Yes. In my case, if I start a "job" many times over the lifetime of the system, identifying which instance is the culprit is problematic. But, you'd really only need this ability as a diagnostic tool; "who screwed with this?". It would be nigh on impossible to catch a one-time event. But, if it is a repeatable "bug", you should be able to manually fix a setting (object) and then watch (instrument) to see how/when it changes thereafter. I can conceivably write a trigger that does this watching for me (if I know what to look for) and turns on a red light when it is tripped, etc.
>> Rescued some more toys, today, so a long night sorting stuff out... :> > > Hah, sounds like you will have some fun :-).
<frown> Until I have to decide what to DISCARD to accommodate the NEW ADDITIONS! :-/ "Simplify" <big frown> I rescued a second (i.e., "spare") 2KW UPS: <https://www.amazon.com/APC-SMT2200RM2U-2200VA-120V-Smart-UPS/dp/B004F09D0O> for my automation system. Coupled with an "expansion battery pack" (48V @ 15AHr), I should be able to keep the system "up" for a few hours without shedding loads... the better part of a *day* with active load management! (beyond that, you've got bigger problems! :> )
On 11/16/2016 4:03 AM, Dimiter_Popoff wrote:
> On 16.11.2016 &#1075;. 04:30, Clifford Heath wrote: >> On 16/11/16 09:39, Dimiter_Popoff wrote:
>> What it seems you need is the kind of reliable storage that >> SQLite provides > > Sort of yes, but I am more after a hierarchical thing, like a > directory tree. I guess I'll be fine as I have made it, if I have to > take steps to compress it further than it is now I know how to do it. > But from your feedback - and that of Don - it seems the extra few bytes > spilled per entry does not bother you much.
In the case of PostgreSQL, it's not JUST a "extra few bytes"! :< There's a fair bit of (data) overhead that gets dragged in with the data. Especially if you want to optimize access to that data or tie it to other data (primary/foreign keys).
>> ... - and does so better than *anything* else you'll >> find, with a code size that is smaller than anything of comparable >> reliability. > > I did not read enough to get to the code size, could you please > post some figure? Just for reference, I'd be curious to know and > other people reading this might be as well.
I think about a quarter megabyte -- depending on which features you include (exclude). For PostgreSQL, that number increases about 5-fold. [Given that "investment", I try to leverage as much functionality out of the RDBMS as is conceivable!] But, these are addressing different sorts of problems. I suspect an even "simpler" (name,value) store (e.g., ndbm) would be considerably smaller -- and less feature-full. E.g., do you have to support concurrent readers/writers and guarantee atomic access? Or, can you afford to do that in a "wrapper" around the "global store" (a "monitor", of sorts). Do you have to support "transactions" and be able to roll-back operations on the store based on conflicts encountered with "later" operations in that same transaction? etc. Sizewise, your approach will almost always "win" -- because you're just special-casing the filesystem code (e.g., to handle your namespace). If you wanted to ensure no two clients (threads/processes) could compete for a particular "setting/parameter", you could implement that mechanism with simple file locking: pend on lock, make change (to value or directory), release lock. etc. By contrast, SQLite/PostgreSQL don't really use the underlying file system for anything more than "bulk storage".
On 17/11/16 04:33, Don Y wrote:
> On 11/16/2016 4:03 AM, Dimiter_Popoff wrote: >> On 16.11.2016 &#1075;. 04:30, Clifford Heath wrote: >>> On 16/11/16 09:39, Dimiter_Popoff wrote: > >>> What it seems you need is the kind of reliable storage that >>> SQLite provides >> >> Sort of yes, but I am more after a hierarchical thing, like a >> directory tree. I guess I'll be fine as I have made it, if I have to >> take steps to compress it further than it is now I know how to do it. >> But from your feedback - and that of Don - it seems the extra few bytes >> spilled per entry does not bother you much. > > In the case of PostgreSQL, it's not JUST a "extra few bytes"! :< > There's a fair bit of (data) overhead that gets dragged in > with the data. Especially if you want to optimize access to > that data or tie it to other data (primary/foreign keys). > >>> ... - and does so better than *anything* else you'll >>> find, with a code size that is smaller than anything of comparable >>> reliability. >> >> I did not read enough to get to the code size, could you please >> post some figure? Just for reference, I'd be curious to know and >> other people reading this might be as well.
They have an entire web page for that - 2nd hit in Google: <https://www.sqlite.org/footprint.html>
> I think about a quarter megabyte -- depending on which features > you include (exclude).
A quarter (all OMIT options) up to half a megabyte (no OMITs). Yes, it's quite a lot. Yes, it would be possible to do the same in a much smaller library. No, no-one has actually done that in a widely-used well-tested library. db/ndb goes close. FWIW, almost all such systems since 1990 were built by following the instructions from just one book - "Transaction Processing - Concepts and Techniques" by Gray and Reuter. Incredibly influential book. This kind of reliability requires techniques that are very non-obvious - much more so if you want to maximise concurrent processing - and those techniques were closely-guarded trade secrets until this book was published. I can highly recommend it to anyone interested in making any composite action appear to be atomic (this is the core idea of a "transaction").
> For PostgreSQL, that number increases about 5-fold.
Postgres is not suitable for this. It requires periodic "vacuum"ing and other maintenance.
> E.g., do you have to support concurrent readers/writers and guarantee > atomic access? Or, can you afford to do that in a "wrapper" around > the "global store" (a "monitor", of sorts).
SQLite uses a global lock, so each transaction is single-threaded. That's why it's so much smaller than e.g. Postgres. However, it's still *really* hard to guarantee atomicity across failure restarts. The method that gives you this, also gives you roll-back for free.
> Sizewise, your approach will almost always "win" -- because you're just > special-casing the filesystem code (e.g., to handle your namespace).
But you're then *totally* reliant on the filesystem to provide atomicity across unexpected restarts - and that is almost certainly not the case.
> By contrast, SQLite/PostgreSQL don't really use the underlying file > system for anything more than "bulk storage".
And even as "bulk storage" those are significant possible points of failure. Consider that a database "page" which might be 8K is made up of multiple sectors, and a power fail can result in a block that has been only part written (so you get a so-called "torn page"). Torn page detection involves writing a page checksum to the start and end of each page, and checking both against the actual data on every read. Drives (including SSD) do so much caching and write rescheduling that you cannot really know which "completed" writes have actually completed. Newer drives provide special modes and operations which can be used to increase reliability for DBMS, in addition to global flush all writes type features. Clifford Heath
On 11/13/2016 11:34 AM, Don Y wrote:
> Is there a *simple* means (I've already found a complicated one) > of noting the changes made to the registry by installing X vs. Y? > > (Consider the different cases: install X, then Y vs. Y then X)
Well, the simplest solution is to install X in a sandbox and trap the registry (as well as filesystem) changes. Then, Y in a sandbox *within* that sandbox for similar reasons. Dump both sandboxes and repeat, swapping the order of X and Y.
On Tuesday, November 15, 2016 at 8:02:04 AM UTC-5, Don Y wrote:
[]
> I suspect you'd have the same sort of problem I'm currently > encountering with the Windows Registry: how could you > "painlessly" track changes made to your "global store", > "who" (which process) made them AND easily "undo" them. > > I've been amused to discover that this *is* possible under > Windows; but, much harder under my formal RDBMS implementation! > Especially the "undo them" (well, maybe I could build a giant > "transaction" around everything but I'm willing to be the > RDBMS would die when faced with an open-ended issue like that!) >
Hi Don, Since you are working in a RDBMS, I am surprised you have not yet tried a transaction history. About 15 years ago, I was contracting on an Oracle project. The system build managed a set of transaction tables that logged who, what, and when changes were made. We were able to use that tracking storage for the work I was doing which was "fixing" data errors. The initial data populated into the data base was pretty bad quality from an old ISAM style system. Mainly address data, and rural addresses at that. So we iteratively identified address corrections, applied them and then found where the change caused other issues and could back out the whole change, or even just subsets. It does not necessarily have to cost a lot more storage. A binary mapping of what changes in a row of your working table and a separate table of those changed columns. Time-stamps and user IDs and such can be in the tracking tables also. I am guessing the RDBMS exists on larger nodes in your system that have more resources. But if you are resource limited, then your options are limited also and this may not be a possibly option. But I hope it helps. Ed

The 2024 Embedded Online Conference