On a sunny day (Thu, 13 Nov 2014 13:30:18 -0500) it happened George Neuner <gneuner2@comcast.net> wrote in <mbs96adq3ae7v38q888p0kdn1dflsq32o7@4ax.com>:>On Tue, 04 Nov 2014 15:50:31 GMT, Jan Panteltje ><pNaonStpealmtje@yahoo.com> wrote: > >> I'm sure [Don] will not invent a better gzip... > > >FYI: gzip is _not_ the last word in general purpose compression. > >gzip uses on-the-fly LZH dictionary compression. gzip is pretty good, >but 7z's LZMA usually does better. > >However, no on-the-fly compressor can do as well as a tool that >performs batch analysis of the file(s) prior to compression and >creates a dictionary customized for the batch. > >There used to be a number of batch oriented compression tools, but >their 2-pass approach made them ever less suitable for handling ever >larger batches. When LZ was introduced, streaming compression became >"good enough" for general purpose and so the batch approach fell out >of favor. > > >While reserving judgment on whether Don could beat gzip for general >purpose, he certainly should be able to beat it for his specialized >purpose. > >GeorgeOnly if he knows something about the filesystem, but he claims 'agnostic'. Yes if you make assumptions about the data you can beat gzip IF your assumptions are right. Long time ago there was a fun discussion in sci.crypt, and wanted to made a joke about infinite long files (with random numbers). 'Just zip it". Now if you think that was simple...I took that question to sci.math, and after finding out about many types of infinities replaced the whole file by 00 one token... There is more to it, let Don fight with it, humanity will apreciate his better compressor when released in the open source. It is not only the filesystem it is also the sort of data stored.
Disk imaging strategy
Started by ●November 2, 2014
Reply by ●November 13, 20142014-11-13
Reply by ●November 23, 20142014-11-23
[This is REALLY LONG, so if not genuinely interested in the problem, click NEXT. OTOH, I wrote it slowly for those of you who can't read fast... :> The followups are even LONGER and more rambling as they reflect free-form comments while I was experimenting with the data. They represent my working notes so they don't attempt to be focused, grammatically correct, etc. But, rather, they suggest ways in which the topic can LATER be described. This post, itself, is also written with that document in mind!] On 11/2/2014 8:25 AM, Don Y wrote:> I'm writing a bit of code to image disk contents REGARDLESS OF THE > FILESYSTEM(s) contained thereon.> I.e., without knowledge of the specific filesystem(s) involved, you don't > know how to recognize live data from deleted data. > > The *hack* that I am currently evaluating is to invoke a trivial executable > UNDER THE NATIVE OS that simply creates large "blank" (i.e., highly > compressible) files until the volume is "full", then unlinks them all. > Doing this while the system is reasonably quiescent isn't guaranteed to > "vacuum" all available space but would make a big dent in it (if the > system is brought down shortly thereafter).> Then, dd | compress (on bare iron).OK, I've accumulated data from most of the boxes that I have, here. plus one (make/model) laptop from one of my pro bono gigs. These confirm that the "trivial" approach I mentioned will work without requiring any "filesystem-aware" code *or* a bulky OS installed solely to "restore" the image (i.e., format the potentially corrupt media, recreate empty filesystem(s), restore file content and any special "attributes", verify the filesystem(s)' integrity, etc.). I.e., *MY* restore algorithm is a few KB instead of many MB! Typical data when processing "binaries" below. Data for the SPARC's and NAS boxes are similar. The trivial algorithm always yields the SMALLEST filesystem-independent image (dump/tar/partclone all require an *appropriate* filesystem to be recreated prior to restore): +++++++++++++++ Executive Summary (KB rounded up) ++++++++++++++++ medium /Archive /Playpen /Playpen a laptop notes sources 70% full 16% full NTFS only! "as was" partition 82576160 524633 524633 72742320 dd | gzip 2402297 242157 175348 42242153 filesystem aware live data est 13897220 328698 76335 8516172 tar 13322170 330420 78670 8464310 tar | gzip 2219766 154853 27264 5162615 dump 14083020 332660 80460 [1] dump | gzip 2268348 155175 27447 [1] fill w/big files dd | gzip 2404488 175370 48030 5532906 walki 14499543 354469 105030 8661819 walki | gzip 2322514 175216 47666 5480802 fill w/many files dd | gzip 2404631 175370 47932 [2] walki 14479387 354466 105032 [2] walki | gzip 2302045 175215 47564 [2] Clonezilla 3005784 189120 50958 5168376 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ [1] requires a fabricated fstab(5) in the live CD image to test; but dump(8)'s performance tends to be on a par with tar(1) [2] skipped this in order to get the laptop off to a student sooner since "fill many" results closely track "fill big" results. [NB: This is just for a "magic string" of "all zeroes". Results for walki don't vary when that magic string is changed. But, gzip's efforts on the raw device worsen as the string becomes "less regular". FWIW, maximum compression that gzip (without "-9") can achieve is 1029:1 on *long* stretches of "zeroes"; walki does 4096:1 on *512B* stretches!] The executables involved: # for command in dd gzip tar dump; \ do loc=`which $command`; \ ls -al $loc; \ done -r-xr-xr-x 1 root wheel 27150 Apr 12 2014 /bin/dd -r-xr-xr-x 4 root wheel 35991 Apr 12 2014 /usr/bin/gzip -r-xr-xr-x 3 root wheel 133411 Apr 12 2014 /bin/tar -r-xr-xr-x 2 root wheel 64878 Apr 12 2014 /sbin/dump [NB: aside from dump(8), all executables (mine and the system's) are dynamically linked. However, my library reliance is essentially just fopen/fclose and fread/fwrite. The Windows executables (below) are linked as compact exe's] Clonezilla is hard to "size" as it relies on a whole OS *under* it (many many megabytes) so it's a silly comparison. Deploying it risks latent bugs consequential to its sheer size! By contrast, walki is intentionally "trivial" for that reason (see below)! Note tar(1) and dump(8) also rely on other capabilities being in place before a restore can take place. By contrast, walki reflects the complexity of its decompressor, as well! (on bare iron) # ls -al ~ -rwxr-xr-x 1 root wheel 5839 Oct 21 00:43 const -rw-r--r-- 1 root wheel 532 Oct 21 00:38 const.c -rwxr-xr-x 1 root wheel 7757 Oct 21 10:51 fill -rw-r--r-- 1 root wheel 16587 Oct 21 10:35 fill.c -rwxr-xr-x 1 root wheel 7757 Oct 25 17:27 fillx -rw-r--r-- 1 root wheel 3162 Oct 21 10:47 magic.h -rw-r--r-- 1 root wheel 512 Oct 21 09:24 noise -rw-r--r-- 1 root wheel 512 Oct 21 00:41 random -rw-r--r-- 1 root wheel 3162 Oct 21 00:39 random.h -rw-r--r-- 1 root wheel 512 Oct 21 00:41 regular -rw-r--r-- 1 root wheel 2835 Oct 21 00:40 regular.h -rwxr-xr-x 1 root wheel 6414 Oct 21 00:02 walki -rw-r--r-- 1 root wheel 4312 Oct 21 00:02 walki.c -rw-r--r-- 1 root wheel 512 Oct 21 00:43 zeroes -rw-r--r-- 1 root wheel 2871 Oct 21 00:43 zeroes.h C:\FFFF> DIR 10/21/2014 10:21 AM 48,858 bigfill.exe 10/25/2014 12:52 PM 48,893 manyfill.exe There are enough pro bono machines to make it worth (my!) while to automate this process. So, I would manually install the OS, drivers, updates, applications and configure the box prior to having the image created and "installed" on an unused partition (along with my "restorer"). A copy of the image can then be archived so that other identical make/model machines can be built, quickly (from the image for the first machine of its type)! Guesstimating ~8GB for the compressed image, I can handle about 120 different images with a 1TB repository. At 20-40 different images per school year, I could probably cut the repository to a 500G and still be "good" for 3+ years (and migrate the oldest images off the repository each year thereafter). [Note losing the "master image" is just an inconvenience. I could always manually recreate the original image from the notes in my logs. If push came to shove, I could chase down a laptop having that particular image and reclaim the image from its "restore partition". I'm not keen on chasing down homeless students so it may be better to mirror the repository and pray for the best] Creating an image requires a pass over the partition's contents followed by some crunching. While not a huge undertaking, it is, nonetheless, time consuming. And, much of that time is just spent "waiting" -- relatively easy to automate for UNATTENDED operation (I have no desire to stare at a screen waiting for a system prompt to reappear!) But, I've got a GREAT opportunity to gather data to quantify the *actual* performance of this algorithm; I can record the size of the medium, amount of "live data" on it and the size of the final image. Of course, these will tend to be very similar numbers for "Windows machines"; other numbers for Mac's; laptops will differ from desktops; etc. And, there will be some slight differences from model to model owing to driver differences and other per-user "customizations". This gives me another 20-40 data points (assuming I see 10 of each make/model in the 200-400 yearly donations) to add to the few dozen machines that I have, here. [Also remember that the algorithm applies to *partitions*, not *machines* or *disks*. As most machines have more than one partition, nowadays, this adds to the number of datapoints.] But, I can ALSO explore how the other approaches that I evaluated WOULD HAVE performed on each of these machines! I.e., show why my approach is "better" with hard numbers beyond the data that I have collected for *my* machines with *my* (biased?) disk contents. However, stepping back a bit, I also have access to the machines in their *donated* conditions/configurations! These should exhibit much more variety than the machines that I will be "producing". At the very least, they will typically have *some* "user files" -- even if only things like browser caches! Even machines from the same (business) donor will have differences in the "user files" and usage history that are evident on the media. ESPECIALLY in the "empty" (deleted files) portions -- which pose the biggest problem for a filesystem-agnostic imager: the compressibility of "deleted data"! So, I would like to run the same sorts of experiments that I've run on my machines on those donated machines *prior* to cleaning them up and formally imaging them. I.e., look at them "as is" and explore the different approaches that I've already evaluated. Then, tabulate the data from those machines in their original condition "as was" along with the resulting data after their formal imaging. This gives me *another* 200-400 data points to add to those already mentioned! And, another 200-400 NEXT school year... and the year after that... etc. Instead of just documenting my algorithm and its derivation, I can present data that puts its performance in (some) context... with machines with which I have had no previous influence (i.e., in the "as donated" state)! Instead of just hand-waving other potential performance scenarios as "your mileage may vary". Given that this *images* drives (i.e., lossless), I can also collect samples from colleagues on additional "non-Windows" machines (e.g., see how it fares on Alphas, SGI's, odddball OS's, other appliances, etc.) -- it's non-destructive so no risk to the data, there! Now, the problem: Running the experiments that I've run on *my* machines is time consuming. You're not just imaging the medium ONCE but, rather, several times! With different algorithms, etc. Even though you are discarding the resulting images (i.e., after harvesting the statistics from them), the time to create them is something you have to live with. Imaging 20-40 make/model donated machines, no big deal; 200-400, still manageable -- though painful. But, to evaluate *multiple* strategies on *each* of them is just a MONUMENTAL effort! In a followup to *this* post, I'll post a sample of my notes from when I started exploring this problem. It illustrates the sorts of computational effort that goes into evaluating algorithms on *one* machine ("partition"). [It also shows the effort I put into a question *before* I post it!] [NB: I'm not really interested in any comments re: my process. I've already made note of most of the obvious improvements to speed things up and figure judicious use of some pipe fittings can make a significant impact in the amount of data moved and processed. E.g., since the compressors evaluated thus far operate in streaming mode, I can pull data off the medium and feed all compressors IN PARALLEL. This allows me to read the disk *once* and tabulate several results.] Given those sorts of computational efforts JUST FOR THIS 'AS WAS' DATA COLLECTION, where is the best allocation of resources in the "test fixture" to minimize the cost (time) of gathering this data? Remember, I can tailor the *final* production images to further reduce the size of the images by creating an "empty" partition (D:) in which the students can keep their "user files" thereby reducing the size of the *imaged* (system) partition. *But*, I can't do that with the media contents AS DONATED! I may be stuck imaging 160G, 250B or even larger "single partition" systems MANY times just to see how the algorithms perform! It's "wild data" so I want to exploit it before *losing* (discarding) it! Keep in mind the costs of "manual intervention". E.g., if I have to pull a drive from a laptop and install it in a fixture, then that adds to the cost (considerably, because I can't do that if I'm not available at the instant that it needs to be done). Or, if I have to cycle power to the test fixture to install that drive then any data collection that is "in process" either has to be restarted or checkpointed. I see two possible physical configurations: - laptops/desktops tethered to test fixture via network cables - (pulled) drives tethered in external USB drive enclosures External enclosures can cheaply be replaced/discarded; installing drives *in* a hot/cold server bay leads to lots of wear-and-tear on the server's hardware. Network connection is relatively low wear-and-tear because the machine is replaced by its successor when done! (the connector on the machine never being needed in subsequent tests on OTHER machines!) Each of these allows me to support SATA and PATA drives without letting that influence the choice/design of the test fixture. The external USB enclosure route has the downside that the number of such enclosures that I have available PER DRIVE TECHNOLOGY limits the number of drives that I can process "in parallel". And, servers tend not to be known for their "high performance" USB implementations! (I will have handled the SCA/SCSI/FC drives that I use *here* with continue to other hardware. Nor will I have to deal with different CPU families, endian issues, etc. -- just PC/Mac laptops/desktops) Remember, I can't babysit this box. Ideally, I want to plug in a bunch of machines/drives and walk away -- returning when I am reasonably sure they are "done"... so *I* am not waiting for *it*! I budget 10 hours per week for pro bono stuff and am not keen on letting that number rise -- especially if it is because I am twiddling my thumbs *waiting* for a test to finish! I lean towards the "tethered via network" approach as it is easily expandable (add another NIC on the server) in the test fixture. And, can offload some of the processing *to* the laptops. I.e., PXE boot them and let them run the tests on their own drives with their own data using their own CPU's! But, thinking *harder* about this, laptops tend not to be as "resource-ful" as servers. Less RAM, slower processors, AND SLOWER DISKS! The number of *desktop* systems will dwindle to zero as they just aren't very portable for kids with no permanent place of residence. And, desktops would each require a keyboard/monitor (or, share *one* via KVM/sneakernet) to be available during this testing. Given that this would require reading the disk multiple times, (i.e., laptops are not likely to have enough RAM to be able to cache the entire disk's/partition's contents!) leaving the only access to that media hiding behind the laptop's CPU may be a poor choice. OTOH, shipping the data from the disk across the network could put a similar bottleneck in place (e.g., laptops that only have 100Mb NIC's are effectively moving data at USB2 PCI speeds). As well as dramatically increase the amount of primary/secondary storage required on the server (imagine testing 3 or 4 laptops each with 160G drives -- in addition to the server's needs) This *suggests* using a combination of approaches: - PXE boot <something> that *starts* the test process - the first thing that <something> does is dd | gzip -> server this makes a copy of the raw device available to the server and does so at a reduced network utilization factor - server sucks up gzip'd images and starts *its* processing on them - meanwhile, <something> runs other tests on the system under test, as resources permit Ideally, let the server and (each) laptop cooperate to determine which portions of the testing each should perform. Whether you: - drive the testing from the <something> on the laptop (i.e., the laptop RJE's tests to the server and, if it refuses to perform them, it performs them itself) OR - drive the testing from the server (i.e., if it's too busy, RJE the test to the laptop) This would allow the resources available (laptop(s) + server) to be reallocated dynamically. What I envision as my "process" is: - configure laptop for PXE boot and other "setup" options - connect to (a standalone!) network - initial application loads and presents *me* (at the laptop's keyboard) with a menu - I specify/select an identifier/description for this machine (the server already has its MAC address in DHCP exchange before PXE) The MAC could reduce the list of potential "known machines" that are offered to me as possible choices; I can create a new one. - I indicate how I want this laptop handled: + get it into production state ASAP (so I can power it down and disconnect it and give it to a waiting student). Offload a copy of its disk image and process that on the server when time permits. + let the laptop do whatever part of the experiments makes sense given the current load on the server by other such laptops + let it do ALL of the experiments (because I think it has ample resources to do so effectively and would rather leave the server's resources for other laptops to use) + allow its resources to be used by others when it's experiments are concluded (i.e., prior to installing its final image) - I walk away - as tests are completed on the data present on the laptop's media, the results are added to a database (running on the test server *or* another box) by whatever "ordered" the test (server vs laptop) - when the tests are complete *and* the laptop's resources are no longer needed elsewhere, the laptop's image is created and/or loaded from the "image server" (which may be another box) and the laptop powered down Now the question hinted at, initially: So, what resources would be best to have on the *server*? Gobs of RAM? (i.e., keep an *entire* disk image resident in RAM so that different algorithms can run at "CPU speed" instead of being I/O bound) Raw horsepower (i.e., run the compression algorithms in the least amount of time)? Some balance of the two (e.g., SSD as "slow RAM/fast disk" with real RAM+CPU dedicated to the actual crunching)? Remember that I want (need?) to process several laptops at a time. I figure I need to *complete* 10 machines each week -- so, probably copy the contents of the first machine onto the server so I can start manually building that "production system" while the other 9 machines are undergoing their "AS WAS" experiments. Then, install the production image on all 10 machines and "call it a day"! :> [Keep in mind that a test fixture can be exercising them over the course of that entire week, if need be, and *my* time can still be capped at 10 hours! Id prefer not running a power-hungry server any longer than necessary, though!] Copying the disk contents from *all* machines onto the server would require a secondary store big enough for 10 "as was" images -- i.e., ten times the size of the disks in the laptops, regardless of how much/little of that disk eventually is used in the production image! This can be an issue for a server-side SSD to expedite the operations! I'm looking for criteria to use in picking a suitable bit of kit to rescue for this job. My notes claim I've got a 64G DL580 tucked away and a 32G? R900. The BladeCenter has gobs of horsepower but the electric bill would be insane (though Winter is coming so we could possibly use it as an "electric space heater"! :-/ ) It might just be easier to find something else with this particular problem in mind instead of trying to fit it to kit-on-hand! I'll run this by friends who run server farms to see what sort of guidance they can offer, as well. They tend to be *amazingly* good at counting hidden system calls and tweeking scripts to save all those little inefficiencies that creep in "between the keystrokes"! I guess, in their environment, when you're running code millions of times a day, EVERY day, all those "little things" add up! Hopefully, I can have something in place after the holidays and get back to work on this after the New Year... Now, back to my holiday baking! :> Thanks! --don
Reply by ●November 23, 20142014-11-23
Hi Dimiter, On 11/13/2014 12:46 AM, Dimiter_Popoff wrote:>> >>> BTW the netmca runs a complete DPS on it, shell windows and all. >> >> Ah, OK. So, you don't have an "embedded" version of it with >> reduced capabilities/features. > > No need for that. Much of the functionality even fits in 2M flash.... > how thinkable is that (about 1.5 to 2M lines of VPA code which is > not generous with CRLF, unlike certain HLL-s :-) ). But booting off > flash is intended just to be able to restore your HDD via the net > if you mess it up.So, a network stack and some utilities...?> I have smaller versions of course, e.g. I am not tortured by > a small coldfire (mcf52211) which has a tiny derivative of dps > (mainly the scheduler and some library calls, about 7 kilobytes > total). Bloody thing won't go into low power mode which is > sort of specified to at least halve the consumption, nothing > of the sort, *zero* effect of entering that mode by the core. > Cost me two days so far to zero result. Not that I can't live > without that mode but why does it not work, drives me mad.Heh heh heh... you'll find it. Then, curse yourself for OVERLOOKING it. Or, the manufacturer for not *documenting* it! :-/
Reply by ●November 23, 20142014-11-23
[As promised, this is *also* really long!] On 11/23/2014 1:52 AM, Don Y wrote:> Typical data when processing "binaries" below. Data for the SPARC's > and NAS boxes are similar. The trivial algorithm always yields the > SMALLEST filesystem-independent image (dump/tar/partclone all require > an *appropriate* filesystem to be recreated prior to restore): > > +++++++++++++++ Executive Summary (KB rounded up) ++++++++++++++++ > medium /Archive /Playpen /Playpen a laptop > notes sources 70% full 16% full NTFS only! > > "as was" > partition 82576160 524633 524633 72742320 > dd | gzip 2402297 242157 175348 42242153 > > filesystem aware > live data est 13897220 328698 76335 8516172 > tar 13322170 330420 78670 8464310 > tar | gzip 2219766 154853 27264 5162615 > dump 14083020 332660 80460 [1] > dump | gzip 2268348 155175 27447 [1] > > fill w/big files > dd | gzip 2404488 175370 48030 5532906 > walki 14499543 354469 105030 8661819 > walki | gzip 2322514 175216 47666 5480802 > > fill w/many files > dd | gzip 2404631 175370 47932 [2] > walki 14479387 354466 105032 [2] > walki | gzip 2302045 175215 47564 [2] > > Clonezilla 3005784 189120 50958 5168376 > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > [1] requires a fabricated fstab(5) in the live CD image to test; but > dump(8)'s performance tends to be on a par with tar(1) > [2] skipped this in order to get the laptop off to a student sooner > since "fill many" results closely track "fill big" results. > > [NB: This is just for a "magic string" of "all zeroes". Results > for walki don't vary when that magic string is changed. But, gzip's > efforts on the raw device worsen as the string becomes "less regular". > FWIW, maximum compression that gzip (without "-9") can achieve is > 1029:1 on *long* stretches of "zeroes"; walki does 4096:1 on *512B* > stretches!] > > The executables involved: > > # for command in dd gzip tar dump; \ > do loc=`which $command`; \ > ls -al $loc; \ > done > -r-xr-xr-x 1 root wheel 27150 Apr 12 2014 /bin/dd > -r-xr-xr-x 4 root wheel 35991 Apr 12 2014 /usr/bin/gzip > -r-xr-xr-x 3 root wheel 133411 Apr 12 2014 /bin/tar > -r-xr-xr-x 2 root wheel 64878 Apr 12 2014 /sbin/dump > > # ls -al ~ > -rwxr-xr-x 1 root wheel 5839 Oct 21 00:43 const > -rw-r--r-- 1 root wheel 532 Oct 21 00:38 const.c > -rwxr-xr-x 1 root wheel 7757 Oct 21 10:51 fill > -rw-r--r-- 1 root wheel 16587 Oct 21 10:35 fill.c > -rwxr-xr-x 1 root wheel 7757 Oct 25 17:27 fillx > -rw-r--r-- 1 root wheel 3162 Oct 21 10:47 magic.h > -rw-r--r-- 1 root wheel 512 Oct 21 09:24 noise > -rw-r--r-- 1 root wheel 512 Oct 21 00:41 random > -rw-r--r-- 1 root wheel 3162 Oct 21 00:39 random.h > -rw-r--r-- 1 root wheel 512 Oct 21 00:41 regular > -rw-r--r-- 1 root wheel 2835 Oct 21 00:40 regular.h > -rwxr-xr-x 1 root wheel 6414 Oct 21 00:02 walki > -rw-r--r-- 1 root wheel 4312 Oct 21 00:02 walki.c > -rw-r--r-- 1 root wheel 512 Oct 21 00:43 zeroes > -rw-r--r-- 1 root wheel 2871 Oct 21 00:43 zeroes.h> In a followup to *this* post, I'll post a sample of my notes from > when I started exploring this problem. It illustrates the sorts of > computational effort that goes into evaluating algorithms on *one* > machine ("partition"). > > [It also shows the effort I put into a question *before* I post it!] > > [NB: I'm not really interested in any comments re: my process. > I've already made note of most of the obvious improvements to speed > things up and figure judicious use of some pipe fittings can make > a significant impact in the amount of data moved and processed. > E.g., since the compressors evaluated thus far operate in streaming > mode, I can pull data off the medium and feed all compressors IN > PARALLEL. This allows me to read the disk *once* and tabulate > several results.][This describes the initial experiments I ran on /Archive. No attempt made to be efficient. But, hopefully *accurate*!] Keep the filesystem essentially stagnant -- with nothing (besides my activities, here) accessing it. # disklabel wd0 16 partitions: # size offset fstype [fsize bsize cpg/sgs] k: 167772528 183503376 4.2BSD 1024 8192 0 # (Cyl. 182047 - 348487) # mount /dev/wd0k /Archive # df Filesystem 512-blocks Used Avail %Cap Mounted on /dev/wd0k 165152320 27794440 129100264 17% /Archive This suggests a "live data" size of approximately 14,230,753,280 bytes ("not to exceed" 27794440*512). The volume is ~80G so 13G is probably a good approximation of what one of my "pro bono" machines would see as a utilization portion. Of course, this isn't a Windows machine so the data here is undoubtedly more compressible (e.g., it is a SOURCE archive so has gobs of TEXT in it!) than Windows executables would be! [Note to self: explore that as well!] # cd /Archive # tar cplf - . | wc -c 13641902080 (Note: invoking this from the filesystem root as "tar cplf - /Archive" would unnecessarily increase the size of the tarball!) The 14,230,753,280 upper limit was high by AT LEAST 588,851,200 bytes. This represents file system overhead (second and third level indirects) *plus* unused fractions of disk blocks (partial fragments) *minus* tar(1) overhead. # tar cplf - . | gzip | wc -c 2273039988 The best gzip(1) can do with this (ahem) "portable" archive is compress it to 16.7% of its original (tarball) size. This is the approximate target for our filesystem-agnostic approach as it encodes *just* the data (while we will *also* have to encode the filesystem implementation *in* our image). Given that the "live data" represents 17% of the medium, this overall rate can look impressive compared to blindly "duplicating the image" (i.e., it represents 2.7% of the raw image's size). But, that's a silly comparison; you just wouldn't do a byte-for-byte image in *any* case! Too wasteful! But, *without* knowledge of the filesystem, all you can hope to do is byte-for-byte! Though you can *compress* it and hope the data has some repeatable nature: # dd if=/dev/rwd0k | gzip | wc -c 167772528+0 records in 167772528+0 records out 85899534336 bytes transferred 2459951780 [grrrr... to be pedantic, I should have umount'ed the filesystem before doing this. But, in reality, it won't noticeably affect the results.] This is the low end of our performance target -- we can always compress the raw image and use that! Unfortunately, this result will tend to vary with media utilization factor, content of dirty blocks, etc. So, it's not a good solution -- it needs more constraint! [i.e., fill the unused portion of the medium with truly random data and watch this gag, embarassingly! However, when you encounter a filesystem "in the wild", you have no idea what the content of that unused portion may be as it reflects whatever was there *previously*] This approaches the efficiency of the gzip'd tarball as 108.2 % of it (i.e., 18.0% of the non-gzip'd tarball's size instead of 17%!) Note that running gzip on "big data" is really taxing (time consuming). Ideally, we want to limit what is being compressed to only that which *needs* to be compressed. I.e., don't compress "empty" parts of the medium! Instead of an 80GB partition, imagine what this would be like at 160G or 250G! <cringe> [Note to self: before writing this up, rerun each operation under time(1) to gather data on the cost of each operation. Probably also monitor other resources (RAM) for a more effective comparison of each algorithm beyond just its "resulting image size". Consider how you will do this in production so you can easily gather statistics from each machine you eventually encounter. This can help to improve the resulting approach by highlighting ACTUAL costs!] My "trivial" solution attempts to bias the data to make compression more effective (i.e., make the empty portions easier less taxing to compress and easier to *identify* -- which walki will exploit!): # mkdir XXX # cd XXX # ~/fill ~/zeroes Creating 'A' /Archive: write failed, file system is full [Creating a subdirectory in which to store all the "bogus" files that "fill" creates simplifies verifying their correct content, manually, and eventually removing them! Carelessly invoking fill in a "populated directory" can make MANUALLY undoing its effects tedious (i.e., you can't just "rm -r *")! However, *when* you screw up and discover that you've done this, keep in mind that fill's filenames fit regex's that are relatively easy to recreate. E.g., [A-Z0-9]+ But, fill won't overwrite an existing file so make sure those regex's don't happen to conflict with your files! E.g., 'A' would be a really bad filename!! :> In production, fill should unlink them as well so this would be a turnkey operation -- but, only unlink those that *it* creates (i.e., consider the case of 'A' mentioned above!!] # df Filesystem 512-blocks Used Avail %Cap Mounted on /dev/wd0k 165152320 165023808 -8129104 105% /Archive # ls -al -rw-r--r-- 1 root wheel 70244237312 Oct 23 11:37 A <ASIDE>================================================= Just for yucks, see how well gzip can compress this file (knowing its contents are "zeroes") -- I expect a final size of 68,330,970 bytes (70244237312/1028) as that's the *best* gzip can do (without -9): # cat A | gzip | wc -c 68273881 Hmmmm... apparently gzip can do (almost) 1029:1! :> # expr 70244237312 / 512 137195776 # expr 137195776 / 8 17149472 I.e., the trivial algorithm should be able to reduce the 70,244,237,312 byte file to 17,149,472 bytes (~4000:1)! Take the size of the gzip'd tarball (2,273,039,988 bytes) as representative of the compressible size of the live data. Add the space required by the trivial algorithm to represent "blank" data (17,149,472 bytes) and the estimated performance of the compressed trivial algorithm should be 2,290,189,460 bytes or 169,762,320 SMALLER than gzip was able to do with the raw disk image! And, it would STILL contain all of the filesystem-specific information (which the raw image *implicitly* preserves) *without* any visible knowledge of the filesystem's structure! We'd have improved on gzip's performance by almost 7% (1 - 2273039988/2459951780). The improvement will vary with the choice of compressor -- the better the compressor is at handling "highly regular" data, the less we will be able to improve on it (without really working at it!). [OTOH, we can craft a version of fill that makes any compressor look much worse (e.g., use "uncompressible" data -- but make the synthesis of that uncompressible data known to *our* decompressor!). But, this doesn't do anything towards improving the compressibility of the images that we will be creating so it's just an intellectual diversion...] The (relatively) poor compression rate that gzip exhibits for the "filled file" (i.e., "A", above) makes the trivial algorithm viable. It can better handle such highly repetitive data as it doesn't have to be "general purpose" (like gzip)! And, the cost of trivial algorithm is peanuts *because* the deck is stacked in its favor! Of course, the relative savings (in the overall image) will vary with the live data "utilization" rate -- heavily used media will have results that approach those of gzip (including *worse* than gzip) because there will be fewer opportunities for the trivial algorithm to outperform gzip. But, as we aren't restricted to stream compression, we can evaluate the relative cost of adding the trivial algorithm's "pre-compression" and, if it will NOT improve the resulting image size, just store a single bit in the image that effectively enables/disables the precompression step (so, worst case, we incur one extra bit of storage in our image that gzip would not have required -- and, we're STILL filesystem-agnostic!) We'll have to see if this holds true... ===================================================</ASIDE> OK, we see that the medium is essentially "consumed". Now clear off all that cruft: # rm -r * # df Filesystem 512-blocks Used Avail %Cap Mounted on /dev/wd0k 165152320 27794496 129100208 17% /Archive So, we're back with the original LIVE contents of the medium, but the "erased" areas now have *known* characteristics -- which was not the case prior to this operation! Let's see how well the raw image compresses now that gzip has been given a more cooperative load with which to work with! But, first, lets see if we can expedite the operation a bit: # factor 167772528 167772528: 2 2 2 2 3 3 7 11 15131 # expr 2 \* 2 \* 2 \* 2 \* 3 \* 3 \* 7 1008 # expr 1008 \* 512 516096 # dd if=/dev/rwd0k bs=1008b | gzip | wc -c 166441+0 records in 166441+0 records out 85899534336 bytes transferred 2462195081 [NB: 63 is the nominal sectors/track and 16 the nominal tracks/cylinder yielding 1008 as the nominal sectors/cylinder. Of course, this bears no real relationship to the physical drive geometery! But, reduces the overhead -- 1,240,341 cylinders processed vs 167,772,528 *sectors*!] That compression did not improve significantly (i.e., 2,462,195,081 vs. the previous 2,459,951,780) suggests that the "empty" data may already have been "blank". The *worsening* of compression (2,243,301 bytes larger; 4,381 sectors) is probably attributable to the number of empty blocks that were dirtied in their roles supporting second and third level indirects. There is no way for me to scrub these as they are part of the file's overhead *WHILE* THE (huge) FILE EXISTS! [Note to self: explore the performance with smaller fill files as that changes the complexion of the second and third level indirects.] Note this is not the same as "filling" the medium prior to installing the files/filesystem! *Unless* building the filesystem only touches small portions of the medium *and* installing the files does NOT require rewrites/deletes of temporary files in the process (e.g., the bane of a Windows machine). [Note to self: explore this option to highlight the costs of Windows' "tentative"/"hesitating" use of medium in contrast to NetBSD, Solaris, etc. Show the downside of that apparently "simple solution" to this compressibility issue!] Now, let's see how much we can actually improve upon gzip's results: # dd if=/dev/rwd0k | ~/walki ~/zeroes | wc -c 167772528+0 records in 167772528+0 records out 85899534336 bytes transferred Magic: 138814403 Live: 28958125 14826560000 To this, add the 20,971,566 bytes (167772528/8) of *uncompressed* (for now) bitmap overhead yielding an overall image size of 14,847,531,566 bytes. Notice walki *has* reduced the size of the image. Contrast its size to that of (uncompressed) tarball (14,847,531,566 vs. 13,641,902,080 bytes). Note that it stores the same data as the tarball but in a different form but *also* represents the filesystem implementation details (which the tarball requires an external program to recreate on its behalf!). So, we have 85,899,534,336 bytes worth of "information" packed into these 14,847,531,566 (i.e., the filesystem-specific information is present in addition to the live data). So, why not apply gzip -- or ANY general purpose compresor -- to the data that walki has NOT compressed (i.e., the live data that it has simply replicated) in much the same way that we did for the tarball?? # dd if=/dev/rwd0k | ~/walki ~/zeroes | gzip | wc -c 167772528+0 records in 167772528+0 records out 85899534336 bytes transferred Magic: 138814403 Live: 28958125 2357282149 Add the 20,971,566 bytes (still uncompressed) bitmap overhead to this yielding a final image size of 2,378,253,715. <grin> Not a huge gain but an improvement, nonetheless! How does this compare with the earlier shirt-cuff prediction? We guessed 2,290,189,460 bytes and attained 2,357,282,149 bytes. So, we're about 4% higher than our estimate (wanna bet we could pick that up if we compress the bitmap??). At 83,941,366 bytes (2462195081-2378253715) *SMALLER* than gzip's stab at compressing the raw device -- just 105,213,727 (2378253715-2273039988) bigger than the gzip'd portable archive (which requires something else to build -- and verify -- the specific filesystem UNDER it prior to its restoration! That adds code, complexity AND unreliability to the mix) [Note to self: see how the other archivers/compressors compare on each dataset/image. There may be subtle advantages of one over the others especially given the "compress once; decompress multiple" nature of the problem! OTOH, we can't get obscene with compressor implementations as time isn't unlimited! We don't want to spend hours or days trying to optimize a single image -- we need *an* image to quickly deploy to 10-20 machines before moving on to the creating the next, *different* image.] Repeat the fill with a multitude of smaller files (i.e., instead of that ginormous, 70GB 'A' file) to see if this has any noticeable impact on the operation(s). This will consume more inodes but those are already set-aside in the filesystem (?) so they shouldn't result in additional free-space being tarnished. OTOH, it could eliminate some number of the third level indirect blocks which might have some merit (yet it seems wrong to expect the third level implementation to cost *more* than the "multiple smaller files" would cost!??? Check McKusick to be sure...): # sh # cd # cc -DFILE_CLAMP=\(100*1024*1024\) -o fillx fill.c # ls -al fillx -rwxr-xr-x 1 root wheel 7757 Oct 25 17:27 fillx # exit # pwd /Archive # df Filesystem 512-blocks Used Avail %Cap Mounted on /dev/wd0k 165152320 27794496 129100208 17% /Archive # cd XXX # ~/fillx ~/zeroes Creating 'A' Creating 'B' Creating 'C' ... Creating 'Z' Creating '0' Creating '1' ... Creating '9' Creating 'AA' Creating 'BA' Creating 'CA' ... Creating 'VR' /Archive: write failed, file system is full # ls | wc -w 670 # ls -al | tail -rw-r--r-- 1 root wheel 104858112 Oct 25 19:45 ZH -rw-r--r-- 1 root wheel 104858112 Oct 25 20:00 ZI -rw-r--r-- 1 root wheel 104858112 Oct 25 20:16 ZJ -rw-r--r-- 1 root wheel 104858112 Oct 25 20:31 ZK -rw-r--r-- 1 root wheel 104858112 Oct 25 20:46 ZL -rw-r--r-- 1 root wheel 104858112 Oct 25 21:02 ZM -rw-r--r-- 1 root wheel 104858112 Oct 25 21:17 ZN -rw-r--r-- 1 root wheel 104858112 Oct 25 21:32 ZO -rw-r--r-- 1 root wheel 104858112 Oct 25 21:48 ZP -rw-r--r-- 1 root wheel 104858112 Oct 25 22:03 ZQ Hmmm... I should have considered sort order when creating the lists of valid filename characters! I.e., 'ZQ' is probably NOT the last\ file created (because it would be an odd coincidence that it would end up exactly the same size as the other files! And, I also happen to know the arbitrary order in which I listed the legal filename characters *in* fill.c! :> ) OK, try this, instead: # ls -alrt | tail -rw-r--r-- 1 root wheel 104858112 Oct 25 22:13 NR -rw-r--r-- 1 root wheel 104858112 Oct 25 22:14 OR -rw-r--r-- 1 root wheel 104858112 Oct 25 22:14 PR -rw-r--r-- 1 root wheel 104858112 Oct 25 22:15 QR -rw-r--r-- 1 root wheel 104858112 Oct 25 22:15 RR -rw-r--r-- 1 root wheel 104858112 Oct 25 22:15 SR -rw-r--r-- 1 root wheel 104858112 Oct 25 22:16 TR -rw-r--r-- 1 root wheel 104858112 Oct 25 22:16 UR -rw-r--r-- 1 root wheel 67780608 Oct 25 22:16 VR <grin> "Smarter than the average bear!" So, we've replaced the one ginormous (70G) file with 670 smaller (100M) files. And, consumed roughly the same amount of space on the medium. # df Filesystem 512-blocks Used Avail %Cap Mounted on /dev/wd0k 165152320 165023808 -8129104 105% /Archive Note this should not affect the times of the tar(1) -- or dump(8), to follow, presently -- results as the "fill" files are deleted and, as such, not seen by tar or dump! But, how do the gzip and the walki algorithms fare with smaller files and more inodes "dirtied"? # rm -r * # dd if=/dev/rwd0k | gzip | wc -c 167772528+0 records in 167772528+0 records out 85899534336 bytes transferred 2462341439 Slightly worse (i.e., 146,358 bytes larger) than the previous "big fill" attempt (which was 2,462,195,081 bytes). Feh... # dd if=/dev/rwd0k | ~/walki ~/zeroes | wc -c Magic: 138813754 Live: 28958774 167772528+0 records in 167772528+0 records out 85899534336 bytes transferred 14826892288 To this, add the 20,971,566 bytes (167772528/8) of *uncompressed* (for now) bitmap overhead yielding an overall image size of 14,847,863,854 bytes. Virtually identical to the previous "big fill" attempt (which was 14,847,531,566 bytes). # dd if=/dev/rwd0k | ~/walki ~/zeroes | gzip | wc -c Magic: 138813754 Live: 28958774 167772528+0 records in 167772528+0 records out 85899534336 bytes transferred 2357293609 Again, add in the 20,971,566 bytes of *uncompressed* bitmap for an image size of 2,378,265,175 nearly identical to the "big fill" 2,378,253,715 byte image. Finally, examine the "right" tools (for this job) for their performance: # cd /Archive # cat /dev/null > /etc/dumpdates # dump -0 -f - . | wc -c DUMP: Dumping sub files/directories from /Archive DUMP: Dumping file/directory . DUMP: Found /dev/rwd0k on /Archive in /etc/fstab DUMP: Date of this level 0 dump: Mon Oct 27 13:18:39 2014 DUMP: Date of last level 0 dump: the epoch DUMP: Dumping a subset of /dev/rwd0k (a subset of /Archive) to standard output DUMP: Label: none DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 14084291 tape blocks. DUMP: Volume 1 started at: Mon Oct 27 13:24:43 2014 DUMP: dumping (Pass III) [directories] DUMP: 1.68% done, finished in 4:53 DUMP: dumping (Pass IV) [regular files] DUMP: 7.53% done, finished in 2:02 DUMP: 13.43% done, finished in 1:36 DUMP: 19.53% done, finished in 1:22 DUMP: 25.63% done, finished in 1:12 DUMP: 31.77% done, finished in 1:04 DUMP: 37.92% done, finished in 0:57 DUMP: 44.01% done, finished in 0:50 DUMP: 49.98% done, finished in 0:45 DUMP: 55.81% done, finished in 0:39 DUMP: 61.62% done, finished in 0:34 DUMP: 67.51% done, finished in 0:28 DUMP: 73.52% done, finished in 0:23 DUMP: 79.55% done, finished in 0:17 DUMP: 85.61% done, finished in 0:12 DUMP: 91.69% done, finished in 0:07 DUMP: 97.85% done, finished in 0:01 DUMP: 14083025 tape blocks DUMP: Volume 1 completed at: Mon Oct 27 14:51:28 2014 DUMP: Volume 1 took 1:26:45 DUMP: Volume 1 transfer rate: 2705 KB/s DUMP: Date of this level 0 dump: Mon Oct 27 13:18:39 2014 DUMP: Date this dump completed: Mon Oct 27 14:51:28 2014 DUMP: Average transfer rate: 2705 KB/s DUMP: DUMP IS DONE 14421012480 Notice the estimate of 14,084,291 tape blocks and how that correlates with the 14,083,025 *actual* (1k) blocks and the 27,794,440 (512) blocks listed as "Used" by df(1) as well as the 28,958,125 "Live" (512) sectors that walki reported; the 14,421,012,480 actual bytes in the dump compared to the 13,641,902,080 bytes (26,644,340*512) in the tarball and the size of walki's uncompressed image (14,847,531,566). Note, also, the initial time estimate for the operation (2:02) vs. the actual time expended (1:33). Now, repeat the same operation to see how well a dump compresses: # cat /dev/null > /etc/dumpdates # dump -0 -f - . | gzip | wc -c DUMP: Dumping sub files/directories from /Archive DUMP: Dumping file/directory . DUMP: Found /dev/rwd0k on /Archive in /etc/fstab DUMP: Date of this level 0 dump: Mon Oct 27 14:57:07 2014 DUMP: Date of last level 0 dump: the epoch DUMP: Dumping a subset of /dev/rwd0k (a subset of /Archive) to standard output DUMP: Label: none DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 14084291 tape blocks. DUMP: Volume 1 started at: Mon Oct 27 15:03:10 2014 DUMP: dumping (Pass III) [directories] DUMP: 1.64% done, finished in 5:00 DUMP: dumping (Pass IV) [regular files] DUMP: 6.13% done, finished in 2:33 DUMP: 10.84% done, finished in 2:03 DUMP: 15.47% done, finished in 1:49 DUMP: 19.97% done, finished in 1:40 DUMP: 24.58% done, finished in 1:32 DUMP: 29.18% done, finished in 1:24 DUMP: 33.50% done, finished in 1:19 DUMP: 37.97% done, finished in 1:13 DUMP: 42.51% done, finished in 1:07 DUMP: 46.81% done, finished in 1:02 DUMP: 51.35% done, finished in 0:56 DUMP: 55.81% done, finished in 0:51 DUMP: 60.35% done, finished in 0:45 DUMP: 64.83% done, finished in 0:40 DUMP: 69.11% done, finished in 0:35 DUMP: 73.52% done, finished in 0:30 DUMP: 77.88% done, finished in 0:25 DUMP: 81.80% done, finished in 0:21 DUMP: 86.40% done, finished in 0:15 DUMP: 90.98% done, finished in 0:10 DUMP: 95.67% done, finished in 0:04 DUMP: 14083025 tape blocks DUMP: Volume 1 completed at: Mon Oct 27 16:57:43 2014 DUMP: Volume 1 took 1:54:33 DUMP: Volume 1 transfer rate: 2049 KB/s DUMP: Date of this level 0 dump: Mon Oct 27 14:57:07 2014 DUMP: Date this dump completed: Mon Oct 27 16:57:43 2014 DUMP: Average transfer rate: 2049 KB/s DUMP: DUMP IS DONE 2322788135 Notice that the time estimate changed (2:33 vs. 2:02) -- no doubt because the system was more heavily loaded (by the gzip process). But, the size estimate remained the same! Of course, dump(8) isn't very useful outside of *this* host system (e.g., no joy for Windows or any of the appliances!). And, dump(8) requires a preexisting filesystem onto which to restore(8)! So, it's no better than tar(1) in that regard! Now, try Clonezilla! <frown> CZ's handling of NetBSD disklabels leaves a lot to be desired! *Guess* at which pseudo-partition is likely to correspond with /dev/wd0k (I'm going to guess sda1 is the NetBSD partition, sda5 is the wd0a slice, sda6 the swap slice, sda7 the wd0e slice, sda8 the wd0f, sda9 the wd0g, sda10 the wd0h, sda11 the wd0i, sda12 the wd0j leaving sda13 as wd0k) [The "names" given for all of the NetBSD slices are all identical! I.e., "_ufs(In_TOSHIBA_MK6459GS)_ata-TOSHIBA_MK6459GSX_1184D0EIB". So, it's a pretty silly way to sort out what you really want to image! :-/ ] Accept all "Expert" default options offered for an external USB drive selected for the "partimag" filesystem. Examining the output/image created: -rw-r--r-- 1 root root 69 Oct 27 2014 clonezilla-img -rw-r--r-- 1 root root 12249 Oct 27 2014 Info-dmi.txt -rw-r--r-- 1 root root 14722 Oct 27 2014 Info-lshw.txt -rw-r--r-- 1 root root 1643 Oct 27 2014 Info-lspci.txt -rw-r--r-- 1 root root 171 Oct 27 2014 Info-packages.txt -rw-r--r-- 1 root root 6 Oct 27 2014 parts -rw------- 1 root root 2097152000 Oct 27 2014 sda13.ufs-ptcl-img.gz.aa -rw------- 1 root root 980770626 Oct 27 2014 sda13.ufs-ptcl-img.gz.ab -rw-r--r-- 1 root root 37 Oct 27 2014 sda-chs.sf -rw-r--r-- 1 root root 31744 Oct 27 2014 sda-hidden-data-after-mbr -rw-r--r-- 1 root root 512 Oct 27 2014 sda-mbr -rw-r--r-- 1 root root 267 Oct 27 2014 sda-pt.parted -rw-r--r-- 1 root root 656 Oct 27 2014 sda-pt.sf Clonezilla builds a 3,077,922,626 byte (2097152000+980770626) image (i.e., 3005783.814453125KiB)
Reply by ●November 23, 20142014-11-23
On 23.11.2014 г. 10:54, Don Y wrote:> Hi Dimiter, > > On 11/13/2014 12:46 AM, Dimiter_Popoff wrote: >>> >>>> BTW the netmca runs a complete DPS on it, shell windows and all. >>> >>> Ah, OK. So, you don't have an "embedded" version of it with >>> reduced capabilities/features. >> >> No need for that. Much of the functionality even fits in 2M flash.... >> how thinkable is that (about 1.5 to 2M lines of VPA code which is >> not generous with CRLF, unlike certain HLL-s :-) ). But booting off >> flash is intended just to be able to restore your HDD via the net >> if you mess it up. > > So, a network stack and some utilities...?Oh a lot more than that. The spectroscopy software works too - not its latest version but good enough to test newly baked boards without the hdd attached etc.> >> I have smaller versions of course, e.g. I am not tortured by >> a small coldfire (mcf52211) which has a tiny derivative of dps >> (mainly the scheduler and some library calls, about 7 kilobytes >> total). Bloody thing won't go into low power mode which is >> sort of specified to at least halve the consumption, nothing >> of the sort, *zero* effect of entering that mode by the core. >> Cost me two days so far to zero result. Not that I can't live >> without that mode but why does it not work, drives me mad. > > Heh heh heh... you'll find it. Then, curse yourself for OVERLOOKING > it. Or, the manufacturer for not *documenting* it! :-/ >I found it. I rarely curse myself, there can always be someone else to blame after all. This time the reason was the period of the "force task out if it does volunteer for reschedule" timer, was set to 100 (or was it 10) uS rather than to the wanted 10mS during initialization. Since no person on Earth can mess with my development tools over the net, nor has anyone been close to my keyboard it must have been some alien bugger. Can't have been me. :D Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/
Reply by ●November 23, 20142014-11-23
On 11/23/2014 2:35 AM, Dimiter_Popoff wrote:> On 23.11.2014 г. 10:54, Don Y wrote:>> Heh heh heh... you'll find it. Then, curse yourself for OVERLOOKING >> it. Or, the manufacturer for not *documenting* it! :-/ > > I found it. I rarely curse myself, there can always be someone > else to blame after all. > This time the reason was the period of the "force task out if it does volunteer > for reschedule" timer, was set to 100 (or was it 10) > uS rather than to the wanted 10mS during initialization. > Since no person on Earth can mess with my development tools over > the net, nor has anyone been close to my keyboard it must have > been some alien bugger. Can't have been me."Gremlins". Had a friend ages ago who was convinced these things actually exist. As proof, she offered up all the *matches* that mysteriously "go missing". She concluded that the Gremlins are fascinated by fire and steal any matches they can find! This, in her mind, was the ONLY way to explain the sheer volume of matches that she would lose in any given week! :> I've had other friends similarly claim the existence of Gremlins... but, use *socks* as proof! Contending that they are amazed at these simple foot coverings and always try to steal them from laundry baskets -- hence the reason you ALWAYS have an oddball sock missing its mate! I'll admit "changing timeout values" had never occurred to me as a similar rationalization for their existence... <grin>
Reply by ●November 23, 20142014-11-23
On Sun, 23 Nov 2014 02:52:25 -0700, Don Y <this@is.not.me.com> Gave us:>On 11/23/2014 2:35 AM, Dimiter_Popoff wrote: >> On 23.11.2014 ?. 10:54, Don Y wrote: > >>> Heh heh heh... you'll find it. Then, curse yourself for OVERLOOKING >>> it. Or, the manufacturer for not *documenting* it! :-/ >> >> I found it. I rarely curse myself, there can always be someone >> else to blame after all. >> This time the reason was the period of the "force task out if it does volunteer >> for reschedule" timer, was set to 100 (or was it 10) >> uS rather than to the wanted 10mS during initialization. >> Since no person on Earth can mess with my development tools over >> the net, nor has anyone been close to my keyboard it must have >> been some alien bugger. Can't have been me. > >"Gremlins". Had a friend ages ago who was convinced these things >actually exist. As proof, she offered up all the *matches* that >mysteriously "go missing". She concluded that the Gremlins are >fascinated by fire and steal any matches they can find! > >This, in her mind, was the ONLY way to explain the sheer volume >of matches that she would lose in any given week! :> > >I've had other friends similarly claim the existence of Gremlins... >but, use *socks* as proof! Contending that they are amazed at these >simple foot coverings and always try to steal them from laundry >baskets -- hence the reason you ALWAYS have an oddball sock missing >its mate! > >I'll admit "changing timeout values" had never occurred to me as >a similar rationalization for their existence... <grin> >You're gonna want to see Mel Blanc's version of a gremlin. <https://www.youtube.com/watch?v=jljAMQNbl4Y>
Reply by ●November 23, 20142014-11-23
On 23.11.2014 г. 11:52, Don Y wrote:> On 11/23/2014 2:35 AM, Dimiter_Popoff wrote: >> On 23.11.2014 г. 10:54, Don Y wrote: > >>> Heh heh heh... you'll find it. Then, curse yourself for OVERLOOKING >>> it. Or, the manufacturer for not *documenting* it! :-/ >> >> I found it. I rarely curse myself, there can always be someone >> else to blame after all. >> This time the reason was the period of the "force task out if it does >> volunteer >> for reschedule" timer, was set to 100 (or was it 10) >> uS rather than to the wanted 10mS during initialization. >> Since no person on Earth can mess with my development tools over >> the net, nor has anyone been close to my keyboard it must have >> been some alien bugger. Can't have been me. > > "Gremlins". Had a friend ages ago who was convinced these things > actually exist. As proof, she offered up all the *matches* that > mysteriously "go missing". She concluded that the Gremlins are > fascinated by fire and steal any matches they can find! > > This, in her mind, was the ONLY way to explain the sheer volume > of matches that she would lose in any given week! :> > > I've had other friends similarly claim the existence of Gremlins... > but, use *socks* as proof! Contending that they are amazed at these > simple foot coverings and always try to steal them from laundry > baskets -- hence the reason you ALWAYS have an oddball sock missing > its mate! > > I'll admit "changing timeout values" had never occurred to me as > a similar rationalization for their existence... <grin> > >That with the too many missing matches I would not take that lightly, mad as it may sound, you know. Gremlins or whatever, things do go missing sometimes in an inexplicable way for me, too - usually only to reappear at the location I initially looked for after minutes - sometimes hours - of search. Not very often but times enough to rule out being just in dreamland when looking there first. I just have no explanation about it but it does happen to me. May be not to everybody... Obviously even now if someone would tell me that even I would consider a mental issue but... it does happen to me. Who knows, may be one day we'll discover that even "changing timeout values" can also go into that inexplicable category :D :D :D. Just now I had another which wasted me an hour. I wish it were inexplicable but it was just Chinese.... Bought a multimeter from ebay to measure current consumptions of things, looked for an analog one which could do 1A at least; found one with 2.5A. Came with no 2.5A, just 0.25A max. OK, no time to deal with this, just put a negative ebay feedback and moved on. Used it at 250mA for the current thing (the one with the timeout values, it is a HV source). While messing with it(soldering live a 470uF at the incoming 12V past a tiny 10uH choke)the something died, consumption went way above 250mA. Unsoldered quite a few parts from the board to see what did die only to discover the deadman was the shunt resistor or something within the multimeter...... Consumption had not risen at all. I must have shorted the incoming 12V briefly - for a few tens of milliseconds - and the thing had died... They must have used 0402 resistors for the shunt :D (no time to investigate, just made an external one for 1A, back to work). Dimiter ------------------------------------------------------ Dimiter Popoff, TGI http://www.tgi-sci.com ------------------------------------------------------ http://www.flickr.com/photos/didi_tgi/sets/72157600228621276/
Reply by ●November 23, 20142014-11-23
Hi Dimiter, On 11/23/2014 4:55 AM, Dimiter_Popoff wrote:> On 23.11.2014 г. 11:52, Don Y wrote:>> "Gremlins". Had a friend ages ago who was convinced these things >> actually exist. As proof, she offered up all the *matches* that >> mysteriously "go missing". She concluded that the Gremlins are >> fascinated by fire and steal any matches they can find! >> >> This, in her mind, was the ONLY way to explain the sheer volume >> of matches that she would lose in any given week! :>> That with the too many missing matches I would not take that lightly, > mad as it may sound, you know.Well, when you *need* a match/light, it can quickly lead to PANIC!!> Gremlins or whatever, things do go > missing sometimes in an inexplicable way for me, too - usually only > to reappear at the location I initially looked for after > minutes - sometimes hours - of search.I've recently become distressed over having too many pairs of *shoes*! Seems like I can never find the pair that I am *supposed* to be wearing; always one of the pairs that I'm *not*! <frown>> Not very often but times enough to rule out being just in dreamland > when looking there first.Actually, L has been moving things when you're not looking; then, placing them back just AFTER you've searched a place! If you listen carefully, while you're running around "hunting", she's quietly giggling in the other room!> I just have no explanation about it but it does happen to me. > May be not to everybody... Obviously even now if someone would tell > me that even I would consider a mental issue but... it does happen > to me. > > Who knows, may be one day we'll discover that even "changing > timeout values" can also go into that inexplicable category :D :D :D. > > Just now I had another which wasted me an hour. I wish it were > inexplicable but it was just Chinese.... Bought a multimeter from > ebay to measure current consumptions of things, looked for an analog > one which could do 1A at least; found one with 2.5A. > Came with no 2.5A, just 0.25A max. OK, no time to deal with this, > just put a negative ebay feedback and moved on. Used it at 250mA for the > current thing (the one with the timeout values, it is a HV source). > While messing with it(soldering live a 470uF at the incoming 12V past > a tiny 10uH choke)the something died, consumption went way above > 250mA. Unsoldered quite a few parts from the board to see what > did die only to discover the deadman was the shunt resistor > or something within the multimeter...... Consumption had not > risen at all. I must have shorted the incoming 12V briefly - for a few > tens of milliseconds - and the thing had died...If you didn't notice at the time, it could have been milliseconds or *weeks*! OTOH, when you *do* notice, it's within ohnoseconds!> They must have > used 0402 resistors for the shunt :D (no time to investigate, just > made an external one for 1A, back to work).Dunno. I've a couple of cheapie DVM's but rarely use them for measuring current (other than to verify charging current flowing into a battery, etc.). Always amazing how inexpensively they can make the things! There's a place, here, that frequently gives them away so you'll find people with 5 or 6 of the same make/model lying on in their shop and you KNOW where they got them! :-/







