EmbeddedRelated.com
Forums
The 2026 Embedded Online Conference

Disk imaging strategy

Started by Don Y November 2, 2014
Don Y wrote:
> On 11/3/2014 10:56 AM, Stefan Reuther wrote: >>> Sure you can dd that partition, but now you really are in trouble. >>> Its safer to tar a filesystem (that should NOT be currently running, >>> else you are in trouble too), >>> you can always untar it into an other filesystem (ext2, ext4, >>> reiserfs, etc) that is compatible. >> >> It depends. >> >> 'dd'ing the raw partition is almost guaranteed to produce a working >> image after unpacking. If you 'tar' a mounted file system, the operating >> system you run the 'tar' on must suppot all nuances of the file system >> you want to clone. Back in Win95 days, cloning (or backup/restore) a >> Win95 installation using 'tar' from Linux did not work, because it did >> not restore all required file attributes. I wouldn't expect Linux 'tar' >> to capture all NTFS attributes (like "compressed", ACLs, ADS) as well. >> Copying the partition blockwise would not have all these problems. > > Exactly. But, you don't want your image to HAVE TO BE as large as the > original. Esp as most disks have a fair bit of unused space. > > Hence the problem I posed: how do you sort out what is "unused space" > from "used space" -- in a manner that allows you to ignore the actual > metadata/etc. imposed by the particular filesystem implementation.
I have often used the "fill the file system with a file that is all zeroes" trick you already mentioned, so I cannot add anything new for that, other than a "+1, yes this works". Stefan
In article <m3a0q6$g74$1@dont-email.me>,
David Brown  <david.brown@hesbynett.no> wrote:

>As to the effect of the fragmentation, a key point is the effectiveness >of the OS's caching and read-ahead. If these work well (as they do in >Linux, and to a fair extent in modern Windows, but not in older >Windows), fragmentation does not matter even when reading longer files.
My rule of thumb is that if you're running a system which tends to be accessing a fair number of files semi-simultaneously (and so is "switching" between reading/writing one file and another on almost every physical I/O), then as long as the continguous disk-sector extents are larger than the size of the reads/writes you're sending to the drive, fragmentation doesn't affect performance very much. You'll end up having to do a seek for almost every read or write anyhow. You gain a lot, of course, by making these reads and writes as large as possible (and having contiguous-extent sizes to match). As you say, modern operating systems tend to do a pretty good job of this, by using usage-sensitive adaptive readahead, and write-behind buffering systems with write consolidation. "Throwing memory at the problem" can make a big difference, by increasing the size of the read-ahead / write-behind buffers. Back when I was doing filesystem work for a DVR company, I occasionally had to go rant at the application-writers a bit to make sure that they were budgeting their memory and I/O behavior properly. The rule of thumb I shot for, was that a "completely busy" disk really ought to be spending at least 50% of its time "on-sector" - that is, actually transferring data to/from the platter... this is "useful" time. Arm-seek and rotational delay ought not to be more than about 25-30% of the time budget, with 15-25% being held back as reserve time (for retries, error correction, and the sort of "adjacent-track maintenance" that modern hard drives require to deal with partial adjacent-track erasure). "Do few I/Os, and big ones" was the mantra.
Am 04.11.2014 um 16:11 schrieb Don Y:

> The appeal of the fs-agnostic approach is that it (should) work > universally.
And the true problem of it is that it canot possibly exist. Your approach is _not_ fs-agnostic. On the contrary. It doesn't even manage to be OS-agnostic. You can't even begin to implement any such "fill disk before creating image" approch without violating that "fs-agnostic" idea. You have to know what that FS is, then you have to be running some OS that knows not just what the FS is, but actually knows how to _write_ to it without corrupting anything. Then you have to run (or even create) a program that writes data to that filesystem, on that OS platform. Any overall procedure that can reliably tell which parts of a disk partition are currently used by the file system, and which aren't, must contain some part that _is_, for all practical intents and purposes, an implementation of that file system. You can throw the problem over the nearest fence and claim it's gone, but you can't actually make it go away. E.g. using Linux or the FS's usual host OS to mount the partition and write to it does just that: it "removes" the need for knowing the FS's innards by pretending that Linux can do the job without needing that knowledge. Well: it couldn't. In short: it is brutally obvious that it impossible to do what you're trying without diluting that "FS-agnostic" promise down to the kind of meaningless gobbledygook we engineers learned to expect from the marketroids.
On a sunny day (Tue, 04 Nov 2014 21:35:22 +0100) it happened
=?windows-1252?Q?Hans-Bernhard_Br=F6ker?= <HBBroeker@t-online.de> wrote in
<cbsrjrF5kqeU1@mid.dfncis.de>:

>Am 04.11.2014 um 16:11 schrieb Don Y: > >> The appeal of the fs-agnostic approach is that it (should) work >> universally. > >And the true problem of it is that it canot possibly exist. Your >approach is _not_ fs-agnostic. On the contrary. It doesn't even manage >to be OS-agnostic. > >You can't even begin to implement any such "fill disk before creating >image" approch without violating that "fs-agnostic" idea. You have to >know what that FS is, then you have to be running some OS that knows not >just what the FS is, but actually knows how to _write_ to it without >corrupting anything. Then you have to run (or even create) a program >that writes data to that filesystem, on that OS platform. > >Any overall procedure that can reliably tell which parts of a disk >partition are currently used by the file system, and which aren't, must >contain some part that _is_, for all practical intents and purposes, an >implementation of that file system. > >You can throw the problem over the nearest fence and claim it's gone, >but you can't actually make it go away. E.g. using Linux or the FS's >usual host OS to mount the partition and write to it does just that: it >"removes" the need for knowing the FS's innards by pretending that Linux >can do the job without needing that knowledge. Well: it couldn't. > >In short: it is brutally obvious that it impossible to do what you're >trying without diluting that "FS-agnostic" promise down to the kind of >meaningless gobbledygook we engineers learned to expect from the >marketroids.
Agreed!!!!!
On 2014-11-02, Don Y <this@is.not.me.com> wrote:
> Hi, > > I'm writing a bit of code to image disk contents REGARDLESS OF THE > FILESYSTEM(s) contained thereon. > > This doesn't have to be "ideal" (defined as "effortless", "minimal > image size", etc.) but should be pretty close. > > It is not intended to be performed often -- "write once, read multiple" > (i.e., RESTORE *far* more often than IMAGE). > > The challenge comes in the filesystem(s) neutral aspect. E.g., I > should be able to image a disk containing FAT32, NTFS, FFSv1/2, QFS, > individual RAID* volumes, little/BIG endian, etc. -- with the same > executable! > > A naive approach to this would be to plumb dd to a compressor -- running > both OUTSIDE the native OS. But, for large/dirty volumes, this gives you > an unacceptably large resulting image -- because you end up having to store > "discarded data" which could potentially be HUGE (consider a large volume > that has seen lots of write/delete cycles) esp in comparison with the > actual precious data!
You're making the assumption that all those zeros will fill up the disk. If the filesystem does data compression you're just burining processor cycles for no gain, If the file system does hash-based storage sharing you're running in place and the task will only complete when you run out of file names, -- umop apisdn
Op Tue, 04 Nov 2014 19:52:15 +0100 schreef Stefan Reuther  
<stefan.news@arcor.de>:
> David Brown wrote: >> On 04/11/14 03:10, DecadentLinuxUserNumeroUno wrote: >>> On Tue, 4 Nov 2014 01:31:50 +0000 (UTC), Andrew Smallshaw > However, when you write a single large file to a blank FAT partition, it > will not fragment. All data will arrive on the disk without any gaps > inbetween.
I've seen 'really old' DOS/Windows create gaps, when in some sort of "fast write mode". It would reserve a bunch of free cylinders and write wherever the write heads are, incrementing the cylinder each revolution. If the disk spins fast enough, gaps appear. At least this is what I deduced from the patterns when looking at the defrag screen. :) -- (Remove the obvious prefix to reply privately.) Gemaakt met Opera's e-mailprogramma: http://www.opera.com/mail/
On Wed, 05 Nov 2014 16:07:06 +0100, "Boudewijn Dijkstra"
<sp4mtr4p.boudewijn@indes.com> Gave us:

>Op Tue, 04 Nov 2014 19:52:15 +0100 schreef Stefan Reuther ><stefan.news@arcor.de>: >> David Brown wrote: >>> On 04/11/14 03:10, DecadentLinuxUserNumeroUno wrote: >>>> On Tue, 4 Nov 2014 01:31:50 +0000 (UTC), Andrew Smallshaw >> However, when you write a single large file to a blank FAT partition, it >> will not fragment. All data will arrive on the disk without any gaps >> inbetween. > >I've seen 'really old' DOS/Windows create gaps, when in some sort of "fast >write mode". It would reserve a bunch of free cylinders and write >wherever the write heads are, incrementing the cylinder each revolution. >If the disk spins fast enough, gaps appear. At least this is what I >deduced from the patterns when looking at the defrag screen. :)
Awe-righty then... I am... truly... in awe... that you garnered what you have thought was your knowledge about hard drive physical platter population from guesswork based on a visual representation of some version of some drive defrag utility. And, you are serious? Yeah... there was a gap created alright...
David Brown wrote:
> > On 04/11/14 03:10, DecadentLinuxUserNumeroUno wrote: > > On Tue, 4 Nov 2014 01:31:50 +0000 (UTC), Andrew Smallshaw > > <andrews@sdf.lonestar.org> Gave us: > > > >> No, on _any_ general purpose file system. It is inevitable. You > >> seem to be of the impression that Linux is somehow the ultimate > >> system, > > > > I was "under the impression" that the ext series of file systems, as > > well as MS' NTFS fought such occurrences with a different operation and > > management paradigm, than the old, FAT type method. > > > > There are three major things that affect the fragmentation rate on a > filesystem, and a number of things that influence the effect of the > fragmentation. > > One is the type of filesystem. Some, such as FAT, have a very simple > structure that is prone to fragmentation. Others, such as ext4 or xfs, > have features such as "extents" and "allocation groups" that greatly > reduce the fragmentation. > > Number two is the OS and the filesystem implementation. DOS and early > Windows versions had very poor caching, and very short write cache times > (since it was expected that the system could crash any second), leading > to massive fragmentation. Linux, and newer Windows versions, take a bit > more time and allocate more sensibly, thus leading to far less > fragmentation. Linux still does a better job than Windows, but since > Win7 the difference is hardly noticeable. > > Number three is the pattern of filesystem usage. Fairly obviously, if > you write many large files at once, or do lots of modification of files, > the chances of fragmentation increase. > > As to the effect of the fragmentation, a key point is the effectiveness > of the OS's caching and read-ahead. If these work well (as they do in > Linux, and to a fair extent in modern Windows, but not in older > Windows), fragmentation does not matter even when reading longer files. > > Fragmentation has always been here (well, not always - there were older > filesystems that did not allow file fragmentation at all, but they > suffered from other problems), and fragmentation will always be with us. > But apart from a few extreme cases, it has never been a problem in > Linux, and it is not a problem in modern Windows. The /problems/ of > fragmentation are a thing of the past.
Some early home computers were a real pain in the ass when it came to fragmentation on low capacity floppy disks. -- Anyone wanting to run for any political office in the US should have to have a DD214, and a honorable discharge.
Hi Dimiter,

On 11/3/2014 11:50 PM, Dimiter_Popoff wrote:
> On 03.11.2014 &#1075;. 23:22, Don Y wrote:
>> First, can a user create files having arbitrary names and contents >> under your FS? > > Yes of course. Pretty much everything you would expect from a > filesystem.
To be clear, by "user" I mean can a human being walk up and enter "arbitrary text" into a file, etc.? I.e., I had assumed your filesystem handled files that the *instrument* created (e.g., observational data, instrument generated reports, etc.). Could a purchaser store his email addresses in a file called MYADDRS.TXT? Could he, likewise, create a file filled with repeated strings of "Kilroy was here!"?
>> Can he copy & rename files? > > Yes, what filesystem would it be without that :D .
Again, my questions are meant to clarify that a *user* can do these things on demand -- not just the *instrument* deciding that it needs to do a "COPY", etc. (for its own purposes)
>> E.g., could he <somehow> introduce a file having some particular >> contents (like "DELETEDDELETEDDELETEDDELETED...") to your FS? > > I'd be tempted to go to all 0 files for the "highly compressed" > pattern - for no good reason really, except perhaps because disks > come as all 0 from the factory. But you will want to fill them > up anyway so this is not a consideration. > >> Then, could he replicate it many times? (copy to a different >> filename) >> >> Having done that until the copy failed ("No space left on device"), >> presumably, he could delete each of them? (perhaps made simpler >> by creating them all in a single subdirectory/folder and then >> just deleting the folder AND its contents) > > Yes, all of the above. Making multiple copies of a single file will > take 2-3 lines of script, to increment the name somehow. Deleting > all in a directory goes the usual del * way, if you want recursion > there is a script doing it (rm path/ -R ). I have deliberately > kept recursive disk operations in scripts, makes new bugs show up, > costs no overhead to speak of, can be retried/resumed, prompts me > to write necessary extensions when there is some new need etc.
OK. So, I *could* fill YOUR disk with files containing a specific 512 character string (or larger). Then, once the OS complains "no space left on device", I could delete them all thereby freeing up all that space -- yet, leaving that 512 character string on the media (in the deleted files)
>> Could I then examine your disk AT THE SECTOR LEVEL and expect to find >> lots of "DELETEDDELETEDDELETED..." in sectors? > > Well you can have the disk image as a file and do with it whatever > you please under any OS. Or under DPS, but you don't have a dps machine. > >> In doing so, effectively know which sectors are currently "unused"? >> (or, at the very least, safe to restore with "DELETEDDELETEDDELETED..." >> as their contents WITHOUT actually having to store an instance of >> "DELETEDDELETEDDELETED..." for EACH such sector? > > I am not sure I get this, we may be thinking somewhat differently on > how you will implement that. My understanding of your idea is that > you will take the disk image - say a 20G one - and copy it elsewhere > in 512 byte (say) pieces by skipping those 512 byte pieces > which are all 0 (or all deleteddeleted whatever you opt for). You just > write (say in another file)where these blocks were, position:length.
I would encode their locations "in-line". So, you open a *bit* stream and start pulling bits out of it in ~512 *byte* chunks. Any sector that was OBSERVED as having the "magic string" in it would be represented by a single bit in this bitstream. So, when *restoring* it, if the bit is set, you generate a copy of the "magic string" that you then store in this sector on the medium. If the bit is NOT set, you take the next 512 bytes worth of *bits* and store them as the "live data". Then, move to the next sector and repeat the process. I.e., a sector either takes a (single) bit or 512*8 bits to store in an image. (You can compress this bitstream separately to achieve even higher compression ratios)
> Oops, I think I get it now. You can know which sectors have been > full of "deleteddeleted" but how will you know you don't have to > restore them?
I *don't* know that! I *do* restore them. But, I don't need to save that "magic string" IN the image (for each such "deleted sector"). Instead, I can save a single bit (or less) and know that this bit represents that "magic string" -- regardless of whether or not the magic string is part of a deleted file (as would most typically be the case -- given that I would have written them deliberately as the contents of those files that I used to fill the medium).
> What if there has been an allocated sector full of > exactly this pattern?
Then I will have recreated that pattern! See above. Note that you MUST restore the entire contents of the disk as you have no idea what sort of "corruption/damage" it may have experienced (i.e., to have necessitated the "restore" operation). You can't even count on the disk to have any legitimate vestiges of the previous filesystem remaining (intact) on it! You don't want to have to "format" the medium and build a filesystem BEFORE you restore the data. That would require a specific algorithm for each filesystem that you needed to recreate -- BEFoRE you even started restoring the DATA! (e.g., it is not uncommon to have two or three different filesystems on a SINGLE Windows laptop's disk) [So, if you are only going to selectively restore parts of the image, then you need another operation that will ensure any other disk structures required by the SYSTEM (incl filesystem) are present, as well! E.g., do a "format" before doing the restore...]
> May be it is still practical to come up > with some 512 byte pattern which will never occur on normal > disks but this will be a vulnerability, perhaps an acceptable
I don't think it is worth the effort. Having to reexpand the "single bit" into that magic string AND write that string onto the physical disk eats up time during the restore operation. But, even 100+GB isn't going to take forever! Remember, I am contrasting this with having to rebuild a system from scratch! (how long does it take to reformat the disk, install the OS, install all the applications, configure things, etc.) And, this operation could be performed unattended (which would not be possible if rebuilding the system from scratch!) The key here is that I am imaging the system AFTER it is "built". It is not an ongoing operation: build system, image, add more apps, image again, add even more apps, image yet again, change some part of the configuration, image once more... So, you can size the partition (medium) to fit the "live data". Let the user store *their* data on another partition (that I don't have to be concerned with).
> one. Clearly the "all 0" I thought earlier of is not applicable then. > But I'd go for restoring all data (at least have the option to > do if the shortcut does not get me anywhere). At say 20M/S a 20G > disk would get restored in what, 1000 seconds. Not that long > for such a massive intervention I suppose.
This doesn't necessarily want to be a "free" operation. If you are using it often, there is something wrong with your usage habits! E.g., I've set up one of our laptops that we use EXCLUSIVELY for on-line financial transactions such that it wipes the disk after each reboot. Essentially equivalent to running off R/O media (as the machine is only powered up to perform the necessary transaction(s) and then powered off/scrubbed)
>> [BTW, we are now below 25C. Almost feels comfortable!! :> ] > > Hey, if 25C is "almost" comfortable then I don't know what is > "really comfortable" in your book :D . We already hover between > 0 and 5C, sometimes 10C, winter is coming... (we do hate it).
Well, it's been above 30 (35 to 40-ish) for MONTHS so being able to stand outside without melting is a huge improvement! :> Already amusing to see the "faint hearted" wearing jackets, gloves, etc. :-/ Unfortunately, too many other activities come with the improving temperatures so it gets harder to keep control of your time... :< I usually reserve the last quarter (of the year) for equipment/tool upgrades/replacements, orgainzing the accumulated mess in the office, closing out the books, etc. Hard to do any "real work" as there are lots of activities that eat into your waking hours (parties, friends returning to town for the cooler weather, community events, etc.). And, a fair bit of time baking for the holidays. I've got four specs that I would be *thrilled* to have formalized by year end so I can start work on implementing them after the new year. Plus a couple of significant chores that I need to tackle before the weather gets much cooler (pour some concrete, do some outdoor painting, finish laying some irrigation line, some body work on SWMBO's vehicle, etc.) (sigh) No time for "work"! ;-) --don
On 11/5/2014 3:16 AM, Jasen Betts wrote:
> On 2014-11-02, Don Y <this@is.not.me.com> wrote: >> Hi, >> >> I'm writing a bit of code to image disk contents REGARDLESS OF THE >> FILESYSTEM(s) contained thereon. >> >> This doesn't have to be "ideal" (defined as "effortless", "minimal >> image size", etc.) but should be pretty close. >> >> It is not intended to be performed often -- "write once, read multiple" >> (i.e., RESTORE *far* more often than IMAGE). >> >> The challenge comes in the filesystem(s) neutral aspect. E.g., I >> should be able to image a disk containing FAT32, NTFS, FFSv1/2, QFS, >> individual RAID* volumes, little/BIG endian, etc. -- with the same >> executable! >> >> A naive approach to this would be to plumb dd to a compressor -- running >> both OUTSIDE the native OS. But, for large/dirty volumes, this gives you >> an unacceptably large resulting image -- because you end up having to store >> "discarded data" which could potentially be HUGE (consider a large volume >> that has seen lots of write/delete cycles) esp in comparison with the >> actual precious data! > > You're making the assumption that all those zeros will fill up the disk.
"Blank" doesn't mean "full of zeroes". Rather, it means devoid of data. I explored various different "magic strings" to fill the "to be unused" portions of the disk. One algorithm takes a fixed (const) 512-byte array and repeatedly pushes it into file(s) until the current file's write(2)'s fail and/or further creat(2)'s fail. Then, unlinks all of this. I've tried with "compressible" patterns in that array as well as "less compressible" patterns (i.e., where the compressor would have to operate over longer distances -- between sectors instead of within *that* sector). I have another that creates "random" data to fill the "to be unused" portions of the medium (the goal being to make compression *hard*). In each case, those portions of the medium are "blank" when the files are later unlinked. Yet, completely restorable! The amount of space they require in the image can vary significantly -- hence the purpose of the experiment(s).
> If the filesystem does data compression you're just burining processor > cycles for no gain,
See above. You can push uncompressible data into those "empty"/unused parts of the media -- yet still restore it from a HIGHLY compressed form! You just have to choose data that the COTS compander can't easily compress -- but that *you* can!! :>
> If the file system does hash-based storage sharing you're running in place > and the task will only complete when you run out of file names,
You can't come up with a naive "one-size fits all" approach. E.g., some filesystems may place limits on how large a file can be. So, you have to be able to create multiple files! (e.g., imagine FAT32 with >2G of space available) Some filesystems may place limitations on how many names can fit in a container. So, you need to be able to create *new* containers (to let you restart the "per container" name count). Some filesystems may have limits on the character set used for the names and the number of characters in an identifier. So, you have to be able to adjust this to support all of the above. But, it's still an amazingly simple piece of code! And, doesn't need to understand any of the particulars of the filesystem it is deployed on. It just needs to be able to create/write/unlink files! Remember, any filesystem has to be *usable*! FAT16 would be silly on a TB medium! So, if my "fill" code was run on that medium, you wouldn't fault it for not being able to fully consume the available free space!
The 2026 Embedded Online Conference