Forums

DOSFS questions/comments

Started by John Williams July 22, 2007
Hello,

We are working on a project that is using the Larwe DOSFS 1.03 library, 
over an SD/MMC card in a standalone embedded platform.  I have 
implemented a generic buffer cache between DOSFS and the physical media, 
which works well and gives the expected performance improvements.

The access pattern in this application is typically just sequential 
writes of ~3MB files.  However, we've found a very strange behaviour, 
whereby DOSFS will only allow us to write as many files to the disk as 
were present when the system was initialised.

For example, if we pre-populate the filesystem with files foo1, foo2, 
and foo3, then start up our app and attempt to write foo1, foo2, foo3 
and foo4 in that sequence, we end up with

foo1, foo2, foo4

on the disk. foo3 exists after it is first written, but then foo4 
"replaces it".  The clusters of foo3 are still allocated, but it has no 
directory entry.

It's alomst as though DOSFS is refusing to expand the root directory 
beyond the unmber of entries that existed there in the first place.

Has anybody seen anything like this, or have any pointers on where to 
start looking for the problem?  We see this behaviour with or without 
our caching enabled.

Secondly, more of a comment / suggestion -  I found that the file write 
time increased linearly with the number of allocated clusters on the 
disk.  I traced this down to the linear, start-from-the-beginning search 
in DFS_GetFreeFAT().

for (i = 2; i < volinfo->numclusters; i++) {
     result = DFS_GetFAT(...)
     ...
}

As the number of allocated clusters grows, so will this search time.  I 
did a little hack whereby we remember the cluster of the previous free 
FAT, and start our search from there next time.  This seems to work 
nicely, and completely "levels out" the write time for these successive 
file writes.

Can anybody confirm if this will do invalid things to the disk like 
leaving holes in the FAT, if there are file deletes in between writes? 
I wrap the starting point around if it reaches volinfo->numclusters, so 
those holes should be filled eventually.

Running fsck.msdos on the resulting file systems doesn't report any nasties.

There doesn't seem to be any link betwen this optimisation and the 
problem I describe above - it occurs regardless.

Thanks for any comments or suggestions,

John


> We are working on a project that is using the Larwe DOSFS 1.03 library, > ...
Don't know that particular implementation, but having dealt with MSDOS <-> DPS filesystem translation, here are some comments while waiting for the backup to finish - well, it just finished but that won't get in my way :-).
> Secondly, more of a comment / suggestion - I found that the file write > time increased linearly with the number of allocated clusters on the > disk. I traced this down to the linear, start-from-the-beginning search > in DFS_GetFreeFAT().
Searching from the beginning is normal for many (most?) systems. The major issue with FAT is that it is 12-16-32 times less efficient on allocation than bit-per-cluster schemes - so if you have a large FAT and a tiny processor this might be noticeable. The other major issue I have seen with FAT filesystems is that they don't seem to have heard about worst fit allocation algorithms... I think they all still do first fit. If your system is indeed small and dedicated your hack - allocate starting from last taken - will probably be more than enough.
> Can anybody confirm if this will do invalid things to the disk like > leaving holes in the FAT, if there are file deletes in between writes?
You can always expect "holes" in the FAT, if holes means non-allocated clusters. Nothing wrong with that, any OS should be able to handle it, actually it would not work if it did not. Dimiter ------------------------------------------------------ Dimiter Popoff Transgalactic Instruments http://www.tgi-sci.com ------------------------------------------------------ On Jul 23, 3:19 am, John Williams <jwilli...@itee.uq.edu.au> wrote:
> Hello, > > We are working on a project that is using the Larwe DOSFS 1.03 library, > over an SD/MMC card in a standalone embedded platform. I have > implemented a generic buffer cache between DOSFS and the physical media, > which works well and gives the expected performance improvements. > > The access pattern in this application is typically just sequential > writes of ~3MB files. However, we've found a very strange behaviour, > whereby DOSFS will only allow us to write as many files to the disk as > were present when the system was initialised. > > For example, if we pre-populate the filesystem with files foo1, foo2, > and foo3, then start up our app and attempt to write foo1, foo2, foo3 > and foo4 in that sequence, we end up with > > foo1, foo2, foo4 > > on the disk. foo3 exists after it is first written, but then foo4 > "replaces it". The clusters of foo3 are still allocated, but it has no > directory entry. > > It's alomst as though DOSFS is refusing to expand the root directory > beyond the unmber of entries that existed there in the first place. > > Has anybody seen anything like this, or have any pointers on where to > start looking for the problem? We see this behaviour with or without > our caching enabled. > > Secondly, more of a comment / suggestion - I found that the file write > time increased linearly with the number of allocated clusters on the > disk. I traced this down to the linear, start-from-the-beginning search > in DFS_GetFreeFAT(). > > for (i = 2; i < volinfo->numclusters; i++) { > result = DFS_GetFAT(...) > ... > > } > > As the number of allocated clusters grows, so will this search time. I > did a little hack whereby we remember the cluster of the previous free > FAT, and start our search from there next time. This seems to work > nicely, and completely "levels out" the write time for these successive > file writes. > > Can anybody confirm if this will do invalid things to the disk like > leaving holes in the FAT, if there are file deletes in between writes? > I wrap the starting point around if it reaches volinfo->numclusters, so > those holes should be filled eventually. > > Running fsck.msdos on the resulting file systems doesn't report any nasties. > > There doesn't seem to be any link betwen this optimisation and the > problem I describe above - it occurs regardless. > > Thanks for any comments or suggestions, > > John
John Williams wrote:

> Can anybody confirm if this will do invalid things to the disk like > leaving holes in the FAT, if there are file deletes in between writes? > I wrap the starting point around if it reaches volinfo->numclusters, so > those holes should be filled eventually.
Aside from performance differences, there shouldn't be any problem with your solution - you can start allocating free clusters where-ever you like on a FAT file system. Regards, -- Mark McDougall, Engineer Virtual Logic Pty Ltd, <http://www.vl.com.au> 21-25 King St, Rockdale, 2216 Ph: +612-9599-3255 Fax: +612-9599-3266
John,

On Jul 22, 8:19 pm, John Williams <jwilli...@itee.uq.edu.au>

This message is mostly just an ack to let you know that I read c.a.e
regularly and though I don't have time tonight to give this the
attention it deserves, I will try to look into it.

> For example, if we pre-populate the filesystem with files foo1, foo2, > and foo3, then start up our app and attempt to write foo1, foo2, foo3 > and foo4 in that sequence, we end up with > > foo1, foo2, foo4
This sounds like a dirent overwriting bug. You are _creating_ foo4, and it sounds like you're creating it over the top of foo3 because of a dirent search bug in DOSFS. Can you email me (this email address is valid) a dd if=/dev/hdxx of=dump.img bs=512 count=yyyy from a card where it is known to happen. (I want the MBR as well the filesystem). Please gzip it first and obviously don't include anything proprietary.
> It's alomst as though DOSFS is refusing to expand the root directory > beyond the unmber of entries that existed there in the first place.
Not really; it's more like the "find first free dirent" code in DOSFS is incorrectly reporting the last used dirent as being the first free one :)
> Secondly, more of a comment / suggestion - I found that the file write > time increased linearly with the number of allocated clusters on the
Correct. Various strategies exist to work around this, and your approach is one of them. In a system with enough RAM to cache the FAT, your approach isn't ideal. The DOSFS implementation is intended to be a vanilla reference, ready to drop in and work - and then to optimize for the application. I've worked on - but not released yet - a special set of flags for file access which has a more general workaround specific to embedded systems - it assumes the disk has contiguous free space and just keeps incrementing the cluster number until you close the file, whereupon it writes out the whole chain. I'm constrained from releasing this code at the moment (it's a bit of a complicated situation).
> Can anybody confirm if this will do invalid things to the disk like > leaving holes in the FAT, if there are file deletes in between writes?
Sure it will. Hence you get fragmentation. Various MS-DOS versions do the same thing. They also cache the FAT to improve performance overall, but that wouldn't fit in DOSFS' RAM requirement. That is why I say in the documentation that this is a working example, but platform-specific optimizations may significantly improve speed, particularly write speed. (That's not verbatim what I put in the readme, but it's darned close). PS: Qld, eh? I'm from Adelaide myself, though I lived most of my life in Melbourne. I would kill for a Carlton Cold, it's unobtainable here in the US. The only readily available "Australian" beer is Fosters, brewed in Canada so they can call it an import.
Hi Lwein,

Thanks for the quick reply - comments inline below:

larwe wrote:

> This message is mostly just an ack to let you know that I read c.a.e > regularly and though I don't have time tonight to give this the > attention it deserves, I will try to look into it.
Thanks - and also I should have said originally, thanks for supporting and releasing DOSFS in the first place!
>>For example, if we pre-populate the filesystem with files foo1, foo2, >>and foo3, then start up our app and attempt to write foo1, foo2, foo3 >>and foo4 in that sequence, we end up with >> >>foo1, foo2, foo4 > > This sounds like a dirent overwriting bug. You are _creating_ foo4, > and it sounds like you're creating it over the top of foo3 because of > a dirent search bug in DOSFS. Can you email me (this email address is > valid) a dd if=/dev/hdxx of=dump.img bs=512 count=yyyy from a card > where it is known to happen. (I want the MBR as well the filesystem). > Please gzip it first and obviously don't include anything proprietary.
Will send this direct shortly.
>>It's alomst as though DOSFS is refusing to expand the root directory >>beyond the unmber of entries that existed there in the first place. > > > Not really; it's more like the "find first free dirent" code in DOSFS > is incorrectly reporting the last used dirent as being the first free > one :)
Indeed.
>>Secondly, more of a comment / suggestion - I found that the file write >>time increased linearly with the number of allocated clusters on the > > Correct. Various strategies exist to work around this, and your > approach is one of them. In a system with enough RAM to cache the FAT, > your approach isn't ideal.
I was quite surprised by the extent of the increase. With 3MB files, the first would take about 1.5 sec to write, the 2nd about 2.5 sec, and so on - an extra second per file, just spent trawling the FAT. I'll need abit moer of a think about my cache replacement strategy and possibly optimise it for this workload - we do have plenty of RAM for caching in this app.
>>Can anybody confirm if this will do invalid things to the disk like >>leaving holes in the FAT, if there are file deletes in between writes? > > Sure it will. Hence you get fragmentation. Various MS-DOS versions do > the same thing. They also cache the FAT to improve performance > overall, but that wouldn't fit in DOSFS' RAM requirement.
OK. I don't think fragmentation is a killer in this case, sitting on top of solid state storage media with negligible (or rather, constant) seek time.
> PS: Qld, eh? I'm from Adelaide myself, though I lived most of my life > in Melbourne. I would kill for a Carlton Cold, it's unobtainable here > in the US. The only readily available "Australian" beer is Fosters, > brewed in Canada so they can call it an import.
Didn't realise you were an expat - yes the beer situation can get dire although the US microbrewery scene is a lot more vibrant (== viable) than here in Oz. Cheers, John
Hi again,

larwe wrote:

> On Jul 22, 8:19 pm, John Williams <jwilli...@itee.uq.edu.au> >
>>For example, if we pre-populate the filesystem with files foo1, foo2, >>and foo3, then start up our app and attempt to write foo1, foo2, foo3 >>and foo4 in that sequence, we end up with >> >>foo1, foo2, foo4 > > > This sounds like a dirent overwriting bug. You are _creating_ foo4, > and it sounds like you're creating it over the top of foo3 because of > a dirent search bug in DOSFS. Can you email me (this email address is > valid) a dd if=/dev/hdxx of=dump.img bs=512 count=yyyy from a card > where it is known to happen. (I want the MBR as well the filesystem). > Please gzip it first and obviously don't include anything proprietary.
OK I think I've fixed this one - in DFS_GetNext there are two successful exit paths - the first is around line 708, where we find a deleted entry and the DFS_DI_BLANKENT mask is set: if (dirent->name[0] == 0) { // no more files in this directory // If this is a "find blank" then we can reuse this name. if (dirinfo->flags & DFS_DI_BLANKENT) return DFS_OK; else return DFS_EOF; } There's a problem here - di->currententry is not incremented. So, when we return to the caller (DFS_OpenFile), it overwrites the next-to-last dirent, and that gives the behaviour I reported earlier today. Adding di->currententry++ in this exit path gets it all working as expected. Make sense? Is this a generic issue or am I triggering some corner case? Thanks, John