EmbeddedRelated.com
Forums

Multithreaded disk access

Started by Don Y October 15, 2021
On 10/25/2021 5:23 PM, Richard Damon wrote:
> On 10/25/21 7:19 AM, Don Y wrote: >> On 10/18/2021 1:46 PM, Don Y wrote: >>>> I think this will give you plenty of an idea how to go about it. >>>> Once you know the limit you can run at some reasonable figure >>>> below it and be happy. Getting more precise figures about all >>>> that is neither easy nor will it buy you anything. >>> >>> I suspect "1" is going to end up as the "best compromise". So, >>> I'm treating this as an exercise in *validating* that assumption. >>> I'll see if I can salvage some of the performance monitoring code >>> from the sanitizer to give me details from which I might be able >>> to ferret out "opportunities". If I start by restricting my >>> observations to non-destructive synthetic loads, then I can >>> pull a drive and see how it fares in a different host while >>> running the same code, etc. >> >> Actually, '2' turns out to be marginally better than '1'. >> Beyond that, its hard to generalize without controlling some >> of the other variables. >> >> '2' wins because there is always the potential to make the >> disk busy, again, just after it satisfies the access for >> the 1st thread (which is now busy using the data, etc.) >> >> But, if the first thread finishes up before the second thread's >> request has been satisfied, then the presence of a THIRD thread >> would just be clutter. (i.e., the work performed correlates >> with the number of threads that can have value) > > Actually, 2 might be slower than 1, because the new request from the second > thread is apt to need a seek, while a single thread making all the calls is > more apt to sequentially read much more of the disk.
Second thread is another instance of first thread; same code, same data. So, it is just "looking ahead" -- i.e., it WILL do what the first thread WOULD do if the second thread hadn't gotten to it, first! The strategy is predicated on tightly coupled actors so they each can see what the other has done/is doing. If the layout of the object that thread 1 is processing calls for a seek, then thread 2 will perform that seek as if thread 1 were to do it "when done chewing on the past data". If the disk caches data, then thread 2 will reap the benefits of that cache AS IF it was thread 1 acting later. Etc. A second thread just hides the cost of the data processing.
> The controller, if not given a new chained command, might choose to > automatically start reading the next sector of the cylinder, which could be > likely the next one asked for. > > The real optimum is likely a single process doing asynchronous requests queuing > up a series of requests, and then distributing the data as it comes in to > processing threads to do what ever crunching needs to be done. > > These threads then send the data to a single thread that does asynchronous > writes of the results. > > You can easily tell if the input or output processes are I/O bound or not and > use that to adjust the number of crunching threads in the middle. >
Dimiter_Popoff <dp@tgi-sci.com> wrote:
> > Now this is where the worst fit allocation strategy becomes the game > changer. A newly created file is almost certainly contiguously > allocated; fragmentation occurs when it is appended to (and the > space past its last block has been already allocated in the meantime). > I think I saw somewhere (never really got interested) that mainstream > operating systems of today do just first fit - which means once you > delete a file, no matter how small, its space will be allocated as > part of the next request etc., no idea why they do it (if they do so, > my memory on that is not very certain) in such a primitive manner > but here they are.
There was some research and first fit turned out to be pretty good. But it is implemented slightly differently: there is moving pointer, search advances it. So, after deallocation you wait till moving pointer arrives to the hole. Way better than immediately filling hole: there is reasonable chance that holes will coalece into bigger free area. Another point is that you refuse allocation when disc is too full (say at 95% utilization). The two things together mean that normally fragmentation is not a problem. -- Waldek Hebisch
On 10/31/2021 21:21, antispam@math.uni.wroc.pl wrote:
> Dimiter_Popoff <dp@tgi-sci.com> wrote: >> >> Now this is where the worst fit allocation strategy becomes the game >> changer. A newly created file is almost certainly contiguously >> allocated; fragmentation occurs when it is appended to (and the >> space past its last block has been already allocated in the meantime). >> I think I saw somewhere (never really got interested) that mainstream >> operating systems of today do just first fit - which means once you >> delete a file, no matter how small, its space will be allocated as >> part of the next request etc., no idea why they do it (if they do so, >> my memory on that is not very certain) in such a primitive manner >> but here they are. > > There was some research and first fit turned out to be pretty good. > But it is implemented slightly differently: there is moving pointer, > search advances it. So, after deallocation you wait till moving > pointer arrives to the hole. Way better than immediately filling > hole: there is reasonable chance that holes will coalece into > bigger free area. Another point is that you refuse allocation > when disc is too full (say at 95% utilization). The two things > together mean that normally fragmentation is not a problem. >
This might be somewhat better than plain first fit but it can be no match to worst fit. They probably don't do worst fit because they must have huge amounts of memory to dig through due to poor filesystem design (they have to go through 32 bits or more for each cluster). In DPS the CAT is one bit per cluster which makes worst fit doable without any problems like that. I have enhanced it a little (many years ago) noticing that the partition could come to a point where it has two almost equally sized empty parts; now you append to a file, it gets some clusters from one of these; next time you append to it the *other* one is the largest and it takes from it... :). So now dps first checks if there are empty clusters to allocate in succession of these already allocated to the file, if none then it does worst fit.