On Thursday 28 February 2008, Jan Kara wrote: > > On Thursday 28 February 2008, Ric Wheeler wrote: > > > > [ fsync batching can be slow ] > > > > > One more thought - what we really want here is to have a sense of the > > > latency of the device. In the S-ATA disk case, this optimization works > > > well for batching since we "spend" an extra 4ms worst case in the > > > chance of combining multiple, slow 18ms operations. > > > > > > With the clariion box we tested, the optimization fails badly since the > > > cost is only 1.3 ms so we optimize by waiting 3-4 times longer than it > > > would take to do the operation immediately. > > > > > > This problem has also seemed to me to be the same problem that IO > > > schedulers do with plugging - we want to dynamically figure out when to > > > plug and unplug here without hard coding in device specific tunings. > > > > > > If we bypass the snippet for multi-threaded writers, we would probably > > > slow down this workload on normal S-ATA/ATA drives (or even higher > > > performance non-RAID disks). > > > > It probably makes sense to keep track of the average number of writers we > > are able to gather into a transcation. There are lots of similar > > workloads where we have a pool of procs doing fsyncs and the size of the > > transaction or the number of times we joined a running transaction will > > be fairly constant. > > I'm probably missing something, but what are you trying to say? Either we > wait for writers and the number of writes is higher, or we don't wait and > the number of writes in a transaction is lower... The common workload would be N mail server threads servicing incoming requests at a fairly constant rate. Right now we sleep for a bit and wait for the number of writers to increase. My guess is that if we record the average number of times a writer joins an existing transaction, or if we record the average size of the transactions, we'll end up with a fairly constant number. So, we can skip the sleep if the transaction has already grown close to that number. This would avoid the latencies Ric is seeing. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html