On Wed, Nov 02, 2022 at 01:57:58AM -0700, Christoph Hellwig wrote: > On Mon, Oct 31, 2022 at 10:27:16AM +0000, Matthew Wilcox wrote: > > > Byte range granularity is probably overkill for block based > > > filesystems - all we need is a couple of extra bits per block to be > > > stored in the mapping tree alongside the folio.... > > > > I think it's overkill for network filesystems too. By sending a > > sector-misaligned write to the server, you force the server to do a R-M-W > > before it commits the write to storage. Assuming that the file has fallen > > out of the server's cache, and a sufficiently busy server probably doesn't > > have the memory capacity for the working set of all of its clients. > > That really depends on your server. For NFS there's definitively > servers that can deal with unaligned writes fairly well because they > just log the data in non volatile memory. That being said I'm not sure > it really is worth to optimize the Linux pagecache for that particular > use case. > > > Anyway, Dave's plan for dirty tracking (as I understand the current > > iteration) is to not store it linked from folio->private at all, but to > > store it in a per-file tree of writes. Then we wouldn't walk the page > > cache looking for dirty folios, but walk the tree of writes choosing > > which ones to write back and delete from the tree. I don't know how > > this will perform in practice, but it'll be generic enough to work for > > any filesystem. > > Yes, this would be generic. But having multiple tracking trees might > not be super optimal - it always reminds me of the btrfs I/O code that > is lost in a maze of trees and performs rather suboptimal. Yep, that's kinda what I'm trying to see if we can avoid.... -Dave. -- Dave Chinner david@xxxxxxxxxxxxx