On Mon, Oct 31, 2022 at 10:27:16AM +0000, Matthew Wilcox wrote: > > Byte range granularity is probably overkill for block based > > filesystems - all we need is a couple of extra bits per block to be > > stored in the mapping tree alongside the folio.... > > I think it's overkill for network filesystems too. By sending a > sector-misaligned write to the server, you force the server to do a R-M-W > before it commits the write to storage. Assuming that the file has fallen > out of the server's cache, and a sufficiently busy server probably doesn't > have the memory capacity for the working set of all of its clients. That really depends on your server. For NFS there's definitively servers that can deal with unaligned writes fairly well because they just log the data in non volatile memory. That being said I'm not sure it really is worth to optimize the Linux pagecache for that particular use case. > Anyway, Dave's plan for dirty tracking (as I understand the current > iteration) is to not store it linked from folio->private at all, but to > store it in a per-file tree of writes. Then we wouldn't walk the page > cache looking for dirty folios, but walk the tree of writes choosing > which ones to write back and delete from the tree. I don't know how > this will perform in practice, but it'll be generic enough to work for > any filesystem. Yes, this would be generic. But having multiple tracking trees might not be super optimal - it always reminds me of the btrfs I/O code that is lost in a maze of trees and performs rather suboptimal.