On Mon, Oct 31, 2022 at 03:43:24AM +0000, Matthew Wilcox wrote: > On Sat, Oct 29, 2022 at 08:04:22AM +1100, Dave Chinner wrote: > > As it is, we already have the capability for the mapping tree to > > have multiple indexes pointing to the same folio - perhaps it's time > > to start thinking about using filesystem blocks as the mapping tree > > index rather than PAGE_SIZE chunks, so that the page cache can then > > track dirty state on filesystem block boundaries natively and > > this whole problem goes away. We have to solve this sub-folio dirty > > tracking problem for multi-page folios anyway, so it seems to me > > that we should solve the sub-page block size dirty tracking problem > > the same way.... > > That's an interesting proposal. From the page cache's point of > view right now, there is only one dirty bit per folio, not per page. Per folio, yes, but I thought we also had a dirty bit per index entry in the mapping tree. Writeback code uses the PAGECACHE_TAG_DIRTY mark to find the dirty folios efficiently (i.e. the write_cache_pages() iterator), so it's not like this is something new. i.e. we already have coherent, external dirty bit tracking mechanisms outside the folio itself that filesystems use. That's kinda what I'm getting at here - we already have coherent dirty state tracking outside of the individual folios themselves. Hence if we have to track sub-folio up-to-date state, sub-folio dirty state and, potentially, sub-folio writeback state outside the folio itself, why not do it by extending the existing coherent dirty state tracking that is built into the mapping tree itself? Folios + Xarray have given us the ability to disconnect the size of the cached item at any given index from the index granularity - why not extend that down to sub-page folio granularity in addition to the scaling up we've been doing for large (multipage) folio mappings? Then we don't need any sort of filesystem specific "add-on" that sits alongside the mapping tree that tries to keep track of dirty state in addition to the folio and the mapping tree tracking that already exists... > We have a number of people looking at the analogous problem for network > filesystems right now. Dave Howells' netfs infrastructure is trying > to solve the problem for everyone (and he's been looking at iomap as > inspiration for what he's doing). I'm kind of hoping we end up with one > unified solution that can be used for all filesystems that want sub-folio > dirty tracking. His solution is a bit more complex than I really want > to see, at least partially because he's trying to track dirtiness at > byte granularity, no matter how much pain that causes to the server. Byte range granularity is probably overkill for block based filesystems - all we need is a couple of extra bits per block to be stored in the mapping tree alongside the folio.... Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx