On Monday 16 March 2009 09:41:35 Daniel Phillips wrote: > Hi Ted, > > So the really unfortunate thing about allocating the block as soon as > > the page is dirty is that it spikes out delayed allocation. By > > delaying the physical allocation of the logical->physical mapping as > > long as possible, the filesystem can select the best possible physical > > location. > > Tux3 does not dirty the metadata until data cache is flushed, so the > allocation decisions for data and metadata are made at the same time. > That is the reason for the distinction between physical metadata above, > and logical metadata such as directory data and bitmaps, which are > delayed. Though physical metadata is positioned when first dirtied, > physical metadata dirtying is delayed until delta commit. > > Implementing this model (we are still working on it) requires taking > care of a lot of subtle details that are specific to the Tux3 cache > model. I have a hard time imagining those allocation decisions driven > by callbacks from a buffer-like library. The filesystem can get pagecache-block-dirty events in a few ways (often a combination of): write_begin/write_end, set_page_dirty, page_mkwrite, etc. Short of implementing entirely your own write path (and even then you need to hook at least page_mkwrite to catch mmapped writes, for completeness), I don't see why a get_block(BLOCK_DIRTY) kind of callback is much harder for you to imagine than any of the other callbacks. Actually I imagine the block based callback should be easier for filesystems that support any block size != page size because all the others are page based. I would like to hear firm details about any problems definitely, because I would like to try to make it more generic even if your filesystem won't use it :) Now this is not to say the current buffer APIs are totally _optimal_. As I said, I would like to see at least something along the lines of "we are about to dirty range (x,y)" callback in the higher level generic write code. But that's another story (which I am planning to get to). -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html