On Tue 03-06-14 15:37:39, Daniel Phillips wrote: > On Tuesday, June 3, 2014 8:21:55 AM PDT, Jan Kara wrote: > >On Tue 03-06-14 07:14:44, Christoph Hellwig wrote: > >>On Tue, Jun 03, 2014 at 04:05:31PM +0200, Jan Kara wrote: > >And I agree we went for per-bdi > >flushing to avoid two threads congesting a single device leading to > >suboptimal IO patterns during background writeback. > > A proposal is on the table to implement s_ops->writeback() as a per-sb > operation in such a way that nothing changes in the current per-inode path. > Good or bad approach? Having s_ops->writeback() is fine. But I just hate how you hook in that callback into the writeback code (and Dave has expressed similar concerns). The way you hook it up, filesystem still has to have one inode in the dirty list so that __writeback_inodes_wb() can find that inode, get superblock from it and ask for writeback to be done via callback. That's really ugly and no-go for me. > >So currently I'm convinced we want to go for per-sb dirty tracking. That > >also makes some speedups in that code noticeably simpler. I'm not > convinced > >about the per-sb flushing thread - if we don't regress the multiple sb on > >bdi case when we just let the threads from different superblocks contend > >for IO, then that would be a natural thing to do. But once we have to > >introduce some synchronization between threads to avoid regressions, I > >think it might be easier to just stay with per-bdi thread which switches > >between superblocks. > > Could you elaborate on the means of switching between superblocks? Do > you mean a new fs-writeback path just for data=journal class filesystems, > or are you suggesting changing the way all filesystems are driven? So I suggest changing the way all filesystems are driven to a per-sb one. That makes sense for other reasons as well and allows clean incorporation of your writeback callback (either a fs will use generic callback which would do what generic writeback code does now, only with per-sb dirty lists instead of current per-bdi ones, or the fs can do its own stuff). I can write something (the switch isn't really that complex) but it will need quite some testing to verify we don't regress somewhere... Honza -- Jan Kara <jack@xxxxxxx> SUSE Labs, CR -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html